LibCat » Книги » Приключения » unrecognised » Stephen Winters-Hilt - Informatics and Machine Learning

Stephen Winters-Hilt - Informatics and Machine Learning

Здесь есть возможность читать онлайн «Stephen Winters-Hilt - Informatics and Machine Learning» — ознакомительный отрывок электронной книги совершенно бесплатно, а после прочтения отрывка купить полную версию. В некоторых случаях можно слушать аудио, скачать через торрент в формате fb2 и присутствует краткое содержание. Жанр: unrecognised, на английском языке. Описание произведения, (предисловие) а так же отзывы посетителей доступны на портале библиотеки ЛибКат.

Читать книгу

Название:
Informatics and Machine Learning
Автор:
Stephen Winters-Hilt
Жанр:
unrecognised / на английском языке
Год:
неизвестен
ISBN:
нет данных
Рейтинг книги:
3 / 5. Голосов: 1
Избранное:

Добавить в избранное
Отзывы:
Написать комментарий
Ваша оценка:
- 60
- 1
- 2
- 3
- 4
- 5

Informatics and Machine Learning: краткое содержание, описание и аннотация

Предлагаем к чтению аннотацию, описание, краткое содержание или предисловие (зависит от того, что написал сам автор книги «Informatics and Machine Learning»). Если вы не нашли необходимую информацию о книге — напишите в комментариях, мы постараемся отыскать её.

Informatics and Machine Learning
Discover a thorough exploration of how to use computational, algorithmic, statistical, and informatics methods to analyze digital data Informatics and Machine Learning: From Martingales to Metaheuristics
ad hoc, ab initio
Informatics and Machine Learning: From Martingales to Metaheuristics

Informatics and Machine Learning — читать онлайн ознакомительный отрывок

Ниже представлен текст книги, разбитый по страницам. Система сохранения места последней прочитанной страницы, позволяет с удобством читать онлайн бесплатно книгу «Informatics and Machine Learning», без необходимости каждый раз заново искать на чём Вы остановились. Поставьте закладку, и сможете в любой момент перейти на страницу, на которой закончили чтение.

Тёмная тема

Шрифт:

↓

↑

Сбросить

Интервал:

↓

↑

Закладка:

Сделать

Ad hoc signal acquisition refers to finding the solution for “this” situation (whatever “this” is) without consideration of wider application. The solution is strongly data dependent in other words. Data dependent methodologies are, by definition, not defined at the outset, but must be invented as the data begins to be understood. As with data dependency in non‐evolutionary search metaheuristics, where there is no optimal search method that is guaranteed to always work well, here there is no optimal signal acquisition method known in advance. This is simply restating a fundamental limit from non‐evolutionary search metaheuristics in another form [1, 3]. What can be done, however, is assemble the core tools and techniques from which a solution can be constructed and to perform a bootstrap algorithmic learning process with those tools (examples in what follows) to arrive at a functional signal acquisition on the data being analyzed. A universal, automated, bootstrap learning process may eventually be possible using evolutionary learning algorithms. This is related to the co‐evolutionary Free Lunch Theorem [1, 3], and this is discussed in Chapter 12.

“Bootstrap” refers to a method of problem solving when the problem is solved by seemingly paradoxical measures (the name references Baron von Munchausen who freed the horse he was riding from a bog by pulling himself, and the horse with him, up by his bootstraps). Such algorithmic methods often involve repeated passes over the data sequence, with improved priors, or a trained filter, among other things, to have improved performance. The bootstrap amplifier from electrical engineering is an amplifier circuit where part of the output is used as input, particularly at start‐up (known as bootstrapping), allowing proper self‐initialization to a functional state (by amplifying ambient circuit noise in some cases). The bootstrap FSA proposed here is a meta‐algorithmic method in that performance “feedback” with learning is used in algorithmic refinements with iterated meta‐algorithmic learning to arrive at a functional signal acquisition status.

Acquisition is often all that is needed in a signal analysis problem, where a basic means to acquire the signals is sought, to be followed by a basic statistical analysis on those signals and their occurrences. Various methods for signal acquisition using FSA constructs are described in what follows, with focus on statistical anomalies to identify the presence of signal and “lock on” [1, 3]. The signal acquisition is initially only guided by use of statistical measures to recognize anomalies. Informatics methods and information theory measures are central to the design of a good FSA acquisition method, however, and will be reviewed in the signal acquisition context [1, 3], along with HMMs.

Thus, FSA processes allow signal regions to be identified, or “acquired,” in O(L) time. Furthermore, in that same order of time complexity, an entire panoply of statistical moments can also be computed on the signals (and used in a bootstrap learning process). The O(L) feature extraction of statistical moments on the signal region acquired may suffice for localized events and structures. For sequential information or events, however, there is often a non‐local , or extended structural, aspect to the signal sought. In these situations we need a general, powerful, way to analyze sequential signal data that is stochastic (random, but with statistics, such as average, that may be unchanging over time if “stationary,” for example). The general method for performing stochastic sequential analysis (SSA) is via HMMs, as will be extensively described in Chapters 6and 7, and briefly summarized in Section 1.5that follows. HMM approaches require an identification of “states” in the signal analysis. If an identification of states is difficult, such as in situations where there can be changes in meaning according to context, e.g. language, then HMMs may not be useful. Text and language analytics are described in Chapters 5and 13, and briefly outlined in the next section.

1.4 Feature Extraction and Language Analytics

The FSA sequential‐data signal processing, and extraction of statistical moments on windowed data, will be shown in Chapter 2to be O(L) with L the size of the data (double the data and you double the processing time). If HMMs can be used, with their introduction of states (the sequential data is described as a sequentence of “hidden” states), then the computational cost goes as O(L N 2). If N = 10, then this could be 100 times more computational time to process than that of a FSA‐based O(L) computation, so the HMMs can generally be a lot more expensive in terms of computational time. Even so, if you can benefit from a HMM it is generally possible to do so, even if hardware specialization (CPU farm utilization, etc.) is required. The problem is if you do not have a strong basis for a HMM application, e.g. when there is no strong basis for delineating the states of the system of communication under study. This is the problem encounterd in the study of natural languages (where there is significant context dependency). In Chapter 5we look into FSA analysis for language by doing some basic text analytics.

Chapter 5shows some (very) basic extensions to an FSA analysis in applications to text. This begins with a simple frequency analysis on words, which for some classics (in their original languages) reveal important word‐frequency results with implied meanings meant by the author (polysemy word usage by Machiavelli, for example). The frequency on word groupings in a given text can be studied as well, with some useful results from texts of sufficient size with clear stylistic conventions by the author. Authors that structure their lines of text according to iambic pentameter (Shakespeare, for example) can also be identified according to the profile (histogram) of syllables used on each line (i.e. 10 for iambic pentameter will dominate).

Text analytics can also take what is still O(L) processing into mapping the mood or sentiment of text samples by use of word‐scored sentiment tables. The generation and use of such sentiment tables is its own craft, usually proprietary, so only minimal examples are given. Thus Chapter 5shows an elaboration of FSA‐based analysis that might be done when there is no clear definition of state, such as in language. NLP processing in general encompasses a much more complete grammatical knowledge of the language, but in the end the NLP and the FSA‐based “add‐on” still suffer from not being able to manage word context easily (the states cannot simply be words since the words can have different meaning according to context). The inability to use HMMs has been a blockade to a “universal translator” that has since been overcome with use of Deep Learning using NNs ( Chapter 13) – where immense amounts of translation data, such as the massive corpus of dual language Canadian Government proceedings, is sufficient to train a translator (English–French). Most of the remaining Chapters focus on situations where a clear delinaeation of signal state can be given, and thus benefit from the use of HMMs.

1.5 Feature Extraction and Gene Structure Identification

HMMs offer a more sophisticated signal recognition process than FSAs, but with greater computational space and time complexity [125, 126]. Like electrical engineering signal processing, HMMs usually involve preprocessing that assumes linear system properties or assumes observation is frequency band limited and not time limited, and thereby inherit the time‐frequency uncertainty relations, Gabor limit, and Nyquist sampling relations. FSA methods can be used to recover (or extract) signal features missed by HMM or classical electrical engineering signal processing. Even if the signal sought is well understood, and a purely HMM‐based approach is possible, this is often needlessly computationally intensive (slow), especially in areas where there is no signal. To address this there are numerous hybrid FSA/HMM approaches (such as BLAST [127] ) that benefit from the O(L) complexity on length L signal with FSA processing, with more targeted processing at O(L N 2) complexity with HMM processing (where there are N states in the HMM model).