Francesca Lazzeri - Machine Learning for Time Series Forecasting with Python

Здесь есть возможность читать онлайн «Francesca Lazzeri - Machine Learning for Time Series Forecasting with Python» — ознакомительный отрывок электронной книги совершенно бесплатно, а после прочтения отрывка купить полную версию. В некоторых случаях можно слушать аудио, скачать через торрент в формате fb2 и присутствует краткое содержание. Жанр: unrecognised, на английском языке. Описание произведения, (предисловие) а так же отзывы посетителей доступны на портале библиотеки ЛибКат.

Machine Learning for Time Series Forecasting with Python: краткое содержание, описание и аннотация

Предлагаем к чтению аннотацию, описание, краткое содержание или предисловие (зависит от того, что написал сам автор книги «Machine Learning for Time Series Forecasting with Python»). Если вы не нашли необходимую информацию о книге — напишите в комментариях, мы постараемся отыскать её.

Learn how to apply the principles of machine learning to 
time series modeling with this indispensable resource
Machine Learning for Time Series Forecasting with Python Despite the centrality of time series forecasting, few business analysts are familiar with the power or utility of applying machine learning to time series modeling. Author Francesca Lazzeri, a distinguished machine learning scientist and economist, corrects that deficiency by providing readers with comprehensive and approachable explanation and treatment of the application of machine learning to time series forecasting. 
Written for readers who have little to no experience in time series forecasting or machine learning, the book comprehensively covers all the topics necessary to: 
Understand time series forecasting concepts, such as stationarity, horizon, trend, and seasonality Prepare time series data for modeling Evaluate time series forecasting models’ performance and accuracy Understand when to use neural networks instead of traditional time series models in time series forecasting 
is full real-world examples, resources and concrete strategies to help readers explore and transform data and develop usable, practical time series forecasts. 
Perfect for entry-level data scientists, business analysts, developers, and researchers, this book is an invaluable and indispensable guide to the fundamental and advanced concepts of machine learning applied to time series modeling.

Machine Learning for Time Series Forecasting with Python — читать онлайн ознакомительный отрывок

Ниже представлен текст книги, разбитый по страницам. Система сохранения места последней прочитанной страницы, позволяет с удобством читать онлайн бесплатно книгу «Machine Learning for Time Series Forecasting with Python», без необходимости каждый раз заново искать на чём Вы остановились. Поставьте закладку, и сможете в любой момент перейти на страницу, на которой закончили чтение.

Тёмная тема
Сбросить

Интервал:

Закладка:

Сделать

The univariate or multivariate nature of your forecasting model – A univariate data is characterized by a single variable. It does not deal with causes or relationships. Its descriptive properties can be identified in some estimates such as central tendency (mean, mode, median), dispersion (range, variance, maximum, minimum, quartile, and standard deviation), and the frequency distributions. The univariate data analysis is known for its limitation in the determination of relationship between two or more variables, correlations, comparisons, causes, explanations, and contingency between variables. Generally, it does not supply further information on the dependent and independent variables and, as such, is insufficient in any analysis involving more than one variable.To obtain results from such multiple indicator problems, data scientists usually use multivariate data analysis. This multivariate approach will not only help consider several characteristics in a model but will also bring to light the effect of the external variables.Time series forecasting can either be univariate or multivariate. The term univariate time series refers to one that consists of single observations recorded sequentially over equal time increments. Unlike other areas of statistics, the univariate time series model contains lag values of itself as independent variables ( itl.nist.gov/div898/handbook/pmc/section4/pmc44.htm). These lag variables can play the role of independent variables as in multiple regression. The multivariate time series model is an extension of the univariate case and involves two or more input variables. It does not limit itself to its past information but also incorporates the past of other variables. Multivariate processes arise when several related time series are observed simultaneously over time instead of a single series being observed as in univariate case. Examples of the univariate time series are the ARIMA models that we will discuss in Chapter 4, “Introduction to Some Classical Methods for Time Series Forecasting.” Considering this question with regard to inputs and outputs may add a further distinction. The number of variables may differ between the inputs and outputs; for example, the data may not be symmetrical. You may have multiple variables as input to the model and only be interested in predicting one of the variables as output. In this case, there is an assumption in the model that the multiple input variables aid and are required in predicting the single output variable.

Single-step or multi-step structure of your forecasting model – Time series forecasting describes predicting the observation at the next time step. This is called a one-step forecast as only one time step is to be predicted. In contrast to the one-step forecast are the multiple-step or multi-step time series forecasting problems, where the goal is to predict a sequence of values in a time series. Many time series problems involve the task of predicting a sequence of values using only the values observed in the past (Cheng et al. 2006). Examples of this task include predicting the time series for crop yield, stock prices, traffic volume, and electrical power consumption. There are at least four commonly used strategies for making multi-step forecasts (Brownlee 2017):Direct multi-step forecast: The direct method requires creating a separate model for each forecast time stamp. For example, in the case of predicting energy consumption for the next two hours, we would need to develop a model for forecasting energy consumption on the first hour and a separate model for forecasting energy consumption on the second hour.Recursive multi-step forecast: Multi-step-ahead forecasting can be handled recursively, where a single time series model is created to forecast next time stamp, and the following forecasts are then computed using previous forecasts. For example, in the case of forecasting energy consumption for the next two hours, we would need to develop a one-step forecasting model. This model would then be used to predict next hour energy consumption, then this prediction would be used as input in order to predict the energy consumption in the second hour.Direct-recursive hybrid multi-step forecast: The direct and recursive strategies can be combined to offer the benefits of both methods (Brownlee 2017). For example, a distinct model can be built for each future time stamp, however each model may leverage the forecasts made by models at prior time steps as input values. In the case of predicting energy consumption for the next two hours, two models can be built, and the output from the first model is used as an input for the second model.Multiple output forecast: The multiple output strategy requires developing one model that is capable of predicting the entire forecast sequence. For example, in the case of predicting energy consumption for the next two hours, we would develop one model and apply it to predict the next two hours in one single computation (Brownlee 2017).

Contiguous or noncontiguous time series values of your forecasting model – A time series that present a consistent temporal interval (for example, every five minutes, every two hours, or every quarter) between each other are defined as contiguous (Zuo et al. 2019). On the other hand, time series that are not uniform over time may be defined as noncontiguous: very often the reason behind noncontiguous timeseries may be missing or corrupt values. Before jumping to the methods of data imputation, it is important to understand the reason data goes missing. There are three most common reasons for this:Missing at random: Missing at random means that the propensity for a data point to be missing is not related to the missing data but it is related to some of the observed data.Missing completely at random: The fact that a certain value is missing has nothing to do with its hypothetical value and with the values of other variables.Missing not at random: Two possible reasons are that the missing value depends on the hypothetical value or the missing value is dependent on some other variable's value.In the first two cases, it is safe to remove the data with missing values depending upon their occurrences, while in the third case removing observations with missing values can produce a bias in the model. There are different solutions for data imputation depending on the kind of problem you are trying to solve, and it is difficult to provide a general solution. Moreover, since it has temporal property, only some of the statistical methodologies are appropriate for time series data.I have identified some of the most commonly used methods and listed them as a structural guide in Figure 1.7. Figure 1.7 : Handling missing dataAs you can observe from the graph in Figure 1.7, listwise deletion removes all data for an observation that has one or more missing values. Particularly if the missing data is limited to a small number of observations, you may just opt to eliminate those cases from the analysis. However, in most cases it is disadvantageous to use listwise deletion. This is because the assumptions of the missing completely at random method are typically rare to support. As a result, listwise deletion methods produce biased parameters and estimates.Pairwise deletion analyses all cases in which the variables of interest are present and thus maximizes all data available by an analysis basis. A strength to this technique is that it increases power in your analysis, but it has many disadvantages. It assumes that the missing data is missing completely at random. If you delete pairwise, then you'll end up with different numbers of observations contributing to different parts of your model, which can make interpretation difficult.Deleting columns is another option, but it is always better to keep data than to discard it. Sometimes you can drop variables if the data is missing for more than 60 percent of the observations but only if that variable is insignificant. Having said that, imputation is always a preferred choice over dropping variables.

Читать дальше
Тёмная тема
Сбросить

Интервал:

Закладка:

Сделать

Похожие книги на «Machine Learning for Time Series Forecasting with Python»

Представляем Вашему вниманию похожие книги на «Machine Learning for Time Series Forecasting with Python» списком для выбора. Мы отобрали схожую по названию и смыслу литературу в надежде предоставить читателям больше вариантов отыскать новые, интересные, ещё непрочитанные произведения.


Отзывы о книге «Machine Learning for Time Series Forecasting with Python»

Обсуждение, отзывы о книге «Machine Learning for Time Series Forecasting with Python» и просто собственные мнения читателей. Оставьте ваши комментарии, напишите, что Вы думаете о произведении, его смысле или главных героях. Укажите что конкретно понравилось, а что нет, и почему Вы так считаете.

x