6 Chapter 6Figure 6.1: The machine learning model workflowFigure 6.2: The modeling and scoring processFigure 6.3: First few rows of the energy data setFigure 6.4: Load data set plotFigure 6.5: Load data set plot of the first week of July 2014Figure 6.6: Web service deployment and consumptionFigure 6.7: Energy demand forecast end-to-end data flow
1 Cover Page
2 Table of Contents
3 Begin Reading
1 i
2 xv
3 xvi
4 xvii
5 xviii
6 1
7 2
8 3
9 4
10 5
11 6
12 7
13 8
14 9
15 10
16 11
17 12
18 13
19 14
20 15
21 16
22 17
23 18
24 19
25 20
26 21
27 22
28 23
29 24
30 25
31 26
32 27
33 29
34 30
35 31
36 32
37 33
38 34
39 35
40 36
41 37
42 38
43 39
44 40
45 41
46 42
47 43
48 44
49 45
50 46
51 47
52 48
53 49
54 50
55 51
56 52
57 53
58 54
59 55
60 56
61 57
62 58
63 59
64 61
65 62
66 63
67 64
68 65
69 66
70 67
71 68
72 69
73 70
74 71
75 72
76 73
77 74
78 75
79 76
80 77
81 78
82 79
83 80
84 81
85 82
86 83
87 84
88 85
89 86
90 87
91 88
92 89
93 90
94 91
95 92
96 93
97 94
98 95
99 96
100 97
101 98
102 99
103 101
104 102
105 103
106 104
107 105
108 106
109 107
110 108
111 109
112 110
113 111
114 112
115 113
116 114
117 115
118 116
119 117
120 118
121 119
122 120
123 121
124 122
125 123
126 124
127 125
128 126
129 127
130 128
131 129
132 130
133 131
134 132
135 133
136 134
137 135
138 136
139 137
140 138
141 139
142 140
143 141
144 142
145 143
146 144
147 145
148 146
149 147
150 148
151 149
152 150
153 151
154 152
155 153
156 154
157 155
158 156
159 157
160 158
161 159
162 160
163 161
164 162
165 163
166 164
167 165
168 167
169 168
170 169
171 170
172 171
173 172
174 173
175 174
176 175
177 176
178 177
179 178
180 179
181 180
182 181
183 182
184 183
185 184
186 185
187 186
188 187
189 188
190 189
191 190
192 191
193 192
194 193
195 194
196 195
197 196
198 197
199 198
200 199
201 200
202 201
203 202
204 203
205 204
206 205
207 206
208 ii
209 iii
210 v
211 vii
212 207
Machine Learning for Time Series Forecasting with Python ®
Francesca Lazzeri, PhD

Time series data is an important source of information used for future decision making, strategy, and planning operations in different industries: from marketing and finance to education, healthcare, and robotics. In the past few decades, machine learning model-based forecasting has also become a very popular tool in the private and public sectors.
Currently, most of the resources and tutorials for machine learning model-based time series forecasting generally fall into two categories: code demonstration repo for certain specific forecasting scenarios, without conceptual details, and academic-style explanations of the theory behind forecasting and mathematical formula. Both of these approaches are very helpful for learning purposes, and I highly recommend using those resources if you are interested in understanding the math behind theoretical hypotheses.
This book fills that gap: in order to solve real business problems, it is essential to have a systematic and well-structured forecasting framework that data scientists can use as a guideline and apply to real-world data science scenarios. The purpose of this hands-on book is to walk you through the core steps of a practical model development framework for building, training, evaluating, and deploying your time series forecasting models.
The first part of the book ( Chapters 1and 2) is dedicated to the conceptual introduction of time series, where you can learn the essential aspects of time series representations, modeling, and forecasting.
In the second part ( Chapters 3through 6), we dive into autoregressive and automated methods for forecasting time series data, such as moving average, autoregressive integrated moving average, and automated machine learning for time series data. I then introduce neural networks for time series forecasting, focusing on concepts such as recurrent neural networks (RNNs) and the comparison of different RNN units. Finally, I guide you through the most important steps of model deployment and operationalization on Azure.
Along the way, I show at practice how these models can be applied to real-world data science scenarios by providing examples and using a variety of open-source Python packages and Azure. With these guidelines in mind, you should be ready to deal with time series data in your everyday work and select the right tools to analyze it.
What Does This Book Cover?
This book offers a comprehensive introduction to the core concepts, terminology, approaches, and applications of machine learning and deep learning for time series forecasting: understanding these principles leads to more flexible and successful time series applications.
In particular, the following chapters are included:
Chapter 1: Overview of Time Series Forecasting This first chapter of the book is dedicated to the conceptual introduction of time series, where you can learn the essential aspects of time series representations, modeling, and forecasting, such as time series analysis and supervised learning for time series forecasting.We will also look at different Python libraries for time series data and how libraries such as pandas, statsmodels, and scikit-learn can help you with data handling, time series modeling, and machine learning, respectively.Finally, I will provide you with general advice for setting up your Python environment for time series forecasting.
Chapter 2: How to Design an End-to-End Time Series Forecasting Solution on the Cloud The purpose of this second chapter is to provide an end-to-end systematic guide for time series forecasting from a practical and business perspective by introducing a time series forecasting template and a real-world data science scenario that we use throughout this book to showcase some of the time series concepts, steps, and techniques discussed.
Chapter 3: Time Series Data Preparation In this chapter, I walk you through the most important steps to prepare your time series data for forecasting models. Good time series data preparation produces clean and well-curated data, which leads to more practical, accurate predictions.Python is a very powerful programming language to handle data, offering an assorted suite of libraries for time series data and excellent support for time series analysis, such as SciPy, NumPy, Matplotlib, pandas, statsmodels, and scikit-learn.You will also learn how to perform feature engineering on time series data, with two goals in mind: preparing the proper input data set that is compatible with the machine learning algorithm requirements and improving the performance of machine learning models.
Читать дальше