Forecasting time series powerful and simple

January 15th
GLOBAL AI BOOTCAMP IS POWERED BY:
Powerful yet Simple
(or not that much)
Forecasting Time Series
AI and IoT Bulgaria Summit, 2022

Speaker Bio
• Software Architect @
o 19+ years professional experience
• Microsoft Azure MVP
• External Expert Horizon 2020, Eurostars-Eureka
• External Expert InnoFund Denmark, RIF Cyprus
• Business Interests
o Web Development, SOA, Integration
o IoT, Machine Learning, Computer Intelligence
o Security & Performance Optimization
• Contact
ivelin.andreev@icb.bg
www.linkedin.com/in/ivelin
www.slideshare.net/ivoandreev

Upcoming Events
Global Azure Bulgaria, 2022
May 14, 2022
Tickets (Eventbrite)
Sessions (Sessionize)

Agenda
• Time Series?
• Forecasting?
• ML.NET
• Azure ML Service
• ARIMA/AutoARIMA
• Regression
• FB Prophet
• Demo

Takeaways
Time Series
o Introduction to Hierarchical Time Series
o Overview of Time Series Forecasting Models
o Time Series Analysis with Python
ARIMA
o Time Series Forecasting with ARIMA models
o ARIMA, Auto ARIMA, Prophet, Regression (Youtube)
SSA
o A Brief Introduction to SSA
o Forecast Service Demand with Time Series Analysis and ML.NET
FB Prophet
o FB Prophet Quickstart (FB GitHub)
o Time Series Analysis using FB Prophet
o Generate Accurate Forecasts with FB Prophet in Python

Time Series – a sequence of
observations taken over time
Forecasting – the process of
predicting for new data

Describing or Forecasting
• Data are Temporal
o Unlike other data, the fact that a point is close to another is important
• Sample Data Look like…
• Time Series Analysis
o Understanding Time Series and underlying causes
o Create a mathematical model that describes data
o Determine seasonal patterns, trends, relations to external factors
o Note: assumptions are often in place (i.e. the form of data)
• Forecasting
o Scientific predictions based on historical time-stamped data
o Univariate / Multivariate TS Forecasting
o Note: Explanatory power is often low
Time Value
2021-11-01T00:00:00+02:00 66
2021-11-01T01:00:00+02:00 29
2021-11-01T02:00:00+02:00 6
2021-11-01T03:00:00+02:00 8
2021-11-01T04:00:00+02:00 91
2021-11-01T05:00:00+02:00 145
2021-11-01T06:00:00+02:00 14
2021-11-01T07:00:00+02:00 19
2021-11-01T08:00:00+02:00 64
2021-11-01T09:00:00+02:00 4
2021-11-01T10:00:00+02:00 22
2021-11-01T11:00:00+02:00 65
2021-11-01T12:00:00+02:00 30
2021-11-01T13:00:00+02:00 152
2021-11-01T14:00:00+02:00 30
2021-11-01T15:00:00+02:00 17
2021-11-01T16:00:00+02:00 9
2021-11-01T17:00:00+02:00 11
2021-11-01T18:00:00+02:00 19
2021-11-01T19:00:00+02:00 76
2021-11-01T20:00:00+02:00 117
2021-11-01T21:00:00+02:00 152
2021-11-01T22:00:00+02:00 53
2021-11-01T23:00:00+02:00 3
2021-11-02T00:00:00+02:00 13

Practical Use Cases
• Sample Data Sources
o Sensor readings (environmental data, temperature, pressure, humidity)
o Financial market data
o Medical data (body parameters, heartbeat, pulse rate, blood pressure)
• Sample Scenarios
o Unit sales for each day in a store
o Number of passengers on a station
o Number of users of a web site
o Liters of usage of hot water in a household
o Stocks price for a day
o Diesel price for the next week
o Water level of a dam during the year
o Body weight over the year ☺

Time Series come in
various flavour types

Hierarchical Time Series Forecasting
• Hierarchical TS
o Evident hierarchical structure
o Lower levels are nested (i.e. geographical split)
• Grouped TS
o Multiple non-nested levels of detail (i.e. category, retailer, colour)
• Hierarchical Forecasting
o A collection of techniques rather that another methodology
o Generate forecast that is consistent across the whole hierarchy
o Forecasts shall add up
• Approaches
o Bottom up, Top-down
o Middle-out (Mixed) – Bottom-up (above middle), Top-down (below middle)
o Reconciliation – each level independently, Determine coefficients with linear regression
Bulgaria
East
Varna
Burgas
West
Sofia

Quacks like Time Series, Moves like …
• Do you have enough data?
o More data = more options for aggregation, model tuning, model testing
• Time horizon for prediction?
o Shorter time horizon can be predicted with higher confidence
• Are forecasts updateable or static?
o Retrain after new data are available for more accurate results
• Frequency of forecasts?
o Downsampling and upsampling of data affect accuracy (in both directions)
• Is time series stationary?
o Time series properties do not depend on observation time?

Time Series Stationarity
• Stationarity
o Statistical properties of TS do not depend on time of observation (mean, variance)
o Rule: Non-stationary data are unpredictable and cannot be forecasted
o Conclusion: Non-stationary TS data need to be converted to stationary
• Differencing
o Method to transform time series and remove time-dependent attributes (trend, seasonality)
o Lag difference could be calculated on a larger time window (i.e. window size)
Note: Some TS forecasting methods do not require stationarity (i.e. ARIMA), as
preliminary differencing is performed. (ARMA does though)
difference(t) = observation(t) - observation(t-1)
Example: 1 2 3 4 5 6 7 8 9 10
Differencing: 1 1 1 1 1 1 1 1 1
inverted(t) = differenced(t) + observation(t-1)

Time Series Analysis
Observations close in time are
often correlated

Time Series Analysis
TS Analysis provides techniques to understand data and break into components:
• Trend (Tt)
o Smooth general long term tendency to increase, decrease or both
• Seasonality (St)
o Rhythmic forces operate on smaller intervals (i.e. 1h, 1d, 1w, 1m)
• Cyclic (Ct)
o Cyclic behaviour that repeats over a long period (i.e. 4y, 1y)
• Random Noise (Rt)
o Random irregular observations that cannot be explained (unpredictable)
Additive Model: Yt = Tt + St + Ct + Rt
Multipl. Model: Yt = Tt * St * Ct * Rt
Mixed Model: Yt = Tt * Ct + St * Rt; Yt = Tt + St * Ct * Rt

Advanced
Observation: Time series tend to display significant autocorellation
• Correlation
o Measures the relationship between TS and a lagged version of it (T, T-k)
o Meaning: ± 1 - perfect correlation; 0 – no correlation
• Measured with Pearson Correlation
o Preconditions: normal distribution, no significant outliers, continuous variables
o Cross-correlation - the correlation is observed across different lags
• Augmented Dickey-Fuller Test (python adfuller function)
o Null hypothesis (H0) – the TS has a unit root (non-stationary)
o Alternate hypothesis (HA) – the null hypothesis is rejected
• ADF p-value < 0.05
• H0 rejected = TS is stationary

Common Data Preparation
• Imputation
o Replacing missing data with substitute values
• Frequency / Resampling
o Could be too high for a model compared to prediction front
o Irregular time series may require resampling at regular intervals
• Outliers
o Extreme values need to be identified and handled
o Outlier = Value ∉ [Q1-1.5*IQR; Q3+1.5*IQR]
Does missing data have
meaning?
NO
Type of data
Large dataset, little
data missing at
random:
Remove instances with
"missing "? data
Does data follow simple
distribution?
NO
Impute with simple ML
model
YES
Impute with mean value
YES, with outliers
Impute missing values
with median
Large, temporary
ordered dataset:
Replace data with
preceding values
YES: Numerical
Convert missing values
to meaningful number

Forecasting Algorithms
Appreciate how genius was
made simple for you

Naïve Algorithms Baseline
Note: Naïve algorithms are often referred to as “benchmark models”
Naïve Model
• Forecasts for any horizon match the last value
SNaïve Model(Seasonal Naïve)
• Assumes a seasonal component with time window T
• Forecast matches the last T timestamps

ARIMA (AutoRegressive Integrated Moving Average)
• Auto Regressive - linear combination of past values of the variable
o Assume that future will resemble the past
o Inaccurate when an unseen event happens
• Moving Average - linear combination of past forecast errors.
o Smooth impacts of short-term fluctuations
o Simple MA – arithmetic mean of the previous 5,10,20,100 etc. values
o Exponential MA - weighted average that gives greater importance to the most recent values
• Integrated – Differencing for stationary time series
• ARIMA Parameters
o p – number of observations from the Past to forecast future
o d – degree of Differencing (number of times raw observations are differenced for stationarity)
o q – size of the window to calculate forecast Quality errors
ARIMA(p,d,q) = const + (weighted sum last P values) + (weighted sum of last Q errors) after D differencing

SeasonalARIMA
• ARIMA (p,d,q) is a non-seasonal ARIMA
• SARIMA (p, d, q, P, D, Q)
o P - number of seasonal autoregressive terms,
o D – differencing order (number of transformations to make TS stationary)
o Q - moving-average order of seasonal component
o m – periods in a season (i.e. 12 for monthly data)
• The parameter space becomes larger
• Grid search for optimal parameters

AutoARIMA
• Identifies the most optimal parameters of ARIMA (p, d, q)
o pip install pyramid-arima (mimics R auto.arima)
o .fit() does a magic
o Utilizes AIC (Akaike Information Criterion) to pick best model (smaller = better)
• N*ln(SSe/N)+2K – N (N- number of observations, SSe - SumSquareErrors, K – model parameters)
• Conducts differencing tests to determine the order of differencing
• Pros
o Saves time
o One of the simplest techniques for TS forecasting
o Eliminates the need of in-depth statistics understanding
o Reduces the chance of human error due to misinterpretation
model = auto_arima(train, [42 other optional arguments])
model.fit(train)

Singular Spectrum Analysis (SSA)
• Novel powerful technique
• 2 complementary stages
o Decomposition - extract independent components from time series
o Reconstruction – reconstruct the series for forecasting, after removing noise
• Pros
o Works with arbitrary statistical process
o No assumptions for data (i.e. stationarity)
• ML.NET ForecastBySsa Parameters
o trainSize – number of train samples (rows) from beginning (i.e. 300)
o seriesLength – length of series in buffer (how much data to use to train on)
o windowSize – length of the window on the series (seasonality)
o horizon – number of values to forecast (i.e. 24)
o confidenceLevel – degree of certainty (i.e. 95% of estimates to contain the real)

SSA, How it Works
• How does it work
• Checkpoint
o Avoids replay of all previous data, provide only most recent observations
o But if this creates a drift, a clean retrain on last observations (i.e. 1 month) may be better
MLContext mlContext = new MLContext(); //All ML.NET operations are within context
IDataView dv = mlContext.Data.LoadFromTextFile(…) //Step 1: Load data from file
var pipeline = mlContext.Forecasting.ForecastBySsa([Parameters],…) //Step 2: SSA Pipeline
SsaForecastingTransformer forecaster = pipeline.Fit(dv); //Step 3: Data training
… //Step 4: Evaluate (i.e. calculate RMSE)
var forecastEngine = forecaster.CreateTimeSeriesEngine(mlContext);
ModelOutput forecast = forecastEngine.Predict(); //Step 5: Load trained model and predict
forecastEngine.CheckPoint(mlContext, outputModelPath); //Save Checkpoint
model = mlContext.Model.Load(file, out DataViewSchema schema); //Load from Checkpoint
forecastEngine = model.CreateTimeSeriesEngine<TimeSeriesData, ChangePointPrediction>(mlContext);

Regression Model
• Forecasting Recap
o Data are ordered in series as {Time: Value} pairs; No external knowledge
• Regression
o Predicting a single numeric value
o Time Series Forecasting involves Regression under the hood
o Can be applied to non-ordered data
o Shall be applied multiple times to predict the same horizon
• Feature Engineering & Extraction
o Date – Year, Month, Day, Hour
o Lag – What has happened at T-1, T-2, T-12, T-24, T-48, T-n observations
o Delta – What is the difference from T-1, T-2, T-12, T-24, T-48, T-n observations
o Moving Average – Mean(2), Mean(12), Mean(24), Mean (48), …
o Sum – Sum(2), Sum(12), Sum(24), Sum(48),…
o Domain knowledge – Weather, Distance (not GPS), Ref. Price

Azure ML Service
• Azure Auto ML (Forecasting uses AutoARIMA under the hood)
o The easiest still powerful way to do ML
o Optimizes the iterative time consuming tasks of ML
o Azure Auto ML Python SDK
o Azure ML Studio – (ML Studio Classic retires August 2024)
Upload File Select Task Type Parameters Metrics

• Created by in 2017
• Pros
o Trains quickly, highly accurate
o No background required (like AutoARIMA)
o Can also be used for multivariate TS analysis
o Handles outliers and missing data well
o Strong at series with seasonal effects and few seasons in training data
o Handles random changes due to special events (i.e. market events)
• Under the Hood
o Requires prophet Python package
o Uses additive regression model
Y(t) = Trend(t) + Seasonality(t) + Holiday(t) + Error(t)

• Prophet does not run on Python 3.9
• What’s the easiest
• Install Azure Data Science VM (< 4 Cores is sluggish)
• Find the 3.8 Kernel from Jupyter Lab
• Activate kernel
• Use Conda package manager to install
• Conda has own C++ compiler to build the packages
• Select a channel
Prophet – Easy to Use, Hard to Install
C:> activate py38_default
(py38_default) C:> conda install pystan -c conda-forge
(py38_default) C:> conda install -c conda-forge fbprophet

Demo
Ref. Comparing Prophet and Deep Learning to ARIMA in
Forecasting Wholesale Food Prices (2021)

Forecasting time series powerful and simple

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Forecasting time series powerful and simple

Similar to Forecasting time series powerful and simple (20)

More from Ivo Andreev

More from Ivo Andreev (20)

Recently uploaded

Recently uploaded (20)

Forecasting time series powerful and simple