SlideShare a Scribd company logo
January 15th
GLOBAL AI BOOTCAMP IS POWERED BY:
Powerful yet Simple
(or not that much)
Forecasting Time Series
AI and IoT Bulgaria Summit, 2022
Speaker Bio
• Software Architect @
o 19+ years professional experience
• Microsoft Azure MVP
• External Expert Horizon 2020, Eurostars-Eureka
• External Expert InnoFund Denmark, RIF Cyprus
• Business Interests
o Web Development, SOA, Integration
o IoT, Machine Learning, Computer Intelligence
o Security & Performance Optimization
• Contact
ivelin.andreev@icb.bg
www.linkedin.com/in/ivelin
www.slideshare.net/ivoandreev
Thanks to our Sponsors
Upcoming Events
Global Azure Bulgaria, 2022
May 14, 2022
Tickets (Eventbrite)
Sessions (Sessionize)
Agenda
• Time Series?
• Forecasting?
• ML.NET
• Azure ML Service
• ARIMA/AutoARIMA
• Regression
• FB Prophet
• Demo
Takeaways
Time Series
o Introduction to Hierarchical Time Series
o Overview of Time Series Forecasting Models
o Time Series Analysis with Python
ARIMA
o Time Series Forecasting with ARIMA models
o ARIMA, Auto ARIMA, Prophet, Regression (Youtube)
SSA
o A Brief Introduction to SSA
o Forecast Service Demand with Time Series Analysis and ML.NET
FB Prophet
o FB Prophet Quickstart (FB GitHub)
o Time Series Analysis using FB Prophet
o Generate Accurate Forecasts with FB Prophet in Python
Time Series – a sequence of
observations taken over time
Forecasting – the process of
predicting for new data
Describing or Forecasting
• Data are Temporal
o Unlike other data, the fact that a point is close to another is important
• Sample Data Look like…
• Time Series Analysis
o Understanding Time Series and underlying causes
o Create a mathematical model that describes data
o Determine seasonal patterns, trends, relations to external factors
o Note: assumptions are often in place (i.e. the form of data)
• Forecasting
o Scientific predictions based on historical time-stamped data
o Univariate / Multivariate TS Forecasting
o Note: Explanatory power is often low
Time Value
2021-11-01T00:00:00+02:00 66
2021-11-01T01:00:00+02:00 29
2021-11-01T02:00:00+02:00 6
2021-11-01T03:00:00+02:00 8
2021-11-01T04:00:00+02:00 91
2021-11-01T05:00:00+02:00 145
2021-11-01T06:00:00+02:00 14
2021-11-01T07:00:00+02:00 19
2021-11-01T08:00:00+02:00 64
2021-11-01T09:00:00+02:00 4
2021-11-01T10:00:00+02:00 22
2021-11-01T11:00:00+02:00 65
2021-11-01T12:00:00+02:00 30
2021-11-01T13:00:00+02:00 152
2021-11-01T14:00:00+02:00 30
2021-11-01T15:00:00+02:00 17
2021-11-01T16:00:00+02:00 9
2021-11-01T17:00:00+02:00 11
2021-11-01T18:00:00+02:00 19
2021-11-01T19:00:00+02:00 76
2021-11-01T20:00:00+02:00 117
2021-11-01T21:00:00+02:00 152
2021-11-01T22:00:00+02:00 53
2021-11-01T23:00:00+02:00 3
2021-11-02T00:00:00+02:00 13
Practical Use Cases
• Sample Data Sources
o Sensor readings (environmental data, temperature, pressure, humidity)
o Financial market data
o Medical data (body parameters, heartbeat, pulse rate, blood pressure)
• Sample Scenarios
o Unit sales for each day in a store
o Number of passengers on a station
o Number of users of a web site
o Liters of usage of hot water in a household
o Stocks price for a day
o Diesel price for the next week
o Water level of a dam during the year
o Body weight over the year ☺
Time Series come in
various flavour types
Hierarchical Time Series Forecasting
• Hierarchical TS
o Evident hierarchical structure
o Lower levels are nested (i.e. geographical split)
• Grouped TS
o Multiple non-nested levels of detail (i.e. category, retailer, colour)
• Hierarchical Forecasting
o A collection of techniques rather that another methodology
o Generate forecast that is consistent across the whole hierarchy
o Forecasts shall add up
• Approaches
o Bottom up, Top-down
o Middle-out (Mixed) – Bottom-up (above middle), Top-down (below middle)
o Reconciliation – each level independently, Determine coefficients with linear regression
Bulgaria
East
Varna
Burgas
West
Sofia
Quacks like Time Series, Moves like …
• Do you have enough data?
o More data = more options for aggregation, model tuning, model testing
• Time horizon for prediction?
o Shorter time horizon can be predicted with higher confidence
• Are forecasts updateable or static?
o Retrain after new data are available for more accurate results
• Frequency of forecasts?
o Downsampling and upsampling of data affect accuracy (in both directions)
• Is time series stationary?
o Time series properties do not depend on observation time?
Time Series Stationarity
• Stationarity
o Statistical properties of TS do not depend on time of observation (mean, variance)
o Rule: Non-stationary data are unpredictable and cannot be forecasted
o Conclusion: Non-stationary TS data need to be converted to stationary
• Differencing
o Method to transform time series and remove time-dependent attributes (trend, seasonality)
o Lag difference could be calculated on a larger time window (i.e. window size)
Note: Some TS forecasting methods do not require stationarity (i.e. ARIMA), as
preliminary differencing is performed. (ARMA does though)
difference(t) = observation(t) - observation(t-1)
Example: 1 2 3 4 5 6 7 8 9 10
Differencing: 1 1 1 1 1 1 1 1 1
inverted(t) = differenced(t) + observation(t-1)
Time Series Analysis
Observations close in time are
often correlated
Time Series Analysis
TS Analysis provides techniques to understand data and break into components:
• Trend (Tt)
o Smooth general long term tendency to increase, decrease or both
• Seasonality (St)
o Rhythmic forces operate on smaller intervals (i.e. 1h, 1d, 1w, 1m)
• Cyclic (Ct)
o Cyclic behaviour that repeats over a long period (i.e. 4y, 1y)
• Random Noise (Rt)
o Random irregular observations that cannot be explained (unpredictable)
Additive Model: Yt = Tt + St + Ct + Rt
Multipl. Model: Yt = Tt * St * Ct * Rt
Mixed Model: Yt = Tt * Ct + St * Rt; Yt = Tt + St * Ct * Rt
Advanced
Observation: Time series tend to display significant autocorellation
• Correlation
o Measures the relationship between TS and a lagged version of it (T, T-k)
o Meaning: ± 1 - perfect correlation; 0 – no correlation
• Measured with Pearson Correlation
o Preconditions: normal distribution, no significant outliers, continuous variables
o Cross-correlation - the correlation is observed across different lags
• Augmented Dickey-Fuller Test (python adfuller function)
o Null hypothesis (H0) – the TS has a unit root (non-stationary)
o Alternate hypothesis (HA) – the null hypothesis is rejected
• ADF p-value < 0.05
• H0 rejected = TS is stationary
Common Data Preparation
• Imputation
o Replacing missing data with substitute values
• Frequency / Resampling
o Could be too high for a model compared to prediction front
o Irregular time series may require resampling at regular intervals
• Outliers
o Extreme values need to be identified and handled
o Outlier = Value ∉ [Q1-1.5*IQR; Q3+1.5*IQR]
Does missing data have
meaning?
NO
Type of data
Large dataset, little
data missing at
random:
Remove instances with
"missing "? data
Does data follow simple
distribution?
NO
Impute with simple ML
model
YES
Impute with mean value
YES, with outliers
Impute missing values
with median
Large, temporary
ordered dataset:
Replace data with
preceding values
YES: Numerical
Convert missing values
to meaningful number
Forecasting Algorithms
Appreciate how genius was
made simple for you
Naïve Algorithms Baseline
Note: Naïve algorithms are often referred to as “benchmark models”
Naïve Model
• Forecasts for any horizon match the last value
SNaïve Model(Seasonal Naïve)
• Assumes a seasonal component with time window T
• Forecast matches the last T timestamps
ARIMA (AutoRegressive Integrated Moving Average)
• Auto Regressive - linear combination of past values of the variable
o Assume that future will resemble the past
o Inaccurate when an unseen event happens
• Moving Average - linear combination of past forecast errors.
o Smooth impacts of short-term fluctuations
o Simple MA – arithmetic mean of the previous 5,10,20,100 etc. values
o Exponential MA - weighted average that gives greater importance to the most recent values
• Integrated – Differencing for stationary time series
• ARIMA Parameters
o p – number of observations from the Past to forecast future
o d – degree of Differencing (number of times raw observations are differenced for stationarity)
o q – size of the window to calculate forecast Quality errors
ARIMA(p,d,q) = const + (weighted sum last P values) + (weighted sum of last Q errors) after D differencing
SeasonalARIMA
• ARIMA (p,d,q) is a non-seasonal ARIMA
• SARIMA (p, d, q, P, D, Q)
o P - number of seasonal autoregressive terms,
o D – differencing order (number of transformations to make TS stationary)
o Q - moving-average order of seasonal component
o m – periods in a season (i.e. 12 for monthly data)
• The parameter space becomes larger
• Grid search for optimal parameters
AutoARIMA
• Identifies the most optimal parameters of ARIMA (p, d, q)
o pip install pyramid-arima (mimics R auto.arima)
o .fit() does a magic
o Utilizes AIC (Akaike Information Criterion) to pick best model (smaller = better)
• N*ln(SSe/N)+2K – N (N- number of observations, SSe - SumSquareErrors, K – model parameters)
• Conducts differencing tests to determine the order of differencing
• Pros
o Saves time
o One of the simplest techniques for TS forecasting
o Eliminates the need of in-depth statistics understanding
o Reduces the chance of human error due to misinterpretation
model = auto_arima(train, [42 other optional arguments])
model.fit(train)
Singular Spectrum Analysis (SSA)
• Novel powerful technique
• 2 complementary stages
o Decomposition - extract independent components from time series
o Reconstruction – reconstruct the series for forecasting, after removing noise
• Pros
o Works with arbitrary statistical process
o No assumptions for data (i.e. stationarity)
• ML.NET ForecastBySsa Parameters
o trainSize – number of train samples (rows) from beginning (i.e. 300)
o seriesLength – length of series in buffer (how much data to use to train on)
o windowSize – length of the window on the series (seasonality)
o horizon – number of values to forecast (i.e. 24)
o confidenceLevel – degree of certainty (i.e. 95% of estimates to contain the real)
SSA, How it Works
• How does it work
• Checkpoint
o Avoids replay of all previous data, provide only most recent observations
o But if this creates a drift, a clean retrain on last observations (i.e. 1 month) may be better
MLContext mlContext = new MLContext(); //All ML.NET operations are within context
IDataView dv = mlContext.Data.LoadFromTextFile(…) //Step 1: Load data from file
var pipeline = mlContext.Forecasting.ForecastBySsa([Parameters],…) //Step 2: SSA Pipeline
SsaForecastingTransformer forecaster = pipeline.Fit(dv); //Step 3: Data training
… //Step 4: Evaluate (i.e. calculate RMSE)
var forecastEngine = forecaster.CreateTimeSeriesEngine(mlContext);
ModelOutput forecast = forecastEngine.Predict(); //Step 5: Load trained model and predict
forecastEngine.CheckPoint(mlContext, outputModelPath); //Save Checkpoint
model = mlContext.Model.Load(file, out DataViewSchema schema); //Load from Checkpoint
forecastEngine = model.CreateTimeSeriesEngine<TimeSeriesData, ChangePointPrediction>(mlContext);
Regression Model
• Forecasting Recap
o Data are ordered in series as {Time: Value} pairs; No external knowledge
• Regression
o Predicting a single numeric value
o Time Series Forecasting involves Regression under the hood
o Can be applied to non-ordered data
o Shall be applied multiple times to predict the same horizon
• Feature Engineering & Extraction
o Date – Year, Month, Day, Hour
o Lag – What has happened at T-1, T-2, T-12, T-24, T-48, T-n observations
o Delta – What is the difference from T-1, T-2, T-12, T-24, T-48, T-n observations
o Moving Average – Mean(2), Mean(12), Mean(24), Mean (48), …
o Sum – Sum(2), Sum(12), Sum(24), Sum(48),…
o Domain knowledge – Weather, Distance (not GPS), Ref. Price
Azure ML Service
• Azure Auto ML (Forecasting uses AutoARIMA under the hood)
o The easiest still powerful way to do ML
o Optimizes the iterative time consuming tasks of ML
o Azure Auto ML Python SDK
o Azure ML Studio – (ML Studio Classic retires August 2024)
Upload File Select Task Type Parameters Metrics
• Created by in 2017
• Pros
o Trains quickly, highly accurate
o No background required (like AutoARIMA)
o Can also be used for multivariate TS analysis
o Handles outliers and missing data well
o Strong at series with seasonal effects and few seasons in training data
o Handles random changes due to special events (i.e. market events)
• Under the Hood
o Requires prophet Python package
o Uses additive regression model
Y(t) = Trend(t) + Seasonality(t) + Holiday(t) + Error(t)
• Prophet does not run on Python 3.9
• What’s the easiest
• Install Azure Data Science VM (< 4 Cores is sluggish)
• Find the 3.8 Kernel from Jupyter Lab
• Activate kernel
• Use Conda package manager to install
• Conda has own C++ compiler to build the packages
• Select a channel
Prophet – Easy to Use, Hard to Install
C:> activate py38_default
(py38_default) C:> conda install pystan -c conda-forge
(py38_default) C:> conda install -c conda-forge fbprophet
Demo
Ref. Comparing Prophet and Deep Learning to ARIMA in
Forecasting Wholesale Food Prices (2021)
Thanks to our Sponsors

More Related Content

What's hot

Survival Analysis Lecture.ppt
Survival Analysis Lecture.pptSurvival Analysis Lecture.ppt
Survival Analysis Lecture.ppt
habtamu biazin
 
Diagnostic in poisson regression models
Diagnostic in poisson regression modelsDiagnostic in poisson regression models
Diagnostic in poisson regression models
University of Southampton
 
Time series forecasting with machine learning
Time series forecasting with machine learningTime series forecasting with machine learning
Time series forecasting with machine learning
Dr Wei Liu
 
MANOVA SPSS
MANOVA SPSSMANOVA SPSS
MANOVA SPSS
Dr Athar Khan
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture Capabilities
Ashraf Uddin
 
R workshop xiv--Survival Analysis with R
R workshop xiv--Survival Analysis with RR workshop xiv--Survival Analysis with R
R workshop xiv--Survival Analysis with R
Vivian S. Zhang
 
Quantitative data analysis - John Richardson
Quantitative data analysis - John RichardsonQuantitative data analysis - John Richardson
Quantitative data analysis - John Richardson
OUmethods
 
Big data case study collection
Big data   case study collectionBig data   case study collection
Big data case study collection
Luis Miguel Salgado
 
Data Analysis
Data AnalysisData Analysis
Data Analysis
Zeynab Moosavi
 
Data Science - Part X - Time Series Forecasting
Data Science - Part X - Time Series ForecastingData Science - Part X - Time Series Forecasting
Data Science - Part X - Time Series Forecasting
Derek Kane
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
Exploratory Data Analysis
Exploratory Data AnalysisExploratory Data Analysis
Exploratory Data Analysis
Umair Shafique
 
Research question sb_faculty
Research question sb_facultyResearch question sb_faculty
Research question sb_faculty
Sandeep Buttan
 
Data analysis and Presentation
Data analysis and PresentationData analysis and Presentation
Data analysis and Presentation
Jignesh Kariya
 
Data Analysis, Intepretation
Data Analysis, IntepretationData Analysis, Intepretation
missing-data-and-multiple-imputation-in-clinical-epidemiolog
missing-data-and-multiple-imputation-in-clinical-epidemiolog missing-data-and-multiple-imputation-in-clinical-epidemiolog
missing-data-and-multiple-imputation-in-clinical-epidemiolog
simbycris
 
What is qq plot ?
What is qq plot ?What is qq plot ?
What is qq plot ?
knowledgette
 
How to collect and organize data
How to collect and organize dataHow to collect and organize data
How to collect and organize data
Frieda Brioschi
 
Research Method
Research MethodResearch Method
Research Method
YeonYuRae
 

What's hot (20)

Survival Analysis Lecture.ppt
Survival Analysis Lecture.pptSurvival Analysis Lecture.ppt
Survival Analysis Lecture.ppt
 
Diagnostic in poisson regression models
Diagnostic in poisson regression modelsDiagnostic in poisson regression models
Diagnostic in poisson regression models
 
Time series forecasting with machine learning
Time series forecasting with machine learningTime series forecasting with machine learning
Time series forecasting with machine learning
 
MANOVA SPSS
MANOVA SPSSMANOVA SPSS
MANOVA SPSS
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture Capabilities
 
R workshop xiv--Survival Analysis with R
R workshop xiv--Survival Analysis with RR workshop xiv--Survival Analysis with R
R workshop xiv--Survival Analysis with R
 
Quantitative data analysis - John Richardson
Quantitative data analysis - John RichardsonQuantitative data analysis - John Richardson
Quantitative data analysis - John Richardson
 
Data analysis copy
Data analysis   copyData analysis   copy
Data analysis copy
 
Big data case study collection
Big data   case study collectionBig data   case study collection
Big data case study collection
 
Data Analysis
Data AnalysisData Analysis
Data Analysis
 
Data Science - Part X - Time Series Forecasting
Data Science - Part X - Time Series ForecastingData Science - Part X - Time Series Forecasting
Data Science - Part X - Time Series Forecasting
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Exploratory Data Analysis
Exploratory Data AnalysisExploratory Data Analysis
Exploratory Data Analysis
 
Research question sb_faculty
Research question sb_facultyResearch question sb_faculty
Research question sb_faculty
 
Data analysis and Presentation
Data analysis and PresentationData analysis and Presentation
Data analysis and Presentation
 
Data Analysis, Intepretation
Data Analysis, IntepretationData Analysis, Intepretation
Data Analysis, Intepretation
 
missing-data-and-multiple-imputation-in-clinical-epidemiolog
missing-data-and-multiple-imputation-in-clinical-epidemiolog missing-data-and-multiple-imputation-in-clinical-epidemiolog
missing-data-and-multiple-imputation-in-clinical-epidemiolog
 
What is qq plot ?
What is qq plot ?What is qq plot ?
What is qq plot ?
 
How to collect and organize data
How to collect and organize dataHow to collect and organize data
How to collect and organize data
 
Research Method
Research MethodResearch Method
Research Method
 

Similar to Forecasting time series powerful and simple

Data Stream Management
Data Stream ManagementData Stream Management
Data Stream Management
k_tauhid
 
IoT with Azure Machine Learning and InfluxDB
IoT with Azure Machine Learning and InfluxDBIoT with Azure Machine Learning and InfluxDB
IoT with Azure Machine Learning and InfluxDB
Ivo Andreev
 
7 QC - NEW.ppt
7 QC - NEW.ppt7 QC - NEW.ppt
7 QC - NEW.ppt
AmitGajbhiye9
 
prediction of_inventory_management
prediction of_inventory_managementprediction of_inventory_management
prediction of_inventory_management
FEG
 
Practical deep learning for computer vision
Practical deep learning for computer visionPractical deep learning for computer vision
Practical deep learning for computer vision
Eran Shlomo
 
Ecm time series forecast
Ecm time series forecastEcm time series forecast
Ecm time series forecast
Ayapparaj SKS
 
Forecasting_CO2_Emissions.pptx
Forecasting_CO2_Emissions.pptxForecasting_CO2_Emissions.pptx
Forecasting_CO2_Emissions.pptx
MOINDALVS
 
Effective monitoring with StatsD
Effective monitoring with StatsDEffective monitoring with StatsD
Effective monitoring with StatsD
Datadog
 
Time Series Anomaly Detection with .net and Azure
Time Series Anomaly Detection with .net and AzureTime Series Anomaly Detection with .net and Azure
Time Series Anomaly Detection with .net and Azure
Marco Parenzan
 
Searching Algorithms
Searching AlgorithmsSearching Algorithms
Searching Algorithms
Afaq Mansoor Khan
 
Performance OR Capacity #CMGimPACt2016
Performance OR Capacity #CMGimPACt2016 Performance OR Capacity #CMGimPACt2016
Performance OR Capacity #CMGimPACt2016
Alex Gilgur
 
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOMEEuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
HONGJOO LEE
 
Time Series.pptx
Time Series.pptxTime Series.pptx
Time Series.pptx
Ramakrishna Reddy Bijjam
 
SQLBits Module 2 RStats Introduction to R and Statistics
SQLBits Module 2 RStats Introduction to R and StatisticsSQLBits Module 2 RStats Introduction to R and Statistics
SQLBits Module 2 RStats Introduction to R and Statistics
Jen Stirrup
 
Towards Evaluating Size Reduction Techniques for Software Model Checking
Towards Evaluating Size Reduction Techniques for Software Model CheckingTowards Evaluating Size Reduction Techniques for Software Model Checking
Towards Evaluating Size Reduction Techniques for Software Model Checking
Akos Hajdu
 
Demand time series analysis and forecasting
Demand time series analysis and forecastingDemand time series analysis and forecasting
Demand time series analysis and forecasting
M Baddar
 
forecast.ppt
forecast.pptforecast.ppt
forecast.ppt
RijuDasgupta
 
4Developers 2015: Measure to fail - Tomasz Kowalczewski
4Developers 2015: Measure to fail - Tomasz Kowalczewski4Developers 2015: Measure to fail - Tomasz Kowalczewski
4Developers 2015: Measure to fail - Tomasz Kowalczewski
PROIDEA
 
Measure to fail
Measure to failMeasure to fail
Measure to fail
Tomasz Kowalczewski
 
Automatic Forecasting at Scale
Automatic Forecasting at ScaleAutomatic Forecasting at Scale
Automatic Forecasting at Scale
Sean Taylor
 

Similar to Forecasting time series powerful and simple (20)

Data Stream Management
Data Stream ManagementData Stream Management
Data Stream Management
 
IoT with Azure Machine Learning and InfluxDB
IoT with Azure Machine Learning and InfluxDBIoT with Azure Machine Learning and InfluxDB
IoT with Azure Machine Learning and InfluxDB
 
7 QC - NEW.ppt
7 QC - NEW.ppt7 QC - NEW.ppt
7 QC - NEW.ppt
 
prediction of_inventory_management
prediction of_inventory_managementprediction of_inventory_management
prediction of_inventory_management
 
Practical deep learning for computer vision
Practical deep learning for computer visionPractical deep learning for computer vision
Practical deep learning for computer vision
 
Ecm time series forecast
Ecm time series forecastEcm time series forecast
Ecm time series forecast
 
Forecasting_CO2_Emissions.pptx
Forecasting_CO2_Emissions.pptxForecasting_CO2_Emissions.pptx
Forecasting_CO2_Emissions.pptx
 
Effective monitoring with StatsD
Effective monitoring with StatsDEffective monitoring with StatsD
Effective monitoring with StatsD
 
Time Series Anomaly Detection with .net and Azure
Time Series Anomaly Detection with .net and AzureTime Series Anomaly Detection with .net and Azure
Time Series Anomaly Detection with .net and Azure
 
Searching Algorithms
Searching AlgorithmsSearching Algorithms
Searching Algorithms
 
Performance OR Capacity #CMGimPACt2016
Performance OR Capacity #CMGimPACt2016 Performance OR Capacity #CMGimPACt2016
Performance OR Capacity #CMGimPACt2016
 
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOMEEuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
 
Time Series.pptx
Time Series.pptxTime Series.pptx
Time Series.pptx
 
SQLBits Module 2 RStats Introduction to R and Statistics
SQLBits Module 2 RStats Introduction to R and StatisticsSQLBits Module 2 RStats Introduction to R and Statistics
SQLBits Module 2 RStats Introduction to R and Statistics
 
Towards Evaluating Size Reduction Techniques for Software Model Checking
Towards Evaluating Size Reduction Techniques for Software Model CheckingTowards Evaluating Size Reduction Techniques for Software Model Checking
Towards Evaluating Size Reduction Techniques for Software Model Checking
 
Demand time series analysis and forecasting
Demand time series analysis and forecastingDemand time series analysis and forecasting
Demand time series analysis and forecasting
 
forecast.ppt
forecast.pptforecast.ppt
forecast.ppt
 
4Developers 2015: Measure to fail - Tomasz Kowalczewski
4Developers 2015: Measure to fail - Tomasz Kowalczewski4Developers 2015: Measure to fail - Tomasz Kowalczewski
4Developers 2015: Measure to fail - Tomasz Kowalczewski
 
Measure to fail
Measure to failMeasure to fail
Measure to fail
 
Automatic Forecasting at Scale
Automatic Forecasting at ScaleAutomatic Forecasting at Scale
Automatic Forecasting at Scale
 

More from Ivo Andreev

Cybersecurity and Generative AI - for Good and Bad vol.2
Cybersecurity and Generative AI - for Good and Bad vol.2Cybersecurity and Generative AI - for Good and Bad vol.2
Cybersecurity and Generative AI - for Good and Bad vol.2
Ivo Andreev
 
Architecting AI Solutions in Azure for Business
Architecting AI Solutions in Azure for BusinessArchitecting AI Solutions in Azure for Business
Architecting AI Solutions in Azure for Business
Ivo Andreev
 
Cybersecurity Challenges with Generative AI - for Good and Bad
Cybersecurity Challenges with Generative AI - for Good and BadCybersecurity Challenges with Generative AI - for Good and Bad
Cybersecurity Challenges with Generative AI - for Good and Bad
Ivo Andreev
 
JS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AIJS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AI
Ivo Andreev
 
How do OpenAI GPT Models Work - Misconceptions and Tips for Developers
How do OpenAI GPT Models Work - Misconceptions and Tips for DevelopersHow do OpenAI GPT Models Work - Misconceptions and Tips for Developers
How do OpenAI GPT Models Work - Misconceptions and Tips for Developers
Ivo Andreev
 
OpenAI GPT in Depth - Questions and Misconceptions
OpenAI GPT in Depth - Questions and MisconceptionsOpenAI GPT in Depth - Questions and Misconceptions
OpenAI GPT in Depth - Questions and Misconceptions
Ivo Andreev
 
Cutting Edge Computer Vision for Everyone
Cutting Edge Computer Vision for EveryoneCutting Edge Computer Vision for Everyone
Cutting Edge Computer Vision for Everyone
Ivo Andreev
 
Collecting and Analysing Spaceborn Data
Collecting and Analysing Spaceborn DataCollecting and Analysing Spaceborn Data
Collecting and Analysing Spaceborn Data
Ivo Andreev
 
Collecting and Analysing Satellite Data with Azure Orbital
Collecting and Analysing Satellite Data with Azure OrbitalCollecting and Analysing Satellite Data with Azure Orbital
Collecting and Analysing Satellite Data with Azure Orbital
Ivo Andreev
 
Language Studio and Custom Models
Language Studio and Custom ModelsLanguage Studio and Custom Models
Language Studio and Custom Models
Ivo Andreev
 
CosmosDB for IoT Scenarios
CosmosDB for IoT ScenariosCosmosDB for IoT Scenarios
CosmosDB for IoT Scenarios
Ivo Andreev
 
Constrained Optimization with Genetic Algorithms and Project Bonsai
Constrained Optimization with Genetic Algorithms and Project BonsaiConstrained Optimization with Genetic Algorithms and Project Bonsai
Constrained Optimization with Genetic Algorithms and Project Bonsai
Ivo Andreev
 
Azure security guidelines for developers
Azure security guidelines for developers Azure security guidelines for developers
Azure security guidelines for developers
Ivo Andreev
 
Autonomous Machines with Project Bonsai
Autonomous Machines with Project BonsaiAutonomous Machines with Project Bonsai
Autonomous Machines with Project Bonsai
Ivo Andreev
 
Global azure virtual 2021 - Azure Lighthouse
Global azure virtual 2021 - Azure LighthouseGlobal azure virtual 2021 - Azure Lighthouse
Global azure virtual 2021 - Azure Lighthouse
Ivo Andreev
 
Flux QL - Nexgen Management of Time Series Inspired by JS
Flux QL - Nexgen Management of Time Series Inspired by JSFlux QL - Nexgen Management of Time Series Inspired by JS
Flux QL - Nexgen Management of Time Series Inspired by JS
Ivo Andreev
 
Azure architecture design patterns - proven solutions to common challenges
Azure architecture design patterns - proven solutions to common challengesAzure architecture design patterns - proven solutions to common challenges
Azure architecture design patterns - proven solutions to common challenges
Ivo Andreev
 
Industrial IoT on Azure
Industrial IoT on AzureIndustrial IoT on Azure
Industrial IoT on Azure
Ivo Andreev
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
Ivo Andreev
 
Flying a Drone with JavaScript and Computer Vision
Flying a Drone with JavaScript and Computer VisionFlying a Drone with JavaScript and Computer Vision
Flying a Drone with JavaScript and Computer Vision
Ivo Andreev
 

More from Ivo Andreev (20)

Cybersecurity and Generative AI - for Good and Bad vol.2
Cybersecurity and Generative AI - for Good and Bad vol.2Cybersecurity and Generative AI - for Good and Bad vol.2
Cybersecurity and Generative AI - for Good and Bad vol.2
 
Architecting AI Solutions in Azure for Business
Architecting AI Solutions in Azure for BusinessArchitecting AI Solutions in Azure for Business
Architecting AI Solutions in Azure for Business
 
Cybersecurity Challenges with Generative AI - for Good and Bad
Cybersecurity Challenges with Generative AI - for Good and BadCybersecurity Challenges with Generative AI - for Good and Bad
Cybersecurity Challenges with Generative AI - for Good and Bad
 
JS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AIJS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AI
 
How do OpenAI GPT Models Work - Misconceptions and Tips for Developers
How do OpenAI GPT Models Work - Misconceptions and Tips for DevelopersHow do OpenAI GPT Models Work - Misconceptions and Tips for Developers
How do OpenAI GPT Models Work - Misconceptions and Tips for Developers
 
OpenAI GPT in Depth - Questions and Misconceptions
OpenAI GPT in Depth - Questions and MisconceptionsOpenAI GPT in Depth - Questions and Misconceptions
OpenAI GPT in Depth - Questions and Misconceptions
 
Cutting Edge Computer Vision for Everyone
Cutting Edge Computer Vision for EveryoneCutting Edge Computer Vision for Everyone
Cutting Edge Computer Vision for Everyone
 
Collecting and Analysing Spaceborn Data
Collecting and Analysing Spaceborn DataCollecting and Analysing Spaceborn Data
Collecting and Analysing Spaceborn Data
 
Collecting and Analysing Satellite Data with Azure Orbital
Collecting and Analysing Satellite Data with Azure OrbitalCollecting and Analysing Satellite Data with Azure Orbital
Collecting and Analysing Satellite Data with Azure Orbital
 
Language Studio and Custom Models
Language Studio and Custom ModelsLanguage Studio and Custom Models
Language Studio and Custom Models
 
CosmosDB for IoT Scenarios
CosmosDB for IoT ScenariosCosmosDB for IoT Scenarios
CosmosDB for IoT Scenarios
 
Constrained Optimization with Genetic Algorithms and Project Bonsai
Constrained Optimization with Genetic Algorithms and Project BonsaiConstrained Optimization with Genetic Algorithms and Project Bonsai
Constrained Optimization with Genetic Algorithms and Project Bonsai
 
Azure security guidelines for developers
Azure security guidelines for developers Azure security guidelines for developers
Azure security guidelines for developers
 
Autonomous Machines with Project Bonsai
Autonomous Machines with Project BonsaiAutonomous Machines with Project Bonsai
Autonomous Machines with Project Bonsai
 
Global azure virtual 2021 - Azure Lighthouse
Global azure virtual 2021 - Azure LighthouseGlobal azure virtual 2021 - Azure Lighthouse
Global azure virtual 2021 - Azure Lighthouse
 
Flux QL - Nexgen Management of Time Series Inspired by JS
Flux QL - Nexgen Management of Time Series Inspired by JSFlux QL - Nexgen Management of Time Series Inspired by JS
Flux QL - Nexgen Management of Time Series Inspired by JS
 
Azure architecture design patterns - proven solutions to common challenges
Azure architecture design patterns - proven solutions to common challengesAzure architecture design patterns - proven solutions to common challenges
Azure architecture design patterns - proven solutions to common challenges
 
Industrial IoT on Azure
Industrial IoT on AzureIndustrial IoT on Azure
Industrial IoT on Azure
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
 
Flying a Drone with JavaScript and Computer Vision
Flying a Drone with JavaScript and Computer VisionFlying a Drone with JavaScript and Computer Vision
Flying a Drone with JavaScript and Computer Vision
 

Recently uploaded

Pro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp BookPro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp Book
abdulrafaychaudhry
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
Globus
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
XfilesPro
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Globus
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
Matt Welsh
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Crescat
 
Game Development with Unity3D (Game Development lecture 3)
Game Development  with Unity3D (Game Development lecture 3)Game Development  with Unity3D (Game Development lecture 3)
Game Development with Unity3D (Game Development lecture 3)
abdulrafaychaudhry
 
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket ManagementUtilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
Paco van Beckhoven
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Globus
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
abdulrafaychaudhry
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Mind IT Systems
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
timtebeek1
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
Drona Infotech
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
Boni García
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Łukasz Chruściel
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
Aftab Hussain
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Globus
 

Recently uploaded (20)

Pro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp BookPro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp Book
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
 
Game Development with Unity3D (Game Development lecture 3)
Game Development  with Unity3D (Game Development lecture 3)Game Development  with Unity3D (Game Development lecture 3)
Game Development with Unity3D (Game Development lecture 3)
 
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket ManagementUtilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
 
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
 

Forecasting time series powerful and simple

  • 1. January 15th GLOBAL AI BOOTCAMP IS POWERED BY: Powerful yet Simple (or not that much) Forecasting Time Series AI and IoT Bulgaria Summit, 2022
  • 2. Speaker Bio • Software Architect @ o 19+ years professional experience • Microsoft Azure MVP • External Expert Horizon 2020, Eurostars-Eureka • External Expert InnoFund Denmark, RIF Cyprus • Business Interests o Web Development, SOA, Integration o IoT, Machine Learning, Computer Intelligence o Security & Performance Optimization • Contact ivelin.andreev@icb.bg www.linkedin.com/in/ivelin www.slideshare.net/ivoandreev
  • 3. Thanks to our Sponsors
  • 4. Upcoming Events Global Azure Bulgaria, 2022 May 14, 2022 Tickets (Eventbrite) Sessions (Sessionize)
  • 5.
  • 6.
  • 7. Agenda • Time Series? • Forecasting? • ML.NET • Azure ML Service • ARIMA/AutoARIMA • Regression • FB Prophet • Demo
  • 8. Takeaways Time Series o Introduction to Hierarchical Time Series o Overview of Time Series Forecasting Models o Time Series Analysis with Python ARIMA o Time Series Forecasting with ARIMA models o ARIMA, Auto ARIMA, Prophet, Regression (Youtube) SSA o A Brief Introduction to SSA o Forecast Service Demand with Time Series Analysis and ML.NET FB Prophet o FB Prophet Quickstart (FB GitHub) o Time Series Analysis using FB Prophet o Generate Accurate Forecasts with FB Prophet in Python
  • 9. Time Series – a sequence of observations taken over time Forecasting – the process of predicting for new data
  • 10. Describing or Forecasting • Data are Temporal o Unlike other data, the fact that a point is close to another is important • Sample Data Look like… • Time Series Analysis o Understanding Time Series and underlying causes o Create a mathematical model that describes data o Determine seasonal patterns, trends, relations to external factors o Note: assumptions are often in place (i.e. the form of data) • Forecasting o Scientific predictions based on historical time-stamped data o Univariate / Multivariate TS Forecasting o Note: Explanatory power is often low Time Value 2021-11-01T00:00:00+02:00 66 2021-11-01T01:00:00+02:00 29 2021-11-01T02:00:00+02:00 6 2021-11-01T03:00:00+02:00 8 2021-11-01T04:00:00+02:00 91 2021-11-01T05:00:00+02:00 145 2021-11-01T06:00:00+02:00 14 2021-11-01T07:00:00+02:00 19 2021-11-01T08:00:00+02:00 64 2021-11-01T09:00:00+02:00 4 2021-11-01T10:00:00+02:00 22 2021-11-01T11:00:00+02:00 65 2021-11-01T12:00:00+02:00 30 2021-11-01T13:00:00+02:00 152 2021-11-01T14:00:00+02:00 30 2021-11-01T15:00:00+02:00 17 2021-11-01T16:00:00+02:00 9 2021-11-01T17:00:00+02:00 11 2021-11-01T18:00:00+02:00 19 2021-11-01T19:00:00+02:00 76 2021-11-01T20:00:00+02:00 117 2021-11-01T21:00:00+02:00 152 2021-11-01T22:00:00+02:00 53 2021-11-01T23:00:00+02:00 3 2021-11-02T00:00:00+02:00 13
  • 11. Practical Use Cases • Sample Data Sources o Sensor readings (environmental data, temperature, pressure, humidity) o Financial market data o Medical data (body parameters, heartbeat, pulse rate, blood pressure) • Sample Scenarios o Unit sales for each day in a store o Number of passengers on a station o Number of users of a web site o Liters of usage of hot water in a household o Stocks price for a day o Diesel price for the next week o Water level of a dam during the year o Body weight over the year ☺
  • 12. Time Series come in various flavour types
  • 13. Hierarchical Time Series Forecasting • Hierarchical TS o Evident hierarchical structure o Lower levels are nested (i.e. geographical split) • Grouped TS o Multiple non-nested levels of detail (i.e. category, retailer, colour) • Hierarchical Forecasting o A collection of techniques rather that another methodology o Generate forecast that is consistent across the whole hierarchy o Forecasts shall add up • Approaches o Bottom up, Top-down o Middle-out (Mixed) – Bottom-up (above middle), Top-down (below middle) o Reconciliation – each level independently, Determine coefficients with linear regression Bulgaria East Varna Burgas West Sofia
  • 14. Quacks like Time Series, Moves like … • Do you have enough data? o More data = more options for aggregation, model tuning, model testing • Time horizon for prediction? o Shorter time horizon can be predicted with higher confidence • Are forecasts updateable or static? o Retrain after new data are available for more accurate results • Frequency of forecasts? o Downsampling and upsampling of data affect accuracy (in both directions) • Is time series stationary? o Time series properties do not depend on observation time?
  • 15. Time Series Stationarity • Stationarity o Statistical properties of TS do not depend on time of observation (mean, variance) o Rule: Non-stationary data are unpredictable and cannot be forecasted o Conclusion: Non-stationary TS data need to be converted to stationary • Differencing o Method to transform time series and remove time-dependent attributes (trend, seasonality) o Lag difference could be calculated on a larger time window (i.e. window size) Note: Some TS forecasting methods do not require stationarity (i.e. ARIMA), as preliminary differencing is performed. (ARMA does though) difference(t) = observation(t) - observation(t-1) Example: 1 2 3 4 5 6 7 8 9 10 Differencing: 1 1 1 1 1 1 1 1 1 inverted(t) = differenced(t) + observation(t-1)
  • 16. Time Series Analysis Observations close in time are often correlated
  • 17. Time Series Analysis TS Analysis provides techniques to understand data and break into components: • Trend (Tt) o Smooth general long term tendency to increase, decrease or both • Seasonality (St) o Rhythmic forces operate on smaller intervals (i.e. 1h, 1d, 1w, 1m) • Cyclic (Ct) o Cyclic behaviour that repeats over a long period (i.e. 4y, 1y) • Random Noise (Rt) o Random irregular observations that cannot be explained (unpredictable) Additive Model: Yt = Tt + St + Ct + Rt Multipl. Model: Yt = Tt * St * Ct * Rt Mixed Model: Yt = Tt * Ct + St * Rt; Yt = Tt + St * Ct * Rt
  • 18. Advanced Observation: Time series tend to display significant autocorellation • Correlation o Measures the relationship between TS and a lagged version of it (T, T-k) o Meaning: ± 1 - perfect correlation; 0 – no correlation • Measured with Pearson Correlation o Preconditions: normal distribution, no significant outliers, continuous variables o Cross-correlation - the correlation is observed across different lags • Augmented Dickey-Fuller Test (python adfuller function) o Null hypothesis (H0) – the TS has a unit root (non-stationary) o Alternate hypothesis (HA) – the null hypothesis is rejected • ADF p-value < 0.05 • H0 rejected = TS is stationary
  • 19. Common Data Preparation • Imputation o Replacing missing data with substitute values • Frequency / Resampling o Could be too high for a model compared to prediction front o Irregular time series may require resampling at regular intervals • Outliers o Extreme values need to be identified and handled o Outlier = Value ∉ [Q1-1.5*IQR; Q3+1.5*IQR] Does missing data have meaning? NO Type of data Large dataset, little data missing at random: Remove instances with "missing "? data Does data follow simple distribution? NO Impute with simple ML model YES Impute with mean value YES, with outliers Impute missing values with median Large, temporary ordered dataset: Replace data with preceding values YES: Numerical Convert missing values to meaningful number
  • 20. Forecasting Algorithms Appreciate how genius was made simple for you
  • 21. Naïve Algorithms Baseline Note: Naïve algorithms are often referred to as “benchmark models” Naïve Model • Forecasts for any horizon match the last value SNaïve Model(Seasonal Naïve) • Assumes a seasonal component with time window T • Forecast matches the last T timestamps
  • 22. ARIMA (AutoRegressive Integrated Moving Average) • Auto Regressive - linear combination of past values of the variable o Assume that future will resemble the past o Inaccurate when an unseen event happens • Moving Average - linear combination of past forecast errors. o Smooth impacts of short-term fluctuations o Simple MA – arithmetic mean of the previous 5,10,20,100 etc. values o Exponential MA - weighted average that gives greater importance to the most recent values • Integrated – Differencing for stationary time series • ARIMA Parameters o p – number of observations from the Past to forecast future o d – degree of Differencing (number of times raw observations are differenced for stationarity) o q – size of the window to calculate forecast Quality errors ARIMA(p,d,q) = const + (weighted sum last P values) + (weighted sum of last Q errors) after D differencing
  • 23. SeasonalARIMA • ARIMA (p,d,q) is a non-seasonal ARIMA • SARIMA (p, d, q, P, D, Q) o P - number of seasonal autoregressive terms, o D – differencing order (number of transformations to make TS stationary) o Q - moving-average order of seasonal component o m – periods in a season (i.e. 12 for monthly data) • The parameter space becomes larger • Grid search for optimal parameters
  • 24. AutoARIMA • Identifies the most optimal parameters of ARIMA (p, d, q) o pip install pyramid-arima (mimics R auto.arima) o .fit() does a magic o Utilizes AIC (Akaike Information Criterion) to pick best model (smaller = better) • N*ln(SSe/N)+2K – N (N- number of observations, SSe - SumSquareErrors, K – model parameters) • Conducts differencing tests to determine the order of differencing • Pros o Saves time o One of the simplest techniques for TS forecasting o Eliminates the need of in-depth statistics understanding o Reduces the chance of human error due to misinterpretation model = auto_arima(train, [42 other optional arguments]) model.fit(train)
  • 25. Singular Spectrum Analysis (SSA) • Novel powerful technique • 2 complementary stages o Decomposition - extract independent components from time series o Reconstruction – reconstruct the series for forecasting, after removing noise • Pros o Works with arbitrary statistical process o No assumptions for data (i.e. stationarity) • ML.NET ForecastBySsa Parameters o trainSize – number of train samples (rows) from beginning (i.e. 300) o seriesLength – length of series in buffer (how much data to use to train on) o windowSize – length of the window on the series (seasonality) o horizon – number of values to forecast (i.e. 24) o confidenceLevel – degree of certainty (i.e. 95% of estimates to contain the real)
  • 26. SSA, How it Works • How does it work • Checkpoint o Avoids replay of all previous data, provide only most recent observations o But if this creates a drift, a clean retrain on last observations (i.e. 1 month) may be better MLContext mlContext = new MLContext(); //All ML.NET operations are within context IDataView dv = mlContext.Data.LoadFromTextFile(…) //Step 1: Load data from file var pipeline = mlContext.Forecasting.ForecastBySsa([Parameters],…) //Step 2: SSA Pipeline SsaForecastingTransformer forecaster = pipeline.Fit(dv); //Step 3: Data training … //Step 4: Evaluate (i.e. calculate RMSE) var forecastEngine = forecaster.CreateTimeSeriesEngine(mlContext); ModelOutput forecast = forecastEngine.Predict(); //Step 5: Load trained model and predict forecastEngine.CheckPoint(mlContext, outputModelPath); //Save Checkpoint model = mlContext.Model.Load(file, out DataViewSchema schema); //Load from Checkpoint forecastEngine = model.CreateTimeSeriesEngine<TimeSeriesData, ChangePointPrediction>(mlContext);
  • 27. Regression Model • Forecasting Recap o Data are ordered in series as {Time: Value} pairs; No external knowledge • Regression o Predicting a single numeric value o Time Series Forecasting involves Regression under the hood o Can be applied to non-ordered data o Shall be applied multiple times to predict the same horizon • Feature Engineering & Extraction o Date – Year, Month, Day, Hour o Lag – What has happened at T-1, T-2, T-12, T-24, T-48, T-n observations o Delta – What is the difference from T-1, T-2, T-12, T-24, T-48, T-n observations o Moving Average – Mean(2), Mean(12), Mean(24), Mean (48), … o Sum – Sum(2), Sum(12), Sum(24), Sum(48),… o Domain knowledge – Weather, Distance (not GPS), Ref. Price
  • 28. Azure ML Service • Azure Auto ML (Forecasting uses AutoARIMA under the hood) o The easiest still powerful way to do ML o Optimizes the iterative time consuming tasks of ML o Azure Auto ML Python SDK o Azure ML Studio – (ML Studio Classic retires August 2024) Upload File Select Task Type Parameters Metrics
  • 29. • Created by in 2017 • Pros o Trains quickly, highly accurate o No background required (like AutoARIMA) o Can also be used for multivariate TS analysis o Handles outliers and missing data well o Strong at series with seasonal effects and few seasons in training data o Handles random changes due to special events (i.e. market events) • Under the Hood o Requires prophet Python package o Uses additive regression model Y(t) = Trend(t) + Seasonality(t) + Holiday(t) + Error(t)
  • 30. • Prophet does not run on Python 3.9 • What’s the easiest • Install Azure Data Science VM (< 4 Cores is sluggish) • Find the 3.8 Kernel from Jupyter Lab • Activate kernel • Use Conda package manager to install • Conda has own C++ compiler to build the packages • Select a channel Prophet – Easy to Use, Hard to Install C:> activate py38_default (py38_default) C:> conda install pystan -c conda-forge (py38_default) C:> conda install -c conda-forge fbprophet
  • 31. Demo Ref. Comparing Prophet and Deep Learning to ARIMA in Forecasting Wholesale Food Prices (2021)
  • 32. Thanks to our Sponsors