SlideShare a Scribd company logo
SFSCon 2020
Statistical forecasting
with exogenous factors
Ideas on how to improve time series models
Geri Skenderi
Prof. Marco Cristani
Outline
Introduction to Forecasting
Data and Statistical analysis
Method and Experiments
Conclusion
Outline
Introduction to Forecasting
Data and Statistical analysis
Method and Experiments
Conclusion
Introduction
Ancient forecasting
Introduction
Modern forecasting
George E.P. Box
1919 - 2013
y’t = c + 𝜙1y’t-1.... 𝜙py’t-p+ 𝜃1 𝜀t-1.... 𝜃q 𝜀t-q + 𝜀t
ARIMA forecast equation for time point t
Rob J. Hyndman
1967
Introduction
Forecasting methods
Outline
Introduction to Forecasting
Data and Statistical analysis
Method and Experiments
Conclusion
Data
Time series
• Forecasts are performed on time series data, which can be defined as a sequence of
observations taken over regular time intervals.
• This data can be univariate or multivariate.
• In this presentation, we will see how to harness the power of multivariate data, in an
era where data is everywhere.
Data
The dataset
• Raw sales data provided from a big Italian
fast-fashion company.
• Multiple features to pose a forecasting
problem.
• We will focus on stores located in three
major cities: Milan, Turin and Rome.
Data
The dataset
Data from two shops, both located inside malls.
Data
The dataset
Data from five shops, four in a mall and one on
the street.
Data
The dataset
Data from five shops, one in a mall and four on
the street.
Where do we go from here?
Use exogenous data to pose a multivariate
forecasting problem
Statistical Analysis
Milan Data
Sales and seasons
Check if certain periods of the year lead to
more/less sales.
Sales and the weather
Check if the weather leads to more/less
sales.
Sales and the day of the week
Check if certain days of the week lead to
more/less sales.
Statistical Analysis
Sales and day of the week
Statistical Analysis
Sales and seasons
● Intuition: The weather affects our mood and habits, so it could have an impact on sales
and shopping behavior.
● More data pre-processing involved. The external weather signals contain
measurements, but also weather phenomena as categorical data:
1. Integer encode weather phenomena.
2. Create a binary weather signal: "rainy": -1 or "not rainy": 1.
3. Original sales signal must be supplementally cleaned and modified.
Statistical Analysis
Sales and weather
● For extensive homogeneity, high sales seasons are removed. Then the signal is taken
by year, divided into ordinary seasons and further into work and off days.
● This structure gives rise to the following set of variables, for each shop:
○ 4 months(March-June, September-December).
○ Approximately 4 weeks per month.
○ 4 days for the working days (Mon-Thu), 3 days for the weekend (Fri-Sun)
● Undersampling is used as a final step to remove statistical inconsistencies.
Statistical Analysis
Sales and weather
Statistical Analysis
Sales and weather
# Samples Mean sold t-test p-
value
Rainy 22 104.50
Not-Rainy 22 60.95
0.002
# Samples Mean sold t-test p-
value
Rainy 16 171.56
Not-Rainy 16 122.25
0.02
# Samples Mean sold t-test p-
value
Rainy 26 140.42
Not-Rainy 22 100.58
0.01
# Samples Mean sold t-test p-
value
Rainy 18 266.00
Not-Rainy 18 185.28
0.001
Milan spring-summer 2017 workdays Milan spring-summer 2017 off-days
Milan spring-summer 2019 workdays Milan spring-summer 2019 off-days
Independent two sample statistical t-test used to determine if there is a significant difference.
Outline
Introduction to Forecasting
Data and Statistical analysis
Method and Experiments
Conclusion
Method
Long Short-Term Memory
Method
EXO-LSTM
An LSTM network which makes use of exogenous temporal and weather data and
uses it for forecasting.
E
N
C
O
D
E
Temporal data
Weather data
● The data is transformed into input - output sequences.
● Standardization is applied: Z = (x - 𝜇)/σ
● By the end of the whole procedure, the data is normalized, divided into sequences and
possesses the original signal, along with the exogenous temporal and weather signals.
t-7 t-6 t-5 t-4 t-3 t-2 t-1 t
Method
Feeding data to the LSTM
● Uses weekly data, with the goal of having an efficient way of checking possible trends
during the year.
● SARIMA and SARIMAX used as classical methods while the LSTM is used as a Deep
Learning representative
● Four LSTM architectures were built:
○ MONO-LSTM: Trained on the univariate total sales signal.
○ EXO1-LSTM: Trained on the total sales signal and the exogenous temporal signal.
○ EXO2-LSTM: Trained on the total sales signal and the exogenous weather signal.
○ EXO3-LSTM: Trained on all the possible features, so sales, weather and temporal signals.
Experiments
Long term forecasting
Dataset SARIMA SARIMAX MONO-LSTM EXO1-LSTM EXO2-LSTM EXO3-LSTM
Milan 299.299334 320.983730 269.000266 259.300154 294.289433 276.278388
Rome 107.928035 103.974567 149.296986 95.339082 127.586807 117.149285
Turin 137.123418 142.099703 146.110333 137.757753 174.867309 163.300861
Dataset SARIMA SARIMAX MONO-LSTM EXO1-LSTM EXO2-LSTM EXO3-LSTM
Milan 6.099036 8.041129 7.412993 7.515069 8.366195 8.250909
Rome 12.846811 12.232047 16.026479 10.029953 12.068225 12.251298
Turin 11.014910 12.382435 13.085100 12.106654 13.937898 13.659325
Measure SARIMA SARIMAX MONO-LSTM EXO1-LSTM EXO2-LSTM EXO3-LSTM
RMSE 181.450262 189.019333 188.135862 164.132330 198.914516 185.576178
SMAPE 9.986919 10.885204 12.174857 9.883892 11.457439 11.387177
RMSE of weekly forecasting based on location
SMAPE of weekly forecasting based on location
RMSE and SMAPE of weekly forecasting based on all the data
The effects of the weather are not as crucial and understandable, while temporal
information like the period of the year is key.
● Uses daily data, with the goal of knowing what is going to happen daily and fill the shelves
with products.
● We use only the four LSTM architectures here, as we have already seen they surpass the
baselines.
● This forecasting strategy along with the different EXO-LSTM models can also aid us in
understanding customer behaviour.
Experiments
Short term forecasting
Dataset MONO-LSTM EXO1-LSTM EXO2-LSTM EXO3-LSTM
Milan 92.681814 97.419602 92.756764 87.843425
Rome 103.320297 101.16842 107.342307 99.044623
Turin 40.815775 34.641631 35.821523 36.963587
Dataset MONO-LSTM EXO1-LSTM EXO2-LSTM EXO3-LSTM
Milan 17.082440 19.892359 20.611555 16.801701
Rome 23.468583 21.877664 22.764811 22.043665
Turin 20.751887 18.803312 18.672907 17.939686
Measure MONO-LSTM EXO1-LSTM EXO2-LSTM EXO3-LSTM
RMSE 78.939295 77.743218 78.640198 74.617212
SMAPE 20.434304 20.191111 20.683091 18.928351
RMSE of daily forecasting based on location
SMAPE of daily forecasting based on location
RMSE and SMAPE of daily forecasting based on all the data
The effects of the weather and time together become clear and are important for
improving forecasts and understanding customer behaviour.
Ground Truth MONO-LSTM EXO1-LSTM EXO2-LSTM EXO3-LSTM
Here we can see how knowing that it is going to rain helps the EXO-LSTM forecast more sales
than when it does not possess this information (MONO).
Outline
Introduction to Forecasting
Data and Statistical analysis
Method and Experiments
Conclusion
Conclusion
Key takeaways: How to improve time series models?
1. Make sure you are making the best use out of your data and other available data.
2. Analyze your data to see if you can extract “hidden” information.
3. Use neural models when dealing with multivariate problems.
4. Don’t be afraid of trying new things.
Conclusion
Future work you could focus on
Various Neural
Architectures
Ensemble
Methods
Feature
Selection
Conclusion
Code and experiment details
Available for free in my GitHub:
https://github.com/geriskenderi/exo-lstm
Get in Touch
We would love to hear your thoughts
LINKEDIN
@geriskenderi
TWITTER
@geriskenderi
EMAIL
geri.skenderi@univr.it
TWITTER
@MarcoCristani
EMAIL
marco.cristani@univr.it
THANK YOU

More Related Content

Similar to SFScon 2020 - Geri Skenderi - Statistical forecasting with exogenous factors

Time Series Analysis.pptx
Time Series Analysis.pptxTime Series Analysis.pptx
Time Series Analysis.pptx
Sunny429247
 
Introduction to need of forecasting in business
Introduction to need of forecasting in businessIntroduction to need of forecasting in business
Introduction to need of forecasting in business
AnuyaK1
 
Demand forecasting by time series analysis
Demand forecasting by time series analysisDemand forecasting by time series analysis
Demand forecasting by time series analysis
Sunny Gandhi
 
timeseries.ppt
timeseries.ppttimeseries.ppt
timeseries.ppt
Sunilkumar222171
 
Demand forecasting and its methods
Demand forecasting and its methodsDemand forecasting and its methods
Demand forecasting and its methods
Shubha Brota Raha
 
IFTA2020 Kei Nakagawa
IFTA2020 Kei NakagawaIFTA2020 Kei Nakagawa
IFTA2020 Kei Nakagawa
Kei Nakagawa
 
Advanced Econometrics L7-8.pptx
Advanced Econometrics L7-8.pptxAdvanced Econometrics L7-8.pptx
Advanced Econometrics L7-8.pptx
akashayosha
 
Chapter-3_Heizer_S1.pptx
Chapter-3_Heizer_S1.pptxChapter-3_Heizer_S1.pptx
Chapter-3_Heizer_S1.pptx
EdwardDelaCruz14
 
SPC Training by D&H Engineers
SPC Training by D&H EngineersSPC Training by D&H Engineers
SPC Training by D&H Engineers
D&H Engineers
 
Time series
Time series Time series
Time series Analysis
Time series AnalysisTime series Analysis
trendanalysis for mba management students
trendanalysis for mba management studentstrendanalysis for mba management students
trendanalysis for mba management students
SoujanyaLk1
 
Enterprise_Planning_TimeSeries_And_Components
Enterprise_Planning_TimeSeries_And_ComponentsEnterprise_Planning_TimeSeries_And_Components
Enterprise_Planning_TimeSeries_And_Components
nanfei
 
Usa Retail Sales Analysis.pdf
Usa Retail Sales Analysis.pdfUsa Retail Sales Analysis.pdf
Usa Retail Sales Analysis.pdf
Vishwas Saini
 
Forecasting Slides
Forecasting SlidesForecasting Slides
Forecasting Slides
knksmart
 
Moving avg & method of least square
Moving avg & method of least squareMoving avg & method of least square
Moving avg & method of least square
Hassan Jalil
 
Training Module
Training ModuleTraining Module
Training Module
Vaseem Ahamad
 
Forecasting_CO2_Emissions.pptx
Forecasting_CO2_Emissions.pptxForecasting_CO2_Emissions.pptx
Forecasting_CO2_Emissions.pptx
MOINDALVS
 
Wavelet Multi-resolution Analysis of High Frequency FX Rates
Wavelet Multi-resolution Analysis of High Frequency FX RatesWavelet Multi-resolution Analysis of High Frequency FX Rates
Wavelet Multi-resolution Analysis of High Frequency FX Rates
aiQUANT
 
Time Series Weather Forecasting Techniques: Literature Survey
Time Series Weather Forecasting Techniques: Literature SurveyTime Series Weather Forecasting Techniques: Literature Survey
Time Series Weather Forecasting Techniques: Literature Survey
IRJET Journal
 

Similar to SFScon 2020 - Geri Skenderi - Statistical forecasting with exogenous factors (20)

Time Series Analysis.pptx
Time Series Analysis.pptxTime Series Analysis.pptx
Time Series Analysis.pptx
 
Introduction to need of forecasting in business
Introduction to need of forecasting in businessIntroduction to need of forecasting in business
Introduction to need of forecasting in business
 
Demand forecasting by time series analysis
Demand forecasting by time series analysisDemand forecasting by time series analysis
Demand forecasting by time series analysis
 
timeseries.ppt
timeseries.ppttimeseries.ppt
timeseries.ppt
 
Demand forecasting and its methods
Demand forecasting and its methodsDemand forecasting and its methods
Demand forecasting and its methods
 
IFTA2020 Kei Nakagawa
IFTA2020 Kei NakagawaIFTA2020 Kei Nakagawa
IFTA2020 Kei Nakagawa
 
Advanced Econometrics L7-8.pptx
Advanced Econometrics L7-8.pptxAdvanced Econometrics L7-8.pptx
Advanced Econometrics L7-8.pptx
 
Chapter-3_Heizer_S1.pptx
Chapter-3_Heizer_S1.pptxChapter-3_Heizer_S1.pptx
Chapter-3_Heizer_S1.pptx
 
SPC Training by D&H Engineers
SPC Training by D&H EngineersSPC Training by D&H Engineers
SPC Training by D&H Engineers
 
Time series
Time series Time series
Time series
 
Time series Analysis
Time series AnalysisTime series Analysis
Time series Analysis
 
trendanalysis for mba management students
trendanalysis for mba management studentstrendanalysis for mba management students
trendanalysis for mba management students
 
Enterprise_Planning_TimeSeries_And_Components
Enterprise_Planning_TimeSeries_And_ComponentsEnterprise_Planning_TimeSeries_And_Components
Enterprise_Planning_TimeSeries_And_Components
 
Usa Retail Sales Analysis.pdf
Usa Retail Sales Analysis.pdfUsa Retail Sales Analysis.pdf
Usa Retail Sales Analysis.pdf
 
Forecasting Slides
Forecasting SlidesForecasting Slides
Forecasting Slides
 
Moving avg & method of least square
Moving avg & method of least squareMoving avg & method of least square
Moving avg & method of least square
 
Training Module
Training ModuleTraining Module
Training Module
 
Forecasting_CO2_Emissions.pptx
Forecasting_CO2_Emissions.pptxForecasting_CO2_Emissions.pptx
Forecasting_CO2_Emissions.pptx
 
Wavelet Multi-resolution Analysis of High Frequency FX Rates
Wavelet Multi-resolution Analysis of High Frequency FX RatesWavelet Multi-resolution Analysis of High Frequency FX Rates
Wavelet Multi-resolution Analysis of High Frequency FX Rates
 
Time Series Weather Forecasting Techniques: Literature Survey
Time Series Weather Forecasting Techniques: Literature SurveyTime Series Weather Forecasting Techniques: Literature Survey
Time Series Weather Forecasting Techniques: Literature Survey
 

More from South Tyrol Free Software Conference

SFSCON23 - Rufai Omowunmi Balogun - SMODEX – a Python package for understandi...
SFSCON23 - Rufai Omowunmi Balogun - SMODEX – a Python package for understandi...SFSCON23 - Rufai Omowunmi Balogun - SMODEX – a Python package for understandi...
SFSCON23 - Rufai Omowunmi Balogun - SMODEX – a Python package for understandi...
South Tyrol Free Software Conference
 
SFSCON23 - Roberto Innocenti - From the design to reality is here the Communi...
SFSCON23 - Roberto Innocenti - From the design to reality is here the Communi...SFSCON23 - Roberto Innocenti - From the design to reality is here the Communi...
SFSCON23 - Roberto Innocenti - From the design to reality is here the Communi...
South Tyrol Free Software Conference
 
SFSCON23 - Martin Rabanser - Real-time aeroplane tracking and the Open Data Hub
SFSCON23 - Martin Rabanser - Real-time aeroplane tracking and the Open Data HubSFSCON23 - Martin Rabanser - Real-time aeroplane tracking and the Open Data Hub
SFSCON23 - Martin Rabanser - Real-time aeroplane tracking and the Open Data Hub
South Tyrol Free Software Conference
 
SFSCON23 - Marianna d'Atri Enrico Zanardo - How can Blockchain technologies i...
SFSCON23 - Marianna d'Atri Enrico Zanardo - How can Blockchain technologies i...SFSCON23 - Marianna d'Atri Enrico Zanardo - How can Blockchain technologies i...
SFSCON23 - Marianna d'Atri Enrico Zanardo - How can Blockchain technologies i...
South Tyrol Free Software Conference
 
SFSCON23 - Lucas Lasota - The Future of Connectivity, Open Internet and Human...
SFSCON23 - Lucas Lasota - The Future of Connectivity, Open Internet and Human...SFSCON23 - Lucas Lasota - The Future of Connectivity, Open Internet and Human...
SFSCON23 - Lucas Lasota - The Future of Connectivity, Open Internet and Human...
South Tyrol Free Software Conference
 
SFSCON23 - Giovanni Giannotta - Intelligent Decision Support System for trace...
SFSCON23 - Giovanni Giannotta - Intelligent Decision Support System for trace...SFSCON23 - Giovanni Giannotta - Intelligent Decision Support System for trace...
SFSCON23 - Giovanni Giannotta - Intelligent Decision Support System for trace...
South Tyrol Free Software Conference
 
SFSCON23 - Elena Maines - Embracing CI/CD workflows for building ETL pipelines
SFSCON23 - Elena Maines - Embracing CI/CD workflows for building ETL pipelinesSFSCON23 - Elena Maines - Embracing CI/CD workflows for building ETL pipelines
SFSCON23 - Elena Maines - Embracing CI/CD workflows for building ETL pipelines
South Tyrol Free Software Conference
 
SFSCON23 - Christian Busse - Free Software and Open Science
SFSCON23 - Christian Busse - Free Software and Open ScienceSFSCON23 - Christian Busse - Free Software and Open Science
SFSCON23 - Christian Busse - Free Software and Open Science
South Tyrol Free Software Conference
 
SFSCON23 - Charles H. Schulz - Why open digital infrastructure matters
SFSCON23 - Charles H. Schulz - Why open digital infrastructure mattersSFSCON23 - Charles H. Schulz - Why open digital infrastructure matters
SFSCON23 - Charles H. Schulz - Why open digital infrastructure matters
South Tyrol Free Software Conference
 
SFSCON23 - Andrea Vianello - Achieving FAIRness with EDP-portal
SFSCON23 - Andrea Vianello - Achieving FAIRness with EDP-portalSFSCON23 - Andrea Vianello - Achieving FAIRness with EDP-portal
SFSCON23 - Andrea Vianello - Achieving FAIRness with EDP-portal
South Tyrol Free Software Conference
 
SFSCON23 - Thomas Aichner - How IoT and AI are revolutionizing Mass Customiza...
SFSCON23 - Thomas Aichner - How IoT and AI are revolutionizing Mass Customiza...SFSCON23 - Thomas Aichner - How IoT and AI are revolutionizing Mass Customiza...
SFSCON23 - Thomas Aichner - How IoT and AI are revolutionizing Mass Customiza...
South Tyrol Free Software Conference
 
SFSCON23 - Stefan Mutschlechner - Smart Werke Meran
SFSCON23 - Stefan Mutschlechner - Smart Werke MeranSFSCON23 - Stefan Mutschlechner - Smart Werke Meran
SFSCON23 - Stefan Mutschlechner - Smart Werke Meran
South Tyrol Free Software Conference
 
SFSCON23 - Mirko Boehm - European regulators cast their eyes on maturing OSS ...
SFSCON23 - Mirko Boehm - European regulators cast their eyes on maturing OSS ...SFSCON23 - Mirko Boehm - European regulators cast their eyes on maturing OSS ...
SFSCON23 - Mirko Boehm - European regulators cast their eyes on maturing OSS ...
South Tyrol Free Software Conference
 
SFSCON23 - Marco Pavanelli - Monitoring the fleet of Sasa with free software
SFSCON23 - Marco Pavanelli - Monitoring the fleet of Sasa with free softwareSFSCON23 - Marco Pavanelli - Monitoring the fleet of Sasa with free software
SFSCON23 - Marco Pavanelli - Monitoring the fleet of Sasa with free software
South Tyrol Free Software Conference
 
SFSCON23 - Marco Cortella - KNOWAGE and AICS for 2030 agenda SDG goals monito...
SFSCON23 - Marco Cortella - KNOWAGE and AICS for 2030 agenda SDG goals monito...SFSCON23 - Marco Cortella - KNOWAGE and AICS for 2030 agenda SDG goals monito...
SFSCON23 - Marco Cortella - KNOWAGE and AICS for 2030 agenda SDG goals monito...
South Tyrol Free Software Conference
 
SFSCON23 - Lina Ceballos - Interoperable Europe Act - A real game changer
SFSCON23 - Lina Ceballos - Interoperable Europe Act - A real game changerSFSCON23 - Lina Ceballos - Interoperable Europe Act - A real game changer
SFSCON23 - Lina Ceballos - Interoperable Europe Act - A real game changer
South Tyrol Free Software Conference
 
SFSCON23 - Johannes Näder Linus Sehn - Let’s monitor implementation of Free S...
SFSCON23 - Johannes Näder Linus Sehn - Let’s monitor implementation of Free S...SFSCON23 - Johannes Näder Linus Sehn - Let’s monitor implementation of Free S...
SFSCON23 - Johannes Näder Linus Sehn - Let’s monitor implementation of Free S...
South Tyrol Free Software Conference
 
SFSCON23 - Gabriel Ku Wei Bin - Why Do We Need A Next Generation Internet
SFSCON23 - Gabriel Ku Wei Bin - Why Do We Need A Next Generation InternetSFSCON23 - Gabriel Ku Wei Bin - Why Do We Need A Next Generation Internet
SFSCON23 - Gabriel Ku Wei Bin - Why Do We Need A Next Generation Internet
South Tyrol Free Software Conference
 
SFSCON23 - Edoardo Scepi - The Brand-New Version of IGis Maps
SFSCON23 - Edoardo Scepi - The Brand-New Version of IGis MapsSFSCON23 - Edoardo Scepi - The Brand-New Version of IGis Maps
SFSCON23 - Edoardo Scepi - The Brand-New Version of IGis Maps
South Tyrol Free Software Conference
 
SFSCON23 - Davide Vernassa - Empowering Insights Unveiling the latest innova...
SFSCON23 - Davide Vernassa - Empowering Insights  Unveiling the latest innova...SFSCON23 - Davide Vernassa - Empowering Insights  Unveiling the latest innova...
SFSCON23 - Davide Vernassa - Empowering Insights Unveiling the latest innova...
South Tyrol Free Software Conference
 

More from South Tyrol Free Software Conference (20)

SFSCON23 - Rufai Omowunmi Balogun - SMODEX – a Python package for understandi...
SFSCON23 - Rufai Omowunmi Balogun - SMODEX – a Python package for understandi...SFSCON23 - Rufai Omowunmi Balogun - SMODEX – a Python package for understandi...
SFSCON23 - Rufai Omowunmi Balogun - SMODEX – a Python package for understandi...
 
SFSCON23 - Roberto Innocenti - From the design to reality is here the Communi...
SFSCON23 - Roberto Innocenti - From the design to reality is here the Communi...SFSCON23 - Roberto Innocenti - From the design to reality is here the Communi...
SFSCON23 - Roberto Innocenti - From the design to reality is here the Communi...
 
SFSCON23 - Martin Rabanser - Real-time aeroplane tracking and the Open Data Hub
SFSCON23 - Martin Rabanser - Real-time aeroplane tracking and the Open Data HubSFSCON23 - Martin Rabanser - Real-time aeroplane tracking and the Open Data Hub
SFSCON23 - Martin Rabanser - Real-time aeroplane tracking and the Open Data Hub
 
SFSCON23 - Marianna d'Atri Enrico Zanardo - How can Blockchain technologies i...
SFSCON23 - Marianna d'Atri Enrico Zanardo - How can Blockchain technologies i...SFSCON23 - Marianna d'Atri Enrico Zanardo - How can Blockchain technologies i...
SFSCON23 - Marianna d'Atri Enrico Zanardo - How can Blockchain technologies i...
 
SFSCON23 - Lucas Lasota - The Future of Connectivity, Open Internet and Human...
SFSCON23 - Lucas Lasota - The Future of Connectivity, Open Internet and Human...SFSCON23 - Lucas Lasota - The Future of Connectivity, Open Internet and Human...
SFSCON23 - Lucas Lasota - The Future of Connectivity, Open Internet and Human...
 
SFSCON23 - Giovanni Giannotta - Intelligent Decision Support System for trace...
SFSCON23 - Giovanni Giannotta - Intelligent Decision Support System for trace...SFSCON23 - Giovanni Giannotta - Intelligent Decision Support System for trace...
SFSCON23 - Giovanni Giannotta - Intelligent Decision Support System for trace...
 
SFSCON23 - Elena Maines - Embracing CI/CD workflows for building ETL pipelines
SFSCON23 - Elena Maines - Embracing CI/CD workflows for building ETL pipelinesSFSCON23 - Elena Maines - Embracing CI/CD workflows for building ETL pipelines
SFSCON23 - Elena Maines - Embracing CI/CD workflows for building ETL pipelines
 
SFSCON23 - Christian Busse - Free Software and Open Science
SFSCON23 - Christian Busse - Free Software and Open ScienceSFSCON23 - Christian Busse - Free Software and Open Science
SFSCON23 - Christian Busse - Free Software and Open Science
 
SFSCON23 - Charles H. Schulz - Why open digital infrastructure matters
SFSCON23 - Charles H. Schulz - Why open digital infrastructure mattersSFSCON23 - Charles H. Schulz - Why open digital infrastructure matters
SFSCON23 - Charles H. Schulz - Why open digital infrastructure matters
 
SFSCON23 - Andrea Vianello - Achieving FAIRness with EDP-portal
SFSCON23 - Andrea Vianello - Achieving FAIRness with EDP-portalSFSCON23 - Andrea Vianello - Achieving FAIRness with EDP-portal
SFSCON23 - Andrea Vianello - Achieving FAIRness with EDP-portal
 
SFSCON23 - Thomas Aichner - How IoT and AI are revolutionizing Mass Customiza...
SFSCON23 - Thomas Aichner - How IoT and AI are revolutionizing Mass Customiza...SFSCON23 - Thomas Aichner - How IoT and AI are revolutionizing Mass Customiza...
SFSCON23 - Thomas Aichner - How IoT and AI are revolutionizing Mass Customiza...
 
SFSCON23 - Stefan Mutschlechner - Smart Werke Meran
SFSCON23 - Stefan Mutschlechner - Smart Werke MeranSFSCON23 - Stefan Mutschlechner - Smart Werke Meran
SFSCON23 - Stefan Mutschlechner - Smart Werke Meran
 
SFSCON23 - Mirko Boehm - European regulators cast their eyes on maturing OSS ...
SFSCON23 - Mirko Boehm - European regulators cast their eyes on maturing OSS ...SFSCON23 - Mirko Boehm - European regulators cast their eyes on maturing OSS ...
SFSCON23 - Mirko Boehm - European regulators cast their eyes on maturing OSS ...
 
SFSCON23 - Marco Pavanelli - Monitoring the fleet of Sasa with free software
SFSCON23 - Marco Pavanelli - Monitoring the fleet of Sasa with free softwareSFSCON23 - Marco Pavanelli - Monitoring the fleet of Sasa with free software
SFSCON23 - Marco Pavanelli - Monitoring the fleet of Sasa with free software
 
SFSCON23 - Marco Cortella - KNOWAGE and AICS for 2030 agenda SDG goals monito...
SFSCON23 - Marco Cortella - KNOWAGE and AICS for 2030 agenda SDG goals monito...SFSCON23 - Marco Cortella - KNOWAGE and AICS for 2030 agenda SDG goals monito...
SFSCON23 - Marco Cortella - KNOWAGE and AICS for 2030 agenda SDG goals monito...
 
SFSCON23 - Lina Ceballos - Interoperable Europe Act - A real game changer
SFSCON23 - Lina Ceballos - Interoperable Europe Act - A real game changerSFSCON23 - Lina Ceballos - Interoperable Europe Act - A real game changer
SFSCON23 - Lina Ceballos - Interoperable Europe Act - A real game changer
 
SFSCON23 - Johannes Näder Linus Sehn - Let’s monitor implementation of Free S...
SFSCON23 - Johannes Näder Linus Sehn - Let’s monitor implementation of Free S...SFSCON23 - Johannes Näder Linus Sehn - Let’s monitor implementation of Free S...
SFSCON23 - Johannes Näder Linus Sehn - Let’s monitor implementation of Free S...
 
SFSCON23 - Gabriel Ku Wei Bin - Why Do We Need A Next Generation Internet
SFSCON23 - Gabriel Ku Wei Bin - Why Do We Need A Next Generation InternetSFSCON23 - Gabriel Ku Wei Bin - Why Do We Need A Next Generation Internet
SFSCON23 - Gabriel Ku Wei Bin - Why Do We Need A Next Generation Internet
 
SFSCON23 - Edoardo Scepi - The Brand-New Version of IGis Maps
SFSCON23 - Edoardo Scepi - The Brand-New Version of IGis MapsSFSCON23 - Edoardo Scepi - The Brand-New Version of IGis Maps
SFSCON23 - Edoardo Scepi - The Brand-New Version of IGis Maps
 
SFSCON23 - Davide Vernassa - Empowering Insights Unveiling the latest innova...
SFSCON23 - Davide Vernassa - Empowering Insights  Unveiling the latest innova...SFSCON23 - Davide Vernassa - Empowering Insights  Unveiling the latest innova...
SFSCON23 - Davide Vernassa - Empowering Insights Unveiling the latest innova...
 

Recently uploaded

GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
saastr
 
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Jeffrey Haguewood
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
fredae14
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStrDeep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
saastr
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
Intelisync
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
Operating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptxOperating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptx
Pravash Chandra Das
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
alexjohnson7307
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdfNunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
flufftailshop
 

Recently uploaded (20)

GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
 
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStrDeep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
Operating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptxOperating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptx
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdfNunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
 

SFScon 2020 - Geri Skenderi - Statistical forecasting with exogenous factors

  • 1. SFSCon 2020 Statistical forecasting with exogenous factors Ideas on how to improve time series models Geri Skenderi Prof. Marco Cristani
  • 2. Outline Introduction to Forecasting Data and Statistical analysis Method and Experiments Conclusion
  • 3. Outline Introduction to Forecasting Data and Statistical analysis Method and Experiments Conclusion
  • 5. Introduction Modern forecasting George E.P. Box 1919 - 2013 y’t = c + 𝜙1y’t-1.... 𝜙py’t-p+ 𝜃1 𝜀t-1.... 𝜃q 𝜀t-q + 𝜀t ARIMA forecast equation for time point t Rob J. Hyndman 1967
  • 7. Outline Introduction to Forecasting Data and Statistical analysis Method and Experiments Conclusion
  • 8. Data Time series • Forecasts are performed on time series data, which can be defined as a sequence of observations taken over regular time intervals. • This data can be univariate or multivariate. • In this presentation, we will see how to harness the power of multivariate data, in an era where data is everywhere.
  • 9. Data The dataset • Raw sales data provided from a big Italian fast-fashion company. • Multiple features to pose a forecasting problem. • We will focus on stores located in three major cities: Milan, Turin and Rome.
  • 10. Data The dataset Data from two shops, both located inside malls.
  • 11. Data The dataset Data from five shops, four in a mall and one on the street.
  • 12. Data The dataset Data from five shops, one in a mall and four on the street.
  • 13. Where do we go from here? Use exogenous data to pose a multivariate forecasting problem
  • 14. Statistical Analysis Milan Data Sales and seasons Check if certain periods of the year lead to more/less sales. Sales and the weather Check if the weather leads to more/less sales. Sales and the day of the week Check if certain days of the week lead to more/less sales.
  • 17. ● Intuition: The weather affects our mood and habits, so it could have an impact on sales and shopping behavior. ● More data pre-processing involved. The external weather signals contain measurements, but also weather phenomena as categorical data: 1. Integer encode weather phenomena. 2. Create a binary weather signal: "rainy": -1 or "not rainy": 1. 3. Original sales signal must be supplementally cleaned and modified. Statistical Analysis Sales and weather
  • 18. ● For extensive homogeneity, high sales seasons are removed. Then the signal is taken by year, divided into ordinary seasons and further into work and off days. ● This structure gives rise to the following set of variables, for each shop: ○ 4 months(March-June, September-December). ○ Approximately 4 weeks per month. ○ 4 days for the working days (Mon-Thu), 3 days for the weekend (Fri-Sun) ● Undersampling is used as a final step to remove statistical inconsistencies. Statistical Analysis Sales and weather
  • 19. Statistical Analysis Sales and weather # Samples Mean sold t-test p- value Rainy 22 104.50 Not-Rainy 22 60.95 0.002 # Samples Mean sold t-test p- value Rainy 16 171.56 Not-Rainy 16 122.25 0.02 # Samples Mean sold t-test p- value Rainy 26 140.42 Not-Rainy 22 100.58 0.01 # Samples Mean sold t-test p- value Rainy 18 266.00 Not-Rainy 18 185.28 0.001 Milan spring-summer 2017 workdays Milan spring-summer 2017 off-days Milan spring-summer 2019 workdays Milan spring-summer 2019 off-days Independent two sample statistical t-test used to determine if there is a significant difference.
  • 20. Outline Introduction to Forecasting Data and Statistical analysis Method and Experiments Conclusion
  • 22. Method EXO-LSTM An LSTM network which makes use of exogenous temporal and weather data and uses it for forecasting. E N C O D E Temporal data Weather data
  • 23. ● The data is transformed into input - output sequences. ● Standardization is applied: Z = (x - 𝜇)/σ ● By the end of the whole procedure, the data is normalized, divided into sequences and possesses the original signal, along with the exogenous temporal and weather signals. t-7 t-6 t-5 t-4 t-3 t-2 t-1 t Method Feeding data to the LSTM
  • 24. ● Uses weekly data, with the goal of having an efficient way of checking possible trends during the year. ● SARIMA and SARIMAX used as classical methods while the LSTM is used as a Deep Learning representative ● Four LSTM architectures were built: ○ MONO-LSTM: Trained on the univariate total sales signal. ○ EXO1-LSTM: Trained on the total sales signal and the exogenous temporal signal. ○ EXO2-LSTM: Trained on the total sales signal and the exogenous weather signal. ○ EXO3-LSTM: Trained on all the possible features, so sales, weather and temporal signals. Experiments Long term forecasting
  • 25. Dataset SARIMA SARIMAX MONO-LSTM EXO1-LSTM EXO2-LSTM EXO3-LSTM Milan 299.299334 320.983730 269.000266 259.300154 294.289433 276.278388 Rome 107.928035 103.974567 149.296986 95.339082 127.586807 117.149285 Turin 137.123418 142.099703 146.110333 137.757753 174.867309 163.300861 Dataset SARIMA SARIMAX MONO-LSTM EXO1-LSTM EXO2-LSTM EXO3-LSTM Milan 6.099036 8.041129 7.412993 7.515069 8.366195 8.250909 Rome 12.846811 12.232047 16.026479 10.029953 12.068225 12.251298 Turin 11.014910 12.382435 13.085100 12.106654 13.937898 13.659325 Measure SARIMA SARIMAX MONO-LSTM EXO1-LSTM EXO2-LSTM EXO3-LSTM RMSE 181.450262 189.019333 188.135862 164.132330 198.914516 185.576178 SMAPE 9.986919 10.885204 12.174857 9.883892 11.457439 11.387177 RMSE of weekly forecasting based on location SMAPE of weekly forecasting based on location RMSE and SMAPE of weekly forecasting based on all the data
  • 26. The effects of the weather are not as crucial and understandable, while temporal information like the period of the year is key.
  • 27. ● Uses daily data, with the goal of knowing what is going to happen daily and fill the shelves with products. ● We use only the four LSTM architectures here, as we have already seen they surpass the baselines. ● This forecasting strategy along with the different EXO-LSTM models can also aid us in understanding customer behaviour. Experiments Short term forecasting
  • 28. Dataset MONO-LSTM EXO1-LSTM EXO2-LSTM EXO3-LSTM Milan 92.681814 97.419602 92.756764 87.843425 Rome 103.320297 101.16842 107.342307 99.044623 Turin 40.815775 34.641631 35.821523 36.963587 Dataset MONO-LSTM EXO1-LSTM EXO2-LSTM EXO3-LSTM Milan 17.082440 19.892359 20.611555 16.801701 Rome 23.468583 21.877664 22.764811 22.043665 Turin 20.751887 18.803312 18.672907 17.939686 Measure MONO-LSTM EXO1-LSTM EXO2-LSTM EXO3-LSTM RMSE 78.939295 77.743218 78.640198 74.617212 SMAPE 20.434304 20.191111 20.683091 18.928351 RMSE of daily forecasting based on location SMAPE of daily forecasting based on location RMSE and SMAPE of daily forecasting based on all the data
  • 29. The effects of the weather and time together become clear and are important for improving forecasts and understanding customer behaviour. Ground Truth MONO-LSTM EXO1-LSTM EXO2-LSTM EXO3-LSTM
  • 30. Here we can see how knowing that it is going to rain helps the EXO-LSTM forecast more sales than when it does not possess this information (MONO).
  • 31. Outline Introduction to Forecasting Data and Statistical analysis Method and Experiments Conclusion
  • 32. Conclusion Key takeaways: How to improve time series models? 1. Make sure you are making the best use out of your data and other available data. 2. Analyze your data to see if you can extract “hidden” information. 3. Use neural models when dealing with multivariate problems. 4. Don’t be afraid of trying new things.
  • 33. Conclusion Future work you could focus on Various Neural Architectures Ensemble Methods Feature Selection
  • 34. Conclusion Code and experiment details Available for free in my GitHub: https://github.com/geriskenderi/exo-lstm
  • 35. Get in Touch We would love to hear your thoughts LINKEDIN @geriskenderi TWITTER @geriskenderi EMAIL geri.skenderi@univr.it TWITTER @MarcoCristani EMAIL marco.cristani@univr.it