SlideShare a Scribd company logo
Forecasting (Revenue for S&P 500 Companies)
Using the baselineforecast Package
by Konstantin Golyaev
Microsoft Azure Machine Learning
Konstantin Golyaev, useR! 2016, Stanford, CA 16/30/2016
Motivation
• “Prediction is very difficult, especially about the future”
• © Niels Bohr (allegedly)
• We want to:
• Forecast multiple time series at different horizons
• Leverage useful external information, when available
• Employ state-of-the-art methods
Note: won’t show any results due to five-minute time constraint 
Konstantin Golyaev, useR! 2016, Stanford, CA 26/30/2016
Two Ways to Forecast
1. Time-series methods (ARIMA, ETS, STL, etc.)
• Great for modeling trend and seasonality
2. Regression-based methods (elastic net, random forest,
boosted regression trees, etc.)
• Derive power from external information (features)
Can we get the best of both worlds?
Konstantin Golyaev, useR! 2016, Stanford, CA 36/30/2016
Konstantin Golyaev, useR! 2016, Stanford, CA 46/30/2016
Illustration
• Take small window of series
𝑦1
𝑦2
𝑦3
𝑦4
𝑦5
𝑦6
𝑦7
𝑦8
⋮
Konstantin Golyaev, useR! 2016, Stanford, CA 56/30/2016
Illustration
• Take small window of series
• Fit a model to it, make
forecasts few steps ahead
𝑦1
𝑦2
𝑦3
𝑦4
𝑦5
𝑦6
𝑦7
𝑦8
⋮
𝑓7|6
𝑓8|6
⋮
Konstantin Golyaev, useR! 2016, Stanford, CA 66/30/2016
Illustration
• Take small window of series
• Fit a model to it, make
forecasts few steps ahead
• Move the window forward
𝑦1
𝑦2
𝑦3
𝑦4
𝑦5
𝑦6
𝑦7
𝑦8
⋮
Konstantin Golyaev, useR! 2016, Stanford, CA 76/30/2016
Illustration
• Take small window of series
• Fit a model to it, make
forecasts few steps ahead
• Move the window forward
• Repeat the process
𝑦1
𝑦2
𝑦3
𝑦4
𝑦5
𝑦6
𝑦7
𝑦8
⋮
𝑓8|7
𝑓9|7
⋮
Konstantin Golyaev, useR! 2016, Stanford, CA 86/30/2016
Illustration
• Take small window of series
• Fit a model to it, make
forecasts few steps ahead
• Move the window forward
• Repeat the process
• Continue until out of data,
combine results when done
𝑦7 𝑓7|6
𝑦8 𝑓8|6
𝑦8
𝑦9
⋮
𝑓8|7
𝑓9|7
⋮
Konstantin Golyaev, useR! 2016, Stanford, CA 96/30/2016
What Else Can We Do?
Konstantin Golyaev, useR! 2016, Stanford, CA 106/30/2016
Date-Based Features
Examples:
• Year
• Quarter
• Month
• Week
• Holidays
• Etc…
Konstantin Golyaev, useR! 2016, Stanford, CA 116/30/2016
Lags or Other Functions of 𝑦𝑡
• R does not compute lags
correctly when series has gaps
in its index
(e.g. missing months/days)
• So we implemented it
Konstantin Golyaev, useR! 2016, Stanford, CA 126/30/2016
External Series as Features
• This is very much problem-specific
• What we used in various projects:
• Macroeconomic data from Federal Reserve Economic Data (FRED)
• Web search trends from Bing/Google/etc
• Tweets scored for sentiments
• External business drivers such as promotions
Konstantin Golyaev, useR! 2016, Stanford, CA 136/30/2016
Implementation
• All code is combined into baselineforecast R package
• Function ConstructDataset() takes series 𝑦𝑡 and
external data 𝑋𝑡, returns data frame with target and features
• Function FitModel() interfaces with caret package to
train any regression learning algorithm and perform time
series cross-validation
Konstantin Golyaev, useR! 2016, Stanford, CA 146/30/2016
Future Work
• Exploratory Data Analysis
• Computing Prediction Intervals
• Decide on the license/distribution model
Have questions?
Ping me at Konstantin.Golyaev@Microsoft.com
Konstantin Golyaev, useR! 2016, Stanford, CA 156/30/2016

More Related Content

Similar to Forecasting Multiple Time Series Using the baselineforecast R Package

Best Practices for Large Scale Text Mining Processing
Best Practices for Large Scale Text Mining ProcessingBest Practices for Large Scale Text Mining Processing
Best Practices for Large Scale Text Mining Processing
Ontotext
 

Similar to Forecasting Multiple Time Series Using the baselineforecast R Package (20)

Machine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout SessionMachine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout Session
 
Machine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout SessionMachine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout Session
 
Machine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout SessionMachine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout Session
 
Splunk for Machine Learning and Analytics
Splunk for Machine Learning and AnalyticsSplunk for Machine Learning and Analytics
Splunk for Machine Learning and Analytics
 
Splunk for Machine Learning and Analytics
Splunk for Machine Learning and AnalyticsSplunk for Machine Learning and Analytics
Splunk for Machine Learning and Analytics
 
Machine Learning and Analytics in Splunk
Machine Learning and Analytics in SplunkMachine Learning and Analytics in Splunk
Machine Learning and Analytics in Splunk
 
MLOps.pptx
MLOps.pptxMLOps.pptx
MLOps.pptx
 
Data Acquisition: A Key Challenge for Quality and Reliability Improvement
Data Acquisition: A Key Challenge for Quality and Reliability ImprovementData Acquisition: A Key Challenge for Quality and Reliability Improvement
Data Acquisition: A Key Challenge for Quality and Reliability Improvement
 
Swad Timeline
Swad TimelineSwad Timeline
Swad Timeline
 
Presentation cmg2016 capacity management essentials-boston
Presentation   cmg2016 capacity management essentials-bostonPresentation   cmg2016 capacity management essentials-boston
Presentation cmg2016 capacity management essentials-boston
 
SWAD Timeline 4:3
SWAD Timeline 4:3SWAD Timeline 4:3
SWAD Timeline 4:3
 
Datamine Solutions for DS and IM, mining,block
Datamine Solutions for DS and IM, mining,blockDatamine Solutions for DS and IM, mining,block
Datamine Solutions for DS and IM, mining,block
 
The agile forecast joe tristano southern fried agile 2018_ final
The agile forecast joe tristano  southern fried agile 2018_ finalThe agile forecast joe tristano  southern fried agile 2018_ final
The agile forecast joe tristano southern fried agile 2018_ final
 
Valuation of Startups: A Machine Learning Perspective
Valuation of Startups: A Machine Learning PerspectiveValuation of Startups: A Machine Learning Perspective
Valuation of Startups: A Machine Learning Perspective
 
Future Friday 201909
Future Friday 201909Future Friday 201909
Future Friday 201909
 
Data Science at Udemy
Data Science at UdemyData Science at Udemy
Data Science at Udemy
 
QDashboard 1.2
QDashboard 1.2QDashboard 1.2
QDashboard 1.2
 
Best Practices for Large Scale Text Mining Processing
Best Practices for Large Scale Text Mining ProcessingBest Practices for Large Scale Text Mining Processing
Best Practices for Large Scale Text Mining Processing
 
Congress on Evolutionary Computation (CEC 2016) - Plenary Talk
Congress on Evolutionary Computation (CEC 2016) - Plenary TalkCongress on Evolutionary Computation (CEC 2016) - Plenary Talk
Congress on Evolutionary Computation (CEC 2016) - Plenary Talk
 
Déjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsDéjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender Systems
 

Recently uploaded

一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理
cyebo
 
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotecAbortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
pyhepag
 
Machine Learning For Career Growth..pptx
Machine Learning For Career Growth..pptxMachine Learning For Career Growth..pptx
Machine Learning For Career Growth..pptx
benishzehra469
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理
cyebo
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理
pyhepag
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
pyhepag
 

Recently uploaded (20)

一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理
 
Machine Learning for Accident Severity Prediction
Machine Learning for Accident Severity PredictionMachine Learning for Accident Severity Prediction
Machine Learning for Accident Severity Prediction
 
MALL CUSTOMER SEGMENTATION USING K-MEANS CLUSTERING.pptx
MALL CUSTOMER SEGMENTATION USING K-MEANS CLUSTERING.pptxMALL CUSTOMER SEGMENTATION USING K-MEANS CLUSTERING.pptx
MALL CUSTOMER SEGMENTATION USING K-MEANS CLUSTERING.pptx
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
 
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotecAbortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
 
Machine Learning For Career Growth..pptx
Machine Learning For Career Growth..pptxMachine Learning For Career Growth..pptx
Machine Learning For Career Growth..pptx
 
How I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prisonHow I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prison
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptx
 
2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
 
Slip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp ClaimsSlip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp Claims
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理
 
Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理
 
how can i exchange pi coins for others currency like Bitcoin
how can i exchange pi coins for others currency like Bitcoinhow can i exchange pi coins for others currency like Bitcoin
how can i exchange pi coins for others currency like Bitcoin
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call
 
Artificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfArtificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdf
 
AI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfAI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdf
 

Forecasting Multiple Time Series Using the baselineforecast R Package

  • 1. Forecasting (Revenue for S&P 500 Companies) Using the baselineforecast Package by Konstantin Golyaev Microsoft Azure Machine Learning Konstantin Golyaev, useR! 2016, Stanford, CA 16/30/2016
  • 2. Motivation • “Prediction is very difficult, especially about the future” • © Niels Bohr (allegedly) • We want to: • Forecast multiple time series at different horizons • Leverage useful external information, when available • Employ state-of-the-art methods Note: won’t show any results due to five-minute time constraint  Konstantin Golyaev, useR! 2016, Stanford, CA 26/30/2016
  • 3. Two Ways to Forecast 1. Time-series methods (ARIMA, ETS, STL, etc.) • Great for modeling trend and seasonality 2. Regression-based methods (elastic net, random forest, boosted regression trees, etc.) • Derive power from external information (features) Can we get the best of both worlds? Konstantin Golyaev, useR! 2016, Stanford, CA 36/30/2016
  • 4. Konstantin Golyaev, useR! 2016, Stanford, CA 46/30/2016
  • 5. Illustration • Take small window of series 𝑦1 𝑦2 𝑦3 𝑦4 𝑦5 𝑦6 𝑦7 𝑦8 ⋮ Konstantin Golyaev, useR! 2016, Stanford, CA 56/30/2016
  • 6. Illustration • Take small window of series • Fit a model to it, make forecasts few steps ahead 𝑦1 𝑦2 𝑦3 𝑦4 𝑦5 𝑦6 𝑦7 𝑦8 ⋮ 𝑓7|6 𝑓8|6 ⋮ Konstantin Golyaev, useR! 2016, Stanford, CA 66/30/2016
  • 7. Illustration • Take small window of series • Fit a model to it, make forecasts few steps ahead • Move the window forward 𝑦1 𝑦2 𝑦3 𝑦4 𝑦5 𝑦6 𝑦7 𝑦8 ⋮ Konstantin Golyaev, useR! 2016, Stanford, CA 76/30/2016
  • 8. Illustration • Take small window of series • Fit a model to it, make forecasts few steps ahead • Move the window forward • Repeat the process 𝑦1 𝑦2 𝑦3 𝑦4 𝑦5 𝑦6 𝑦7 𝑦8 ⋮ 𝑓8|7 𝑓9|7 ⋮ Konstantin Golyaev, useR! 2016, Stanford, CA 86/30/2016
  • 9. Illustration • Take small window of series • Fit a model to it, make forecasts few steps ahead • Move the window forward • Repeat the process • Continue until out of data, combine results when done 𝑦7 𝑓7|6 𝑦8 𝑓8|6 𝑦8 𝑦9 ⋮ 𝑓8|7 𝑓9|7 ⋮ Konstantin Golyaev, useR! 2016, Stanford, CA 96/30/2016
  • 10. What Else Can We Do? Konstantin Golyaev, useR! 2016, Stanford, CA 106/30/2016
  • 11. Date-Based Features Examples: • Year • Quarter • Month • Week • Holidays • Etc… Konstantin Golyaev, useR! 2016, Stanford, CA 116/30/2016
  • 12. Lags or Other Functions of 𝑦𝑡 • R does not compute lags correctly when series has gaps in its index (e.g. missing months/days) • So we implemented it Konstantin Golyaev, useR! 2016, Stanford, CA 126/30/2016
  • 13. External Series as Features • This is very much problem-specific • What we used in various projects: • Macroeconomic data from Federal Reserve Economic Data (FRED) • Web search trends from Bing/Google/etc • Tweets scored for sentiments • External business drivers such as promotions Konstantin Golyaev, useR! 2016, Stanford, CA 136/30/2016
  • 14. Implementation • All code is combined into baselineforecast R package • Function ConstructDataset() takes series 𝑦𝑡 and external data 𝑋𝑡, returns data frame with target and features • Function FitModel() interfaces with caret package to train any regression learning algorithm and perform time series cross-validation Konstantin Golyaev, useR! 2016, Stanford, CA 146/30/2016
  • 15. Future Work • Exploratory Data Analysis • Computing Prediction Intervals • Decide on the license/distribution model Have questions? Ping me at Konstantin.Golyaev@Microsoft.com Konstantin Golyaev, useR! 2016, Stanford, CA 156/30/2016