Advances in automatic time series forecasting

5,096 views
5,013 views

Published on

Invited talk, Australian Statistical Conference, Adelaide, July 2012.

Published in: Technology, Business
1 Comment
11 Likes
Statistics
Notes
No Downloads
Views
Total views
5,096
On SlideShare
0
From Embeds
0
Number of Embeds
2,098
Actions
Shares
0
Downloads
318
Comments
1
Likes
11
Embeds 0
No embeds

No notes for slide

Advances in automatic time series forecasting

  1. 1. Rob J HyndmanAdvances inautomatictime series forecastingAdvances in automatic time series forecasting 1
  2. 2. Outline1 Motivation2 Exponential smoothing3 ARIMA modelling4 Time series with complex seasonality5 Hierarchical and grouped time series6 Functional time series7 Grouped functional time series Advances in automatic time series forecasting Motivation 2
  3. 3. MotivationAdvances in automatic time series forecasting Motivation 3
  4. 4. MotivationAdvances in automatic time series forecasting Motivation 3
  5. 5. MotivationAdvances in automatic time series forecasting Motivation 3
  6. 6. MotivationAdvances in automatic time series forecasting Motivation 3
  7. 7. MotivationAdvances in automatic time series forecasting Motivation 3
  8. 8. MotivationAdvances in automatic time series forecasting Motivation 3
  9. 9. Motivation 1 Common in business to have over 1000 products that need forecasting at least monthly. 2 Forecasts are often required by people who are untrained in time series analysis. 3 Some types of data can be decomposed into a large number of univariate time series that need to be forecast.SpecificationsAutomatic forecasting algorithms must: ¯ determine an appropriate time series model; ¯ estimate the parameters; ¯ compute the forecasts with prediction intervals. Advances in automatic time series forecasting Motivation 4
  10. 10. Motivation 1 Common in business to have over 1000 products that need forecasting at least monthly. 2 Forecasts are often required by people who are untrained in time series analysis. 3 Some types of data can be decomposed into a large number of univariate time series that need to be forecast.SpecificationsAutomatic forecasting algorithms must: ¯ determine an appropriate time series model; ¯ estimate the parameters; ¯ compute the forecasts with prediction intervals. Advances in automatic time series forecasting Motivation 4
  11. 11. Motivation 1 Common in business to have over 1000 products that need forecasting at least monthly. 2 Forecasts are often required by people who are untrained in time series analysis. 3 Some types of data can be decomposed into a large number of univariate time series that need to be forecast.SpecificationsAutomatic forecasting algorithms must: ¯ determine an appropriate time series model; ¯ estimate the parameters; ¯ compute the forecasts with prediction intervals. Advances in automatic time series forecasting Motivation 4
  12. 12. Motivation 1 Common in business to have over 1000 products that need forecasting at least monthly. 2 Forecasts are often required by people who are untrained in time series analysis. 3 Some types of data can be decomposed into a large number of univariate time series that need to be forecast.SpecificationsAutomatic forecasting algorithms must: ¯ determine an appropriate time series model; ¯ estimate the parameters; ¯ compute the forecasts with prediction intervals. Advances in automatic time series forecasting Motivation 4
  13. 13. Motivation 1 Common in business to have over 1000 products that need forecasting at least monthly. 2 Forecasts are often required by people who are untrained in time series analysis. 3 Some types of data can be decomposed into a large number of univariate time series that need to be forecast.SpecificationsAutomatic forecasting algorithms must: ¯ determine an appropriate time series model; ¯ estimate the parameters; ¯ compute the forecasts with prediction intervals. Advances in automatic time series forecasting Motivation 4
  14. 14. Example: Cortecosteroid sales Monthly cortecosteroid drug sales in Australia 1.6 1.4Total scripts (millions) 1.2 1.0 0.8 0.6 0.4 1995 2000 2005 2010 Year Advances in automatic time series forecasting Motivation 5
  15. 15. Example: Cortecosteroid sales Automatic ARIMA forecasts 1.6 1.4Total scripts (millions) 1.2 1.0 0.8 0.6 0.4 1995 2000 2005 2010 Year Advances in automatic time series forecasting Motivation 5
  16. 16. Outline1 Motivation2 Exponential smoothing3 ARIMA modelling4 Time series with complex seasonality5 Hierarchical and grouped time series6 Functional time series7 Grouped functional time series Advances in automatic time series forecasting Exponential smoothing 6
  17. 17. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N Ad ,A Ad ,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md ,N Md ,A Md ,MAdvances in automatic time series forecasting Exponential smoothing 7
  18. 18. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N Ad ,A Ad ,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M N,N: Simple exponential smoothingAdvances in automatic time series forecasting Exponential smoothing 7
  19. 19. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N Ad ,A Ad ,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M N,N: Simple exponential smoothing A,N: Holt’s linear methodAdvances in automatic time series forecasting Exponential smoothing 7
  20. 20. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N Ad ,A Ad ,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad ,N: Additive damped trend methodAdvances in automatic time series forecasting Exponential smoothing 7
  21. 21. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N Ad ,A Ad ,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad ,N: Additive damped trend method M,N: Exponential trend methodAdvances in automatic time series forecasting Exponential smoothing 7
  22. 22. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N Ad ,A Ad ,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad ,N: Additive damped trend method M,N: Exponential trend method Md ,N: Multiplicative damped trend methodAdvances in automatic time series forecasting Exponential smoothing 7
  23. 23. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N Ad ,A Ad ,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad ,N: Additive damped trend method M,N: Exponential trend method Md ,N: Multiplicative damped trend method A,A: Additive Holt-Winters’ methodAdvances in automatic time series forecasting Exponential smoothing 7
  24. 24. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N Ad ,A Ad ,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad ,N: Additive damped trend method M,N: Exponential trend method Md ,N: Multiplicative damped trend method A,A: Additive Holt-Winters’ method A,M: Multiplicative Holt-Winters’ methodAdvances in automatic time series forecasting Exponential smoothing 7
  25. 25. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N Ad ,A Ad ,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M There are 15 separate exponential smoothing methods.Advances in automatic time series forecasting Exponential smoothing 7
  26. 26. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N Ad ,A Ad ,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M There are 15 separate exponential smoothing methods. Each can have an additive or multiplicative error, giving 30 separate models.Advances in automatic time series forecasting Exponential smoothing 7
  27. 27. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N Ad ,A Ad ,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md ,N Md ,A Md ,MGeneral notation E T S : ExponenTial SmoothingExamples:A,N,N: Simple exponential smoothing with additive errorsA,A,N: Holt’s linear method with additive errorsM,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsAdvances in automatic time series forecasting Exponential smoothing 8
  28. 28. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N Ad ,A Ad ,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md ,N Md ,A Md ,MGeneral notation E T S : ExponenTial SmoothingExamples:A,N,N: Simple exponential smoothing with additive errorsA,A,N: Holt’s linear method with additive errorsM,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsAdvances in automatic time series forecasting Exponential smoothing 8
  29. 29. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N Ad ,A Ad ,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md ,N Md ,A Md ,MGeneral notation E T S : ExponenTial Smoothing ↑ TrendExamples:A,N,N: Simple exponential smoothing with additive errorsA,A,N: Holt’s linear method with additive errorsM,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsAdvances in automatic time series forecasting Exponential smoothing 8
  30. 30. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N Ad ,A Ad ,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md ,N Md ,A Md ,MGeneral notation E T S : ExponenTial Smoothing ↑ Trend SeasonalExamples:A,N,N: Simple exponential smoothing with additive errorsA,A,N: Holt’s linear method with additive errorsM,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsAdvances in automatic time series forecasting Exponential smoothing 8
  31. 31. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N Ad ,A Ad ,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md ,N Md ,A Md ,MGeneral notation E T S : ExponenTial Smoothing ↑ Error Trend SeasonalExamples:A,N,N: Simple exponential smoothing with additive errorsA,A,N: Holt’s linear method with additive errorsM,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsAdvances in automatic time series forecasting Exponential smoothing 8
  32. 32. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N Ad ,A Ad ,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md ,N Md ,A Md ,MGeneral notation E T S : ExponenTial Smoothing ↑ Error Trend SeasonalExamples:A,N,N: Simple exponential smoothing with additive errorsA,A,N: Holt’s linear method with additive errorsM,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsAdvances in automatic time series forecasting Exponential smoothing 8
  33. 33. Exponential smoothing methodsInnovations state space modelsComponent Seasonal Trend N A M ¯ AllComponent ETS models can (None) (Additive) (Multiplicative) be written in innovations N state space form. (None) N,N N,A N,M A (Additive) A,N A,A A,M ¯ Additive and multiplicative versions give,M A (Additive damped) A ,N A ,A A the d d d d M same point forecasts but different prediction (Multiplicative) M,N M,A M,M Md intervals. damped) M ,N (Multiplicative M ,A M ,M d d dGeneral notation E T S : ExponenTial Smoothing ↑ Error Trend SeasonalExamples:A,N,N: Simple exponential smoothing with additive errorsA,A,N: Holt’s linear method with additive errorsM,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsAdvances in automatic time series forecasting Exponential smoothing 8
  34. 34. Automatic forecastingFrom Hyndman et al. (IJF, 2002): Apply each of 30 models that are appropriate to the data. Optimize parameters and initial values using MLE (or some other criterion). Select best method using AIC: AIC = −2 log(Likelihood) + 2p where p = # parameters. Produce forecasts using best method. Obtain prediction intervals using underlying state space model.Method performed very well in M3 competition. Advances in automatic time series forecasting Exponential smoothing 9
  35. 35. Automatic forecastingFrom Hyndman et al. (IJF, 2002): Apply each of 30 models that are appropriate to the data. Optimize parameters and initial values using MLE (or some other criterion). Select best method using AIC: AIC = −2 log(Likelihood) + 2p where p = # parameters. Produce forecasts using best method. Obtain prediction intervals using underlying state space model.Method performed very well in M3 competition. Advances in automatic time series forecasting Exponential smoothing 9
  36. 36. Automatic forecastingFrom Hyndman et al. (IJF, 2002): Apply each of 30 models that are appropriate to the data. Optimize parameters and initial values using MLE (or some other criterion). Select best method using AIC: AIC = −2 log(Likelihood) + 2p where p = # parameters. Produce forecasts using best method. Obtain prediction intervals using underlying state space model.Method performed very well in M3 competition. Advances in automatic time series forecasting Exponential smoothing 9
  37. 37. Automatic forecastingFrom Hyndman et al. (IJF, 2002): Apply each of 30 models that are appropriate to the data. Optimize parameters and initial values using MLE (or some other criterion). Select best method using AIC: AIC = −2 log(Likelihood) + 2p where p = # parameters. Produce forecasts using best method. Obtain prediction intervals using underlying state space model.Method performed very well in M3 competition. Advances in automatic time series forecasting Exponential smoothing 9
  38. 38. Automatic forecastingFrom Hyndman et al. (IJF, 2002): Apply each of 30 models that are appropriate to the data. Optimize parameters and initial values using MLE (or some other criterion). Select best method using AIC: AIC = −2 log(Likelihood) + 2p where p = # parameters. Produce forecasts using best method. Obtain prediction intervals using underlying state space model.Method performed very well in M3 competition. Advances in automatic time series forecasting Exponential smoothing 9
  39. 39. Automatic forecastingFrom Hyndman et al. (IJF, 2002): Apply each of 30 models that are appropriate to the data. Optimize parameters and initial values using MLE (or some other criterion). Select best method using AIC: AIC = −2 log(Likelihood) + 2p where p = # parameters. Produce forecasts using best method. Obtain prediction intervals using underlying state space model.Method performed very well in M3 competition. Advances in automatic time series forecasting Exponential smoothing 9
  40. 40. Exponential smoothing Automatic ETS forecasts 1.6 1.4Total scripts (millions) 1.2 1.0 0.8 0.6 0.4 1995 2000 2005 2010 Year Advances in automatic time series forecasting Exponential smoothing 10
  41. 41. Exponential smoothing Automatic ETS forecasts 1.6 library(forecast) 1.4 fit <- ets(h02) fcast <- forecast(fit)Total scripts (millions) 1.2 plot(fcast) 1.0 0.8 0.6 0.4 1995 2000 2005 2010 Year Advances in automatic time series forecasting Exponential smoothing 11
  42. 42. Exponential smoothing> fitETS(M,Md,M) Smoothing parameters: alpha = 0.3318 beta = 4e-04 gamma = 1e-04 phi = 0.9695 Initial states: l = 0.4003 b = 1.0233 s = 0.8575 0.8183 0.7559 0.7627 0.6873 1.2884 1.3456 1.1867 1.1653 1.1033 1.0398 0.9893 sigma: 0.0651 AIC AICc BIC-121.97999 -118.68967 -65.57195 Advances in automatic time series forecasting Exponential smoothing 12
  43. 43. References Hyndman, Koehler, Snyder, Grose (2002). “A state space framework for automatic forecasting using exponential smoothing methods”. International Journal of Forecasting 18(3), 439–454 Hyndman, Koehler, Ord, Snyder (2008). Forecasting with exponential smoothing: the state space approach. Berlin: Springer-Verlag. www.exponentialsmoothing.net Hyndman (2012). forecast: Forecasting functions for time series. cran.r-project.org/package=forecastAdvances in automatic time series forecasting Exponential smoothing 13
  44. 44. Outline1 Motivation2 Exponential smoothing3 ARIMA modelling4 Time series with complex seasonality5 Hierarchical and grouped time series6 Functional time series7 Grouped functional time series Advances in automatic time series forecasting ARIMA modelling 14
  45. 45. How does auto.arima() work?A non-seasonal ARIMA process φ(B)(1 − B)d yt = c + θ(B)εtNeed to select appropriate orders p, q, d, andwhether to include c. Advances in automatic time series forecasting ARIMA modelling 15
  46. 46. How does auto.arima() work?A non-seasonal ARIMA process φ(B)(1 − B)d yt = c + θ(B)εtNeed to select appropriate orders p, q, d, andwhether to include c.Hyndman & Khandakar (JSS, 2008) algorithm: Select no. differences d via KPSS unit root test. Select p, q, c by minimising AIC. Use stepwise search to traverse model space, starting with a simple model and considering nearby variants. Advances in automatic time series forecasting ARIMA modelling 15
  47. 47. How does auto.arima() work?A seasonal ARIMA process Φ(Bm )φ(B)(1 − B)d (1 − Bm )D yt = c + Θ(Bm )θ(B)εtNeed to select appropriate orders p, q, d, P, Q, D, andwhether to include c.Hyndman & Khandakar (JSS, 2008) algorithm: Select no. differences d via KPSS unit root test. Select D using OCSB unit root test. Select p, q, P, Q, c by minimising AIC. Use stepwise search to traverse model space, starting with a simple model and considering nearby variants. Advances in automatic time series forecasting ARIMA modelling 16
  48. 48. Auto ARIMA Automatic ARIMA forecasts 1.6 1.4Total scripts (millions) 1.2 1.0 0.8 0.6 0.4 1995 2000 2005 2010 Year Advances in automatic time series forecasting ARIMA modelling 17
  49. 49. Auto ARIMA Automatic ARIMA forecasts 1.6 fit <- auto.arima(h02) fcast <- forecast(fit) 1.4 plot(fcast)Total scripts (millions) 1.2 1.0 0.8 0.6 0.4 1995 2000 2005 2010 Year Advances in automatic time series forecasting ARIMA modelling 18
  50. 50. Auto ARIMA> fitSeries: h02ARIMA(3,1,3)(0,1,1)[12]Coefficients: ar1 ar2 ar3 ma1 ma2 ma3 -0.3648 -0.0636 0.3568 -0.4850 0.0479 -0.353s.e. 0.2198 0.3293 0.1268 0.2227 0.2755 0.212 sma1 -0.5931s.e. 0.0651sigma^2 estimated as 0.002706: log likelihood=290.25AIC=-564.5 AICc=-563.71 BIC=-538.48 Advances in automatic time series forecasting ARIMA modelling 19
  51. 51. References Hyndman, Khandakar (2008). “Automatic time series forecasting : the forecast package for R”. Journal of Statistical Software 26(3) Hyndman (2011). “Major changes to the forecast package”. robjhyndman.com/researchtips/forecast3/. Hyndman (2012). forecast: Forecasting functions for time series. cran.r-project.org/package=forecastAdvances in automatic time series forecasting ARIMA modelling 20
  52. 52. Outline1 Motivation2 Exponential smoothing3 ARIMA modelling4 Time series with complex seasonality5 Hierarchical and grouped time series6 Functional time series7 Grouped functional time series Advances in automatic time series forecasting Time series with complex seasonality 21
  53. 53. Examples US finished motor gasoline products 6500 7000 7500 8000 8500 9000 9500Thousands of barrels per day 1992 1994 1996 1998 2000 2002 2004 Weeks Advances in automatic time series forecasting Time series with complex seasonality 22
  54. 54. Examples Number of calls to large American bank (7am−9pm) 400 300Number of call arrivals 200 100 3 March 17 March 31 March 14 April 28 April 12 May 5 minute intervals Advances in automatic time series forecasting Time series with complex seasonality 22
  55. 55. Examples Turkish electricity demand 25Electricity demand (GW) 20 15 10 2000 2002 2004 2006 2008 Days Advances in automatic time series forecasting Time series with complex seasonality 22
  56. 56. TBATS model yt = observation at time t ω (ω) (yt − 1)/ω if ω = 0;yt = log yt if ω = 0. M (ω) (i)yt = t −1 + φbt −1 + st−mi + dt i=1 = t−1 + φbt−1 + αdt t bt = (1 − φ)b + φbt−1 + β dt p q dt = φi dt−i + θj εt−j + εt i=1 j=1 (i) (i) (i) ∗(i) (i) (i) ki sj,t = sj,t−1 cos λj + sj,t−1 sin λj + γ1 dt (i) (i)st = sj,t (i) (i) (i) ∗(i) (i) (i) sj,t = −sj,t−1 sin λj + sj,t−1 cos λj + γ2 dt j=1 Advances in automatic time series forecasting Time series with complex seasonality 23
  57. 57. TBATS model yt = observation at time t ω (ω) (yt − 1)/ω if ω = 0; Box-Cox transformationyt = log yt if ω = 0. M (ω) (i)yt = t −1 + φbt −1 + st−mi + dt i=1 = t−1 + φbt−1 + αdt t bt = (1 − φ)b + φbt−1 + β dt p q dt = φi dt−i + θj εt−j + εt i=1 j=1 (i) (i) (i) ∗(i) (i) (i) ki sj,t = sj,t−1 cos λj + sj,t−1 sin λj + γ1 dt (i) (i)st = sj,t (i) (i) (i) ∗(i) (i) (i) sj,t = −sj,t−1 sin λj + sj,t−1 cos λj + γ2 dt j=1 Advances in automatic time series forecasting Time series with complex seasonality 23
  58. 58. TBATS model yt = observation at time t ω (ω) (yt − 1)/ω if ω = 0; Box-Cox transformationyt = log yt if ω = 0. M (ω) (i) M seasonal periodsyt = t −1 + φbt −1 + st−mi + dt i=1 = t−1 + φbt−1 + αdt t bt = (1 − φ)b + φbt−1 + β dt p q dt = φi dt−i + θj εt−j + εt i=1 j=1 (i) (i) (i) ∗(i) (i) (i) ki sj,t = sj,t−1 cos λj + sj,t−1 sin λj + γ1 dt (i) (i)st = sj,t (i) (i) (i) ∗(i) (i) (i) sj,t = −sj,t−1 sin λj + sj,t−1 cos λj + γ2 dt j=1 Advances in automatic time series forecasting Time series with complex seasonality 23
  59. 59. TBATS model yt = observation at time t ω (ω) (yt − 1)/ω if ω = 0; Box-Cox transformationyt = log yt if ω = 0. M (ω) (i) M seasonal periodsyt = t −1 + φbt −1 + st−mi + dt i=1 = t−1 + φbt−1 + αdt t global and local trend bt = (1 − φ)b + φbt−1 + β dt p q dt = φi dt−i + θj εt−j + εt i=1 j=1 (i) (i) (i) ∗(i) (i) (i) ki sj,t = sj,t−1 cos λj + sj,t−1 sin λj + γ1 dt (i) (i)st = sj,t (i) (i) (i) ∗(i) (i) (i) sj,t = −sj,t−1 sin λj + sj,t−1 cos λj + γ2 dt j=1 Advances in automatic time series forecasting Time series with complex seasonality 23
  60. 60. TBATS model yt = observation at time t ω (ω) (yt − 1)/ω if ω = 0; Box-Cox transformationyt = log yt if ω = 0. M (ω) (i) M seasonal periodsyt = t −1 + φbt −1 + st−mi + dt i=1 = t−1 + φbt−1 + αdt t global and local trend bt = (1 − φ)b + φbt−1 + β dt p q dt = φi dt−i + θj εt−j + εt ARMA error i=1 j=1 (i) (i) (i) ∗(i) (i) (i) ki sj,t = sj,t−1 cos λj + sj,t−1 sin λj + γ1 dt (i) (i)st = sj,t (i) (i) (i) ∗(i) (i) (i) sj,t = −sj,t−1 sin λj + sj,t−1 cos λj + γ2 dt j=1 Advances in automatic time series forecasting Time series with complex seasonality 23
  61. 61. TBATS model yt = observation at time t ω (ω) (yt − 1)/ω if ω = 0; Box-Cox transformationyt = log yt if ω = 0. M (ω) (i) M seasonal periodsyt = t −1 + φbt −1 + st−mi + dt i=1 = t−1 + φbt−1 + αdt t global and local trend bt = (1 − φ)b + φbt−1 + β dt p q dt = φi dt−i + θj εt−j + εt ARMA error i=1 j=1 Fourier-like seasonal (i) (i) (i) ∗(i) (i) (i) ki sj,t = sj,t−1 cos λj + sj,t−1 sin λj + γ1 dt terms (i) (i)st = sj,t (i) (i) (i) ∗(i) (i) (i) sj,t = −sj,t−1 sin λj + sj,t−1 cos λj + γ2 dt j=1 Advances in automatic time series forecasting Time series with complex seasonality 23
  62. 62. TBATS model yt = observation at time t ω (ω) (yt − 1)/ω if ω = 0; Box-Cox transformationyt = TBATSif ω = 0. log yt M (ω) Trigonometric d = (i) M seasonal periodsyt t −1 + φbt −1 + st − m + t i Box-Cox1 i= t = t −1 + φbt −1 + αdt global and local trend ARMA bt = (1 − φ)b + φbt−1 + β dt p Trendq dt = φi dt−i + θj ε + ε ARMA error i=1 Seasonalt−j t j=1 Fourier-like seasonal (i) (i) (i) ∗(i) (i) (i) ki sj,t = sj,t−1 cos λj + sj,t−1 sin λj + γ1 dt terms (i) (i)st = sj,t (i) (i) (i) ∗(i) (i) (i) sj,t = −sj,t−1 sin λj + sj,t−1 cos λj + γ2 dt j=1 Advances in automatic time series forecasting Time series with complex seasonality 23
  63. 63. Examples Forecasts from TBATS(0.999, {2,2}, 1, {<52.1785714285714,8>}) 10000 fit <- tbats(gasoline) fcast <- forecast(fit)Thousands of barrels per day plot(fcast) 9000 8000 7000 1995 2000 2005 Weeks Advances in automatic time series forecasting Time series with complex seasonality 24
  64. 64. Examples Forecasts from TBATS(1, {3,1}, 0.987, {<169,5>, <845,3>}) 500 fit <- tbats(callcentre) fcast <- forecast(fit) 400 plot(fcast)Number of call arrivals 300 200 100 0 3 March 17 March 31 March 14 April 28 April 12 May 26 May 9 June 5 minute intervals Advances in automatic time series forecasting Time series with complex seasonality 25
  65. 65. Examples Forecasts from TBATS(0, {5,3}, 0.997, {<7,3>, <354.37,12>, <365.25,4>}) fit <- tbats(turk) 25 fcast <- forecast(fit)Electricity demand (GW) plot(fcast) 20 15 10 2000 2002 2004 2006 2008 2010 Days Advances in automatic time series forecasting Time series with complex seasonality 26
  66. 66. References Automatic algorithm described in De Livera, Hyndman, Snyder (2011). “Forecasting time series with complex seasonal patterns using exponential smoothing”. Journal of the American Statistical Association 106(496), 1513–1527. Slightly improved algorithm implemented in Hyndman (2012). forecast: Forecasting functions for time series. cran.r-project.org/package=forecast. More work required!Advances in automatic time series forecasting Time series with complex seasonality 27
  67. 67. Outline1 Motivation2 Exponential smoothing3 ARIMA modelling4 Time series with complex seasonality5 Hierarchical and grouped time series6 Functional time series7 Grouped functional time series Advances in automatic time series forecasting Hierarchical and grouped time series 28
  68. 68. Introduction Total A B C AA AB AC BA BB BC CA CB CCExamples Manufacturing product hierarchies Pharmaceutical sales Net labour turnoverAdvances in automatic time series forecasting Hierarchical and grouped time series 29
  69. 69. Introduction Total A B C AA AB AC BA BB BC CA CB CCExamples Manufacturing product hierarchies Pharmaceutical sales Net labour turnoverAdvances in automatic time series forecasting Hierarchical and grouped time series 29
  70. 70. Introduction Total A B C AA AB AC BA BB BC CA CB CCExamples Manufacturing product hierarchies Pharmaceutical sales Net labour turnoverAdvances in automatic time series forecasting Hierarchical and grouped time series 29
  71. 71. Introduction Total A B C AA AB AC BA BB BC CA CB CCExamples Manufacturing product hierarchies Pharmaceutical sales Net labour turnoverAdvances in automatic time series forecasting Hierarchical and grouped time series 29
  72. 72. Hierarchical/grouped time series A hierarchical time series is a collection of several time series that are linked together in a hierarchical structure. A grouped time series is a collection of time series that are aggregated in a number of non-hierarchical ways. Example: daily numbers of calls to HP call centres are grouped by product type and location of call centre. Forecasts should be “aggregate consistent”, unbiased, minimum variance. How to compute forecast intervals?Advances in automatic time series forecasting Hierarchical and grouped time series 30
  73. 73. Hierarchical/grouped time series A hierarchical time series is a collection of several time series that are linked together in a hierarchical structure. A grouped time series is a collection of time series that are aggregated in a number of non-hierarchical ways. Example: daily numbers of calls to HP call centres are grouped by product type and location of call centre. Forecasts should be “aggregate consistent”, unbiased, minimum variance. How to compute forecast intervals?Advances in automatic time series forecasting Hierarchical and grouped time series 30
  74. 74. Hierarchical/grouped time series A hierarchical time series is a collection of several time series that are linked together in a hierarchical structure. A grouped time series is a collection of time series that are aggregated in a number of non-hierarchical ways. Example: daily numbers of calls to HP call centres are grouped by product type and location of call centre. Forecasts should be “aggregate consistent”, unbiased, minimum variance. How to compute forecast intervals?Advances in automatic time series forecasting Hierarchical and grouped time series 30
  75. 75. Hierarchical/grouped time series A hierarchical time series is a collection of several time series that are linked together in a hierarchical structure. A grouped time series is a collection of time series that are aggregated in a number of non-hierarchical ways. Example: daily numbers of calls to HP call centres are grouped by product type and location of call centre. Forecasts should be “aggregate consistent”, unbiased, minimum variance. How to compute forecast intervals?Advances in automatic time series forecasting Hierarchical and grouped time series 30
  76. 76. Hierarchical/grouped time series A hierarchical time series is a collection of several time series that are linked together in a hierarchical structure. A grouped time series is a collection of time series that are aggregated in a number of non-hierarchical ways. Example: daily numbers of calls to HP call centres are grouped by product type and location of call centre. Forecasts should be “aggregate consistent”, unbiased, minimum variance. How to compute forecast intervals?Advances in automatic time series forecasting Hierarchical and grouped time series 30
  77. 77. Notation Yt : observed aggregate of all Total series at time t. YX,t : observation on series X at time t. A B C Y i,t : vector of all series at level iK : number of levels in in time t. hierarchy (excl. Total). Y t = [Yt , Y 1,t , . . . , Y K ,t ]Advances in automatic time series forecasting Hierarchical and grouped time series 31
  78. 78. Notation Yt : observed aggregate of all Total series at time t. YX,t : observation on series X at time t. A B C Y i,t : vector of all series at level iK : number of levels in in time t. hierarchy (excl. Total). Y t = [Yt , Y 1,t , . . . , Y K ,t ]Advances in automatic time series forecasting Hierarchical and grouped time series 31
  79. 79. Notation Yt : observed aggregate of all Total series at time t. YX,t : observation on series X at time t. A B C Y i,t : vector of all series at level i K : number of levels in in time t. hierarchy (excl. Total). Y t = [Yt , Y 1,t , . . . , Y K ,t ]   1 1 1   1 YA,t 0 0 Y t = [Yt , YA,t , YB,t , YC,t ] =  0  YB,t  1 0 YC,t 0 0 1 Advances in automatic time series forecasting Hierarchical and grouped time series 31
  80. 80. Notation Yt : observed aggregate of all Total series at time t. YX,t : observation on series X at time t. A B C Y i,t : vector of all series at level i K : number of levels in in time t. hierarchy (excl. Total). Y t = [Yt , Y 1,t , . . . , Y K ,t ]   1 1 1   1 YA,t 0 0 Y t = [Yt , YA,t , YB,t , YC,t ] =  0  YB,t  1 0 YC,t 0 0 1 S Advances in automatic time series forecasting Hierarchical and grouped time series 31
  81. 81. Notation Yt : observed aggregate of all Total series at time t. YX,t : observation on series X at time t. A B C Y i,t : vector of all series at level i K : number of levels in in time t. hierarchy (excl. Total). Y t = [Yt , Y 1,t , . . . , Y K ,t ]   1 1 1   1 Y A, t 0 0 Y t = [Yt , YA,t , YB,t , YC,t ] =  0  YB , t  1 0 YC,t 0 0 1 Y K ,t S Advances in automatic time series forecasting Hierarchical and grouped time series 31
  82. 82. Notation Yt : observed aggregate of all Total series at time t. YX,t : observation on series X at time t. A B C Y i,t : vector of all series at level i K : number of levels in in time t. hierarchy (excl. Total). Y t = [Yt , Y 1,t , . . . , Y K ,t ]   1 1 1   1 Y A, t 0 0 Y t = [Yt , YA,t , YB,t , YC,t ] =  0  YB , t  1 0 YC,t 0 0 1 Y K ,tY t = SY K ,t S Advances in automatic time series forecasting Hierarchical and grouped time series 31
  83. 83. Hierarchical data Total A B C AX AY AZ BX BY BZ CX CY CZ  Yt  1 1 1 1 1 1 1 1 1  YA,t 1 1 1 0 0 0 0 0 0 YB,t  0 0 0 1 1 1 0 0 0 Y  AX,t   YC,t  0 0 0 0 0 0 1 1 1 Y AY ,t YAX,t   1 0 0 0 0 0 0 0 0  Y AZ,t   YAY ,t  0 1 0 0 0 0 0 0 0  Y BX,t Yt =  =  YAZ,t  0 0 1 0 0 0 0 0 0  Y , BY t    YBX,t  0 0 0 1 0 0 0 0 0 Y , BZ t  YBY ,t  0 0 0 0 1 0 0 0 0 Y  , CX t   YBZ,t  0 0 0 0 0 1 0 0 0 Y CY ,t  YCX,t  0 0 0 0 0 0 1 0 0 Y CZ,t YCY ,t 0 0 0 0 0 0 0 1 0 YCZ,t 0 0 0 0 0 0 0 0 1 Y K ,t SAdvances in automatic time series forecasting Hierarchical and grouped time series 32
  84. 84. Hierarchical data Total A B C AX AY AZ BX BY BZ CX CY CZ  Yt  1 1 1 1 1 1 1 1 1  YA,t 1 1 1 0 0 0 0 0 0 YB,t  0 0 0 1 1 1 0 0 0 Y  AX,t   YC,t  0 0 0 0 0 0 1 1 1 Y AY ,t YAX,t   1 0 0 0 0 0 0 0 0  Y AZ,t   YAY ,t  0 1 0 0 0 0 0 0 0  Y BX,t Yt =  =  YAZ,t  0 0 1 0 0 0 0 0 0  Y , BY t    YBX,t  0 0 0 1 0 0 0 0 0 Y , BZ t  YBY ,t  0 0 0 0 1 0 0 0 0 Y  , CX t   YBZ,t  0 0 0 0 0 1 0 0 0 Y CY ,t  YCX,t  0 0 0 0 0 0 1 0 0 Y CZ,t YCY ,t 0 0 0 0 0 0 0 1 0 YCZ,t 0 0 0 0 0 0 0 0 1 Y K ,t SAdvances in automatic time series forecasting Hierarchical and grouped time series 32
  85. 85. Grouped data Total Total A B X YAX AY BX BY AX BX AY BY     Yt 1 1 1 1  YA,t  1 1 0 0  YB,t  0 0 1 1       YAX,t  YX,t  1 0 1 0  YAY ,t Yt =  YY ,t  = 0 1 0 1     YAX,t  1 YBX,t     0 0 0  YBY ,t YAY ,t  0 1 0 0     YBX,t 0 0 1 0 Y K ,t YBY ,t 0 0 0 1 SAdvances in automatic time series forecasting Hierarchical and grouped time series 33
  86. 86. Grouped data Total Total A B X YAX AY BX BY AX BX AY BY     Yt 1 1 1 1  YA,t  1 1 0 0  YB,t  0 0 1 1       YAX,t  YX,t  1 0 1 0  YAY ,t Yt =  YY ,t  = 0 1 0 1     YAX,t  1 YBX,t     0 0 0  YBY ,t YAY ,t  0 1 0 0     YBX,t 0 0 1 0 Y K ,t YBY ,t 0 0 0 1 SAdvances in automatic time series forecasting Hierarchical and grouped time series 33
  87. 87. ForecastsKey idea: forecast reconciliation ¯ Ignore structural constraints and forecast every series of interest independently. ¯ Adjust forecasts to impose constraints. ˆLet Y n (h) be vector of initial forecasts for horizon h,made at time n, stacked in same order as Y t .Y t = SY K ,t . ˆ So Y n (h) = Sβn (h) + εh . βn (h) = E[Y K ,n+h | Y 1 , . . . , Y n ]. εh has zero mean and covariance matrix Σh . Estimate βn (h) using GLS? ˜ Revised forecasts: Y n (h) = Sβn (h) ˆ Advances in automatic time series forecasting Hierarchical and grouped time series 34
  88. 88. ForecastsKey idea: forecast reconciliation ¯ Ignore structural constraints and forecast every series of interest independently. ¯ Adjust forecasts to impose constraints. ˆLet Y n (h) be vector of initial forecasts for horizon h,made at time n, stacked in same order as Y t .Y t = SY K ,t . ˆ So Y n (h) = Sβn (h) + εh . βn (h) = E[Y K ,n+h | Y 1 , . . . , Y n ]. εh has zero mean and covariance matrix Σh . Estimate βn (h) using GLS? ˜ Revised forecasts: Y n (h) = Sβn (h) ˆ Advances in automatic time series forecasting Hierarchical and grouped time series 34
  89. 89. ForecastsKey idea: forecast reconciliation ¯ Ignore structural constraints and forecast every series of interest independently. ¯ Adjust forecasts to impose constraints. ˆLet Y n (h) be vector of initial forecasts for horizon h,made at time n, stacked in same order as Y t .Y t = SY K ,t . ˆ So Y n (h) = Sβn (h) + εh . βn (h) = E[Y K ,n+h | Y 1 , . . . , Y n ]. εh has zero mean and covariance matrix Σh . Estimate βn (h) using GLS? ˜ Revised forecasts: Y n (h) = Sβn (h) ˆ Advances in automatic time series forecasting Hierarchical and grouped time series 34
  90. 90. ForecastsKey idea: forecast reconciliation ¯ Ignore structural constraints and forecast every series of interest independently. ¯ Adjust forecasts to impose constraints. ˆLet Y n (h) be vector of initial forecasts for horizon h,made at time n, stacked in same order as Y t .Y t = SY K ,t . ˆ So Y n (h) = Sβn (h) + εh . βn (h) = E[Y K ,n+h | Y 1 , . . . , Y n ]. εh has zero mean and covariance matrix Σh . Estimate βn (h) using GLS? ˜ Revised forecasts: Y n (h) = Sβn (h) ˆ Advances in automatic time series forecasting Hierarchical and grouped time series 34
  91. 91. ForecastsKey idea: forecast reconciliation ¯ Ignore structural constraints and forecast every series of interest independently. ¯ Adjust forecasts to impose constraints. ˆLet Y n (h) be vector of initial forecasts for horizon h,made at time n, stacked in same order as Y t .Y t = SY K ,t . ˆ So Y n (h) = Sβn (h) + εh . βn (h) = E[Y K ,n+h | Y 1 , . . . , Y n ]. εh has zero mean and covariance matrix Σh . Estimate βn (h) using GLS? ˜ Revised forecasts: Y n (h) = Sβn (h) ˆ Advances in automatic time series forecasting Hierarchical and grouped time series 34
  92. 92. ForecastsKey idea: forecast reconciliation ¯ Ignore structural constraints and forecast every series of interest independently. ¯ Adjust forecasts to impose constraints. ˆLet Y n (h) be vector of initial forecasts for horizon h,made at time n, stacked in same order as Y t .Y t = SY K ,t . ˆ So Y n (h) = Sβn (h) + εh . βn (h) = E[Y K ,n+h | Y 1 , . . . , Y n ]. εh has zero mean and covariance matrix Σh . Estimate βn (h) using GLS? ˜ Revised forecasts: Y n (h) = Sβn (h) ˆ Advances in automatic time series forecasting Hierarchical and grouped time series 34
  93. 93. ForecastsKey idea: forecast reconciliation ¯ Ignore structural constraints and forecast every series of interest independently. ¯ Adjust forecasts to impose constraints. ˆLet Y n (h) be vector of initial forecasts for horizon h,made at time n, stacked in same order as Y t .Y t = SY K ,t . ˆ So Y n (h) = Sβn (h) + εh . βn (h) = E[Y K ,n+h | Y 1 , . . . , Y n ]. εh has zero mean and covariance matrix Σh . Estimate βn (h) using GLS? ˜ Revised forecasts: Y n (h) = Sβn (h) ˆ Advances in automatic time series forecasting Hierarchical and grouped time series 34
  94. 94. ForecastsKey idea: forecast reconciliation ¯ Ignore structural constraints and forecast every series of interest independently. ¯ Adjust forecasts to impose constraints. ˆLet Y n (h) be vector of initial forecasts for horizon h,made at time n, stacked in same order as Y t .Y t = SY K ,t . ˆ So Y n (h) = Sβn (h) + εh . βn (h) = E[Y K ,n+h | Y 1 , . . . , Y n ]. εh has zero mean and covariance matrix Σh . Estimate βn (h) using GLS? ˜ Revised forecasts: Y n (h) = Sβn (h) ˆ Advances in automatic time series forecasting Hierarchical and grouped time series 34
  95. 95. Optimal combination forecasts ˜ ˆ ˆ † † Y n (h) = Sβn (h) = S(S Σh S)−1 S Σh Y n (h) Σ† is generalized inverse of Σh . h Problem: Don’t know Σh and hard to estimate. Solution: Assume εh ≈ SεK ,h where εK ,h is the forecast error at bottom level. Then Σh ≈ SΩh S where Ωh = Var(εK ,h ). If Moore-Penrose generalized inverse used, then (S Σ† S)−1 S Σ† = (S S)−1 S . ˜ ˆ Y n (h) = S(S S)−1 S Y n (h)Advances in automatic time series forecasting Hierarchical and grouped time series 35
  96. 96. Optimal combination forecasts ˜ ˆ ˆ† † Y n (h) = Sβn (h) = S(S Σh S)−1 S Σh Y n (h) Base forecasts † Σh is generalized inverse of Σh . Problem: Don’t know Σh and hard to estimate. Solution: Assume εh ≈ SεK ,h where εK ,h is the forecast error at bottom level. Then Σh ≈ SΩh S where Ωh = Var(εK ,h ). If Moore-Penrose generalized inverse used, then (S Σ† S)−1 S Σ† = (S S)−1 S . ˜ ˆ Y n (h) = S(S S)−1 S Y n (h)Advances in automatic time series forecasting Hierarchical and grouped time series 35
  97. 97. Optimal combination forecasts ˜ ˆ ˆ† † Y n (h) = Sβn (h) = S(S Σh S)−1 S Σh Y n (h)Revised forecasts Base forecasts † Σh is generalized inverse of Σh . Problem: Don’t know Σh and hard to estimate. Solution: Assume εh ≈ SεK ,h where εK ,h is the forecast error at bottom level. Then Σh ≈ SΩh S where Ωh = Var(εK ,h ). If Moore-Penrose generalized inverse used, then (S Σ† S)−1 S Σ† = (S S)−1 S . ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) Advances in automatic time series forecasting Hierarchical and grouped time series 35
  98. 98. Optimal combination forecasts ˜ ˆ ˆ† † Y n (h) = Sβn (h) = S(S Σh S)−1 S Σh Y n (h)Revised forecasts Base forecasts † Σh is generalized inverse of Σh . Problem: Don’t know Σh and hard to estimate. Solution: Assume εh ≈ SεK ,h where εK ,h is the forecast error at bottom level. Then Σh ≈ SΩh S where Ωh = Var(εK ,h ). If Moore-Penrose generalized inverse used, then (S Σ† S)−1 S Σ† = (S S)−1 S . ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) Advances in automatic time series forecasting Hierarchical and grouped time series 35
  99. 99. Optimal combination forecasts ˜ ˆ ˆ† † Y n (h) = Sβn (h) = S(S Σh S)−1 S Σh Y n (h)Revised forecasts Base forecasts † Σh is generalized inverse of Σh . Problem: Don’t know Σh and hard to estimate. Solution: Assume εh ≈ SεK ,h where εK ,h is the forecast error at bottom level. Then Σh ≈ SΩh S where Ωh = Var(εK ,h ). If Moore-Penrose generalized inverse used, then (S Σ† S)−1 S Σ† = (S S)−1 S . ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) Advances in automatic time series forecasting Hierarchical and grouped time series 35
  100. 100. Optimal combination forecasts ˜ ˆ ˆ† † Y n (h) = Sβn (h) = S(S Σh S)−1 S Σh Y n (h)Revised forecasts Base forecasts † Σh is generalized inverse of Σh . Problem: Don’t know Σh and hard to estimate. Solution: Assume εh ≈ SεK ,h where εK ,h is the forecast error at bottom level. Then Σh ≈ SΩh S where Ωh = Var(εK ,h ). If Moore-Penrose generalized inverse used, then (S Σ† S)−1 S Σ† = (S S)−1 S . ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) Advances in automatic time series forecasting Hierarchical and grouped time series 35
  101. 101. Optimal combination forecasts ˜ ˆ ˆ† † Y n (h) = Sβn (h) = S(S Σh S)−1 S Σh Y n (h)Revised forecasts Base forecasts † Σh is generalized inverse of Σh . Problem: Don’t know Σh and hard to estimate. Solution: Assume εh ≈ SεK ,h where εK ,h is the forecast error at bottom level. Then Σh ≈ SΩh S where Ωh = Var(εK ,h ). If Moore-Penrose generalized inverse used, then (S Σ† S)−1 S Σ† = (S S)−1 S . ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) Advances in automatic time series forecasting Hierarchical and grouped time series 35
  102. 102. Optimal combination forecasts ˜ ˆ ˆ† † Y n (h) = Sβn (h) = S(S Σh S)−1 S Σh Y n (h)Revised forecasts Base forecasts † Σh is generalized inverse of Σh . Problem: Don’t know Σh and hard to estimate. Solution: Assume εh ≈ SεK ,h where εK ,h is the forecast error at bottom level. Then Σh ≈ SΩh S where Ωh = Var(εK ,h ). If Moore-Penrose generalized inverse used, then (S Σ† S)−1 S Σ† = (S S)−1 S . ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) Advances in automatic time series forecasting Hierarchical and grouped time series 35
  103. 103. Optimal combination forecasts ˜ ˆ ˆ† † Y n (h) = Sβn (h) = S(S Σh S)−1 S Σh Y n (h)Revised forecasts Base forecasts † Σh is generalized inverse of Σh . Problem: Don’t know Σh and hard to estimate. Solution: Assume εh ≈ SεK ,h where εK ,h is the forecast error at bottom level. Then Σh ≈ SΩh S where Ωh = Var(εK ,h ). If Moore-Penrose generalized inverse used, then (S Σ† S)−1 S Σ† = (S S)−1 S . ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) Advances in automatic time series forecasting Hierarchical and grouped time series 35
  104. 104. Optimal combination forecasts ˜ ˆ ˆ† † Y n (h) = Sβn (h) = S(S Σh S)−1 S Σh Y n (h)Revised forecasts Base forecasts † Σh is generalized inverse of Σh . Problem: Don’t know Σh and hard to estimate. Solution: Assume εh ≈ SεK ,h where εK ,h is the forecast error at bottom level. Then Σh ≈ SΩh S where Ωh = Var(εK ,h ). If Moore-Penrose generalized inverse used, then (S Σ† S)−1 S Σ† = (S S)−1 S . ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) Advances in automatic time series forecasting Hierarchical and grouped time series 35
  105. 105. Optimal combination forecasts ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) GLS =OLS. Optimal weighted average of base forecasts. Computational difficulties in big hierarchies due to size of S matrix. Optimal weights are S(S S)−1 S Weights are independent of the data! Total A B CAdvances in automatic time series forecasting Hierarchical and grouped time series 36
  106. 106. Optimal combination forecasts ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) GLS =OLS. Optimal weighted average of base forecasts. Computational difficulties in big hierarchies due to size of S matrix. Optimal weights are S(S S)−1 S Weights are independent of the data! Total A B CAdvances in automatic time series forecasting Hierarchical and grouped time series 36
  107. 107. Optimal combination forecasts ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) GLS =OLS. Optimal weighted average of base forecasts. Computational difficulties in big hierarchies due to size of S matrix. Optimal weights are S(S S)−1 S Weights are independent of the data! Total A B CAdvances in automatic time series forecasting Hierarchical and grouped time series 36
  108. 108. Optimal combination forecasts ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) GLS =OLS. Optimal weighted average of base forecasts. Computational difficulties in big hierarchies due to size of S matrix. Optimal weights are S(S S)−1 S Weights are independent of the data! Total A B CAdvances in automatic time series forecasting Hierarchical and grouped time series 36
  109. 109. Optimal combination forecasts ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) GLS =OLS. Optimal weighted average of base forecasts. Computational difficulties in big hierarchies due to size of S matrix. Optimal weights are S(S S)−1 S Weights are independent of the data! Total A B CAdvances in automatic time series forecasting Hierarchical and grouped time series 36
  110. 110. Optimal combination forecasts ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) GLS =OLS. Optimal weighted average of base forecasts. Computational difficulties in big hierarchies due to size of S matrix. Optimal weights are S(S S)−1 S Weights are independent of the data! Total A B CAdvances in automatic time series forecasting Hierarchical and grouped time series 36
  111. 111. Optimal combination forecasts ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) GLS =OLS. Optimal weighted average of base forecasts. Computational difficulties in big hierarchies due to size of S matrix. Optimal weights are S(S S)−1 S Weights are independent of the data! Weights: S(S S)−1 S = Total  0.75 0.25 0.25 0.25   0.25  0.75 −0.25 −0.25    0.25 −0.25 0.75 −0.25  A B C 0.25 −0.25 −0.25 0.75Advances in automatic time series forecasting Hierarchical and grouped time series 36
  112. 112. Optimal combination forecasts Total A B C AA AB AC BA BB BC CA CB CCWeights: S(S S)−1 S = 0.69 0.23 0.23 0.23 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08  0.23 0.58 −0.17 −0.17 0.19 0.19 0.19 −0.06 −0.06 −0.06 −0.06 −0.06 −0.06  −0.17 −0.17 −0.06 −0.06 −0.06 −0.06 −0.06 −0.06   0.23 0.58 0.19 0.19 0.19  0.23 −0.17 −0.17 0.58 −0.06 −0.06 −0.06 −0.06 −0.06 −0.06 0.19 0.19 0.19   0.08 0.19 −0.06 −0.06 0.73 −0.27 −0.27 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02   0.08 0.19 −0.06 −0.06 −0.27 0.73 −0.27 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02   0.08 0.19 −0.06 −0.06 −0.27 −0.27 0.73 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02   0.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 0.73 −0.27 −0.27 −0.02 −0.02 −0.02   0.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 −0.27 0.73 −0.27 −0.02 −0.02 −0.02   0.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 −0.27 −0.27 0.73 −0.02 −0.02 −0.02   0.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 0.73 −0.27 −0.27   0.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 −0.27 0.73 −0.27  0.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 −0.27 −0.27 0.73 Advances in automatic time series forecasting Hierarchical and grouped time series 37
  113. 113. Optimal combination forecasts Total A B C AA AB AC BA BB BC CA CB CCWeights: S(S S)−1 S = 0.69 0.23 0.23 0.23 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08  0.23 0.58 −0.17 −0.17 0.19 0.19 0.19 −0.06 −0.06 −0.06 −0.06 −0.06 −0.06  −0.17 −0.17 −0.06 −0.06 −0.06 −0.06 −0.06 −0.06   0.23 0.58 0.19 0.19 0.19  0.23 −0.17 −0.17 0.58 −0.06 −0.06 −0.06 −0.06 −0.06 −0.06 0.19 0.19 0.19   0.08 0.19 −0.06 −0.06 0.73 −0.27 −0.27 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02   0.08 0.19 −0.06 −0.06 −0.27 0.73 −0.27 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02   0.08 0.19 −0.06 −0.06 −0.27 −0.27 0.73 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02   0.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 0.73 −0.27 −0.27 −0.02 −0.02 −0.02   0.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 −0.27 0.73 −0.27 −0.02 −0.02 −0.02   0.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 −0.27 −0.27 0.73 −0.02 −0.02 −0.02   0.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 0.73 −0.27 −0.27   0.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 −0.27 0.73 −0.27  0.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 −0.27 −0.27 0.73 Advances in automatic time series forecasting Hierarchical and grouped time series 37
  114. 114. Features and problems ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) ˜ Var[Y n (h)] = SΩh S = Σh .Covariates can be included in base forecasts.Point forecasts are always consistent.Need to estimate Ωh to produce prediction intervals.Very simple and flexible method. Can work with anyhierarchical or grouped time series.Advances in automatic time series forecasting Hierarchical and grouped time series 38
  115. 115. Features and problems ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) ˜ Var[Y n (h)] = SΩh S = Σh .Covariates can be included in base forecasts.Point forecasts are always consistent.Need to estimate Ωh to produce prediction intervals.Very simple and flexible method. Can work with anyhierarchical or grouped time series.Advances in automatic time series forecasting Hierarchical and grouped time series 38
  116. 116. Features and problems ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) ˜ Var[Y n (h)] = SΩh S = Σh .Covariates can be included in base forecasts.Point forecasts are always consistent.Need to estimate Ωh to produce prediction intervals.Very simple and flexible method. Can work with anyhierarchical or grouped time series.Advances in automatic time series forecasting Hierarchical and grouped time series 38
  117. 117. Features and problems ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) ˜ Var[Y n (h)] = SΩh S = Σh .Covariates can be included in base forecasts.Point forecasts are always consistent.Need to estimate Ωh to produce prediction intervals.Very simple and flexible method. Can work with anyhierarchical or grouped time series.Advances in automatic time series forecasting Hierarchical and grouped time series 38

×