Upcoming SlideShare
×

# Advances in automatic time series forecasting

5,096 views
5,013 views

Published on

Invited talk, Australian Statistical Conference, Adelaide, July 2012.

Published in: Technology, Business
1 Comment
11 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here

Are you sure you want to  Yes  No
Your message goes here
Views
Total views
5,096
On SlideShare
0
From Embeds
0
Number of Embeds
2,098
Actions
Shares
0
318
1
Likes
11
Embeds 0
No embeds

No notes for slide

### Advances in automatic time series forecasting

1. 1. Rob J HyndmanAdvances inautomatictime series forecastingAdvances in automatic time series forecasting 1
2. 2. Outline1 Motivation2 Exponential smoothing3 ARIMA modelling4 Time series with complex seasonality5 Hierarchical and grouped time series6 Functional time series7 Grouped functional time series Advances in automatic time series forecasting Motivation 2
3. 3. MotivationAdvances in automatic time series forecasting Motivation 3
4. 4. MotivationAdvances in automatic time series forecasting Motivation 3
5. 5. MotivationAdvances in automatic time series forecasting Motivation 3
6. 6. MotivationAdvances in automatic time series forecasting Motivation 3
7. 7. MotivationAdvances in automatic time series forecasting Motivation 3
8. 8. MotivationAdvances in automatic time series forecasting Motivation 3
9. 9. Motivation 1 Common in business to have over 1000 products that need forecasting at least monthly. 2 Forecasts are often required by people who are untrained in time series analysis. 3 Some types of data can be decomposed into a large number of univariate time series that need to be forecast.SpeciﬁcationsAutomatic forecasting algorithms must: ¯ determine an appropriate time series model; ¯ estimate the parameters; ¯ compute the forecasts with prediction intervals. Advances in automatic time series forecasting Motivation 4
10. 10. Motivation 1 Common in business to have over 1000 products that need forecasting at least monthly. 2 Forecasts are often required by people who are untrained in time series analysis. 3 Some types of data can be decomposed into a large number of univariate time series that need to be forecast.SpeciﬁcationsAutomatic forecasting algorithms must: ¯ determine an appropriate time series model; ¯ estimate the parameters; ¯ compute the forecasts with prediction intervals. Advances in automatic time series forecasting Motivation 4
11. 11. Motivation 1 Common in business to have over 1000 products that need forecasting at least monthly. 2 Forecasts are often required by people who are untrained in time series analysis. 3 Some types of data can be decomposed into a large number of univariate time series that need to be forecast.SpeciﬁcationsAutomatic forecasting algorithms must: ¯ determine an appropriate time series model; ¯ estimate the parameters; ¯ compute the forecasts with prediction intervals. Advances in automatic time series forecasting Motivation 4
12. 12. Motivation 1 Common in business to have over 1000 products that need forecasting at least monthly. 2 Forecasts are often required by people who are untrained in time series analysis. 3 Some types of data can be decomposed into a large number of univariate time series that need to be forecast.SpeciﬁcationsAutomatic forecasting algorithms must: ¯ determine an appropriate time series model; ¯ estimate the parameters; ¯ compute the forecasts with prediction intervals. Advances in automatic time series forecasting Motivation 4
13. 13. Motivation 1 Common in business to have over 1000 products that need forecasting at least monthly. 2 Forecasts are often required by people who are untrained in time series analysis. 3 Some types of data can be decomposed into a large number of univariate time series that need to be forecast.SpeciﬁcationsAutomatic forecasting algorithms must: ¯ determine an appropriate time series model; ¯ estimate the parameters; ¯ compute the forecasts with prediction intervals. Advances in automatic time series forecasting Motivation 4
14. 14. Example: Cortecosteroid sales Monthly cortecosteroid drug sales in Australia 1.6 1.4Total scripts (millions) 1.2 1.0 0.8 0.6 0.4 1995 2000 2005 2010 Year Advances in automatic time series forecasting Motivation 5
15. 15. Example: Cortecosteroid sales Automatic ARIMA forecasts 1.6 1.4Total scripts (millions) 1.2 1.0 0.8 0.6 0.4 1995 2000 2005 2010 Year Advances in automatic time series forecasting Motivation 5
16. 16. Outline1 Motivation2 Exponential smoothing3 ARIMA modelling4 Time series with complex seasonality5 Hierarchical and grouped time series6 Functional time series7 Grouped functional time series Advances in automatic time series forecasting Exponential smoothing 6
17. 17. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N Ad ,A Ad ,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md ,N Md ,A Md ,MAdvances in automatic time series forecasting Exponential smoothing 7
18. 18. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N Ad ,A Ad ,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M N,N: Simple exponential smoothingAdvances in automatic time series forecasting Exponential smoothing 7
19. 19. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N Ad ,A Ad ,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M N,N: Simple exponential smoothing A,N: Holt’s linear methodAdvances in automatic time series forecasting Exponential smoothing 7
20. 20. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N Ad ,A Ad ,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad ,N: Additive damped trend methodAdvances in automatic time series forecasting Exponential smoothing 7
21. 21. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N Ad ,A Ad ,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad ,N: Additive damped trend method M,N: Exponential trend methodAdvances in automatic time series forecasting Exponential smoothing 7
22. 22. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N Ad ,A Ad ,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad ,N: Additive damped trend method M,N: Exponential trend method Md ,N: Multiplicative damped trend methodAdvances in automatic time series forecasting Exponential smoothing 7
25. 25. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N Ad ,A Ad ,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M There are 15 separate exponential smoothing methods.Advances in automatic time series forecasting Exponential smoothing 7
26. 26. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N Ad ,A Ad ,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md ,N Md ,A Md ,M There are 15 separate exponential smoothing methods. Each can have an additive or multiplicative error, giving 30 separate models.Advances in automatic time series forecasting Exponential smoothing 7
27. 27. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N Ad ,A Ad ,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md ,N Md ,A Md ,MGeneral notation E T S : ExponenTial SmoothingExamples:A,N,N: Simple exponential smoothing with additive errorsA,A,N: Holt’s linear method with additive errorsM,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsAdvances in automatic time series forecasting Exponential smoothing 8
28. 28. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N Ad ,A Ad ,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md ,N Md ,A Md ,MGeneral notation E T S : ExponenTial SmoothingExamples:A,N,N: Simple exponential smoothing with additive errorsA,A,N: Holt’s linear method with additive errorsM,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsAdvances in automatic time series forecasting Exponential smoothing 8
29. 29. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N Ad ,A Ad ,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md ,N Md ,A Md ,MGeneral notation E T S : ExponenTial Smoothing ↑ TrendExamples:A,N,N: Simple exponential smoothing with additive errorsA,A,N: Holt’s linear method with additive errorsM,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsAdvances in automatic time series forecasting Exponential smoothing 8
30. 30. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N Ad ,A Ad ,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md ,N Md ,A Md ,MGeneral notation E T S : ExponenTial Smoothing ↑ Trend SeasonalExamples:A,N,N: Simple exponential smoothing with additive errorsA,A,N: Holt’s linear method with additive errorsM,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsAdvances in automatic time series forecasting Exponential smoothing 8
31. 31. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N Ad ,A Ad ,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md ,N Md ,A Md ,MGeneral notation E T S : ExponenTial Smoothing ↑ Error Trend SeasonalExamples:A,N,N: Simple exponential smoothing with additive errorsA,A,N: Holt’s linear method with additive errorsM,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsAdvances in automatic time series forecasting Exponential smoothing 8
32. 32. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad ,N Ad ,A Ad ,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md ,N Md ,A Md ,MGeneral notation E T S : ExponenTial Smoothing ↑ Error Trend SeasonalExamples:A,N,N: Simple exponential smoothing with additive errorsA,A,N: Holt’s linear method with additive errorsM,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsAdvances in automatic time series forecasting Exponential smoothing 8
33. 33. Exponential smoothing methodsInnovations state space modelsComponent Seasonal Trend N A M ¯ AllComponent ETS models can (None) (Additive) (Multiplicative) be written in innovations N state space form. (None) N,N N,A N,M A (Additive) A,N A,A A,M ¯ Additive and multiplicative versions give,M A (Additive damped) A ,N A ,A A the d d d d M same point forecasts but different prediction (Multiplicative) M,N M,A M,M Md intervals. damped) M ,N (Multiplicative M ,A M ,M d d dGeneral notation E T S : ExponenTial Smoothing ↑ Error Trend SeasonalExamples:A,N,N: Simple exponential smoothing with additive errorsA,A,N: Holt’s linear method with additive errorsM,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsAdvances in automatic time series forecasting Exponential smoothing 8
34. 34. Automatic forecastingFrom Hyndman et al. (IJF, 2002): Apply each of 30 models that are appropriate to the data. Optimize parameters and initial values using MLE (or some other criterion). Select best method using AIC: AIC = −2 log(Likelihood) + 2p where p = # parameters. Produce forecasts using best method. Obtain prediction intervals using underlying state space model.Method performed very well in M3 competition. Advances in automatic time series forecasting Exponential smoothing 9
35. 35. Automatic forecastingFrom Hyndman et al. (IJF, 2002): Apply each of 30 models that are appropriate to the data. Optimize parameters and initial values using MLE (or some other criterion). Select best method using AIC: AIC = −2 log(Likelihood) + 2p where p = # parameters. Produce forecasts using best method. Obtain prediction intervals using underlying state space model.Method performed very well in M3 competition. Advances in automatic time series forecasting Exponential smoothing 9
36. 36. Automatic forecastingFrom Hyndman et al. (IJF, 2002): Apply each of 30 models that are appropriate to the data. Optimize parameters and initial values using MLE (or some other criterion). Select best method using AIC: AIC = −2 log(Likelihood) + 2p where p = # parameters. Produce forecasts using best method. Obtain prediction intervals using underlying state space model.Method performed very well in M3 competition. Advances in automatic time series forecasting Exponential smoothing 9
37. 37. Automatic forecastingFrom Hyndman et al. (IJF, 2002): Apply each of 30 models that are appropriate to the data. Optimize parameters and initial values using MLE (or some other criterion). Select best method using AIC: AIC = −2 log(Likelihood) + 2p where p = # parameters. Produce forecasts using best method. Obtain prediction intervals using underlying state space model.Method performed very well in M3 competition. Advances in automatic time series forecasting Exponential smoothing 9
38. 38. Automatic forecastingFrom Hyndman et al. (IJF, 2002): Apply each of 30 models that are appropriate to the data. Optimize parameters and initial values using MLE (or some other criterion). Select best method using AIC: AIC = −2 log(Likelihood) + 2p where p = # parameters. Produce forecasts using best method. Obtain prediction intervals using underlying state space model.Method performed very well in M3 competition. Advances in automatic time series forecasting Exponential smoothing 9
39. 39. Automatic forecastingFrom Hyndman et al. (IJF, 2002): Apply each of 30 models that are appropriate to the data. Optimize parameters and initial values using MLE (or some other criterion). Select best method using AIC: AIC = −2 log(Likelihood) + 2p where p = # parameters. Produce forecasts using best method. Obtain prediction intervals using underlying state space model.Method performed very well in M3 competition. Advances in automatic time series forecasting Exponential smoothing 9
40. 40. Exponential smoothing Automatic ETS forecasts 1.6 1.4Total scripts (millions) 1.2 1.0 0.8 0.6 0.4 1995 2000 2005 2010 Year Advances in automatic time series forecasting Exponential smoothing 10
41. 41. Exponential smoothing Automatic ETS forecasts 1.6 library(forecast) 1.4 fit <- ets(h02) fcast <- forecast(fit)Total scripts (millions) 1.2 plot(fcast) 1.0 0.8 0.6 0.4 1995 2000 2005 2010 Year Advances in automatic time series forecasting Exponential smoothing 11
42. 42. Exponential smoothing> fitETS(M,Md,M) Smoothing parameters: alpha = 0.3318 beta = 4e-04 gamma = 1e-04 phi = 0.9695 Initial states: l = 0.4003 b = 1.0233 s = 0.8575 0.8183 0.7559 0.7627 0.6873 1.2884 1.3456 1.1867 1.1653 1.1033 1.0398 0.9893 sigma: 0.0651 AIC AICc BIC-121.97999 -118.68967 -65.57195 Advances in automatic time series forecasting Exponential smoothing 12
43. 43. References Hyndman, Koehler, Snyder, Grose (2002). “A state space framework for automatic forecasting using exponential smoothing methods”. International Journal of Forecasting 18(3), 439–454 Hyndman, Koehler, Ord, Snyder (2008). Forecasting with exponential smoothing: the state space approach. Berlin: Springer-Verlag. www.exponentialsmoothing.net Hyndman (2012). forecast: Forecasting functions for time series. cran.r-project.org/package=forecastAdvances in automatic time series forecasting Exponential smoothing 13
44. 44. Outline1 Motivation2 Exponential smoothing3 ARIMA modelling4 Time series with complex seasonality5 Hierarchical and grouped time series6 Functional time series7 Grouped functional time series Advances in automatic time series forecasting ARIMA modelling 14
45. 45. How does auto.arima() work?A non-seasonal ARIMA process φ(B)(1 − B)d yt = c + θ(B)εtNeed to select appropriate orders p, q, d, andwhether to include c. Advances in automatic time series forecasting ARIMA modelling 15
46. 46. How does auto.arima() work?A non-seasonal ARIMA process φ(B)(1 − B)d yt = c + θ(B)εtNeed to select appropriate orders p, q, d, andwhether to include c.Hyndman & Khandakar (JSS, 2008) algorithm: Select no. differences d via KPSS unit root test. Select p, q, c by minimising AIC. Use stepwise search to traverse model space, starting with a simple model and considering nearby variants. Advances in automatic time series forecasting ARIMA modelling 15
47. 47. How does auto.arima() work?A seasonal ARIMA process Φ(Bm )φ(B)(1 − B)d (1 − Bm )D yt = c + Θ(Bm )θ(B)εtNeed to select appropriate orders p, q, d, P, Q, D, andwhether to include c.Hyndman & Khandakar (JSS, 2008) algorithm: Select no. differences d via KPSS unit root test. Select D using OCSB unit root test. Select p, q, P, Q, c by minimising AIC. Use stepwise search to traverse model space, starting with a simple model and considering nearby variants. Advances in automatic time series forecasting ARIMA modelling 16
48. 48. Auto ARIMA Automatic ARIMA forecasts 1.6 1.4Total scripts (millions) 1.2 1.0 0.8 0.6 0.4 1995 2000 2005 2010 Year Advances in automatic time series forecasting ARIMA modelling 17
49. 49. Auto ARIMA Automatic ARIMA forecasts 1.6 fit <- auto.arima(h02) fcast <- forecast(fit) 1.4 plot(fcast)Total scripts (millions) 1.2 1.0 0.8 0.6 0.4 1995 2000 2005 2010 Year Advances in automatic time series forecasting ARIMA modelling 18
50. 50. Auto ARIMA> fitSeries: h02ARIMA(3,1,3)(0,1,1)[12]Coefficients: ar1 ar2 ar3 ma1 ma2 ma3 -0.3648 -0.0636 0.3568 -0.4850 0.0479 -0.353s.e. 0.2198 0.3293 0.1268 0.2227 0.2755 0.212 sma1 -0.5931s.e. 0.0651sigma^2 estimated as 0.002706: log likelihood=290.25AIC=-564.5 AICc=-563.71 BIC=-538.48 Advances in automatic time series forecasting ARIMA modelling 19
51. 51. References Hyndman, Khandakar (2008). “Automatic time series forecasting : the forecast package for R”. Journal of Statistical Software 26(3) Hyndman (2011). “Major changes to the forecast package”. robjhyndman.com/researchtips/forecast3/. Hyndman (2012). forecast: Forecasting functions for time series. cran.r-project.org/package=forecastAdvances in automatic time series forecasting ARIMA modelling 20
52. 52. Outline1 Motivation2 Exponential smoothing3 ARIMA modelling4 Time series with complex seasonality5 Hierarchical and grouped time series6 Functional time series7 Grouped functional time series Advances in automatic time series forecasting Time series with complex seasonality 21
53. 53. Examples US finished motor gasoline products 6500 7000 7500 8000 8500 9000 9500Thousands of barrels per day 1992 1994 1996 1998 2000 2002 2004 Weeks Advances in automatic time series forecasting Time series with complex seasonality 22
54. 54. Examples Number of calls to large American bank (7am−9pm) 400 300Number of call arrivals 200 100 3 March 17 March 31 March 14 April 28 April 12 May 5 minute intervals Advances in automatic time series forecasting Time series with complex seasonality 22
55. 55. Examples Turkish electricity demand 25Electricity demand (GW) 20 15 10 2000 2002 2004 2006 2008 Days Advances in automatic time series forecasting Time series with complex seasonality 22
56. 56. TBATS model yt = observation at time t ω (ω) (yt − 1)/ω if ω = 0;yt = log yt if ω = 0. M (ω) (i)yt = t −1 + φbt −1 + st−mi + dt i=1 = t−1 + φbt−1 + αdt t bt = (1 − φ)b + φbt−1 + β dt p q dt = φi dt−i + θj εt−j + εt i=1 j=1 (i) (i) (i) ∗(i) (i) (i) ki sj,t = sj,t−1 cos λj + sj,t−1 sin λj + γ1 dt (i) (i)st = sj,t (i) (i) (i) ∗(i) (i) (i) sj,t = −sj,t−1 sin λj + sj,t−1 cos λj + γ2 dt j=1 Advances in automatic time series forecasting Time series with complex seasonality 23
57. 57. TBATS model yt = observation at time t ω (ω) (yt − 1)/ω if ω = 0; Box-Cox transformationyt = log yt if ω = 0. M (ω) (i)yt = t −1 + φbt −1 + st−mi + dt i=1 = t−1 + φbt−1 + αdt t bt = (1 − φ)b + φbt−1 + β dt p q dt = φi dt−i + θj εt−j + εt i=1 j=1 (i) (i) (i) ∗(i) (i) (i) ki sj,t = sj,t−1 cos λj + sj,t−1 sin λj + γ1 dt (i) (i)st = sj,t (i) (i) (i) ∗(i) (i) (i) sj,t = −sj,t−1 sin λj + sj,t−1 cos λj + γ2 dt j=1 Advances in automatic time series forecasting Time series with complex seasonality 23
58. 58. TBATS model yt = observation at time t ω (ω) (yt − 1)/ω if ω = 0; Box-Cox transformationyt = log yt if ω = 0. M (ω) (i) M seasonal periodsyt = t −1 + φbt −1 + st−mi + dt i=1 = t−1 + φbt−1 + αdt t bt = (1 − φ)b + φbt−1 + β dt p q dt = φi dt−i + θj εt−j + εt i=1 j=1 (i) (i) (i) ∗(i) (i) (i) ki sj,t = sj,t−1 cos λj + sj,t−1 sin λj + γ1 dt (i) (i)st = sj,t (i) (i) (i) ∗(i) (i) (i) sj,t = −sj,t−1 sin λj + sj,t−1 cos λj + γ2 dt j=1 Advances in automatic time series forecasting Time series with complex seasonality 23
59. 59. TBATS model yt = observation at time t ω (ω) (yt − 1)/ω if ω = 0; Box-Cox transformationyt = log yt if ω = 0. M (ω) (i) M seasonal periodsyt = t −1 + φbt −1 + st−mi + dt i=1 = t−1 + φbt−1 + αdt t global and local trend bt = (1 − φ)b + φbt−1 + β dt p q dt = φi dt−i + θj εt−j + εt i=1 j=1 (i) (i) (i) ∗(i) (i) (i) ki sj,t = sj,t−1 cos λj + sj,t−1 sin λj + γ1 dt (i) (i)st = sj,t (i) (i) (i) ∗(i) (i) (i) sj,t = −sj,t−1 sin λj + sj,t−1 cos λj + γ2 dt j=1 Advances in automatic time series forecasting Time series with complex seasonality 23
60. 60. TBATS model yt = observation at time t ω (ω) (yt − 1)/ω if ω = 0; Box-Cox transformationyt = log yt if ω = 0. M (ω) (i) M seasonal periodsyt = t −1 + φbt −1 + st−mi + dt i=1 = t−1 + φbt−1 + αdt t global and local trend bt = (1 − φ)b + φbt−1 + β dt p q dt = φi dt−i + θj εt−j + εt ARMA error i=1 j=1 (i) (i) (i) ∗(i) (i) (i) ki sj,t = sj,t−1 cos λj + sj,t−1 sin λj + γ1 dt (i) (i)st = sj,t (i) (i) (i) ∗(i) (i) (i) sj,t = −sj,t−1 sin λj + sj,t−1 cos λj + γ2 dt j=1 Advances in automatic time series forecasting Time series with complex seasonality 23
61. 61. TBATS model yt = observation at time t ω (ω) (yt − 1)/ω if ω = 0; Box-Cox transformationyt = log yt if ω = 0. M (ω) (i) M seasonal periodsyt = t −1 + φbt −1 + st−mi + dt i=1 = t−1 + φbt−1 + αdt t global and local trend bt = (1 − φ)b + φbt−1 + β dt p q dt = φi dt−i + θj εt−j + εt ARMA error i=1 j=1 Fourier-like seasonal (i) (i) (i) ∗(i) (i) (i) ki sj,t = sj,t−1 cos λj + sj,t−1 sin λj + γ1 dt terms (i) (i)st = sj,t (i) (i) (i) ∗(i) (i) (i) sj,t = −sj,t−1 sin λj + sj,t−1 cos λj + γ2 dt j=1 Advances in automatic time series forecasting Time series with complex seasonality 23
62. 62. TBATS model yt = observation at time t ω (ω) (yt − 1)/ω if ω = 0; Box-Cox transformationyt = TBATSif ω = 0. log yt M (ω) Trigonometric d = (i) M seasonal periodsyt t −1 + φbt −1 + st − m + t i Box-Cox1 i= t = t −1 + φbt −1 + αdt global and local trend ARMA bt = (1 − φ)b + φbt−1 + β dt p Trendq dt = φi dt−i + θj ε + ε ARMA error i=1 Seasonalt−j t j=1 Fourier-like seasonal (i) (i) (i) ∗(i) (i) (i) ki sj,t = sj,t−1 cos λj + sj,t−1 sin λj + γ1 dt terms (i) (i)st = sj,t (i) (i) (i) ∗(i) (i) (i) sj,t = −sj,t−1 sin λj + sj,t−1 cos λj + γ2 dt j=1 Advances in automatic time series forecasting Time series with complex seasonality 23
63. 63. Examples Forecasts from TBATS(0.999, {2,2}, 1, {<52.1785714285714,8>}) 10000 fit <- tbats(gasoline) fcast <- forecast(fit)Thousands of barrels per day plot(fcast) 9000 8000 7000 1995 2000 2005 Weeks Advances in automatic time series forecasting Time series with complex seasonality 24
64. 64. Examples Forecasts from TBATS(1, {3,1}, 0.987, {<169,5>, <845,3>}) 500 fit <- tbats(callcentre) fcast <- forecast(fit) 400 plot(fcast)Number of call arrivals 300 200 100 0 3 March 17 March 31 March 14 April 28 April 12 May 26 May 9 June 5 minute intervals Advances in automatic time series forecasting Time series with complex seasonality 25
65. 65. Examples Forecasts from TBATS(0, {5,3}, 0.997, {<7,3>, <354.37,12>, <365.25,4>}) fit <- tbats(turk) 25 fcast <- forecast(fit)Electricity demand (GW) plot(fcast) 20 15 10 2000 2002 2004 2006 2008 2010 Days Advances in automatic time series forecasting Time series with complex seasonality 26
66. 66. References Automatic algorithm described in De Livera, Hyndman, Snyder (2011). “Forecasting time series with complex seasonal patterns using exponential smoothing”. Journal of the American Statistical Association 106(496), 1513–1527. Slightly improved algorithm implemented in Hyndman (2012). forecast: Forecasting functions for time series. cran.r-project.org/package=forecast. More work required!Advances in automatic time series forecasting Time series with complex seasonality 27
67. 67. Outline1 Motivation2 Exponential smoothing3 ARIMA modelling4 Time series with complex seasonality5 Hierarchical and grouped time series6 Functional time series7 Grouped functional time series Advances in automatic time series forecasting Hierarchical and grouped time series 28
68. 68. Introduction Total A B C AA AB AC BA BB BC CA CB CCExamples Manufacturing product hierarchies Pharmaceutical sales Net labour turnoverAdvances in automatic time series forecasting Hierarchical and grouped time series 29
69. 69. Introduction Total A B C AA AB AC BA BB BC CA CB CCExamples Manufacturing product hierarchies Pharmaceutical sales Net labour turnoverAdvances in automatic time series forecasting Hierarchical and grouped time series 29
70. 70. Introduction Total A B C AA AB AC BA BB BC CA CB CCExamples Manufacturing product hierarchies Pharmaceutical sales Net labour turnoverAdvances in automatic time series forecasting Hierarchical and grouped time series 29
71. 71. Introduction Total A B C AA AB AC BA BB BC CA CB CCExamples Manufacturing product hierarchies Pharmaceutical sales Net labour turnoverAdvances in automatic time series forecasting Hierarchical and grouped time series 29
72. 72. Hierarchical/grouped time series A hierarchical time series is a collection of several time series that are linked together in a hierarchical structure. A grouped time series is a collection of time series that are aggregated in a number of non-hierarchical ways. Example: daily numbers of calls to HP call centres are grouped by product type and location of call centre. Forecasts should be “aggregate consistent”, unbiased, minimum variance. How to compute forecast intervals?Advances in automatic time series forecasting Hierarchical and grouped time series 30
73. 73. Hierarchical/grouped time series A hierarchical time series is a collection of several time series that are linked together in a hierarchical structure. A grouped time series is a collection of time series that are aggregated in a number of non-hierarchical ways. Example: daily numbers of calls to HP call centres are grouped by product type and location of call centre. Forecasts should be “aggregate consistent”, unbiased, minimum variance. How to compute forecast intervals?Advances in automatic time series forecasting Hierarchical and grouped time series 30
74. 74. Hierarchical/grouped time series A hierarchical time series is a collection of several time series that are linked together in a hierarchical structure. A grouped time series is a collection of time series that are aggregated in a number of non-hierarchical ways. Example: daily numbers of calls to HP call centres are grouped by product type and location of call centre. Forecasts should be “aggregate consistent”, unbiased, minimum variance. How to compute forecast intervals?Advances in automatic time series forecasting Hierarchical and grouped time series 30
75. 75. Hierarchical/grouped time series A hierarchical time series is a collection of several time series that are linked together in a hierarchical structure. A grouped time series is a collection of time series that are aggregated in a number of non-hierarchical ways. Example: daily numbers of calls to HP call centres are grouped by product type and location of call centre. Forecasts should be “aggregate consistent”, unbiased, minimum variance. How to compute forecast intervals?Advances in automatic time series forecasting Hierarchical and grouped time series 30
76. 76. Hierarchical/grouped time series A hierarchical time series is a collection of several time series that are linked together in a hierarchical structure. A grouped time series is a collection of time series that are aggregated in a number of non-hierarchical ways. Example: daily numbers of calls to HP call centres are grouped by product type and location of call centre. Forecasts should be “aggregate consistent”, unbiased, minimum variance. How to compute forecast intervals?Advances in automatic time series forecasting Hierarchical and grouped time series 30
77. 77. Notation Yt : observed aggregate of all Total series at time t. YX,t : observation on series X at time t. A B C Y i,t : vector of all series at level iK : number of levels in in time t. hierarchy (excl. Total). Y t = [Yt , Y 1,t , . . . , Y K ,t ]Advances in automatic time series forecasting Hierarchical and grouped time series 31
78. 78. Notation Yt : observed aggregate of all Total series at time t. YX,t : observation on series X at time t. A B C Y i,t : vector of all series at level iK : number of levels in in time t. hierarchy (excl. Total). Y t = [Yt , Y 1,t , . . . , Y K ,t ]Advances in automatic time series forecasting Hierarchical and grouped time series 31
79. 79. Notation Yt : observed aggregate of all Total series at time t. YX,t : observation on series X at time t. A B C Y i,t : vector of all series at level i K : number of levels in in time t. hierarchy (excl. Total). Y t = [Yt , Y 1,t , . . . , Y K ,t ]   1 1 1   1 YA,t 0 0 Y t = [Yt , YA,t , YB,t , YC,t ] =  0  YB,t  1 0 YC,t 0 0 1 Advances in automatic time series forecasting Hierarchical and grouped time series 31
80. 80. Notation Yt : observed aggregate of all Total series at time t. YX,t : observation on series X at time t. A B C Y i,t : vector of all series at level i K : number of levels in in time t. hierarchy (excl. Total). Y t = [Yt , Y 1,t , . . . , Y K ,t ]   1 1 1   1 YA,t 0 0 Y t = [Yt , YA,t , YB,t , YC,t ] =  0  YB,t  1 0 YC,t 0 0 1 S Advances in automatic time series forecasting Hierarchical and grouped time series 31
81. 81. Notation Yt : observed aggregate of all Total series at time t. YX,t : observation on series X at time t. A B C Y i,t : vector of all series at level i K : number of levels in in time t. hierarchy (excl. Total). Y t = [Yt , Y 1,t , . . . , Y K ,t ]   1 1 1   1 Y A, t 0 0 Y t = [Yt , YA,t , YB,t , YC,t ] =  0  YB , t  1 0 YC,t 0 0 1 Y K ,t S Advances in automatic time series forecasting Hierarchical and grouped time series 31
82. 82. Notation Yt : observed aggregate of all Total series at time t. YX,t : observation on series X at time t. A B C Y i,t : vector of all series at level i K : number of levels in in time t. hierarchy (excl. Total). Y t = [Yt , Y 1,t , . . . , Y K ,t ]   1 1 1   1 Y A, t 0 0 Y t = [Yt , YA,t , YB,t , YC,t ] =  0  YB , t  1 0 YC,t 0 0 1 Y K ,tY t = SY K ,t S Advances in automatic time series forecasting Hierarchical and grouped time series 31
83. 83. Hierarchical data Total A B C AX AY AZ BX BY BZ CX CY CZ  Yt  1 1 1 1 1 1 1 1 1  YA,t 1 1 1 0 0 0 0 0 0 YB,t  0 0 0 1 1 1 0 0 0 Y  AX,t   YC,t  0 0 0 0 0 0 1 1 1 Y AY ,t YAX,t   1 0 0 0 0 0 0 0 0  Y AZ,t   YAY ,t  0 1 0 0 0 0 0 0 0  Y BX,t Yt =  =  YAZ,t  0 0 1 0 0 0 0 0 0  Y , BY t    YBX,t  0 0 0 1 0 0 0 0 0 Y , BZ t  YBY ,t  0 0 0 0 1 0 0 0 0 Y  , CX t   YBZ,t  0 0 0 0 0 1 0 0 0 Y CY ,t  YCX,t  0 0 0 0 0 0 1 0 0 Y CZ,t YCY ,t 0 0 0 0 0 0 0 1 0 YCZ,t 0 0 0 0 0 0 0 0 1 Y K ,t SAdvances in automatic time series forecasting Hierarchical and grouped time series 32
84. 84. Hierarchical data Total A B C AX AY AZ BX BY BZ CX CY CZ  Yt  1 1 1 1 1 1 1 1 1  YA,t 1 1 1 0 0 0 0 0 0 YB,t  0 0 0 1 1 1 0 0 0 Y  AX,t   YC,t  0 0 0 0 0 0 1 1 1 Y AY ,t YAX,t   1 0 0 0 0 0 0 0 0  Y AZ,t   YAY ,t  0 1 0 0 0 0 0 0 0  Y BX,t Yt =  =  YAZ,t  0 0 1 0 0 0 0 0 0  Y , BY t    YBX,t  0 0 0 1 0 0 0 0 0 Y , BZ t  YBY ,t  0 0 0 0 1 0 0 0 0 Y  , CX t   YBZ,t  0 0 0 0 0 1 0 0 0 Y CY ,t  YCX,t  0 0 0 0 0 0 1 0 0 Y CZ,t YCY ,t 0 0 0 0 0 0 0 1 0 YCZ,t 0 0 0 0 0 0 0 0 1 Y K ,t SAdvances in automatic time series forecasting Hierarchical and grouped time series 32
85. 85. Grouped data Total Total A B X YAX AY BX BY AX BX AY BY     Yt 1 1 1 1  YA,t  1 1 0 0  YB,t  0 0 1 1       YAX,t  YX,t  1 0 1 0  YAY ,t Yt =  YY ,t  = 0 1 0 1     YAX,t  1 YBX,t     0 0 0  YBY ,t YAY ,t  0 1 0 0     YBX,t 0 0 1 0 Y K ,t YBY ,t 0 0 0 1 SAdvances in automatic time series forecasting Hierarchical and grouped time series 33
86. 86. Grouped data Total Total A B X YAX AY BX BY AX BX AY BY     Yt 1 1 1 1  YA,t  1 1 0 0  YB,t  0 0 1 1       YAX,t  YX,t  1 0 1 0  YAY ,t Yt =  YY ,t  = 0 1 0 1     YAX,t  1 YBX,t     0 0 0  YBY ,t YAY ,t  0 1 0 0     YBX,t 0 0 1 0 Y K ,t YBY ,t 0 0 0 1 SAdvances in automatic time series forecasting Hierarchical and grouped time series 33
87. 87. ForecastsKey idea: forecast reconciliation ¯ Ignore structural constraints and forecast every series of interest independently. ¯ Adjust forecasts to impose constraints. ˆLet Y n (h) be vector of initial forecasts for horizon h,made at time n, stacked in same order as Y t .Y t = SY K ,t . ˆ So Y n (h) = Sβn (h) + εh . βn (h) = E[Y K ,n+h | Y 1 , . . . , Y n ]. εh has zero mean and covariance matrix Σh . Estimate βn (h) using GLS? ˜ Revised forecasts: Y n (h) = Sβn (h) ˆ Advances in automatic time series forecasting Hierarchical and grouped time series 34
88. 88. ForecastsKey idea: forecast reconciliation ¯ Ignore structural constraints and forecast every series of interest independently. ¯ Adjust forecasts to impose constraints. ˆLet Y n (h) be vector of initial forecasts for horizon h,made at time n, stacked in same order as Y t .Y t = SY K ,t . ˆ So Y n (h) = Sβn (h) + εh . βn (h) = E[Y K ,n+h | Y 1 , . . . , Y n ]. εh has zero mean and covariance matrix Σh . Estimate βn (h) using GLS? ˜ Revised forecasts: Y n (h) = Sβn (h) ˆ Advances in automatic time series forecasting Hierarchical and grouped time series 34
89. 89. ForecastsKey idea: forecast reconciliation ¯ Ignore structural constraints and forecast every series of interest independently. ¯ Adjust forecasts to impose constraints. ˆLet Y n (h) be vector of initial forecasts for horizon h,made at time n, stacked in same order as Y t .Y t = SY K ,t . ˆ So Y n (h) = Sβn (h) + εh . βn (h) = E[Y K ,n+h | Y 1 , . . . , Y n ]. εh has zero mean and covariance matrix Σh . Estimate βn (h) using GLS? ˜ Revised forecasts: Y n (h) = Sβn (h) ˆ Advances in automatic time series forecasting Hierarchical and grouped time series 34
90. 90. ForecastsKey idea: forecast reconciliation ¯ Ignore structural constraints and forecast every series of interest independently. ¯ Adjust forecasts to impose constraints. ˆLet Y n (h) be vector of initial forecasts for horizon h,made at time n, stacked in same order as Y t .Y t = SY K ,t . ˆ So Y n (h) = Sβn (h) + εh . βn (h) = E[Y K ,n+h | Y 1 , . . . , Y n ]. εh has zero mean and covariance matrix Σh . Estimate βn (h) using GLS? ˜ Revised forecasts: Y n (h) = Sβn (h) ˆ Advances in automatic time series forecasting Hierarchical and grouped time series 34
91. 91. ForecastsKey idea: forecast reconciliation ¯ Ignore structural constraints and forecast every series of interest independently. ¯ Adjust forecasts to impose constraints. ˆLet Y n (h) be vector of initial forecasts for horizon h,made at time n, stacked in same order as Y t .Y t = SY K ,t . ˆ So Y n (h) = Sβn (h) + εh . βn (h) = E[Y K ,n+h | Y 1 , . . . , Y n ]. εh has zero mean and covariance matrix Σh . Estimate βn (h) using GLS? ˜ Revised forecasts: Y n (h) = Sβn (h) ˆ Advances in automatic time series forecasting Hierarchical and grouped time series 34
92. 92. ForecastsKey idea: forecast reconciliation ¯ Ignore structural constraints and forecast every series of interest independently. ¯ Adjust forecasts to impose constraints. ˆLet Y n (h) be vector of initial forecasts for horizon h,made at time n, stacked in same order as Y t .Y t = SY K ,t . ˆ So Y n (h) = Sβn (h) + εh . βn (h) = E[Y K ,n+h | Y 1 , . . . , Y n ]. εh has zero mean and covariance matrix Σh . Estimate βn (h) using GLS? ˜ Revised forecasts: Y n (h) = Sβn (h) ˆ Advances in automatic time series forecasting Hierarchical and grouped time series 34
93. 93. ForecastsKey idea: forecast reconciliation ¯ Ignore structural constraints and forecast every series of interest independently. ¯ Adjust forecasts to impose constraints. ˆLet Y n (h) be vector of initial forecasts for horizon h,made at time n, stacked in same order as Y t .Y t = SY K ,t . ˆ So Y n (h) = Sβn (h) + εh . βn (h) = E[Y K ,n+h | Y 1 , . . . , Y n ]. εh has zero mean and covariance matrix Σh . Estimate βn (h) using GLS? ˜ Revised forecasts: Y n (h) = Sβn (h) ˆ Advances in automatic time series forecasting Hierarchical and grouped time series 34
94. 94. ForecastsKey idea: forecast reconciliation ¯ Ignore structural constraints and forecast every series of interest independently. ¯ Adjust forecasts to impose constraints. ˆLet Y n (h) be vector of initial forecasts for horizon h,made at time n, stacked in same order as Y t .Y t = SY K ,t . ˆ So Y n (h) = Sβn (h) + εh . βn (h) = E[Y K ,n+h | Y 1 , . . . , Y n ]. εh has zero mean and covariance matrix Σh . Estimate βn (h) using GLS? ˜ Revised forecasts: Y n (h) = Sβn (h) ˆ Advances in automatic time series forecasting Hierarchical and grouped time series 34
95. 95. Optimal combination forecasts ˜ ˆ ˆ † † Y n (h) = Sβn (h) = S(S Σh S)−1 S Σh Y n (h) Σ† is generalized inverse of Σh . h Problem: Don’t know Σh and hard to estimate. Solution: Assume εh ≈ SεK ,h where εK ,h is the forecast error at bottom level. Then Σh ≈ SΩh S where Ωh = Var(εK ,h ). If Moore-Penrose generalized inverse used, then (S Σ† S)−1 S Σ† = (S S)−1 S . ˜ ˆ Y n (h) = S(S S)−1 S Y n (h)Advances in automatic time series forecasting Hierarchical and grouped time series 35
96. 96. Optimal combination forecasts ˜ ˆ ˆ† † Y n (h) = Sβn (h) = S(S Σh S)−1 S Σh Y n (h) Base forecasts † Σh is generalized inverse of Σh . Problem: Don’t know Σh and hard to estimate. Solution: Assume εh ≈ SεK ,h where εK ,h is the forecast error at bottom level. Then Σh ≈ SΩh S where Ωh = Var(εK ,h ). If Moore-Penrose generalized inverse used, then (S Σ† S)−1 S Σ† = (S S)−1 S . ˜ ˆ Y n (h) = S(S S)−1 S Y n (h)Advances in automatic time series forecasting Hierarchical and grouped time series 35
97. 97. Optimal combination forecasts ˜ ˆ ˆ† † Y n (h) = Sβn (h) = S(S Σh S)−1 S Σh Y n (h)Revised forecasts Base forecasts † Σh is generalized inverse of Σh . Problem: Don’t know Σh and hard to estimate. Solution: Assume εh ≈ SεK ,h where εK ,h is the forecast error at bottom level. Then Σh ≈ SΩh S where Ωh = Var(εK ,h ). If Moore-Penrose generalized inverse used, then (S Σ† S)−1 S Σ† = (S S)−1 S . ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) Advances in automatic time series forecasting Hierarchical and grouped time series 35
98. 98. Optimal combination forecasts ˜ ˆ ˆ† † Y n (h) = Sβn (h) = S(S Σh S)−1 S Σh Y n (h)Revised forecasts Base forecasts † Σh is generalized inverse of Σh . Problem: Don’t know Σh and hard to estimate. Solution: Assume εh ≈ SεK ,h where εK ,h is the forecast error at bottom level. Then Σh ≈ SΩh S where Ωh = Var(εK ,h ). If Moore-Penrose generalized inverse used, then (S Σ† S)−1 S Σ† = (S S)−1 S . ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) Advances in automatic time series forecasting Hierarchical and grouped time series 35
99. 99. Optimal combination forecasts ˜ ˆ ˆ† † Y n (h) = Sβn (h) = S(S Σh S)−1 S Σh Y n (h)Revised forecasts Base forecasts † Σh is generalized inverse of Σh . Problem: Don’t know Σh and hard to estimate. Solution: Assume εh ≈ SεK ,h where εK ,h is the forecast error at bottom level. Then Σh ≈ SΩh S where Ωh = Var(εK ,h ). If Moore-Penrose generalized inverse used, then (S Σ† S)−1 S Σ† = (S S)−1 S . ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) Advances in automatic time series forecasting Hierarchical and grouped time series 35
100. 100. Optimal combination forecasts ˜ ˆ ˆ† † Y n (h) = Sβn (h) = S(S Σh S)−1 S Σh Y n (h)Revised forecasts Base forecasts † Σh is generalized inverse of Σh . Problem: Don’t know Σh and hard to estimate. Solution: Assume εh ≈ SεK ,h where εK ,h is the forecast error at bottom level. Then Σh ≈ SΩh S where Ωh = Var(εK ,h ). If Moore-Penrose generalized inverse used, then (S Σ† S)−1 S Σ† = (S S)−1 S . ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) Advances in automatic time series forecasting Hierarchical and grouped time series 35
101. 101. Optimal combination forecasts ˜ ˆ ˆ† † Y n (h) = Sβn (h) = S(S Σh S)−1 S Σh Y n (h)Revised forecasts Base forecasts † Σh is generalized inverse of Σh . Problem: Don’t know Σh and hard to estimate. Solution: Assume εh ≈ SεK ,h where εK ,h is the forecast error at bottom level. Then Σh ≈ SΩh S where Ωh = Var(εK ,h ). If Moore-Penrose generalized inverse used, then (S Σ† S)−1 S Σ† = (S S)−1 S . ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) Advances in automatic time series forecasting Hierarchical and grouped time series 35
102. 102. Optimal combination forecasts ˜ ˆ ˆ† † Y n (h) = Sβn (h) = S(S Σh S)−1 S Σh Y n (h)Revised forecasts Base forecasts † Σh is generalized inverse of Σh . Problem: Don’t know Σh and hard to estimate. Solution: Assume εh ≈ SεK ,h where εK ,h is the forecast error at bottom level. Then Σh ≈ SΩh S where Ωh = Var(εK ,h ). If Moore-Penrose generalized inverse used, then (S Σ† S)−1 S Σ† = (S S)−1 S . ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) Advances in automatic time series forecasting Hierarchical and grouped time series 35
103. 103. Optimal combination forecasts ˜ ˆ ˆ† † Y n (h) = Sβn (h) = S(S Σh S)−1 S Σh Y n (h)Revised forecasts Base forecasts † Σh is generalized inverse of Σh . Problem: Don’t know Σh and hard to estimate. Solution: Assume εh ≈ SεK ,h where εK ,h is the forecast error at bottom level. Then Σh ≈ SΩh S where Ωh = Var(εK ,h ). If Moore-Penrose generalized inverse used, then (S Σ† S)−1 S Σ† = (S S)−1 S . ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) Advances in automatic time series forecasting Hierarchical and grouped time series 35
104. 104. Optimal combination forecasts ˜ ˆ ˆ† † Y n (h) = Sβn (h) = S(S Σh S)−1 S Σh Y n (h)Revised forecasts Base forecasts † Σh is generalized inverse of Σh . Problem: Don’t know Σh and hard to estimate. Solution: Assume εh ≈ SεK ,h where εK ,h is the forecast error at bottom level. Then Σh ≈ SΩh S where Ωh = Var(εK ,h ). If Moore-Penrose generalized inverse used, then (S Σ† S)−1 S Σ† = (S S)−1 S . ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) Advances in automatic time series forecasting Hierarchical and grouped time series 35
105. 105. Optimal combination forecasts ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) GLS =OLS. Optimal weighted average of base forecasts. Computational difﬁculties in big hierarchies due to size of S matrix. Optimal weights are S(S S)−1 S Weights are independent of the data! Total A B CAdvances in automatic time series forecasting Hierarchical and grouped time series 36
106. 106. Optimal combination forecasts ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) GLS =OLS. Optimal weighted average of base forecasts. Computational difﬁculties in big hierarchies due to size of S matrix. Optimal weights are S(S S)−1 S Weights are independent of the data! Total A B CAdvances in automatic time series forecasting Hierarchical and grouped time series 36
107. 107. Optimal combination forecasts ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) GLS =OLS. Optimal weighted average of base forecasts. Computational difﬁculties in big hierarchies due to size of S matrix. Optimal weights are S(S S)−1 S Weights are independent of the data! Total A B CAdvances in automatic time series forecasting Hierarchical and grouped time series 36
108. 108. Optimal combination forecasts ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) GLS =OLS. Optimal weighted average of base forecasts. Computational difﬁculties in big hierarchies due to size of S matrix. Optimal weights are S(S S)−1 S Weights are independent of the data! Total A B CAdvances in automatic time series forecasting Hierarchical and grouped time series 36
109. 109. Optimal combination forecasts ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) GLS =OLS. Optimal weighted average of base forecasts. Computational difﬁculties in big hierarchies due to size of S matrix. Optimal weights are S(S S)−1 S Weights are independent of the data! Total A B CAdvances in automatic time series forecasting Hierarchical and grouped time series 36
110. 110. Optimal combination forecasts ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) GLS =OLS. Optimal weighted average of base forecasts. Computational difﬁculties in big hierarchies due to size of S matrix. Optimal weights are S(S S)−1 S Weights are independent of the data! Total A B CAdvances in automatic time series forecasting Hierarchical and grouped time series 36
111. 111. Optimal combination forecasts ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) GLS =OLS. Optimal weighted average of base forecasts. Computational difﬁculties in big hierarchies due to size of S matrix. Optimal weights are S(S S)−1 S Weights are independent of the data! Weights: S(S S)−1 S = Total  0.75 0.25 0.25 0.25   0.25  0.75 −0.25 −0.25    0.25 −0.25 0.75 −0.25  A B C 0.25 −0.25 −0.25 0.75Advances in automatic time series forecasting Hierarchical and grouped time series 36
112. 112. Optimal combination forecasts Total A B C AA AB AC BA BB BC CA CB CCWeights: S(S S)−1 S = 0.69 0.23 0.23 0.23 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08  0.23 0.58 −0.17 −0.17 0.19 0.19 0.19 −0.06 −0.06 −0.06 −0.06 −0.06 −0.06  −0.17 −0.17 −0.06 −0.06 −0.06 −0.06 −0.06 −0.06   0.23 0.58 0.19 0.19 0.19  0.23 −0.17 −0.17 0.58 −0.06 −0.06 −0.06 −0.06 −0.06 −0.06 0.19 0.19 0.19   0.08 0.19 −0.06 −0.06 0.73 −0.27 −0.27 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02   0.08 0.19 −0.06 −0.06 −0.27 0.73 −0.27 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02   0.08 0.19 −0.06 −0.06 −0.27 −0.27 0.73 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02   0.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 0.73 −0.27 −0.27 −0.02 −0.02 −0.02   0.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 −0.27 0.73 −0.27 −0.02 −0.02 −0.02   0.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 −0.27 −0.27 0.73 −0.02 −0.02 −0.02   0.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 0.73 −0.27 −0.27   0.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 −0.27 0.73 −0.27  0.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 −0.27 −0.27 0.73 Advances in automatic time series forecasting Hierarchical and grouped time series 37
113. 113. Optimal combination forecasts Total A B C AA AB AC BA BB BC CA CB CCWeights: S(S S)−1 S = 0.69 0.23 0.23 0.23 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08  0.23 0.58 −0.17 −0.17 0.19 0.19 0.19 −0.06 −0.06 −0.06 −0.06 −0.06 −0.06  −0.17 −0.17 −0.06 −0.06 −0.06 −0.06 −0.06 −0.06   0.23 0.58 0.19 0.19 0.19  0.23 −0.17 −0.17 0.58 −0.06 −0.06 −0.06 −0.06 −0.06 −0.06 0.19 0.19 0.19   0.08 0.19 −0.06 −0.06 0.73 −0.27 −0.27 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02   0.08 0.19 −0.06 −0.06 −0.27 0.73 −0.27 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02   0.08 0.19 −0.06 −0.06 −0.27 −0.27 0.73 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02   0.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 0.73 −0.27 −0.27 −0.02 −0.02 −0.02   0.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 −0.27 0.73 −0.27 −0.02 −0.02 −0.02   0.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 −0.27 −0.27 0.73 −0.02 −0.02 −0.02   0.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 0.73 −0.27 −0.27   0.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 −0.27 0.73 −0.27  0.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 −0.27 −0.27 0.73 Advances in automatic time series forecasting Hierarchical and grouped time series 37
114. 114. Features and problems ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) ˜ Var[Y n (h)] = SΩh S = Σh .Covariates can be included in base forecasts.Point forecasts are always consistent.Need to estimate Ωh to produce prediction intervals.Very simple and ﬂexible method. Can work with anyhierarchical or grouped time series.Advances in automatic time series forecasting Hierarchical and grouped time series 38
115. 115. Features and problems ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) ˜ Var[Y n (h)] = SΩh S = Σh .Covariates can be included in base forecasts.Point forecasts are always consistent.Need to estimate Ωh to produce prediction intervals.Very simple and ﬂexible method. Can work with anyhierarchical or grouped time series.Advances in automatic time series forecasting Hierarchical and grouped time series 38
116. 116. Features and problems ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) ˜ Var[Y n (h)] = SΩh S = Σh .Covariates can be included in base forecasts.Point forecasts are always consistent.Need to estimate Ωh to produce prediction intervals.Very simple and ﬂexible method. Can work with anyhierarchical or grouped time series.Advances in automatic time series forecasting Hierarchical and grouped time series 38
117. 117. Features and problems ˜ ˆ Y n (h) = S(S S)−1 S Y n (h) ˜ Var[Y n (h)] = SΩh S = Σh .Covariates can be included in base forecasts.Point forecasts are always consistent.Need to estimate Ωh to produce prediction intervals.Very simple and ﬂexible method. Can work with anyhierarchical or grouped time series.Advances in automatic time series forecasting Hierarchical and grouped time series 38