Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Like this presentation? Why not share!

- ARIMA by Venkata Reddy Kon... 22082 views
- R tools for hierarchical time series by Rob Hyndman 3411 views
- Coherent mortality forecasting usin... by Rob Hyndman 1233 views
- Advances in automatic time series f... by Rob Hyndman 5102 views
- Forecasting Hierarchical Time Series by Rob Hyndman 3545 views
- Automatic time series forecasting by Rob Hyndman 5605 views

2,465 views

2,338 views

2,338 views

Published on

Keynote talk given at the International Symposium on Forecasting, Seoul, South Korea. 25 June 2013

No Downloads

Total views

2,465

On SlideShare

0

From Embeds

0

Number of Embeds

716

Shares

0

Downloads

169

Comments

0

Likes

6

No embeds

No notes for slide

- 1. 1 Rob J Hyndman Forecasting without forecasters
- 2. Outline 1 Motivation 2 Exponential smoothing 3 ARIMA modelling 4 Time series with complex seasonality 5 Hierarchical and grouped time series 6 Functional time series Forecasting without forecasters Motivation 2
- 3. Motivation Forecasting without forecasters Motivation 3
- 4. Motivation Forecasting without forecasters Motivation 3
- 5. Motivation Forecasting without forecasters Motivation 3
- 6. Motivation Forecasting without forecasters Motivation 3
- 7. Motivation Forecasting without forecasters Motivation 3
- 8. Motivation 1 Common in business to have over 1000 products that need forecasting at least monthly. 2 Forecasts are often required by people who are untrained in time series analysis. 3 Some types of data can be decomposed into a large number of univariate time series that need to be forecast. Speciﬁcations Automatic forecasting algorithms must: ¯ determine an appropriate time series model; ¯ estimate the parameters; ¯ compute the forecasts with prediction intervals. Forecasting without forecasters Motivation 4
- 9. Motivation 1 Common in business to have over 1000 products that need forecasting at least monthly. 2 Forecasts are often required by people who are untrained in time series analysis. 3 Some types of data can be decomposed into a large number of univariate time series that need to be forecast. Speciﬁcations Automatic forecasting algorithms must: ¯ determine an appropriate time series model; ¯ estimate the parameters; ¯ compute the forecasts with prediction intervals. Forecasting without forecasters Motivation 4
- 10. Motivation 1 Common in business to have over 1000 products that need forecasting at least monthly. 2 Forecasts are often required by people who are untrained in time series analysis. 3 Some types of data can be decomposed into a large number of univariate time series that need to be forecast. Speciﬁcations Automatic forecasting algorithms must: ¯ determine an appropriate time series model; ¯ estimate the parameters; ¯ compute the forecasts with prediction intervals. Forecasting without forecasters Motivation 4
- 11. Motivation 1 Common in business to have over 1000 products that need forecasting at least monthly. 2 Forecasts are often required by people who are untrained in time series analysis. 3 Some types of data can be decomposed into a large number of univariate time series that need to be forecast. Speciﬁcations Automatic forecasting algorithms must: ¯ determine an appropriate time series model; ¯ estimate the parameters; ¯ compute the forecasts with prediction intervals. Forecasting without forecasters Motivation 4
- 12. Motivation 1 Common in business to have over 1000 products that need forecasting at least monthly. 2 Forecasts are often required by people who are untrained in time series analysis. 3 Some types of data can be decomposed into a large number of univariate time series that need to be forecast. Speciﬁcations Automatic forecasting algorithms must: ¯ determine an appropriate time series model; ¯ estimate the parameters; ¯ compute the forecasts with prediction intervals. Forecasting without forecasters Motivation 4
- 13. Example: Asian sheep Forecasting without forecasters Motivation 5 Numbers of sheep in Asia Year millionsofsheep 1960 1970 1980 1990 2000 2010 250300350400450500550
- 14. Example: Asian sheep Forecasting without forecasters Motivation 5 Automatic ETS forecasts Year millionsofsheep 1960 1970 1980 1990 2000 2010 250300350400450500550
- 15. Example: Cortecosteroid sales Forecasting without forecasters Motivation 6 Monthly cortecosteroid drug sales in Australia Year Totalscripts(millions) 1995 2000 2005 2010 0.40.60.81.01.21.4
- 16. Example: Cortecosteroid sales Forecasting without forecasters Motivation 6 Automatic ARIMA forecasts Year Totalscripts(millions) 1995 2000 2005 2010 0.40.60.81.01.21.4
- 17. M3 competition Forecasting without forecasters Motivation 7
- 18. M3 competition Forecasting without forecasters Motivation 7 3003 time series. Early comparison of automatic forecasting algorithms. Best-performing methods undocumented. Limited subsequent research on general automatic forecasting algorithms.
- 19. M3 competition Forecasting without forecasters Motivation 7 3003 time series. Early comparison of automatic forecasting algorithms. Best-performing methods undocumented. Limited subsequent research on general automatic forecasting algorithms.
- 20. M3 competition Forecasting without forecasters Motivation 7 3003 time series. Early comparison of automatic forecasting algorithms. Best-performing methods undocumented. Limited subsequent research on general automatic forecasting algorithms.
- 21. M3 competition Forecasting without forecasters Motivation 7 3003 time series. Early comparison of automatic forecasting algorithms. Best-performing methods undocumented. Limited subsequent research on general automatic forecasting algorithms.
- 22. Outline 1 Motivation 2 Exponential smoothing 3 ARIMA modelling 4 Time series with complex seasonality 5 Hierarchical and grouped time series 6 Functional time series Forecasting without forecasters Exponential smoothing 8
- 23. Exponential smoothing Forecasting without forecasters Exponential smoothing 9 Classic Reference Makridakis, Wheelwright and Hyndman (1998) Forecasting: methods and applications, 3rd ed., Wiley: NY.
- 24. Exponential smoothing Forecasting without forecasters Exponential smoothing 9 Classic Reference Makridakis, Wheelwright and Hyndman (1998) Forecasting: methods and applications, 3rd ed., Wiley: NY. ¯ “Unfortunately, exponential smoothing methods do not allow the easy calculation of prediction intervals.” (MWH, p.177) ¯ No satisfactory way to select an exponential smoothing method.
- 25. Exponential smoothing Forecasting without forecasters Exponential smoothing 9 Classic Reference Makridakis, Wheelwright and Hyndman (1998) Forecasting: methods and applications, 3rd ed., Wiley: NY. ¯ “Unfortunately, exponential smoothing methods do not allow the easy calculation of prediction intervals.” (MWH, p.177) ¯ No satisfactory way to select an exponential smoothing method.
- 26. Exponential smoothing Forecasting without forecasters Exponential smoothing 9 Classic Reference Makridakis, Wheelwright and Hyndman (1998) Forecasting: methods and applications, 3rd ed., Wiley: NY. Current Reference Hyndman and Athanasopoulos (2013) Forecasting: principles and practice, OTexts: Australia. OTexts.com/fpp.
- 27. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M Forecasting without forecasters Exponential smoothing 10
- 28. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing Forecasting without forecasters Exponential smoothing 10
- 29. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing A,N: Holt’s linear method Forecasting without forecasters Exponential smoothing 10
- 30. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad,N: Additive damped trend method Forecasting without forecasters Exponential smoothing 10
- 31. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad,N: Additive damped trend method M,N: Exponential trend method Forecasting without forecasters Exponential smoothing 10
- 32. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad,N: Additive damped trend method M,N: Exponential trend method Md,N: Multiplicative damped trend method Forecasting without forecasters Exponential smoothing 10
- 33. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad,N: Additive damped trend method M,N: Exponential trend method Md,N: Multiplicative damped trend method A,A: Additive Holt-Winters’ method Forecasting without forecasters Exponential smoothing 10
- 34. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad,N: Additive damped trend method M,N: Exponential trend method Md,N: Multiplicative damped trend method A,A: Additive Holt-Winters’ method A,M: Multiplicative Holt-Winters’ method Forecasting without forecasters Exponential smoothing 10
- 35. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M There are 15 separate exponential smoothing methods. Forecasting without forecasters Exponential smoothing 10
- 36. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M There are 15 separate exponential smoothing methods. Each can have an additive or multiplicative error, giving 30 separate models. Forecasting without forecasters Exponential smoothing 10
- 37. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smoothing Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors Forecasting without forecasters Exponential smoothing 11
- 38. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smoothing Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors Forecasting without forecasters Exponential smoothing 11
- 39. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smoothing ↑ Trend Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors Forecasting without forecasters Exponential smoothing 11
- 40. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smoothing ↑ Trend Seasonal Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors Forecasting without forecasters Exponential smoothing 11
- 41. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smoothing ↑ Error Trend Seasonal Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors Forecasting without forecasters Exponential smoothing 11
- 42. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smoothing ↑ Error Trend Seasonal Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors Forecasting without forecasters Exponential smoothing 11
- 43. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smoothing ↑ Error Trend Seasonal Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors Forecasting without forecasters Exponential smoothing 11 Innovations state space models ¯ All ETS models can be written in innovations state space form (IJF, 2002). ¯ Additive and multiplicative versions give the same point forecasts but different prediction intervals.
- 44. Automatic forecasting From Hyndman et al. (IJF, 2002): Apply each of 30 models that are appropriate to the data. Optimize parameters and initial values using MLE (or some other criterion). Select best method using AIC: AIC = −2 log(Likelihood) + 2p where p = # parameters. Produce forecasts using best method. Obtain prediction intervals using underlying state space model. Forecasting without forecasters Exponential smoothing 12
- 45. Automatic forecasting From Hyndman et al. (IJF, 2002): Apply each of 30 models that are appropriate to the data. Optimize parameters and initial values using MLE (or some other criterion). Select best method using AIC: AIC = −2 log(Likelihood) + 2p where p = # parameters. Produce forecasts using best method. Obtain prediction intervals using underlying state space model. Forecasting without forecasters Exponential smoothing 12
- 46. Automatic forecasting From Hyndman et al. (IJF, 2002): Apply each of 30 models that are appropriate to the data. Optimize parameters and initial values using MLE (or some other criterion). Select best method using AIC: AIC = −2 log(Likelihood) + 2p where p = # parameters. Produce forecasts using best method. Obtain prediction intervals using underlying state space model. Forecasting without forecasters Exponential smoothing 12
- 47. Automatic forecasting From Hyndman et al. (IJF, 2002): Apply each of 30 models that are appropriate to the data. Optimize parameters and initial values using MLE (or some other criterion). Select best method using AIC: AIC = −2 log(Likelihood) + 2p where p = # parameters. Produce forecasts using best method. Obtain prediction intervals using underlying state space model. Forecasting without forecasters Exponential smoothing 12
- 48. Exponential smoothing Forecasting without forecasters Exponential smoothing 13 Forecasts from ETS(M,A,N) Year millionsofsheep 1960 1970 1980 1990 2000 2010 300400500600
- 49. Exponential smoothing fit <- ets(livestock) fcast <- forecast(fit) plot(fcast) Forecasting without forecasters Exponential smoothing 14 Forecasts from ETS(M,A,N) Year millionsofsheep 1960 1970 1980 1990 2000 2010 300400500600
- 50. Exponential smoothing Forecasting without forecasters Exponential smoothing 15 Forecasts from ETS(M,Md,M) Year Totalscripts(millions) 1995 2000 2005 2010 0.40.60.81.01.21.41.6
- 51. Exponential smoothing fit <- ets(h02) fcast <- forecast(fit) plot(fcast) Forecasting without forecasters Exponential smoothing 16 Forecasts from ETS(M,Md,M) Year Totalscripts(millions) 1995 2000 2005 2010 0.40.60.81.01.21.41.6
- 52. M3 comparisons Method MAPE sMAPE MASE Theta 17.83 12.86 1.40 ForecastPro 18.00 13.06 1.47 ETS additive 18.58 13.69 1.48 ETS 19.33 13.57 1.59 Forecasting without forecasters Exponential smoothing 17
- 53. References RJ Hyndman, AB Koehler, RD Snyder, and S Grose (2002). “A state space framework for automatic forecasting using exponential smoothing methods”. International Journal of Forecasting 18(3), 439–454. RJ Hyndman, AB Koehler, JK Ord, and RD Snyder (2008). Forecasting with exponential smoothing: the state space approach. Springer-Verlag. RJ Hyndman and G Athanasopoulos (2013). Forecasting: principles and practice. OTexts. OTexts.com/fpp/. Forecasting without forecasters Exponential smoothing 18
- 54. Outline 1 Motivation 2 Exponential smoothing 3 ARIMA modelling 4 Time series with complex seasonality 5 Hierarchical and grouped time series 6 Functional time series Forecasting without forecasters ARIMA modelling 19
- 55. ARIMA modelling Forecasting without forecasters ARIMA modelling 20 Classic Reference Makridakis, Wheelwright and Hyndman (1998) Forecasting: methods and applications, 3rd ed., Wiley: NY.
- 56. ARIMA modelling Forecasting without forecasters ARIMA modelling 20 Classic Reference Makridakis, Wheelwright and Hyndman (1998) Forecasting: methods and applications, 3rd ed., Wiley: NY. ¯ “There is such a bewildering variety of ARIMA models, it can be difﬁcult to decide which model is most appropriate for a given set of data.” (MWH, p.347)
- 57. Auto ARIMA Forecasting without forecasters ARIMA modelling 21 Forecasts from ARIMA(0,1,0) with drift Year millionsofsheep 1960 1970 1980 1990 2000 2010 250300350400450500550
- 58. Auto ARIMA fit <- auto.arima(livestock) fcast <- forecast(fit) plot(fcast) Forecasting without forecasters ARIMA modelling 22 Forecasts from ARIMA(0,1,0) with drift Year millionsofsheep 1960 1970 1980 1990 2000 2010 250300350400450500550
- 59. Auto ARIMA Forecasting without forecasters ARIMA modelling 23 Forecasts from ARIMA(3,1,3)(0,1,1)[12] Year Totalscripts(millions) 1995 2000 2005 2010 0.40.60.81.01.21.4
- 60. Auto ARIMA fit <- auto.arima(h02) fcast <- forecast(fit) plot(fcast) Forecasting without forecasters ARIMA modelling 24 Forecasts from ARIMA(3,1,3)(0,1,1)[12] Year Totalscripts(millions) 1995 2000 2005 2010 0.40.60.81.01.21.4
- 61. How does auto.arima() work? A non-seasonal ARIMA process φ(B)(1 − B)d yt = c + θ(B)εt Need to select appropriate orders p, q, d, and whether to include c. Forecasting without forecasters ARIMA modelling 25 Algorithm choices driven by forecast accuracy.
- 62. How does auto.arima() work? A non-seasonal ARIMA process φ(B)(1 − B)d yt = c + θ(B)εt Need to select appropriate orders p, q, d, and whether to include c. Hyndman & Khandakar (JSS, 2008) algorithm: Select no. differences d via KPSS unit root test. Select p, q, c by minimising AIC. Use stepwise search to traverse model space, starting with a simple model and considering nearby variants. Forecasting without forecasters ARIMA modelling 25 Algorithm choices driven by forecast accuracy.
- 63. How does auto.arima() work? A non-seasonal ARIMA process φ(B)(1 − B)d yt = c + θ(B)εt Need to select appropriate orders p, q, d, and whether to include c. Hyndman & Khandakar (JSS, 2008) algorithm: Select no. differences d via KPSS unit root test. Select p, q, c by minimising AIC. Use stepwise search to traverse model space, starting with a simple model and considering nearby variants. Forecasting without forecasters ARIMA modelling 25 Algorithm choices driven by forecast accuracy.
- 64. How does auto.arima() work? A seasonal ARIMA process Φ(Bm )φ(B)(1 − B)d (1 − Bm )D yt = c + Θ(Bm )θ(B)εt Need to select appropriate orders p, q, d, P, Q, D, and whether to include c. Hyndman & Khandakar (JSS, 2008) algorithm: Select no. differences d via KPSS unit root test. Select D using OCSB unit root test. Select p, q, P, Q, c by minimising AIC. Use stepwise search to traverse model space, starting with a simple model and considering nearby variants. Forecasting without forecasters ARIMA modelling 26
- 65. M3 comparisons Method MAPE sMAPE MASE Theta 17.83 12.86 1.40 ForecastPro 18.00 13.06 1.47 BJauto 19.14 13.73 1.55 AutoARIMA 18.98 13.75 1.47 ETS-additive 18.58 13.69 1.48 ETS 19.33 13.57 1.59 ETS-ARIMA 18.17 13.11 1.44 Forecasting without forecasters ARIMA modelling 27
- 66. M3 conclusions MYTHS Simple methods do better. Exponential smoothing is better than ARIMA. FACTS The best methods are hybrid approaches. ETS-ARIMA (the simple average of ETS-additive and AutoARIMA) is the only fully documented method that is comparable to the M3 competition winners. I have an algorithm that does better than all of these, but it takes too long to be practical. Forecasting without forecasters ARIMA modelling 28
- 67. M3 conclusions MYTHS Simple methods do better. Exponential smoothing is better than ARIMA. FACTS The best methods are hybrid approaches. ETS-ARIMA (the simple average of ETS-additive and AutoARIMA) is the only fully documented method that is comparable to the M3 competition winners. I have an algorithm that does better than all of these, but it takes too long to be practical. Forecasting without forecasters ARIMA modelling 28
- 68. M3 conclusions MYTHS Simple methods do better. Exponential smoothing is better than ARIMA. FACTS The best methods are hybrid approaches. ETS-ARIMA (the simple average of ETS-additive and AutoARIMA) is the only fully documented method that is comparable to the M3 competition winners. I have an algorithm that does better than all of these, but it takes too long to be practical. Forecasting without forecasters ARIMA modelling 28
- 69. M3 conclusions MYTHS Simple methods do better. Exponential smoothing is better than ARIMA. FACTS The best methods are hybrid approaches. ETS-ARIMA (the simple average of ETS-additive and AutoARIMA) is the only fully documented method that is comparable to the M3 competition winners. I have an algorithm that does better than all of these, but it takes too long to be practical. Forecasting without forecasters ARIMA modelling 28
- 70. M3 conclusions MYTHS Simple methods do better. Exponential smoothing is better than ARIMA. FACTS The best methods are hybrid approaches. ETS-ARIMA (the simple average of ETS-additive and AutoARIMA) is the only fully documented method that is comparable to the M3 competition winners. I have an algorithm that does better than all of these, but it takes too long to be practical. Forecasting without forecasters ARIMA modelling 28
- 71. M3 conclusions MYTHS Simple methods do better. Exponential smoothing is better than ARIMA. FACTS The best methods are hybrid approaches. ETS-ARIMA (the simple average of ETS-additive and AutoARIMA) is the only fully documented method that is comparable to the M3 competition winners. I have an algorithm that does better than all of these, but it takes too long to be practical. Forecasting without forecasters ARIMA modelling 28
- 72. References RJ Hyndman and Y Khandakar (2008). “Automatic time series forecasting : the forecast package for R”. Journal of Statistical Software 26(3) RJ Hyndman (2011). “Major changes to the forecast package”. robjhyndman.com/hyndsight/forecast3/. RJ Hyndman and G Athanasopoulos (2013). Forecasting: principles and practice. OTexts. OTexts.com/fpp/. Forecasting without forecasters ARIMA modelling 29
- 73. Outline 1 Motivation 2 Exponential smoothing 3 ARIMA modelling 4 Time series with complex seasonality 5 Hierarchical and grouped time series 6 Functional time series Forecasting without forecasters Time series with complex seasonality 30
- 74. Examples Forecasting without forecasters Time series with complex seasonality 31 US finished motor gasoline products Weeks Thousandsofbarrelsperday 1992 1994 1996 1998 2000 2002 2004 6500700075008000850090009500
- 75. Examples Forecasting without forecasters Time series with complex seasonality 31 Number of calls to large American bank (7am−9pm) 5 minute intervals Numberofcallarrivals 100200300400 3 March 17 March 31 March 14 April 28 April 12 May
- 76. Examples Forecasting without forecasters Time series with complex seasonality 31 Turkish electricity demand Days Electricitydemand(GW) 2000 2002 2004 2006 2008 10152025
- 77. TBATS model TBATS Trigonometric terms for seasonality Box-Cox transformations for heterogeneity ARMA errors for short-term dynamics Trend (possibly damped) Seasonal (including multiple and non-integer periods) Forecasting without forecasters Time series with complex seasonality 32
- 78. Examples fit <- tbats(gasoline) fcast <- forecast(fit) plot(fcast) Forecasting without forecasters Time series with complex seasonality 33 Forecasts from TBATS(0.999, {2,2}, 1, {<52.1785714285714,8>}) Weeks Thousandsofbarrelsperday 1995 2000 2005 70008000900010000
- 79. Examples fit <- tbats(callcentre) fcast <- forecast(fit) plot(fcast) Forecasting without forecasters Time series with complex seasonality 34 Forecasts from TBATS(1, {3,1}, 0.987, {<169,5>, <845,3>}) 5 minute intervals Numberofcallarrivals 0100200300400500 3 March 17 March 31 March 14 April 28 April 12 May 26 May 9 June
- 80. Examples fit <- tbats(turk) fcast <- forecast(fit) plot(fcast) Forecasting without forecasters Time series with complex seasonality 35 Forecasts from TBATS(0, {5,3}, 0.997, {<7,3>, <354.37,12>, <365.25,4>}) Days Electricitydemand(GW) 2000 2002 2004 2006 2008 2010 10152025
- 81. References Automatic algorithm described in AM De Livera, RJ Hyndman, and RD Snyder (2011). “Forecasting time series with complex seasonal patterns using exponential smoothing”. Journal of the American Statistical Association 106(496), 1513–1527. Slightly improved algorithm implemented in RJ Hyndman (2012). forecast: Forecasting functions for time series. cran.r-project.org/package=forecast. More work required! Forecasting without forecasters Time series with complex seasonality 36
- 82. Outline 1 Motivation 2 Exponential smoothing 3 ARIMA modelling 4 Time series with complex seasonality 5 Hierarchical and grouped time series 6 Functional time series Forecasting without forecasters Hierarchical and grouped time series 37
- 83. Introduction Total A AA AB AC B BA BB BC C CA CB CC Examples Manufacturing product hierarchies Pharmaceutical sales Net labour turnover Forecasting without forecasters Hierarchical and grouped time series 38
- 84. Introduction Total A AA AB AC B BA BB BC C CA CB CC Examples Manufacturing product hierarchies Pharmaceutical sales Net labour turnover Forecasting without forecasters Hierarchical and grouped time series 38
- 85. Introduction Total A AA AB AC B BA BB BC C CA CB CC Examples Manufacturing product hierarchies Pharmaceutical sales Net labour turnover Forecasting without forecasters Hierarchical and grouped time series 38
- 86. Introduction Total A AA AB AC B BA BB BC C CA CB CC Examples Manufacturing product hierarchies Pharmaceutical sales Net labour turnover Forecasting without forecasters Hierarchical and grouped time series 38
- 87. Hierarchical/grouped time series A hierarchical time series is a collection of several time series that are linked together in a hierarchical structure. Example: Pharmaceutical products are organized in a hierarchy under the Anatomical Therapeutic Chemical (ATC) Classiﬁcation System. A grouped time series is a collection of time series that are aggregated in a number of non-hierarchical ways. Example: daily numbers of calls to HP call centres are grouped by product type and location of call centre. Forecasting without forecasters Hierarchical and grouped time series 39
- 88. Hierarchical/grouped time series A hierarchical time series is a collection of several time series that are linked together in a hierarchical structure. Example: Pharmaceutical products are organized in a hierarchy under the Anatomical Therapeutic Chemical (ATC) Classiﬁcation System. A grouped time series is a collection of time series that are aggregated in a number of non-hierarchical ways. Example: daily numbers of calls to HP call centres are grouped by product type and location of call centre. Forecasting without forecasters Hierarchical and grouped time series 39
- 89. Hierarchical/grouped time series A hierarchical time series is a collection of several time series that are linked together in a hierarchical structure. Example: Pharmaceutical products are organized in a hierarchy under the Anatomical Therapeutic Chemical (ATC) Classiﬁcation System. A grouped time series is a collection of time series that are aggregated in a number of non-hierarchical ways. Example: daily numbers of calls to HP call centres are grouped by product type and location of call centre. Forecasting without forecasters Hierarchical and grouped time series 39
- 90. Hierarchical/grouped time series A hierarchical time series is a collection of several time series that are linked together in a hierarchical structure. Example: Pharmaceutical products are organized in a hierarchy under the Anatomical Therapeutic Chemical (ATC) Classiﬁcation System. A grouped time series is a collection of time series that are aggregated in a number of non-hierarchical ways. Example: daily numbers of calls to HP call centres are grouped by product type and location of call centre. Forecasting without forecasters Hierarchical and grouped time series 39
- 91. Hierarchical data Total A B C Forecasting without forecasters Hierarchical and grouped time series 40 Yt : observed aggregate of all series at time t. YX,t : observation on series X at time t. Bt : vector of all series at bottom level in time t.
- 92. Hierarchical data Total A B C Forecasting without forecasters Hierarchical and grouped time series 40 Yt : observed aggregate of all series at time t. YX,t : observation on series X at time t. Bt : vector of all series at bottom level in time t.
- 93. Hierarchical data Total A B C Yt = [Yt, YA,t, YB,t, YC,t] = 1 1 1 1 0 0 0 1 0 0 0 1 YA,t YB,t YC,t Forecasting without forecasters Hierarchical and grouped time series 40 Yt : observed aggregate of all series at time t. YX,t : observation on series X at time t. Bt : vector of all series at bottom level in time t.
- 94. Hierarchical data Total A B C Yt = [Yt, YA,t, YB,t, YC,t] = 1 1 1 1 0 0 0 1 0 0 0 1 S YA,t YB,t YC,t Forecasting without forecasters Hierarchical and grouped time series 40 Yt : observed aggregate of all series at time t. YX,t : observation on series X at time t. Bt : vector of all series at bottom level in time t.
- 95. Hierarchical data Total A B C Yt = [Yt, YA,t, YB,t, YC,t] = 1 1 1 1 0 0 0 1 0 0 0 1 S YA,t YB,t YC,t Bt Forecasting without forecasters Hierarchical and grouped time series 40 Yt : observed aggregate of all series at time t. YX,t : observation on series X at time t. Bt : vector of all series at bottom level in time t.
- 96. Hierarchical data Total A B C Yt = [Yt, YA,t, YB,t, YC,t] = 1 1 1 1 0 0 0 1 0 0 0 1 S YA,t YB,t YC,t Bt Yt = SBt Forecasting without forecasters Hierarchical and grouped time series 40 Yt : observed aggregate of all series at time t. YX,t : observation on series X at time t. Bt : vector of all series at bottom level in time t.
- 97. Grouped data Total A AX AY B BX BY Total X AX BX Y AY BY Yt = Yt YA,t YB,t YX,t YY,t YAX,t YAY,t YBX,t YBY,t = 1 1 1 1 1 1 0 0 0 0 1 1 1 0 1 0 0 1 0 1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 S YAX,t YAY,t YBX,t YBY,t Bt Forecasting without forecasters Hierarchical and grouped time series 41
- 98. Grouped data Total A AX AY B BX BY Total X AX BX Y AY BY Yt = Yt YA,t YB,t YX,t YY,t YAX,t YAY,t YBX,t YBY,t = 1 1 1 1 1 1 0 0 0 0 1 1 1 0 1 0 0 1 0 1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 S YAX,t YAY,t YBX,t YBY,t Bt Forecasting without forecasters Hierarchical and grouped time series 41
- 99. Grouped data Total A AX AY B BX BY Total X AX BX Y AY BY Yt = Yt YA,t YB,t YX,t YY,t YAX,t YAY,t YBX,t YBY,t = 1 1 1 1 1 1 0 0 0 0 1 1 1 0 1 0 0 1 0 1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 S YAX,t YAY,t YBX,t YBY,t Bt Yt = SBt Forecasting without forecasters Hierarchical and grouped time series 41
- 100. Forecasts Key idea: forecast reconciliation ¯ Ignore structural constraints and forecast every series of interest independently. ¯ Adjust forecasts to impose constraints. Let ˆYn(h) be vector of initial forecasts for horizon h, made at time n, stacked in same order as Yt. Optimal reconciled forecasts: ˜Yn(h) = S(S S)−1 S ˆYn(h) Forecasting without forecasters Hierarchical and grouped time series 42
- 101. Forecasts Key idea: forecast reconciliation ¯ Ignore structural constraints and forecast every series of interest independently. ¯ Adjust forecasts to impose constraints. Let ˆYn(h) be vector of initial forecasts for horizon h, made at time n, stacked in same order as Yt. Optimal reconciled forecasts: ˜Yn(h) = S(S S)−1 S ˆYn(h) Forecasting without forecasters Hierarchical and grouped time series 42
- 102. Forecasts Key idea: forecast reconciliation ¯ Ignore structural constraints and forecast every series of interest independently. ¯ Adjust forecasts to impose constraints. Let ˆYn(h) be vector of initial forecasts for horizon h, made at time n, stacked in same order as Yt. Optimal reconciled forecasts: ˜Yn(h) = S(S S)−1 S ˆYn(h) Forecasting without forecasters Hierarchical and grouped time series 42
- 103. Forecasts Key idea: forecast reconciliation ¯ Ignore structural constraints and forecast every series of interest independently. ¯ Adjust forecasts to impose constraints. Let ˆYn(h) be vector of initial forecasts for horizon h, made at time n, stacked in same order as Yt. Optimal reconciled forecasts: ˜Yn(h) = S(S S)−1 S ˆYn(h) Forecasting without forecasters Hierarchical and grouped time series 42 Independent of covariance structure of hierarchy! Optimal reconciliation weights are S(S S)−1 S , independent of data.
- 104. Features Forget “bottom up” or “top down”. This approach combines all forecasts optimally. Method outperforms bottom-up and top-down, especially for middle levels. Covariates can be included in base forecasts. Adjustments can be made to base forecasts at any level. Point forecasts are always aggregate consistent. Very simple and ﬂexible method. Can work with any hierarchical or grouped time series. Conceptually easy to implement: OLS on base forecasts. Forecasting without forecasters Hierarchical and grouped time series 43
- 105. Features Forget “bottom up” or “top down”. This approach combines all forecasts optimally. Method outperforms bottom-up and top-down, especially for middle levels. Covariates can be included in base forecasts. Adjustments can be made to base forecasts at any level. Point forecasts are always aggregate consistent. Very simple and ﬂexible method. Can work with any hierarchical or grouped time series. Conceptually easy to implement: OLS on base forecasts. Forecasting without forecasters Hierarchical and grouped time series 43
- 106. Features Forget “bottom up” or “top down”. This approach combines all forecasts optimally. Method outperforms bottom-up and top-down, especially for middle levels. Covariates can be included in base forecasts. Adjustments can be made to base forecasts at any level. Point forecasts are always aggregate consistent. Very simple and ﬂexible method. Can work with any hierarchical or grouped time series. Conceptually easy to implement: OLS on base forecasts. Forecasting without forecasters Hierarchical and grouped time series 43
- 107. Features Forget “bottom up” or “top down”. This approach combines all forecasts optimally. Method outperforms bottom-up and top-down, especially for middle levels. Covariates can be included in base forecasts. Adjustments can be made to base forecasts at any level. Point forecasts are always aggregate consistent. Very simple and ﬂexible method. Can work with any hierarchical or grouped time series. Conceptually easy to implement: OLS on base forecasts. Forecasting without forecasters Hierarchical and grouped time series 43
- 108. Features Forget “bottom up” or “top down”. This approach combines all forecasts optimally. Method outperforms bottom-up and top-down, especially for middle levels. Covariates can be included in base forecasts. Adjustments can be made to base forecasts at any level. Point forecasts are always aggregate consistent. Very simple and ﬂexible method. Can work with any hierarchical or grouped time series. Conceptually easy to implement: OLS on base forecasts. Forecasting without forecasters Hierarchical and grouped time series 43
- 109. Features Forget “bottom up” or “top down”. This approach combines all forecasts optimally. Method outperforms bottom-up and top-down, especially for middle levels. Covariates can be included in base forecasts. Adjustments can be made to base forecasts at any level. Point forecasts are always aggregate consistent. Very simple and ﬂexible method. Can work with any hierarchical or grouped time series. Conceptually easy to implement: OLS on base forecasts. Forecasting without forecasters Hierarchical and grouped time series 43
- 110. Features Forget “bottom up” or “top down”. This approach combines all forecasts optimally. Method outperforms bottom-up and top-down, especially for middle levels. Covariates can be included in base forecasts. Adjustments can be made to base forecasts at any level. Point forecasts are always aggregate consistent. Very simple and ﬂexible method. Can work with any hierarchical or grouped time series. Conceptually easy to implement: OLS on base forecasts. Forecasting without forecasters Hierarchical and grouped time series 43
- 111. Challenges Computational difﬁculties in big hierarchies due to size of the S matrix and non-singular behavior of (S S). Need to estimate covariance matrix to produce prediction intervals. Forecasting without forecasters Hierarchical and grouped time series 44
- 112. Challenges Computational difﬁculties in big hierarchies due to size of the S matrix and non-singular behavior of (S S). Need to estimate covariance matrix to produce prediction intervals. Forecasting without forecasters Hierarchical and grouped time series 44
- 113. Example using R library(hts) # bts is a matrix containing the bottom level time series # g describes the grouping/hierarchical structure y <- hts(bts, g=c(1,1,2,2)) Forecasting without forecasters Hierarchical and grouped time series 45
- 114. Example using R library(hts) # bts is a matrix containing the bottom level time series # g describes the grouping/hierarchical structure y <- hts(bts, g=c(1,1,2,2)) Forecasting without forecasters Hierarchical and grouped time series 45 Total A AX AY B BX BY
- 115. Example using R library(hts) # bts is a matrix containing the bottom level time series # g describes the grouping/hierarchical structure y <- hts(bts, g=c(1,1,2,2)) # Forecast 10-step-ahead using optimal combination method # ETS used for each series by default fc <- forecast(y, h=10) Forecasting without forecasters Hierarchical and grouped time series 46
- 116. Example using R library(hts) # bts is a matrix containing the bottom level time series # g describes the grouping/hierarchical structure y <- hts(bts, g=c(1,1,2,2)) # Forecast 10-step-ahead using optimal combination method # ETS used for each series by default fc <- forecast(y, h=10) # Select your own methods ally <- allts(y) allf <- matrix(, nrow=10, ncol=ncol(ally)) for(i in 1:ncol(ally)) allf[,i] <- mymethod(ally[,i], h=10) allf <- ts(allf, start=2004) # Reconcile forecasts so they add up fc2 <- combinef(allf, Smatrix(y)) Forecasting without forecasters Hierarchical and grouped time series 47
- 117. References RJ Hyndman, RA Ahmed, G Athanasopoulos, and HL Shang (2011). “Optimal combination forecasts for hierarchical time series”. Computational Statistics and Data Analysis 55(9), 2579–2589 RJ Hyndman, RA Ahmed, and HL Shang (2013). hts: Hierarchical time series. cran.r-project.org/package=hts. RJ Hyndman and G Athanasopoulos (2013). Forecasting: principles and practice. OTexts. OTexts.com/fpp/. Forecasting without forecasters Hierarchical and grouped time series 48
- 118. Outline 1 Motivation 2 Exponential smoothing 3 ARIMA modelling 4 Time series with complex seasonality 5 Hierarchical and grouped time series 6 Functional time series Forecasting without forecasters Functional time series 49
- 119. Fertility rates Forecasting without forecasters Functional time series 50 15 20 25 30 35 40 45 50 050100150200250 Australia: fertility rates (1921) Age Fertilityrate
- 120. Functional data model Let ft,x be the observed data in period t at age x, t = 1, . . . , n. ft(x) = µ(x) + K k=1 βt,k φk(x) + et(x) Forecasting without forecasters Functional time series 51 Decomposition separates time and age to allow forecasting. Estimate µ(x) as mean ft(x) across years. Estimate βt,k and φk(x) using functional (weighted) principal components. Univariate models used for automatic forecasting of scores {βt,k}.
- 121. Functional data model Let ft,x be the observed data in period t at age x, t = 1, . . . , n. ft(x) = µ(x) + K k=1 βt,k φk(x) + et(x) Forecasting without forecasters Functional time series 51 Decomposition separates time and age to allow forecasting. Estimate µ(x) as mean ft(x) across years. Estimate βt,k and φk(x) using functional (weighted) principal components. Univariate models used for automatic forecasting of scores {βt,k}.
- 122. Functional data model Let ft,x be the observed data in period t at age x, t = 1, . . . , n. ft(x) = µ(x) + K k=1 βt,k φk(x) + et(x) Forecasting without forecasters Functional time series 51 Decomposition separates time and age to allow forecasting. Estimate µ(x) as mean ft(x) across years. Estimate βt,k and φk(x) using functional (weighted) principal components. Univariate models used for automatic forecasting of scores {βt,k}.
- 123. Functional data model Let ft,x be the observed data in period t at age x, t = 1, . . . , n. ft(x) = µ(x) + K k=1 βt,k φk(x) + et(x) Forecasting without forecasters Functional time series 51 Decomposition separates time and age to allow forecasting. Estimate µ(x) as mean ft(x) across years. Estimate βt,k and φk(x) using functional (weighted) principal components. Univariate models used for automatic forecasting of scores {βt,k}.
- 124. Functional data model Let ft,x be the observed data in period t at age x, t = 1, . . . , n. ft(x) = µ(x) + K k=1 βt,k φk(x) + et(x) Forecasting without forecasters Functional time series 51 Decomposition separates time and age to allow forecasting. Estimate µ(x) as mean ft(x) across years. Estimate βt,k and φk(x) using functional (weighted) principal components. Univariate models used for automatic forecasting of scores {βt,k}.
- 125. Fertility application Forecasting without forecasters Functional time series 52 15 20 25 30 35 40 45 50 050100150200250 Australia fertility rates (1921−2006) Age Fertilityrate
- 126. Fertility model Forecasting without forecasters Functional time series 53 15 20 25 30 35 40 45 50 051015 Age Mu 15 20 25 30 35 40 45 50 0.000.100.200.30 Age Phi1 Year Beta1 1920 1960 2000 −20−100510 15 20 25 30 35 40 45 50 −0.2−0.10.00.10.2 Age Phi2 YearBeta2 1920 1960 2000 −100102030
- 127. Forecasts of ft(x) Forecasting without forecasters Functional time series 54 15 20 25 30 35 40 45 50 050100150200250 Australia fertility rates (1921−2006) Age Fertilityrate
- 128. Forecasts of ft(x) Forecasting without forecasters Functional time series 54 15 20 25 30 35 40 45 50 050100150200250 Australia fertility rates (1921−2006) Age Fertilityrate
- 129. Forecasts of ft(x) Forecasting without forecasters Functional time series 54 15 20 25 30 35 40 45 50 050100150200250 Australia fertility rates (1921−2006) Age Fertilityrate
- 130. Forecasts of ft(x) Forecasting without forecasters Functional time series 54 15 20 25 30 35 40 45 50 050100150200250 Australia fertility rates (1921−2006) Age Fertilityrate 80% prediction intervals
- 131. R code Forecasting without forecasters Functional time series 55 15 20 25 30 35 40 45 50 050100150200250 Australia fertility rates (1921−2006) Age Fertilityrate library(demography) plot(aus.fert) fit <- fdm(aus.fert) fc <- forecast(fit)
- 132. References RJ Hyndman and S Ullah (2007). “Robust forecasting of mortality and fertility rates: A functional data approach”. Computational Statistics and Data Analysis 51(10), 4942–4956 RJ Hyndman and HL Shang (2009). “Forecasting functional time series (with discussion)”. Journal of the Korean Statistical Society 38(3), 199–221 RJ Hyndman (2012). demography: Forecasting mortality, fertility, migration and population data. cran.r-project.org/package=demography. Forecasting without forecasters Functional time series 56
- 133. For further information robjhyndman.com Slides and references for this talk. Links to all papers and books. Links to R packages. A blog about forecasting research. Forecasting without forecasters Functional time series 57

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

Be the first to comment