Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Time series forecasting

327 views

Published on

Time series forecasting
Roman Merkulov, InData Labs

AI Day Minsk, October 14, 2017

Published in: Data & Analytics
  • Be the first to comment

Time series forecasting

  1. 1. Time Series Forecasting Roman Merkulov Data Scientist at InData Labs
  2. 2. Time series data ordered by time sequence of observations of a random value equally spaced in time.
  3. 3. Cross-sectional data vs Time Series data Cross-sectional data: Random sample from a population where observations are i.i.d. N-vectors Time series data: Observations of a single object which properties vary in time. Observations are not i.i.d. Panel data: cross-sectional + time series data
  4. 4. Tasks - identification (model structure and parameters estimation) - forecasting (prediction of a target value in future)
  5. 5. Applications Historically: - macroeconomics - financial markets - ecology, hydrology, etc Nowadays: - retail - telecom - banking - everywhere!
  6. 6. Typical Time Series structure y - observed value f - trend component c - cyclic component s - seasonal component e - innovation
  7. 7. Stationarity
  8. 8. Non-stationarity Time Series types - trend stationary - difference stationary (unit root processes) - explosive processes
  9. 9. Stationarity test - Dickey Fuller - Augmented Dickey Fuller
  10. 10. Stationarity test - Dickey Fuller - Augmented Dickey Fuller - KPSS, Phillips-Perron and many others
  11. 11. Stationarity test - Dickey Fuller - Augmented Dickey Fuller - KPSS, Phillips-Perron and many others ADF-stat = -1.92646465481 p-value = 0.0515955016175
  12. 12. Prediction intervals Atmospheric concentrations of CO2.
  13. 13. Prediction intervals El Nino dataset This data contains the averaged monthly sea surface temperature in degrees Celcius of the Pacific Ocean, between 0-10 degrees South and 90-80 degrees West, from 1950 to 2010.
  14. 14. Prediction intervals Time series giving the monthly males deaths from bronchitis, emphysema and asthma in the UK, 1974–1979.
  15. 15. AR(p) Stationarity condition
  16. 16. MA(q)
  17. 17. Autocorrelation function. Partial Autocorrelation function. US quarterly inflation data (from 1959 to 2010)
  18. 18. Autocorrelation function. Partial Autocorrelation function.
  19. 19. Autocorrelation function. Partial Autocorrelation function.
  20. 20. Multicollinearity 1) large variation for parameters estimates 2) statistical insignificance of estimates 3) unstable predictions
  21. 21. Distortions in data: - misobservations - outliers - structural breaks Data Adjustments: - average by months - per-capita data - price index - dummy variables Notes on Data Preparation
  22. 22. Transformations Box-Cox transformation Log transformation
  23. 23. Differencing
  24. 24. Seasonal Differencing
  25. 25. ARMA(p, q) Wold’s Decomposition Theorem implies that any covariance stationary process can be arbitrarily well described by ARMA model.
  26. 26. ARIMA(p, d, q) Taking differences until TS is stationary and apply ARMA to it then apply reverse transformation.
  27. 27. SARMA (p, q) (P, Q) - SARIMA (p, d, q) (P, D, Q) - SARIMAX (p, d, q) (P, D, Q) - VAR(p), VARMA(p, q), etc back to multivariate regression...
  28. 28. Model selection
  29. 29. Box-Jenkins approach Steps: 1. Identification a. assessing stationarity b. detrending c. removing seasonality d. differencing e. ACF, PACF analysis 2. Building candidate-models a. models order and parameters estimation b. model selection 3. Residual Diagnostics a. assessing normality b. assessing autocorrelation
  30. 30. Residuals Diagnostics Normality: Jarque-Bera test Serial autocorrelation: Durbin-Watson test Portmanteau test Ljung-Box test Homoscedasticity: White’s test Bartlett’s test
  31. 31. Cross-Validation for Time Series data
  32. 32. Forecasting Quality Metrics
  33. 33. Alternatives - multivariate regression - GBM - SSM - NN - VECM & Cointegration (Johansen approach) - Naive, Seasonal Naive, Drift method :) - ...
  34. 34. Resources - https://www.otexts.org/fpp - https://www.slideshare.net/hyndman/automatic-time-series-forecasting - https://www.slideshare.net/seanjtaylor/automatic-forecasting-at-scale - https://www.youtube.com/user/SpartacanUsuals/videos - https://stat.ethz.ch/R-manual/R-devel/library/datasets/html/00Index.html
  35. 35. Thanks for your attention! Contacts: r_merkulov@indatalabs.com merkylovecom@mail.ru https://www.linkedin.com/in/roman-merkulov-a61804a4/

×