Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

12

Share

Download to read offline

Academia sinica jan-2015

Download to read offline

Visualizing and forecasting big time series data

Academia sinica jan-2015

  1. 1. Rob J Hyndman Visualizing and forecasting big time series data Victoria: scaled
  2. 2. Outline 1 Examples of big time series 2 Time series visualisation 3 BLUF: Best Linear Unbiased Forecasts 4 Application: Australian tourism 5 Application: Australian labour market 6 Fast computation tricks 7 hts package for R 8 References Visualising and forecasting big time series data Examples of big time series 2
  3. 3. 1. Australian tourism demand Visualising and forecasting big time series data Examples of big time series 3
  4. 4. 1. Australian tourism demand Visualising and forecasting big time series data Examples of big time series 3 Quarterly data on visitor night from 1998:Q1 – 2013:Q4 From: National Visitor Survey, based on annual interviews of 120,000 Australians aged 15+, collected by Tourism Research Australia. Split by 7 states, 27 zones and 76 regions (a geographical hierarchy) Also split by purpose of travel Holiday Visiting friends and relatives (VFR) Business Other 304 bottom-level series
  5. 5. 2. Labour market participation Australia and New Zealand Standard Classification of Occupations 8 major groups 43 sub-major groups 97 minor groups – 359 unit groups * 1023 occupations Example: statistician 2 Professionals 22 Business, Human Resource and Marketing Professionals 224 Information and Organisation Professionals 2241 Actuaries, Mathematicians and Statisticians 224113 Statistician Visualising and forecasting big time series data Examples of big time series 4
  6. 6. 2. Labour market participation Australia and New Zealand Standard Classification of Occupations 8 major groups 43 sub-major groups 97 minor groups – 359 unit groups * 1023 occupations Example: statistician 2 Professionals 22 Business, Human Resource and Marketing Professionals 224 Information and Organisation Professionals 2241 Actuaries, Mathematicians and Statisticians 224113 Statistician Visualising and forecasting big time series data Examples of big time series 4
  7. 7. 3. PBS sales Visualising and forecasting big time series data Examples of big time series 5
  8. 8. 3. PBS sales ATC drug classification A Alimentary tract and metabolism B Blood and blood forming organs C Cardiovascular system D Dermatologicals G Genito-urinary system and sex hormones H Systemic hormonal preparations, excluding sex hormones and insulins J Anti-infectives for systemic use L Antineoplastic and immunomodulating agents M Musculo-skeletal system N Nervous system P Antiparasitic products, insecticides and repellents R Respiratory system S Sensory organs V Various Visualising and forecasting big time series data Examples of big time series 6
  9. 9. 3. PBS sales ATC drug classification A Alimentary tract and metabolism14 classes A10 Drugs used in diabetes84 classes A10B Blood glucose lowering drugs A10BA Biguanides A10BA02 Metformin Visualising and forecasting big time series data Examples of big time series 7
  10. 10. 4. Spectacle sales Visualising and forecasting big time series data Examples of big time series 8 Monthly sales data from 2000 – 2014 Provided by a large spectacle manufacturer Split by brand (26), gender (3), price range (6), materials (4), and stores (600) About a million bottom-level series
  11. 11. 4. Spectacle sales Visualising and forecasting big time series data Examples of big time series 8 Monthly sales data from 2000 – 2014 Provided by a large spectacle manufacturer Split by brand (26), gender (3), price range (6), materials (4), and stores (600) About a million bottom-level series
  12. 12. 4. Spectacle sales Visualising and forecasting big time series data Examples of big time series 8 Monthly sales data from 2000 – 2014 Provided by a large spectacle manufacturer Split by brand (26), gender (3), price range (6), materials (4), and stores (600) About a million bottom-level series
  13. 13. 4. Spectacle sales Visualising and forecasting big time series data Examples of big time series 8 Monthly sales data from 2000 – 2014 Provided by a large spectacle manufacturer Split by brand (26), gender (3), price range (6), materials (4), and stores (600) About a million bottom-level series
  14. 14. Hierarchical time series A hierarchical time series is a collection of several time series that are linked together in a hierarchical structure. Total A AA AB AC B BA BB BC C CA CB CC Examples Net labour turnover Pharmaceutical sales Tourism by state and region Visualising and forecasting big time series data Examples of big time series 9
  15. 15. Hierarchical time series A hierarchical time series is a collection of several time series that are linked together in a hierarchical structure. Total A AA AB AC B BA BB BC C CA CB CC Examples Net labour turnover Pharmaceutical sales Tourism by state and region Visualising and forecasting big time series data Examples of big time series 9
  16. 16. Hierarchical time series A hierarchical time series is a collection of several time series that are linked together in a hierarchical structure. Total A AA AB AC B BA BB BC C CA CB CC Examples Net labour turnover Pharmaceutical sales Tourism by state and region Visualising and forecasting big time series data Examples of big time series 9
  17. 17. Hierarchical time series A hierarchical time series is a collection of several time series that are linked together in a hierarchical structure. Total A AA AB AC B BA BB BC C CA CB CC Examples Net labour turnover Pharmaceutical sales Tourism by state and region Visualising and forecasting big time series data Examples of big time series 9
  18. 18. Grouped time series A grouped time series is a collection of time series that can be grouped together in a number of non-hierarchical ways. Total A AX AY B BX BY Total X AX BX Y AY BY Examples Tourism by state and purpose of travel Glasses by brand and store Visualising and forecasting big time series data Examples of big time series 10
  19. 19. Grouped time series A grouped time series is a collection of time series that can be grouped together in a number of non-hierarchical ways. Total A AX AY B BX BY Total X AX BX Y AY BY Examples Tourism by state and purpose of travel Glasses by brand and store Visualising and forecasting big time series data Examples of big time series 10
  20. 20. Grouped time series A grouped time series is a collection of time series that can be grouped together in a number of non-hierarchical ways. Total A AX AY B BX BY Total X AX BX Y AY BY Examples Tourism by state and purpose of travel Glasses by brand and store Visualising and forecasting big time series data Examples of big time series 10
  21. 21. Outline 1 Examples of big time series 2 Time series visualisation 3 BLUF: Best Linear Unbiased Forecasts 4 Application: Australian tourism 5 Application: Australian labour market 6 Fast computation tricks 7 hts package for R 8 References Visualising and forecasting big time series data Time series visualisation 11
  22. 22. Victorian tourism dataBAAHolBABHol BAAVisBABVis BAABusBABBus BAAOthBABOth BACHolBBAHol BACVisBBAVis BACBusBBABus BACOthBBAOth BCAHolBCBHol BCAVisBCBVis BCABusBCBBus BCAOthBCBOth BCCHolBDAHol BCCVisBDAVis BCCBusBDABus BCCOthBDAOth BDBHolBDCHol BDBVisBDCVis BDBBusBDCBus BDBOthBDCOth BDDHolBDEHol BDDVisBDEVis BDDBusBDEBus BDDOthBDEOth BDFHolBEAHol BDFVisBEAVis BDFBusBEABus BDFOthBEAOth BEBHolBECHol BEBVisBECVis BEBBusBECBus BEBOthBECOth BEDHolBEEHol BEDVisBEEVis BEDBusBEEBus BEDOthBEEOth BEFHolBEGHol BEFVisBEGVis BEFBusBEGBus BEFOthBEGOth Visualising and forecasting big time series data Time series visualisation 12
  23. 23. Kite diagrams 000 Line graph profile Duplicate & flip around the hori- zontal axis Fill the colour Visualising and forecasting big time series data Time series visualisation 13
  24. 24. Kite diagrams: Victorian tourism 20002010 Holiday 20002010 VFR 20002010 Business 20002010 BAA BAB BAC BBA BCA BCB BCC BDA BDB BDC BDD BDE BDF BEA BEB BEC BED BEE BEF Other BEG Victoria Visualising and forecasting big time series data Time series visualisation 14
  25. 25. Kite diagrams: Victorian tourism Visualising and forecasting big time series data Time series visualisation 14
  26. 26. Kite diagrams: Victorian tourism 20002010 Holiday 20002010 VFR 20002010 Business 20002010 BAA BAB BAC BBA BCA BCB BCC BDA BDB BDC BDD BDE BDF BEA BEB BEC BED BEE BEF Other BEG Victoria: scaled Visualising and forecasting big time series data Time series visualisation 14
  27. 27. An STL decomposition STL decomposition of tourism demand for holidays in Peninsula 5.06.07.0 data −0.50.5 seasonal 5.86.16.4 trend −0.40.0 2000 2005 2010 remainder Visualising and forecasting big time series data Time series visualisation 15
  28. 28. Seasonal stacked bar chart Place positive values above the origin while negative values below the origin Map the bar length to the magnitude Encode quarters by colours Visualising and forecasting big time series data Time series visualisation 16
  29. 29. Seasonal stacked bar chart Place positive values above the origin while negative values below the origin Map the bar length to the magnitude Encode quarters by colours −1.0 −0.5 0.0 0.5 1.0 Holiday BAA BABBACBBABCABCBBCCBDABDBBDCBDDBDEBDF BEA BEBBECBEDBEE BEFBEG Regions SeasonalComponent Qtr Q1 Q2 Q3 Q4 Visualising and forecasting big time series data Time series visualisation 16
  30. 30. Seasonal stacked bar chart: VIC Visualising and forecasting big time series data Time series visualisation 17
  31. 31. Seasonal stacked bar chart: VIC −1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0 HolidayVFRBusinessOther BAABABBACBBABCABCBBCCBDABDBBDCBDDBDEBDFBEABEBBECBEDBEEBEFBEG Regions SeasonalComponent Qtr Q1 Q2 Q3 Q4 Visualising and forecasting big time series data Time series visualisation 17
  32. 32. Corrgram of remainder Visualising and forecasting big time series data Time series visualisation 18 Compute the correlations among the remainder components Render both the sign and magnitude using a colour mapping of two hues Order variables according to the first principal component of the correlations.
  33. 33. Corrgram of remainder: VIC Visualising and forecasting big time series data Time series visualisation 19 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 BEEHolBEFOthBEEOthBDEOthBEBOthBEABusBEFBusBDCOthBACHolBEBBusBEAVisBBAHolBDEHolBABOthBAAVisBAAHolBDCHolBBABusBCBHolBEGBusBDDVisBABVisBDAVisBEAOthBDFHolBEEBusBAAOthBACOthBDAOthBDEBusBCBOthBACBusBEBVisBACVisBCAOthBEFVisBCBVisBEDHolBEGOthBDBHolBABBusBEBHolBDFBusBECHolBCAHolBDBOthBEAHolBDCBusBECVisBDBVisBCCHolBBAVisBABHolBBAOthBCCOthBCBBusBCCVisBEGVisBDDHolBECOthBDCVisBAABusBCCBusBECBusBCAVisBDFVisBEGHolBDDOthBEDOthBEDVisBDDBusBDEVisBEFHolBEEVisBDBBusBDABusBDAHolBCABusBDFOthBEDBus BEEHolBEFOthBEEOthBDEOthBEBOthBEABusBEFBusBDCOthBACHolBEBBusBEAVisBBAHolBDEHolBABOthBAAVisBAAHolBDCHolBBABusBCBHolBEGBusBDDVisBABVisBDAVisBEAOthBDFHolBEEBusBAAOthBACOthBDAOthBDEBusBCBOthBACBusBEBVisBACVisBCAOthBEFVisBCBVisBEDHolBEGOthBDBHolBABBusBEBHolBDFBusBECHolBCAHolBDBOthBEAHolBDCBusBECVisBDBVisBCCHolBBAVisBABHolBBAOthBCCOthBCBBusBCCVisBEGVisBDDHolBECOthBDCVisBAABusBCCBusBECBusBCAVisBDFVisBEGHolBDDOthBEDOthBEDVisBDDBusBDEVisBEFHolBEEVisBDBBusBDABusBDAHolBCABusBDFOthBEDBus
  34. 34. Corrgram of remainder: VIC Visualising and forecasting big time series data Time series visualisation 19 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 BDAHol BDDHol BEBHol BEFHol BECHol BEDHol BDFHol BCCHol BDCHol BCAHol BEAHol BEGHol BBAHol BAAHol BABHol BDBHol BDEHol BACHol BCBHol BEEHol BDAHol BDDHol BEBHol BEFHol BECHol BEDHol BDFHol BCCHol BDCHol BCAHol BEAHol BEGHol BBAHol BAAHol BABHol BDBHol BDEHol BACHol BCBHol BEEHol
  35. 35. Corrgram of remainder: TAS Visualising and forecasting big time series data Time series visualisation 20 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 FCAHol FBBHol FBAHol FAAHol FCBHol FCAVis FBBVis FAAVis FCBBus FAAOth FCAOth FBBOth FBABus FBAOth FCBVis FCABus FBAVis FCBOth FBBBus FAABus FCAHol FBBHol FBAHol FAAHol FCBHol FCAVis FBBVis FAAVis FCBBus FAAOth FCAOth FBBOth FBABus FBAOth FCBVis FCABus FBAVis FCBOth FBBBus FAABus
  36. 36. Principal components decomposition Visualising and forecasting big time series data Time series visualisation 21 −25−15−55 PC1 −50510 PC2 −50510 2000 2005 2010 PC3 Time First three PCs
  37. 37. Principal components decomposition Visualising and forecasting big time series data Time series visualisation 21 −25−20−15−10−505 Season plot: PC1 Month q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  38. 38. Principal components decomposition Visualising and forecasting big time series data Time series visualisation 21 −50510 Season plot: PC2 Month q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  39. 39. Principal components decomposition Visualising and forecasting big time series data Time series visualisation 21 −50510 Season plot: PC3 Month q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  40. 40. Principal components decomposition Visualising and forecasting big time series data Time series visualisation 22 q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q qq q q qq q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q qq q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q −0.15 −0.10 −0.05 0.00 0.05 −0.100.000.050.100.150.20 Loading 1 Loading2 q q q q q q q NSW VIC QLD SA TAS NT WA
  41. 41. Principal components decomposition Visualising and forecasting big time series data Time series visualisation 22 q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q qq q q qq q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q qq q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q −0.15 −0.10 −0.05 0.00 0.05 −0.100.000.050.100.150.20 Loading 1 Loading2 q q q q Hol Vis Bus Oth
  42. 42. Feature analysis Summarize each time series with a feature vector: strength of trend summer seasonality winter seasonality Box-Pierce statistic on remainder of STL Lumpiness (variance of annual variances of remainder) Do PCA on feature matrix Visualising and forecasting big time series data Time series visualisation 23
  43. 43. Feature analysis Summarize each time series with a feature vector: strength of trend summer seasonality winter seasonality Box-Pierce statistic on remainder of STL Lumpiness (variance of annual variances of remainder) Do PCA on feature matrix Visualising and forecasting big time series data Time series visualisation 23
  44. 44. Feature analysis Summarize each time series with a feature vector: strength of trend summer seasonality winter seasonality Box-Pierce statistic on remainder of STL Lumpiness (variance of annual variances of remainder) Do PCA on feature matrix Visualising and forecasting big time series data Time series visualisation 23
  45. 45. Feature analysis Summarize each time series with a feature vector: strength of trend summer seasonality winter seasonality Box-Pierce statistic on remainder of STL Lumpiness (variance of annual variances of remainder) Do PCA on feature matrix Visualising and forecasting big time series data Time series visualisation 23
  46. 46. Feature analysis Summarize each time series with a feature vector: strength of trend summer seasonality winter seasonality Box-Pierce statistic on remainder of STL Lumpiness (variance of annual variances of remainder) Do PCA on feature matrix Visualising and forecasting big time series data Time series visualisation 23
  47. 47. Feature analysis Summarize each time series with a feature vector: strength of trend summer seasonality winter seasonality Box-Pierce statistic on remainder of STL Lumpiness (variance of annual variances of remainder) Do PCA on feature matrix Visualising and forecasting big time series data Time series visualisation 23
  48. 48. Feature analysis Summarize each time series with a feature vector: strength of trend summer seasonality winter seasonality Box-Pierce statistic on remainder of STL Lumpiness (variance of annual variances of remainder) Do PCA on feature matrix Visualising and forecasting big time series data Time series visualisation 23
  49. 49. Feature analysis Visualising and forecasting big time series data Time series visualisation 24 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q trend summer wintercorr lum py −2 0 2 −5.0 −2.5 0.0 2.5 PC1 (39.1% explained var.) PC2(23.6%explainedvar.) groups q q q q Bus Hol Oth Vis
  50. 50. Feature analysis Visualising and forecasting big time series data Time series visualisation 24 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q trend summer wintercorr lum py −2 0 2 −5.0 −2.5 0.0 2.5 PC1 (39.1% explained var.) PC2(23.6%explainedvar.) groups q q q q q q q NSW NT QLD SA TAS VIC WA
  51. 51. Outline 1 Examples of big time series 2 Time series visualisation 3 BLUF: Best Linear Unbiased Forecasts 4 Application: Australian tourism 5 Application: Australian labour market 6 Fast computation tricks 7 hts package for R 8 References Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 25
  52. 52. Hierarchical/grouped time series Forecasts should be “aggregate consistent”, unbiased, minimum variance. Existing methods: ¢ Bottom-up ¢ Top-down ¢ Middle-out How to compute forecast intervals? Most research is concerned about relative performance of existing methods. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 26
  53. 53. Hierarchical/grouped time series Forecasts should be “aggregate consistent”, unbiased, minimum variance. Existing methods: ¢ Bottom-up ¢ Top-down ¢ Middle-out How to compute forecast intervals? Most research is concerned about relative performance of existing methods. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 26
  54. 54. Hierarchical/grouped time series Forecasts should be “aggregate consistent”, unbiased, minimum variance. Existing methods: ¢ Bottom-up ¢ Top-down ¢ Middle-out How to compute forecast intervals? Most research is concerned about relative performance of existing methods. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 26
  55. 55. Hierarchical/grouped time series Forecasts should be “aggregate consistent”, unbiased, minimum variance. Existing methods: ¢ Bottom-up ¢ Top-down ¢ Middle-out How to compute forecast intervals? Most research is concerned about relative performance of existing methods. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 26
  56. 56. Hierarchical/grouped time series Forecasts should be “aggregate consistent”, unbiased, minimum variance. Existing methods: ¢ Bottom-up ¢ Top-down ¢ Middle-out How to compute forecast intervals? Most research is concerned about relative performance of existing methods. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 26
  57. 57. Hierarchical/grouped time series Forecasts should be “aggregate consistent”, unbiased, minimum variance. Existing methods: ¢ Bottom-up ¢ Top-down ¢ Middle-out How to compute forecast intervals? Most research is concerned about relative performance of existing methods. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 26
  58. 58. Hierarchical/grouped time series Forecasts should be “aggregate consistent”, unbiased, minimum variance. Existing methods: ¢ Bottom-up ¢ Top-down ¢ Middle-out How to compute forecast intervals? Most research is concerned about relative performance of existing methods. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 26
  59. 59. Top-down method Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 27 Advantages Works well in presence of low counts. Single forecasting model easy to build Provides reliable forecasts for aggregate levels. Disadvantages Loss of information, especially individual series dynamics. Distribution of forecasts to lower levels can be difficult No prediction intervals
  60. 60. Top-down method Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 27 Advantages Works well in presence of low counts. Single forecasting model easy to build Provides reliable forecasts for aggregate levels. Disadvantages Loss of information, especially individual series dynamics. Distribution of forecasts to lower levels can be difficult No prediction intervals
  61. 61. Top-down method Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 27 Advantages Works well in presence of low counts. Single forecasting model easy to build Provides reliable forecasts for aggregate levels. Disadvantages Loss of information, especially individual series dynamics. Distribution of forecasts to lower levels can be difficult No prediction intervals
  62. 62. Top-down method Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 27 Advantages Works well in presence of low counts. Single forecasting model easy to build Provides reliable forecasts for aggregate levels. Disadvantages Loss of information, especially individual series dynamics. Distribution of forecasts to lower levels can be difficult No prediction intervals
  63. 63. Top-down method Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 27 Advantages Works well in presence of low counts. Single forecasting model easy to build Provides reliable forecasts for aggregate levels. Disadvantages Loss of information, especially individual series dynamics. Distribution of forecasts to lower levels can be difficult No prediction intervals
  64. 64. Top-down method Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 27 Advantages Works well in presence of low counts. Single forecasting model easy to build Provides reliable forecasts for aggregate levels. Disadvantages Loss of information, especially individual series dynamics. Distribution of forecasts to lower levels can be difficult No prediction intervals
  65. 65. Bottom-up method Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 28 Advantages No loss of information. Better captures dynamics of individual series. Disadvantages Large number of series to be forecast. Constructing forecasting models is harder because of noisy data at bottom level. No prediction intervals
  66. 66. Bottom-up method Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 28 Advantages No loss of information. Better captures dynamics of individual series. Disadvantages Large number of series to be forecast. Constructing forecasting models is harder because of noisy data at bottom level. No prediction intervals
  67. 67. Bottom-up method Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 28 Advantages No loss of information. Better captures dynamics of individual series. Disadvantages Large number of series to be forecast. Constructing forecasting models is harder because of noisy data at bottom level. No prediction intervals
  68. 68. Bottom-up method Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 28 Advantages No loss of information. Better captures dynamics of individual series. Disadvantages Large number of series to be forecast. Constructing forecasting models is harder because of noisy data at bottom level. No prediction intervals
  69. 69. Bottom-up method Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 28 Advantages No loss of information. Better captures dynamics of individual series. Disadvantages Large number of series to be forecast. Constructing forecasting models is harder because of noisy data at bottom level. No prediction intervals
  70. 70. The BLUF approach Hyndman et al (CSDA 2011) proposed a new statistical framework for forecasting hierarchical time series which: 1 provides point forecasts that are consistent across the hierarchy; 2 allows for correlations and interaction between series at each level; 3 provides estimates of forecast uncertainty which are consistent across the hierarchy; 4 allows for ad hoc adjustments and inclusion of covariates at any level. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 29
  71. 71. The BLUF approach Hyndman et al (CSDA 2011) proposed a new statistical framework for forecasting hierarchical time series which: 1 provides point forecasts that are consistent across the hierarchy; 2 allows for correlations and interaction between series at each level; 3 provides estimates of forecast uncertainty which are consistent across the hierarchy; 4 allows for ad hoc adjustments and inclusion of covariates at any level. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 29
  72. 72. The BLUF approach Hyndman et al (CSDA 2011) proposed a new statistical framework for forecasting hierarchical time series which: 1 provides point forecasts that are consistent across the hierarchy; 2 allows for correlations and interaction between series at each level; 3 provides estimates of forecast uncertainty which are consistent across the hierarchy; 4 allows for ad hoc adjustments and inclusion of covariates at any level. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 29
  73. 73. The BLUF approach Hyndman et al (CSDA 2011) proposed a new statistical framework for forecasting hierarchical time series which: 1 provides point forecasts that are consistent across the hierarchy; 2 allows for correlations and interaction between series at each level; 3 provides estimates of forecast uncertainty which are consistent across the hierarchy; 4 allows for ad hoc adjustments and inclusion of covariates at any level. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 29
  74. 74. Hierarchical data Total A B C Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 30 Yt : observed aggregate of all series at time t. YX,t : observation on series X at time t. Bt : vector of all series at bottom level in time t.
  75. 75. Hierarchical data Total A B C Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 30 Yt : observed aggregate of all series at time t. YX,t : observation on series X at time t. Bt : vector of all series at bottom level in time t.
  76. 76. Hierarchical data Total A B C yt = [Yt, YA,t, YB,t, YC,t] =     1 1 1 1 0 0 0 1 0 0 0 1       YA,t YB,t YC,t   Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 30 Yt : observed aggregate of all series at time t. YX,t : observation on series X at time t. Bt : vector of all series at bottom level in time t.
  77. 77. Hierarchical data Total A B C yt = [Yt, YA,t, YB,t, YC,t] =     1 1 1 1 0 0 0 1 0 0 0 1     S   YA,t YB,t YC,t   Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 30 Yt : observed aggregate of all series at time t. YX,t : observation on series X at time t. Bt : vector of all series at bottom level in time t.
  78. 78. Hierarchical data Total A B C yt = [Yt, YA,t, YB,t, YC,t] =     1 1 1 1 0 0 0 1 0 0 0 1     S   YA,t YB,t YC,t   Bt Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 30 Yt : observed aggregate of all series at time t. YX,t : observation on series X at time t. Bt : vector of all series at bottom level in time t.
  79. 79. Hierarchical data Total A B C yt = [Yt, YA,t, YB,t, YC,t] =     1 1 1 1 0 0 0 1 0 0 0 1     S   YA,t YB,t YC,t   Bt yt = SBt Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 30 Yt : observed aggregate of all series at time t. YX,t : observation on series X at time t. Bt : vector of all series at bottom level in time t.
  80. 80. Hierarchical data Total A AX AY AZ B BX BY BZ C CX CY CZ yt =             Yt YA,t YB,t YC,t YAX,t YAY,t YAZ,t YBX,t YBY,t YBZ,t YCX,t YCY,t YCZ,t             =             1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1             S        YAX,t YAY,t YAZ,t YBX,t YBY,t YBZ,t YCX,t YCY,t YCZ,t        Bt Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 31
  81. 81. Hierarchical data Total A AX AY AZ B BX BY BZ C CX CY CZ yt =             Yt YA,t YB,t YC,t YAX,t YAY,t YAZ,t YBX,t YBY,t YBZ,t YCX,t YCY,t YCZ,t             =             1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1             S        YAX,t YAY,t YAZ,t YBX,t YBY,t YBZ,t YCX,t YCY,t YCZ,t        Bt Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 31
  82. 82. Hierarchical data Total A AX AY AZ B BX BY BZ C CX CY CZ yt =             Yt YA,t YB,t YC,t YAX,t YAY,t YAZ,t YBX,t YBY,t YBZ,t YCX,t YCY,t YCZ,t             =             1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1             S        YAX,t YAY,t YAZ,t YBX,t YBY,t YBZ,t YCX,t YCY,t YCZ,t        Bt Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 31 yt = SBt
  83. 83. Grouped data AX AY A BX BY B X Y Total yt =              Yt YA,t YB,t YX,t YY,t YAX,t YAY,t YBX,t YBY,t              =              1 1 1 1 1 1 0 0 0 0 1 1 1 0 1 0 0 1 0 1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1              S    YAX,t YAY,t YBX,t YBY,t    Bt Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 32
  84. 84. Grouped data AX AY A BX BY B X Y Total yt =              Yt YA,t YB,t YX,t YY,t YAX,t YAY,t YBX,t YBY,t              =              1 1 1 1 1 1 0 0 0 0 1 1 1 0 1 0 0 1 0 1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1              S    YAX,t YAY,t YBX,t YBY,t    Bt Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 32
  85. 85. Grouped data AX AY A BX BY B X Y Total yt =              Yt YA,t YB,t YX,t YY,t YAX,t YAY,t YBX,t YBY,t              =              1 1 1 1 1 1 0 0 0 0 1 1 1 0 1 0 0 1 0 1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1              S    YAX,t YAY,t YBX,t YBY,t    Bt Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 32 yt = SBt
  86. 86. Forecasting notation Let ˆyn(h) be vector of initial h-step forecasts, made at time n, stacked in same order as yt. (They may not add up.) Hierarchical forecasting methods of the form: ˜yn(h) = SPˆyn(h) for some matrix P. P extracts and combines base forecasts ˆyn(h) to get bottom-level forecasts. S adds them up Revised reconciled forecasts: ˜yn(h). Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 33
  87. 87. Forecasting notation Let ˆyn(h) be vector of initial h-step forecasts, made at time n, stacked in same order as yt. (They may not add up.) Hierarchical forecasting methods of the form: ˜yn(h) = SPˆyn(h) for some matrix P. P extracts and combines base forecasts ˆyn(h) to get bottom-level forecasts. S adds them up Revised reconciled forecasts: ˜yn(h). Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 33
  88. 88. Forecasting notation Let ˆyn(h) be vector of initial h-step forecasts, made at time n, stacked in same order as yt. (They may not add up.) Hierarchical forecasting methods of the form: ˜yn(h) = SPˆyn(h) for some matrix P. P extracts and combines base forecasts ˆyn(h) to get bottom-level forecasts. S adds them up Revised reconciled forecasts: ˜yn(h). Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 33
  89. 89. Forecasting notation Let ˆyn(h) be vector of initial h-step forecasts, made at time n, stacked in same order as yt. (They may not add up.) Hierarchical forecasting methods of the form: ˜yn(h) = SPˆyn(h) for some matrix P. P extracts and combines base forecasts ˆyn(h) to get bottom-level forecasts. S adds them up Revised reconciled forecasts: ˜yn(h). Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 33
  90. 90. Forecasting notation Let ˆyn(h) be vector of initial h-step forecasts, made at time n, stacked in same order as yt. (They may not add up.) Hierarchical forecasting methods of the form: ˜yn(h) = SPˆyn(h) for some matrix P. P extracts and combines base forecasts ˆyn(h) to get bottom-level forecasts. S adds them up Revised reconciled forecasts: ˜yn(h). Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 33
  91. 91. Forecasting notation Let ˆyn(h) be vector of initial h-step forecasts, made at time n, stacked in same order as yt. (They may not add up.) Hierarchical forecasting methods of the form: ˜yn(h) = SPˆyn(h) for some matrix P. P extracts and combines base forecasts ˆyn(h) to get bottom-level forecasts. S adds them up Revised reconciled forecasts: ˜yn(h). Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 33
  92. 92. Bottom-up forecasts ˜yn(h) = SPˆyn(h) Bottom-up forecasts are obtained using P = [0 | I] , where 0 is null matrix and I is identity matrix. P matrix extracts only bottom-level forecasts from ˆyn(h) S adds them up to give the bottom-up forecasts. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 34
  93. 93. Bottom-up forecasts ˜yn(h) = SPˆyn(h) Bottom-up forecasts are obtained using P = [0 | I] , where 0 is null matrix and I is identity matrix. P matrix extracts only bottom-level forecasts from ˆyn(h) S adds them up to give the bottom-up forecasts. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 34
  94. 94. Bottom-up forecasts ˜yn(h) = SPˆyn(h) Bottom-up forecasts are obtained using P = [0 | I] , where 0 is null matrix and I is identity matrix. P matrix extracts only bottom-level forecasts from ˆyn(h) S adds them up to give the bottom-up forecasts. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 34
  95. 95. Top-down forecasts ˜yn(h) = SPˆyn(h) Top-down forecasts are obtained using P = [p | 0] where p = [p1, p2, . . . , pmK ] is a vector of proportions that sum to one. P distributes forecasts of the aggregate to the lowest level series. Different methods of top-down forecasting lead to different proportionality vectors p. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 35
  96. 96. Top-down forecasts ˜yn(h) = SPˆyn(h) Top-down forecasts are obtained using P = [p | 0] where p = [p1, p2, . . . , pmK ] is a vector of proportions that sum to one. P distributes forecasts of the aggregate to the lowest level series. Different methods of top-down forecasting lead to different proportionality vectors p. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 35
  97. 97. Top-down forecasts ˜yn(h) = SPˆyn(h) Top-down forecasts are obtained using P = [p | 0] where p = [p1, p2, . . . , pmK ] is a vector of proportions that sum to one. P distributes forecasts of the aggregate to the lowest level series. Different methods of top-down forecasting lead to different proportionality vectors p. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 35
  98. 98. General properties: bias ˜yn(h) = SPˆyn(h) Assume: base forecasts ˆyn(h) are unbiased: E[ˆyn(h)|y1, . . . , yn] = E[yn+h|y1, . . . , yn] Let ˆBn(h) be bottom level base forecasts with βn(h) = E[ˆBn(h)|y1, . . . , yn]. Then E[ˆyn(h)] = Sβn(h). We want the revised forecasts to be unbiased: E[˜yn(h)] = SPSβn(h) = Sβn(h). Result will hold provided SPS = S. True for bottom-up, but not for any top-down method or middle-out method. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 36
  99. 99. General properties: bias ˜yn(h) = SPˆyn(h) Assume: base forecasts ˆyn(h) are unbiased: E[ˆyn(h)|y1, . . . , yn] = E[yn+h|y1, . . . , yn] Let ˆBn(h) be bottom level base forecasts with βn(h) = E[ˆBn(h)|y1, . . . , yn]. Then E[ˆyn(h)] = Sβn(h). We want the revised forecasts to be unbiased: E[˜yn(h)] = SPSβn(h) = Sβn(h). Result will hold provided SPS = S. True for bottom-up, but not for any top-down method or middle-out method. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 36
  100. 100. General properties: bias ˜yn(h) = SPˆyn(h) Assume: base forecasts ˆyn(h) are unbiased: E[ˆyn(h)|y1, . . . , yn] = E[yn+h|y1, . . . , yn] Let ˆBn(h) be bottom level base forecasts with βn(h) = E[ˆBn(h)|y1, . . . , yn]. Then E[ˆyn(h)] = Sβn(h). We want the revised forecasts to be unbiased: E[˜yn(h)] = SPSβn(h) = Sβn(h). Result will hold provided SPS = S. True for bottom-up, but not for any top-down method or middle-out method. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 36
  101. 101. General properties: bias ˜yn(h) = SPˆyn(h) Assume: base forecasts ˆyn(h) are unbiased: E[ˆyn(h)|y1, . . . , yn] = E[yn+h|y1, . . . , yn] Let ˆBn(h) be bottom level base forecasts with βn(h) = E[ˆBn(h)|y1, . . . , yn]. Then E[ˆyn(h)] = Sβn(h). We want the revised forecasts to be unbiased: E[˜yn(h)] = SPSβn(h) = Sβn(h). Result will hold provided SPS = S. True for bottom-up, but not for any top-down method or middle-out method. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 36
  102. 102. General properties: bias ˜yn(h) = SPˆyn(h) Assume: base forecasts ˆyn(h) are unbiased: E[ˆyn(h)|y1, . . . , yn] = E[yn+h|y1, . . . , yn] Let ˆBn(h) be bottom level base forecasts with βn(h) = E[ˆBn(h)|y1, . . . , yn]. Then E[ˆyn(h)] = Sβn(h). We want the revised forecasts to be unbiased: E[˜yn(h)] = SPSβn(h) = Sβn(h). Result will hold provided SPS = S. True for bottom-up, but not for any top-down method or middle-out method. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 36
  103. 103. General properties: bias ˜yn(h) = SPˆyn(h) Assume: base forecasts ˆyn(h) are unbiased: E[ˆyn(h)|y1, . . . , yn] = E[yn+h|y1, . . . , yn] Let ˆBn(h) be bottom level base forecasts with βn(h) = E[ˆBn(h)|y1, . . . , yn]. Then E[ˆyn(h)] = Sβn(h). We want the revised forecasts to be unbiased: E[˜yn(h)] = SPSβn(h) = Sβn(h). Result will hold provided SPS = S. True for bottom-up, but not for any top-down method or middle-out method. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 36
  104. 104. General properties: bias ˜yn(h) = SPˆyn(h) Assume: base forecasts ˆyn(h) are unbiased: E[ˆyn(h)|y1, . . . , yn] = E[yn+h|y1, . . . , yn] Let ˆBn(h) be bottom level base forecasts with βn(h) = E[ˆBn(h)|y1, . . . , yn]. Then E[ˆyn(h)] = Sβn(h). We want the revised forecasts to be unbiased: E[˜yn(h)] = SPSβn(h) = Sβn(h). Result will hold provided SPS = S. True for bottom-up, but not for any top-down method or middle-out method. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 36
  105. 105. General properties: bias ˜yn(h) = SPˆyn(h) Assume: base forecasts ˆyn(h) are unbiased: E[ˆyn(h)|y1, . . . , yn] = E[yn+h|y1, . . . , yn] Let ˆBn(h) be bottom level base forecasts with βn(h) = E[ˆBn(h)|y1, . . . , yn]. Then E[ˆyn(h)] = Sβn(h). We want the revised forecasts to be unbiased: E[˜yn(h)] = SPSβn(h) = Sβn(h). Result will hold provided SPS = S. True for bottom-up, but not for any top-down method or middle-out method. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 36
  106. 106. General properties: variance ˜yn(h) = SPˆyn(h) Let variance of base forecasts ˆyn(h) be given by Σh = Var[ˆyn(h)|y1, . . . , yn] Then the variance of the revised forecasts is given by Var[˜yn(h)|y1, . . . , yn] = SPΣhP S . This is a general result for all existing methods. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 37
  107. 107. General properties: variance ˜yn(h) = SPˆyn(h) Let variance of base forecasts ˆyn(h) be given by Σh = Var[ˆyn(h)|y1, . . . , yn] Then the variance of the revised forecasts is given by Var[˜yn(h)|y1, . . . , yn] = SPΣhP S . This is a general result for all existing methods. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 37
  108. 108. General properties: variance ˜yn(h) = SPˆyn(h) Let variance of base forecasts ˆyn(h) be given by Σh = Var[ˆyn(h)|y1, . . . , yn] Then the variance of the revised forecasts is given by Var[˜yn(h)|y1, . . . , yn] = SPΣhP S . This is a general result for all existing methods. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 37
  109. 109. BLUF via trace minimization Theorem For any P satisfying SPS = S, then min P = trace[SPΣhP S ] has solution P = (S Σ† hS)−1 S Σ† h. Σ† h is generalized inverse of Σh. Equivalent to GLS estimate of regression ˆyn(h) = Sβn(h) + εh where ε ∼ N(0, Σh). Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 38
  110. 110. BLUF via trace minimization Theorem For any P satisfying SPS = S, then min P = trace[SPΣhP S ] has solution P = (S Σ† hS)−1 S Σ† h. Σ† h is generalized inverse of Σh. Equivalent to GLS estimate of regression ˆyn(h) = Sβn(h) + εh where ε ∼ N(0, Σh). Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 38
  111. 111. BLUF via trace minimization Theorem For any P satisfying SPS = S, then min P = trace[SPΣhP S ] has solution P = (S Σ† hS)−1 S Σ† h. Σ† h is generalized inverse of Σh. Equivalent to GLS estimate of regression ˆyn(h) = Sβn(h) + εh where ε ∼ N(0, Σh). Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 38
  112. 112. Optimal combination forecasts ˜yn(h) = SPˆyn(h) = S(S Σ† hS)−1 S Σ† hˆyn(h) Σ† h is generalized inverse of Σh. Var[˜yn(h)|y1, . . . , yn] = S(S Σ† hS)−1 S Problem: Σh hard to estimate. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 39
  113. 113. Optimal combination forecasts ˜yn(h) = SPˆyn(h) = S(S Σ† hS)−1 S Σ† hˆyn(h) Initial forecasts Σ† h is generalized inverse of Σh. Var[˜yn(h)|y1, . . . , yn] = S(S Σ† hS)−1 S Problem: Σh hard to estimate. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 39
  114. 114. Optimal combination forecasts ˜yn(h) = SPˆyn(h) = S(S Σ† hS)−1 S Σ† hˆyn(h) Revised forecasts Initial forecasts Σ† h is generalized inverse of Σh. Var[˜yn(h)|y1, . . . , yn] = S(S Σ† hS)−1 S Problem: Σh hard to estimate. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 39
  115. 115. Optimal combination forecasts ˜yn(h) = SPˆyn(h) = S(S Σ† hS)−1 S Σ† hˆyn(h) Revised forecasts Initial forecasts Σ† h is generalized inverse of Σh. Var[˜yn(h)|y1, . . . , yn] = S(S Σ† hS)−1 S Problem: Σh hard to estimate. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 39
  116. 116. Optimal combination forecasts ˜yn(h) = SPˆyn(h) = S(S Σ† hS)−1 S Σ† hˆyn(h) Revised forecasts Initial forecasts Σ† h is generalized inverse of Σh. Var[˜yn(h)|y1, . . . , yn] = S(S Σ† hS)−1 S Problem: Σh hard to estimate. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 39
  117. 117. Optimal combination forecasts ˜yn(h) = SPˆyn(h) = S(S Σ† hS)−1 S Σ† hˆyn(h) Revised forecasts Initial forecasts Σ† h is generalized inverse of Σh. Var[˜yn(h)|y1, . . . , yn] = S(S Σ† hS)−1 S Problem: Σh hard to estimate. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 39
  118. 118. Optimal combination forecasts ˜yn(h) = SPˆyn(h) = S(S Σ† hS)−1 S Σ† hˆyn(h) Revised forecasts Initial forecasts Σ† h is generalized inverse of Σh. Var[˜yn(h)|y1, . . . , yn] = S(S Σ† hS)−1 S Problem: Σh hard to estimate. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 39
  119. 119. Optimal combination forecasts ˜yn(h) = S(S Σ† hS)−1 S Σ† hˆyn(h) Revised forecasts Base forecasts Solution 1: OLS Assume εh ≈ SεB,h where εB,h is the forecast error at bottom level. Then Σh ≈ SΩhS where Ωh = Var(εB,h). If Moore-Penrose generalized inverse used, then (S Σ† hS)−1 S Σ† h = (S S)−1 S . ˜yn(h) = S(S S)−1 S ˆyn(h) Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 40
  120. 120. Optimal combination forecasts ˜yn(h) = S(S Σ† hS)−1 S Σ† hˆyn(h) Revised forecasts Base forecasts Solution 1: OLS Assume εh ≈ SεB,h where εB,h is the forecast error at bottom level. Then Σh ≈ SΩhS where Ωh = Var(εB,h). If Moore-Penrose generalized inverse used, then (S Σ† hS)−1 S Σ† h = (S S)−1 S . ˜yn(h) = S(S S)−1 S ˆyn(h) Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 40
  121. 121. Optimal combination forecasts ˜yn(h) = S(S Σ† hS)−1 S Σ† hˆyn(h) Revised forecasts Base forecasts Solution 1: OLS Assume εh ≈ SεB,h where εB,h is the forecast error at bottom level. Then Σh ≈ SΩhS where Ωh = Var(εB,h). If Moore-Penrose generalized inverse used, then (S Σ† hS)−1 S Σ† h = (S S)−1 S . ˜yn(h) = S(S S)−1 S ˆyn(h) Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 40
  122. 122. Optimal combination forecasts ˜yn(h) = S(S Σ† hS)−1 S Σ† hˆyn(h) Revised forecasts Base forecasts Solution 1: OLS Assume εh ≈ SεB,h where εB,h is the forecast error at bottom level. Then Σh ≈ SΩhS where Ωh = Var(εB,h). If Moore-Penrose generalized inverse used, then (S Σ† hS)−1 S Σ† h = (S S)−1 S . ˜yn(h) = S(S S)−1 S ˆyn(h) Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 40
  123. 123. Optimal combination forecasts ˜yn(h) = S(S Σ† hS)−1 S Σ† hˆyn(h) Revised forecasts Base forecasts Solution 1: OLS Assume εh ≈ SεB,h where εB,h is the forecast error at bottom level. Then Σh ≈ SΩhS where Ωh = Var(εB,h). If Moore-Penrose generalized inverse used, then (S Σ† hS)−1 S Σ† h = (S S)−1 S . ˜yn(h) = S(S S)−1 S ˆyn(h) Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 40
  124. 124. Optimal combination forecasts ˜yn(h) = S(S Σ† hS)−1 S Σ† hˆyn(h) Revised forecasts Base forecasts Solution 1: OLS Assume εh ≈ SεB,h where εB,h is the forecast error at bottom level. Then Σh ≈ SΩhS where Ωh = Var(εB,h). If Moore-Penrose generalized inverse used, then (S Σ† hS)−1 S Σ† h = (S S)−1 S . ˜yn(h) = S(S S)−1 S ˆyn(h) Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 40
  125. 125. Optimal combination forecasts Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 41 ˜yn(h) = S(S S)−1 S ˆyn(h)Total A B C
  126. 126. Optimal combination forecasts Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 41 ˜yn(h) = S(S S)−1 S ˆyn(h)Total A B C Weights: S(S S)−1 S =     0.75 0.25 0.25 0.25 0.25 0.75 −0.25 −0.25 0.25 −0.25 0.75 −0.25 0.25 −0.25 −0.25 0.75    
  127. 127. Optimal combination forecasts Total A AA AB AC B BA BB BC C CA CB CC Weights: S(S S)−1 S =                       0.69 0.23 0.23 0.23 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.23 0.58 −0.17 −0.17 0.19 0.19 0.19 −0.06 −0.06 −0.06 −0.06 −0.06 −0.06 0.23 −0.17 0.58 −0.17 −0.06 −0.06 −0.06 0.19 0.19 0.19 −0.06 −0.06 −0.06 0.23 −0.17 −0.17 0.58 −0.06 −0.06 −0.06 −0.06 −0.06 −0.06 0.19 0.19 0.19 0.08 0.19 −0.06 −0.06 0.73 −0.27 −0.27 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 0.08 0.19 −0.06 −0.06 −0.27 0.73 −0.27 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 0.08 0.19 −0.06 −0.06 −0.27 −0.27 0.73 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 0.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 0.73 −0.27 −0.27 −0.02 −0.02 −0.02 0.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 −0.27 0.73 −0.27 −0.02 −0.02 −0.02 0.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 −0.27 −0.27 0.73 −0.02 −0.02 −0.02 0.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 0.73 −0.27 −0.27 0.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 −0.27 0.73 −0.27 0.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 −0.27 −0.27 0.73                       Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 42
  128. 128. Optimal combination forecasts Total A AA AB AC B BA BB BC C CA CB CC Weights: S(S S)−1 S =                       0.69 0.23 0.23 0.23 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.23 0.58 −0.17 −0.17 0.19 0.19 0.19 −0.06 −0.06 −0.06 −0.06 −0.06 −0.06 0.23 −0.17 0.58 −0.17 −0.06 −0.06 −0.06 0.19 0.19 0.19 −0.06 −0.06 −0.06 0.23 −0.17 −0.17 0.58 −0.06 −0.06 −0.06 −0.06 −0.06 −0.06 0.19 0.19 0.19 0.08 0.19 −0.06 −0.06 0.73 −0.27 −0.27 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 0.08 0.19 −0.06 −0.06 −0.27 0.73 −0.27 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 0.08 0.19 −0.06 −0.06 −0.27 −0.27 0.73 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 0.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 0.73 −0.27 −0.27 −0.02 −0.02 −0.02 0.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 −0.27 0.73 −0.27 −0.02 −0.02 −0.02 0.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 −0.27 −0.27 0.73 −0.02 −0.02 −0.02 0.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 0.73 −0.27 −0.27 0.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 −0.27 0.73 −0.27 0.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 −0.27 −0.27 0.73                       Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 42
  129. 129. Features Covariates can be included in initial forecasts. Adjustments can be made to initial forecasts at any level. Very simple and flexible method. Can work with any hierarchical or grouped time series. SPS = S so reconciled forcasts are unbiased. Conceptually easy to implement: OLS on base forecasts. Weights are independent of the data and of the covariance structure of the hierarchy. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 43
  130. 130. Features Covariates can be included in initial forecasts. Adjustments can be made to initial forecasts at any level. Very simple and flexible method. Can work with any hierarchical or grouped time series. SPS = S so reconciled forcasts are unbiased. Conceptually easy to implement: OLS on base forecasts. Weights are independent of the data and of the covariance structure of the hierarchy. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 43
  131. 131. Features Covariates can be included in initial forecasts. Adjustments can be made to initial forecasts at any level. Very simple and flexible method. Can work with any hierarchical or grouped time series. SPS = S so reconciled forcasts are unbiased. Conceptually easy to implement: OLS on base forecasts. Weights are independent of the data and of the covariance structure of the hierarchy. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 43
  132. 132. Features Covariates can be included in initial forecasts. Adjustments can be made to initial forecasts at any level. Very simple and flexible method. Can work with any hierarchical or grouped time series. SPS = S so reconciled forcasts are unbiased. Conceptually easy to implement: OLS on base forecasts. Weights are independent of the data and of the covariance structure of the hierarchy. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 43
  133. 133. Features Covariates can be included in initial forecasts. Adjustments can be made to initial forecasts at any level. Very simple and flexible method. Can work with any hierarchical or grouped time series. SPS = S so reconciled forcasts are unbiased. Conceptually easy to implement: OLS on base forecasts. Weights are independent of the data and of the covariance structure of the hierarchy. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 43
  134. 134. Features Covariates can be included in initial forecasts. Adjustments can be made to initial forecasts at any level. Very simple and flexible method. Can work with any hierarchical or grouped time series. SPS = S so reconciled forcasts are unbiased. Conceptually easy to implement: OLS on base forecasts. Weights are independent of the data and of the covariance structure of the hierarchy. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 43
  135. 135. Challenges Computational difficulties in big hierarchies due to size of the S matrix and singular behavior of (S S). Need to estimate covariance matrix to produce prediction intervals. Ignores covariance matrix in computing point forecasts. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 44 ˜yn(h) = S(S S)−1 S ˆyn(h)
  136. 136. Challenges Computational difficulties in big hierarchies due to size of the S matrix and singular behavior of (S S). Need to estimate covariance matrix to produce prediction intervals. Ignores covariance matrix in computing point forecasts. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 44 ˜yn(h) = S(S S)−1 S ˆyn(h)
  137. 137. Challenges Computational difficulties in big hierarchies due to size of the S matrix and singular behavior of (S S). Need to estimate covariance matrix to produce prediction intervals. Ignores covariance matrix in computing point forecasts. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 44 ˜yn(h) = S(S S)−1 S ˆyn(h)
  138. 138. Optimal combination forecasts Solution 1: OLS Approximate Σ† 1 by cI. Solution 2: Rescaling Suppose we approximate Σ1 by its diagonal. Let Λ = diagonal Σ1 −1 contain inverse one-step forecast variances. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 45 ˜yn(h) = S(S Σ† 1S)−1 S Σ† 1ˆyn(h) ˜yn(h) = S(SΛS)−1 SΛˆyn(h)
  139. 139. Optimal combination forecasts Solution 1: OLS Approximate Σ† 1 by cI. Solution 2: Rescaling Suppose we approximate Σ1 by its diagonal. Let Λ = diagonal Σ1 −1 contain inverse one-step forecast variances. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 45 ˜yn(h) = S(S Σ† 1S)−1 S Σ† 1ˆyn(h) ˜yn(h) = S(SΛS)−1 SΛˆyn(h)
  140. 140. Optimal combination forecasts Solution 1: OLS Approximate Σ† 1 by cI. Solution 2: Rescaling Suppose we approximate Σ1 by its diagonal. Let Λ = diagonal Σ1 −1 contain inverse one-step forecast variances. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 45 ˜yn(h) = S(S Σ† 1S)−1 S Σ† 1ˆyn(h) ˜yn(h) = S(SΛS)−1 SΛˆyn(h)
  141. 141. Optimal combination forecasts Solution 1: OLS Approximate Σ† 1 by cI. Solution 2: Rescaling Suppose we approximate Σ1 by its diagonal. Let Λ = diagonal Σ1 −1 contain inverse one-step forecast variances. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 45 ˜yn(h) = S(S Σ† 1S)−1 S Σ† 1ˆyn(h) ˜yn(h) = S(SΛS)−1 SΛˆyn(h)
  142. 142. Optimal combination forecasts Solution 1: OLS Approximate Σ† 1 by cI. Solution 2: Rescaling Suppose we approximate Σ1 by its diagonal. Let Λ = diagonal Σ1 −1 contain inverse one-step forecast variances. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 45 ˜yn(h) = S(S Σ† 1S)−1 S Σ† 1ˆyn(h) ˜yn(h) = S(SΛS)−1 SΛˆyn(h)
  143. 143. Optimal reconciled forecasts ˜yn(h) = S ˆβn(h) = S(S ΛS)−1 S Λˆyn(h) Easy to estimate, and places weight where we have best forecasts. Ignores covariances. For large numbers of time series, we need to do calculation without explicitly forming S or (SΛS)−1 or SΛ. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 46
  144. 144. Optimal reconciled forecasts ˜yn(h) = S ˆβn(h) = S(S ΛS)−1 S Λˆyn(h) Initial forecasts Easy to estimate, and places weight where we have best forecasts. Ignores covariances. For large numbers of time series, we need to do calculation without explicitly forming S or (SΛS)−1 or SΛ. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 46
  145. 145. Optimal reconciled forecasts ˜yn(h) = S ˆβn(h) = S(S ΛS)−1 S Λˆyn(h) Revised forecasts Initial forecasts Easy to estimate, and places weight where we have best forecasts. Ignores covariances. For large numbers of time series, we need to do calculation without explicitly forming S or (SΛS)−1 or SΛ. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 46
  146. 146. Optimal reconciled forecasts ˜yn(h) = S ˆβn(h) = S(S ΛS)−1 S Λˆyn(h) Revised forecasts Initial forecasts Easy to estimate, and places weight where we have best forecasts. Ignores covariances. For large numbers of time series, we need to do calculation without explicitly forming S or (SΛS)−1 or SΛ. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 46
  147. 147. Optimal reconciled forecasts ˜yn(h) = S ˆβn(h) = S(S ΛS)−1 S Λˆyn(h) Revised forecasts Initial forecasts Easy to estimate, and places weight where we have best forecasts. Ignores covariances. For large numbers of time series, we need to do calculation without explicitly forming S or (SΛS)−1 or SΛ. Visualising and forecasting big time series data BLUF: Best Linear Unbiased Forecasts 46
  148. 148. Outline 1 Examples of big time series 2 Time series visualisation 3 BLUF: Best Linear Unbiased Forecasts 4 Application: Australian tourism 5 Application: Australian labour market 6 Fast computation tricks 7 hts package for R 8 References Visualising and forecasting big time series data Application: Australian tourism 47
  149. 149. Australian tourism Visualising and forecasting big time series data Application: Australian tourism 48
  150. 150. Australian tourism Visualising and forecasting big time series data Application: Australian tourism 48 Hierarchy: States (7) Zones (27) Regions (82)
  151. 151. Australian tourism Visualising and forecasting big time series data Application: Australian tourism 48 Hierarchy: States (7) Zones (27) Regions (82) Base forecasts ETS (exponential smoothing) models
  152. 152. Base forecasts Visualising and forecasting big time series data Application: Australian tourism 49 Domestic tourism forecasts: Total Year Visitornights 1998 2000 2002 2004 2006 2008 600006500070000750008000085000
  153. 153. Base forecasts Visualising and forecasting big time series data Application: Australian tourism 49 Domestic tourism forecasts: NSW Year Visitornights 1998 2000 2002 2004 2006 2008 18000220002600030000
  154. 154. Base forecasts Visualising and forecasting big time series data Application: Australian tourism 49 Domestic tourism forecasts: VIC Year Visitornights 1998 2000 2002 2004 2006 2008 1000012000140001600018000
  155. 155. Base forecasts Visualising and forecasting big time series data Application: Australian tourism 49 Domestic tourism forecasts: Nth.Coast.NSW Year Visitornights 1998 2000 2002 2004 2006 2008 50006000700080009000
  156. 156. Base forecasts Visualising and forecasting big time series data Application: Australian tourism 49 Domestic tourism forecasts: Metro.QLD Year Visitornights 1998 2000 2002 2004 2006 2008 800090001100013000
  157. 157. Base forecasts Visualising and forecasting big time series data Application: Australian tourism 49 Domestic tourism forecasts: Sth.WA Year Visitornights 1998 2000 2002 2004 2006 2008 400600800100012001400
  158. 158. Base forecasts Visualising and forecasting big time series data Application: Australian tourism 49 Domestic tourism forecasts: X201.Melbourne Year Visitornights 1998 2000 2002 2004 2006 2008 40004500500055006000
  159. 159. Base forecasts Visualising and forecasting big time series data Application: Australian tourism 49 Domestic tourism forecasts: X402.Murraylands Year Visitornights 1998 2000 2002 2004 2006 2008 0100200300
  160. 160. Base forecasts Visualising and forecasting big time series data Application: Australian tourism 49 Domestic tourism forecasts: X809.Daly Year Visitornights 1998 2000 2002 2004 2006 2008 020406080100
  161. 161. Reconciled forecasts Visualising and forecasting big time series data Application: Australian tourism 50 Total 2000 2005 2010 650008000095000
  162. 162. Reconciled forecasts Visualising and forecasting big time series data Application: Australian tourism 50 NSW 2000 2005 2010 180002400030000 VIC 2000 2005 2010 100001400018000 QLD 2000 2005 2010 1400020000 Other 2000 2005 2010 1800024000
  163. 163. Reconciled forecasts Visualising and forecasting big time series data Application: Australian tourism 50 Sydney 2000 2005 2010 40007000 OtherNSW 2000 2005 2010 1400022000 Melbourne 2000 2005 2010 40005000 OtherVIC 2000 2005 2010 600012000 GCandBrisbane 2000 2005 2010 60009000 OtherQLD 2000 2005 2010 600012000 Capitalcities 2000 2005 2010 1400020000 Other 2000 2005 2010 55007500
  164. 164. Forecast evaluation Select models using all observations; Re-estimate models using first 12 observations and generate 1- to 8-step-ahead forecasts; Increase sample size one observation at a time, re-estimate models, generate forecasts until the end of the sample; In total 24 1-step-ahead, 23 2-steps-ahead, up to 17 8-steps-ahead for forecast evaluation. Visualising and forecasting big time series data Application: Australian tourism 51
  165. 165. Forecast evaluation Select models using all observations; Re-estimate models using first 12 observations and generate 1- to 8-step-ahead forecasts; Increase sample size one observation at a time, re-estimate models, generate forecasts until the end of the sample; In total 24 1-step-ahead, 23 2-steps-ahead, up to 17 8-steps-ahead for forecast evaluation. Visualising and forecasting big time series data Application: Australian tourism 51
  166. 166. Forecast evaluation Select models using all observations; Re-estimate models using first 12 observations and generate 1- to 8-step-ahead forecasts; Increase sample size one observation at a time, re-estimate models, generate forecasts until the end of the sample; In total 24 1-step-ahead, 23 2-steps-ahead, up to 17 8-steps-ahead for forecast evaluation. Visualising and forecasting big time series data Application: Australian tourism 51
  167. 167. Forecast evaluation Select models using all observations; Re-estimate models using first 12 observations and generate 1- to 8-step-ahead forecasts; Increase sample size one observation at a time, re-estimate models, generate forecasts until the end of the sample; In total 24 1-step-ahead, 23 2-steps-ahead, up to 17 8-steps-ahead for forecast evaluation. Visualising and forecasting big time series data Application: Australian tourism 51
  168. 168. Hierarchy: states, zones, regions MAPE h = 1 h = 2 h = 4 h = 6 h = 8 Average Top Level: Australia Bottom-up 3.79 3.58 4.01 4.55 4.24 4.06 OLS 3.83 3.66 3.88 4.19 4.25 3.94 Scaling (st. dev.) 3.68 3.56 3.97 4.57 4.25 4.04 Level: States Bottom-up 10.70 10.52 10.85 11.46 11.27 11.03 OLS 11.07 10.58 11.13 11.62 12.21 11.35 Scaling (st. dev.) 10.44 10.17 10.47 10.97 10.98 10.67 Level: Zones Bottom-up 14.99 14.97 14.98 15.69 15.65 15.32 OLS 15.16 15.06 15.27 15.74 16.15 15.48 Scaling (st. dev.) 14.63 14.62 14.68 15.17 15.25 14.94 Bottom Level: Regions Bottom-up 33.12 32.54 32.26 33.74 33.96 33.18 OLS 35.89 33.86 34.26 36.06 37.49 35.43 Scaling (st. dev.) 31.68 31.22 31.08 32.41 32.77 31.89 Visualising and forecasting big time series data Application: Australian tourism 52
  169. 169. Outline 1 Examples of big time series 2 Time series visualisation 3 BLUF: Best Linear Unbiased Forecasts 4 Application: Australian tourism 5 Application: Australian labour market 6 Fast computation tricks 7 hts package for R 8 References Visualising and forecasting big time series data Application: Australian labour market 53
  170. 170. ANZSCO Australia and New Zealand Standard Classification of Occupations 8 major groups 43 sub-major groups 97 minor groups – 359 unit groups * 1023 occupations Example: statistician 2 Professionals 22 Business, Human Resource and Marketing Professionals 224 Information and Organisation Professionals 2241 Actuaries, Mathematicians and Statisticians 224113 Statistician Visualising and forecasting big time series data Application: Australian labour market 54
  171. 171. ANZSCO Australia and New Zealand Standard Classification of Occupations 8 major groups 43 sub-major groups 97 minor groups – 359 unit groups * 1023 occupations Example: statistician 2 Professionals 22 Business, Human Resource and Marketing Professionals 224 Information and Organisation Professionals 2241 Actuaries, Mathematicians and Statisticians 224113 Statistician Visualising and forecasting big time series data Application: Australian labour market 54
  172. 172. Australian Labour Market data Visualising and forecasting big time series data Application: Australian labour market 55 Time Level0 7000900011000 Time Level1 5001000150020002500 1. Managers 2. Professionals 3. Technicians and trade workers 4. Community and personal services workers 5. Clerical and administrative workers 6. Sales workers 7. Machinery operators and drivers 8. Labourers Time Level2 100200300400500600700 Time Level3 100200300400500600700 Time Level4 1990 1995 2000 2005 2010 100200300400500
  173. 173. Australian Labour Market data Visualising and forecasting big time series data Application: Australian labour market 55 Time Level0 7000900011000 Time Level1 5001000150020002500 1. Managers 2. Professionals 3. Technicians and trade workers 4. Community and personal services workers 5. Clerical and administrative workers 6. Sales workers 7. Machinery operators and drivers 8. Labourers Time Level2 100200300400500600700 Time Level3 100200300400500600700 Time Level4 1990 1995 2000 2005 2010 100200300400500 Lower three panels show largest sub-groups at each level.
  174. 174. Australian Labour Market data Visualising and forecasting big time series data Application: Australian labour market 55 Time Level0 7000900011000 Time Level1 5001000150020002500 1. Managers 2. Professionals 3. Technicians and trade workers 4. Community and personal services workers 5. Clerical and administrative workers 6. Sales workers 7. Machinery operators and drivers 8. Labourers Time Level2 100200300400500600700 Time Level3 100200300400500600700 Time Level4 1990 1995 2000 2005 2010 100200300400500 Time Level0 10800112001160012000 Base forecasts Reconciled forecasts Time Level1 680700720740760780800 Time Level2 140150160170180190200 Time Level3 140150160170180 Year Level4 2010 2011 2012 2013 2014 2015 120130140150160
  175. 175. Australian Labour Market data Visualising and forecasting big time series data Application: Australian labour market 55 Time Level0 7000900011000 Time Level1 5001000150020002500 1. Managers 2. Professionals 3. Technicians and trade workers 4. Community and personal services workers 5. Clerical and administrative workers 6. Sales workers 7. Machinery operators and drivers 8. Labourers Time Level2 100200300400500600700 Time Level3 100200300400500600700 Time Level4 1990 1995 2000 2005 2010 100200300400500 Time Level0 10800112001160012000 Base forecasts Reconciled forecasts Time Level1 680700720740760780800 Time Level2 140150160170180190200 Time Level3 140150160170180 Year Level4 2010 2011 2012 2013 2014 2015 120130140150160 Base forecasts from auto.arima() Largest changes shown for each level
  176. 176. Forecast evaluation (rolling origin) RMSE h = 1 h = 2 h = 3 h = 4 h = 5 h = 6 h = 7 h = 8 Average Top level Bottom-up 74.71 102.02 121.70 131.17 147.08 157.12 169.60 178.93 135.29 OLS 52.20 77.77 101.50 119.03 138.27 150.75 160.04 166.38 120.74 WLS 61.77 86.32 107.26 119.33 137.01 146.88 156.71 162.38 122.21 Level 1 Bottom-up 21.59 27.33 30.81 32.94 35.45 37.10 39.00 40.51 33.09 OLS 21.89 28.55 32.74 35.58 38.82 41.24 43.34 45.49 35.96 WLS 20.58 26.19 29.71 31.84 34.36 35.89 37.53 38.86 31.87 Level 2 Bottom-up 8.78 10.72 11.79 12.42 13.13 13.61 14.14 14.65 12.40 OLS 9.02 11.19 12.34 13.04 13.92 14.56 15.17 15.77 13.13 WLS 8.58 10.48 11.54 12.15 12.88 13.36 13.87 14.36 12.15 Level 3 Bottom-up 5.44 6.57 7.17 7.53 7.94 8.27 8.60 8.89 7.55 OLS 5.55 6.78 7.42 7.81 8.29 8.68 9.04 9.37 7.87 WLS 5.35 6.46 7.06 7.42 7.84 8.17 8.48 8.76 7.44 Bottom Level Bottom-up 2.35 2.79 3.02 3.15 3.29 3.42 3.54 3.65 3.15 OLS 2.40 2.86 3.10 3.24 3.41 3.55 3.68 3.80 3.25 WLS 2.34 2.77 2.99 3.12 3.27 3.40 3.52 3.63 3.13 Visualising and forecasting big time series data Application: Australian labour market 56
  177. 177. Outline 1 Examples of big time series 2 Time series visualisation 3 BLUF: Best Linear Unbiased Forecasts 4 Application: Australian tourism 5 Application: Australian labour market 6 Fast computation tricks 7 hts package for R 8 References Visualising and forecasting big time series data Fast computation tricks 57
  178. 178. Fast computation: hierarchical data Total A AX AY AZ B BX BY BZ C CX CY CZ yt =             Yt YA,t YB,t YC,t YAX,t YAY,t YAZ,t YBX,t YBY,t YBZ,t YCX,t YCY,t YCZ,t             =             1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1             S        YAX,t YAY,t YAZ,t YBX,t YBY,t YBZ,t YCX,t YCY,t YCZ,t        Bt Visualising and forecasting big time series data Fast computation tricks 58 yt = SBt
  179. 179. Fast computation: hierarchical data Total A AX AY AZ B BX BY BZ C CX CY CZ yt =             Yt YA,t YAX,t YAY,t YAZ,t YB,t YBX,t YBY,t YBZ,t YC,t YCX,t YCY,t YCZ,t             =             1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1             S        YAX,t YAY,t YAZ,t YBX,t YBY,t YBZ,t YCX,t YCY,t YCZ,t        Bt Visualising and forecasting big time series data Fast computation tricks 59 yt = SBt
  180. 180. Fast computation: hierarchies Think of the hierarchy as a tree of trees: Total T1 T2 . . . TK Then the summing matrix contains k smaller summing matrices: S =       1n1 1n2 · · · 1nK S1 0 · · · 0 0 S2 · · · 0 ... ... ... ... 0 0 · · · SK       where 1n is an n-vector of ones and tree Ti has ni terminal nodes. Visualising and forecasting big time series data Fast computation tricks 60
  181. 181. Fast computation: hierarchies Think of the hierarchy as a tree of trees: Total T1 T2 . . . TK Then the summing matrix contains k smaller summing matrices: S =       1n1 1n2 · · · 1nK S1 0 · · · 0 0 S2 · · · 0 ... ... ... ... 0 0 · · · SK       where 1n is an n-vector of ones and tree Ti has ni terminal nodes. Visualising and forecasting big time series data Fast computation tricks 60
  182. 182. Fast computation: hierarchies SΛS =     S1Λ1S1 0 · · · 0 0 S2Λ2S2 · · · 0 ... ... ... ... 0 0 · · · SKΛKSK    +λ0 Jn λ0 is the top left element of Λ; Λk is a block of Λ, corresponding to tree Tk; Jn is a matrix of ones; n = k nk. Now apply the Sherman-Morrison formula . . . Visualising and forecasting big time series data Fast computation tricks 61
  183. 183. Fast computation: hierarchies SΛS =     S1Λ1S1 0 · · · 0 0 S2Λ2S2 · · · 0 ... ... ... ... 0 0 · · · SKΛKSK    +λ0 Jn λ0 is the top left element of Λ; Λk is a block of Λ, corresponding to tree Tk; Jn is a matrix of ones; n = k nk. Now apply the Sherman-Morrison formula . . . Visualising and forecasting big time series data Fast computation tricks 61
  184. 184. Fast computation: hierarchies (SΛS)−1 =      (S1Λ1S1)−1 0 · · · 0 0 (S2Λ2S2)−1 · · · 0 ... ... ... ... 0 0 · · · (SKΛKSK)−1      −cS0 S0 can be partitioned into K2 blocks, with the (k, ) block (of dimension nk × n ) being (SkΛkSk)−1 Jnk,n (S Λ S )−1 Jnk,n is a nk × n matrix of ones. c−1 = λ−1 0 + k 1nk (SkΛkSk)−1 1nk . Each SkΛkSk can be inverted similarly. SΛy can also be computed recursively. Visualising and forecasting big time series data Fast computation tricks 62
  185. 185. Fast computation: hierarchies (SΛS)−1 =      (S1Λ1S1)−1 0 · · · 0 0 (S2Λ2S2)−1 · · · 0 ... ... ... ... 0 0 · · · (SKΛKSK)−1      −cS0 S0 can be partitioned into K2 blocks, with the (k, ) block (of dimension nk × n ) being (SkΛkSk)−1 Jnk,n (S Λ S )−1 Jnk,n is a nk × n matrix of ones. c−1 = λ−1 0 + k 1nk (SkΛkSk)−1 1nk . Each SkΛkSk can be inverted similarly. SΛy can also be computed recursively. Visualising and forecasting big time series data Fast computation tricks 62 The recursive calculations can be done in such a way that we never store any of the large matrices involved.
  186. 186. Fast computation When the time series are not strictly hierarchical and have more than two grouping variables: Use sparse matrix storage and arithmetic. Use iterative approximation for inverting large sparse matrices. Paige & Saunders (1982) ACM Trans. Math. Software Visualising and forecasting big time series data Fast computation tricks 63
  187. 187. Fast computation When the time series are not strictly hierarchical and have more than two grouping variables: Use sparse matrix storage and arithmetic. Use iterative approximation for inverting large sparse matrices. Paige & Saunders (1982) ACM Trans. Math. Software Visualising and forecasting big time series data Fast computation tricks 63
  188. 188. Fast computation When the time series are not strictly hierarchical and have more than two grouping variables: Use sparse matrix storage and arithmetic. Use iterative approximation for inverting large sparse matrices. Paige & Saunders (1982) ACM Trans. Math. Software Visualising and forecasting big time series data Fast computation tricks 63
  189. 189. Outline 1 Examples of big time series 2 Time series visualisation 3 BLUF: Best Linear Unbiased Forecasts 4 Application: Australian tourism 5 Application: Australian labour market 6 Fast computation tricks 7 hts package for R 8 References Visualising and forecasting big time series data hts package for R 64
  190. 190. hts package for R Visualising and forecasting big time series data hts package for R 65 hts: Hierarchical and grouped time series Methods for analysing and forecasting hierarchical and grouped time series Version: 4.3 Depends: forecast (≥ 5.0) Imports: SparseM, parallel, utils Published: 2014-06-10 Author: Rob J Hyndman, Earo Wang and Alan Lee Maintainer: Rob J Hyndman <Rob.Hyndman at monash.edu> BugReports: https://github.com/robjhyndman/hts/issues License: GPL (≥ 2)
  191. 191. Example using R library(hts) # bts is a matrix containing the bottom level time series # nodes describes the hierarchical structure y <- hts(bts, nodes=list(2, c(3,2))) Visualising and forecasting big time series data hts package for R 66
  192. 192. Example using R library(hts) # bts is a matrix containing the bottom level time series # nodes describes the hierarchical structure y <- hts(bts, nodes=list(2, c(3,2))) Visualising and forecasting big time series data hts package for R 66 Total A AX AY AZ B BX BY
  193. 193. Example using R library(hts) # bts is a matrix containing the bottom level time series # nodes describes the hierarchical structure y <- hts(bts, nodes=list(2, c(3,2))) # Forecast 10-step-ahead using WLS combination method # ETS used for each series by default fc <- forecast(y, h=10) Visualising and forecasting big time series data hts package for R 67
  194. 194. forecast.gts function Usage forecast(object, h, method = c("comb", "bu", "mo", "tdgsf", "tdgsa", "tdfp"), fmethod = c("ets", "rw", "arima"), weights = c("sd", "none", "nseries"), positive = FALSE, parallel = FALSE, num.cores = 2, ...) Arguments object Hierarchical time series object of class gts. h Forecast horizon method Method for distributing forecasts within the hierarchy. fmethod Forecasting method to use positive If TRUE, forecasts are forced to be strictly positive weights Weights used for "optimal combination" method. When weights = "sd", it takes account of the standard deviation of forecasts. parallel If TRUE, allow parallel processing num.cores If parallel = TRUE, specify how many cores are going to be used Visualising and forecasting big time series data hts package for R 68
  195. 195. Outline 1 Examples of big time series 2 Time series visualisation 3 BLUF: Best Linear Unbiased Forecasts 4 Application: Australian tourism 5 Application: Australian labour market 6 Fast computation tricks 7 hts package for R 8 References Visualising and forecasting big time series data References 69
  196. 196. References RJ Hyndman, RA Ahmed, G Athanasopoulos, and HL Shang (2011). “Optimal combination forecasts for hierarchical time series”. Computational statistics & data analysis 55(9), 2579–2589. RJ Hyndman, AJ Lee, and E Wang (2014). Fast computation of reconciled forecasts for hierarchical and grouped time series. Working paper 17/14. Department of Econometrics & Business Statistics, Monash University RJ Hyndman, AJ Lee, and E Wang (2014). hts: Hierarchical and grouped time series. cran.r-project.org/package=hts. RJ Hyndman and G Athanasopoulos (2014). Forecasting: principles and practice. OTexts. OTexts.org/fpp/. Visualising and forecasting big time series data References 70
  197. 197. References RJ Hyndman, RA Ahmed, G Athanasopoulos, and HL Shang (2011). “Optimal combination forecasts for hierarchical time series”. Computational statistics & data analysis 55(9), 2579–2589. RJ Hyndman, AJ Lee, and E Wang (2014). Fast computation of reconciled forecasts for hierarchical and grouped time series. Working paper 17/14. Department of Econometrics & Business Statistics, Monash University RJ Hyndman, AJ Lee, and E Wang (2014). hts: Hierarchical and grouped time series. cran.r-project.org/package=hts. RJ Hyndman and G Athanasopoulos (2014). Forecasting: principles and practice. OTexts. OTexts.org/fpp/. Visualising and forecasting big time series data References 70 ¯ Papers and R code: robjhyndman.com ¯ Email: Rob.Hyndman@monash.edu
  • eknath1

    Nov. 1, 2018
  • mathematixy

    Dec. 18, 2017
  • amakamaduka14

    Dec. 20, 2015
  • chankhou9

    Jan. 13, 2015
  • slimbctt

    Jan. 12, 2015
  • tkbible

    Jan. 12, 2015
  • taichiwang777

    Jan. 12, 2015
  • ckliu

    Jan. 12, 2015
  • chihchengliang

    Jan. 12, 2015
  • ntuaha

    Jan. 12, 2015
  • muhaha03

    Jan. 12, 2015
  • summitsuen

    Jan. 11, 2015

Visualizing and forecasting big time series data

Views

Total views

4,903

On Slideshare

0

From embeds

0

Number of embeds

2,859

Actions

Downloads

93

Shares

0

Comments

0

Likes

12

×