Breaking the Kubernetes Kill Chain: Host Path Mount
Grouped time-series forecasting: Application to regional infant mortality counts
1. Motivation Data Method Result Conclusion
Grouped time-series forecasting:
Application to regional infant mortality counts
Han Lin Shang and Peter W. F. Smith
University of Southampton
2. Motivation Data Method Result Conclusion
Motivation
1 Multiple time series can be disaggregated by
hierarchical/grouped structure
3. Motivation Data Method Result Conclusion
Motivation
1 Multiple time series can be disaggregated by
hierarchical/grouped structure
2 Hyndman, Ahmed, Athanasopoulos and Shang (2010, CSDA)
considered four hierarchical methods, but did not consider the
construction of prediction interval for hierarchical/grouped
time series
4. Motivation Data Method Result Conclusion
Motivation
1 Multiple time series can be disaggregated by
hierarchical/grouped structure
2 Hyndman, Ahmed, Athanasopoulos and Shang (2010, CSDA)
considered four hierarchical methods, but did not consider the
construction of prediction interval for hierarchical/grouped
time series
3 Present a parametric bootstrap method to construct prediction
interval
5. Motivation Data Method Result Conclusion
Motivation
1 Multiple time series can be disaggregated by
hierarchical/grouped structure
2 Hyndman, Ahmed, Athanasopoulos and Shang (2010, CSDA)
considered four hierarchical methods, but did not consider the
construction of prediction interval for hierarchical/grouped
time series
3 Present a parametric bootstrap method to construct prediction
interval
4 Apply to infant mortality forecasting
6. Motivation Data Method Result Conclusion
Data
Consider regional infant mortality counts from 1933 to 2003,
available in the hts package
Western Australia
South Australia
Northern Territory
Queensland
New South Wales
Victoria
Tasmania
Capital Territory
Perth
Adelaide
Darwin
Brisbane
Sydney
Melbourne
Hobart
Canberra
Australia
7. Motivation Data Method Result Conclusion
Data
1 Hierarchical structure is expressed below
Level Number of series
Australia 1
Gender 2
State 8
Gender × State 16
Total 27
2 Since multiple time series can be disaggregated by state first
or gender first, our data are called grouped time series
3 Forecast regional infant mortality count from 2004 to 2013
8. Motivation Data Method Result Conclusion
Hierarchical tree
Total
Male
VIC NSW QLD SA WA ACT NT TAS
Female
VIC NSW QLD SA WA ACT NT TAS
Figure: A two level hierarchical tree diagram.
9. Motivation Data Method Result Conclusion
Bottom-up method
1 Generate base (or independent) forecasts for each series at the
bottom level
10. Motivation Data Method Result Conclusion
Bottom-up method
1 Generate base (or independent) forecasts for each series at the
bottom level
2 Aggregate these upwards to produce revised forecasts
11. Motivation Data Method Result Conclusion
Bottom-up method
1 Generate base (or independent) forecasts for each series at the
bottom level
2 Aggregate these upwards to produce revised forecasts
3 E.g., ¯YMale,h = ¯Y VIC
Male,h + ... + ¯Y NT
Male,h,
¯YTotal,h = ¯YMale,h + ¯YFemale,h, where h represents horizon
12. Motivation Data Method Result Conclusion
Bottom-up method
1 Generate base (or independent) forecasts for each series at the
bottom level
2 Aggregate these upwards to produce revised forecasts
3 E.g., ¯YMale,h = ¯Y VIC
Male,h + ... + ¯Y NT
Male,h,
¯YTotal,h = ¯YMale,h + ¯YFemale,h, where h represents horizon
4 Base forecasts = Revised forecasts
13. Motivation Data Method Result Conclusion
Bottom-up in action
Level 0
1940 1960 1980 2000
2000300040005000
total
1940 1960 1980 2000
50015002500
Level 1
female
male
1940 1960 1980 2000
050010002000
Level 2
nsw
vic
qld
sa
wa
nt
actot
tas
1940 1960 1980 2000
02006001000
Level 3
nsw_f
vic_f
qld_f
sa_f
wa_f
nt_f
actot_f
tas_f
nsw_m
vic_m
qld_m
sa_m
wa_m
nt_m
actot_m
tas_m
14. Motivation Data Method Result Conclusion
Point forecast accuracy: data design
1 For series in the bottom level, select optimal exponential
smoothing model based on information criterion, such as AIC
(by defualt) or BIC
15. Motivation Data Method Result Conclusion
Point forecast accuracy: data design
1 For series in the bottom level, select optimal exponential
smoothing model based on information criterion, such as AIC
(by defualt) or BIC
2 Re-estimate the parameters of model using a rolling window
approach, with the initial fitting period (1933 to 1993)
16. Motivation Data Method Result Conclusion
Point forecast accuracy: data design
1 For series in the bottom level, select optimal exponential
smoothing model based on information criterion, such as AIC
(by defualt) or BIC
2 Re-estimate the parameters of model using a rolling window
approach, with the initial fitting period (1933 to 1993)
3 Forecasts are produced for one- to ten-step-ahead
17. Motivation Data Method Result Conclusion
Point forecast accuracy: data design
1 For series in the bottom level, select optimal exponential
smoothing model based on information criterion, such as AIC
(by defualt) or BIC
2 Re-estimate the parameters of model using a rolling window
approach, with the initial fitting period (1933 to 1993)
3 Forecasts are produced for one- to ten-step-ahead
4 Iterate the process, by increasing the sample size of training
period by one year until 2003
18. Motivation Data Method Result Conclusion
Point forecast accuracy: data design
1 For series in the bottom level, select optimal exponential
smoothing model based on information criterion, such as AIC
(by defualt) or BIC
2 Re-estimate the parameters of model using a rolling window
approach, with the initial fitting period (1933 to 1993)
3 Forecasts are produced for one- to ten-step-ahead
4 Iterate the process, by increasing the sample size of training
period by one year until 2003
5 This gives us 10 one-step-ahead forecasts, 9 two-step-ahead
forecasts, ..., and 1 ten-step-ahead forecast
19. Motivation Data Method Result Conclusion
Point forecast accuracy: data design
1 For series in the bottom level, select optimal exponential
smoothing model based on information criterion, such as AIC
(by defualt) or BIC
2 Re-estimate the parameters of model using a rolling window
approach, with the initial fitting period (1933 to 1993)
3 Forecasts are produced for one- to ten-step-ahead
4 Iterate the process, by increasing the sample size of training
period by one year until 2003
5 This gives us 10 one-step-ahead forecasts, 9 two-step-ahead
forecasts, ..., and 1 ten-step-ahead forecast
6 The advantage of rolling window approach is to assess forecast
accuracy for each horizon
20. Motivation Data Method Result Conclusion
Point forecast accuracy: evaluation
To compare point forecast accuracy between the base and
bottom-up forecasts for all series, calculate mean absolute
percentage error,
MAPEh =
1
(11 − h) × m
n+(10−h)
i=n
m
j=1
Yt+h,j − Yt+h,j
Yt+h,j
,
where m represents the total number of time series in the hierarchy,
and h = 1, 2, . . . , 10
21. Motivation Data Method Result Conclusion
Point forecast result
Level 0 Level 1 Level 2 Level 3
Base BU Base BU Base BU Base BU
1 4.26 5.35 5.59 5.72 14.76 14.03 20.98 20.98
2 6.25 5.96 7.38 6.23 16.32 16.20 25.50 25.50
3 8.27 6.51 10.26 6.86 18.95 18.95 30.55 30.55
4 11.94 10.73 14.71 10.34 22.40 22.11 34.55 34.55
5 19.02 9.37 16.48 10.47 24.87 25.96 39.58 39.58
6 16.46 6.16 17.60 6.18 27.75 27.74 41.99 41.99
7 19.59 9.46 19.55 9.58 31.66 34.43 47.57 47.57
8 20.30 9.74 24.50 10.03 34.61 39.32 54.78 54.78
9 28.71 11.62 29.72 12.02 33.41 40.38 52.97 52.97
10 32.40 27.55 32.42 26.15 37.66 45.66 61.32 61.32
Mean 16.72 10.25 17.82 10.36 26.24 28.48 40.98 40.98
Bottom-up method outperforms the independent (base) forecasts
(without group structure) at the top two levels, not the state level
22. Motivation Data Method Result Conclusion
Construction of interval forecasts
1 Provide pointwise interval forecasts for assessing uncertainty
23. Motivation Data Method Result Conclusion
Construction of interval forecasts
1 Provide pointwise interval forecasts for assessing uncertainty
2 Proposed method fits within the framework of parametric
bootstrapping
24. Motivation Data Method Result Conclusion
Construction of interval forecasts
1 Provide pointwise interval forecasts for assessing uncertainty
2 Proposed method fits within the framework of parametric
bootstrapping
3 Draw bootstrap samples from the fitted exponential smoothing
model for each series at the bottom level
25. Motivation Data Method Result Conclusion
Construction of interval forecasts
1 Provide pointwise interval forecasts for assessing uncertainty
2 Proposed method fits within the framework of parametric
bootstrapping
3 Draw bootstrap samples from the fitted exponential smoothing
model for each series at the bottom level
4 For each bootstrap sample, we construct group structure and
obtain point forecasts
26. Motivation Data Method Result Conclusion
Construction of interval forecasts
1 Provide pointwise interval forecasts for assessing uncertainty
2 Proposed method fits within the framework of parametric
bootstrapping
3 Draw bootstrap samples from the fitted exponential smoothing
model for each series at the bottom level
4 For each bootstrap sample, we construct group structure and
obtain point forecasts
5 Based on bootstrapped forecasts, we assess the variability of
point forecasts by constructing prediction interval
27. Motivation Data Method Result Conclusion
Construction of interval forecasts
1 Provide pointwise interval forecasts for assessing uncertainty
2 Proposed method fits within the framework of parametric
bootstrapping
3 Draw bootstrap samples from the fitted exponential smoothing
model for each series at the bottom level
4 For each bootstrap sample, we construct group structure and
obtain point forecasts
5 Based on bootstrapped forecasts, we assess the variability of
point forecasts by constructing prediction interval
6 Computationally, the simulate.ets function in the forecast
package was used
28. Motivation Data Method Result Conclusion
Demonstration of interval forecasts
Present 80% pointwise prediction interval of the regional infant
mortality counts from 2004 to 2013 at the top two levels
Year
Count
1940 1950 1960 1970 1980 1990 2000
100020003000400050006000
Total
(a) Level 0
1940 1950 1960 1970 1980 1990 200050010001500200025003000
Year
Count
Male
Female
(b) Level 1
Infant mortality counts will continue to decrease in future. The
variability of male forecasts is higher than female ones
29. Motivation Data Method Result Conclusion
Interval forecast accuracy
1 Given a sample path [Y1, . . . , Yn] where Yt is a column vector
of values across the entire hierarchy, we constructed the
h-step-ahead interval forecasts
30. Motivation Data Method Result Conclusion
Interval forecast accuracy
1 Given a sample path [Y1, . . . , Yn] where Yt is a column vector
of values across the entire hierarchy, we constructed the
h-step-ahead interval forecasts
2 Let Ln+h|n(p) and Un+h|n(p) be the lower and upper bounds,
where p symbolizes the nominal coverage probability
31. Motivation Data Method Result Conclusion
Interval forecast accuracy
1 Given a sample path [Y1, . . . , Yn] where Yt is a column vector
of values across the entire hierarchy, we constructed the
h-step-ahead interval forecasts
2 Let Ln+h|n(p) and Un+h|n(p) be the lower and upper bounds,
where p symbolizes the nominal coverage probability
3 Conditioning on holdout data, the indicator variable is
In+h,j =
1 if Yn+h,j ∈ [Ln+h|n,j(p), Un+h|n,j(p)]
0 if Yn+h,j /∈ [Ln+h|n,j(p), Un+h|n,j(p)] j = 1, . . . , m
32. Motivation Data Method Result Conclusion
Empirical coverage probability
Empirical coverage probability (ECP) is defined as
ECPh = 1 −
n+(10−h)
l=n
m
j=1 Il+h,j
m × (11 − h)
, h = 1, . . . , 10
h 1 2 3 4 5 6 7 8 9 10
ECP 0.71 0.72 0.75 0.69 0.64 0.73 0.72 0.69 0.72 0.74
Table: Empirical coverage probability at nominal of 0.8
33. Motivation Data Method Result Conclusion
Hypothesis testing: interval forecast accuracy
1 To test if the ECP differs from the nominal coverage
probability, we performed log likelihood-ratio test statistics
(see Christoffersen 1998, for more details)
34. Motivation Data Method Result Conclusion
Hypothesis testing: interval forecast accuracy
1 To test if the ECP differs from the nominal coverage
probability, we performed log likelihood-ratio test statistics
(see Christoffersen 1998, for more details)
2 Christoffersen (1998) proposed a test for unconditional
coverage, a test for independence of indicator sequence, and a
joint test of conditional coverage and independence
35. Motivation Data Method Result Conclusion
Hypothesis testing: interval forecast accuracy
1 To test if the ECP differs from the nominal coverage
probability, we performed log likelihood-ratio test statistics
(see Christoffersen 1998, for more details)
2 Christoffersen (1998) proposed a test for unconditional
coverage, a test for independence of indicator sequence, and a
joint test of conditional coverage and independence
3 At the nominal coverage probability of 0.8, log likelihood-ratio
are
h 1 2 3 4 5 6 7 8 9 10
LR 5.73 4.55 1.87 3.24 9.23 5.28 5.94 4.03 2.55 5.01
Table: Critical value is 5.99 at 95% level of significance
36. Motivation Data Method Result Conclusion
Hypothesis testing: interval forecast accuracy
1 To test if the ECP differs from the nominal coverage
probability, we performed log likelihood-ratio test statistics
(see Christoffersen 1998, for more details)
2 Christoffersen (1998) proposed a test for unconditional
coverage, a test for independence of indicator sequence, and a
joint test of conditional coverage and independence
3 At the nominal coverage probability of 0.8, log likelihood-ratio
are
h 1 2 3 4 5 6 7 8 9 10
LR 5.73 4.55 1.87 3.24 9.23 5.28 5.94 4.03 2.55 5.01
Table: Critical value is 5.99 at 95% level of significance
4 At 95% level of significance, only 1 in 10 is greater than
critical value
38. Motivation Data Method Result Conclusion
Conclusion
1 Revisited the bottom-up method
2 Applied it to the regional infant mortality count in Australia
39. Motivation Data Method Result Conclusion
Conclusion
1 Revisited the bottom-up method
2 Applied it to the regional infant mortality count in Australia
3 Performed evaluation of point forecast accuracy
40. Motivation Data Method Result Conclusion
Conclusion
1 Revisited the bottom-up method
2 Applied it to the regional infant mortality count in Australia
3 Performed evaluation of point forecast accuracy
4 Proposed a parametric bootstrap method to construct
prediction interval
41. Motivation Data Method Result Conclusion
Conclusion
1 Revisited the bottom-up method
2 Applied it to the regional infant mortality count in Australia
3 Performed evaluation of point forecast accuracy
4 Proposed a parametric bootstrap method to construct
prediction interval
5 Performed evaluation of interval forecast accuracy
42. Motivation Data Method Result Conclusion
Conclusion
1 Revisited the bottom-up method
2 Applied it to the regional infant mortality count in Australia
3 Performed evaluation of point forecast accuracy
4 Proposed a parametric bootstrap method to construct
prediction interval
5 Performed evaluation of interval forecast accuracy
6 Carried out hypothesis testing of interval forecast accuracy
43. Motivation Data Method Result Conclusion
Future research
1 Parametric bootstrapping is expected to work for other
hierarchical/grouped time series forecasting method, such as
top-down methods
44. Motivation Data Method Result Conclusion
Future research
1 Parametric bootstrapping is expected to work for other
hierarchical/grouped time series forecasting method, such as
top-down methods
2 Modeling age-specific mortality counts hierarchically and
coherently
45. Motivation Data Method Result Conclusion
Future research
1 Parametric bootstrapping is expected to work for other
hierarchical/grouped time series forecasting method, such as
top-down methods
2 Modeling age-specific mortality counts hierarchically and
coherently
3 Extension from mortality count to mortality rate
46. Motivation Data Method Result Conclusion
Thank you
A draft is available upon request from H.Shang@soton.ac.uk