Grouped time-series forecasting: Application to regional infant mortality counts

Motivation Data Method Result Conclusion
Grouped time-series forecasting:
Application to regional infant mortality counts
Han Lin Shang and Peter W. F. Smith
University of Southampton

Motivation
1 Multiple time series can be disaggregated by
hierarchical/grouped structure

Motivation
2 Hyndman, Ahmed, Athanasopoulos and Shang (2010, CSDA)
considered four hierarchical methods, but did not consider the
construction of prediction interval for hierarchical/grouped
time series

Motivation
time series
3 Present a parametric bootstrap method to construct prediction
interval

Motivation
time series
3 Present a parametric bootstrap method to construct prediction
interval
4 Apply to infant mortality forecasting

Data
Consider regional infant mortality counts from 1933 to 2003,
available in the hts package
Western Australia
South Australia
Northern Territory
Queensland
New South Wales
Victoria
Tasmania
Capital Territory
Perth
Adelaide
Darwin
Brisbane
Sydney
Melbourne
Hobart
Canberra
Australia

Data
1 Hierarchical structure is expressed below
Level Number of series
Australia 1
Gender 2
State 8
Gender × State 16
Total 27
2 Since multiple time series can be disaggregated by state ﬁrst
or gender ﬁrst, our data are called grouped time series
3 Forecast regional infant mortality count from 2004 to 2013

Hierarchical tree
Total
Male
VIC NSW QLD SA WA ACT NT TAS
Female
VIC NSW QLD SA WA ACT NT TAS
Figure: A two level hierarchical tree diagram.

Bottom-up method
1 Generate base (or independent) forecasts for each series at the
bottom level

Bottom-up method
bottom level
2 Aggregate these upwards to produce revised forecasts

Bottom-up method
bottom level
3 E.g., ¯YMale,h = ¯Y VIC
Male,h + ... + ¯Y NT
Male,h,
¯YTotal,h = ¯YMale,h + ¯YFemale,h, where h represents horizon

Bottom-up method
bottom level
3 E.g., ¯YMale,h = ¯Y VIC
Male,h + ... + ¯Y NT
Male,h,
¯YTotal,h = ¯YMale,h + ¯YFemale,h, where h represents horizon
4 Base forecasts = Revised forecasts

Bottom-up in action
Level 0
1940 1960 1980 2000
2000300040005000
total
1940 1960 1980 2000
50015002500
Level 1
female
male
1940 1960 1980 2000
050010002000
Level 2
nsw
vic
qld
sa
wa
nt
actot
tas
1940 1960 1980 2000
02006001000
Level 3
nsw_f
vic_f
qld_f
sa_f
wa_f
nt_f
actot_f
tas_f
nsw_m
vic_m
qld_m
sa_m
wa_m
nt_m
actot_m
tas_m

Point forecast accuracy: data design
1 For series in the bottom level, select optimal exponential
smoothing model based on information criterion, such as AIC
(by defualt) or BIC

(by defualt) or BIC
2 Re-estimate the parameters of model using a rolling window
approach, with the initial ﬁtting period (1933 to 1993)

(by defualt) or BIC
3 Forecasts are produced for one- to ten-step-ahead

(by defualt) or BIC
4 Iterate the process, by increasing the sample size of training
period by one year until 2003

(by defualt) or BIC
5 This gives us 10 one-step-ahead forecasts, 9 two-step-ahead
forecasts, ..., and 1 ten-step-ahead forecast

(by defualt) or BIC
5 This gives us 10 one-step-ahead forecasts, 9 two-step-ahead
forecasts, ..., and 1 ten-step-ahead forecast
6 The advantage of rolling window approach is to assess forecast
accuracy for each horizon

Point forecast accuracy: evaluation
To compare point forecast accuracy between the base and
bottom-up forecasts for all series, calculate mean absolute
percentage error,
MAPEh =
1
(11 − h) × m
n+(10−h)
i=n
m
j=1
Yt+h,j − Yt+h,j
Yt+h,j
,
where m represents the total number of time series in the hierarchy,
and h = 1, 2, . . . , 10

Point forecast result
Level 0 Level 1 Level 2 Level 3
Base BU Base BU Base BU Base BU
1 4.26 5.35 5.59 5.72 14.76 14.03 20.98 20.98
2 6.25 5.96 7.38 6.23 16.32 16.20 25.50 25.50
3 8.27 6.51 10.26 6.86 18.95 18.95 30.55 30.55
4 11.94 10.73 14.71 10.34 22.40 22.11 34.55 34.55
5 19.02 9.37 16.48 10.47 24.87 25.96 39.58 39.58
6 16.46 6.16 17.60 6.18 27.75 27.74 41.99 41.99
7 19.59 9.46 19.55 9.58 31.66 34.43 47.57 47.57
8 20.30 9.74 24.50 10.03 34.61 39.32 54.78 54.78
9 28.71 11.62 29.72 12.02 33.41 40.38 52.97 52.97
10 32.40 27.55 32.42 26.15 37.66 45.66 61.32 61.32
Mean 16.72 10.25 17.82 10.36 26.24 28.48 40.98 40.98
Bottom-up method outperforms the independent (base) forecasts
(without group structure) at the top two levels, not the state level

Construction of interval forecasts
1 Provide pointwise interval forecasts for assessing uncertainty

2 Proposed method ﬁts within the framework of parametric
bootstrapping

bootstrapping
3 Draw bootstrap samples from the ﬁtted exponential smoothing
model for each series at the bottom level

bootstrapping
4 For each bootstrap sample, we construct group structure and
obtain point forecasts

bootstrapping
5 Based on bootstrapped forecasts, we assess the variability of
point forecasts by constructing prediction interval

bootstrapping
5 Based on bootstrapped forecasts, we assess the variability of
point forecasts by constructing prediction interval
6 Computationally, the simulate.ets function in the forecast
package was used

Demonstration of interval forecasts
Present 80% pointwise prediction interval of the regional infant
mortality counts from 2004 to 2013 at the top two levels
Year
Count
1940 1950 1960 1970 1980 1990 2000
100020003000400050006000
Total
(a) Level 0
1940 1950 1960 1970 1980 1990 200050010001500200025003000
Year
Count
Male
Female
(b) Level 1
Infant mortality counts will continue to decrease in future. The
variability of male forecasts is higher than female ones

Interval forecast accuracy
1 Given a sample path [Y1, . . . , Yn] where Yt is a column vector
of values across the entire hierarchy, we constructed the
h-step-ahead interval forecasts

2 Let Ln+h|n(p) and Un+h|n(p) be the lower and upper bounds,
where p symbolizes the nominal coverage probability

Empirical coverage probability
Empirical coverage probability (ECP) is deﬁned as
ECPh = 1 −
n+(10−h)
l=n
m
j=1 Il+h,j
m × (11 − h)
, h = 1, . . . , 10
h 1 2 3 4 5 6 7 8 9 10
ECP 0.71 0.72 0.75 0.69 0.64 0.73 0.72 0.69 0.72 0.74
Table: Empirical coverage probability at nominal of 0.8

Hypothesis testing: interval forecast accuracy
1 To test if the ECP diﬀers from the nominal coverage
probability, we performed log likelihood-ratio test statistics
(see Christoﬀersen 1998, for more details)

2 Christoﬀersen (1998) proposed a test for unconditional
coverage, a test for independence of indicator sequence, and a
joint test of conditional coverage and independence

3 At the nominal coverage probability of 0.8, log likelihood-ratio
are
h 1 2 3 4 5 6 7 8 9 10
LR 5.73 4.55 1.87 3.24 9.23 5.28 5.94 4.03 2.55 5.01
Table: Critical value is 5.99 at 95% level of signiﬁcance

3 At the nominal coverage probability of 0.8, log likelihood-ratio
are
h 1 2 3 4 5 6 7 8 9 10
LR 5.73 4.55 1.87 3.24 9.23 5.28 5.94 4.03 2.55 5.01
Table: Critical value is 5.99 at 95% level of signiﬁcance
4 At 95% level of signiﬁcance, only 1 in 10 is greater than
critical value

Conclusion
1 Revisited the bottom-up method

Conclusion
2 Applied it to the regional infant mortality count in Australia

Conclusion
3 Performed evaluation of point forecast accuracy

Conclusion
4 Proposed a parametric bootstrap method to construct
prediction interval

Conclusion
prediction interval
5 Performed evaluation of interval forecast accuracy

Conclusion
prediction interval
5 Performed evaluation of interval forecast accuracy
6 Carried out hypothesis testing of interval forecast accuracy

Future research
1 Parametric bootstrapping is expected to work for other
hierarchical/grouped time series forecasting method, such as
top-down methods

Future research
top-down methods
2 Modeling age-speciﬁc mortality counts hierarchically and
coherently

Future research
top-down methods
2 Modeling age-speciﬁc mortality counts hierarchically and
coherently
3 Extension from mortality count to mortality rate

Thank you
A draft is available upon request from H.Shang@soton.ac.uk

Grouped time-series forecasting: Application to regional infant mortality counts

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Grouped time-series forecasting: Application to regional infant mortality counts

Similar to Grouped time-series forecasting: Application to regional infant mortality counts (20)

Recently uploaded

Recently uploaded (20)

Grouped time-series forecasting: Application to regional infant mortality counts