Upcoming SlideShare
×

# Paris2012 session1

1,274 views
1,196 views

Published on

State Space Model by Pr. Koopman

0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
1,274
On SlideShare
0
From Embeds
0
Number of Embeds
674
Actions
Shares
0
10
0
Likes
0
Embeds 0
No embeds

No notes for slide

### Paris2012 session1

1. 1. Forecasting in State Space: theory and practice Siem Jan Koopman http://personal.vu.nl/s.j.koopman Department of Econometrics VU University Amsterdam Tinbergen Institute 2012
2. 2. ProgramLectures : • Introduction to UC models • State space methods • Forecasting time series with diﬀerent components • Practice of Forecasting with IllustrationsExercises and assignments will be part of the course. 2 / 42
3. 3. Time SeriesA time series is a set of observations yt , each one recorded at aspeciﬁc time t.The observations are ordered over time.We assume to have n observations, t = 1, . . . , n.Examples of time series are: • Number of cars sold each year • Gross Domestic Product of a country • Stock prices during one day • Number of ﬁrm defaultsOur purpose is to identify and to model the serial or “dynamic”correlation structure in the time series.Time series analysis may be relevant for economic policy, ﬁnancialdecision making and forecasting 3 / 42
4. 4. Example: Nile data Nile Data14001300120011001000900800700600500 1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 1970 4 / 42
5. 5. Example: GDP growth, quarter by quarter 5 4 3 2 1 0−1−2−3−4 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 5 / 42
6. 6. Example: winner boat races Cambridge/Oxford1.00.90.80.70.60.50.40.30.20.1 1840 1860 1880 1900 1920 1940 1960 1980 2000 6 / 42
7. 7. Example: US monthly unemployment 7 / 42
8. 8. Sources for time series dataData sources : • US economics : http://research.stlouisfed.org/fred2/ • DK book data : http://www.ssfpack.com/dkbook.html • Financial data : Datastream, Yahoo Finance 8 / 42
9. 9. 9: 9 / 42
10. 10. White noise processesSimplest example of a stationary process is a white noise (WN)process which we usually denote as εt .A white noise process is a sequence of uncorrelated random 2variables, each with zero mean and constant variance σε : 2 εt ∼ WN(0, σε ).The autocovariance function is equal to zero for lags h > 0: σε2 if h = 0, γY (h) = 0 if h = 0. 9 / 42
11. 11. White noise realisations White Noise, 500 observations 2 1 0−1−2−3 0 50 100 150 200 250 300 350 400 450 500 10 / 42
12. 12. White noise ACF and SACF Theoretical ACF max lag = 50 Theoretical ACF max lag = 500 0.10 0.10 0.05 0.05 0.00 0.00−0.05 −0.05 0 10 20 30 40 50 0 100 200 300 400 500 Sample ACF n = 50 Sample ACF n = 500 1.0 1.0 ACF− ACF− 0.5 0.5 0.0 0.0 −0.5 −0.5 0 10 20 30 40 50 0 100 200 300 400 500 11 / 42
13. 13. Random Walk processesIf ε1 , ε2 , . . . come from a white noise process with variance σ 2 ,then the process {Yt } with Yt = ε1 + ε2 + . . . + εt for t = 1, 2, . . .is called a random walk.A recursive way to deﬁne a random walk is: Yt = Yt−1 + εt for t = 2, 3, . . . Y1 = ε1 12 / 42
14. 14. Random Walk properties IA random walk is not stationary, because the variance of Yt istime-varying: E(Yt ) = E(ε1 + . . . + εt ) = 0 Var (Yt ) = E(Yt2 ) = E[(ε1 + . . . + εt )2 ] = tσ 2The autocovariance function is equal to: γ(t, t − h) = E(Yt Yt−h ) t−h t t−h = E[( εj + εj )( εj )] j=1 j=t−h+1 j=1 2 = (t − h)σThis means that the variance and the autocovariances go toinﬁnity if t → ∞. 13 / 42
15. 15. Random Walk properties IIThe autocorrelation of Yt and Yt−h is γ(t, t − h) ρ(t, t − h) = Var (Yt )Var (Yt−h ) √ (t − h)σ 2 t−h = = √ (tσ 2 )((t − h)σ 2 ) t 14 / 42
16. 16. RW realisation Random Walk, 500 observations 5 0−5−10−15−20−25−30 0 50 100 150 200 250 300 350 400 450 500 15 / 42
17. 17. RW sample ACF Sample ACF 50 lags1.0 ACF−0.90.80.70.60.50.40.30.20.1 0 5 10 15 20 25 30 35 40 45 50 16 / 42
18. 18. Seatbelt Law7.97.87.77.67.57.47.37.27.17.0 70 75 80 85 17 / 42
19. 19. Classical DecompositionA basic model for representing a time series is the additive model yt = µt + γt + εt , t = 1, . . . , n,also known as the Classical Decomposition. yt = observation, µt = slowly changing component (trend), γt = periodic component (seasonal), εt = irregular component (disturbance).In a Structural Time Series Model (STSM) or UnobservedComponents Model (UCM), the RHS components are modelledexplicitly as stochastic processes. 18 / 42
20. 20. Nile data Nile Data14001300120011001000900800700600500 1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 1970 19 / 42
21. 21. Local Level Model• Components can be deterministic functions of time (e.g. polynomials), or stochastic processes; 2• Deterministic example: yt = µ + εt with εt ∼ NID(0, σε ).• Stochastic example: the Random Walk plus Noise, or Local Level model: 2 yt = µt + εt , εt ∼ NID(0, σε ) 2 µt+1 = µt + ηt , ηt ∼ NID(0, ση ),• The disturbances εt , ηs are independent for all s, t;• The model is incomplete without a speciﬁcation for µ1 (note the non-stationarity): µ1 ∼ N (a, P) 20 / 42
22. 22. Local Level Model 2 yt = µt + εt , εt ∼ NID(0, σε ) 2 µt+1 = µt + ηt , ηt ∼ NID(0, ση ), µ1 ∼ N (a, P)• The level µt and the irregular εt are unobserved; 2 2• Parameters: a, P, σε , ση ;• Trivial special cases: 2 2 • ση = 0 =⇒ yt ∼ NID(µ1 , σε ) (WN with constant level); 2 • σε = 0 =⇒ yt+1 = yt + ηt (pure RW);• Local Level is a model representation for EWMA forecasting. 21 / 42
23. 23. Simulated LL Data 6 σ ε 2=0.1 σ η 2=1 y µ 4 2 0−2−4−6 0 10 20 30 40 50 60 70 80 90 100 22 / 42
24. 24. Simulated LL Data 6 σ ε 2=1 σ η 2=1 4 2 0−2−4−6 0 10 20 30 40 50 60 70 80 90 100 23 / 42
25. 25. Simulated LL Data 6 σ ε 2=1 σ η 2=0.1 4 2 0−2−4−6 0 10 20 30 40 50 60 70 80 90 100 24 / 42
26. 26. Simulated LL Data 5 σ ε 2=0.1 σ η 2=1 y µ 0−5 0 10 20 30 40 50 60 70 80 90 100 5 σ ε 2=1 σ η 2=1 0−5 0 10 20 30 40 50 60 70 80 90 100 2 σ ε 2=1 σ η 2=0.1 0−2 0 10 20 30 40 50 60 70 80 90 100 25 / 42
27. 27. Properties of the LL model 2 yt = µt + εt , εt ∼ NID(0, σε ), 2 µt+1 = µt + ηt , ηt ∼ NID(0, ση ),• First diﬀerence is stationary: ∆yt = ∆µt + ∆εt = ηt−1 + εt − εt−1 .• Dynamic properties of ∆yt : E(∆yt ) = 0, 2 2 γ0 = E(∆yt ∆yt ) = ση + 2σε , 2 γ1 = E(∆yt ∆yt−1 ) = −σε , γτ = E(∆yt ∆yt−τ ) = 0 for τ ≥ 2. 26 / 42
28. 28. Properties of the LL model• The ACF of ∆yt is 2 −σε 1 2 2 ρ1 = 2 2 =− , q = ση /σε , ση + 2σε q+2 ρτ = 0, τ ≥ 2.• q is called the signal-noise ratio;• The model for ∆yt is MA(1) with restricted parameters such that −1/2 ≤ ρ1 ≤ 0 i.e., yt is ARIMA(0,1,1);• Write ∆yt = ξt + θξt−1 , ξt ∼ NID(0, σ 2 ) to solve θ: 1 θ= q 2 + 4q − 2 − q . 2 27 / 42
29. 29. Local Level Model• The model parameters are estimated by Maximum Likelihood;• Advantages of model based approach: assumptions can be tested, parameters are estimated rather than “calibrated”;• Estimated model can be used for signal extraction;• The estimated level µt is obtained as a locally weighted average;• The distribution of weights can be compared with Kernel functions in nonparametric regressions;• Within the model, our methods yield MMSE forecasts. 28 / 42
30. 30. Signal Extraction and Weights for the Nile Data data and estimated level weights1250 0.021000750 0.01500 1880 1900 1920 1940 1960 −20 −15 −10 −5 0 5 10 15 201250 0.151000 0.10750 0.05500 1880 1900 1920 1940 1960 −20 −15 −10 −5 0 5 10 15 2015001250 0.501000 0.25750500 1880 1900 1920 1940 1960 −20 −15 −10 −5 0 5 10 15 20 29 / 42
31. 31. Local Linear Trend ModelThe LLT model extends the LL model with a slope: 2 yt = µt + εt , εt ∼ NID(0, σε ), 2 µt+1 = βt + µt + ηt , ηt ∼ NID(0, ση ), 2 βt+1 = βt + ξt , ξt ∼ NID(0, σξ ). • All disturbances are independent at all lags and leads; • Initial distributions β1 , µ1 need to speciﬁed; 2 • If σξ = 0 the trend is a random walk with constant drift β1 ; (For β1 = 0 the model reduces to a LL model.) 2 • If additionally ση = 0 the trend is a straight line with slope β1 and intercept µ1 ; 2 2 • If σξ > 0 but ση = 0, the trend is a smooth curve, or an Integrated Random Walk; 30 / 42
32. 32. Trend and Slope in LLT Model µ 5.0 2.5 0.0 −2.5 0 10 20 30 40 50 60 70 80 90 100 0.75 β 0.50 0.25 0.00−0.25 0 10 20 30 40 50 60 70 80 90 100 31 / 42
33. 33. Trend and Slope in Integrated Random Walk Model 10 µ 5 0 0 10 20 30 40 50 60 70 80 90 100 0.75 β 0.50 0.25 0.00−0.25 0 10 20 30 40 50 60 70 80 90 100 32 / 42
34. 34. Local Linear Trend Model• Reduced form of LLT is ARIMA(0,2,2);• LLT provides a model for Holt-Winters forecasting;• Smooth LLT provides a model for spline-ﬁtting;• Smoother trends: higher order Random Walks ∆d µt = ηt 33 / 42
35. 35. Seasonal EﬀectsWe have seen speciﬁcations for µt in the basic model yt = µt + γt + εt .Now we will consider the seasonal term γt . Let s denote thenumber of ‘seasons’ in the data: • s = 12 for monthly data, • s = 4 for quarterly data, • s = 7 for daily data when modelling a weekly pattern. 34 / 42
36. 36. Dummy SeasonalThe simplest way to model seasonal eﬀects is by using dummyvariables. The eﬀect summed over the seasons should equal zero: s−1 γt+1 = − γt+1−j . j=1To allow the pattern to change over time, we introduce a newdisturbance term: s−1 2 γt+1 = − γt+1−j + ωt , ωt ∼ NID(0, σω ). j=1The expectation of the sum of the seasonal eﬀects is zero. 35 / 42
37. 37. Trigonometric SeasonalDeﬁning γjt as the eﬀect of season j at time t, an alternativespeciﬁcation for the seasonal pattern is [s/2] γt = γjt , j=1 γj,t+1 = γjt cos λj + γjt sin λj + ωjt , ∗ γj,t+1 = −γjt sin λj + γjt cos λj + ωjt , ∗ ∗ ∗ 2 ωjt , ωjt ∼ NID(0, σω ), ∗ λj = 2πj/s. • Without the disturbance, the trigonometric speciﬁcation is identical to the deterministic dummy speciﬁcation. • The autocorrelation in the trigonometric speciﬁcation lasts through more lags: changes occur in a smoother way; 36 / 42
38. 38. Unobserved Component Models• Diﬀerent speciﬁcations for the trend and the seasonal can be freely combined.• Other components of interest, like cycles, explanatory variables, interventions eﬀects, outliers, are easily added.• UC models are Multiple Source of Errors models. The reduced form is a Single Source of Errors model.• We model non-stationarity directly.• Components have an explicit interpretation: the model is not just a forecasting device. 37 / 42
39. 39. Seatbelt Law7.97.87.77.67.57.47.37.27.17.0 70 75 80 85 38 / 42
40. 40. Seatbelt Law: decomposition drivers Level+Reg7.757.25 70 75 80 85 drivers−Seasonal 0.2 0.0 70 75 80 85 drivers−Irregular 0.1 0.0−0.1 70 75 80 85 39 / 42
41. 41. Seatbelt Law: forecasting7.97.87.77.67.57.47.37.27.17.0 70 75 80 85 40 / 42
42. 42. Textbooks• A.C.Harvey (1989). Forecasting, Structural Time Series Models and the Kalman Filter. Cambridge University Press• G.Kitagawa & W.Gersch (1996). Smoothness Priors Analysis of Time Series. Springer-Verlag• J.Harrison & M.West (1997). Bayesian Forecasting and Dynamic Models. Springer-Verlag• J.Durbin & S.J.Koopman (2001). Time Series Analysis by State Space Methods. Oxford University Press• J.J.F.Commandeur & S.J.Koopman (2007). An Introduction to State Space Time Series Analysis. Oxford University Press 41 / 42
43. 43. Exercises1. Consider LL model (see slides, see DK chapter 2). • Reduced form is ARIMA(0,1,1) process. Derive the relationship between signal-to-noise ratio q of LL model and the θ coeﬃcient of the ARIMA model; √ • Derive the reduced form in the case ηt = qεt and notice the diﬀerence in the general case. • Give the elements of the mean vector and variance matrix of y = (y1 , . . . , yn )′ when yt is generated by a LL model for t = 1, . . . , n.2. Consider LLT model (see slides, see DK section 3.2.1). • Show that the reduced form is an ARIMA(0,2,2) process; • Discuss the initial values for level and slope of LLT; • Relate the LLT model forecasts with the Holt-Winters method of forecasting. Comment. 42 / 42