State space methods for signal extraction,  parameter estimation and forecasting             Siem Jan Koopman    http://pe...
Regression Model in time seriesFor a time series of a dependent variable yt and a set of regressorsXt , we can consider th...
State Space ModelLinear Gaussian state space model is defined in three parts:→ State equation:              αt+1 = Tt αt + ...
State Space Model• state space model is linear and Gaussian: therefore properties  and results of multivariate normal dist...
Regression with time-varying coefficientsGeneral state space model:            αt+1 = Tt αt + Rt ζt ,     ζt ∼ NID(0, Qt ), ...
AR(1) process in State Space FormConsider AR(1) model yt+1 = φyt + ζt , in state space form:           αt+1 = Tt αt + Rt ζ...
AR(2) in State Space FormConsider AR(2) model yt+1 = φ1 yt + φ2 yt−1 + ζt , in state spaceform:            αt+1 = Tt αt + ...
MA(1) in State Space FormConsider MA(1) model yt+1 = ζt + θζt−1 , in state space form:            αt+1 = Tt αt + Rt ζt ,  ...
General ARMA in State Space FormConsider ARMA(2,1) model               yt = φ1 yt−1 + φ2 yt−2 + ζt + θζt−1in state space f...
UC models in State Space FormState space model: αt+1 = Tt αt + Rt ζt ,    yt = Zt αt + εt .LL model µt+1 = µt + ηt and yt ...
UC models in State Space FormState space model: αt+1 = Tt αt + Rt ζt ,   yt = Zt αt + εt .LLT model with season: µt+1 = µt...
Kalman Filter• The Kalman filter calculates the mean and variance of the  unobserved state, given the observations.• The st...
Kalman FilterThe unobserved state αt can be estimated from the observationswith the Kalman filter:                    vt = ...
Lemma I : multivariate Normal regressionSuppose x and y are jointly Normally distributed vectors withmeans E(x) and E(y ),...
Lemma II : an extensionThe proof of the Kalman filter uses a slight extension of the lemmafor multivariate Normal regressio...
Kalman Filter : derivationState space model: αt+1 = Tt αt + Rt ζt ,          yt = Zt αt + εt .  • Writing Yt = {y1 , . . ....
Kalman Filter : derivationState space model: αt+1 = Tt αt + Rt ζt ,            yt = Zt αt + εt .  • We have Yt = {Yt−1 , y...
Kalman Filter : derivation• Our best prediction of yt is Zt at . When the actual  observation arrives, calculate the predi...
Kalman Filter Illustration                                                             10000       observation        filt...
Prediction and FilteringThis version of the Kalman filter computes the predictions of thestates directly:             at+1 ...
Smoothing  • The filter calculates the mean and variance conditional on Yt ;  • The Kalman smoother calculates the mean and...
Smoothing Illustration          observations    smoothed state                 4000     V_t1250                           ...
Filtering and Smoothing1400         observation       filtered level         smoothed level1300120011001000900800700600500...
Missing ObservationsMissing observations are very easy to handle in Kalman filtering:  • suppose yj is missing  • put vj = ...
Missing Observations          observation          a_t12501000 750 500   1870   1880          1890         1900   1910   1...
Missing Observations, Filter and Smoohter         filtered state                                  P_t1250                 ...
ForecastingForecasting requires no extra theory: just treat future observationsas missing:  • put vj = 0, Kj = 0 and Fj = ...
Forecasting            observation    a_t 1250 1000  750  500    1870   1880    1890   1900   1910   1920   1930   1940   ...
Parameter EstimationThe system matrices in a state space model typically depends on aparameter vector ψ. The model is comp...
ML Estimate of Nile Data1400         observation           level (q = 0)         level (q = 1000)      level (q = 0.0973 =...
Diagnostics• Null hypothesis: standardised residuals                            √                        vt / F t ∼ NID(0,...
Nile Data Residuals Diagnostics                                                                    Correlogram            ...
Assignment 1 - Exercises 1 and 21. Formulate the following models in state space form:     •   ARMA(3,1) process;     •   ...
Assignment 1 - Exercise 33.Consider a time series for yt that is modelled by a state spacemodel and consider a time series...
Assignment 1 - Computing work• Consider Chapter 2 of Durbin and Koopman book.• There are 8 figures in Chapter 2.• Write com...
Upcoming SlideShare
Loading in...5
×

Paris2012 session2

840

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
840
On Slideshare
0
From Embeds
0
Number of Embeds
19
Actions
Shares
0
Downloads
14
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Paris2012 session2

  1. 1. State space methods for signal extraction, parameter estimation and forecasting Siem Jan Koopman http://personal.vu.nl/s.j.koopman Department of Econometrics VU University Amsterdam Tinbergen Institute 2012
  2. 2. Regression Model in time seriesFor a time series of a dependent variable yt and a set of regressorsXt , we can consider the regression model yt = µ + Xt β + εt , εt ∼ NID(0, σ 2 ), t = 1, . . . , n.For a large n, is it reasonable to assume constant coefficients µ, βand σ 2 ?Assume we have strong reasons to believe that µ and β aretime-varying, the model may be written as yt = µt + Xt βt + εt , µt+1 = µt + ηt , βt+1 = βt + ξt .This is an example of a linear state space model. 2 / 35
  3. 3. State Space ModelLinear Gaussian state space model is defined in three parts:→ State equation: αt+1 = Tt αt + Rt ζt , ζt ∼ NID(0, Qt ),→ Observation equation: yt = Zt αt + εt , εt ∼ NID(0, Ht ),→ Initial state distribution α1 ∼ N (a1 , P1 ).Notice that • ζt and εs independent for all t, s, and independent from α1 ; • observation yt can be multivariate; • state vector αt is unobserved; • matrices Tt , Zt , Rt , Qt , Ht determine structure of model. 3 / 35
  4. 4. State Space Model• state space model is linear and Gaussian: therefore properties and results of multivariate normal distribution apply;• state vector αt evolves as a VAR(1) process;• system matrices usually contain unknown parameters;• estimation has therefore two aspects: • measuring the unobservable state (prediction, filtering and smoothing); • estimation of unknown parameters (maximum likelihood estimation);• state space methods offer a unified approach to a wide range of models and techniques: dynamic regression, ARIMA, UC models, latent variable models, spline-fitting and many ad-hoc filters;• next, some well-known model specifications in state space form ... 4 / 35
  5. 5. Regression with time-varying coefficientsGeneral state space model: αt+1 = Tt αt + Rt ζt , ζt ∼ NID(0, Qt ), yt = Zt αt + εt , εt ∼ NID(0, Ht ).Put regressors in Zt (that is Zt = Xt ), Tt = I , Rt = I ,Result is regression model yt = Xt βt + εt with time-varyingcoefficient βt = αt following a random walk.In case Qt = 0, the time-varying regression model reduces to thestandard linear regression model yt = Xt β + εt with β = α1 .Many dynamic specifications for the regression coefficient can beconsidered, see below. 5 / 35
  6. 6. AR(1) process in State Space FormConsider AR(1) model yt+1 = φyt + ζt , in state space form: αt+1 = Tt αt + Rt ζt , ζt ∼ NID(0, Qt ), yt = Zt αt + εt , εt ∼ NID(0, Ht ).with state vector αt and system matrices Zt = 1, Ht = 0, Tt = φ, Rt = 1, Qt = σ 2we have • Zt and Ht = 0 imply that α1t = yt ; • State equation implies yt+1 = φ1 yt + ζt with ζt ∼ NID(0, σ 2 ); • This is the AR(1) model ! 6 / 35
  7. 7. AR(2) in State Space FormConsider AR(2) model yt+1 = φ1 yt + φ2 yt−1 + ζt , in state spaceform: αt+1 = Tt αt + Rt ζt , ζt ∼ NID(0, Qt ), yt = Zt αt + εt , εt ∼ NID(0, Ht ).with 2 × 1 state vector αt and system matrices: Zt = 1 0 , Ht = 0 φ1 1 1 Tt = , Rt = , Qt = σ 2 φ2 0 0we have • Zt and Ht = 0 imply that α1t = yt ; • First state equation implies yt+1 = φ1 yt + α2t + ζt with ζt ∼ NID(0, σ 2 ); • Second state equation implies α2,t+1 = φ2 yt ; 7 / 35
  8. 8. MA(1) in State Space FormConsider MA(1) model yt+1 = ζt + θζt−1 , in state space form: αt+1 = Tt αt + Rt ζt , ζt ∼ NID(0, Qt ), yt = Zt αt + εt , εt ∼ NID(0, Ht ).with 2 × 1 state vector αt and system matrices: Zt = 1 0 , Ht = 0 0 1 1 Tt = , Rt = , Qt = σ 2 0 0 θwe have • Zt and Ht = 0 imply that α1t = yt ; • First state equation implies yt+1 = α2t + ζt with ζt ∼ NID(0, σ 2 ); • Second state equation implies α2,t+1 = θζt ; 8 / 35
  9. 9. General ARMA in State Space FormConsider ARMA(2,1) model yt = φ1 yt−1 + φ2 yt−2 + ζt + θζt−1in state space form yt αt = φ2 yt−1 + θζt Zt = 1 0 , Ht = 0, φ1 1 1 Tt = , Rt = , Qt = σ 2 φ2 0 θAll ARIMA(p, d, q) models have a (non-unique) state spacerepresentation. 9 / 35
  10. 10. UC models in State Space FormState space model: αt+1 = Tt αt + Rt ζt , yt = Zt αt + εt .LL model µt+1 = µt + ηt and yt = µt + εt : 2 αt = µt , Tt = 1, Rt = 1, Qt = ση , 2 Zt = 1, Ht = σε .LLT model µt+1 = µt + βt + ηt , βt+1 = βt + ξt andyt = µt + εt : 2 ση 0 µt 1 1 1 0αt = , Tt = , Rt = , Qt = 2 , βt 0 1 0 1 0 σξ 2 Zt = 1 0 , Ht = σε . 10 / 35
  11. 11. UC models in State Space FormState space model: αt+1 = Tt αt + Rt ζt , yt = Zt αt + εt .LLT model with season: µt+1 = µt + βt + ηt , βt+1 = βt + ξt ,S(L)γt+1 = ωt and yt = µt + γt + εt : ′αt = µt βt γt−1 γt−2 , γt     1 1 0 0 0  2  1 0 0 0 1 0 0 0 ση 0 0 0 1 0 2   Tt = 0  0 −1 −1 −1 , Qt =  0 σξ 0  , Rt = 0   0 1 ,  0 0 1 0 0 0 0 σω2 0 0 0 0 0 0 1 0 0 0 0 2Zt = 1 0 1 0 0 , Ht = σε . 11 / 35
  12. 12. Kalman Filter• The Kalman filter calculates the mean and variance of the unobserved state, given the observations.• The state is Gaussian: the complete distribution is characterized by the mean and variance.• The filter is a recursive algorithm; the current best estimate is updated whenever a new observation is obtained.• To start the recursion, we need a1 and P1 , which we assumed given.• There are various ways to initialize when a1 and P1 are unknown, which we will not discuss here. 12 / 35
  13. 13. Kalman FilterThe unobserved state αt can be estimated from the observationswith the Kalman filter: vt = yt − Z t a t , Ft = Zt Pt Zt′ + Ht , Kt = Tt Pt Zt′ Ft−1 , at+1 = Tt at + Kt vt , Pt+1 = Tt Pt Tt′ + Rt Qt Rt′ − Kt Ft Kt′ ,for t = 1, . . . , n and starting with given values for a1 and P1 . • Writing Yt = {y1 , . . . , yt }, at+1 = E(αt+1 |Yt ), Pt+1 = Var(αt+1 |Yt ). 13 / 35
  14. 14. Lemma I : multivariate Normal regressionSuppose x and y are jointly Normally distributed vectors withmeans E(x) and E(y ), respectively, with variance matrices Σxx andΣyy , respectively, and covariance matrix Σxy . Then −1 E(x|y ) = µx + Σxy Σyy (y − µy ), Var(x|y ) = Σxx − Σxy Σyy Σ′ . −1 xyDefine e = x − E(x|y ). Then −1 Var(e) = Var([x − µx ] − Σxy Σyy [y − µy ]) = Σxx − Σxy Σyy Σ′ −1 xy = Var(x|y ),and −1 Cov(e, y ) = Cov([x − µx ] − Σxy Σyy [y − µy ], y ) = Σxy − Σxy = 0. 14 / 35
  15. 15. Lemma II : an extensionThe proof of the Kalman filter uses a slight extension of the lemmafor multivariate Normal regression theory.Suppose x, y and z are jointly Normally distributed vectors withE(z) = 0 and Σyz = 0. Then −1 E(x|y , z) = E(x|y ) + Σxz Σzz z, Var(x|y , z) = Var(x|y ) − Σxz Σzz Σ′ , −1 xzCan you give a proof of Lemma II ?Hint : apply Lemma I for x and y ∗ = (y ′ , z ′ )′ . 15 / 35
  16. 16. Kalman Filter : derivationState space model: αt+1 = Tt αt + Rt ζt , yt = Zt αt + εt . • Writing Yt = {y1 , . . . , yt }, define at+1 = E(αt+1 |Yt ), Pt+1 = Var(αt+1 |Yt ); • The prediction error is vt = yt − E(yt |Yt−1 ) = yt − E(Zt αt + εt |Yt−1 ) = yt − Zt E(αt |Yt−1 ) = yt − Z t a t ; • It follows that vt = Zt (αt − at ) + εt and E(vt ) = 0; • The prediction error variance is Ft = Var(vt ) = Zt Pt Zt′ + Ht . 16 / 35
  17. 17. Kalman Filter : derivationState space model: αt+1 = Tt αt + Rt ζt , yt = Zt αt + εt . • We have Yt = {Yt−1 , yt } = {Yt−1 , vt } and E(vt yt−j ) = 0 for j = 1, . . . , t − 1; • Lemma E(x|y , z) = E(x|y ) + Σxz Σ−1 z, and take x = αt+1 , zz y = Yt−1 and z = vt = Zt (αt − at ) + εt ; • It follows that E(αt+1 |Yt−1 ) = Tt at ; • Further, E(αt+1 vt′ ) = Tt E(αt vt′ ) + Rt E(ζt vt′ ) = Tt Pt Zt′ ; • We carry out lemma and obtain the state update at+1 = E(αt+1 |Yt−1 , yt ) = Tt at + Tt Pt Zt′ Ft−1 vt = Tt at + Kt vt ; with Kt = Tt Pt Zt′ Ft−1 17 / 35
  18. 18. Kalman Filter : derivation• Our best prediction of yt is Zt at . When the actual observation arrives, calculate the prediction error vt = yt − Zt at and its variance Ft = Zt Pt Zt′ + Ht . The new best estimates of the state mean is based on both the old estimate at and the new information vt : at+1 = Tt at + Kt vt , similarly for the variance: Pt+1 = Tt Pt Tt′ + Rt Qt Rt′ − Kt Ft Kt′ . Can you derive the updating equation for Pt+1 ?• The Kalman gain Kt = Tt Pt Zt′ Ft−1 is the optimal weighting matrix for the new evidence.• You should be able to replicate the proof of the Kalman filter for the Local Level Model (DK, Chapter 2). 18 / 35
  19. 19. Kalman Filter Illustration 10000 observation filtered level a_t state variance P_t1250 90001000 8000 750 7000 6000 500 1880 1900 1920 1940 1960 1880 1900 1920 1940 1960 prediction error v_t prediction error variance F_t 25000 250 24000 0 23000 22000−250 21000 1880 1900 1920 1940 1960 1880 1900 1920 1940 1960 19 / 35
  20. 20. Prediction and FilteringThis version of the Kalman filter computes the predictions of thestates directly: at+1 = E(αt+1 |Yt ), Pt+1 = Var(αt+1 |Yt ).The filtered estimates are defined by at|t = E(αt |Yt ), Pt|t = Var(αt |Yt ),for t = 1, . . . , n.The Kalman filter can be “re-ordered” to compute both: ′ at|t = at + Mt vt , Pt|t = Pt − Mt Ft Mt , at+1 = Tt at|t , Pt+1 = Tt Pt|t Tt′ + Rt Qt Rt′ ,with Mt = Pt Zt′ Ft−1 for t = 1, . . . , n and starting with a1 and P1 .Compare Mt with Kt . Verify this version of KF using earlierderivations. 20 / 35
  21. 21. Smoothing • The filter calculates the mean and variance conditional on Yt ; • The Kalman smoother calculates the mean and variance conditional on the full set of observations Yn ; • After the filtered estimates are calculated, the smoothing recursion starts at the last observations and runs until the first. αt = E(αt |Yn ), ˆ Vt = Var(αt |Yn ), rt = weighted sum of future innovations, Nt = Var(rt ), Lt = Tt − Kt Zt .Starting with rn = 0, Nn = 0, the smoothing recursions are givenby rt−1 = Ft−1 vt + Lt rt , Nt−1 = Ft−1 + L′ Nt Lt , t αt = at + Pt rt−1 , ˆ Vt = Pt − Pt Nt−1 Pt . 21 / 35
  22. 22. Smoothing Illustration observations smoothed state 4000 V_t1250 35001000 3000 750 2500 500 1880 1900 1920 1940 1960 1880 1900 1920 1940 1960 0.02 r_t 0.000100 N_t 0.000075 0.00 0.000050−0.02 0.000025 1880 1900 1920 1940 1960 1880 1900 1920 1940 1960 22 / 35
  23. 23. Filtering and Smoothing1400 observation filtered level smoothed level1300120011001000900800700600500 1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 1970 23 / 35
  24. 24. Missing ObservationsMissing observations are very easy to handle in Kalman filtering: • suppose yj is missing • put vj = 0, Kj = 0 and Fj = ∞ in the algorithm • proceed further calculations as normalThe filter algorithm extrapolates according to the state equationuntil a new observation arrives. The smoother interpolatesbetween observations. 24 / 35
  25. 25. Missing Observations observation a_t12501000 750 500 1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 1970 P_t300002000010000 1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 1970 25 / 35
  26. 26. Missing Observations, Filter and Smoohter filtered state P_t1250 300001000 20000750 10000500 1880 1900 1920 1940 1960 1880 1900 1920 1940 1960 10000 smoothed state V_t1250 75001000 5000750500 2500 1880 1900 1920 1940 1960 1880 1900 1920 1940 1960 26 / 35
  27. 27. ForecastingForecasting requires no extra theory: just treat future observationsas missing: • put vj = 0, Kj = 0 and Fj = ∞ for j = n + 1, . . . , n + k • proceed further calculations as normal • forecast for yj is Zj aj 27 / 35
  28. 28. Forecasting observation a_t 1250 1000 750 500 1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 200050000 P_t40000300002000010000 1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 28 / 35
  29. 29. Parameter EstimationThe system matrices in a state space model typically depends on aparameter vector ψ. The model is completely Gaussian; weestimate by Maximum Likelihood.The loglikelihood af a time series is n log L = log p(yt |Yt−1 ). t=1In the state space model, p(yt |Yt−1 ) is a Gaussian density withmean at and variance Ft : n nN 1 log L = − log 2π − log |Ft | + vt′ Ft−1 vt , 2 2 t=1with vt and Ft from the Kalman filter. This is called the predictionerror decomposition of the likelihood. Estimation proceeds bynumerically maximising log L. 29 / 35
  30. 30. ML Estimate of Nile Data1400 observation level (q = 0) level (q = 1000) level (q = 0.0973 = ML estimate)1300120011001000900800700600500 1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 1970 30 / 35
  31. 31. Diagnostics• Null hypothesis: standardised residuals √ vt / F t ∼ NID(0, 1)• Apply standard test for Normality, heteroskedasticity, serial correlation;• A recursive algorithm is available to calculate smoothed disturbances (auxilliary residuals), which can be used to detect breaks and outliers;• Model comparison and parameter restrictions: use likelihood based procedures (LR test, AIC, BIC). 31 / 35
  32. 32. Nile Data Residuals Diagnostics Correlogram 1.0 Residual Nile Residual Nile 2 0.5 0 0.0 −0.5−2 1880 1900 1920 1940 1960 0 5 10 15 20 QQ plot0.5 N(s=0.996) normal 20.40.3 00.20.1 −2 −4 −3 −2 −1 0 1 2 3 4 −2 −1 0 1 2 32 / 35
  33. 33. Assignment 1 - Exercises 1 and 21. Formulate the following models in state space form: • ARMA(3,1) process; • ARMA(1,3) process; • ARIMA(3,1,1) process; • AR(3) plus noise model; • Random Walk plus AR(3) process. Please discuss the initial conditions for αt in all cases.2. Derive a Kalman filter for the local level model 2 2 yt = µ t + ξ t , ξt ∼ N(0, σξ ), ∆µt+1 = ηt ∼ N(0, ση ), with E (ξt ηt ) = σξη = 0 and E (ξt ηs ) = 0 for all t, s and t = s. Also discuss the problem of missing obervations in this case. 33 / 35
  34. 34. Assignment 1 - Exercise 33.Consider a time series for yt that is modelled by a state spacemodel and consider a time series vector of k exogenous variablesXt . The dependent time series yt depend on the explanatoryvariables Xt in a linear way. • How would you introduce Xt as regression effects in the state space model ? • Can you allow the transition matrix Tt to depend on Xt too ? • Can you allow the transition matrix Tt to depend on yt ?For the last two questions, please also consider maximumlikelihood estimation of coefficients within the system matrices. 34 / 35
  35. 35. Assignment 1 - Computing work• Consider Chapter 2 of Durbin and Koopman book.• There are 8 figures in Chapter 2.• Write computer code that can reproduce all these figures in Chapter 2, except Fig. 2.4.• Write a short documentation for the Ox program. 35 / 35
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×