SlideShare a Scribd company logo
1 of 43
Download to read offline
Forecasting in State Space: theory and practice

                Siem Jan Koopman

           http://personal.vu.nl/s.j.koopman
               Department of Econometrics
                VU University Amsterdam
                   Tinbergen Institute
                          2012
Program



Lectures :
  • Introduction to UC models
  • State space methods
  • Forecasting time series with different components
  • Practice of Forecasting with Illustrations
Exercises and assignments will be part of the course.




                                                        2 / 42
Time Series
A time series is a set of observations yt , each one recorded at a
specific time t.
The observations are ordered over time.
We assume to have n observations, t = 1, . . . , n.
Examples of time series are:
  • Number of cars sold each year
  • Gross Domestic Product of a country
  • Stock prices during one day
  • Number of firm defaults
Our purpose is to identify and to model the serial or “dynamic”
correlation structure in the time series.
Time series analysis may be relevant for economic policy, financial
decision making and forecasting
                                                                     3 / 42
Example: Nile data

         Nile Data
1400


1300


1200


1100


1000


900


800


700


600


500

       1870     1880   1890   1900    1910   1920   1930   1940   1950   1960   1970




                                                                                       4 / 42
Example: GDP growth, quarter by quarter


 5

 4

 3

 2

 1

 0

−1

−2

−3

−4


     2000   2001   2002   2003   2004   2005   2006   2007   2008   2009   2010   2011   2012




                                                                                                5 / 42
Example: winner boat races Cambridge/Oxford

1.0

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1


        1840   1860   1880   1900   1920   1940   1960   1980   2000




                                                                       6 / 42
Example: US monthly unemployment




                                   7 / 42
Sources for time series data




Data sources :
  • US economics :
    http://research.stlouisfed.org/fred2/
  • DK book data : http://www.ssfpack.com/dkbook.html
  • Financial data : Datastream, Yahoo Finance




                                                        8 / 42
9:




     9 / 42
White noise processes

Simplest example of a stationary process is a white noise (WN)
process which we usually denote as εt .
A white noise process is a sequence of uncorrelated random
                                                      2
variables, each with zero mean and constant variance σε :
                                     2
                         εt ∼ WN(0, σε ).

The autocovariance function is equal to zero for lags h > 0:

                                σε2   if h = 0,
                   γY (h) =
                                0     if h = 0.




                                                                 9 / 42
White noise realisations
         White Noise, 500 observations


 2



 1



 0



−1



−2



−3

     0          50        100        150   200   250   300   350   400   450   500



                                                                                     10 / 42
White noise ACF and SACF
        Theoretical ACF max lag = 50                         Theoretical ACF max lag = 500
 0.10                                             0.10


 0.05                                             0.05


 0.00                                             0.00


−0.05                                            −0.05



     0      10       20         30     40   50           0        100      200        300    400   500
       Sample ACF n = 50                                     Sample ACF n = 500
  1.0                                              1.0
           ACF−                                                 ACF−


  0.5                                              0.5


  0.0                                              0.0


 −0.5                                             −0.5



    0        10        20       30     40   50           0         100      200       300    400   500



                                                                                                         11 / 42
Random Walk processes


If ε1 , ε2 , . . . come from a white noise process with variance σ 2 ,
then the process {Yt } with

             Yt = ε1 + ε2 + . . . + εt       for t = 1, 2, . . .

is called a random walk.
A recursive way to define a random walk is:

                Yt    = Yt−1 + εt        for t = 2, 3, . . .
                Y1 = ε1




                                                                         12 / 42
Random Walk properties I
A random walk is not stationary, because the variance of Yt is
time-varying:

           E(Yt ) = E(ε1 + . . . + εt ) = 0
        Var (Yt ) = E(Yt2 ) = E[(ε1 + . . . + εt )2 ] = tσ 2

The autocovariance function is equal to:

         γ(t, t − h) = E(Yt Yt−h )
                              t−h             t              t−h
                      = E[(         εj +             εj )(         εj )]
                              j=1          j=t−h+1           j=1
                                     2
                      = (t − h)σ

This means that the variance and the autocovariances go to
infinity if t → ∞.
                                                                           13 / 42
Random Walk properties II



The autocorrelation of Yt and Yt−h is

                              γ(t, t − h)
         ρ(t, t − h) =
                            Var (Yt )Var (Yt−h )
                                                           √
                              (t − h)σ 2                       t−h
                     =                                 =       √
                            (tσ 2 )((t   −   h)σ 2 )             t




                                                                     14 / 42
RW realisation

          Random Walk, 500 observations
  5


  0


−5


−10


−15


−20


−25


−30

      0         50       100       150    200     250   300   350   400   450   500




                                                                                      15 / 42
RW sample ACF

          Sample ACF 50 lags
1.0
               ACF−
0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1


      0         5        10    15   20   25   30   35   40   45   50




                                                                       16 / 42
Seatbelt Law
7.9


7.8


7.7


7.6


7.5


7.4


7.3


7.2


7.1


7.0

      70   75             80   85



                                    17 / 42
Classical Decomposition

A basic model for representing a time series is the additive model

               yt = µt + γt + εt ,     t = 1, . . . , n,

also known as the Classical Decomposition.

             yt = observation,
            µt = slowly changing component (trend),
             γt = periodic component (seasonal),
             εt = irregular component (disturbance).

In a Structural Time Series Model (STSM) or Unobserved
Components Model (UCM), the RHS components are modelled
explicitly as stochastic processes.

                                                                     18 / 42
Nile data
         Nile Data
1400


1300


1200


1100


1000


900


800


700


600


500

       1870     1880   1890   1900   1910   1920   1930   1940   1950   1960   1970



                                                                                      19 / 42
Local Level Model
• Components can be deterministic functions of time (e.g.
  polynomials), or stochastic processes;
                                                       2
• Deterministic example: yt = µ + εt with εt ∼ NID(0, σε ).
• Stochastic example: the Random Walk plus Noise, or
  Local Level model:

                                               2
               yt = µt + εt ,     εt ∼ NID(0, σε )
                                               2
            µt+1 = µt + ηt ,      ηt ∼ NID(0, ση ),


• The disturbances εt , ηs are independent for all s, t;
• The model is incomplete without a specification for µ1 (note
  the non-stationarity):

                           µ1 ∼ N (a, P)
                                                                20 / 42
Local Level Model


                                              2
              yt = µt + εt ,     εt ∼ NID(0, σε )
                                              2
            µt+1 = µt + ηt ,     ηt ∼ NID(0, ση ),
              µ1 ∼ N (a, P)

• The level µt and the irregular εt are unobserved;
                       2  2
• Parameters: a, P, σε , ση ;
• Trivial special cases:
         2                      2
     • ση = 0 =⇒ yt ∼ NID(µ1 , σε ) (WN with constant level);
         2
     • σε = 0 =⇒ yt+1 = yt + ηt    (pure RW);
• Local Level is a model representation for EWMA forecasting.



                                                                21 / 42
Simulated LL Data
 6
         σ ε 2=0.1   σ η 2=1             y        µ
 4



 2



 0



−2



−4



−6


     0        10      20       30   40       50       60   70   80   90   100



                                                                                22 / 42
Simulated LL Data
 6
         σ ε 2=1   σ η 2=1
 4



 2



 0



−2



−4



−6


     0        10    20       30   40   50   60   70   80   90   100



                                                                      23 / 42
Simulated LL Data
 6
         σ ε 2=1   σ η 2=0.1
 4



 2



 0



−2



−4



−6


     0        10    20         30   40   50   60   70   80   90   100



                                                                        24 / 42
Simulated LL Data
 5   σ ε 2=0.1 σ η 2=1                     y   µ


 0


−5

     0         10           20   30   40       50   60   70   80   90   100
 5   σ ε 2=1    σ η 2=1

 0


−5

     0         10           20   30   40       50   60   70   80   90   100
 2   σ ε 2=1    σ η 2=0.1

 0


−2

     0         10           20   30   40       50   60   70   80   90   100



                                                                              25 / 42
Properties of the LL model

                                                2
               yt = µt + εt ,      εt ∼ NID(0, σε ),
                                                2
            µt+1 = µt + ηt ,       ηt ∼ NID(0, ση ),

• First difference is stationary:

              ∆yt = ∆µt + ∆εt = ηt−1 + εt − εt−1 .

• Dynamic properties of ∆yt :

             E(∆yt ) = 0,
                                      2     2
                  γ0 = E(∆yt ∆yt ) = ση + 2σε ,
                                         2
                  γ1 = E(∆yt ∆yt−1 ) = −σε ,
                  γτ = E(∆yt ∆yt−τ ) = 0 for τ ≥ 2.

                                                       26 / 42
Properties of the LL model
• The ACF of ∆yt is
                        2
                      −σε       1                   2   2
            ρ1 =    2     2
                            =−     ,           q = ση /σε ,
                   ση + 2σε    q+2
            ρτ = 0,       τ ≥ 2.

• q is called the signal-noise ratio;
• The model for ∆yt is MA(1) with restricted parameters such
  that
                             −1/2 ≤ ρ1 ≤ 0
  i.e., yt is ARIMA(0,1,1);
• Write ∆yt = ξt + θξt−1 ,      ξt ∼ NID(0, σ 2 ) to solve θ:
                         1
                    θ=         q 2 + 4q − 2 − q .
                         2
                                                                27 / 42
Local Level Model


• The model parameters are estimated by Maximum Likelihood;
• Advantages of model based approach: assumptions can be
  tested, parameters are estimated rather than “calibrated”;
• Estimated model can be used for signal extraction;
• The estimated level µt is obtained as a locally weighted
  average;
• The distribution of weights can be compared with Kernel
  functions in nonparametric regressions;
• Within the model, our methods yield MMSE forecasts.




                                                               28 / 42
Signal Extraction and Weights for the Nile Data
         data and estimated level                          weights
1250
                                                  0.02
1000

750                                               0.01

500
       1880    1900     1920        1940   1960      −20    −15      −10   −5   0   5   10   15   20

1250                                              0.15
1000                                              0.10
750
                                                  0.05
500
       1880    1900     1920        1940   1960      −20    −15      −10   −5   0   5   10   15   20
1500
1250                                              0.50

1000
                                                  0.25
750
500
       1880    1900     1920        1940   1960      −20    −15      −10   −5   0   5   10   15   20



                                                                                                       29 / 42
Local Linear Trend Model
The LLT model extends the LL model with a slope:
                                                     2
               yt = µt + εt ,           εt ∼ NID(0, σε ),
                                                     2
            µt+1 = βt + µt + ηt ,       ηt ∼ NID(0, ση ),
                                                     2
            βt+1 = βt + ξt ,            ξt ∼ NID(0, σξ ).

  • All disturbances are independent at all lags and leads;
  • Initial distributions β1 , µ1 need to specified;
        2
  • If σξ = 0 the trend is a random walk with constant drift β1 ;
    (For β1 = 0 the model reduces to a LL model.)
                     2
  • If additionally ση = 0 the trend is a straight line with slope β1
    and intercept µ1 ;
        2          2
  • If σξ > 0 but ση = 0, the trend is a smooth curve, or an
    Integrated Random Walk;
                                                                        30 / 42
Trend and Slope in LLT Model

            µ
  5.0


  2.5


  0.0


 −2.5
        0       10    20   30   40   50   60   70   80   90   100

 0.75       β


 0.50

 0.25

 0.00

−0.25


        0       10    20   30   40   50   60   70   80   90   100



                                                                    31 / 42
Trend and Slope in Integrated Random Walk Model

  10        µ




   5



   0



        0       10   20   30   40   50   60   70   80   90   100

 0.75       β


 0.50

 0.25

 0.00

−0.25


        0       10   20   30   40   50   60   70   80   90   100



                                                                   32 / 42
Local Linear Trend Model



• Reduced form of LLT is ARIMA(0,2,2);
• LLT provides a model for Holt-Winters forecasting;
• Smooth LLT provides a model for spline-fitting;
• Smoother trends: higher order Random Walks

                           ∆d µt = ηt




                                                       33 / 42
Seasonal Effects


We have seen specifications for µt in the basic model

                        yt = µt + γt + εt .

Now we will consider the seasonal term γt . Let s denote the
number of ‘seasons’ in the data:
  • s = 12 for monthly data,
  • s = 4 for quarterly data,
  • s = 7 for daily data when modelling a weekly pattern.




                                                               34 / 42
Dummy Seasonal

The simplest way to model seasonal effects is by using dummy
variables. The effect summed over the seasons should equal zero:
                                     s−1
                          γt+1 = −         γt+1−j .
                                     j=1

To allow the pattern to change over time, we introduce a new
disturbance term:
                    s−1
                                                            2
         γt+1 = −         γt+1−j + ωt ,        ωt ∼ NID(0, σω ).
                    j=1

The expectation of the sum of the seasonal effects is zero.


                                                                   35 / 42
Trigonometric Seasonal
Defining γjt as the effect of season j at time t, an alternative
specification for the seasonal pattern is
                          [s/2]
                   γt =           γjt ,
                          j=1

               γj,t+1 = γjt cos λj + γjt sin λj + ωjt ,
                                      ∗

               γj,t+1 = −γjt sin λj + γjt cos λj + ωjt ,
                ∗                      ∗            ∗

                                  2
              ωjt , ωjt ∼ NID(0, σω ),
                     ∗
                                           λj = 2πj/s.


  • Without the disturbance, the trigonometric specification is
    identical to the deterministic dummy specification.
  • The autocorrelation in the trigonometric specification lasts
    through more lags: changes occur in a smoother way;
                                                                  36 / 42
Unobserved Component Models


• Different specifications for the trend and the seasonal can be
  freely combined.
• Other components of interest, like cycles, explanatory
  variables, interventions effects, outliers, are easily added.
• UC models are Multiple Source of Errors models. The reduced
  form is a Single Source of Errors model.
• We model non-stationarity directly.
• Components have an explicit interpretation: the model is not
  just a forecasting device.




                                                                 37 / 42
Seatbelt Law
7.9


7.8


7.7


7.6


7.5


7.4


7.3


7.2


7.1


7.0

      70   75             80   85



                                    38 / 42
Seatbelt Law: decomposition

       drivers       Level+Reg
7.75



7.25


       70                        75        80        85

       drivers−Seasonal
 0.2


 0.0


       70                        75        80        85

       drivers−Irregular
 0.1

 0.0

−0.1

       70                        75        80        85



                                                          39 / 42
Seatbelt Law: forecasting
7.9


7.8


7.7


7.6


7.5


7.4


7.3


7.2


7.1


7.0

      70       75          80          85



                                            40 / 42
Textbooks


• A.C.Harvey (1989). Forecasting, Structural Time Series
  Models and the Kalman Filter. Cambridge University Press
• G.Kitagawa & W.Gersch (1996). Smoothness Priors Analysis
  of Time Series. Springer-Verlag
• J.Harrison & M.West (1997). Bayesian Forecasting and
  Dynamic Models. Springer-Verlag
• J.Durbin & S.J.Koopman (2001). Time Series Analysis by
  State Space Methods. Oxford University Press
• J.J.F.Commandeur & S.J.Koopman (2007). An Introduction
  to State Space Time Series Analysis. Oxford University Press




                                                                 41 / 42
Exercises

1. Consider LL model (see slides, see DK chapter 2).
     • Reduced form is ARIMA(0,1,1) process. Derive the
       relationship between signal-to-noise ratio q of LL model and
       the θ coefficient of the ARIMA model;
                                                       √
     • Derive the reduced form in the case ηt = qεt and notice the
       difference in the general case.
     • Give the elements of the mean vector and variance matrix of
       y = (y1 , . . . , yn )′ when yt is generated by a LL model for
       t = 1, . . . , n.

2. Consider LLT model (see slides, see DK section 3.2.1).
     • Show that the reduced form is an ARIMA(0,2,2) process;
     • Discuss the initial values for level and slope of LLT;
     • Relate the LLT model forecasts with the Holt-Winters method
       of forecasting. Comment.


                                                                        42 / 42

More Related Content

What's hot

11.[104 111]analytical solution for telegraph equation by modified of sumudu ...
11.[104 111]analytical solution for telegraph equation by modified of sumudu ...11.[104 111]analytical solution for telegraph equation by modified of sumudu ...
11.[104 111]analytical solution for telegraph equation by modified of sumudu ...
Alexander Decker
 
Optimalpolicyhandout
OptimalpolicyhandoutOptimalpolicyhandout
Optimalpolicyhandout
NBER
 
5. cem granger causality ecm
5. cem granger causality  ecm 5. cem granger causality  ecm
5. cem granger causality ecm
Quang Hoang
 

What's hot (18)

Parameter Estimation in Stochastic Differential Equations by Continuous Optim...
Parameter Estimation in Stochastic Differential Equations by Continuous Optim...Parameter Estimation in Stochastic Differential Equations by Continuous Optim...
Parameter Estimation in Stochastic Differential Equations by Continuous Optim...
 
Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...
 
Camera calibration
Camera calibrationCamera calibration
Camera calibration
 
11.[104 111]analytical solution for telegraph equation by modified of sumudu ...
11.[104 111]analytical solution for telegraph equation by modified of sumudu ...11.[104 111]analytical solution for telegraph equation by modified of sumudu ...
11.[104 111]analytical solution for telegraph equation by modified of sumudu ...
 
Random Matrix Theory and Machine Learning - Part 2
Random Matrix Theory and Machine Learning - Part 2Random Matrix Theory and Machine Learning - Part 2
Random Matrix Theory and Machine Learning - Part 2
 
Optimalpolicyhandout
OptimalpolicyhandoutOptimalpolicyhandout
Optimalpolicyhandout
 
Is the Macroeconomy Locally Unstable and Why Should We Care?
Is the Macroeconomy Locally Unstable and Why Should We Care?Is the Macroeconomy Locally Unstable and Why Should We Care?
Is the Macroeconomy Locally Unstable and Why Should We Care?
 
Lesson 2: A Catalog of Essential Functions
Lesson 2: A Catalog of Essential FunctionsLesson 2: A Catalog of Essential Functions
Lesson 2: A Catalog of Essential Functions
 
Random Matrix Theory and Machine Learning - Part 1
Random Matrix Theory and Machine Learning - Part 1Random Matrix Theory and Machine Learning - Part 1
Random Matrix Theory and Machine Learning - Part 1
 
Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4
 
Geodesic Method in Computer Vision and Graphics
Geodesic Method in Computer Vision and GraphicsGeodesic Method in Computer Vision and Graphics
Geodesic Method in Computer Vision and Graphics
 
On the solvability of a system of forward-backward linear equations with unbo...
On the solvability of a system of forward-backward linear equations with unbo...On the solvability of a system of forward-backward linear equations with unbo...
On the solvability of a system of forward-backward linear equations with unbo...
 
Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3Random Matrix Theory and Machine Learning - Part 3
Random Matrix Theory and Machine Learning - Part 3
 
2018 MUMS Fall Course - Statistical and Mathematical Techniques for Sensitivi...
2018 MUMS Fall Course - Statistical and Mathematical Techniques for Sensitivi...2018 MUMS Fall Course - Statistical and Mathematical Techniques for Sensitivi...
2018 MUMS Fall Course - Statistical and Mathematical Techniques for Sensitivi...
 
5. cem granger causality ecm
5. cem granger causality  ecm 5. cem granger causality  ecm
5. cem granger causality ecm
 
Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...
 
YSC 2013
YSC 2013YSC 2013
YSC 2013
 
Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...
 

Viewers also liked

Scm indicateurs prospectifs_2012_11_06
Scm indicateurs prospectifs_2012_11_06Scm indicateurs prospectifs_2012_11_06
Scm indicateurs prospectifs_2012_11_06
Cdiscount
 
Prediction of Quantiles by Statistical Learning and Application to GDP Foreca...
Prediction of Quantiles by Statistical Learning and Application to GDP Foreca...Prediction of Quantiles by Statistical Learning and Application to GDP Foreca...
Prediction of Quantiles by Statistical Learning and Application to GDP Foreca...
Cdiscount
 
Scm prix blé_2012_11_06
Scm prix blé_2012_11_06Scm prix blé_2012_11_06
Scm prix blé_2012_11_06
Cdiscount
 
Paris2012 session4
Paris2012 session4Paris2012 session4
Paris2012 session4
Cdiscount
 
Présentation Olivier Biau Random forests et conjoncture
Présentation Olivier Biau Random forests et conjoncturePrésentation Olivier Biau Random forests et conjoncture
Présentation Olivier Biau Random forests et conjoncture
Cdiscount
 
Forecasting GDP profile with an application to French Business Surveys
Forecasting GDP profile with an application to French Business SurveysForecasting GDP profile with an application to French Business Surveys
Forecasting GDP profile with an application to French Business Surveys
Cdiscount
 
Prévision consommation électrique par processus à valeurs fonctionnelles
Prévision consommation électrique par processus à valeurs fonctionnellesPrévision consommation électrique par processus à valeurs fonctionnelles
Prévision consommation électrique par processus à valeurs fonctionnelles
Cdiscount
 
FLTauR - Construction de modèles de prévision sous r avec le package caret
FLTauR - Construction de modèles de prévision sous r avec le package caretFLTauR - Construction de modèles de prévision sous r avec le package caret
FLTauR - Construction de modèles de prévision sous r avec le package caret
jfeudeline
 

Viewers also liked (20)

Scm indicateurs prospectifs_2012_11_06
Scm indicateurs prospectifs_2012_11_06Scm indicateurs prospectifs_2012_11_06
Scm indicateurs prospectifs_2012_11_06
 
Scm risques
Scm risquesScm risques
Scm risques
 
Prediction of Quantiles by Statistical Learning and Application to GDP Foreca...
Prediction of Quantiles by Statistical Learning and Application to GDP Foreca...Prediction of Quantiles by Statistical Learning and Application to GDP Foreca...
Prediction of Quantiles by Statistical Learning and Application to GDP Foreca...
 
Scm prix blé_2012_11_06
Scm prix blé_2012_11_06Scm prix blé_2012_11_06
Scm prix blé_2012_11_06
 
Robust sequentiel learning
Robust sequentiel learningRobust sequentiel learning
Robust sequentiel learning
 
Paris2012 session4
Paris2012 session4Paris2012 session4
Paris2012 session4
 
Prévisions trafic aérien
Prévisions trafic aérienPrévisions trafic aérien
Prévisions trafic aérien
 
Présentation Olivier Biau Random forests et conjoncture
Présentation Olivier Biau Random forests et conjoncturePrésentation Olivier Biau Random forests et conjoncture
Présentation Olivier Biau Random forests et conjoncture
 
Ranking binaire, agrégation multiclasses
Ranking binaire, agrégation multiclasses Ranking binaire, agrégation multiclasses
Ranking binaire, agrégation multiclasses
 
Forecasting GDP profile with an application to French Business Surveys
Forecasting GDP profile with an application to French Business SurveysForecasting GDP profile with an application to French Business Surveys
Forecasting GDP profile with an application to French Business Surveys
 
Prévision consommation électrique par processus à valeurs fonctionnelles
Prévision consommation électrique par processus à valeurs fonctionnellesPrévision consommation électrique par processus à valeurs fonctionnelles
Prévision consommation électrique par processus à valeurs fonctionnelles
 
Prediction in dynamic Graphs
Prediction in dynamic GraphsPrediction in dynamic Graphs
Prediction in dynamic Graphs
 
Gur1009
Gur1009Gur1009
Gur1009
 
R2DOCX : R + WORD
R2DOCX : R + WORDR2DOCX : R + WORD
R2DOCX : R + WORD
 
Big data with r
Big data with rBig data with r
Big data with r
 
R Devtools
R DevtoolsR Devtools
R Devtools
 
Presentation r markdown
Presentation r markdown Presentation r markdown
Presentation r markdown
 
R versur Python
R versur PythonR versur Python
R versur Python
 
Présentation G.Biau Random Forests
Présentation G.Biau Random ForestsPrésentation G.Biau Random Forests
Présentation G.Biau Random Forests
 
FLTauR - Construction de modèles de prévision sous r avec le package caret
FLTauR - Construction de modèles de prévision sous r avec le package caretFLTauR - Construction de modèles de prévision sous r avec le package caret
FLTauR - Construction de modèles de prévision sous r avec le package caret
 

Similar to Paris2012 session1

SolutionsPlease see answer in bold letters.Note pi = 3.14.docx
SolutionsPlease see answer in bold letters.Note pi = 3.14.docxSolutionsPlease see answer in bold letters.Note pi = 3.14.docx
SolutionsPlease see answer in bold letters.Note pi = 3.14.docx
rafbolet0
 
Eonometrics for acct and finance ch 6 2023 (2).pdf
Eonometrics for acct and finance ch 6   2023 (2).pdfEonometrics for acct and finance ch 6   2023 (2).pdf
Eonometrics for acct and finance ch 6 2023 (2).pdf
GETAHUNASSEFAALEMU
 
MATLAB sessions Laboratory 6MAT 275 Laboratory 6Forced .docx
MATLAB sessions Laboratory 6MAT 275 Laboratory 6Forced .docxMATLAB sessions Laboratory 6MAT 275 Laboratory 6Forced .docx
MATLAB sessions Laboratory 6MAT 275 Laboratory 6Forced .docx
andreecapon
 
Amth250 octave matlab some solutions (1)
Amth250 octave matlab some solutions (1)Amth250 octave matlab some solutions (1)
Amth250 octave matlab some solutions (1)
asghar123456
 
Chapter 4: Modern Location Theory of the Firm
Chapter 4: Modern Location Theory of the FirmChapter 4: Modern Location Theory of the Firm
Chapter 4: Modern Location Theory of the Firm
DISPAR
 
11.[95 103]solution of telegraph equation by modified of double sumudu transf...
11.[95 103]solution of telegraph equation by modified of double sumudu transf...11.[95 103]solution of telegraph equation by modified of double sumudu transf...
11.[95 103]solution of telegraph equation by modified of double sumudu transf...
Alexander Decker
 
Chapter 14 solutions_to_exercises(engineering circuit analysis 7th)
Chapter 14 solutions_to_exercises(engineering circuit analysis 7th)Chapter 14 solutions_to_exercises(engineering circuit analysis 7th)
Chapter 14 solutions_to_exercises(engineering circuit analysis 7th)
Maamoun Hennache
 
Wide sense stationary process in digital communication
Wide sense stationary process in digital communicationWide sense stationary process in digital communication
Wide sense stationary process in digital communication
VitthalGavhane1
 
Ee443 phase locked loop - presentation - schwappach and brandy
Ee443   phase locked loop - presentation - schwappach and brandyEe443   phase locked loop - presentation - schwappach and brandy
Ee443 phase locked loop - presentation - schwappach and brandy
Loren Schwappach
 

Similar to Paris2012 session1 (20)

SolutionsPlease see answer in bold letters.Note pi = 3.14.docx
SolutionsPlease see answer in bold letters.Note pi = 3.14.docxSolutionsPlease see answer in bold letters.Note pi = 3.14.docx
SolutionsPlease see answer in bold letters.Note pi = 3.14.docx
 
Eonometrics for acct and finance ch 6 2023 (2).pdf
Eonometrics for acct and finance ch 6   2023 (2).pdfEonometrics for acct and finance ch 6   2023 (2).pdf
Eonometrics for acct and finance ch 6 2023 (2).pdf
 
Levy processes in the energy markets
Levy processes in the energy marketsLevy processes in the energy markets
Levy processes in the energy markets
 
Congrès SMAI 2019
Congrès SMAI 2019Congrès SMAI 2019
Congrès SMAI 2019
 
pres06-main
pres06-mainpres06-main
pres06-main
 
MATLAB sessions Laboratory 6MAT 275 Laboratory 6Forced .docx
MATLAB sessions Laboratory 6MAT 275 Laboratory 6Forced .docxMATLAB sessions Laboratory 6MAT 275 Laboratory 6Forced .docx
MATLAB sessions Laboratory 6MAT 275 Laboratory 6Forced .docx
 
Amth250 octave matlab some solutions (1)
Amth250 octave matlab some solutions (1)Amth250 octave matlab some solutions (1)
Amth250 octave matlab some solutions (1)
 
Optimal debt maturity management
Optimal debt maturity managementOptimal debt maturity management
Optimal debt maturity management
 
Panel data
Panel data Panel data
Panel data
 
Chapter 4: Modern Location Theory of the Firm
Chapter 4: Modern Location Theory of the FirmChapter 4: Modern Location Theory of the Firm
Chapter 4: Modern Location Theory of the Firm
 
Fourier Transform
Fourier TransformFourier Transform
Fourier Transform
 
11.[95 103]solution of telegraph equation by modified of double sumudu transf...
11.[95 103]solution of telegraph equation by modified of double sumudu transf...11.[95 103]solution of telegraph equation by modified of double sumudu transf...
11.[95 103]solution of telegraph equation by modified of double sumudu transf...
 
Chapter 14 solutions_to_exercises(engineering circuit analysis 7th)
Chapter 14 solutions_to_exercises(engineering circuit analysis 7th)Chapter 14 solutions_to_exercises(engineering circuit analysis 7th)
Chapter 14 solutions_to_exercises(engineering circuit analysis 7th)
 
Euler's Method.pdf
Euler's Method.pdfEuler's Method.pdf
Euler's Method.pdf
 
Euler's Method.pdf
Euler's Method.pdfEuler's Method.pdf
Euler's Method.pdf
 
Capitulo 10, 7ma edición
Capitulo 10, 7ma ediciónCapitulo 10, 7ma edición
Capitulo 10, 7ma edición
 
Capitulo 10 7 ed
Capitulo 10 7 edCapitulo 10 7 ed
Capitulo 10 7 ed
 
Wide sense stationary process in digital communication
Wide sense stationary process in digital communicationWide sense stationary process in digital communication
Wide sense stationary process in digital communication
 
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
 
Ee443 phase locked loop - presentation - schwappach and brandy
Ee443   phase locked loop - presentation - schwappach and brandyEe443   phase locked loop - presentation - schwappach and brandy
Ee443 phase locked loop - presentation - schwappach and brandy
 

More from Cdiscount (13)

Fltau r interface
Fltau r interfaceFltau r interface
Fltau r interface
 
Dataiku r users group v2
Dataiku   r users group v2Dataiku   r users group v2
Dataiku r users group v2
 
Introduction à la cartographie avec R
Introduction à la cartographie avec RIntroduction à la cartographie avec R
Introduction à la cartographie avec R
 
HADOOP + R
HADOOP + RHADOOP + R
HADOOP + R
 
Parallel R in snow (english after 2nd slide)
Parallel R in snow (english after 2nd slide)Parallel R in snow (english after 2nd slide)
Parallel R in snow (english after 2nd slide)
 
Premier pas de web scrapping avec R
Premier pas de  web scrapping avec RPremier pas de  web scrapping avec R
Premier pas de web scrapping avec R
 
Incorporer du C dans R, créer son package
Incorporer du C dans R, créer son packageIncorporer du C dans R, créer son package
Incorporer du C dans R, créer son package
 
Comptabilité Nationale avec R
Comptabilité Nationale avec RComptabilité Nationale avec R
Comptabilité Nationale avec R
 
Cartographie avec igraph sous R (Partie 2)
Cartographie avec igraph sous R (Partie 2)Cartographie avec igraph sous R (Partie 2)
Cartographie avec igraph sous R (Partie 2)
 
Cartographie avec igraph sous R (Partie 1)
Cartographie avec igraph sous R (Partie 1) Cartographie avec igraph sous R (Partie 1)
Cartographie avec igraph sous R (Partie 1)
 
RStudio is good for you
RStudio is good for youRStudio is good for you
RStudio is good for you
 
R fait du la tex
R fait du la texR fait du la tex
R fait du la tex
 
Première approche de cartographie sous R
Première approche de cartographie sous RPremière approche de cartographie sous R
Première approche de cartographie sous R
 

Paris2012 session1

  • 1. Forecasting in State Space: theory and practice Siem Jan Koopman http://personal.vu.nl/s.j.koopman Department of Econometrics VU University Amsterdam Tinbergen Institute 2012
  • 2. Program Lectures : • Introduction to UC models • State space methods • Forecasting time series with different components • Practice of Forecasting with Illustrations Exercises and assignments will be part of the course. 2 / 42
  • 3. Time Series A time series is a set of observations yt , each one recorded at a specific time t. The observations are ordered over time. We assume to have n observations, t = 1, . . . , n. Examples of time series are: • Number of cars sold each year • Gross Domestic Product of a country • Stock prices during one day • Number of firm defaults Our purpose is to identify and to model the serial or “dynamic” correlation structure in the time series. Time series analysis may be relevant for economic policy, financial decision making and forecasting 3 / 42
  • 4. Example: Nile data Nile Data 1400 1300 1200 1100 1000 900 800 700 600 500 1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 1970 4 / 42
  • 5. Example: GDP growth, quarter by quarter 5 4 3 2 1 0 −1 −2 −3 −4 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 5 / 42
  • 6. Example: winner boat races Cambridge/Oxford 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 1840 1860 1880 1900 1920 1940 1960 1980 2000 6 / 42
  • 7. Example: US monthly unemployment 7 / 42
  • 8. Sources for time series data Data sources : • US economics : http://research.stlouisfed.org/fred2/ • DK book data : http://www.ssfpack.com/dkbook.html • Financial data : Datastream, Yahoo Finance 8 / 42
  • 9. 9: 9 / 42
  • 10. White noise processes Simplest example of a stationary process is a white noise (WN) process which we usually denote as εt . A white noise process is a sequence of uncorrelated random 2 variables, each with zero mean and constant variance σε : 2 εt ∼ WN(0, σε ). The autocovariance function is equal to zero for lags h > 0: σε2 if h = 0, γY (h) = 0 if h = 0. 9 / 42
  • 11. White noise realisations White Noise, 500 observations 2 1 0 −1 −2 −3 0 50 100 150 200 250 300 350 400 450 500 10 / 42
  • 12. White noise ACF and SACF Theoretical ACF max lag = 50 Theoretical ACF max lag = 500 0.10 0.10 0.05 0.05 0.00 0.00 −0.05 −0.05 0 10 20 30 40 50 0 100 200 300 400 500 Sample ACF n = 50 Sample ACF n = 500 1.0 1.0 ACF− ACF− 0.5 0.5 0.0 0.0 −0.5 −0.5 0 10 20 30 40 50 0 100 200 300 400 500 11 / 42
  • 13. Random Walk processes If ε1 , ε2 , . . . come from a white noise process with variance σ 2 , then the process {Yt } with Yt = ε1 + ε2 + . . . + εt for t = 1, 2, . . . is called a random walk. A recursive way to define a random walk is: Yt = Yt−1 + εt for t = 2, 3, . . . Y1 = ε1 12 / 42
  • 14. Random Walk properties I A random walk is not stationary, because the variance of Yt is time-varying: E(Yt ) = E(ε1 + . . . + εt ) = 0 Var (Yt ) = E(Yt2 ) = E[(ε1 + . . . + εt )2 ] = tσ 2 The autocovariance function is equal to: γ(t, t − h) = E(Yt Yt−h ) t−h t t−h = E[( εj + εj )( εj )] j=1 j=t−h+1 j=1 2 = (t − h)σ This means that the variance and the autocovariances go to infinity if t → ∞. 13 / 42
  • 15. Random Walk properties II The autocorrelation of Yt and Yt−h is γ(t, t − h) ρ(t, t − h) = Var (Yt )Var (Yt−h ) √ (t − h)σ 2 t−h = = √ (tσ 2 )((t − h)σ 2 ) t 14 / 42
  • 16. RW realisation Random Walk, 500 observations 5 0 −5 −10 −15 −20 −25 −30 0 50 100 150 200 250 300 350 400 450 500 15 / 42
  • 17. RW sample ACF Sample ACF 50 lags 1.0 ACF− 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 5 10 15 20 25 30 35 40 45 50 16 / 42
  • 19. Classical Decomposition A basic model for representing a time series is the additive model yt = µt + γt + εt , t = 1, . . . , n, also known as the Classical Decomposition. yt = observation, µt = slowly changing component (trend), γt = periodic component (seasonal), εt = irregular component (disturbance). In a Structural Time Series Model (STSM) or Unobserved Components Model (UCM), the RHS components are modelled explicitly as stochastic processes. 18 / 42
  • 20. Nile data Nile Data 1400 1300 1200 1100 1000 900 800 700 600 500 1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 1970 19 / 42
  • 21. Local Level Model • Components can be deterministic functions of time (e.g. polynomials), or stochastic processes; 2 • Deterministic example: yt = µ + εt with εt ∼ NID(0, σε ). • Stochastic example: the Random Walk plus Noise, or Local Level model: 2 yt = µt + εt , εt ∼ NID(0, σε ) 2 µt+1 = µt + ηt , ηt ∼ NID(0, ση ), • The disturbances εt , ηs are independent for all s, t; • The model is incomplete without a specification for µ1 (note the non-stationarity): µ1 ∼ N (a, P) 20 / 42
  • 22. Local Level Model 2 yt = µt + εt , εt ∼ NID(0, σε ) 2 µt+1 = µt + ηt , ηt ∼ NID(0, ση ), µ1 ∼ N (a, P) • The level µt and the irregular εt are unobserved; 2 2 • Parameters: a, P, σε , ση ; • Trivial special cases: 2 2 • ση = 0 =⇒ yt ∼ NID(µ1 , σε ) (WN with constant level); 2 • σε = 0 =⇒ yt+1 = yt + ηt (pure RW); • Local Level is a model representation for EWMA forecasting. 21 / 42
  • 23. Simulated LL Data 6 σ ε 2=0.1 σ η 2=1 y µ 4 2 0 −2 −4 −6 0 10 20 30 40 50 60 70 80 90 100 22 / 42
  • 24. Simulated LL Data 6 σ ε 2=1 σ η 2=1 4 2 0 −2 −4 −6 0 10 20 30 40 50 60 70 80 90 100 23 / 42
  • 25. Simulated LL Data 6 σ ε 2=1 σ η 2=0.1 4 2 0 −2 −4 −6 0 10 20 30 40 50 60 70 80 90 100 24 / 42
  • 26. Simulated LL Data 5 σ ε 2=0.1 σ η 2=1 y µ 0 −5 0 10 20 30 40 50 60 70 80 90 100 5 σ ε 2=1 σ η 2=1 0 −5 0 10 20 30 40 50 60 70 80 90 100 2 σ ε 2=1 σ η 2=0.1 0 −2 0 10 20 30 40 50 60 70 80 90 100 25 / 42
  • 27. Properties of the LL model 2 yt = µt + εt , εt ∼ NID(0, σε ), 2 µt+1 = µt + ηt , ηt ∼ NID(0, ση ), • First difference is stationary: ∆yt = ∆µt + ∆εt = ηt−1 + εt − εt−1 . • Dynamic properties of ∆yt : E(∆yt ) = 0, 2 2 γ0 = E(∆yt ∆yt ) = ση + 2σε , 2 γ1 = E(∆yt ∆yt−1 ) = −σε , γτ = E(∆yt ∆yt−τ ) = 0 for τ ≥ 2. 26 / 42
  • 28. Properties of the LL model • The ACF of ∆yt is 2 −σε 1 2 2 ρ1 = 2 2 =− , q = ση /σε , ση + 2σε q+2 ρτ = 0, τ ≥ 2. • q is called the signal-noise ratio; • The model for ∆yt is MA(1) with restricted parameters such that −1/2 ≤ ρ1 ≤ 0 i.e., yt is ARIMA(0,1,1); • Write ∆yt = ξt + θξt−1 , ξt ∼ NID(0, σ 2 ) to solve θ: 1 θ= q 2 + 4q − 2 − q . 2 27 / 42
  • 29. Local Level Model • The model parameters are estimated by Maximum Likelihood; • Advantages of model based approach: assumptions can be tested, parameters are estimated rather than “calibrated”; • Estimated model can be used for signal extraction; • The estimated level µt is obtained as a locally weighted average; • The distribution of weights can be compared with Kernel functions in nonparametric regressions; • Within the model, our methods yield MMSE forecasts. 28 / 42
  • 30. Signal Extraction and Weights for the Nile Data data and estimated level weights 1250 0.02 1000 750 0.01 500 1880 1900 1920 1940 1960 −20 −15 −10 −5 0 5 10 15 20 1250 0.15 1000 0.10 750 0.05 500 1880 1900 1920 1940 1960 −20 −15 −10 −5 0 5 10 15 20 1500 1250 0.50 1000 0.25 750 500 1880 1900 1920 1940 1960 −20 −15 −10 −5 0 5 10 15 20 29 / 42
  • 31. Local Linear Trend Model The LLT model extends the LL model with a slope: 2 yt = µt + εt , εt ∼ NID(0, σε ), 2 µt+1 = βt + µt + ηt , ηt ∼ NID(0, ση ), 2 βt+1 = βt + ξt , ξt ∼ NID(0, σξ ). • All disturbances are independent at all lags and leads; • Initial distributions β1 , µ1 need to specified; 2 • If σξ = 0 the trend is a random walk with constant drift β1 ; (For β1 = 0 the model reduces to a LL model.) 2 • If additionally ση = 0 the trend is a straight line with slope β1 and intercept µ1 ; 2 2 • If σξ > 0 but ση = 0, the trend is a smooth curve, or an Integrated Random Walk; 30 / 42
  • 32. Trend and Slope in LLT Model µ 5.0 2.5 0.0 −2.5 0 10 20 30 40 50 60 70 80 90 100 0.75 β 0.50 0.25 0.00 −0.25 0 10 20 30 40 50 60 70 80 90 100 31 / 42
  • 33. Trend and Slope in Integrated Random Walk Model 10 µ 5 0 0 10 20 30 40 50 60 70 80 90 100 0.75 β 0.50 0.25 0.00 −0.25 0 10 20 30 40 50 60 70 80 90 100 32 / 42
  • 34. Local Linear Trend Model • Reduced form of LLT is ARIMA(0,2,2); • LLT provides a model for Holt-Winters forecasting; • Smooth LLT provides a model for spline-fitting; • Smoother trends: higher order Random Walks ∆d µt = ηt 33 / 42
  • 35. Seasonal Effects We have seen specifications for µt in the basic model yt = µt + γt + εt . Now we will consider the seasonal term γt . Let s denote the number of ‘seasons’ in the data: • s = 12 for monthly data, • s = 4 for quarterly data, • s = 7 for daily data when modelling a weekly pattern. 34 / 42
  • 36. Dummy Seasonal The simplest way to model seasonal effects is by using dummy variables. The effect summed over the seasons should equal zero: s−1 γt+1 = − γt+1−j . j=1 To allow the pattern to change over time, we introduce a new disturbance term: s−1 2 γt+1 = − γt+1−j + ωt , ωt ∼ NID(0, σω ). j=1 The expectation of the sum of the seasonal effects is zero. 35 / 42
  • 37. Trigonometric Seasonal Defining γjt as the effect of season j at time t, an alternative specification for the seasonal pattern is [s/2] γt = γjt , j=1 γj,t+1 = γjt cos λj + γjt sin λj + ωjt , ∗ γj,t+1 = −γjt sin λj + γjt cos λj + ωjt , ∗ ∗ ∗ 2 ωjt , ωjt ∼ NID(0, σω ), ∗ λj = 2πj/s. • Without the disturbance, the trigonometric specification is identical to the deterministic dummy specification. • The autocorrelation in the trigonometric specification lasts through more lags: changes occur in a smoother way; 36 / 42
  • 38. Unobserved Component Models • Different specifications for the trend and the seasonal can be freely combined. • Other components of interest, like cycles, explanatory variables, interventions effects, outliers, are easily added. • UC models are Multiple Source of Errors models. The reduced form is a Single Source of Errors model. • We model non-stationarity directly. • Components have an explicit interpretation: the model is not just a forecasting device. 37 / 42
  • 40. Seatbelt Law: decomposition drivers Level+Reg 7.75 7.25 70 75 80 85 drivers−Seasonal 0.2 0.0 70 75 80 85 drivers−Irregular 0.1 0.0 −0.1 70 75 80 85 39 / 42
  • 42. Textbooks • A.C.Harvey (1989). Forecasting, Structural Time Series Models and the Kalman Filter. Cambridge University Press • G.Kitagawa & W.Gersch (1996). Smoothness Priors Analysis of Time Series. Springer-Verlag • J.Harrison & M.West (1997). Bayesian Forecasting and Dynamic Models. Springer-Verlag • J.Durbin & S.J.Koopman (2001). Time Series Analysis by State Space Methods. Oxford University Press • J.J.F.Commandeur & S.J.Koopman (2007). An Introduction to State Space Time Series Analysis. Oxford University Press 41 / 42
  • 43. Exercises 1. Consider LL model (see slides, see DK chapter 2). • Reduced form is ARIMA(0,1,1) process. Derive the relationship between signal-to-noise ratio q of LL model and the θ coefficient of the ARIMA model; √ • Derive the reduced form in the case ηt = qεt and notice the difference in the general case. • Give the elements of the mean vector and variance matrix of y = (y1 , . . . , yn )′ when yt is generated by a LL model for t = 1, . . . , n. 2. Consider LLT model (see slides, see DK section 3.2.1). • Show that the reduced form is an ARIMA(0,2,2) process; • Discuss the initial values for level and slope of LLT; • Relate the LLT model forecasts with the Holt-Winters method of forecasting. Comment. 42 / 42