Regression Analysis:
Autocorrelation
Peter Adamu
Department of Economics
Nile University of Nigeria
Email:
peteradamu@gmail.com
1
Outlines
2
 The Nature of Autocorrelation
 OLS Estimation in the Presence of
Autocorrelation
 The Blue Estimator in the Presence of
Autocorrelation
 Consequences of using OLS in the
Presence of Autocorrelation
 Autocorrelation in Dynamic Models
Learning Outcomes
3
 To recap OLS assumptions –
Autocorrelation violates assumptions
 To discuss the nature of Autocorrelation
 To introduce the consequences of
Autocorrelation
 To detect Autocorrelation
 To rectify/correct Autocorrelation
Autocorrelation: Definition
4
 Assumption of CLRM
 No autocorrelation or the error term between 2 period are
not correlated.
 Cov(μi, μj) = 0, i ≠ j; and i and j are 2 different time period.
 Autocorrelation is the violation of this assumption.
 Autocorrelation implies that the error term from one time
period depends in some systematic way on error terms
from other time periods.
 If the error term in one period is correlated with the error term
in the previous time period, there is first-order autocorrelation.
 Autocorrelation is associated with time series data.
 In which the observations are ordered in chronological order.
Autocorrelation: Definition (cont.)
5
 Autocorrelation means that the disturbance term
(error term) for any observations is related to
the disturbance term of other observations.
 Example, Yt = β0 + β1X1t + β2X2t + μt
μt = α0 + α1μt-1 + εt
 (Y) Output = f [labor (X1), capital (X2)] for the
period of 1990-2002.
 Note:
 Error term captures factors that can not be explained
by independent variables.
 Error term also capture shocks.
Autocorrelation: Definition (cont.)
6
 If in 1994, there was a big fire that disrupted the
production of output, there is no reason to believe that
this disruption will be carried over to next year.
 If output is lower in 1994, there is no reason to expect it to
be lower the next year.
 Since the fire did not affect the labor and capital, the shock
in
1994 will affect only the output of 1994.
 This disruption of output in 1994 is not explained by labor
and capital. Thus, it is captured by the error term.
 If there is no autocorrelation, then the output disruption
will only happen to year 1994.
 If there is autocorrelation, the output disruption (shock)
will be carried to year 1995, 1996 and so on.
Autocorrelation: Implications
7
1. Estimates remain unbiased
 Regardless of autocorrelation the estimates of the
β’s will still be centered around the true populations
β’s.
2. Variance
 The property of minimum variance no longer holds
i.e. not efficient.
 This means that the variance and SE will generally
be biased – no longer BLUE.
3. Hypothesis testing
 The usual confidence intervals and hypothesis tests
based upon the t and F distribution are no longer
reliable.
Autocorrelation: Causes
8
1. Inertia from business cycle – time series
exhibit cycles.
 In the upswing, the value of a series at one point is
greater than its previous value. There is a
momentum built into them and it continues until
something happen.
2. Specification errors – omitted variables
(excluded variable case).
3. Specification errors – incorrect
functional form
4. (e.g. a linear form when a non-linear one should be
used).
5. Cobweb phenomenon – decision take time to
implement.
 E.g. Price of agricultural increase but supply of
output will only increase later.
Autocorrelation: Causes
9
5. Lags – the dependent variables depend on its
previous values.
 Known as autoregression model.
6. Manipulation of data
 Smoothing (average of a quarter is obtained by
adding the 3 months values and divide by 3).
 Interpolation or extrapolation (from annual data
interpolation to obtained quarterly data).
7. Nonstationary data
Autocorrelation: Tests/Indicators
1. Graphical Method (not
accurate)
2. Durbin-Watson d Test (for
first order autocorrelation
only)
3. Breusch-Godfrey LM test
(for higher order
autocorrelation).
Informal Test
10
Formal Test
Autocorrelation: Tests/Indicators
11
1. Graphical Method
 This approach is useful as a starting
point.
 After running an OLS model, investigate
the residuals.
 Useful to look at cross plots of:
 et against time t
 et against et-1
Autocorrelation: Tests/Indicators…
1. Graphical Method (cont.)
 E.g. Pattern of autocorrelation:
12
Autocorrelation: Tests/Indicators….
 E.g. Pattern of no autocorrelation:
ut
13
ut - 1
Autocorrelation: Tests/Indicators…
14
2. Durbin-Watson d Test
 Use to test for first order autocorrelation
[AR(1)] only.
Yt = β0 + β1X1t + μt
μt = ρμt-1 + vt
where
 vt is the well behave error term or purely random
error term.
 ρ is the coefficient of autocorrelation. (-1 ≤ ρ ≤ 1)
Autocorrelation: Tests/Indicators…
15
2. Durbin-Watson d Test
 Assumption underlying the d statistics:
i. The regression model must include the
intercept term.
Yt = β0 + β1X1t + μt
iii.
ii. The independent variables, Xs are
nonstochastic; their values are fixed in
repeated sampling.
The error term are generated as the first
order autoregressive scheme.
μt = ρμt-1 + vt
Autocorrelation: Tests/Indicators…
16
2. Durbin-Watson d Test (cont.)
 Assumption underlying the d statistics:
iv. The error term ut is assumed to be normally
distributed.
v. The regression does not contain the lagged
dependent variable (Yt-1); autoregressive
model.
e.g.
Yt = β0 + β1X1t + β2X2t + β3Yt-1 + μt
where Yt-1 is one period lagged value of Y.
vi. No missing observations in the data.
Autocorrelation: Tests/Indicators…
17
2
2
ˆ
 t
or
ˆ
n
n
 2
t
1
2. Durbin-Watson d Test (cont.)
 Steps involved in the Durbin-Watson Test:
i. Run the OLS and obtain the residuals μt.
ii. Compute the d statistics.
n
t t 1
d  t 2

 ˆt
ˆt1
  t 2

ˆ
ˆ
and d ≈ 2 (1 –
ρ)
Autocorrelation: Tests/Indicators…
18

iii.
2. Durbin-Watson d Test (cont.)
Steps involved in the Durbin-Watson Test:
For the given sample size and no. of explanatory variables,
find out the critical dL and dU values.
H0: No first order autocorrelation (ρ =
0) H1: First order autocorrelation (ρ ≠
0)
iv. Make conclusion based on decision
rules.
Durbin Watson
Table
19
Durbin Watson
Table
20
Autocorrelation: Tests/Indicators…
21
2. Durbin-Watson d Test (cont.)
d ≈ 2(1 – ρ)
 ρ is the coefficient of autocorrelation. (-1
≤ ρ ≤ 1)
 Perfect negative autocorrelation. (ρ = -1 or d = 4)
 No autocorrelation. (ρ = 0 or d = 2)
 Perfect positive autocorrelation. (ρ = 1 or d = 0)
2. Durbin-Watson d Test (cont.)
The d statistic is approximately equal to 2(1-r),
where r is the correlation between t and t-1 .
Hence it varies from 0 (extreme positive
autocorrelation) to 4 (extreme negative
autocorrelation), with a value of 2 indicating no
autocorrelation
22
Autocorrelation: Tests/Indicators…
Example 1: Durbin-Watson d Test
23
 If number of observations, n = 35, number of
independent variables (excluding constant term),
k = 3 and DW d statistics = 1.53, do we have
first order autocorrelation?
ANSWER:
 Refer to Durbin-Watson d Statistic table (0.05
level of significance).
 For n = 35, and k = 3, dL =
and dU = .
Example 1: Answer
 DW d statistics = 1.53,
.
 Conclusion: .
24
Autocorrelation: Tests/Indicators…
2. Durbin-Watson d Test (cont.)
 The Durbin-Watson test is unusual in that it has
an inconclusive region.
 One cannot look at a table for the critical value
of d but can find only upper and lower bounds
for the critical value (dU and dL, respectively).
 Hence if the calculated value of d falls within
those bounds, the test will be inconclusive.
Example 2: Labour Demand Model
 Consider the following firm level labour demand
model based on a time series data.
lt =β0 +β1wt +β2 yt + t t=1,…,T
In this equation t indexes year, l is log of labour
demand, w is the log wage faced by the, y is log of
its expected output, and  represents the usual
disturbance/error term.
Example 2: Labour Demand Model…
 If the ’s are independently, identically and
normally distributed, we can estimate the β’s
using OLS and use the t and F statistics for
inference, just like we did in the cross section
models
 But in time series data the assumption that the
errors are independent of one another should
not be taken for granted. Usually the sign of the
last period’s  could be a good indicator of
the sign of this period’s .
 Test using DW test:
The Following labour demand model is estimated on the
basis of 32 observations
Dependent Variable: L
Method: Least Squares
Sample: 1 32
Included observations: 32
Variable Coefficient Std. Error t-Statistic Prob.
Y 0.335329 0.073666 4.552018 0.0001
W 0.136913 0.062084 2.205274 0.0355
C 2.472616 0.433455 5.704434 0.0000
R-squared 0.984128 Mean dependent var 13.16161
Adjusted R-squared 0.983034 S.D. dependent var 0.686189
S.E. of regression 0.089379 Akaike info criterion -1.902791
Sum squared resid 0.231672 Schwarz criterion -1.765378
Log likelihood 33.44466 Hannan-Quinn criter. -1.857243
F-statistic 899.0740 Durbin-Watson stat 0.431615
Prob(F-statistic) 0.000000
Example 2: Labour Demand Model…
 Eviews give a d statistics of 0.4316, also
indicating that there are 3 regressors and 32
observations in the model
 The number of regressors without the intercept
is of course 2, and at 5% level of significance the
tabulated critical values are dL = and dU
=
.
 Conclusion:
Example: Autocorrelation test using STATA
30
Time Y X
1 591.43 11.32
2 542.23 10.74
3 502.82 12.86
4 674.96 15.73
5 743.04 16.14
6 811.1 16.24
7 768.69 18.76
8 767.06 20.21
9 675.45 22.41
10 742.87 22.19
11 734.66 24.24
12 812.33 24.97
13 922.1 25.13
14 982.24 27.07
15 974.38 27.49
16 898.35 23.63
17 911.51 27.16
18 833.37 29.62
19 798.83 27.93
20 795.84 29.38
21 713.51 32.21
22 752.36 31.4
23 729.95 32.34
24 679.64 25.31
. tsset time
time variable: time,
1 to 24
delta:
1 unit
. reg y x
Source | SS df MS Number of obs = 24
-------------+------------------------------ F( 1,
22)
= 10.89
Model | 113875.122 1 113875.122 Prob > F = 0.0033
Residual | 230075.133 22 10457.9606 R-squared = 0.3311
-------------+------------------------------ Adj R-squared = 0.3007
Total | 343950.255 23 14954.3589 Root MSE = 102.26
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
+
x | 10.76397 3.261983 3.30 0.003 3.999035 17.52891
_cons | 516.263 78.20027 6.60 0.000 354.0856 678.4404
. estat dwatson
Durbin-Watson d-statistic( 2, 24) = .463818
. estat durbinalt
Durbin's alternative test for autocorrelation
lags(p) | chi2 df Prob > chi2
+
1 | 28.947 1 0.0000
H0: no serial correlation
[Conclusion?]
Limitation of DW test
 The Durbin-Watson test is restrictive because:
(i) it depends on the data matrix,
(ii) its inconclusive region,
(iii)it may not be very as accurate if the error
term is not AR(1), and
(iv)it does not allow for lagged dependent
variables (Yt-1) as independent variable.
Autocorrelation: Tests/Indicators…
32
3. Breusch-Godfrey (BG) Test, Lagrange
Multiplier (LM) Test
 The BG test is a less restrictive
alternative
 For higher order autocorrelation.
 Suppose the error term has pth order
autocorrelation:
μt = ρ1μt-1 + ρ2μt-2 +…+ ρpμt-p +

t
where 
t is the classical error term.
Autocorrelation: Tests/Indicators…
33
3. Breusch-Godfrey (BG) Test, Lagrange
Multiplier (LM) Test…
 If all autoregressive coefficients are
simultaneously zero, there is no autocorrelation
of any order. Thus, our null hypothesis of BG
test is:
H0: ρ1 = ρ2 = … = ρp = 0 (no autocorrelation)
vs
HA: t = AR(p) or t = MA(p), p > 1 (auto)
34
Autocorrelation: Tests/Indicators…

3. Breusch-Godfrey (BG) Test, Lagrange Multiplier (LM)
Test (cont.)
Breusch-Godfrey test procedures are:
Step 1: Run the regression model by usual OLS and obtain
the residuals, ût.
Step 2: Regress ût on original Xt (all Xs) and ût-1, ût-2, ... , ût-p
(lagged values of the residuals in step 1) in the OLS
model and obtain R2.
Step 3: Compute (n – p) R2, which is asymptotically follows
the chi-squares distribution with p degrees of freedom,
where n = no. of obs., p = order of autocorrelation.
Step 4: If (n – p) R2 exceeds critical χ2, we can reject the null
hypothesis of no autocorrelation. Otherwise, do not reject
H0.
uˆt  1  2 Xt 
t
 ˆ1uˆt1  ˆ2uˆt2    ˆpuˆt p
Breusch-Godfrey Lagrange Multiplier (LM) Test
using STATA (after estimating OLS regression)
35
. reg y x
Source | SS df MS Number of obs =
-
+-- F( 1,
Model | Prob > F
Residual |
113875.122
230075.133
1 113875.122
22 10457.9606 R-squared
-
+-- Adj R-squared =
Total | 343950.255 23 14954.3589 Root MSE =
24
22) = 10.89
= 0.0033
= 0.3311
0.3007
102.26
- - - - -
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
- -
+-- - -
3.30 0.003 3.999035 17.52891
_cons |
x | 10.76397 3.261983
516.263 78.20027 6.60 0.000 354.0856 678.4404
- - - - -
. estat bgodfrey
Breusch-Godfrey LM test for autocorrelation
- - - - -
lags(p) | chi2 df Prob > chi2
-
+-- - - -
1 | 13.909 1 0.0002
- - - - -
H0: no serial correlation
.
36
Autocorrelation: Remedies
 If autocorrelation is detected, we can transform the
equation using the (i) method of Generalized
Least Square (GLS), and also (ii) Conchrane-
Orcutt Iterative Procedure.
 Assuming the autocorrelation problem is the first
order autocorrelation.
 Assuming the model as
Yt = β0 + β1X1t + μt
and the error term follows AR(1)
μt = ρμt-1 + t; -1 < ρ <
1 Thus, the model
Yt = β0 + β1X1t + ρμt-1
(1)
(2)
Autocorrelation: Remedies (cont.)
37
 Write the regression model with one period lag:
Yt-1 = β0 + β1X1, t-1 + μt-1
 Multiply the regression by ρ on both sides, we obtain:
ρYt-1 = ρβ0 + ρβ1X1, t-1 + ρμt-1
(3)
 Subtract the original model (2) with equation (3) :
(Yt - ρYt-1) = (β0 - ρβ0) + (β1X1t - ρβ1X1, t-1) + (ρμt - ρμt-1) + t
(Yt - ρYt-1) = β0(1 - ρ) + β1(X1t - ρX1, t-1) + t
 Re-write:
Yt* = β0* + β1X1t* + t
where
Yt* = (Yt - ρYt-1), β0* = β0(1 - ρ), X1t* = (X1t - ρX1, t-1)
Autocorrelation: Remedies (cont.)
38
 Since the error term in this equation satisfies the
OLS assumptions, we can apply OLS to the
transformed variables Y* and X*.
Yt* = β0* + β1X1t* + vt
 It involves regression Y on X, not in the original
form, but in the difference form, which is
obtained by subtracting a proportion (= ρ) of the
value of a variable in the previous time period
time period from its value in the current time
period.
Autocorrelation: Remedies (cont.)
 In differencing procedure, we lose one
observation because the first observation has no
antecedent.
 To avoid this loss of the first observation, the
first observation of Y and X are transformed as
follows:
 Obtain ρ from Eq (1).
1
39
1
X *
 1  2
( X )
1
1
Y *
 1  2
(Y )
Example 3
40
Y X
1999M1 591.43 11.32
1999M2 542.23 10.74
1999M3 502.82 12.86
1999M4 674.96 15.73
1999M5 743.04 16.14
1999M6 811.10 16.24
1999M7 768.69 18.76
1999M8 767.06 20.21
1999M9 675.45 22.41
1999M10 742.87 22.19
1999M11 734.66 24.24
1999M12 812.33 24.97
2000M1 922.10 25.13
2000M2 982.24 27.07
2000M3 974.38 27.49
2000M4 898.35 23.63
2000M5 911.51 27.16
2000M6 833.37 29.62
2000M7 798.83 27.93
2000M8 795.84 29.38
2000M9 713.51 32.21
2000M10 752.36 31.40
2000M11 729.95 32.34
2000M12 679.64 25.31
41
Example 3 (cont.)
Dependent Variable: Y
Method: Least Squares
Sample: 1999M01 2000M12
Included observations: 24
Variable Coefficient Std. Error t-Statistic Prob.
X 10.76397 3.261983 3.299825 0.0033
C 516.2630 78.20027 6.601806 0.0000
R-squared 0.331080 Mean dependent var 764.9467
Adjusted R-squared 0.300675 S.D. dependent var 122.2880
S.E. of regression 102.2642 Akaike info criterion 12.17265
Sum squared resid 230075.2 Schwarz criterion 12.27082
Log likelihood -144.0718 Hannan-Quinn criter. 12.19870
F-statistic 10.88884 Durbin-Watson stat 0.463818
Prob(F-statistic) 0.003263
• DW d statistics =
.
• n = .
• k = .
Example 3 (cont.)
42
 DW d statistics =
 n =
k =
 Decision :
 ρ =
Example 3 (cont.) - transform the data using ρ
43
Y X Y(t-1) X(t-1) Y(t) - pY(t-1) X(t) - pX(t-1)
1999M1 591.43 11.32 378.7103 7.2485
1999M2 542.23 10.74 591.43 11.32 87.952617 2.045108
1999M3 502.82 12.86 542.23 10.74 86.333137 4.610606
1999M4 674.96 15.73 502.82 12.86 288.743958 5.852234
1999M5 743.04 16.14 674.96 15.73 224.603224 4.057787
1999M6 811.10 16.24 743.04 16.14 240.370976 3.842866
1999M7 768.69 18.76 811.10 16.24 145.68409 6.286056
1999M8 767.06 20.21 768.69 18.76 176.629211 5.800444
1999M9 675.45 22.41 767.06 20.21 86.271214 6.886699
1999M10 742.87 22.19 675.45 22.41 224.056855 4.976879
1999M11 734.66 24.24 742.87 22.19 164.061553 7.195861
1999M12 812.33 24.97 734.66 24.24 248.037654 6.351256
2000M1 922.10 25.13 812.33 24.97 298.149327 5.950543
2000M2 982.24 27.07 922.10 25.13 273.97499 7.767647
2000M3 974.38 27.49 982.24 27.07 219.921456 6.697533
2000M4 898.35 23.63 974.38 27.49 149.928722 2.514931
2000M5 911.51 27.16 898.35 23.63 221.487365 9.009797
2000M6 833.37 29.62 911.51 27.16 133.239169 8.758404
2000M7 798.83 27.93 833.37 29.62 158.718503 5.178878
2000M8 795.84 29.38 798.83 27.93 182.258677 7.926967
2000M9 713.51 32.21 795.84 29.38 102.225296 9.643222
2000M10 752.36 31.40 713.51 32.21 204.312969 6.659499
2000M11 729.95 32.34 752.36 31.40 152.062284 8.22166
2000M12 679.64 25.31 729.95 32.34 118.965405 0.469646
Example 3 (cont.)
44
Dependent Variable: YSTAR
Method: Least Squares
Sample: 1999M01 2000M12
Included observations: 24
Variable Coefficient Std. Error t-Statistic Prob.
XSTAR 5.943227 6.968103 0.852919 0.4029
C 154.6314 44.53969 3.471767 0.0022
R-squared 0.032008 Mean dependent var 190.2791
Adjusted R-squared -0.011991 S.D. dependent var 74.96486
S.E. of regression 75.41298 Akaike info criterion 11.56349
Sum squared resid 125116.6 Schwarz criterion 11.66166
Log likelihood -136.7619 Hannan-Quinn criter. 11.58954
F-statistic 0.727471 Durbin-Watson stat 1.698745
Prob(F-statistic) 0.402895
XSTAR = X* = Xt – ρXt-1
dL = 1.273 and dU = 1.446, DW stat = .
Decision: .
Autocorrelation: Remedies (cont.)
 But a word of caution , to get a good estimate for
ρ, GLS requires a large sample size as it relies
on asymptotic properties.
 Analogous to White standard errors for
heteroskedasticity, there are Newey-West errors
standard errors that adjust conventionally
measured standard errors to account for serial
correlation and heteroskedasticity.
Autocorrelation: Remedies (cont.)
 To generate result with Newey-West errors in STATA
. newey y x, lag(2) [Need to include lag]
Regression with Newey-West standard errors Number of obs = 24
maximum lag: 2 F( 1, 22) = 4.86
Prob > F = 0.0383
- - - - -
|
y | Coef.
Newey-West
Std. Err.
t
P>|t| [95% Conf. Interval]
+-- - - - -
2.20 0.038
_cons |
x | 10.76397 4.884593
516.263 106.6412 4.84 0.000
.6339483 20.894
295.1027 737.4233
- - - - -
.
. newey y x, lag(1) [Need to include lag]
Regression with Newey-West standard errors Number of obs = 24
maximum lag: 1 F(
1,
22) = 6.10
Prob > F = 0.0218
- - - - -
Newey-West
|
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-
+-- - - -
2.47 0.022
_cons |
x | 10.76397 4.359792
516.263 97.16329 5.31 0.000
1.722318 19.80563
314.7587 717.7673
- - - - -
47
Autocorrelation: Remedies (cont.)
Notice that the coefficients are the same as the
OLS coefficients (slide 39), but the standard
errors are different, and they are valid for
making inference.

1585557622_13765780_Autocorrelation.pptx

  • 1.
    Regression Analysis: Autocorrelation Peter Adamu Departmentof Economics Nile University of Nigeria Email: peteradamu@gmail.com 1
  • 2.
    Outlines 2  The Natureof Autocorrelation  OLS Estimation in the Presence of Autocorrelation  The Blue Estimator in the Presence of Autocorrelation  Consequences of using OLS in the Presence of Autocorrelation  Autocorrelation in Dynamic Models
  • 3.
    Learning Outcomes 3  Torecap OLS assumptions – Autocorrelation violates assumptions  To discuss the nature of Autocorrelation  To introduce the consequences of Autocorrelation  To detect Autocorrelation  To rectify/correct Autocorrelation
  • 4.
    Autocorrelation: Definition 4  Assumptionof CLRM  No autocorrelation or the error term between 2 period are not correlated.  Cov(μi, μj) = 0, i ≠ j; and i and j are 2 different time period.  Autocorrelation is the violation of this assumption.  Autocorrelation implies that the error term from one time period depends in some systematic way on error terms from other time periods.  If the error term in one period is correlated with the error term in the previous time period, there is first-order autocorrelation.  Autocorrelation is associated with time series data.  In which the observations are ordered in chronological order.
  • 5.
    Autocorrelation: Definition (cont.) 5 Autocorrelation means that the disturbance term (error term) for any observations is related to the disturbance term of other observations.  Example, Yt = β0 + β1X1t + β2X2t + μt μt = α0 + α1μt-1 + εt  (Y) Output = f [labor (X1), capital (X2)] for the period of 1990-2002.  Note:  Error term captures factors that can not be explained by independent variables.  Error term also capture shocks.
  • 6.
    Autocorrelation: Definition (cont.) 6 If in 1994, there was a big fire that disrupted the production of output, there is no reason to believe that this disruption will be carried over to next year.  If output is lower in 1994, there is no reason to expect it to be lower the next year.  Since the fire did not affect the labor and capital, the shock in 1994 will affect only the output of 1994.  This disruption of output in 1994 is not explained by labor and capital. Thus, it is captured by the error term.  If there is no autocorrelation, then the output disruption will only happen to year 1994.  If there is autocorrelation, the output disruption (shock) will be carried to year 1995, 1996 and so on.
  • 7.
    Autocorrelation: Implications 7 1. Estimatesremain unbiased  Regardless of autocorrelation the estimates of the β’s will still be centered around the true populations β’s. 2. Variance  The property of minimum variance no longer holds i.e. not efficient.  This means that the variance and SE will generally be biased – no longer BLUE. 3. Hypothesis testing  The usual confidence intervals and hypothesis tests based upon the t and F distribution are no longer reliable.
  • 8.
    Autocorrelation: Causes 8 1. Inertiafrom business cycle – time series exhibit cycles.  In the upswing, the value of a series at one point is greater than its previous value. There is a momentum built into them and it continues until something happen. 2. Specification errors – omitted variables (excluded variable case). 3. Specification errors – incorrect functional form 4. (e.g. a linear form when a non-linear one should be used). 5. Cobweb phenomenon – decision take time to implement.  E.g. Price of agricultural increase but supply of output will only increase later.
  • 9.
    Autocorrelation: Causes 9 5. Lags– the dependent variables depend on its previous values.  Known as autoregression model. 6. Manipulation of data  Smoothing (average of a quarter is obtained by adding the 3 months values and divide by 3).  Interpolation or extrapolation (from annual data interpolation to obtained quarterly data). 7. Nonstationary data
  • 10.
    Autocorrelation: Tests/Indicators 1. GraphicalMethod (not accurate) 2. Durbin-Watson d Test (for first order autocorrelation only) 3. Breusch-Godfrey LM test (for higher order autocorrelation). Informal Test 10 Formal Test
  • 11.
    Autocorrelation: Tests/Indicators 11 1. GraphicalMethod  This approach is useful as a starting point.  After running an OLS model, investigate the residuals.  Useful to look at cross plots of:  et against time t  et against et-1
  • 12.
    Autocorrelation: Tests/Indicators… 1. GraphicalMethod (cont.)  E.g. Pattern of autocorrelation: 12
  • 13.
    Autocorrelation: Tests/Indicators….  E.g.Pattern of no autocorrelation: ut 13 ut - 1
  • 14.
    Autocorrelation: Tests/Indicators… 14 2. Durbin-Watsond Test  Use to test for first order autocorrelation [AR(1)] only. Yt = β0 + β1X1t + μt μt = ρμt-1 + vt where  vt is the well behave error term or purely random error term.  ρ is the coefficient of autocorrelation. (-1 ≤ ρ ≤ 1)
  • 15.
    Autocorrelation: Tests/Indicators… 15 2. Durbin-Watsond Test  Assumption underlying the d statistics: i. The regression model must include the intercept term. Yt = β0 + β1X1t + μt iii. ii. The independent variables, Xs are nonstochastic; their values are fixed in repeated sampling. The error term are generated as the first order autoregressive scheme. μt = ρμt-1 + vt
  • 16.
    Autocorrelation: Tests/Indicators… 16 2. Durbin-Watsond Test (cont.)  Assumption underlying the d statistics: iv. The error term ut is assumed to be normally distributed. v. The regression does not contain the lagged dependent variable (Yt-1); autoregressive model. e.g. Yt = β0 + β1X1t + β2X2t + β3Yt-1 + μt where Yt-1 is one period lagged value of Y. vi. No missing observations in the data.
  • 17.
    Autocorrelation: Tests/Indicators… 17 2 2 ˆ  t or ˆ n n 2 t 1 2. Durbin-Watson d Test (cont.)  Steps involved in the Durbin-Watson Test: i. Run the OLS and obtain the residuals μt. ii. Compute the d statistics. n t t 1 d  t 2   ˆt ˆt1   t 2  ˆ ˆ and d ≈ 2 (1 – ρ)
  • 18.
    Autocorrelation: Tests/Indicators… 18  iii. 2. Durbin-Watsond Test (cont.) Steps involved in the Durbin-Watson Test: For the given sample size and no. of explanatory variables, find out the critical dL and dU values. H0: No first order autocorrelation (ρ = 0) H1: First order autocorrelation (ρ ≠ 0) iv. Make conclusion based on decision rules.
  • 19.
  • 20.
  • 21.
    Autocorrelation: Tests/Indicators… 21 2. Durbin-Watsond Test (cont.) d ≈ 2(1 – ρ)  ρ is the coefficient of autocorrelation. (-1 ≤ ρ ≤ 1)  Perfect negative autocorrelation. (ρ = -1 or d = 4)  No autocorrelation. (ρ = 0 or d = 2)  Perfect positive autocorrelation. (ρ = 1 or d = 0)
  • 22.
    2. Durbin-Watson dTest (cont.) The d statistic is approximately equal to 2(1-r), where r is the correlation between t and t-1 . Hence it varies from 0 (extreme positive autocorrelation) to 4 (extreme negative autocorrelation), with a value of 2 indicating no autocorrelation 22 Autocorrelation: Tests/Indicators…
  • 23.
    Example 1: Durbin-Watsond Test 23  If number of observations, n = 35, number of independent variables (excluding constant term), k = 3 and DW d statistics = 1.53, do we have first order autocorrelation? ANSWER:  Refer to Durbin-Watson d Statistic table (0.05 level of significance).  For n = 35, and k = 3, dL = and dU = .
  • 24.
    Example 1: Answer DW d statistics = 1.53, .  Conclusion: . 24
  • 25.
    Autocorrelation: Tests/Indicators… 2. Durbin-Watsond Test (cont.)  The Durbin-Watson test is unusual in that it has an inconclusive region.  One cannot look at a table for the critical value of d but can find only upper and lower bounds for the critical value (dU and dL, respectively).  Hence if the calculated value of d falls within those bounds, the test will be inconclusive.
  • 26.
    Example 2: LabourDemand Model  Consider the following firm level labour demand model based on a time series data. lt =β0 +β1wt +β2 yt + t t=1,…,T In this equation t indexes year, l is log of labour demand, w is the log wage faced by the, y is log of its expected output, and  represents the usual disturbance/error term.
  • 27.
    Example 2: LabourDemand Model…  If the ’s are independently, identically and normally distributed, we can estimate the β’s using OLS and use the t and F statistics for inference, just like we did in the cross section models  But in time series data the assumption that the errors are independent of one another should not be taken for granted. Usually the sign of the last period’s  could be a good indicator of the sign of this period’s .  Test using DW test:
  • 28.
    The Following labourdemand model is estimated on the basis of 32 observations Dependent Variable: L Method: Least Squares Sample: 1 32 Included observations: 32 Variable Coefficient Std. Error t-Statistic Prob. Y 0.335329 0.073666 4.552018 0.0001 W 0.136913 0.062084 2.205274 0.0355 C 2.472616 0.433455 5.704434 0.0000 R-squared 0.984128 Mean dependent var 13.16161 Adjusted R-squared 0.983034 S.D. dependent var 0.686189 S.E. of regression 0.089379 Akaike info criterion -1.902791 Sum squared resid 0.231672 Schwarz criterion -1.765378 Log likelihood 33.44466 Hannan-Quinn criter. -1.857243 F-statistic 899.0740 Durbin-Watson stat 0.431615 Prob(F-statistic) 0.000000
  • 29.
    Example 2: LabourDemand Model…  Eviews give a d statistics of 0.4316, also indicating that there are 3 regressors and 32 observations in the model  The number of regressors without the intercept is of course 2, and at 5% level of significance the tabulated critical values are dL = and dU = .  Conclusion:
  • 30.
    Example: Autocorrelation testusing STATA 30 Time Y X 1 591.43 11.32 2 542.23 10.74 3 502.82 12.86 4 674.96 15.73 5 743.04 16.14 6 811.1 16.24 7 768.69 18.76 8 767.06 20.21 9 675.45 22.41 10 742.87 22.19 11 734.66 24.24 12 812.33 24.97 13 922.1 25.13 14 982.24 27.07 15 974.38 27.49 16 898.35 23.63 17 911.51 27.16 18 833.37 29.62 19 798.83 27.93 20 795.84 29.38 21 713.51 32.21 22 752.36 31.4 23 729.95 32.34 24 679.64 25.31 . tsset time time variable: time, 1 to 24 delta: 1 unit . reg y x Source | SS df MS Number of obs = 24 -------------+------------------------------ F( 1, 22) = 10.89 Model | 113875.122 1 113875.122 Prob > F = 0.0033 Residual | 230075.133 22 10457.9606 R-squared = 0.3311 -------------+------------------------------ Adj R-squared = 0.3007 Total | 343950.255 23 14954.3589 Root MSE = 102.26 y | Coef. Std. Err. t P>|t| [95% Conf. Interval] + x | 10.76397 3.261983 3.30 0.003 3.999035 17.52891 _cons | 516.263 78.20027 6.60 0.000 354.0856 678.4404 . estat dwatson Durbin-Watson d-statistic( 2, 24) = .463818 . estat durbinalt Durbin's alternative test for autocorrelation lags(p) | chi2 df Prob > chi2 + 1 | 28.947 1 0.0000 H0: no serial correlation [Conclusion?]
  • 31.
    Limitation of DWtest  The Durbin-Watson test is restrictive because: (i) it depends on the data matrix, (ii) its inconclusive region, (iii)it may not be very as accurate if the error term is not AR(1), and (iv)it does not allow for lagged dependent variables (Yt-1) as independent variable.
  • 32.
    Autocorrelation: Tests/Indicators… 32 3. Breusch-Godfrey(BG) Test, Lagrange Multiplier (LM) Test  The BG test is a less restrictive alternative  For higher order autocorrelation.  Suppose the error term has pth order autocorrelation: μt = ρ1μt-1 + ρ2μt-2 +…+ ρpμt-p +  t where  t is the classical error term.
  • 33.
    Autocorrelation: Tests/Indicators… 33 3. Breusch-Godfrey(BG) Test, Lagrange Multiplier (LM) Test…  If all autoregressive coefficients are simultaneously zero, there is no autocorrelation of any order. Thus, our null hypothesis of BG test is: H0: ρ1 = ρ2 = … = ρp = 0 (no autocorrelation) vs HA: t = AR(p) or t = MA(p), p > 1 (auto)
  • 34.
    34 Autocorrelation: Tests/Indicators…  3. Breusch-Godfrey(BG) Test, Lagrange Multiplier (LM) Test (cont.) Breusch-Godfrey test procedures are: Step 1: Run the regression model by usual OLS and obtain the residuals, ût. Step 2: Regress ût on original Xt (all Xs) and ût-1, ût-2, ... , ût-p (lagged values of the residuals in step 1) in the OLS model and obtain R2. Step 3: Compute (n – p) R2, which is asymptotically follows the chi-squares distribution with p degrees of freedom, where n = no. of obs., p = order of autocorrelation. Step 4: If (n – p) R2 exceeds critical χ2, we can reject the null hypothesis of no autocorrelation. Otherwise, do not reject H0. uˆt  1  2 Xt  t  ˆ1uˆt1  ˆ2uˆt2    ˆpuˆt p
  • 35.
    Breusch-Godfrey Lagrange Multiplier(LM) Test using STATA (after estimating OLS regression) 35 . reg y x Source | SS df MS Number of obs = - +-- F( 1, Model | Prob > F Residual | 113875.122 230075.133 1 113875.122 22 10457.9606 R-squared - +-- Adj R-squared = Total | 343950.255 23 14954.3589 Root MSE = 24 22) = 10.89 = 0.0033 = 0.3311 0.3007 102.26 - - - - - y | Coef. Std. Err. t P>|t| [95% Conf. Interval] - - +-- - - 3.30 0.003 3.999035 17.52891 _cons | x | 10.76397 3.261983 516.263 78.20027 6.60 0.000 354.0856 678.4404 - - - - - . estat bgodfrey Breusch-Godfrey LM test for autocorrelation - - - - - lags(p) | chi2 df Prob > chi2 - +-- - - - 1 | 13.909 1 0.0002 - - - - - H0: no serial correlation .
  • 36.
    36 Autocorrelation: Remedies  Ifautocorrelation is detected, we can transform the equation using the (i) method of Generalized Least Square (GLS), and also (ii) Conchrane- Orcutt Iterative Procedure.  Assuming the autocorrelation problem is the first order autocorrelation.  Assuming the model as Yt = β0 + β1X1t + μt and the error term follows AR(1) μt = ρμt-1 + t; -1 < ρ < 1 Thus, the model Yt = β0 + β1X1t + ρμt-1 (1) (2)
  • 37.
    Autocorrelation: Remedies (cont.) 37 Write the regression model with one period lag: Yt-1 = β0 + β1X1, t-1 + μt-1  Multiply the regression by ρ on both sides, we obtain: ρYt-1 = ρβ0 + ρβ1X1, t-1 + ρμt-1 (3)  Subtract the original model (2) with equation (3) : (Yt - ρYt-1) = (β0 - ρβ0) + (β1X1t - ρβ1X1, t-1) + (ρμt - ρμt-1) + t (Yt - ρYt-1) = β0(1 - ρ) + β1(X1t - ρX1, t-1) + t  Re-write: Yt* = β0* + β1X1t* + t where Yt* = (Yt - ρYt-1), β0* = β0(1 - ρ), X1t* = (X1t - ρX1, t-1)
  • 38.
    Autocorrelation: Remedies (cont.) 38 Since the error term in this equation satisfies the OLS assumptions, we can apply OLS to the transformed variables Y* and X*. Yt* = β0* + β1X1t* + vt  It involves regression Y on X, not in the original form, but in the difference form, which is obtained by subtracting a proportion (= ρ) of the value of a variable in the previous time period time period from its value in the current time period.
  • 39.
    Autocorrelation: Remedies (cont.) In differencing procedure, we lose one observation because the first observation has no antecedent.  To avoid this loss of the first observation, the first observation of Y and X are transformed as follows:  Obtain ρ from Eq (1). 1 39 1 X *  1  2 ( X ) 1 1 Y *  1  2 (Y )
  • 40.
    Example 3 40 Y X 1999M1591.43 11.32 1999M2 542.23 10.74 1999M3 502.82 12.86 1999M4 674.96 15.73 1999M5 743.04 16.14 1999M6 811.10 16.24 1999M7 768.69 18.76 1999M8 767.06 20.21 1999M9 675.45 22.41 1999M10 742.87 22.19 1999M11 734.66 24.24 1999M12 812.33 24.97 2000M1 922.10 25.13 2000M2 982.24 27.07 2000M3 974.38 27.49 2000M4 898.35 23.63 2000M5 911.51 27.16 2000M6 833.37 29.62 2000M7 798.83 27.93 2000M8 795.84 29.38 2000M9 713.51 32.21 2000M10 752.36 31.40 2000M11 729.95 32.34 2000M12 679.64 25.31
  • 41.
    41 Example 3 (cont.) DependentVariable: Y Method: Least Squares Sample: 1999M01 2000M12 Included observations: 24 Variable Coefficient Std. Error t-Statistic Prob. X 10.76397 3.261983 3.299825 0.0033 C 516.2630 78.20027 6.601806 0.0000 R-squared 0.331080 Mean dependent var 764.9467 Adjusted R-squared 0.300675 S.D. dependent var 122.2880 S.E. of regression 102.2642 Akaike info criterion 12.17265 Sum squared resid 230075.2 Schwarz criterion 12.27082 Log likelihood -144.0718 Hannan-Quinn criter. 12.19870 F-statistic 10.88884 Durbin-Watson stat 0.463818 Prob(F-statistic) 0.003263 • DW d statistics = . • n = . • k = .
  • 42.
    Example 3 (cont.) 42 DW d statistics =  n = k =  Decision :  ρ =
  • 43.
    Example 3 (cont.)- transform the data using ρ 43 Y X Y(t-1) X(t-1) Y(t) - pY(t-1) X(t) - pX(t-1) 1999M1 591.43 11.32 378.7103 7.2485 1999M2 542.23 10.74 591.43 11.32 87.952617 2.045108 1999M3 502.82 12.86 542.23 10.74 86.333137 4.610606 1999M4 674.96 15.73 502.82 12.86 288.743958 5.852234 1999M5 743.04 16.14 674.96 15.73 224.603224 4.057787 1999M6 811.10 16.24 743.04 16.14 240.370976 3.842866 1999M7 768.69 18.76 811.10 16.24 145.68409 6.286056 1999M8 767.06 20.21 768.69 18.76 176.629211 5.800444 1999M9 675.45 22.41 767.06 20.21 86.271214 6.886699 1999M10 742.87 22.19 675.45 22.41 224.056855 4.976879 1999M11 734.66 24.24 742.87 22.19 164.061553 7.195861 1999M12 812.33 24.97 734.66 24.24 248.037654 6.351256 2000M1 922.10 25.13 812.33 24.97 298.149327 5.950543 2000M2 982.24 27.07 922.10 25.13 273.97499 7.767647 2000M3 974.38 27.49 982.24 27.07 219.921456 6.697533 2000M4 898.35 23.63 974.38 27.49 149.928722 2.514931 2000M5 911.51 27.16 898.35 23.63 221.487365 9.009797 2000M6 833.37 29.62 911.51 27.16 133.239169 8.758404 2000M7 798.83 27.93 833.37 29.62 158.718503 5.178878 2000M8 795.84 29.38 798.83 27.93 182.258677 7.926967 2000M9 713.51 32.21 795.84 29.38 102.225296 9.643222 2000M10 752.36 31.40 713.51 32.21 204.312969 6.659499 2000M11 729.95 32.34 752.36 31.40 152.062284 8.22166 2000M12 679.64 25.31 729.95 32.34 118.965405 0.469646
  • 44.
    Example 3 (cont.) 44 DependentVariable: YSTAR Method: Least Squares Sample: 1999M01 2000M12 Included observations: 24 Variable Coefficient Std. Error t-Statistic Prob. XSTAR 5.943227 6.968103 0.852919 0.4029 C 154.6314 44.53969 3.471767 0.0022 R-squared 0.032008 Mean dependent var 190.2791 Adjusted R-squared -0.011991 S.D. dependent var 74.96486 S.E. of regression 75.41298 Akaike info criterion 11.56349 Sum squared resid 125116.6 Schwarz criterion 11.66166 Log likelihood -136.7619 Hannan-Quinn criter. 11.58954 F-statistic 0.727471 Durbin-Watson stat 1.698745 Prob(F-statistic) 0.402895 XSTAR = X* = Xt – ρXt-1 dL = 1.273 and dU = 1.446, DW stat = . Decision: .
  • 45.
    Autocorrelation: Remedies (cont.) But a word of caution , to get a good estimate for ρ, GLS requires a large sample size as it relies on asymptotic properties.  Analogous to White standard errors for heteroskedasticity, there are Newey-West errors standard errors that adjust conventionally measured standard errors to account for serial correlation and heteroskedasticity.
  • 46.
    Autocorrelation: Remedies (cont.) To generate result with Newey-West errors in STATA . newey y x, lag(2) [Need to include lag] Regression with Newey-West standard errors Number of obs = 24 maximum lag: 2 F( 1, 22) = 4.86 Prob > F = 0.0383 - - - - - | y | Coef. Newey-West Std. Err. t P>|t| [95% Conf. Interval] +-- - - - - 2.20 0.038 _cons | x | 10.76397 4.884593 516.263 106.6412 4.84 0.000 .6339483 20.894 295.1027 737.4233 - - - - - . . newey y x, lag(1) [Need to include lag] Regression with Newey-West standard errors Number of obs = 24 maximum lag: 1 F( 1, 22) = 6.10 Prob > F = 0.0218 - - - - - Newey-West | y | Coef. Std. Err. t P>|t| [95% Conf. Interval] - +-- - - - 2.47 0.022 _cons | x | 10.76397 4.359792 516.263 97.16329 5.31 0.000 1.722318 19.80563 314.7587 717.7673 - - - - -
  • 47.
    47 Autocorrelation: Remedies (cont.) Noticethat the coefficients are the same as the OLS coefficients (slide 39), but the standard errors are different, and they are valid for making inference.