Cointegration analysis: Modelling the complex interdependencies between financial assets

Cointegration analysis
Modelling the complex interdependencies between ﬁnancial assets
Dr. Edward Thomas Jones
Bangor Business School
Bangor University
E-mail: e.t.jones@bangor.ac.uk
February 21, 2017
Dr. Edward Thomas Jones Cointegration analysis February 21, 2017 1 / 40

Cointegration analysis The non-stationary process
The drunkard and his dog
The notion of cointegration: An adaptation of the drunkard’s walk

Cointegration analysis Contents
Overview
Background
Stationarity
Modelling dynamic patterns of association
Cointegration analysis
References
Appendix

Background Correlation analysis
Measuring simple relationships
Correlation (e.g. Pearson’s product-moment coefficient) is an easy and well-understood
approach to measuring the relationship between two variables. It is widely used in
academia and industry through risk management and hedging calculations to explain the
relationship between the movement of financial assets as well as economic data series.
ρX,Y = corr(X, Y ) =
cov(X, Y )
σX σY
=
E(X − µX )(Y − µY )
σX σY
where ρX,Y is the correlation coefficient, X and Y are two random variables with
expected values of µX and µY , and standard deviation of σX and σY , and cov(X, Y ) is
the covariance of X and Y .
A correlation value can sit anywhere between -1 (perfect negative relationship) and +1
(perfect positive relationship).

Background Correlation analysis
Issues with a simple correlation measure
Despite correlation being a popular tool to measuring relationships, it is unstable and is
of very limited use (see Pearl, 2000).
Two equal groups from the same dataset (up to one date and beyond that date)
could result in very different relationships between the series being implied by the
correlation.
Correlation analysis is only valid for stationary series; that is, a series with a mean
and variance that doesn’t change over time (Alexander and Dimitriu (2002)). This
condition usually requires prior de-trending of the series before performing
correlation analysis, which results in loss of valuable information.
De-trending the data series before analysis removes any possibility to detect a
common trend and the interpretation of a relationship becomes difficult when
different approaches are taken to de-trend both series. This problem is amplified
when series are integrated to different orders so as to become stationary.

Stationarity Definition of a stationary series
What is a stationary series?
A stationary series has a mean and variance that doesn’t change over time and does not
follow any trends. That is, the joint probability distribution does not change over time.
E(yt ) = E(yt−s ) = µ
E(yt − µ)2
= E(yt−s − µ)2
= σ2
y
In addition, there is also covariance stationarity which requires the autocovariances of
the series to be unaffected by a change in time.
E(yt − µ)(yt−s − µ) = E(Yt−j − µ)(yt−j−s − µ) = δs
where µ, σ2
y , and δs are constants.
A series is integrated of order d if it must be differenced d times in order to become
stationary. A stationary series is by definition an I(0) process (i.e. no integration
required).

Stationarity Identifying a stationary series
The Dickey Fuller (DF) test
One of three statistical tests can be used to determine if a data series has a unit root or
not; that is, if it is stationary (no unit root) or not (with a unit root). The Augmented
Dickey Fuller (ADF) test, which is an extension of the Dickey Fuller (DF) test, is the
most commonly used tests for stationary.
The Dickey-Fuller test ﬁts a trend line using the data from the series as the xt values
and the diﬀerences between each data (xt − xt−1) point as the y-values. In this way, the
trend-line models the relationship between each point and its change to the next point.
A stationary series would behave in such a way that;
if the x points are small, they would generally be followed by a large positive shift,
and
if the x points are large, they would be followed by a large negative shift.
In this way, large points would descend and small points would ascend towards a mean.
Points in the middle would in general have a small gradient and the overall gradient of
the trend-line would be negative.

The Augmented Dickey Fuller (ADF) test
This gradient is also called the root, and is denoted by a(1) in the equation:
x(t) − x(t−1) = a(0) + a(1)x(t−1) + εt
A t-test with special critical values (recorded empirically by Dickey and Fuller) tells us
whether the estimate for this gradient is indicative of a stationary series or not. The
t-test is calculated by dividing the result a(1) by the standard deviation of this result.
The ADF test is an extended version of the DF test, which removes all the structural
effects (i.e. autocorrelations) in the time series.
x(t) − x(t−1) = a(0) + b(1)t + a(1)x(t−1) + d(1)x(t−1) + . . . . . . + d(p−1)x(t−p+1) + εt
where b(1) is the coefficient of the time trend, d(i) are the coefficients for the lagged level
of the series, and p is the lag order of the autoregressive process. Once the structural
effects have been controlled for by the b(1) and d(i) coefficients, we are left with testing
the a(1) coefficient.

The Augmented Dickey Fuller (ADF) test (cont.)
In some situations, outliers can greatly influence the gradient of a curve. It is advisable
to use a certain amount of judgement in performing this test:
consider the graphs of the series,
as well as the graph of the trend line fitted through the scattered x(t) and
(xt − xt−1) points,
notice outliers that may influence stationarity as well as large shifts in the data
that may cause an otherwise stationary series to appear non-stationary.
By including lags of order p, the ADF formulation allows for higher-order autoregressive
processes. This means that the lag length p has to be determined when applying this
test.

The Phillips-Perron (P-P) test
An alternative approach to identifying a unit root is the Phillips-Perron (P-P) test,
which builds upon the ADF test.
The difference between both tests is their approach in addressing the issue that the
process generating the data series might have a higher order of autocorrelation than is
admitted in the test equation.
the ADF addresses this issue by introducing lags of the first differences
(d(p−1)x(t−p+1)) as the regressors in the test equation,
the P-P test makes a non-parametric correction to the t-test statistic by using the
Newey-West standard errors to account for serial correlation.
Given this correction, the P-P test is robust with respect to unspecified autocorrelation
and heteroscedasticity in the disturbance term ε of the test equation.

Stationary Lags selection for use in a statistical test
Identifying the number of lags p
Before calculating a test for unit root, it is necessary to decide upon the number of lags
p to be included in the test equation.
too many lags could increase the error in the forecast,
too few could leave out relevant information.
The number of lags to be included in the test equation can be calculated by using
either:
Schwarz’s Bayesian Information Criterion (SBIC),
Akaike’s Information Criterion (AIC), or
Hanna and Quinn Information Criterion (HQIC).
When all three agree, the lag selection is clear. According to Ivanov and Kilian (2001):
AIC tends to be more accurate for monthly data,
HQIC works better for quarterly data on samples over 120 observations, and
SBIC works ﬁne with any sample size for quarterly data.
In most cases, the preferred model is one that has the fewest parameters to estimate.

Modelling dynamic patterns of association Bivariate VAR models
Vector Autoregressive (VAR) models
Vector Autoregressive (VAR) models provide a framework for modelling dynamic
relationships between stationary time series variables (Sims, 1980).
In a bivariate ﬁrst-order VAR(1) model, there are two variables y1t and y2t . Each
depends partly on the lagged values y1t−1 and y2t−1, and partly on its own random
disturbance term u1t and u2t :
y1t = β10 + β11y1t−1 + α11y2t−1 + u1t
y2t = β20 + α21y1t−1 + β21y2t−1 + u2t
Higher-order VAR models simply add further lagged terms on the right hand-side. A
bivariate VAR(p) model would be:
y1t = β10 + β11y1t−1 + α11y2t−1 + . . . . . . + β1py1t−p + α1py2t−p + u1t
y2t = β20 + β21y1t−1 + α21y2t−1 + . . . . . . + α2py1t−p + β2py2t−p + u2t

Modelling dynamic patterns of association Multivariate VAR models
Vector Autoregressive (VAR) models (cont.)
A VAR model does not have to be bivariate; it can be multivariate. We could have a
system of equations for three, four or any number of variables. However, as the number
of equations and the lag-length increases, the number of parameters escalates rapidly.
The estimation of larger VAR models can run into degrees of freedom problems of too
many parameters and too few observations.
Matrix notation provides a more concise representation of VAR models. The bivariate
ﬁrst-order VAR(1) model . . .
y1t = β10 + β11y1t−1 + α11y2t−1 + u1t
y2t = β20 + α21y1t−1 + β21y2t−1 + u2t
. . . can be written in matrix form as:
y1t
y2t
=
β10
β20
+
β11 α11
α21 β21
y1t−1
y2t−1
+
u1t
u2t

Modelling dynamic patterns of association Multivariate VAR models
The matrix formula can be written more concisely as:
Yt = β0 + β1Yt−1 + ut
where Yt , Yt−1, and ut are (2 × 1) column vectors, β0 is a (2 × 1) column vector, and
β1 is a (2 × 2) matrix:
Yt =
y1t
y2t
, Yt−1 =
y1t−1
y2t−1
, ut =
u1t
u2t
, β0 =
β10
β20
, β1 =
β11 α11
α21 β21
By extension, in the most general case, an m-variate VAR(p) model is written:
Yt = β0 + β1Yt−1 + . . . + βpYt−p + ut
where Yt , Yt−1, and ut are (m × 1), β0 is (m × 1), β1, . . . , βp are (m × m) matrices.

Modelling dynamic patterns of association VARs with exogenous variables
It is also possible to specify VARs in which the present values of the endogenous
variables y1t and y2t are determined partly by their own history, and partly by the present
and/or past values of other exogenous variables.
A ﬁrst-order bivariate VAR with s exogenous variables collected together in the matrix
Xt , whose current values inﬂuence the current values of the two endogenous variables
y1t and y2t takes the following form:
Yt = β0 + β1Yt−1 + ΨXt + ut
where Yt , Yt−1, and ut are (2 × 1), Xt is (s × 1), β0 is (2 × 1), β1 is (2 × 2), and Ψ is
(2 × s) matrices.

Modelling dynamic patterns of association Estimations of VARs
If T observations on all variables are available, a full set of lagged variables will be
available only for observations t = p + 1, p + 2, . . . , T. Therefore the number of
observations that can be used in the estimation of the VAR(p) model is T − p.
Equation-by-equation OLS could be used to obtain estimates of the coefficients of the
matrices β0, β1, . . . , βp. However, in practice it is usual to estimate these coefficients
using a technique called Seemingly Unrelated Regressions (SUR) Estimation.
When the variables included on the right-hand-side of equation are identical, SUR
produces the same estimated coefficients as equation-by-equation OLS; but it produces
different standard errors and hypothesis test statistics.
If the variables included on the right-hand-side of each equation are not identical, SUR
and equation-by-equation OLS produce different estimated coefficients.
With SUR, the estimated variance-covariance matrix of the disturbance terms takes into
account any contemporaneous correlation between the disturbance terms of different
equations (this is ignored in equation-by-equation OLS).

Modelling dynamic patterns of association Lags selection for use in VAR models
Identifying the number of lags p in VAR models
Information criteria can be used to determine the most appropriate value of p, the
lag-length, based on the determinant of the variance covariance matrix of the
disturbance terms:
Σ = E(ut , ut ) where ut =
u1t
u2t
. . .
umt
where umt are the disturbance terms.
⇒ Σ =
var(uit ) cov(u1t , u2t ) . . . cov(u1t , umt )
cov(u2t , u1t ) var(u2t ) . . . cov(u2t , umt )
. . . . . . . . . . . .
cov(umt , u1t ) cov(umt , u2t ) . . . var(umt )

Identifying the number of lags p in VAR models (cont.)
If the uit ’s are normally distributed, the maximum likelihood estimator of Σ is:
ΣEST =
Σe2
1t /n Σe1t e2t /n . . . Σe1t emt /n
Σe2t e1t /n Σe2
2t /n . . . Σe2t emt /n
. . . . . . . . . . . .
Σemt e1t /n Σemt e2t /n . . . Σe2
mt /n
where eit is residual for the t’th observation in the i’th equation of the estimated model.
The determinant of ΣEST , denoted |ΣEST |, is calculated from the elements of ΣEST . The
determinant of any matrix can be interpreted as a single numerical summary of the
’information’ that is contained in the matrix. |ΣEST | plays a similar to the residual sum
of squares (Σe2
t ).

Identifying the number of lags p in VAR models (cont.)
The multivariate versions of the Akaike and Schwartz Information Criteria are:
MAIC = ln(|ΣEST |) + 2k/(T − P)
MSIC = ln(|ΣEST |) + k/(T − p)ln(T − p)
where k is the total number of estimated coeﬃcients (across all m equations in the
VAR), and (T − p) is the number of observations used in the estimation.
The speciﬁcation (i.e. the value of p) which produces the smallest MAIC or MSIC
should be selected.
Note, comparison should only be drawn between versions of the model that have been
estimated over the same observations; that is, when comparing a VAR(p) with a
VAR(p-1) model, do not include the extra available observation in the estimation of the
VAR(p-1).

Cointegration analysis Deﬁnition of cointegration
Non-stationary VAR models
If yt and xt are non-stationary, the estimation of the VAR model is subject to the
’spurious regression’ problem, and modelling cannot proceed without any adjustment. In
some cases, however, it is possible to ﬁnd a pair of constants, π1 and π2, such that:
vt = yt − π1 − π2xt
is stationary or I(0), even when yt and xt are both non-stationary or I(1). Note that vt is
just a linear function of yt and xt . If this is possible (i.e. if vt is stationary), then yt and
xt are said to be cointegrated.
If two non-stationary time series variables are cointegrated, they tend to ’move together’
over time; in other words, they are bounded together by a long-run equilibrium
relationship.

Non-stationary VAR models (cont.)
Before investigating cointegration, we might originally have been thinking of fitting one
of the following specifications:
yt = β10 + β20xt + α1yt−1 + β21xt−1 + t
yt = β10 + β20xt + α1yt−1 + β21xt−1 + α2yt−2 + β22xt−2 + t
. . . or however many lags are required.
Both equations are specified in terms of I(1) variables. However, if yt and xt are
cointegrated, both specifications can easily be rearranged so that they contain I(0)
variables only:
yt − yt−1 = β10 + β20(xt − xt−1) + (α1 − 1)yt−1 + (β21 + β20)xt−1 + t
⇒ ∆yt = δ20∆xt + Ψ(yt−1 − π1 − π2xt−1) + t
where δ20 = β20, ψ = α1 − 1, π1 = β10/(1 − α1), π2 = (β21 + β20)/(1 − α1).

A general deﬁnition of cointegration
The aim of cointegration is to detect any stochastic trends in the series and use these
trends for a dynamic analysis of correlation.
With cointegration, if two or more time series are non-stationary but there exists a
linear combination of them that is stationary then we can correctly conduct
hypothesis testing concerning the relationship between the variables (Engle, 1987).
The main advantage of cointegration analysis, as compared to the standard measure of
relationship such as correlation, is that it allows the use of the entire information set
when the series are non-stationary.
Furthermore, cointegrating is able to explain the long-run behaviour of related series,
whereas correlation usually lacks stability, because it is a short-run measure of
co-dependency.

A technical deﬁnition of cointegration
We say that components of the vector ˜Xt are cointegrated of order d,b, which is
denoted by ˜Xt ∼ CI(d, b) if:
All components of ˜Xt are integrated of order d,
There exists a vector ˜β such that the linear combination
˜β ˜Xt = β1X1t + β2X2t + . . . . . . + βnXnt
is integrated of order (d,b), where b > 0 and ˜β is the Cointegrated Vector (CV).
While the amount of historical data required to support the cointegrating relationship
may be large, the attempt to use the same sample to estimate correlation coeﬃcients
may face many obstacles such as outliers and volatility clustering.

Important aspects of cointegration
Cointegration refers to a linear combination of non-stationary variables. The CV is
not unique.
For example, if (β1, β2, . . . + βn) is a CV, then for non-zero λ,
(λβ1, λβ2, . . . + λβn) is also a CV. Typically the CV is normalised with respect to
x1t by selecting λ = 1/βn.
All variables must be integrated of the same order. This is a prior condition for the
presence of a cointegrating relationship. The inverse is not true; this condition
does not imply that all similarly integrated variables are cointegrated, in fact it is
usually not the case.
If the vector ˜Xt has n components, there may be as many as (n-1) linearly
independent cointegrating vectors. For example, if n = 2 then there can be at
most one independent CV.

Cointegration analysis Cointegration and the error correction model
The error correction model
∆yt = δ20∆xt + Ψ(yt−1 − π1 − π2xt−1) + t is known as an error correction model. On
the right-hand side:
The term in ∆xt represents the model’s short-run dynamics. It contains
information about the extent to which current changes in xt inﬂuence current
changes in yt (i.e. ∆xt inﬂuences ∆yt ).
The term Ψ(yt−1 − π1 − π2xt−1), which can also be written Ψvt−1, is known as
the error correction mechanism. Recall yt−1 = π1 + π2xt−1 represents the long-run
equilibrium relationship between xt and yt . Accordingly:
If vt−1 = yt−1 − π1 − π2xt−1 > 0, yt−1 was above its equilibrium value at t − 1;
If vt−1 = yt−1 − π1 − π2xt−1 < 0, yt−1 was below its equilibrium value at t − 1.
We expect Ψ < 0, so that; vt−1 > 0 ⇒ Ψvt−1 < 0 and a tendency for ∆yt < 0 (yt
falling), and vt−1 < 0 ⇒ Ψvt−1 > 0 and a tendency for ∆yt > 0 (yt rising). The error
correction model incorporates an adjustment process, pushing yt towards equilibrium at
time t whenever yt was out of equilibrium at time t − 1.

Cointegration analysis The Engle-Granger’s approach
A two-step approach for identifying cointegration
Engle and Granger (1987) proposed a simple two-step (residual-based) approach to test
for cointegration and incorporating the relationship into an estimated model. Other tests
have been developed, including the Johansen procedure that allows the testing of several
series (unlike Engle and Granger approach that is restricted to testing only two series).
The first step of the Engle-Granger’s approach is to test each series individually for
their order of integration. If the individual time series are integrated of different
orders then it can be concluded with certainty that they are not cointegrated.
The Engle-Granger procedure is applicable only if both variables are non-stationary
and I(0). Assume that xt and yt are both I(0). If either or both of xt and yt are
non-stationary and I(2), then the procedure could be applied by using the
first-differences of the I(2) variable.

Cointegration analysis Engle-Granger’s approach
Engle-Granger’s approach: Step 1
Obtain the estimated cointegration regression: yt = π1 − π2xt−1 + vt .
Save the residuals vt = yt − π1 − π2xt−1, and test vt for stationarity using a DF or
ADF-type procedure:
Test H0 : ρ = 0 against H0 : ρ < 0 in one of:
∆vt = ρvt−1 + t
or ∆vt = ρvt−1 + δ1∆vt−1 + t
or ∆vt = ρvt−1 + δ1∆vt−1 + δ2∆vt−2 + t
Accept H0 ⇒ vt is non-stationary ⇒ stop (yt and xt are not cointegrated)
Reject H0 ⇒ vt is stationary ⇒ proceed to step 2 (yt and xt are cointegrated)

AIC or SBIC can be used to select the lag-length for the DF/ADF-type
autoregression. Selecting the correct lag-length is important because the result of
the cointegration test is sensitive to the lag-length chosen.
No constant term or trend is required in the DF/ADF-type autoregression, because
the sample mean vt is zero and vt is untrended.
A separate set of critical values (produced by Engle and Granger) is required to
determine acceptance or rejection of H0. In the multivariate case, the critical
values are dependent on the number of xt ’s included on the right-hand side of the
cointegration regression.
The standard Dickey-Fuller critical values cannot be used, because {vt } are
residuals from regression of yt on xt . This regression will already have introduced
an element of ’smoothing’ into {vt }, which the Engle-Granger critical values take
into account.
In common with the DF and ADF test, the Engle-Granger cointegration test has
low power in small samples. It is common not to ﬁnd evidence of cointegration
with this test.

Obtain the estimated error correction model by estimating one of the following using
OLS:
∆yt = δ20∆xt + Ψvt−1 + t
or ∆yt = δ20∆xt + δ11∆yt−1 + δ21∆xt−1 + Ψvt−1 + t
AIC or SIC can be used to select the lag-length for the lagged ∆yt ’s and ∆xt ’s in
the error correction model.
It is not possible to perform hypothesis tests on π1 and π2, which is a serious
limitation of the Engle-Granger procedure, but hypothesis tests on δ20, δ11, δ21 and
Ψ can be performed.

Cointegration analysis The Engle-Granger’s approach
Drawbacks of the Engle-Granger’s approach
If the variables are in fact cointegrated then OLS regression yields superconsistent
estimates of the cointegrating parameters β0 and β1 (CV). It has been shown by Stock
(1987) that OLS coeﬃcient estimates converge faster towards their parameter values in
the presence of a cointegrating relationship compared with regressions involving
stationary variables. If deviations from the long-run equilibrium εt are found to be
stationary, I(0), then the yt and zt sequences are cointegrated of order (1,1). The
augmented Dickey-Fuller test can be used to determine the stationarity of the residual
series εt .
The Engle and Granger approach is relatively straight forward and easily implemented in
practice. However there are signiﬁcant drawbacks of this approach:
The Engle and Granger test for cointegration uses residuals from either of the two
equilibrium equations,
The major problem regarding the Engle and Granger procedure is that it relies on a
two-step estimator.

Cointegration analysis Johansen’s approach
Cointegration as a special case of VAR
The Johansen (1988) maximum likelihood estimators circumvent the use of a two-step
estimator and in doing so avoid the drawbacks faced by Engle and Granger.
Instead, the Johansen (1988) procedure relies heavily on the relationship between the
rank of a matrix and its characteristic roots. Johansen (1988) demonstrated that
cointegration can also be modelled with a modiﬁed Vector Autoregression (VAR)
framework. In order to keep notation as simple as possible, this topic will be discussed
for the bivariate case only. Consider the following bivariate VAR(1) model:
y1t
y2t
=
β10
β20
+
β11 α11
α21 β21
y1t−1
y2t−1
+
u1t
u2t
Suppose y1t and y2t are both non-stationary or I(1), but a linear combination of y1t and
y2t exists which is stationary or I(0). Therefore y1t and y2t are cointegrated. In this case,
the bivariate VAR(1) model can be reparameterised so that it is expressed in terms of
I(0) variables only, as follows:
y1t − y1t−1
y2t − y2t−1
=
β10
β20
+
β11 − 1 α11
α21 β21 − 1
y1t−1
y2t−1
+
u1t
u2t

Speciﬁcation of a Vector Error Correction Model (VECM)
⇒
y1t
y2t
=
β10
β20
+
π11 π12
π21 π22
y1t−1
y2t−1
+
u1t
u2t
where π = β11 − 1, π12 = α11, π21 = α21, π = β21 − 1.
or ∆Yt = β0 + πYt−1 + ut
The above equation is know as a Vector Error Correction Model (VECM) representation
of the bivariate VAR(1) model. Now consider the bivariate VAR(2) model:
y1t
y2t
=
β10
β20
+
β11 α11
α21 β21
y1t−1
y2t−1
+
β12 α12
α22 β22
y1t−2
y2t−2
+
u1t
u2t
Again, if y1t and y2t are both I(0), but a linear combination of y1t and y2t exists which is
I(O), then the above bivariate VAR(2) model can be reparameterised and expressed in
terms of I(0) variables only, as follows:
∆y1t
∆y2t
=
β10
β20
+
π11 π12
π21 π22
y1t−1
y2t−1
+
δ11 γ12
γ21 δ22
∆y1t−1
∆y2t−1
+
u1t
u2t
where π11 = β11 + β12 − 1, . . . , δ11 = −β12, . . . and so on.

Cointegration as a special case of VAR (cont.)
In the Engle-Granger formulation, there is a presumption that yt is partly determined by
xt . Accordingly ∆yt depends on ∆xt , as well as ∆yt−1 and ∆xt−1.
In the Johansen formulation, y1t and y2t are treated symmetrically, and no causation is
assumed between the current values in either direction. ∆y1t depends only on the lagged
values ∆y1t−1, ∆y2t−1 (and higher-order lags if applicable); similarly, ∆y2t depends only
on the lagged values ∆y1t−1, ∆y2t−1 etc.

Conditions for cointegration
Johansen showed that the condition for a stationary or I(0) linear combination of y1t and
y2t to exist (the condition for y1t and y2t to be cointegrated) depends on the rank of the
matrix π =
π11 π12
π21 π22
in the VECMs.
If y1t and y2t are both stationary or I(0), rank (π) = 2. In this case, any linear
combination of y1t and y2t is stationary. The matrix π contains 2 cointegrating
vectors. In one sense, y1t and y2t are trivially cointegrated. However, it would not
be common practice to refer to y1t and y2t as cointegrated in this case.
If y1t and y2t are both non-stationary or I(1), but a linear combination of y1t and
y2t exists which is stationary or I(0), rank (π) = 1 and y1t and y2t are
cointegrated. The matrix π contains 1 cointegrating vector.
If y1t and y2t are both non-stationary or I(1), and no stationary linear combination
of y1t and y2t exists, rank (π) = 0 and y1t and y2t are not cointegrated. The
matrix π contains no cointegrating vectors.

Interpreting the long-run equilibrium relationship
If rank (π) = 1, it is possible to decompose π as follows:
π =
π11 π12
π21 π22
=
a1
a2
1 − b2 =
a1 −a1b2
a2 −a2b2
a1 and a2 are known as the adjustment parameters. These parameters play a similar role
to Ψ seen in the previous error correction model. (1 − b2) is known as the cointegrating
vector.
In equilibrium, nothing is changing and there is no disturbance. In order to study the
long-run equilibrium relationship between y1t and y2t , we set variables in ﬁrst-diﬀerence,
and the error term to zero.

Testing for cointegration
In Johansen VECM formulation, to test for cointegration we need to test hypotheses
concerning the rank of the matrix π. Let r denote rank (π).
Johansen developed two test statistics, known as the trace statistic and the maximal
eigenvalue statistic. Using Johansen notation, these are denoted λtrace and λmax .
The formulation of the null hypothesis diﬀers very slightly between the two procedures.
In both cases, acceptance or rejection of the null is decided by comparing the test
statistic with special critical values compiled by Johansen.

Testing for cointegration (cont.)
For a bivariate model, the tests would be carried out in two stages:
λtrace λmax
Stage1 H0 : r = 0 H0 : r = 0
H1 : r > 0 H1 : r = 1
Accept H0 ⇒ series are I(1) and not cointegrated ⇒ STOP.
Reject H0 ⇒ proceed to Stage 2.
λtrace λmax
Stage2 H0 : r ≤ 1 H0 : r ≤ 1
H1 : r = 2 H1 : r = 2
Accept H0 ⇒ series are I(1) and cointegrated.
Reject H0 ⇒ series are I(0).

Cointegration analysis References
References
Alexander, C. and Dimitriu, A. (2002) The Cointegration Alpha: Enhanced Index Tracking and
Long-Short Equity Market Neutral Strategies. ISMA Finance Discussion Paper No. 2002-08.
June. pp. 1-55.
Dickey, D. and Fuller, W. (1979) Distribution of the Estimators for Autoregressive Time Series
with a Unit Root. Journal of the American Statistical Association, 74, pp. 427-431.
Engle, R. and Granger, C. (1987) Co-integration and Error Correction: Representation,
Estimation and Testing. Econometrica, 55(2), pp. 251-76.
Johansen, S. (1988) Statistical Analysis of Cointegration Vectors. Journal of Economic
Dynamics and Control, 12, p. 231-254.
Pearl, J. (2000) Causal Inference Without Counterfactuals: Comment. Journal of the
American Statistical Association, 95 (450), pp. 428-431.
Stock, J. (1987) Asymptotic Properties of Least Squares Estimators of Cointegrating Vectors.
Econometrica, 55(2), pp. 1035-1056.

Appendix Granger causality test
Granger causality test
Granger (1969) describes a test that assess whether the past values of one variable
’cause’ the current values of another variable.
Consider the bivariate VAR(2) model:
y1t
y2t
=
β10
β20
+
β11 α11
α21 β21
y1t−1
y2t−1
+
β12 α12
α22 β22
y1t−1
y2t−1
+
u1t
u2t
This model is equivalent to:
y1t = β10 + β11y1t−1 + α11y2t−1 + β12y1t−2 + α12y2t−2 + u1t
y2t = β20 + α21y1t−1 + β21y2t−1 + α22y1t−2 + β22y2t−2 + u2t

Appendix Granger causality test
Hypothesis of Granger causality testing
The terminology of Granger causality testing:
The variable y1 Granger causes the variable y2 if the coefficients on y1t−1 and
y1t−2 are significant in the equation for y2t .
The variable y2 Granger causes the variable y1 if the coefficients on y2t−1 and
y2t−2 are significant in the equation for y1t .
Normally, Granger causality tests involve testing restrictions on one equation at a time.
A standard F-test can be used to determine acceptance or rejection of the following:
H0 : α21 = α22 = 0. If we can reject this null hypothesis, we can infer that y1
Granger cause y2.
H0 : α12 = α12 = 0. If we can reject this null hypothesis, we can infer that y2
Granger cause y1.

Cointegration analysis: Modelling the complex interdependencies between financial assets

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Cointegration analysis: Modelling the complex interdependencies between financial assets

Similar to Cointegration analysis: Modelling the complex interdependencies between financial assets (20)

Recently uploaded

Recently uploaded (20)

Cointegration analysis: Modelling the complex interdependencies between financial assets