1. Chapter 22 Time Series Econometrics: Forecasting 791
22.10 Measuring Volatility in Financial Time Series:
The ARCH and GARCH Models
As noted in the introduction to this chapter, financial time series, such as stock prices,
exchange rates, inflation rates, etc., often exhibit the phenomenon of volatility clustering,
that is, periods in which their prices show wide swings for an extended time period
followed by periods in which there is relative calm. As Philip Franses notes:
Since such [financial time series] data reflect the result of trading among buyers and sellers at,
for example, stock markets, various sources of news and other exogenous economic events
may have an impact on the time series pattern of asset prices. Given that news can lead to
various interpretations, and also given that specific economic events like an oil crisis can last
for some time, we often observe that large positive and large negative observations in financial
time series tend to appear in clusters.
21
Knowledge of volatility is of crucial importance in many areas. For example, consider-
able macroeconometric work has been done in studying the variability of inflation over
time. For some decision makers, inflation in itself may not be bad, but its variability is bad
because it makes financial planning difficult.
The same is true of importers, exporters, and traders in foreign exchange markets, for
variability in the exchange rates means huge losses or profits. Investors in the stock market
are obviously interested in the volatility of stock prices, for high volatility could mean huge
losses or gains and hence greater uncertainty. In volatile markets it is difficult for compa-
nies to raise capital in the capital markets.
How do we model financial time series that may experience such volatility? For exam-
ple, how do we model times series of stock prices, exchange rates, inflation, etc.? A char-
acteristic of most of these financial time series is that in their level form they are random
walks; that is, they are nonstationary. On the other hand, in the first difference form, they
are generally stationary, as we saw in the case of GDP series in the previous chapter, even
though GDP is not strictly a financial time series.
Therefore, instead of modeling the levels of financial time series, why not model their first
differences? But these first differences often exhibit wide swings, or volatility, suggesting
that the variance of financial time series varies over time. How can we model such “varying
variance”? This is where the so-called autoregressive conditional heteroscedasticity
(ARCH) model originally developed by Engle comes in handy.22
As the name suggests, heteroscedasticity, or unequal variance, may have an autoregres-
sive structure in that heteroscedasticity observed over different periods may be autocorre-
lated. To see what all this means, let us consider a concrete example.
21
Philip Hans Franses, Time Series Models for Business and Economic Forecasting, Cambridge University
Press, New York, 1998, p. 155.
22
R. Engle, “Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United
Kingdom Inflation,” Econometrica, vol. 50. no. 1, 1982, pp. 987–1007. See also A. Bera and M.
Higgins, “ARCH Models: Properties, Estimation and Testing,” Journal of Economic Surveys, vol. 7, 1993,
pp. 305–366.
EXAMPLE 22.1
U.S./U.K.
Exchange Rate:
An Example
Figure 22.6 gives logs of the monthly U.S./U.K. exchange rate (dollars per pound) for the
period 1971–2007, for a total of 444 monthly observations. As you can see from this
figure, there are considerable ups and downs in the exchange rate over the sample period.
To see this more vividly, in Figure 22.7 we plot the changes in the logs of the exchange
(Continued)
guj75772_ch22.qxd 01/09/2008 01:36 PM Page 791
2. 792 Part Four Simultaneous-Equation Models and Time Series Econometrics
rate; note that changes in the log of a variable denote relative changes, which, if multi-
plied by 100, give percentage changes. As you can observe, the relative changes in the
U.S./U.K. exchange rate show periods of wide swings for some time periods and periods
of rather moderate swings in other time periods, thus exemplifying the phenomenon of
volatility clustering.
Now the practical question is: How do we statistically measure volatility? Let us
illustrate this with our exchange rate example.
Let Yt = U.S./U.K. exchange rate
Yt* = log of Yt
dYt* = Yt* − Yt*
−1 = relative change in the exchange rate
d ¯
Y t* = mean of dYt*
Xt = dYt* − d ¯
Y t*
EXAMPLE 22.1
(Continued) 1.2
1.0
0.8
0.6
0.4
0.2
0
1
9
7
1
1
9
7
3
1
9
7
5
1
9
7
7
1
9
7
9
1
9
8
1
1
9
8
3
1
9
8
5
1
9
8
7
1
9
8
9
1
9
9
1
1
9
9
3
1
9
9
5
1
9
9
7
1
9
9
9
2
0
0
1
2
0
0
3
2
0
0
5
2
0
0
7
Log
exchange
rate
Year
FIGURE 22.6 Log of U.S./U.K. exchange rate, 1971–2007 (monthly)
0.15
0.10
0.05
0
–0.05
–0.10
–0.15
1
9
7
1
1
9
7
3
1
9
7
5
1
9
7
7
1
9
7
9
1
9
8
1
1
9
8
3
1
9
8
5
1
9
8
7
1
9
8
9
1
9
9
1
1
9
9
3
1
9
9
5
1
9
9
7
1
9
9
9
2
0
0
1
2
0
0
3
2
0
0
5
2
0
0
7
Change
in
log
exchange
rate
Year
FIGURE 22.7 Change in the log of U.S./U.K. exchange rate.
guj75772_ch22.qxd 01/09/2008 01:36 PM Page 792
3. Chapter 22 Time Series Econometrics: Forecasting 793
23
You might wonder why we do not use the variance of Xt =
X2
t /n as a measure of volatility. This
is because we want to take into account changing volatility of asset prices over time. If we use the
variance of Xt, it will only be a single value for a given data set.
Thus, Xt is the mean-adjusted relative change in the exchange rate. Now we can use X 2
t
as a measure of volatility. Being a squared quantity, its value will be high in periods when
there are big changes in the prices of financial assets and its value will be comparatively
small when there are modest changes in the prices of financial assets.23
Accepting X 2
t as a measure of volatility, how do we know if it changes over time?
Suppose we consider the following AR(1), or ARIMA (1, 0, 0), model:
X 2
t = β0 + β1 X 2
t−1 + ut (22.10.1)
This model postulates that volatility in the current period is related to its value in the pre-
vious period plus a white noise error term. If β1 is positive, it suggests that if volatility was
high in the previous period, it will continue to be high in the current period, indicating
volatility clustering. If β1 is zero, then there is no volatility clustering. The statistical signif-
icance of the estimated β2 can be judged by the usual t test.
There is nothing to prevent us from considering an AR(p) model of volatility such that
X 2
t = β0 + β1 X 2
t−1 + β2 X 2
t−2 + · · · + βp X 2
t−p + ut (22.10.2)
This model suggests that volatility in the current period is related to volatility in the past p
periods, the value of p being an empirical question. This empirical question can be resolved
by one or more of the model selection criteria that we discussed in Chapter 13 (e.g., the
Akaike information measure). We can test the significance of any individual β coefficient by
the t test and the collective significance of two or more coefficients by the usual F test.
Model (22.10.1) is an example of an ARCH(1) model and Eq. (22.10.2) is called an
ARCH(p) model, where p represents the number of autoregressive terms in the model.
Before proceeding further, let us illustrate the ARCH model with the U.S./U.K.
exchange rate data. The results of the ARCH(1) model were as follows.
X 2
t = 0.00043 + 0.23036X 2
t−1
t = (7.71) (4.97) (22.10.3)
R2
= 0.0531 d = 1.9933
where X2
t is as defined before.
Since the coefficient of the lagged term is highly significant (p value of about 0.000), it
seems volatility clustering is present in the present instance. We tried higher-order ARCH
models, but only the AR(1) model turned out to be significant.
How would we test for the ARCH effect in a regression model in general that is based
on time series data? To be more specific, let us consider the k-variable linear regression
model:
Yt = β1 + β2 X2t + · · · + βk Xkt + ut (22.10.4)
and assume that conditional on the information available at time (t − 1), the disturbance
term is distributed as
ut ∼ N
0,
α0 + α1u2
t−1
(22.10.5)
EXAMPLE 22.1
(Continued)
(Continued)
guj75772_ch22.qxd 01/09/2008 06:09 PM Page 793
4. 794 Part Four Simultaneous-Equation Models and Time Series Econometrics
that is, ut is normally distributed with zero mean and
var (ut) =
α0 + α1u2
t−1
(22.10.6)
that is, the variance of ut follows an ARCH(1) process.
The normality of ut is not new to us. What is new is that the variance of u at time t is
dependent on the squared disturbance at time (t − 1), thus giving the appearance of serial
correlation.24
Of course, the error variance may depend not only on one lagged term of
the squared error term but also on several lagged squared terms as follows:
var (ut) = σ2
t = α0 + α1u2
t−1 + α2u2
t−2 + · · · + αpu2
t−p (22.10.7)
If there is no autocorrelation in the error variance, we have
H0: α1 = α2 = · · · = αp = 0 (22.10.8)
in which case var(ut) = α0, and we do not have the ARCH effect.
Since we do not directly observe σ2
t , Engle has shown that running the following
regression can easily test the preceding null hypothesis:
û2
t = α̂0 + α̂1û2
t−1 + α̂2û2
t−2 + · · · + α̂pû2
t−p (22.10.9)
where ût, as usual, denotes the OLS residuals obtained from the original regression model
(22.10.4).
One can test the null hypothesis H0 by the usual F test, or alternatively, by computing
nR2
, where R2
is the coefficient of determination from the auxiliary regression (22.10.9).
It can be shown that
nR2
asy ∼ χ2
p (22.10.10)
that is, in large samples nR2
follows the chi-square distribution with df equal to the
number of autoregressive terms in the auxiliary regression.
Before we proceed to illustrate, make sure that you do not confuse autocorrelation of
the error term as discussed in Chapter 12 and the ARCH model. In the ARCH model it is
the (conditional) variance of ut that depends on the (squared) previous error terms, thus
giving the impression of autocorrelation.
EXAMPLE 22.1
(Continued)
24
A technical note: Remember that for our classical linear model the variance of ut was assumed to be
σ2, which in the present context becomes unconditional variance. If α1 1, the stability condition,
we can write σ2 = α0 + α1σ2; that is, σ2 = α0/(1 − α1). This shows that the unconditional variance
of u does not depend on t, but does depend on the ARCH parameter α1.
25
This graph and the regression results presented in this example are based on the data collected by
Gary Koop, Analysis of Economic Data, John Wiley Sons, New York, 2000 (data from the data disk). The
monthly percentage change in the stock price index can be regarded as a rate of return on the index.
EXAMPLE 22.2
NewYork Stock
Exchange Price
Changes
As a further illustration of the ARCH effect, Figure 22.8 presents monthly percentage
change in the NYSE (New York Stock Exchange) Index for the period 1966–2002.25
It is
evident from this graph that the percent price changes in the NYSE Index exhibit consid-
erable volatility. Notice especially the wide swing around the 1987 crash in stock prices.
To capture the volatility in the stock return seen in the figure, let us consider a very
simple model:
Yt = β1 + ut (22.10.11)
where Yt = percent change in the NYSE stock index and ut = random error term.
guj75772_ch22.qxd 01/09/2008 01:36 PM Page 794
5. Chapter 22 Time Series Econometrics: Forecasting 795
What to Do If ARCH Is Present
Recall that we have discussed several methods of correcting for heteroscedasticity, which
basically involved applying OLS to transformed data. Remember that OLS applied to trans-
formed data is generalized least squares (GLS). If the ARCH effect is found, we will have
to use GLS. We will not pursue the technical details, for they are beyond the scope of
this book.26
Fortunately, software packages such as EViews, SHAZAM, MICROFIT, and
PC-GIVE now have user-friendly routines to estimate such models.
Notice that besides the intercept, there is no other explanatory variable in the model.
From the data, we obtained the following OLS regression:
Ŷ t = 0.00574
t = (3.36) (22.10.12)
d = 1.4915
What does this intercept denote? It is simply the average percent rate of return on the
NYSE index, or the mean value of Yt (can you verify this?). Thus over the sample period
the average monthly return on the NYSE index was about 0.00574 percent.
Now we obtain the residuals from the preceding regression and estimate the ARCH(1)
model, which gave the following results:
û2
t = 0.000007 + 0.25406û2
t−1
t = (0.000) (5.52) (22.10.13)
R2
= 0.0645 d = 1.9464
where ût is the estimated residual from regression (22.10.12).
Since the lagged squared disturbance term is statistically significant (p value of about
0.000), it seems the error variances are correlated; that is, there is an ARCH effect. We tried
higher-order ARCH models but only ARCH(1) was statistically significant.
EXAMPLE 22.2
(Continued)
0.15
0.10
0.05
0
–0.05
–0.10
–0.15
1966 1971 1976 1981 1986 1991 1996 2001
Change,
%
Year
FIGURE 22.8 Monthly percent change in the NYSE Price Index, 1966–2002.
26
Consult Russell Davidson and James G. MacKinnon, Estimation and Inference in Econometrics, Oxford
University Press, New York, 1993, Section 16.4 and William H. Greene, Econometric Analysis, 4th ed.,
Prentice Hall, Englewood Cliffs, NJ, 2000, Section 18.5.
guj75772_ch22.qxd 01/09/2008 01:36 PM Page 795
6. 796 Part Four Simultaneous-Equation Models and Time Series Econometrics
A Word on the Durbin–Watson d and the ARCH Effect
We have reminded the reader several times that a significant d statistic may not always
mean that there is significant autocorrelation in the data at hand. Very often a significant d
value is an indication of the model specification errors that we discussed in Chapter 13.
Now we have an additional specification error, due to the ARCH effect. Therefore, in a time
series regression, if a significant d value is obtained, we should test for the ARCH effect
before accepting the d statistic at its face value. An example is given in Exercise 22.23.
A Note on the GARCH Model
Since its “discovery” in 1982, ARCH modeling has become a growth industry, with all
kinds of variations on the original model. One that has become popular is the generalized
autoregressive conditional heteroscedasticity (GARCH) model, originally proposed
by Bollerslev.27
The simplest GARCH model is the GARCH(1, 1) model, which can be
written as:
σ2
t = α0 + α1u2
t−1 + α2σ2
t−1 (22.10.14)
which says that the conditional variance of u at time t depends not only on the squared error
term in the previous time period (as in ARCH[1]) but also on its conditional variance in the
previous time period.This model can be generalized to a GARCH(p, q) model in which there
are p lagged terms of the squared error term and q terms of the lagged conditional variances.
We will not pursue the technical details of these models, as they are involved, except
to point out that a GARCH(1, 1) model is equivalent to an ARCH(2) model and a
GARCH(p, q) model is equivalent to an ARCH(p + q) model.28
For our U.S./U.K. exchange rate and NYSE stock return examples, we have already
stated that an ARCH(2) model was not significant, suggesting that perhaps a GARCH(1, 1)
model is not appropriate in these cases.
22.11 Concluding Examples
We conclude this chapter by considering a few additional examples that illustrate some of
the points we have made in this chapter.
EXAMPLE 22.3
The Relationship
between the
Help-Wanted
Index (HWI) and
the Unemploy-
ment Rate (UN)
from January
1969 to January
2000
To study causality between HWI and UN, two indicators of labor market conditions in the
United States, Marc A. Giammatteo considered the following regression model:29
HWIt = α0 +
25
i=1
αi UNt−i +
25
j
βj HWIt− j (22.11.1)
UNt = α0 +
25
i=1
λi UNt−i +
25
j=1
δj HWIt− j (22.11.2)
To save space we will not present the actual regression results, but the main conclusion
that emerges from this study is that there is bilateral causality between the two labor
market indicators and this conclusion did not change when the lag length was varied. The
data on HWI and UN are given on the textbook website as Table 22.5.
27
T. Bollerslev, “Generalized Autoregressive Conditional Heteroscedasticity,” Journal of Econometrics,
vol. 31, 1986, pp. 307–326.
28
For details, see Davidson and MacKinnon, op. cit., pp. 558–560.
29
Marc A. Giammatteo (West Point, Class of 2000), “The Relationship between the Help Wanted Index
and the Unemployment Rate,” unpublished term paper. (Notations altered to conform to our notation.)
guj75772_ch22.qxd 01/09/2008 01:36 PM Page 796
7. Chapter 22 Time Series Econometrics: Forecasting 797
EXAMPLE 22.4
ARIMA Modeling
of theYen/Dollar
Exchange Rate:
January 1971 to
April 2008
The yen/dollar exchange rate (¥/$) is a key exchange rate. From the logarithms of the
monthly ¥/$, it was found that in the level form this exchange rate showed the typical pat-
tern of a nonstationary time series. But examining the first differences, it was found that
they were stationary; the graph here pretty much resembles Figure 22.8.
Unit root analysis confirmed that the first differences of the logs of ¥/$ were stationary.
After examining the correlogram of the log first differences, we estimated the following
MA(1) model:
ˆ
Y t = −0.0028 − 0.3300ut−1
t = (−1.71) (−7.32) (22.11.3)
R2
= 0.1012 d = 1.9808
where Yt = first differences of the logs of ¥/$ and u = a white noise error term.
To save space, we have provided the data underlying the preceding analysis on the
textbook website in Table 22.6. Using these data, the reader is urged to try other models
and compare their forecasting performances.
EXAMPLE 22.5
ARCH Model of
the U.S. Inflation
Rate: January
1947 to March
2008
To see if the ARCH effect is present in the U.S. inflation rate as measured by the CPI, we
obtained CPI data from January 1947 to March 2008. The plot of the logarithms of the CPI
showed that the time series was nonstationary. But the plot of the first differences of the
logs of the CPI, as shown in Figure 22.9, shows considerable volatility even though the
first differences are stationary.
Following the procedure outlined in regressions (22.10.12) and (22.10.13), we first
regressed the logged first differences of CPI on a constant and obtained residuals from this
equation. Squaring these residuals, we obtained the following ARCH(2) model:
û2
t = 0.000028 + 0.12125û2
t−1 + 0.08718û2
t−2
t = (5.42) (3.34) (2.41) (22.11.4)
R2
= 0.026 d = 2.0214
0.07
0.05
0.06
0.04
0.02
0
0.03
0.01
–0.01
–0.02
–0.03
1947 1952 1962
1957 1967 1972 1982
1977 1987 1992 1997 2002 2007
First
differences
Year
FIGURE 22.9
First differences of
the logs of CPI.
(Continued)
guj75772_ch22.qxd 01/09/2008 06:22 PM Page 797
8. 798 Part Four Simultaneous-Equation Models and Time Series Econometrics
Summary and
Conclusions
1. Box–Jenkins and VAR approaches to economic forecasting are alternatives to tradi-
tional single- and simultaneous-equation models.
2. To forecast the values of a time series, the basic Box–Jenkins strategy is as follows:
a. First examine the series for stationarity. This step can be done by computing the
autocorrelation function (ACF) and the partial autocorrelation function (PACF) or by
a formal unit root analysis. The correlograms associated with ACF and PACF are
often good visual diagnostic tools.
b. If the time series is not stationary, difference it one or more times to achieve stationarity.
c. TheACFandPACFofthestationarytimeseriesarethencomputedtofindoutiftheseries
is purely autoregressive or purely of the moving average type or a mixture of the two.
FrombroadguidelinesgiveninTable22.1onecanthendeterminethevaluesofpandqin
theARMA process to be fitted.At this stage the chosenARMA(p, q) model is tentative.
d. The tentative model is then estimated.
e. The residuals from this tentative model are examined to find out if they are white
noise. If they are, the tentative model is probably a good approximation to the under-
lying stochastic process. If they are not, the process is started all over again. There-
fore, the Box–Jenkins method is iterative.
f. The model finally selected can be used for forecasting.
3. The VAR approach to forecasting considers several time series at a time. The distin-
guishing features of VAR are as follows:
a. It is a truly simultaneous system in that all variables are regarded as endogenous.
b. In VAR modeling the value of a variable is expressed as a linear function of the past,
or lagged, values of that variable and all other variables included in the model.
c. If each equation contains the same number of lagged variables in the system, it can
be estimated by OLS without resorting to any systems method, such as two-stage
least squares (2SLS) or seemingly unrelated regressions (SURE).
d. This simplicity of VAR modeling may be its drawback. In view of the limited num-
ber of observations that are generally available in most economic analyses, introduc-
tion of several lags of each variable can consume a lot of degrees of freedom.30
e. If there are several lags in each equation, it is not always easy to interpret each coeffi-
cient, especially if the signs of the coefficients alternate. For this reason one examines
the impulse response function (IRF) in VAR modeling to find out how the dependent
variable responds to a shock administered to one or more equations in the system.
f. There is considerable debate and controversy about the superiority of the various fore-
casting methods. Single-equation, simultaneous-equation, Box–Jenkins, and VAR
methods of forecasting have their admirers as well as their detractors. All one can
say is that there is no single method that will suit all situations. If that were the case,
there would be no need for discussing the various alternatives. One thing is sure:
The Box–Jenkins and VAR methodologies have now become an integral part of
econometrics.
30
Followers of Bayesian statistics believe that this problem can be minimized. See R. Litterman,
“A Statistical Approach to Economic Forecasting,” Journal of Business and Economic Statistics,
vol. 4, 1986, pp. 1–4.
As you can see, there is quite a bit of persistence in the volatility, as volatility in the current
month depends on volatility in the preceding 2 months. The reader is advised to obtain
CPI data from government sources and try to see if another model, preferably a GARCH
model, does a better job.
EXAMPLE 22.5
(Continued)
guj75772_ch22.qxd 01/09/2008 06:09 PM Page 798
9. Chapter 22 Time Series Econometrics: Forecasting 799
4. We also considered in this chapter a special class of models, ARCH and GARCH,
which are especially useful in analyzing financial time series, such as stock prices,
inflation rates, and exchange rates. A distinguishing feature of these models is that the
error variance may be correlated over time because of the phenomenon of volatility
clustering. In this connection we also pointed out that in many cases a significant
Durbin–Watson d may in fact be due to the ARCH or GARCH effect.
5. There are variants of ARCH and GARCH models, but we have not considered them in
this chapter due to space constraints. Some of these other models are: GARCH-M
(GARCH in mean), TGARCH (threshold GARCH), and EGARCH (exponential
GARCH). A discussion of these models can be found in the references.31
31
See Walter Enders, Applied Econometric Time Series, 2d ed., John Wiley Sons, New York, 2004. For
an application-oriented discussion, see Dimitrios Asteriou and Stephen Hall, Applied Econometrics: A
Modern Approach, revised edition, Palgrave/Macmillan, New York, 2007, Chapter 14.
EXERCISES Questions
22.1. What are the major methods of economic forecasting?
22.2. What are the major differences between simultaneous-equation and Box–Jenkins
approaches to economic forecasting?
22.3. Outline the major steps involved in the application of the Box–Jenkins approach to
forecasting.
22.4. What happens if Box–Jenkins techniques are applied to time series that are
nonstationary?
22.5. What are the differences between Box–Jenkins and VAR approaches to economic
forecasting?
22.6. In what sense is VAR atheoretic?
22.7. “If the primary object is forecasting, VAR will do the job.” Critically evaluate this
statement.
22.8. Since the number of lags to be introduced in a VAR model can be a subjective ques-
tion, how does one decide how many lags to introduce in a concrete application?
22.9. Comment on this statement: “Box–Jenkins and VAR are prime examples of
measurement without theory.”
22.10. What is the connection, if any, between Granger causality tests and VAR modeling?
Empirical Exercises
22.11. Consider the data on log DPI (personal disposable income) introduced in Section 21.1
(see the book’s website for the actual data). Suppose you want to fit a suitableARIMA
model to these data. Outline the steps involved in carrying out this task.
22.12. Repeat Exercise 22.11 for the LPCE (personal consumption expenditure) data
introduced in Section 21.1 (again, see the book’s website for the actual data).
22.13. Repeat Exercise 22.11 for the LCP.
22.14. Repeat Exercise 22.11 for the LDNIDENDS.
22.15. In Section 13.9 you were introduced to the Schwarz Information criterion (SIC) to
determine lag length. How would you use this criterion to determine the appropri-
ate lag length in a VAR model?
22.16. Using the data on LPCE and LDPI introduced in Section 21.1 (see the book’s web-
site for the actual data), develop a bivariate VAR model for the period 1970–I to
2006–IV
. Use this model to forecast the values of these variables for the four quarters
of 2007 and compare the forecast values with the actual values given in the dataset.
guj75772_ch22.qxd 01/09/2008 06:09 PM Page 799