These notes are an eﬀort to integrate the body of knowledge encountered in
the Finance PhD classes at the University of North Carolina. The original
draft of this document was developed to prepare for the area comprehensive
exams. As such, the presentation is in a condensed format. The theoretical
models are derived in a skeleton form, with the focus on the set up and key
steps rather than line-by-line explanations. Similarly, empirical work is sum-
marized in terms of purpose, methodology (where important), ﬁndings, and
ﬁt with the literature. Throughout the manuscript special attention is given
to tying together the ideas in diﬀererent models and areas. Providing this
structure should make the material easier to remember and more meaningful
The organization of the paper is as follows. Chapter 2 covers asset pricing,
both theoretical and empirical. A separate chapter on ﬁxed income secuti-
ties follows. Chapter 4 coveres derivative securities, from both a binomial
perspective and a continuous time framework. Chapter 5 covers the main
topics in corporate ﬁnance, again both theoretically and empirically. Next
is a chapter on market microstructure and information economics, which is
largely theoretical. A chapter on international ﬁnance concludes the main
body of the document. Lastly is a chapter covering important mathematical,
statistical, and econometric issues.
An eﬀort is made to preserve notational consistency, but inevitably there
will be deviations. Bold is used for vectors x and matrices X. Time subscripts
are dropped unless needed for clarity. Generally t is the current time, T is
the end, and τ is the time between two dates. Random variables are given a
tilde only as needed. Expectations are with respect to the true probabilities
2 CHAPTER 1. INTRODUCTION
P unless otherwise denoted. The risk-neutral measure is represented by Q.
R denotes a gross return, whereas r is a net return. When working with
the pricing kernel models it is useful to have a notation for element-wise
operations. I use to denote element-by-element multiplication and to
represent such a division.
This document draws heavily from a variety of sources, including Bhat-
tacharya and Constantinides (1989), Cochrane (1998), Campbell, Lo, and
MacKinlay (1997), Huang and Litzenberger (1988), Ingersoll (1987), Jarrow,
Maksimovic, and Ziemba (1995), as well as lecture notes from Dong-Hyun
Ahn, Jennifer Conrad, and Dick Rendleman.
June 30, 1998
There are three primary apporaches to pricing assets. The equilibrium ap-
proach begins with agents preferences (e.g., over expected returns or con-
sumption). Agents maximize expected utility subject to budget constraints
and market clearing conditions. Equilibrium models price all assets simul-
taneously and in equilibrium there is no arbitrage. The arbitrage approach
takes a diﬀerent point of view. It takes as given the prices of basis assets,
which can be combined to generate other payoﬀs. The absence of arbitrage
implies unique prices for these synthetic assets when markets are (locally)
complete. If markets are incomplete, it may be the case that there is a range
of admissable prices. Unfortunately, it is generally not possible to recover
a supporting equilibrium from the arbitrage approach. Somewhat paradox-
ically, the arbitrage approach may in fact admit arbitrage opportunities in
the sense that selecting diﬀerent basis assets may give diﬀerent prices. The
ﬁnal approach focuses on the pricing kernel. This approach shares many of
the features of the ﬁrst two approaches and provides a unifying framework.
Under this paradigm, all assets can be priced by the relation p = E[mx].
Asset pricing models diﬀer in the speciﬁcation of the pricing kernel m.
One question that arises immediately in asset pricing is the decision to
work in discrete or continous time. The discrete time models were developed
ﬁrst, and the have the beneﬁt of a more intuitive feel. Continuous time
models have a number of advantages. With a single state variable returns
are perfectly instantaneously correlated which simpliﬁes the analysis. More
4 CHAPTER 2. ASSET PRICING
generally, moments higher than the second vanish in continuous time.
Conditional asset pricing models have become popular in response to
the failure of unconditional models. A conditional model can capture time-
varying expected returns and/or risk premiums.
This chapter develops each theoretical approach, discussing the under-
lying assumptions and the resulting implications. Derivations are provided
for each, and an eﬀort is made to show the connections among the models.
The chapter concludes with a summary of the major emprical results and
methodologies. We begin the chapter by reviewing portfolio theory.
2.2 Portfolio Theory
Portfolio theory is concerned with the investors’ decision to consume or save
and the portfolio selection decision. The theory develops many of the re-
sults that appear in the CAPM framework. These results follow from mean-
variance mathematics, not from any economic model. Early works were due
to Markowitz (1959), who moved the thinking from maximizing E[R] to con-
sideration of both mean and variance. The principle of diversiﬁcation comes
from this work.
Under certain conditions, we can consider only the mean and variance of
asset returns. One suﬃcient condition is quadratic utility (see Section 2.3.1).
The other suﬃcient condition is multivariate normality of asset returns. Al-
though either of these assumptions are unlikely to hold, the resulting analysis
provides an intuitively appealing framework.
2.2.1 Single Period Optimization Problem
In terms of notation, consider a vector of asset weights w, returns r, expected
returns µ, and a variance-covariance matrix Σ. Investors minimize variance,
subject to achieving a particular return and the portfolio weights summing
w Σw + λ(µp − w µ) + γ(1 − w ι) (2.1)
Σw = λµ + γι µp = w µ 1 = w ι.
2.2. PORTFOLIO THEORY 5
Solve for w to get
w = λΣ−1
µ + γΣ−1
Frontier portfolios are linear combinations of two portfolios. Premultiply by
ι and µ, then deﬁne A = µ Σ−1
ι, B = µ Σ−1
µ, C = ι Σ−1
ι, D = BC − A2
Combine the expression for w and the FOCs to get λ = (Cµp − A)/D and
γ = (B − Aµp)/D. This gives
wp = g + hµp (2.3)
where g = BΣ−1
ι − AΣ−1
µ /D and h = CΣ−1
µ − AΣ−1
ι /D. Note
that ι g = 1, ι h = 0, µ g = 0, µ h = 1
2.2.2 Key Results
From here, we can establish a number of results [see Markowitz (1959) and
• The eﬃcient frontier is a hyperbola in µ-σ space.
• Global minimum variance portfolio o is the point ( 1
• o is positively correlated with all other minimum variance portfolios
and its covariance with these portfolios is its variance, 1/C.
• A frontier portfolio p is eﬃcient if µp ≥ A
• For all frontier portfolios except the minimum variance portfolio, there
exists a unique orthogonal frontier portfolio, z with wzΣwp = 0.
• All portfolios on the eﬃcient frontier are positively correlated. More
generally, ρp,j = SRj/SRp where p is on the eﬃcient frontier. (??)
• µg = 0, µg+h = 1.
• The portfolios g and g + h span the entire frontier.
• Any n ≥ 2 frontier portfolios can span the entire frontier.
• If wi is eﬃcient then wq = A wi is eﬃcient (A diagonal, trace(A) = 1,
and Aii ≥ 0 ∀ i).
• The covariance between the returns of a frontier portfolio p and any
other portfolio n (not necessarily on the frontier) is λµn + γ.
• µn = µz + β(µp − µz), where σp,z = 0
• geometry of tangency lines.
6 CHAPTER 2. ASSET PRICING
A beta representation is easy to derive from the FOCs.
wpΣwp = σ2
p = λµp + γ
ipΣwp = σip = λµi + γ
zpΣwp = σzp = λµz + γ = 0
where z is the portfolio orthogonal to p (or rf in the SL model) and i is an
portfolio. Since the third equation equals zero, subtract it from the ﬁrst two
p = λµp + γ − λµz) − γ = λ(µp − µz)
σip = λµi + γ − λµz) − γ = λ(µi − µz)
Solving for λ and rearranging gives the desired result
µi = µz + βip(µp − µz). (2.4)
Note that the beta is measured relative to portfolio p which is currently
unspeciﬁed. The important point of Roll’s critique is that this representation
is a mathematical result from the set up of the minimization problem. It does
not have any economic content unless we specify p as a particular portfolio.
His critique also says that the CAPM is not testable because the market
portfolio includes all assets, which we can not measure.
2.2.3 Multiperiod Portfolio Choice
In moving to a multiperiod setting the agent now considers future expected
consumption. Time subscripts are for indexing only. Other subscripts denote
Et[U(Ct)] + Et[B(WT )]
Deﬁne the indirect utility function as
J(Wt) ≡ max
Et[U(Cs)] + Et[B(WT )]
2.3. EQUILIBRIUM ASSET PRICING THEORY 7
with J(WT ) = B(WT ). At T − 1, indirect utility is
J(WT −1) = max U(CT −1) + ET −1[J(WT )]
where WT = [WT −1 −CT −1] [ wi(Ri − Rf ) + Rf ]. The ﬁrst order conditions
UC − ET −1[BW R∗
] = 0 and ET −1[BW (Ri − Rf )] = 0.
This generalizes to
J(Wτ ) = max U(Cτ ) + Eτ [J(Wτ+1)]
UC = Eτ [JW R∗
] and Eτ [JW (Ri − Rf )] = 0.
With log utility optimal consumption depends only on current wealth
and not on the investment opportunity set. Consumption is a speciﬁc pro-
portion of wealth and investors choose portfolios as in a single period setting
by equating the marginal utilities across assets. With power utility, opti-
mal consumption does depend on the investment opportunity set although
investment decisions are independent of consumption. With more general
HARA utility both consumption and portfolio choice depend on wealth.
2.3 Equilibrium Asset Pricing Theory
The equilibrium approach begins with agents’ preferences and maximizes
expected utility subject to budget constraints and market clearing condi-
tions. This approach has the advantage of internal consistency (no arbitrage
opportunities) and providing comparative statics. Models may be general
equilibrium or partial (e.g., take the riskless rate as given). A disadvantage
of these models is that they require taking a stand on preferences, and this of-
ten involves a tradeoﬀ between reality and tractability. The standard CAPM
is set is a single-period discrete world, whereas the ICAPM and CCAPM are
multi-period models in continuous time.
8 CHAPTER 2. ASSET PRICING
2.3.1 Utility Functions
Utility functions are the foundation of equilibrium asset pricing models.
Specifying a utility function deternines the features of the agents’ prefer-
ences, which in turn aﬀect how assets are priced in the economy. Here we
will discuss several important classes of utility functions in a nested frame-
work. Many of the commonly used funtions are special cases of more general
speciﬁcations. This section also brieﬂy discusses aggregation, representative
agents, and the implications for asset pricing.
One desirable feature is time-separability of a utility function. This means
that an agent’s consumption today does not aﬀect his consumption prefer-
ences in the future (no hangovers)
u(c0, . . . , cT ) = Et
uτ (cτ )
This is a strong assumption, but it greatly simpliﬁes much of the analysis.
Durability is one source of nonseparabilty. Models that relax the separability
assumption include habit persistence, “keeping up with the Joneses,” and the
Epstein-Zin class of recursive preferences models.
Risk-averse agents have utility functions that are concave in wealth (or
consumption). In this case, u[E(c)] ≥ E[u(c)] (by Jensen’s Inequality). It is
expected utility we care about. Concave utility functions mean the agent is
better oﬀ with a certain outome than a risky outcome.
There are several measures of risk aversion. RA = −u /u is the Arrow-
Pratt measure of absolute risk aversion, which applies to small risks. The
larger is RA, the larger the risk premium required to induce the agent to invest
in risk assets. CARA means the agent keeps a ﬁxed dollar amount invested
in the risky asset as wealth changes. Models based on CARA preferences do
not have income eﬀects. IARA implies the risky asset is an inferior good —
those with more wealth take less risk. This doesn’t make sense if one thinks
about a subsistence level of wealth.
To measure relative risk aversion, we use RR(C) = RA(C)C. This mea-
sure describes proportional changes in the risky asset investment for changes
in wealth. The wealth elasticity of demand is unity for CRRA utility func-
tions and greater than one for DRRA functions. With CRRA, agents invest
a contant proportion of their wealth in the risky assets, whereas with DRRA
the fraction of wealth invested in the risky asset increases with initial wealth.
2.3. EQUILIBRIUM ASSET PRICING THEORY 9
Table 2.1: Common Utility Functions: HARA
1 − γ
1 − γ
Case u = γ b Features
Quadratic 2 IARA, M-V
Negative Exp. −e−αW
−∞ 1 CARA= α
/γ < 1 0 CRRA= 1 − γ
Log log(W) 0 0 CRRA=1
The HARA (hyperbolic absolute risk aversion) family nests many com-
monly used classes of utility functions. Table 2.1 summarzies the features of
common utility functions.
With a riskless asset, quadratic or HARA utility implies two-fund separa-
tion. If there is not a riskless asset, quadratic or CRRA utility provides this
result. With the exception of quadratic (which has its own undesirable prop-
erties), restrictions on utility functions alone do not imply mean-variance
preferences, so therefore do not imply the CAPM.
Equilibrium models rely on the ability to aggregate over individuals in
the economy. A complete or eﬀectively complete market guarantees the ex-
istance of a representative agent. The representative agent’s utility function
is completely determined by individual agents’ preferences and wealths and
is independent of available assets only when all investors have HARA utility.
The risk aversion of the representative agent is the harmonic mean of indi-
vidual risk aversions, and will be less than or equal to the wealth-weighted
average. It is easier to establish the existence of a representative agent than
it is to aggregate demands. In many cases, however, we are interested in the
less diﬃcult task of aggregating demand only at the equilibrium price.
2.3.2 CAPM Theory
• homogeneous expectations (distinguishes from portfolio theory)
10 CHAPTER 2. ASSET PRICING
• Quadratic utility or multivariate normality of returns
• rational, risk-averse investors
• perfect capital markets
• unrestricted short selling (Black)
• borrow and lend at riskless rate (SL)
Derivation of Sharpe-Lintner Model
w Σw + λ[µp − µf − w (µ − µf ι)]
Σw = λ(µ − µf ι)
µp − µf = w (µ − µf ι)
Solving for λ,
λ = w Σw[w (µ − µf ι)]−1
µ − µf ι = Σw/λ =
(µp − µf ) = β(µp − µf )
Investors will only hold a combination of the riskfree asset and a tangency
portfolio. With homogeneous expectations the portfolio p must be the value-
weighted market portfolio M.
µ − µf ι = β(µM − µf )
Derivation of Black Model
Black’s (1972) CAPM adds one assumption to give the portfolio math results
economic content. With investor homogeneity, all investors will hold eﬃcient
portfolios. Since the value weighted market portfolio is a linear combination
of these eﬃcient portfolios, it too is eﬃcient.We can the rewrite (2.4) as
µi = µz + βi(µM − µz).
2.3. EQUILIBRIUM ASSET PRICING THEORY 11
Alternatively, we can maximize expected return for a given portfolio vari-
L = w µ + µz(1 − ι w) + λ(σ2
− w Σw)
= wΣw 1 = 1 w µ = µzι + 2λΣw.
w µ = µz + 2λσ2
For the market portfolio 2λ = (µM − µz)/σ2
M . For a generic asset,
µi = µz + (µM − µz)σiM /σ2
M = µz + βi(µM − µz).
The assets that covary negatively with the market tend to payoﬀ when the
market is doing poorly. These assets are valueable to investors in smoothing
their wealth. Since they are valuable, investors will pay a high price and
accept a low return. Thus, assets with low or negative betas will have low
(or possibly negative) expected returns. Higher risk aversion increases the
risk-return tradeoﬀ. This is measured by the Sharpe-ratio
, the slope
of the CML.
2.3.3 ICAPM Theory
The intertemporal capital asset pricing model and consumption capital asset
pricing model extend the standard CAPM intuition to a multi-period set-
ting. The ICAPM replaces dependence on quadratic utility/normal returns
with the assumption of a GBM process which implies normally distributed
returns. In the continuous time setting, higher moments do not matter, im-
proving tractability of the model. An advantage over the CAPM is utility
can be state-dependent, although the time-separability assumption remains.
With constant risk tolerance utility functions and constant investment op-
portunities, optimal portfolio choices are also constant. When the investment
opportunity set changes, so will portfolio allocations.
12 CHAPTER 2. ASSET PRICING
Merton’s (1973) ICAPM begins with the speciﬁcation of asset price paths.
Demands are determined by investors’ maximizing current and expected fu-
ture utility, subject to his budget constraint. Preferences are instantaneously
state-independent and depend only on immediate consumption. The in-
direct utility function, which is the maximized utility of future wealth, is
state-dependent. A collection of state variables are suﬃcient statistics for
summarizing the investment opportunities. Investors hedge against adverse
changes in the investment opportunity set, with the end goal being a hedge
against changes in consumption.
• limited liability
• perfect markets
• no restrictions on trading volume/short selling
• always in equilibrium
• borrow/lend at same rate
• continuous-time trading
• state variable has continuous sample path
• ﬁrst 2 return moments exist, higher moments unimportant
• returns have a compact distribution
• time-separable preferences
• ˜ri = αidt + σidzi
Under certain conditions, we have two-fund separation and the CAPM:
1. log utility (this means JW x = 0, investors do not want to hedge)
2. σix = 0 ∀ i (no hedge is possible)
The following derivation is for a single state variable x. The more gereral
case of a vector of state variables is similar.
dW = −Cdt + [W − Cdt]w r dx = µdt + s˜εx
Et[dW] = [Ww α − C]dt E(dx) = µdt
var(dW) = W2
w Σwdt var(dx) = s2
cov(dx, r) = ρixsσidt = σixdt
2.3. EQUILIBRIUM ASSET PRICING THEORY 13
J(W, x, t) = max Et
U(C, s)ds + J(W + dw, x + dx, t + dt)
J(W + dw, x + dx, t + dt) = J(W, x, t) + Jtdt + JW dW + Jxdx
JW W (dW)2
+ JW xdwdX + JtxdtdX + JtW dtdW + φ
where φ contains higher-order terms.
E[J(·, ·, ·)] = J + Jtdt + JW E[dW] + JxE[dx]
JW W var(dW) +
Jxxvar(dx) + JW xcov(dw, dx) (2.6)
0 = max
[U(C, t) + Jt + JW (−C + Ww α) +
JW W w Σw
+ Jxµ +
+ JW xWw σix]dt (2.7)
FOCs: (with portfolio constraint N
i=0 wiαi = rf + N
i=1 wi(αi − rf ))
UC = JW (envelope condition)
WJW (α − rf ι) + W2
JW W Σw + WJW xσix = 0
Now solve for optimal portfolio weights
(α − rf ι) +
(α − rf ι) H ≡
14 CHAPTER 2. ASSET PRICING
(α − rf ι)
(α − rf ι)
= Dt + Hh. Further, ι t = ι h = 1 so t and h are portfolios.
This gives three-fund separation, with the third fund being the riskless asset.
h is the “hedge portfolio,” and has the highest correlation with the state
variable x. This set up generalizes with a vector of state variables, in which
case we have dim(x) + 2-fund separation.
Deﬁne ak = −JW /JW W and bk = −JW x/JW W where k indexes the investor.
Rewrite the second FOC as:
JW (α − rf ι) + JW W WΣw∗
+ JW xσix = 0
ak(α − rf ι) = WkΣwk − bkσix
Sum over all investors and divide by k ak:
(α − rf ι) = AΣµ − Bσix or (αi − rf ) = Aσim − Bσix
where A = k Wk/ k ak, B = k bk/ k ak, and µ = k wkWk/ k Wk
(average investment in each asset across investors). Now multiply by µ and
h to get
αm − r = Aσ2
m − Bσmx, αh − r = Aσ2
hm − Bσhx,
Solving for A and B and substituting,
αi − r =
σimσhx − σixσmh
mσhx − σmxσmh
(αm − r) +
m − σimσmx
mσhx − σmxσmh
(αh − r)
i (x)(αm − r) + βh
i (x)(αh − r)
The βs have the interpretation of regression coeﬃcients in an IV regression,
where x serves as an instrument for h. Note that
σih = Σh =
Therefore, σix = kσih. This trick generalizes to cov(j, x) = kcov(j, h) where
k = ι Σ−1
σix. Terms depending on x can be factored from the betas so
i (x) = βm
i and βh
i (x) = βh
2.3. EQUILIBRIUM ASSET PRICING THEORY 15
2.3.4 CCAPM Theory
The CCAPM, due to Breeden (1979), is very much like the ICAPM with
consumption growth as the single state variable. In the ICAPM investors
hedge against changes in the state variables because these represent changes
in the investment opportunity set, and therefore, changes in consumption.
The CCAPM goes directly to heding against changes in consumption. The
model is also similar to the static CAPM, where end of period wealth mat-
tered. Since the CAPM is one period, end of period wealth is the same
as consumption. A key assumption in the CCAPM is additively separable
preferences, which gives state independence of direct utility.
To make more clear the link between the ICAPM and the CCAPM, note
that in the ICAPM agents set the marginal utility of wealth equal to the
marginal utility of consumption along the optimal consumption path. This
is the envelope condition, UC = JW . If markets are complete, then perfect
hedges for the state variables can be formed and all individuals will have per-
fectly (instantaneously) correlated consumption policies. This is an analogue
to all individuals holding the market portfolio in the static CAPM.
In many ways, the CCAPM is the most fundamental of the equilibrium
models. It is illogical to choose the CAPM or ICAPM because you think
the consumption-based model is wrong. The only reason for chosing an
alternative model is because the consumption data to test the model may be
The combination of portfolios h and t which the investor chooses minimize
the variance in consumption, not wealth.
The CCAPM can be derived as a simple modiﬁcation to the previous
derivation of the ICAPM. Since UC = JW at the optimum, JW W = UCC C∗
and JW x = UCC C∗
x. Substituting into (2.8),
(α − rf ι) +
(α − rf ι) = WC∗
16 CHAPTER 2. ASSET PRICING
The covariance between the return on asset i and consumption growth is
= E[(αdt + Σdz)(Ctdt + CW dW + Cxdx + φ)]
σix = σk
Noting that this is diﬀerent for each agent k and letting T k
= −CUC /UCC
(αi − rf ) = σk
Summing over all investors we get
(αi − rf ) = T−1
Deﬁning a reference portfolio C,
C = wCσiC = T(αC − rf ).
Solving for T and substituting,
(αi − rf ) =
(αp − rf ) = βiC(αC − rf ).
Note that if the consumption portfolio is not itself a traded asset than the
portfolio with the maximum correlation with consumption can be used. The
same basic intuition applies, but this results in the same kind of instrumental
variable ﬂavor as in the previous presentation of the ICAPM. If consuption
is available, it serves as the single variable driving the returns process. When
it is not available we include additional state variables to use as instruments.
2.3.5 The CIR Model
? derive a general equilibrium model with endogenous production and stochas-
tic technology shocks. Distribution of production depends on the state vari-
ables Y , which are changing randomly. This model ﬁlls a void in the litera-
ture in that it endogenously determines the equilibrium price path, given the
speciﬁcation of technology. Recall the ICAPM begins with a speciﬁcation of
the price path then determines the equilibrium demand.
2.3. EQUILIBRIUM ASSET PRICING THEORY 17
• single physical good
• n production activities follow (2.9)
• k state variables follow (2.10)
• contingent claims for the single good, whose value follows (2.11)
• competitive markets
• endogenously determined instantaneous borrowing/lending rate r
• ﬁxed number of identical individuals who maximize E
U[C(s), Y (s), s]ds
• continous investing and trading with no transactions costs
• there exists a unique J and ˆv
• (technical) v ∈ V is the class of admissible controls
• (technical) J, a∗
are suﬃciently diﬀerentiable.
n Production Activities
dη(t) = Iηα(Y, t)dt + IηG(Y, t)dw(t) (2.9)
k State Variables
dY (t) = µ(Y, t)dt + S(Y, t)dw(t) (2.10)
Value of Contingent Claim i
βi − δi)dt + Fi
aiW(αi − r) +
biW(βi − r) + rW − C dt
dW = Wµ(W)dt + W
18 CHAPTER 2. ASSET PRICING
Let K(v(t), W(t), Y (t), t) ≡ E
U(v(s), Y (s), s)ds and deﬁne Lv
as the diﬀerential operator
(t)K = µ(W)WKW +
Let the indirect utility function J(W, Y, t) be the solution to
(t)J + U(v, Y, t)] + Jt = 0.
J has many of the same properties as U, such as being increasing and strictly
concave in W.
Deﬁning Ψ = Lv
J + U, we get the following necessary and suﬃcient
• ΨC = UC − JW ≤ 0
• CΨC = 0
• Ψa = [α − r]WJW + [GG a + GH b]W2
JW W + GS WJW Y ≤ 0
• a Ψa = 0
• Ψb = [β − r]WJW + [HG a + HH b]W2
JW W + HS WJW Y = 0
Solving for ˆC, ˆa,ˆb, we obtain a PDE for J. The equilibrium satisﬁes these
conditions and markets clear: ai = 1 and bi = 0 ∀ i.
The expected rate of return on wealth is a∗
α. r is the negative of the expected
rate of change in the MU wealth, or a∗
α + the covariance between the rate
of return on wealth and the rate of change in the MU of wealth.
r = −E
The expected rate of return on the ith
contingent claim is
(βi − r)Fi
= [φW φY ][Fi
2.3. EQUILIBRIUM ASSET PRICING THEORY 19
φW = −
cov(W, Yi) = (a∗
α − r)W
cov(W, Yi) +
Alternatively, we can write
βi = r − cov(Fi
, JW )/Fi
The expected return on a contingent claim is the riskfree rate plus a linear
combination of the ﬁrst partials of the asset price with respect to W and Y .
The weights are the φ coeﬃcients, which are much like factor risk premiums
in the APT or hegde portfolios in the ICAPM. The φs do not depend on the
contingent claim itself and are the same for all claims.
If U is not state-dependent, we get a CCAPM-type result, with φW =
, W) and φY = −u
, Y ), giving (βi−r)Fi
The expected excess return on an asset is proportional to its covariance with
optimal consumption. We can then express relative rates of return in a way
that does not depend (explicitly) on preferences.
Fundamental Valuation Equation
var(W)FW W + cov(W, Yi)FW Yi
cov(W, Yi) −
+ [rW − C∗
]FW + Ft − rF + δ(W, Y, t) = 0 (2.14)
where r and C∗
are functions of W, Y, and t. This PDE holds for any contin-
gent claim, with boundary conditions and δ depending on the terms of the
claim. The PDE can price assets with payoﬀs (i) contingent on crossing a
barrier, (ii) contingent on not crossing a barrier, and/or (iii) ﬂow payoﬀs.
20 CHAPTER 2. ASSET PRICING
We can focus on the system of equations:
dW(t) = [a∗
αW − C∗
]dt + a∗
dY (t) = µ(Y, t)dt + S(Y, t)dw(t)
or a second system with a diﬀerent drift term reﬂecting a change of measure:
dW(t) = [a∗
αW − C∗
− φW ]dt + a∗
dY (t) = [µ(Y, t) − φY ]dt + S(Y, t)dw(t)
The expression JW (W (s),Y (s),s)
JW (W (t),Y (t),t)
is the conditional pricing kernel.
2.4 Arbitrage Asset Pricing
Arbitrage pricing takes a set of basis assets as given and uses them to price
2.4.1 State Contingent Claims
State contigent claims, or Arrow-Debreu securities, are the building blocks
for all assets. These securities pay $1 in a speciﬁed state and zero otherwise.
Ross (1977b) shows the absence of arbitrage implies the existence of state
contingent prices and, therefore, of a linear pricing operator. This is really
just a spanning result. We can write p(x) = s φ(s)x(s). This says the
price of security x is the sum over all states of the price of a dollar in each
state φ(s) scaled by the size of the payoﬀ in each state x(s). Harrison and
Kreps (1979) extend this to show that this operator can be represented as
an expectation with respect to a martingale measure.
Let D denote an (n × n) matrix of asset payoﬀs with typical element dij,
where i denotes the state and j the security. This matrix is a colection of
vectors dj of asset payoﬀs. α is an n-vector of weights, b an n-vector of
payoﬀs. φ is the price vector for the n Arrow-Debreu securities and p the
prices of the complex securities. We have the following pricing relations
D φ = p and Dα = b
with ι φ = 1
= pf , (1 + rf )ι φ = ι π = 1. π = f(θ, λ) is the risk-neutral
probablities, a function of the true probabilties θ and risk aversion λ.
2.4. ARBITRAGE ASSET PRICING 21
2.4.2 Arbitrage Pricing Theory
The APT, originally developed by Ross (1976), has generated a tremendous
literature of theoretical extensions and a wide range of empirical tests. The
intuition is simple. Assume returns follow a factor-model, meaning returns
depend on the realization of factors and (quasi-) orthogonal shocks.1
factors are not diversiﬁable, whereas the orthogonal shocks are in some sense.
The theory is silent on what the factors are, or even the number of factors.
A key idea is the factor-mimicking portfolio.
There are really three diﬀerent cases of the APT, depending on the as-
sumptions about the structure of the Ω matrix of “idiosyncratic” covariances.
If we have an exact or noiseless factor model, then Ω is the zero matrix and
an exact arbitrage argument will hold. Alternatively, we could have a strict
factor model in which the matrix is diagonal so there is no correlation across
assets. Large diversiﬁed portfolios cause the idiosyncratic variance to go to
zero. We appeal to an asymptotic arbitrage argument in which there is no
arbitrage on average, although speciﬁc securities may be mispriced. Finally,
we could allow for a more general correlation structure where Ω may con-
tain non-zero oﬀ-diagonal elements. This approximate factor model allows
for idiosyncratic correlations (e.g., industries) and requires restrictions on
the covariaces of returns such that the idiosyncratic part is diversiﬁed away
while the factors remain. The controversy over the structure of Ω has major
implications for the testability of the model.
The APT has a ﬂavor very similar to the ICAPM, although it is arises
from a diﬀerent viewpoint. In the end, both models specify expected returns
as a function of a linear combination of their covariances with variables (fac-
tors and state variables, respectively). This link arises because it is implied
by the absence of arbitrage. The additional assumptions in the equilibrium
model serve to determine the risk premium associated with each state vari-
The model has been extended in a number of other ways including dy-
namic, conditional, nonlinear, international versions. Tests of the model have
also followed several paths, broadly categorized as cross-sectional or time se-
By quasi-orthogonal shocks I mean that some correlation among the reisduals is al-
Actually, models such as the CAPM are partial equilibrium models and take the
riskless rate and market price of risk as given. Richer models such as CIR introduce
production uncertainty and are able to more completely characterize the economy.
22 CHAPTER 2. ASSET PRICING
ries. In general, tests reject the model but ﬁnd it provides more favorable
performance than models like the CAPM.
This derivation is based on the strict factor version. The exact APT deriva-
tion will also work under this approach. Modiﬁcations for the approximate
APT are mentioned at the end. It is very important to understand that the
APT starts with a characterization of realized returns r, and uses statistical
properties to say something about expected returns µ.
rt = µt + νt = µt + Bft + ut (2.15)
E[rt] = µt ft ∼ N(0, I) ut ∼ N(0, Ω)
where Ω is diagonal. ft is a factor vector and B a loading matrix, which
together give the unexpected factor-related return. Return covariances are
E[rtrt] = Bﬀ B + Ω = BB + Ω = Ψ.
As an aside, deﬁne Φ such that ΦΦ = I, giving B = DΦ, a rotation. There-
fore, Ψ = D D + Ω, illustrating the rotational indeterminancy.
Next form a portfolio with weights w. The portfolio variance is
p = w Ψw = w BB w + w Ωw ≈ w BB w.
The strategy is to choose w such that w BB w = 0 without making an
investment, ι w = 0. To ﬁnd a w think of this as a regression of µ on [ι B].
µ = λ0ι + Bλ + w. (2.16)
The normal equations from the regression give ι w = 0 and B w = 0, which
implies w BB w = 0 as desired.
To ﬁnd w r, insert (2.16) into (2.15) to get
rp = w (λ0ι + Bλ + w) + w Bf + w u.
2.5. PRICING KERNEL APPROACH 23
Taking expectations and using the orthogonality conditions, µp = w w. This
validates (2.16), which can be written as
µt ≈ λ0ι + Bλ. (2.17)
If a factor is negatively correlated with the IMRS the model implies a positive
in (2.16), where N indexes the number of assets, a sequence of
arbitrage portfolios satisﬁes the Ross pricing bound if wN
does not go to
inﬁnity with N. The approximate factor model is derived by requiring that
as N → ∞ the smallest eigenvalue of B B → ∞ while the largest eigenvalue
of Ω → 0. That is, the factors are pervasive while the idiosyncratic part is
2.5 Pricing Kernel Approach
The pricing kernel approch is in many ways a hybrid of the equilibrium and
arbitrage approaches. The focus is to specify the pricing kernel3
makes the Euler equation hold:
pt = Et[mt+τ xt+τ ] (2.18)
This seemingly simple expression is complex enough to cover pricing for any
asset. The expression can be modiﬁed to handle returns, excess returns,
stocks, bonds, options, etc. The meaning of the payoﬀ x and the price
change, but the same intuition applies.
The expected return on an asset is negatively related to its covariance with
the stochastic discount factor. Assets whose returns vary positively with the
sdf pay oﬀ when the marginal utility is high. That is, they provide wealth
in the states when it is most valuable to investors. Consequently, investors
are willing to pay high prices and accept low returns for these assets.
There are basically two ways of doing business. One is to take the IMRS as
given and interpret (2.18) as the Euler equation arising from the consumer’s
This object lives by many names, including the stochastic discount factor (sdf), in-
tertemporal marginal rate of substitution (IMRS), or benchmark pricing variable. It is
incorrectly referred to as the Radon-Nikodym derivative, Arrow-Debreu price, or state-
contingent claims price (unless the riskless rate is zero). While on naming conventions, the
risk-neutral probability measure is also referred to as the equivalent martingale measure
24 CHAPTER 2. ASSET PRICING
optimization problem. The goal would then be to explain asset returns. The
other view is to take the returns as given and explore the implications for m.
The characteristics of m depend upon the structure of the economy. If
the law of one price (LOP) is satisﬁed, there will exist (at least one) m such
that (2.18) holds. In the absence of arbitrage (NA), m is strictly positive. If
markets are complete then m is unique.
This presentation is for a discrete time, multiperiod model. Deﬁne the con-
sumption set c ∈ B(ei
, p) ⊂ R × X. The budget constraints are c(0) =
e(0) − θ p and c(T, ω) = e(T, ω) − θ d(ω). Combining these two equations,
ˆDθ = ˆc − ˆe. The attainable set ˆDθ = ˆc ignores the initial endowment. I
will abuse notation and consistency by letting Q and π∗
refer to the EMM.
The later is more appropriate for discrete settings. Also, dividend (payoﬀ)
vectors and matrices are indicated by d and D.
Deﬁnition 1 The market is complete iﬀ every consumption process is at-
tainable (M = X), or iﬀ rank(D) = k.
Deﬁnition 2 An arbitrage strategy has non-negative, non-zero consumption
with e(0) = (0); ˆDθ ≥+
Deﬁnition 3 An Equivalent Martingale Measure Q (or π∗
) satisﬁes p =
Q exists iﬀ there is no arbitrage, or iﬀ an equilibrium exists. If markets
are complete then Q is unique.
Deﬁnition 4 A price functional Φ : R × M → R (Π : M → R) satisﬁes
Φ(c) = c(0) + Π(c(T)) = c(0) + θ p for any θ such that c(T) = θ d.
This implies B(e, p) can be expressed as Φ(nc) = 0 where nc(t) ≡ c(t)−
e(t) ∈ M.
Π is unique even in an incomplete market and exists is there is an equi-
librium. A price system is viable: iﬀ there is no arbitrage, iﬀ Q exists, or iﬀ
Φ (or Π) exists.
Deﬁnition 5 Ψ : X → R is an extension of Π if for all x ∈ M, Ψ(x) = Π(c).
A sequence of scaled prices is a Q-martingale.
2.5. PRICING KERNEL APPROACH 25
2.5.2 Diﬀerent Expectations
Denote the price of asset x, a package of state-contingent claims, as p(x).
x(s) = EP
where π(s) is the (true) probability of state s. It follows then that m(s) =
φ(s)/π(s). To move to risk-neutral probabilities π∗
(s) ≡ Rf m(s)π(s) = Rf φ(s),
where 1/Rf = φ(s) = E[m]. Then
These results imply
p(x) = EQ
[x]/Rf = EP
= Rf m(s).
The risk neutral probabilities give greater weight to states with high marginal
utility, the “bad” states. In discrete time, the “change of measure” is
π = Rf m =
In continuous time the analagous expression is
n (x1, . . . , xn)
n (x1, . . . , xn)
where fn() represents the joint likelihood under the respective measure. This
expression is the Radon-Nikodym derivative, and is the limit of the likelihood
ratios. This random variable satisﬁes
(xT ) = EP dQ
26 CHAPTER 2. ASSET PRICING
2.5.3 Asset Pricing with m
This analysis is useful in pricing assets. For a collection of assets in an econ-
omy, the price is the risk-neutral expectation of the future value, discounted
back to the present at the riskless rate
p = D π∗
If the market is complete, Q is unique (π∗
is identifable in a discrete setting)
and we can invert the payoﬀ matrix to solve for the probabilities
= Rf (D )−1
If the market is not complete it is often possible to get a range of admissable
EMMs. Further restrictions may result from imposing the NA condition that
the pricing kernel be positive.
Recall that dividing by the riskless rate will give the Arrow-Debreu prices
φ = π∗
/Rf = (D )−1
p. Furthermore, the pricing kernel is
m = pf π∗
π = (D )−1
Once the EMM or pricing kernel are known they can be used to price any
2.5.4 The Agent’s Problem
There is a relationship between the pricing kernel and equilibrium approaches.
The agent will
βπ(s)u[c(s)] s.t. c +
φ(s)c(s) = y +
u (c) = λ βπ(s)u [c(s)] = λφ(s)
φ(s) = βπ(s)
2.5. PRICING KERNEL APPROACH 27
Thus m(s1)/m(s2) = u [c(s1)]/u [c(s2)], so m gives the marginal rate of sub-
stitution between date and state contingent claims. In equilibrium, marginal
utility growth should be the same for all consumers
Hence m is referred to as the IMRS. Taking the expectation of either m or
IMRS gives the price of a riskless bond.
2.5.5 The Main Results
Using the deﬁnition of covariance and (2.18)
1 = E[mR] = E[m]E[R] + cov(m, R) (2.19)
It follows immediately that if there is a riskless asset Rf = 1/E[m], or pf =
E[m]. Without a riskless asset, we can view 1/E[m] as a “shadow” riskfree
rate, or a zero beta return. Note that the expectations have been under the
true probability measure P.
Using the above results,
E[Ri] = Rf +
= Rf + βi,mλm
which is a beta pricing model.
Relation between m, β models, and MV frontier
• p = E[mx] ⇒ β: m, x∗
, or R∗
can serve as reference variables. If
m = b f, then f, proj(f|X), or proj(f|R) can be used.
• p = E[mx] ⇒ mean-variance frontier which includes R∗
• β ⇒ p = E[mx]: m = b f
28 CHAPTER 2. ASSET PRICING
Table 2.2: Common Pricing Kernels
CAPM a + bRW,t+1
ICAPM a + K
CCAPM β u (ct+1)
APT b f
Black-Scholes exp[−(r + 1
)τ + σdZ]
• MV frontier ⇒ p = E[mx]: m = a + bRmv
• MV frontier ⇒ β model with Rmv as a reference variable.
Since mean-variance eﬃciency implies a single beta representation, some
single beta representation can always be found. The asset pricing model says
that a particular portfolio (e.g., the market) will be mean-variance eﬃcient.
In other words, the content of a model comes from m = f(·), not p = E[mx].
Also, given any multi-factor or multi-beta representation, we can always
ﬁnd a single beta representation. The relationship between the ICAPM and
CCAPM is an example of this.
m as a Portfolio
The portfolio that maximizes squared correlation with m is a minimum vari-
ance portfolio. m∗
, the projection, also prices assets and can replace m.
p = E[mx] = E[(m∗
+ ε)x] = E[m∗
2.5.6 Hansen-Jagannathan Bounds
The Hansen and Jagannathan (1991) bounds are an important addition to
asset pricing. Instead of a binary reject/fail to reject result, the HJ bounds
oﬀer some insights as to why the model may be rejected. The model is most
useful for testing models like the consumption model where m is explicitly
speciﬁed. The model is useless for evaluating factor models that do not
specify the factors since there are always some factor-mimicking portfolios
that will work ex post.
2.6. CONDITIONING INFORMATION 29
Working with excess returns, E[mre
] = 0, so E[m]E[re
i ] = −cov(m, ri) =
. Since |ρ| ≤ 1,
represents the return with the maximum Sharpe ratio. This holds
for any asset i, including the one with the maximum Sharpe ratio. To be
clear, the maximal Sharpe ratio measure the excess return on the tangency
relative to its standard deviation (assuming a one-factor world).
Both the excess return on the tangent portfolio and the SR depend on Rf .
Rewriting as σm = E[m]SR, the H-J bound is a function of E[m]. As
we change E[m], we get a new Rf , a new tangency portfolio, and a new
Sharpe ratio. Plotting σm as a function of E[m] gives us the locus of points
comprising the H-J bound. Note that if we know Rf , the the bound is just
a point. These results are based on the law of one price (LOP), and do not
use the no arbitrage (NA) restricition that m > 0.
By imposing the NA restriction we can sharpen the bound given in (2.21).
The NA bound is very similar to the LOP bound for moderate values of
E[m], but as E[m] becomes more extreme (higher SR), the NA bound is
much stricter (higher). For payoﬀs x and Lagrange multipliers λ and δ,
= [λ + δ x]+
subject to E[m+
] = E[m] and E[m+
x] = p. This nonlinear problem can
generally be solved numerically. m+
has the interpretation of a call option
with zero strike price on a portfolio of payoﬀs [1x] .
The H-J bound analysis has been extended in several ways. Snow (1991)
generalizes the model to include any moment of m. In this setting the bounds
are more sensitive to outliers. Other extensions include incorporating trans-
actions costs, utilizing cross-moments, and analyzing pricing errors as a way
to detect speciﬁcation errors. One example is adding diﬀerent sets of assets
and seeing how much the bound shifts up.
2.6 Conditioning Information
The diﬀerence between a conditional and unconditional model is the infor-
mation set used. If payoﬀs and discount factors (and therefore, prices) are
30 CHAPTER 2. ASSET PRICING
iid, then conditional and unconditional models are the same. Deﬁne
UMV iﬀ E[R2
p∗ ] ≤ E[R2
p] ∀ Rp s.t. E[Rp∗ ] = E[Rp]
CMV iﬀ Et[R2
p∗ ] ≤ Et[R2
p] ∀ Rp s.t. Et[Rp∗ ] = Et[Rp]
By iterated expectations, this gives UMV ⊆ CMV. If a portfolio is UMV it
must be CMV, but the converse need not be true. We can also consider the
set of minimum variance portfolios conditional on Z, CMVZ. Then CMV
includes CMVZ , which in turn includes UMV. A conditional factor pricing
model does not imply an unconditional model. An unconditional model does
imply a conditional model.
From here we can say that it is possible to reject that a portfolio is UMV
or CMVZ, but we can not reject CMV since the information set for CMV
is unobservable. This is similar to the issue raised by Roll (1977); rejecting
UMV does not imply rejection of CMV. Cochrane (1998) refers to this as the
Hansen and Richard (1987) critique. The use of scaled factors (i.e., scaled
by instruments in the proper information set) is a partial solution.
If the test is based on 1 = E[mR] for some particular m, then it is possible
to test without the complete information set. Recall m∗
can replace m in
(2.18), so m∗
is also CMV and is a function of the unobserved information
The use of conditional models allows for time-varying expected returns.
This time variation can arises due to changes in the risk premium or because
of conditional covariances (β changes through time). The ARCH-GARCH
family of models is often used to capture the time series behavior of condi-
2.7 Market Eﬃciency
Examining the link between the theoretical asset pricing models and empir-
ical tests requires a position on market eﬃciency. The general idea behind
market eﬃciency is that prices reﬂect available information. Of course a more
precise deﬁnition of available information and the implications of reﬂecting
this information are necessary.
The early view of market eﬃciency was the random walk. In this model
the series of innovations is independent. Empirical evidence during this pe-
riod found that prices are consistent with a random walk. The apparant im-
plications of this model are that prices are not driven by supply/demand and
2.8. EMPIRICAL ASSET PRICING 31
there is no point in fundamental analysis. In fact, the random walk does not
have these implications since slowly adjusting prices would allow proﬁtable
trading strategies. A problem with the random walk is that it simulatneosly
requires rational investors to eliminate proﬁtable trading opportunites, but
also assumes investors irrationally pay for security analysis.
The martingale model was proposed as an alternative to the random walk
by Samuelson in the mid-1960s. A random variable xt+1 is a martingale with
respect to an information set Φt if
E[xt+1] = xt.
A fair game has the property that E[yt+1] = 0. Returns are a fair game if
prices and dividends follow a martingale. Finding a variable that can predict
returns means either that returns are not a martingale or that that variable
in not in the information set. More recent versions of market eﬃciency also
assume rational expectations.
The martingale will hold when investors have common, constant rate
of time preferences, homogeneous beliefs, and are risk-neutral. Note that
risk neutrality implies a martingale, but does not imply a random walk.
The reason is that a martingale allows dependence of higher moments on
the information set, whereas the random walk does not. Allowing for risk
aversion does not go very far in reconciling the martingale model with the
There are several reasons not to base market eﬃciency on the martingale
model. In a setting such as the ICAPM, conditional expected returns depend
on dividends. Since dividends are autocorrelated the conditional expected
returns are partially forecastable in violation of the martingale model. Time
variation in the risk premium may also lead to failure of the martingale model.
Finally, most emprircal tests have a joint hypothesis problem. Rejecting a
model may mean either the model is wrong or the market is ineﬃcient.
2.8 Empirical Asset Pricing
2.8.1 Properties of Asset Returns
Normality oﬀers nice features in modeling asset prices, however departures
from normality have been extensively documented. Relative to the normal
distribution, asset returns exhibit skewness and kurtosis. Matters are com-
plicated further by serial correlation in returns.
32 CHAPTER 2. ASSET PRICING
Table 2.3: Patterns In Returns
Factor Relation Comment
Size – Banz (1981)
E/P + Basu (1977)
Dividend Yield +
Term structure slope
Expected Inﬂation –
Credit quality + also related to volatility
Jegadeesh and Titman (1993)
There is evidence that lagged variables are useful in predicting stock and
bond returns. Many of the results documented in the U.S. are also present
in other countries. Table 2.3 provides an overview of these patterns.
Interpretation of these patterns are diﬃcult since many of these variables
are highly correlated, and much of the relation each has with returns comes
in January. At longer time horizons some of the eﬀects, such as size and
E/P, tend to reverse themselves. A common criticism is that these variables
may be correlated with the true β when estimates of β are noisy. Chan &
Chen () show that average size and estimated beta in size-sorted portfolios
are almost perfectly negatively correlated.
Another issue that arises in interpretation of the cross-sectional regulari-
ties is whether they are all capturing the same underlying phenomenon. This
is especially likely considering price is in many of the variables.
Attempts to disentangle the eﬀects are inconclusive. Some researchers
2.8. EMPIRICAL ASSET PRICING 33
claim size subsumes E/P, while others claim the opposite. Fama and French
(1992) claim that size and B/M together subsume E/P (and beta). Given
the way these tests are designed, the B/M variable may actually be a proxy
for the true beta. A stock that recently declined in price will have a high
B/M. This stock is also likely to be more levered than before its decline, so
it is now riskier and should have a higher beta. However the beta estimate
is generally based on returns several years prior, so the recent downturn is
likely to be washed out. In the end, the estimated beta may be too low, and
the high B/M may capture the added risk of the stock. Alternatively, the
B/M results may be due to survivorship biases in the COMPUSTAT tapes.
There are several calendar related patterns in returns. Most famous is
the January eﬀect, where returns are much larger in January than in other
months. Possible explanations include tax-based trading, window dressing
by institutions, and liquidity trading. The January eﬀect is most pronounced
for small ﬁrms.
The weekend eﬀect describes the large negative returns from Friday close
to Monday close. It is not clear that all the abnormal return is due to the
weekend period, but Monday returns alone do not seem to account for the
entire eﬀect. International evidence is mixed with respect to weekly patterns,
but many of the Asian markets have a Tuesday eﬀect, which corresponds to
Monday trading in the U.S. There is some evidence that most of the returns
each month occur during the ﬁrst two weeks. This may be due to portfolio
rebalancing caused by month-end salaries. Finally, there is a holiday eﬀect,
where one third of the annual returns occur on the trading days preceeding
the eight holidays on which the market is closed.4
In a clever paper Berk (1995) addresses the fact that price is directly
related to size. The basic logic is very simple — risky ﬁrms will be discounted
at a higher rate, therefore current market values will be smaller. This will
give the appearance that small ﬁrms have higher returns, even though ﬁrm
size (future cashﬂows) and risk may be unrelated. Consider a set of ﬁrms
with log future cash ﬂows c, log price p, and log return r = c − p. Further
assume size and risk are independent. Now regress returns on beginning of
r = α1 + β1p + ε1.
This is misleading since positive and negative returns cancel out.
34 CHAPTER 2. ASSET PRICING
The sign of β1 depends on the covariance between r and p
cov(r, p) = cov(c − r, r) = cov(c, r) − var(r) = −var(r) < 0.
Thus we should expect a negative relation between ﬁrm size and returns.
Now consider a regression of actual returns on expected (model) returns ˆr
r = α2 + β2ˆr + ε2.
Take the pricing errors ε2 and regress them on current size
ε2 = α3 + β3p + ε3.
The sign of this regression coeﬃcient depends on the covariance
cov(p, ε2) = cov(c − r, ε2)
= cov(c, r − α2 − β2ˆr) − cov(α2 + β2ˆr + ε2, ε2) = −var(ε2) < 0.
This shows that size is negatively related to pricing errors. How much of the
variation in actual returns is explained by size? Decompose the R2
var(c − r)
var(r) + var(c)
The larger the variation in cashﬂows the lower is the R2
. The basic con-
clusion of the article is that market value will end up capturing unmea-
Time Series Patterns
Asset returns contain patterns in autocorrelations summarized in Table 2.4.
Using CRSP stock returns from 1962–1994, portfolio autocorrelations range
from 1.3% to 43.1%. Autocorrelations increase with shorter time horizons
and are higher in equally-weighted portfolios than value-weighted portfolios.
Both of these eﬀects are likely due to higher autocorrelation in smaller stocks,
which may be due to non-synchronous trading. There is weak evidence of
negative autocorrelations in multi-year returns. In most cases the economic
signiﬁcance of the autocorrelations may be small, as is the proportion of the
total variance explained. Individual stocks, especially smaller ones, tend to
have negative autocorrelation.
2.8. EMPIRICAL ASSET PRICING 35
Table 2.4: Correlation Patterns
Horizon Individual Portfolio
Daily – +
Weekly – +
Monthly – +
The random walk hypothesis implies the variance of asset returns scales with
time; a T-period return should have a variance T times as large as a one-
period return. A similar statistic can be derived using variance diﬀerences.
Finite sample properties can be signiﬁcantly improved by using overlapping
observations and making appropriate degrees of freedom adjustments.
Positive autocorrelations suggest variance ratios greater than one. For the
equally-weighted portfolios, this seems to be the case, with V R(2) ≈ 1.2, and
increasing with longer-horizons. V R(16) ranges from 1.5 to 1.9, depending on
the time period (this eﬀect is getting smaller as time goes on). These results
disappear in value-weighted portfolios. Looking at size-sorted portfolios, the
variance ratios are largest for the small-stock portfolios and are close to one
for the stocks in the largest decile. For individual securities the variance
ratios are close to one in general, and less than one for the longer horizons.
This is because there is some negative autocorrelation in individual security
returns due to the bid-ask spread.
The combination of negative autocorrelation in individual securties and
positive autocorrelation in portfolios gives rise to positive cross-autocorrelations.
This phenomena can be summarized as a stronger correlation between cur-
rent small-stock returns and lagged large-stock returns than between current
large-stock returns and lagged small-stock returns. More directly, large stocks
tend to lead smaller stocks. This can help explain the apparant proﬁtability
of contrarian strategies.
36 CHAPTER 2. ASSET PRICING
Shiller () and Summers () present models where stock prices have fads or bub-
bles, causing large slowly decaying swings from fundamental values. Shorter
horizon portfolio returns have little autocorrelation, while returns at longer
horizons have strong negative autocorrelation. Empirical evidence supports
these models, although the tests are based on small sample sizes and lack
power. Other empirical results indicate that the variance grows more slowly
than the time horizon, also consistent with the model. A general problem
is that that irrational bubbles in stock prices are not distinguishable from
rational time-varying expected returns. Long-horizon returns are also pre-
dictable with other variables such as D/P and E/P. These variables can
explain roughly a quarter of the variation in two to four year returns, much
more than is possible for shorter horizons.
? propose their contrarian viewpoint, where buying losers and selling
winners (measured over 3 to 5 year periods) produces excess returns. Others
have argued that the excess returns are due to diﬀerences in risk, although
a rebuttal paper from DeBondt and Thaler disagrees. It is possible that the
contratrian results are due to a size eﬀect or some type of distressed-ﬁrm
2.8.2 General Procedures
Multivariate tests can elimintate the errors-in-variables problem and increase
the precision of parameter estimates. This type of test still does not say why
the model is rejected. Consider a multi-beta model of the form
Et[Ri,t+1] = λ0,t +
To test this using a multivariate regression
Ri,t+1 = αi +
βi,jRj,t+1 + εi,t+1 (2.22)
the intercept restriction is αi = λ0(1 − βi,j). This is equivalent to mean-
variance intersection, meaning that the minimum variance boundaries of all
the asset returns and minimum variance portfolios intersect at a single point.
2.8. EMPIRICAL ASSET PRICING 37
In other words, a combination of mimicking portfolios lies on the mean-
The multivariate regression in the restricted form uses TN observations
to estimate N + 1 parameters. The unrestricted model has 2N parameters
to estimate. Tests with longer time series have more power, while those with
more assets have a larger size. The restrictions can be tested with the Wald
(W), likelihood ratio (LR), or Lagrange multiplier (LM) statistics. These are
all asymptotically χ2
but may diﬀer in ﬁnite samples.
2.8.3 CAPM Tests
The only testable implications of the CAPM are that the market is mean-
variance eﬃcient, and for the SL model that the intercept is zero. Roll
(1977) indicates that this is inherently impossible to do since the market is
unobservable. “Rejecting” the model may simply mean that the proxy is not
mean variance eﬃcient. Converesely, “failing to reject” may mean that the
proxy is mean variance eﬃcient. In either case, we have not said anything
about the mean-variance eﬃciency of the market. Further, there are always
some portfolios which are mean-variance eﬃcient. There is also the issue
with conditioning information. The CAPM can hold conditionally but fail
unconditionally. Without knowing what conditioning information to use, the
models are diﬃcult to test.
Stambaugh (1982) examines the sensitivity to excluded assets in the mar-
ket proxy, ﬁnding inferences are similar regardless of the speciﬁc composition
of the proxy. Kandel and Stambaugh (1987) and Shanken (1987) estimate
the upper bound on the correlation between the proxy and the true market
needed to overturn rejection of the model. As long as the correlation is at
least 0.70, inferences would not change. Roll and Ross (1994) counter by
saying that if the true market portfolio is eﬃcient, cross-sectional relations
between expected return and beta are very sensitive to the proxy choice.
As in any statistical test, there is a tradeoﬀ between size and power.
Adding assets tends to increase the size of a test in ﬁnite samples. A longer
time series can considerably increase the power of a test. GMM tests have
become popular since they do not rely on normality, homoskedasticity, or
The early evidence was generally supportive of the CAPM, in that the
evidence seemed consistent with mean-variance eﬃciency of the “market”
portfolio. Representative studies include Fama and MacBeth (1973), Black,
38 CHAPTER 2. ASSET PRICING
Jensen, and Scholes (1972), and Blume and Friend (1973). In the mid-1970’s
the “anomalies” literature developed [see Fama (1991) for a review].
Common criticisms of these “anomolies” are sample selection and data
snooping biases. Kothari, Shanken, and Sloan (1995) claim that sample
selection biases drive the results of Fama and French (1992), although Fama
and French (1996b) dispute this claim.
FM perform introduce what has become a classic methodology for empirical
asset pricing tests. They test the Black and SL CAPMs using monthly
portfolio returns and the equally-weighted NYSE as the market. Their tests
examine (i) the linearity of the risk-return tradeoﬀ, (ii) if variables other
than β matter, (iii) if the risk premium is positive, and (iv) if the return on
the zero-beta portfolio is equal to the riskless rate.
The procedure is as follows. First, portfolios are formed using estimated
β of individual securities over a four year period. Since measurement error
will systematically aﬀect these portfolios, the betas are reestimated over a
ﬁve year period and averaged across assets to get portfolio β. The β for
each portfolio is recalculated each month over the next four years to cover
delistings. Returns for each of the 20 portfolios are regressed on the port-
folio betas. This is repeated each month, and the estimated coeﬃcients are
averaged over time.
The results are generally supportive of the Black model but the estimated
riskless rate is higher than the market rate. Additional regressions including
and the asset-speciﬁc risk indicate that the risk-return relation is linear
and there is no reward for bearing unsystematic risk.
Extensions by Litzenberger and Ramaswamy (1979) and Shanken (1992)
explicitly adjust standard errors for the EIV bias rather than form portfolios.
Shanken (1992) shows that the standard errors in Fama and MacBeth (1973)
do not properly reﬂect measurement error in β, overstating the precision of
the risk premium estimates.
Black, Jensen & Scholes (1972)
The controversial Fama and French (1992) paper has generated a signiﬁcant
debate in the literature. The general goal of the paper is to assess the relative
2.8. EMPIRICAL ASSET PRICING 39
importance of beta, size, B/M, leverage, and E/P in determining the cross-
section of expected returns. These variables had been previously documented
as important in the “anomalies” literature. Their general ﬁndings are that
beta is not systematically related to returns, while size and B/M subsume
the other factors.
The methodology employed is basically an extension of the Fama and
MacBeth (1973) procedure. The new steps involve the combination of ac-
counting and market data. All accounting data for the ﬁscal year ending
t − 1 is combined with returns measured from July of year t to June of t + 1.
Stock price data used to construct accounting ratios is from the beginning of
year t, while the size measure is from June of year t. This procedure ensures
all explanatory variables are known prior to the return.
In order to preserve the ﬁrm-speciﬁc accounting information, portfolios
are not used in the same way as in FM. Instead, portfolios are used to
calculate betas, which are then assigned to all ﬁrms in that portfolio. The
portfolios are formed by ﬁrst forming size deciles, then forming beta deciles
within each size decile. In both sorts, breakpoints are set based on only the
NYSE ﬁrms. With these 100 portfolios, portfolio betas are calculated each as
the sum of the coeﬃcient on current and prior month CRSP value-weighted
retutns. The beta for a particular stock can change over time as the stock
moves into diﬀerent portfolios.
This two-way sorting procedure produces variation in beta that is unre-
lated to size. Univariate statistics show that average returns are related to
size, but unrelated to beta. This evidence is conﬁrmed by the FM regressions.
Gibbons (1982) introduces a multivariate test of the CAPM and rejects
CAPM soundly using LR. He uses the CRSP equally-weighted index as the
market, estimates β over a 5 year period, and forms 40 portfolios. This mu-
tivariate methodology avoids the EIV problem, provides more precise risk
premium estimates, and has more power than previous tests. The nonlinear
restriction on the intercept is linearized with a Taylor-series expansion.
Stambaugh (1982) shows inferences are not sensitive to proxy choice, but
are sensitive to the asset choice. He argues that W lacks power, LR has
40 CHAPTER 2. ASSET PRICING
the wrong size, and LM is closest to its asymptotic distribution. Using a
portfolio of stocks, bonds, and preferred, he fails to reject linearity (Black
CAPM), but rejects SL. Using fewer assets he rejects both models.
Shanken (1985) provides the asymptotic results for the multivariate tests in
Gibbons (1982). He shows that LM < LR < Q∗
(= W). These statistics are
all transformations of one another. Shanken uses QA
C, which includes consid-
erations for sample size and degrees of freedom adjustments. Recalculating
Gibbons’ LR statistic, Shanken shows p = 0.75, so the rejection inference is
The cross-section regression test (CSRT) used in this paper does not
require specifying HA. The procedure estimates beta in a ﬁrst stage, then
using betas in cross-sectional regressions. The CAPM is rejected using the
equally-weighted CRSP index.
MacKinlay (1987) discusses power of multivariate SL CAPM tests. Finds
that tests against an unspeciﬁed alternative have low power. The type of de-
viation from the model is important in determining power. These tests have
reasonable power against cross-sectional random deviations. However, these
tests have low power against omitted factors. He rejects in some subperiods
but fails to reject overall.
2.8.4 ICAPM/CCAPM Tests
Tests of a multi-beta model are similar to CAPM tests in that they are
really tests of the mean-varance eﬃciency of a particular combination of
portfolios. There is mixed evidence about the importance of durable goods.
Habit persistence models perform better in goodness-of-ﬁt tests, but still do
not explain the ﬁrst moment of the equity premium puzzle.
Reject model. See QM notes for more details.
2.8. EMPIRICAL ASSET PRICING 41
Mehra Prescott (1985)
The equity premium puzzle arises because extreme risk aversion parameters
are needed to make the low volatility of aggregate consumption growth in the
U.S. consistent with the returns on both equity and T-bills. Some of these
results may arise partially because of poorly measured consumption data, but
eﬀorts to correct for this still lead to rejections of the model. One possible
(partial) explanation for the equity premium puzzle is incomplete markets,
which may result in the overestimation of risk aversion. One experiment
using log utility (CRRA = 1) results in an estimate based on aggregate
consumption of CRRA = 3. Weil () presents the same puzzle from the
perspective of the riskless asset.
2.8.5 APT Tests
The testable implications of the APT given in (2.17) are
1. λi = 0 for any i
2. λ0(= rf ) ≥ 0 (debated)
Again, the test really amounts to seeing if a particular combination of port-
foliosis mean-varance eﬃcienct.
To make the intertemporal APT testable, certain restricitons need to be
imposed. One alternative is to assume that (i) the observed set of assets has
a factor structure, (ii) the noise terms of the observed assets are uncorrelated
with the noise terms on the unobserved assets, and (iii) the factors span the
state variables. Alternatively, we can assume logarithmic utility in which
case the intertemporal APT reduces to the APT. These requirements are
very similar to the ICAPM.
As mentioned in Section 2.4.2, the APT has features which make testing
diﬃcult. In fact, one view is that APT is not testable [e.g., Shanken (1982),
Reisman (1992)], whereas others [Ingersoll (1984), others ??] claim it is. The
primary reason for this disagreement is the approximate nature of the model.
Are deviations from the exact model due to the approximation or are they
genuine deviations from the model itself? The test then becomes a joint test
of the model and the additional assumptions needed to impose the exact pric-
ing relation. The APT and ICAPM are not empirically distinguishable. The
“pervasive factors” in the APT world can coincide with the “state variables”
in the ICAPM world.
42 CHAPTER 2. ASSET PRICING
The test of the model requires estimation of both the factor loadings (B)
and the factor prices (F). The two primary testing approaches diﬀer in the
order these variables are estimated. Cross-sectional tests estimate (B) in the
time series, the use these estimates for a number of ﬁrms to estimate (F) in
the cross-section. The time series tests perform the estimation in the reverse
Fama and MacBeth (1973) provide the basic approach for the cross-
sectional test [see Section 2.8.3 for details]. Some of these tests estimate
the factors statistically while others use economic speciﬁcations. Chen, Roll,
and Ross (1986) specify ﬁve economic variables as factors: industrial produc-
tion, unexpected inﬂation, changes in expected inﬂation, credit quality, and
a term premium. The ﬁnd that the speciﬁcation is good in the sense that
many of these factors are priced and additional factors such as the market
return, consumption growth, and changes in oil prices are not priced. Chan,
Chen, and Hsieh (1984) perform a study similar to CRR, but are also able to
explain the size anomaly. However, Shanken and Weinstein (1990) reply that
these two studies are sensitive to the portfolio formation used. Speciﬁcally,
forming size-based portfolios at the end of the estimation period causes mis-
estimation of the βs to show up systematically in the size portfolios, biasing
the subsequent risk premium estimate.
The time series test method was originally proposed by Black, Jensen,
and Scholes (1972) Factor prices are estimated in the ﬁrst pass, and their
sensitivity in the second pass. The null bypothesis is that the intercept is
zero (or α = (1 − Bi)λ0 in the absence of a riskless asset).
In summary, the tests of the APT generally reject the model, but the
APT seems to perform better than alternatives such as CAPM. The APT
has been used in applications which oﬀer indirect evidence of its success as
well. In fund performance tests, the model indicates fund managers have
negative Jensen’s alphas, which is a similar result from the CAPM models
(the magnitudes diﬀer though). In calculating the cost of capital, CAPM
and APT yield similar results. In event studies the APT does not seem to
oﬀer much gain over a single factor model.
2.8.6 Present Value Relations
The history of volatility and returns tests result in a ﬂip-ﬂop of results. The
early variance bounds tests rejected the present value models, whereas the
returns tests failed to reject. More recently, volatility bounds tests provide
2.8. EMPIRICAL ASSET PRICING 43
mixed evidence, but the returns tests now reject the model.
Denote the “perfect foresight” price
Then pt = E[p∗
t ] or
t = E[p∗
t ] + εt = pt + εt
t ) = var(pt) + var(εt) ≥ var(pt) (2.23)
This says actual prices should be less volatile than the “model” price from
the dividend series. In fact, we ﬁnd the opposite. Actual prices are more
volatile than would be expected from dividends.
There are several problems with the above test. First, the price series
is nonstationary so it needs to be modiﬁed. Second, the inﬁnite sum is a
problem in a ﬁnite sample. This can be overcome by including a terminal
value in the distant future. Third, the observed dividend series is not series
of independent observations, but rather a single realization. This creates a
small sample problem in implementing the test. Fourth, there is no way to
capture time-varying expected returns in this framework. Finally, diﬀerent
speciﬁcations of the investors’ information sets lead to diﬀerent critical val-
ues, making interpretation diﬃcult. In summary, there are several necessary
adjustments to the variance bounds test. Even after making these adjust-
ments, there is no way to hold size constant so there is no way to meaningfully
compare the power of this test to alternatives.
Shiller (1981) uses the perfect foresight price decomposition to derive
varaince bounds. He ﬁnds the actual price is ﬁve to thirteen times more
volatile than the perfect foresight price. His analysis indicates that the price
change volatility is highest when information about dividends is revealed
smoothly. Large, occasional information releases result in prices with lower
variance but higher kurtosis.
44 CHAPTER 2. ASSET PRICING
Tests of long horizon returns have found that there is siginiﬁcant negative
autocorrelation over the three to ﬁve year horizon, indicating a tendancy for
A model-free version is not subject to the nuisance parameter problem which
plagues the variance bounds test. Both the model-free and the model-based
orthogonality tests are better-behaved econometrically than the returns tests.
The pricing of bonds diﬀers from pricing other assets such as equity primarily
because bonds are nonlinear. A bond has:
1. ﬁxed, known maturity
2. ﬁxed, known terminal (face) value
3. ﬁxed, known periodic cash ﬂows
4. more thinly traded (at least “older” issues)
Term structure models can be viewed as time series models of the stochastic
3.2 Term Structure Basics
3.3 Inﬂation and Returns
3.4 Forward Rates
Forward rates had been viewed simply as forecasts of expected future spot
rates (PEH). Fama (??) shows that the forward rates also contain expecta-
tions of the premium above one month T-bills.
• Holding period return is the change in log price on a particular bond
from one period to the next.
46 CHAPTER 3. FIXED INCOME
• The forward rate is the diﬀerence in the log prices of bonds of diﬀerent
maturities at the same point in time.
• Premium is the holding period return less the one month spot rate.
Fama uses a regression approach to separate the information about expected
future spot rates from information about the expected premium.
1. premium = f(forward - spot)
2. ∆ spot = f(forward - spot)
Results are that forward rates can predict premiums which vary through
time and the expected future spot rate up to ﬁve months out. Froot has a
response to Fama’s ﬁnding, suggesting that Fama ignores systematic expec-
Find that forward rate forecasts of near-term changes in interest rates are
poor, but forecast power increases at longer time horizons. Interpret this as
evidence of a slow mean-reverting process. Also ﬁnd evidence of time-varying
expected premiums, and that the ordering of risks and rewards changes with
the business cycle.
An aﬃne yield model implies a latent variable structure for bond returns.
Fewer state variables than forecasting variables puts testable restrictions on
forecasting equations for bond returns. Reject CIR with non-matched ma-
turities (avoids measurement error). Addresses source of errors, their conse-
quences, and how the choice of instruments aﬀect the outcome of the tests
3.5 Bond Pricing
As any asset, bonds can be priced using the pricing kernel approach presented
in Section 2.5. Begin with the fundamental pricing equation
1 = Et[Mt+1Rn,t+1].
3.6. AFFINE MODELS 47
The uppercase M is used to distinguish it from logs and the n subscript
indicates the time to maturity. The return can obviuosly be expressed as
the relative price change Rn,t+1 = Pn−1,t+1/Pn,t. Substituting this into the
pricing equation gives
Pn,t = Et[Mt+1Pn−1,t+1].
Recursive substitution and the fact that the bond is worth a dollar at matu-
rity gives another representation
Pn,t = Et[Mt+1 . . . Mt+n].
In this light a bond pricing model is really a time series model of the stochastic
Fixed income models are broadly categorized as either stochastic interest
rate models or stochastic term structure models. Stochastic interest rate
models begin by specifying a process dr for the short rate. The problem with
this approach is that the model price of the bond may not equal the market
price. The short rate process also implies prices for bonds of other maturities
and these may be mispriced as well. The stochastic term structure models
use the observed market prices and estimates of the volatility structure to
infer the stochastic process of the short rate. This information is then used
to get a distribution for the bond price.
3.6 Aﬃne Models
Aﬃne yield models represent a class of realtively simple models in which all
relevent variables are conditionally log-normal and log yields are linear in
state variables. Aﬃne forward rates imply aﬃne yields. Taking logs of the
pn,t = Et[mt+1 + pn−1,t+1] +
var(mt+1 + pn−1,t+1).
A model with k state variables implies that the term structure can be
summarized by the levels of k bond yields at each point in time and the
constant coeﬃcients relating the bond yields. In this sense aﬃne yield models
are linear; they are non-linear in the evolutionary process of the k basis yields
and the relation between the cross-sectional coeﬃcients and the underlying
parameters of the model.
48 CHAPTER 3. FIXED INCOME
Table 3.1: Single Factor Stochastic Interest Rate Models
dr = (α + βr)dt + σrγ
Model α β γ Speciﬁcation
Merton (ABM) 0 0 αdt + σdZ
Vasicek 0 (α + βr)dt + σdZ
CIR SR 1/2 (α + βr)dt + σ
Courtadon 1 (α + βr)dt + σrdZ
Dothan 0 0 1 σrdZ
GBM 0 1 βrdt + σrdZ
CIR VR 0 0 3/2 σr3/2
CEV 0 βrdt + σrγ
Duﬃe-Kan 1/2 (α1 + β1r)dt + (α2 + β2r)γ
• distribution of the SDF is conditionally lognormal;
• bond prices are jointly lognormal with the SDF;
• (additional strong assumptions): homoskedastic mt+1 (Vasicek)
• Log prices (and yields) are aﬃne in state variables.
• Analytic solution of pricing equations (outside aﬃne yield generally
requires numerical solutions e.g., Black, Derman, and Toy).
• Trivial rejection of model without addition of an error term.
• Limits the way in which interest rate volatility can change with the
level of interest rates.
• Implies risk premia on long bonds always have the same sign (single-
• Applies to real bonds only ?
• The model can be renormalized so that the yields themselves are the
state variables (e.g., a two-factor model would use two yields).
3.6. AFFINE MODELS 49
dr = κ(θ − r)dt + σdB
y1t = xt − β2
/2 and − pnt = An + Bnxt
To get this model begin by writing the sdf as a forecast and an innovation
−mt+1 = xt + εt+1.
The sign is a convention. Assume that xt+1 follows an AR(1) process and,
for simplicity, its innovations are uncorrelated with εt+1
xt+1 − µ = φ(xt − µ) + ξt+1 and εt+1 = βξt+1.
Now consider the log price of a one period bond
p1,t = Et[mt+1] +
var(mt+1) = −xt +
• Allows interest rates to be negative (OK for real, not nominal).
• Can handle rising, inverted, and humped yield curves, but not inverted
• Price of interest rate risk is a constant that does not depend on the
level of the short rate.
• Interest rate changes have constant variance.
• Limiting forward rate can not be both ﬁnite and time-varying.
• Log forward rate curve tends to slope downwards unless β is suﬃciently
• Random walk is a special case.
• B measures the sensitivity of the n-period bond return to the one-
period interest rate (and the state variable). This sensitivity increases
in maturity, and is always less than the maturity.
• Average short rate is µ − β2
3.6.2 The CIR Model
dr = κ(θ − r)dt + σ
50 CHAPTER 3. FIXED INCOME
The basic CIR model is a general equilibrium, continuous time model of the
real returns on the asset in an economy [see section 2.3.5]. The general model
is specialized to the term structure in ?. The asset is used to smooth con-
sumption, so its value depends on its hedging eﬀectiveness, or its covariance
with consumption. The model is derived in an option pricing framework
by constructing a riskless synthetic portfolio, which must earn the riskless
rate in equilibrium. The hedge portfolio is constructed of bonds of diﬀering
maturities; it is assumed that the market price of risk is the same for bonds
of all maturities. A recursive approach must be used to solve the model.
Although the model claims to endogenously derive the interest rate process,
it is a direct consequence of the speciﬁcation of the state variable.
• identical individuals with time-additive log utility (Dunn and Singleton
relax this assumption but do not have much success)
• xt+i and mt+i are normal conditional on xt for i = 1, but non-normal
for i > 1.
• y1t = −p1t = xt(1 − β2
/2) y1t is proportional to the state variable
and its conditional variance is proportional to its level.
• restricts interest rates to be positive
• Variance proportional to the state variable.
• All bond returns are perfectly correlated (general prediction of all
• Prices are a deterministic function of the parameters, the short rate,
and maturity; an error term must be speciﬁed to keep the model
• The long rate converges to a constant.
• Stable parameters (λ, κ, θ, σ).
• Forward rate fnt = −B2
• time variation in term premia ?
3.6.3 Duﬃe-Kan Class
The Duﬃe-Kan model is the most general aﬃne model possible. It nests all
the common models as special cases.
dr = κ(θ − r)dt + α + βrdZ
3.7. MULTI-FACTOR MODELS 51
3.6.4 Other Single Factor Models
• Non-linear models (γ = 3/2)
• Non-parametric models
• Markov switching models
• Higher-order ARMA processes
• Several state variables
3.7 Multi-Factor Models
Longstaﬀ and Schwartz (2–factor)
−mt+1 = x1t + x2t + x
p1t = −x1t − x2t + x1tβ2
• second factor (instantaneous variance of changes in short rate) avoids
implication that all bond returns are perfectly correlated
• variance of innovation to log SDF is proportional to the level of x1t and
is conditionally correlated with x1t but not with x2t.
• One-period yield is no longer proportional to x1t and the short rate
alone is no longer suﬃcient to describe the state of the economy.
• The model is a generalization of the square-root model
• it can also generate inverted humped yield curves.
• Whenever the SDF can be expressed as the sum of two independent
processes, the resulting term structure is the sum of the term structures
that would exist under each of these processes.
3.8 Empirical Tests
3.8.1 Brown & Dybvig (1986)
• Nominal, prices, cross-sectional, ML
Table 3.2: Summary of Empirical Results
BD (1986) P,N C,ML iid errors ˆr > r, σ not constant
BS (1994) P,R C,ML Unstable est., don’t support
mean reversion, σ > 0 binds
CKLS (1992) Y,N TS,GMM assume normality reject γ < 1, unconstr. γ = 1.5,
mean reversion not important
GR (1993) Y,R TS,GMM forecast R from N fail to reject CIR, plausible
use non-central χ2
estimates, ﬁt short bonds better
PS (1994) P,N TS,ML second factor for inﬂation unstable/unrealistic estimates,
reject original and two factor CIR
LS (1992) Y,N C,GMM second factor for volatility reject single factor model,
estimated with GARCH 2–factor holds for short and int bonds
Price or Yield; Nominal or Real.
Cross-section or Time series, Econometric Method.
3.8. EMPIRICAL TESTS 53
• Assume pricing errors are iid - a strong assumption given the diﬀerences
in trading frequency across maturities; an alternative is to assume vari-
ance increases with maturity and is correlated across maturities.
• Estimated r systematically overstates implied short rates (recall Fama
MacBeth; Merton’s model of heterogeneous information sets).
• Find estimated variance is erratic, although similar in magnitude to
CIR weekly time series estimates.
• ﬁnd annual average of implied standard deviation (
ˆr) appears to be
an unbiased predictor of time series estimate of the standard deviation
of changes in the short rate.
• Bills appear to be better described by the model than bonds.
• Discount issues’ prices are underestimated, premiums are overestimated.
• Evidence that the errors are not iid.
3.8.2 Brown & Schaefer (1994)
• Real, prices, cross-sectional, ML
• CIR model is generally able to replicate observed yield curve shapes
• Pricing errors are generally within the bid–ask spread
• Parameter estimates are unstable, especially κ + λ
• Positivity constraint on σ2
binds in many cases
• Cross-sectional estimates of variance are not unbiased estimates of the
time series estimates.
• evidence on mean reversion is generally not supportive
3.8.3 Chan, Karolyi, Longstaﬀ & Sanders (1992)
CKLS present a generalized model that nests eight popular interest rate
dr = (α + βr)dt + σrγ
• Nominal, yields, time series, GMM
• The γ term seems to be the most important; models with γ < 1 are
all rejected, and those with γ = 1.5 fare the best. The unrestricted
estimate of γ is 1.5, and is signiﬁcantly diﬀerent than unity.
• The mean reversion process, which adds considerable complexity to the
model, does not appear to be of major importance.
• Results are trouble for single-factor aﬃne yield models: without mean
reversion, the term structure may increase initially, but will then be
downward sloping. Second, with γ > 0.5, the models become in-
tractable and must be solved numerically.
54 CHAPTER 3. FIXED INCOME
3.8.4 Gibbons & Ramaswamy (1993)
• Forecast real returns on nominal bonds in a time series setting (assume
inﬂation is independent of the real SDF ?)
• GMM in a time series
• Fail to reject CIR, obtain plausible parameter estimates
• Reject with oﬀ-the-run bonds (measurement error and a small sample).
• Model ﬁts short end of term structure better than longer maturities.
• Find some evidence of autocorrelation in returns
3.8.5 Pearson & Sun (1994)
• Nominal, prices, time series, ML
• Generalize square-root model to allow the variance of the state variable
to be linear in the level of the state variable.
• Also include a second factor — expected inﬂation.
• Reject original and two-factor CIR model.
• Unrealistic parameter estimates:
• Unstable parameter estimates (across datasets).
• Within sample prediction has no power and is little better than a naive
prediction of current values.
3.8.6 Longstaﬀ & Schwartz (1992)
• second factor for volatility estimated using GARCH
• test cross-sectional restrictions with GMM
• Find model holds for both short-and intermediate-term maturities
• Reject single-factor model