• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Handbook of the economics of finance

Handbook of the economics of finance






Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds


Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Handbook of the economics of finance Handbook of the economics of finance Document Transcript

    • HANDBOOKS IN ECONOMICS 21 Series Editors KENNETH J. ARROW MICHAEL D. INTRILIGATOR Amsterdam • Boston • Heidelberg • London • New York • Oxford Paris • San Diego • San Francisco • Singapore • Sydney • Tokyo
    • HANDBOOK OF THE ECONOMICS OF FINANCE VOLUME 1B FINANCIAL MARKETS AND ASSET PRICING Edited by GEORGE M. CONSTANTINIDES University of Chicago MILTON HARRIS University of Chicago and REN ´E M. STULZ Ohio State University 2003 Amsterdam • Boston • Heidelberg • London • New York • Oxford Paris • San Diego • San Francisco • Singapore • Sydney • Tokyo
    • ELSEVIER B.V. Sara Burgerhartstraat 25 P.O. Box 211, 1000 AE Amsterdam, The Netherlands © 2003 Elsevier B.V. All rights reserved. This work is protected under copyright by Elsevier, and the following terms and conditions apply to its use: Photocopying Single photocopies of single chapters may be made for personal use as allowed by national copyright laws. Permission of the Publisher and payment of a fee is required for all other photocopying, including multiple or systematic copying, copying for advertising or promotional purposes, resale, and all forms of document delivery. Special rates are available for educational institutions that wish to make photocopies for non-profit educational classroom use. Permissions may be sought directly from Elsevier’s Science & Technology Rights Department in Oxford, UK: phone: (+44) 1865 843830, fax (+44) 1865 853333, e-mail: permissions@elsevier.com. You may also complete your request on-line via the Elsevier home page (http://www.elsevier.com) by selecting ‘Customer Support’ and then ‘Obtaining Permissions’. In the USA, users may clear permissions and make payments through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA; phone: (+1) (978) 7508400, fax: (+1) (978) 7504744, and in the UK through the Copyright Licensing Agency Rapid Clearance Service (CLARCS), 90 Tottenham Court Road, London W1P 0LP, UK; phone: (+44) 20 7631 5555; fax: (+44) 20 7631 5500. Other countries may have a local reprographic rights agency for payments. Derivative Works Tables of contents may be reproduced for internal circulation, but permission of Elsevier is required for external resale or distribution of such material. Permission of the Publisher is required for all other derivative works, including compilations and translations. Electronic Storage or Usage Permission of the Publisher is required to store or use electronically any material contained in this work, including any chapter or part of a chapter. Except as outlined above, no part of this work may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without prior written permission of the Publisher. Address permissions requests to: Elsevier’s Science & Technology Rights Department, at the phone, fax and e-mail addresses noted above. Notice No responsibility is assumed by the Publisher for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions or ideas contained in the material herein. Because of rapid advances in the medical sciences, in particular, independent verification of diagnoses and drug dosages should be made. First edition 2003 Library of Congress Cataloging-in-Publication Data A catalog record from the Library of Congress has been applied for. British Library Cataloguing in Publication Data A catalogue record from the British Library has been applied for. ISBN: 0-444-50298-X (set, comprising vols. 1A & 1B) ISBN: 0-444-51362-0 (vol. 1A) ISBN: 0-444-51363-9 (vol. 1B) ISSN: 0169-7218 (Handbooks in Economics Series) ∞ The paper used in this publication meets the requirements of ANSI/NISO Z39.48-1992 (Permanence of Paper). Printed in The Netherlands.
    • INTRODUCTION TO THE SERIES The aim of the Handbooks in Economics series is to produce Handbooks for various branches of economics, each of which is a definitive source, reference, and teaching supplement for use by professional researchers and advanced graduate students. Each Handbook provides self-contained surveys of the current state of a branch of economics in the form of chapters prepared by leading specialists on various aspects of this branch of economics. These surveys summarize not only received results but also newer developments, from recent journal articles and discussion papers. Some original material is also included, but the main goal is to provide comprehensive and accessible surveys. The Handbooks are intended to provide not only useful reference volumes for professional collections but also possible supplementary readings for advanced courses for graduate students in economics. KENNETH J. ARROW and MICHAEL D. INTRILIGATOR PUBLISHER’S NOTE For a complete overview of the Handbooks in Economics Series, please refer to the listing at the end of this volume.
    • This Page Intentionally Left Blank
    • CONTENTS OF THE HANDBOOK VOLUME 1A CORPORATE FINANCE Chapter 1 Corporate Governance and Control MARCO BECHT, PATRICK BOLTON and AILSA R ¨OELL Chapter 2 Agency, Information and Corporate Investment JEREMY C. STEIN Chapter 3 Corporate Investment Policy MICHAEL J. BRENNAN Chapter 4 Financing of Corporations STEWART C. MYERS Chapter 5 Investment Banking and Security Issuance JAY R. RITTER Chapter 6 Financial Innovation PETER TUFANO Chapter 7 Payout Policy FRANKLIN ALLEN and RONI MICHAELY Chapter 8 Financial Intermediation GARY GORTON and ANDREW WINTON Chapter 9 Market Microstructure HANS R. STOLL
    • viii Contents of the Handbook VOLUME 1B FINANCIAL MARKETS AND ASSET PRICING Chapter 10 Arbitrage, State Prices and Portfolio Theory PHILIP H. DYBVIG and STEPHEN A. ROSS Chapter 11 Intertemporal Asset-Pricing Theory DARRELL DUFFIE Chapter 12 Tests of Multi-Factor Pricing Models, Volatility, and Portfolio Performance WAYNE E. FERSON Chapter 13 Consumption-Based Asset Pricing JOHN Y. CAMPBELL Chapter 14 The Equity Premium in Retrospect RAJNISH MEHRA and EDWARD C. PRESCOTT Chapter 15 Anomalies and Market Efficiency G. WILLIAM SCHWERT Chapter 16 Are Financial Assets Priced Locally or Globally? G. ANDREW KAROLYI and REN´E M. STULZ Chapter 17 Microstructure and Asset Pricing DAVID EASLEY and MAUREEN O’HARA Chapter 18 A Survey of Behavioral Finance NICHOLAS C. BARBERIS and RICHARD H. THALER Finance Optimization, and the Irreducibly Irrational Component of Human Behavior ROBERT J. SHILLER Chapter 19 Derivatives ROBERT E. WHALEY Chapter 20 Fixed Income Pricing QIANG DAI and KENNETH J. SINGLETON
    • PREFACE Financial economics applies the techniques of economic analysis to understand the savings and investment decisions by individuals, the investment, financing and payout decisions by firms, the level and properties of interest rates and prices of financial assets and derivatives, and the economic role of financial intermediaries. Until the 1950s, finance was viewed primarily as the study of financial institutional detail and was hardly accorded the status of a mainstream field of economics. This perception was epitomized by the difficulty Harry Markowitz had in receiving a PhD degree in the economics department at the University of Chicago for work that eventually would earn him a Nobel prize in economic science. This state of affairs changed in the second half of the 20th century with a revolution that took place from the 1950s to the early 1970s. At that time, key progress was made in understanding the financial decisions of individuals and firms and their implications for the pricing of common stocks, debt, and interest rates. Harry Markowitz, William Sharpe, James Tobin, and others showed how individuals concerned about their expected future wealth and its variance make investment decisions. Their key results showing the benefits of diversification, that wealth is optimally allocated across funds that are common across individuals, and that investors are rewarded for bearing risks that are not diversifiable, are now the basis for much of the investment industry. Merton Miller and Franco Modigliani showed that the concept of arbitrage is a powerful tool to understand the implications of firm capital structures for firm value. In a world without frictions, they showed that a firm’s value is unrelated to its capital structure. Eugene Fama put forth the efficient markets hypothesis and led the way in its empirical investigation. Finally, Fischer Black, Robert Merton and Myron Scholes provided one of the most elegant theories in all of economics: the theory of how to price financial derivatives in markets without frictions. Following the revolution brought about by these fathers of modern finance, the field of finance has experienced tremendous progress. Along the way, it influenced public policy throughout the world in a major way, played a crucial role in the growth of a new $100 trillion dollar derivatives industry, and affected how firms are managed everywhere. However, finance also evolved from being at best a junior partner in economics to being often a leader. Key concepts and theories first developed in finance led to progress in other fields of economics. It is now common among economists to use theories of arbitrage, rational expectations, equilibrium, agency relations, and information asymmetries that were first developed in finance. The committee for the
    • x Preface Alfred Nobel Memorial Prize in economic science eventually recognized this state of affairs. Markowitz, Merton, Miller, Modigliani, Scholes, Sharpe, and Tobin received Nobel prizes for contributions in financial economics. This Handbook presents the state of the field of finance fifty years after this revolution in modern finance started. The surveys are written by leaders in financial economics. They provide a comprehensive report on developments in both theory and empirical testing in finance at a level that, while rigorous, is nevertheless accessible to researchers not intimate with the field and doctoral students in economics, finance and related fields. By summarizing the state of the art and pointing out as-yet unresolved questions, this Handbook should prove an invaluable resource to researchers planning to contribute to the field and an excellent pedagogical tool for teaching doctoral students. The book is divided into two Volumes, corresponding to the traditional taxonomy of finance: corporate finance (1A) and financial markets and asset pricing (1B). 1. Corporate finance Corporate finance is concerned with how businesses work, in particular, how they allocate capital (traditionally, “the capital budgeting decision”) and how they obtain capital (“the financing decision”). Though managers play no independent role in the work of Miller and Modigliani, major contributions in finance since then have shown that managers maximize their own objectives. To understand the firm’s decisions, it is therefore necessary to understand the forces that lead managers to maximize the wealth of shareholders. For example, a number of researchers have emphasized the positive and negative roles of large shareholders in aligning incentives of managers and shareholders. The part of the Handbook devoted to corporate finance starts with an overview, entitled Corporate Governance and Control, by Marco Becht, Patrick Bolton, and Ailsa R¨oell (Chapter 1) of the framework in which managerial activities take place. Their broad survey covers everything about corporate governance, from its history and importance to theories and empirical evidence to cross-country comparisons. Following the survey of corporate governance in Chapter 1, two complementary essays discuss the investment decision. In Agency, Information and Corporate Investment, Jeremy Stein (Chapter 2) focuses on the effects of agency problems and asymmetric information on the allocation of capital, both across firms and within firms. This survey does not address the issue of how to value a proposed investment project, given information about the project. That topic is considered in Corporate Investment Policy by Michael Brennan in Chapter 3. Brennan draws out the implications of recent developments in asset pricing, including option pricing techniques and tax considerations, for evaluating investment projects. In Chapter 4, Financing of Corporations, the focus moves to the financing decision. Stewart Myers provides an overview of the research that seeks to explain firms’ capital structure, that is, the types and proportions of securities firms use to finance their
    • Preface xi investments. Myers covers the traditional theories that attempt to explain proportions of debt and equity financing as well as more recent theories that attempt to explain the characteristics of the securities issued. In assessing the different capital structure theories, he concludes that he does not expect that there will ever be “one” capital structure theory that applies to all firms. Rather, he believes that we will always use different theories to explain the behavior of different types of firms. In Chapter 5, Investment Banking and Security Issuance, Jay Ritter is concerned with how firms raise equity and the role of investment banks in that process. He examines both initial public offerings and seasoned equity offerings. A striking result discovered first by Ritter is that firms that issue equity experience poor long-term stock returns afterwards. This result has led to a number of vigorous controversies that Ritter reviews in this chapter. Firms may also obtain capital by issuing securities other than equity and debt. A hallmark of the last thirty years has been the tremendous amount of financial innovation that has taken place. Though some of the innovations fizzled and others provided fodder to crooks, financial innovation can enable firms to undertake profitable projects that otherwise they would not be able to undertake. In Chapter 6, Financial Innovation, Peter Tufano delves deeper into the issues of security design and financial innovation. He reviews the process of financial innovation and explanations of the quantity of innovation. Investors do not purchase equity without expecting a return from their investment. In one of their classic papers, Miller and Modigliani show that, in the absence of frictions, dividend policy is irrelevant for firm value. Since then, a large literature has developed that identifies when dividend policy matters and when it does not. Franklin Allen and Roni Michaely (Chapter 7) survey this literature in their essay entitled Payout Policy. Allen and Michaely consider the roles of taxes, asymmetric information, incomplete contracting and transaction costs in determining payouts to equity holders, both dividends and share repurchases. Chapter 8, Financial Intermediation, focuses more directly on the role financial intermediaries play. Although some investment is funded directly through capital markets, according to Gary Gorton and Andrew Winton, the vast majority of external investment flows through financial intermediaries. In Chapter 8, Gorton and Winton survey the literature on financial intermediation with emphasis on banking. They explore why intermediaries exist, discuss banking crises, and examine why and how they are regulated. Exchanges on which securities are traded play a crucial role in intermediating between individuals who want to buy securities and others who want to sell them. In many ways, they are special types of corporations whose workings affect the value of financial securities as well as the size of financial markets. The Handbook contains two chapters that deal with the issues of how securities are traded. Market Microstructure, by Hans Stoll (Chapter 9), focuses on how exchanges perform their functions as financial intermediaries and therefore is included in this part. Stoll examines explanations of the bid-ask spread, the empirical evidence for these explanations, and the implications for market design. Microstructure and Asset Pricing,
    • xii Preface by Maureen O’Hara and David Easley (Chapter 17), examines the implications of how securities trade for the properties of securities returns and is included in Volume 1B on Financial Markets and Asset Pricing. 2. Financial markets and asset pricing A central theme in finance and economics is the pursuit of an understanding of how the prices of financial securities are determined in financial markets. Currently, there is immense interest among academics, policy makers, and practitioners in whether these markets get prices right, fueled in part by the large daily volatility in prices and by the large increase in stock prices over most of the 1990s, followed by the sharp decrease in prices at the turn of the century. Our understanding of how securities are priced is far from complete. In the early 1960s, Eugene Fama from the University of Chicago established the foundations for the “efficient markets” view that financial markets are highly effective in incorporating information into asset prices. This view led to a large body of empirical and theoretical work. Some of the chapters in this part of the Handbook review that body of work, but the “efficient markets” view has been challenged by the emergence of a new, controversial field, behavioral finance, which seeks to show that psychological biases of individuals affect the pricing of securities. There is therefore divergence of opinion and critical reexamination of given doctrine. This is fertile ground for creative thinking and innovation. In Volume 1B of the Handbook, we invite the reader to partake in this intellectual odyssey. We present eleven original essays on the economics of financial markets. The divergence of opinion and puzzles presented in these essays belies the incredible progress made by financial economists over the second half of the 20th century that lay the foundations for future research. The modern quantitative approach to finance has its origins in neoclassical economics. In the opening essay titled Arbitrage, State Prices and Portfolio Theory (Chapter 10), Philip Dybvig and Stephen Ross illustrate a surprisingly large amount of the intuition and intellectual content of modern finance in the context of a single- period, perfect-markets neoclassical model. They discuss the fundamental theorems of asset pricing – the consequences of the absence of arbitrage, optimal portfolio choice, the properties of efficient portfolios, aggregation, the capital asset-pricing model (CAPM), mutual fund separation, and the arbitrage pricing theory (APT). A number of these notions may be traced to the original contributions of Stephen Ross. In his essay titled Intertemporal Asset Pricing Theory (Chapter 11), Darrell Duffie provides a systematic development of the theory of intertemporal asset pricing, first in a discrete-time setting and then in a continuous-time setting. As applications of the basic theory, Duffie also presents comprehensive treatments of the term structure of interest rates and fixed-income pricing, derivative pricing, and the pricing of corporate securities with default modeled both as an endogenous and an exogenous process.
    • Preface xiii These applications are discussed in further detail in some of the subsequent essays. Duffie’s essay is comprehensive and authoritative and may serve as the basis of an entire 2nd-year PhD-level course on asset pricing. Historically, the empirically testable implications of asset-pricing theory have been couched in terms of the mean-variance efficiency of a given portfolio, the validity of a multifactor pricing model with given factors, or the validity of a given stochastic discount factor. Furthermore, different methodologies have been developed and applied in the testing of these implications. In Tests of Multi-Factor Pricing Models, Volatility, and Portfolio Performance (Chapter 12), Wayne Ferson discusses the empirical methodologies applied in testing asset-pricing models. He points out that these three statements of the empirically testable implications are essentially equivalent and that the seemingly different empirical methodologies are equivalent as well. In his essay titled Consumption-Based Asset Pricing (Chapter 13), John Campbell begins by reviewing the salient features of the joint behavior of equity returns, aggregate dividends, the interest rate, and aggregate consumption in the USA. Features that challenge existing asset-pricing theory include, but are not limited to, the “equity premium puzzle”: the finding that the low covariance of the growth rate of aggregate consumption with equity returns is a major stumbling block in explaining the mean aggregate equity premium and the cross-section of asset returns, in the context of the representative-consumer, time-separable-preferences models examined by Grossman and Shiller (1981), Hansen and Singleton (1983), and Mehra and Prescott (1985). Campbell also examines data from other countries to see which features of the USA data are pervasive. He then proceeds to relate these findings to recent developments in asset-pricing theory that relax various assumptions of the standard asset-pricing model. In a closely related essay titled The Equity Premium in Retrospect (Chapter 14), Rajnish Mehra and Edward Prescott – the researchers who coined the term – critically reexamine the data sources used to document the equity premium puzzle in the USA and other major industrial countries. They then proceed to relate these findings to recent developments in asset-pricing theory by employing the methodological tool of calibration, as opposed to the standard empirical estimation of model parameters and the testing of over-identifying restrictions. Mehra and Prescott have different views than Campbell as to which assumptions of the standard asset-pricing model need to be relaxed in order to address the stylized empirical findings. Why are these questions important? First and foremost, financial markets play a central role in the allocation of investment capital and in the sharing of risk. Failure to answer these questions suggests that our understanding of the fundamental process of capital allocation is highly imperfect. Second, the basic economic paradigm employed in analyzing financial markets is closely related to the paradigm employed in the study of business cycles and growth. Failure to explain the stylized facts of financial markets calls into question the appropriateness of the related paradigms for the study of macro- economic issues. The above two essays convey correctly the status quo that the puzzle
    • xiv Preface is at the forefront of academic interest and that views regarding its resolution are divergent. Several goals are accomplished in William Schwert’s comprehensive and incisive essay titled Anomalies and Market Efficiency (Chapter 15). First, Schwert discusses cross-sectional and time-series regularities in asset returns, both at the aggregate and disaggregate level. These include the size, book-to-market, momentum, and dividend yield effects. Second, Schwert discusses differences in returns realized by different types of investors, including individual and institutional investors. Third, he evaluates the role of measurement issues in many of the papers that study anomalies, including the difficult issues associated with long-horizon return performance. Finally, Schwert discusses the implications of the anomalies literature for asset-pricing and corporate finance theories. In discussing the informational efficiency of the market, Schwert points out that tests of market efficiency are also joint tests of market efficiency and a particular equilibrium asset-pricing model. In the essay titled Are Financial Assets Priced Locally or Globally? (Chapter 16), Andrew Karolyi and Ren´e Stulz discuss the theoretical implications of and empirical evidence concerning asset-pricing theory as it applies to international equities markets. They explain that country-risk premia are determined internationally, but the evidence is weak on whether international factors affect the cross-section of expected returns. A long-standing puzzle in international finance is that investors invest more heavily in domestic equities than predicted by the theory. Karolyi and Stulz argue that barriers to international investment only partly resolve the home-bias puzzle. They conclude that contagion – the linkage of international markets – may be far less prevalent than commonly assumed. At frequencies lower than the daily frequency, asset-pricing theory generally ignores the role of the microstructure of financial markets. In their essay titled Microstructure and Asset Pricing (Chapter 17), David Easley and Maureen O’Hara survey the theoretical and empirical literature linking microstructure factors to long-run returns, and focus on why stock prices might be expected to reflect premia related to liquidity or informational asymmetries. They show that asset-pricing dynamics may be better understood by recognizing the role played by microstructure factors and the linkages of microstructure and fundamental economic variables. All the models that are discussed in the essays by Campbell, Mehra and Prescott, Schwert, Karolyi and Stulz, and Easley and O’Hara are variations of the neoclassical asset-pricing model. The model is rational, in that investors process information rationally and have unambiguously defined preferences over consumption. Naturally, the model allows for market incompleteness, market imperfections, informational asymmetries, and learning. The model also allows for differences among assets for liquidity, transaction costs, tax status, and other institutional factors. Many of these variations are explored in the above essays. In their essay titled A Survey of Behavioral Finance (Chapter 18), Nicholas Barberis and Richard Thaler provide a counterpoint to the rational model by providing explanations of the cross-sectional and time-series regularities in asset returns by
    • Preface xv relying on economic models that are less than fully rational. These include cultural and psychological factors and tap into the rich and burgeoning literature on behavioral economics and finance. Robert Shiller, who is, along with Richard Thaler, one of the founders of behavioral finance, provides his personal perspective on behavioral finance in his statement titled Finance, Optimization and the Irreducibly Irrational Component of Human Behavior. One of the towering achievements in finance in the second half of the 20th century is the celebrated option-pricing theory of Black and Scholes (1973) and Merton (1973). The model has had a profound influence on the course of economic thought. In his essay titled Derivatives (Chapter 19), Robert Whaley provides comprehensive coverage of the topic. Following a historical overview of futures and options, he proceeds to derive the implications of the law of one price and then the Black–Scholes– Merton theory. He concludes with a systematic coverage of the empirical evidence and a discussion of the social costs and benefits associated with the introduction of derivatives. Whaley’s thorough and insightful essay provides an easy entry to an important topic that many economists find intimidating. In their essay titled Fixed-Income Pricing (Chapter 20), Qiang Dai and Ken Singleton survey the literature on fixed-income pricing models, including term structure models, fixed-income derivatives, and models of defaultable securities. They point out that this literature is vast, with both the academic and practitioner communities having proposed a wide variety of models. In guiding the reader through these models, they explain that different applications call for different models based on the trade-offs of complexity, flexibility, tractability, and data availability – the “art” of modeling. The Dai and Singleton essay, combined with Duffie’s earlier essay, provides an insightful and authoritative introduction to the world of fixed-income pricing models at the advanced MBA and PhD levels. We hope that the contributions represented by these essays communicate the excitement of financial economics to beginners and specialists alike and stimulate further research. We thank Rodolfo Martell for his help in processing the papers for publication. GEORGE M. CONSTANTINIDES University of Chicago, Chicago MILTON HARRIS University of Chicago, Chicago REN´E STULZ Ohio State University, Columbus References Black, F., and M.S. Scholes (1973), “The pricing of options and corporate liabilities”, Journal of Political Economy 81:637−654.
    • xvi Preface Grossman, S.J., and R.J. Shiller (1981), “The determinants of the variability of stock market prices”, American Economic Review Papers and Proceedings 71:222−227. Hansen, L.P., and K.J. Singleton (1982), “Generalized instrumental variables estimation of nonlinear rational expectations models”, Econometrica 50:1269−1288. Mehra, R., and E.C. Prescott (1985), “The equity premium: a puzzle”, Journal of Monetary Economics 15:145−161. Merton, R.C. (1973), “Theory of rational option pricing”, Bell Journal of Economics and Management Science 4:141−183.
    • CONTENTS OF VOLUME 1B Introduction to the Series v Contents of the Handbook vii Preface ix FINANCIAL MARKETS AND ASSET PRICING Chapter 10 Arbitrage, State Prices and Portfolio Theory PHILIP H. DYBVIG and STEPHEN A. ROSS 605 Abstract 606 Keywords 606 1. Introduction 607 2. Portfolio problems 607 3. Absence of arbitrage and preference-free results 612 3.1. Fundamental theorem of asset pricing 614 3.2. Pricing rule representation theorem 616 4. Various analyses: Arrow–Debreu world 618 4.1. Optimal portfolio choice 619 4.2. Efficient portfolios 619 4.3. Aggregation 620 4.4. Asset pricing 621 4.5. Payoff distribution pricing 622 5. Capital asset pricing model (CAPM) 624 6. Mutual fund separation theory 629 6.1. Preference approach 629 6.2. Beliefs 631 7. Arbitrage pricing theory (APT) 633 8. Conclusion 634 References 634 Chapter 11 Intertemporal Asset Pricing Theory DARRELL DUFFIE 639 Abstract 641
    • xviii Contents of Volume 1B Keywords 641 1. Introduction 642 2. Basic theory 642 2.1. Setup 643 2.2. Arbitrage, state prices, and martingales 644 2.3. Individual agent optimality 646 2.4. Habit and recursive utilities 647 2.5. Equilibrium and Pareto optimality 649 2.6. Equilibrium asset pricing 651 2.7. Breeden’s consumption-based CAPM 653 2.8. Arbitrage and martingale measures 654 2.9. Valuation of redundant securities 656 2.10. American exercise policies and valuation 657 3. Continuous-time modeling 661 3.1. Trading gains for Brownian prices 662 3.2. Martingale trading gains 663 3.3. The Black–Scholes option-pricing formula 665 3.4. Ito’s Formula 668 3.5. Arbitrage modeling 670 3.6. Numeraire invariance 670 3.7. State prices and doubling strategies 671 3.8. Equivalent martingale measures 672 3.9. Girsanov and market prices of risk 672 3.10. Black–Scholes again 676 3.11. Complete markets 677 3.12. Optimal trading and consumption 678 3.13. Martingale solution to Merton’s problem 682 4. Term-structure models 686 4.1. One-factor models 687 4.2. Term-structure derivatives 691 4.3. Fundamental solution 693 4.4. Multifactor term-structure models 695 4.5. Affine models 696 4.6. The HJM model of forward rates 699 5. Derivative pricing 702 5.1. Forward and futures prices 702 5.2. Options and stochastic volatility 705 5.3. Option valuation by transform analysis 708 6. Corporate securities 711 6.1. Endogenous default timing 712 6.2. Example: Brownian dividend growth 713 6.3. Taxes, bankruptcy costs, capital structure 717 6.4. Intensity-based modeling of default 719
    • Contents of Volume 1B xix 6.5. Zero-recovery bond pricing 721 6.6. Pricing with recovery at default 722 6.7. Default-adjusted short rate 724 References 725 Chapter 12 Tests of Multifactor Pricing Models, Volatility Bounds and Portfolio Performance WAYNE E. FERSON 743 Abstract 745 Keywords 745 1. Introduction 746 2. Multifactor asset-pricing models: Review and integration 748 2.1. The stochastic discount factor representation 748 2.2. Expected risk premiums 750 2.3. Return predictability 751 2.4. Consumption-based asset-pricing models 753 2.5. Multi-beta pricing models 754 2.6. Mean-variance efficiency with conditioning information 760 2.7. Choosing the factors 765 3. Modern variance bounds 768 3.1. The Hansen–Jagannathan bounds 768 3.2. Variance bounds with conditioning information 770 3.3. The Hansen–Jagannathan distance 773 4. Methodology and tests of multifactor asset-pricing models 774 4.1. The Generalized Method of Moments approach 774 4.2. Cross-sectional regression methods 775 4.3. Multivariate regression and beta-pricing models 781 5. Conditional performance evaluation 785 5.1. Stochastic discount factor formulation 787 5.2. Beta-pricing formulation 788 5.3. Using portfolio weights 790 5.4. Conditional market-timing models 792 5.5. Empirical evidence on conditional performance 793 6. Conclusions 794 References 795 Chapter 13 Consumption-Based Asset Pricing JOHN Y. CAMPBELL 803 Abstract 804 Keywords 804 1. Introduction 805
    • xx Contents of Volume 1B 2. International stock market data 810 3. The equity premium puzzle 816 3.1. The stochastic discount factor 816 3.2. Consumption-based asset pricing with power utility 819 3.3. The risk-free rate puzzle 824 3.4. Bond returns and the equity-premium and risk-free rate puzzles 827 3.5. Separating risk aversion and intertemporal substitution 828 4. The dynamics of asset returns and consumption 832 4.1. Time-variation in conditional expectations 832 4.2. A loglinear asset-pricing framework 836 4.3. The equity volatility puzzle 840 4.4. Implications for the equity premium puzzle 845 4.5. What does the stock market forecast? 849 4.6. Changing volatility in stock returns 857 4.7. What does the bond market forecast? 859 5. Cyclical variation in the price of risk 866 5.1. Habit formation 866 5.2. Models with heterogeneous agents 873 5.3. Irrational expectations 876 6. Some implications for macroeconomics 879 References 881 Chapter 14 The Equity Premium in Retrospect RAJNISH MEHRA and EDWARD C. PRESCOTT 889 Abstract 890 Keywords 890 1. Introduction 891 2. The equity premium: history 891 2.1. Facts 891 2.2. Data sources 892 2.3. Estimates of the equity premium 894 2.4. Variation in the equity premium over time 897 3. Is the equity premium due to a premium for bearing non-diversifiable risk? 899 3.1. Standard preferences 902 3.2. Estimating the equity risk premium versus estimating the risk aversion parameter 912 3.3. Alternative preference structures 913 3.4. Idiosyncratic and uninsurable income risk 918 3.5. Models incorporating a disaster state and survivorship bias 920 4. Is the equity premium due to borrowing constraints, a liquidity premium or taxes? 921 4.1. Borrowing constraints 921 4.2. Liquidity premium 924
    • Contents of Volume 1B xxi 4.3. Taxes and regulation 924 5. An equity premium in the future? 927 Appendix A 928 Appendix B. The original analysis of the equity premium puzzle 930 B.1. The economy, asset prices and returns 930 References 935 Chapter 15 Anomalies and Market Efficiency G. WILLIAM SCHWERT 939 Abstract 941 Keywords 941 1. Introduction 942 2. Selected empirical regularities 943 2.1. Predictable differences in returns across assets 943 2.2. Predictable differences in returns through time 951 3. Returns to different types of investors 956 3.1. Individual investors 956 3.2. Institutional investors 958 3.3. Limits to arbitrage 961 4. Long-run returns 961 4.1. Returns to firms issuing equity 962 4.2. Returns to bidder firms 964 5. Implications for asset pricing 966 5.1. The search for risk factors 966 5.2. Conditional asset pricing 967 5.3. Excess volatility 967 5.4. The role of behavioral finance 967 6. Implications for corporate finance 968 6.1. Firm size and liquidity 968 6.2. Book-to-market effects 968 6.3. Slow reaction to corporate financial policy 969 7. Conclusions 970 References 970 Chapter 16 Are Financial Assets Priced Locally or Globally? G. ANDREW KAROLYI and REN´E M. STULZ 975 Abstract 976 Keywords 976 1. Introduction 977 2. The perfect financial markets model 978 2.1. Identical consumption-opportunity sets across countries 979
    • xxii Contents of Volume 1B 2.2. Different consumption-opportunity sets across countries 982 2.3. A general approach 988 2.4. Empirical evidence on asset pricing using perfect market models 992 3. Home bias 997 4. Flows, spillovers, and contagion 1004 4.1. Flows and returns 1007 4.2. Correlations, spillovers, and contagion 1010 5. Conclusion 1014 References 1014 Chapter 17 Microstructure and Asset Pricing DAVID EASLEY and MAUREEN O’HARA 1021 Abstract 1022 Keywords 1022 1. Introduction 1023 2. Equilibrium asset pricing 1024 3. Asset pricing in the short-run 1025 3.1. The mechanics of pricing behavior 1026 3.2. The adjustment of prices to information 1029 3.3. Statistical and structural models of microstructure data 1031 3.4. Volume and price movements 1033 4. Asset pricing in the long-run 1035 4.1. Liquidity 1036 4.2. Information 1041 5. Linking microstructure and asset pricing: puzzles for researchers 1044 References 1047 Chapter 18 A Survey of Behavioral Finance NICHOLAS BARBERIS and RICHARD THALER 1053 Abstract 1054 Keywords 1054 1. Introduction 1055 2. Limits to arbitrage 1056 2.1. Market efficiency 1056 2.2. Theory 1058 2.3. Evidence 1061 3. Psychology 1065 3.1. Beliefs 1065 3.2. Preferences 1069 4. Application: The aggregate stock market 1075 4.1. The equity premium puzzle 1078
    • Contents of Volume 1B xxiii 4.2. The volatility puzzle 1083 5. Application: The cross-section of average returns 1087 5.1. Belief-based models 1092 5.2. Belief-based models with institutional frictions 1095 5.3. Preferences 1097 6. Application: Closed-end funds and comovement 1098 6.1. Closed-end funds 1098 6.2. Comovement 1099 7. Application: Investor behavior 1101 7.1. Insufficient diversification 1101 7.2. Naive diversification 1103 7.3. Excessive trading 1103 7.4. The selling decision 1104 7.5. The buying decision 1105 8. Application: Corporate finance 1106 8.1. Security issuance, capital structure and investment 1106 8.2. Dividends 1109 8.3. Models of managerial irrationality 1111 9. Conclusion 1113 Appendix A 1115 References 1116 Finance, Optimization, and the Irreducibly Irrational Component of Human Behavior ROBERT J. SHILLER 1125 Chapter 19 Derivatives ROBERT E. WHALEY 1129 Abstract 1131 Keywords 1131 1. Introduction 1132 2. Background 1133 3. No-arbitrage pricing relations 1139 3.1. Carrying costs 1140 3.2. Valuing forward/futures using the no-arbitrage principle 1141 3.3. Valuing options using the no-arbitrage principle 1143 4. Option valuation 1148 4.1. The Black–Scholes/Merton option valuation theory 1149 4.2. Analytical formulas 1151 4.3. Approximation methods 1157 4.4. Generalizations 1164
    • xxiv Contents of Volume 1B 5. Studies of no-arbitrage price relations 1166 5.1. Forward/futures prices 1167 5.2. Option prices 1169 5.3. Summary and analysis 1173 6. Studies of option valuation models 1173 6.1. Pricing errors/implied volatility anomalies 1174 6.2. Trading simulations 1176 6.3. Informational content of implied volatility 1179 6.4. Summary and analysis 1181 7. Social costs/benefits of derivatives trading 1189 7.1. Contract introductions 1189 7.2. Contract expirations 1193 7.3. Market synchronization 1194 7.4. Summary and analysis 1197 8. Summary 1198 References 1199 Chapter 20 Fixed-Income Pricing QIANG DAI and KENNETH J. SINGLETON 1207 Abstract 1208 Keywords 1208 1. Introduction 1209 2. Fixed-income pricing in a diffusion setting 1210 2.1. The term structure 1210 2.2. Fixed-income securities with deterministic payoffs 1211 2.3. Fixed-income securities with state-dependent payoffs 1212 2.4. Fixed-income securities with stopping times 1213 3. Dynamic term-structure models for default-free bonds 1215 3.1. One-factor dynamic term-structure models 1215 3.2. Multi-factor dynamic term-structure models 1218 4. Dynamic term-structure models with jump diffusions 1222 5. Dynamic term-structure models with regime shifts 1223 6. Dynamic term-structure models with rating migrations 1225 6.1. Fractional recovery of market value 1225 6.2. Fractional recovery of par, payable at maturity 1228 6.3. Fractional recovery of par, payable at default 1229 6.4. Pricing defaultable coupon bonds 1229 6.5. Pricing Eurodollar swaps 1230 7. Pricing of fixed-income derivatives 1231 7.1. Derivatives pricing using dynamic term-structure models 1231 7.2. Derivatives pricing using forward-rate models 1232 7.3. Defaultable forward-rate models with rating migrations 1234
    • Contents of Volume 1B xxv 7.4. The LIBOR market model 1237 7.5. The swaption market model 1241 References 1242 Subject Index I-1
    • This Page Intentionally Left Blank
    • Chapter 10 ARBITRAGE, STATE PRICES AND PORTFOLIO THEORY PHILIP H. DYBVIG Washington University in Saint Louis STEPHEN A. ROSS MIT Contents Abstract 606 Keywords 606 1. Introduction 607 2. Portfolio problems 607 3. Absence of arbitrage and preference-free results 612 3.1. Fundamental theorem of asset pricing 614 3.2. Pricing rule representation theorem 616 4. Various analyses: Arrow–Debreu world 618 4.1. Optimal portfolio choice 619 4.2. Efficient portfolios 619 4.3. Aggregation 620 4.4. Asset pricing 621 4.5. Payoff distribution pricing 622 5. Capital asset pricing model (CAPM) 624 6. Mutual fund separation theory 629 6.1. Preference approach 629 6.2. Beliefs 631 7. Arbitrage pricing theory (APT) 633 8. Conclusion 634 References 634 Handbook of the Economics of Finance, Edited by G.M. Constantinides, M. Harris and R. Stulz © 2003 Elsevier B.V. All rights reserved
    • 606 P.H. Dybvig and S.A. Ross Abstract Neoclassical financial models provide the foundation for our understanding of finance. This chapter introduces the main ideas of neoclassical finance in a single-period context that avoids the technical difficulties of continuous-time models, but preserves the principal intuitions of the subject. The starting point of the analysis is the formulation of standard portfolio choice problems. A central conceptual result is the Fundamental Theorem of Asset Pricing, which asserts the equivalence of absence of arbitrage, the existence of a positive linear pricing rule, and the existence of an optimum for some agent who prefers more to less. A related conceptual result is the Pricing Rule Representation Theorem, which asserts that a positive linear pricing rule can be represented as using state prices, risk-neutral expectations, or a state-price density. Different equivalent representations are useful in different contexts. Many applied results can be derived from the first-order conditions of the portfolio choice problem. The first-order conditions say that marginal utility in each state is proportional to a consistent state-price density, where the constant of proportionality is determined by the budget constraint. If markets are complete, the implicit state- price density is uniquely determined by investment opportunities and must be the same as viewed by all agents, thus simplifying the choice problem. Solving first-order conditions for quantities gives us optimal portfolio choice, solving them for prices gives us asset pricing models, solving them for utilities gives us preferences, and solving them for probabilities gives us beliefs. We look at two popular asset pricing models, the CAPM and the APT, as well as complete-markets pricing. In the case of the CAPM, the first-order conditions link nicely to the traditional measures of portfolio performance. Further conceptual results include aggregation and mutual fund separation theory, both of which are useful for understanding equilibrium and asset pricing. Keywords arbitrage, arbitrage pricing theory, investments, portfolio choice, asset pricing, complete markets, mean-variance analysis, performance measurement, mutual fund separation, aggregation JEL classification: G11, G12
    • Ch. 10: Arbitrage, State Prices and Portfolio Theory 607 1. Introduction The modern quantitative approach to finance has its original roots in neoclassical economics. Neoclassical economics studies an idealized world in which markets work smoothly without impediments such as transaction costs, taxes, asymmetry of information, or indivisibilities. This chapter considers what we learn from single- period neoclassical models in finance. While dynamic models are becoming more and more common, single-period models contain a surprisingly large amount of the intuition and intellectual content of modern finance, and are also commonly used by investment practitioners for the construction of optimal portfolios and communication of investment results. Focusing on a single period is also consistent with an important theme. While general equilibrium theory seeks great generality and abstraction, finance has work to be done and seeks specific models with strong assumptions and definite implications that can be tested and implemented in practice. 2. Portfolio problems In our analysis, there are two points of time, 0 and 1, with an interval of time in between during which nothing happens. At time zero, our champion (the agent) is making decisions that will affect the allocation of consumption between nonrandom consumption, c0, at time 0, and random consumption {cw} across states w = 1, 2, . . . , W revealed at time 1. At time 0 and in each state at time 1, there is a single consumption good, and therefore consumption at time 0 or in a state at time 1 is a real number. This abstraction of a single good is obviously not “true” in any literal sense, but this is not a problem, and indeed any useful theoretical model is much simpler than reality. The abstraction does, however, face us with the question of how to interpret our simple model (in this case with a single good) in a practical context that is more complex (has multiple goods). In using a single-good model, there are two usual practices: either use nominal values and measure consumption in dollars, or use real values and measure consumption in inflation-adjusted dollars. Depending on the context, one or the other can make the most sense. Following the usual practice from general equilibrium theory of thinking of units of consumption at various times and in different states of nature as different goods, a typical consumption vector is C ≡ {c0, c1, . . . , cW }, where the real number c0 denotes consumption of the single good at time zero, and the vector c ≡ {c1, . . . , cW } of real numbers c1, . . . , cW denotes random consumption of the single good in each state 1, . . . , W at time 1. If this were a typical exercise in general equilibrium theory, we would have a price vector for consumption across goods. For example, we might have the following choice problem, which is named after two great pioneers of general equilibrium theory, Kenneth Arrow and Gerard Debreu:
    • 608 P.H. Dybvig and S.A. Ross Problem 1: Arrow–Debreu Problem. Choose consumptions C ≡ {c0, c1, . . . , cW } to maximize utility of consumption U(C) subject to the budget constraint c0 + W w = 1 pwcw = W. (1) Here, U(·) is the utility function that represents preferences, p is the price vector, and W is wealth, which might be replaced by the market value of an endowment. We are taking consumption at time 0 to be the numeraire, and pw is the price of the Arrow– Debreu security which is a claim to one unit of consumption at time 1 in state w. The first-order condition for Problem 1 is the existence of a positive Lagrangian multiplier l (the marginal utility of wealth) such that U0(c0) = l, and for all w = 1, . . . , W, Uw(cw) = lpw. This is the usual result from neoclassical economics that the gradient of the utility function is proportional to prices. Specializing to the leading case in finance of time- separable von Neumann–Morgenstern preferences, named after John von Neumann and Oscar Morgenstern (1944), two great pioneers of utility theory, we have that U(C) = v(c0) + W w = 1 pwu(cw). We will take v and u to be differentiable, strictly increasing (more is preferred to less), and strictly concave (risk averse). Here, pw is the probability of state w. In this case, the first-order condition is the existence of l such that v (c0) = l, (2) and for all w = 1, 2, . . . , n, pwu (cw) = lpw, (3) or equivalently u (cw) = løw, (4) where øw ≡ pw/pw is the state-price density (also called the stochastic discount factor or pricing kernel), which is a measure of priced relative scarcity in state of nature w. Therefore, the marginal utility of consumption in a state is proportional to the relative scarcity. There is a solution if the problem is feasible, prices and probabilities are positive, the von Neumann–Morgenstern utility function is increasing and strictly concave, and there is satisfied the Inada condition limc ↑ ∞ u (c) = 0.1 There are 1 Proving the existence of a solution requires more assumptions in continuous-state models.
    • Ch. 10: Arbitrage, State Prices and Portfolio Theory 609 different motivations of von Neumann–Morgenstern preferences in the literature and the probabilities may be objective or subjective. What is important for us is that the von Neumann–Morgenstern utility function represents preferences in the sense that expected utility is higher for more preferred consumption patterns.2 Using von Neumann–Morgenstern preferences has been popular in part because of axiomatic derivations of the theory [see, for example, Herstein and Milnor (1953) or Luce and Raiffa (1957, Chapter 2)]. There is also a large literature on alternatives and extensions to von Neumann–Morgenstern preferences. For single-period models, see Knight (1921), Bewley (1988), Machina (1982), Blume, Brandenburger and Dekel (1991) and Fishburn (1988). There is an even richer set of models in multiple periods, for example, time-separable von Neumann–Morgenstern (the traditional standard), habit formation [e.g., Duesenberry (1949), Pollak (1970), Abel (1990), Constantinides (1991) and Dybvig (1995)], local substitutability over time [Hindy and Huang (1992)], interpersonal dependence [Duesenberry (1949) and Abel (1990)], preference for resolution of uncertainty [Kreps and Porteus(1978)], time preference dependent on consumption [Bergman (1985)], and general recursive utility [Epstein and Zin (1989)]. Recently, there have also been some attempts to revive the age-old idea of studying financial situations using psychological theories [like prospect theory, Kahneman and Tversky (1979)]. Unfortunately, these models do not translate well to financial markets. For example, in prospect theory framing matters, that is, the observed phenomenon of an agent making different decisions when facing identical decision problems described differently. However, this is an alien concept for financial economists and when they proxy for it in models they substitute something more familiar [for example, some history dependence as in Barberis, Huang and Santos (2001)]. Another problem with the psychological theories is that they tend to be isolated stories rather than a general specification, and they are often hard to generalize. For example, prospect theory says that agents put extra weight on very unlikely outcomes, but it is not at all clear what this means in a model with a continuum of states. This literature also has problems with using ex post explanations (positive correlations of returns are underreaction and negative correlations are overreactions) and a lack of clarity of how much is going on that cannot be explained by traditional models (and much of it can). In actual financial markets, Arrow–Debreu securities do not trade directly, even if they can be constructed indirectly using a portfolio of securities. A security is characterized by its cash flows. This description would not be adequate for analysis of taxes, since different sources of cash flow might have very different tax treatment, but we are looking at models without taxes. For an asset like a common stock or a bond, the cash flow might be negative at time 0, from payment of the price, and positive or zero in each state at time 1, the positive amount coming from any repayment of 2 Later, when we look at multiple-agent results, we will also make the neoclassical assumption of identical beliefs, which is probably most naturally motivated by common objective beliefs.
    • 610 P.H. Dybvig and S.A. Ross principal, dividends, coupons, or proceeds from sale of the asset. For a futures contract, the cash flow would be 0 at time 0, and the cash flow in different states at time 1 could be positive, negative, or zero, depending on news about the value of the underlying commodity. In general, we think of the negative of the initial cash flow as the price of a security. We denote by P = {P1, . . . , PN } the vector of prices of the N securities 1, . . . , N, and we denote by X the payoff matrix. We have that Pn is the price we pay for one unit of security n and Xwn is the payoff per unit of security n at time 1 in the single state of nature w. With the choice of a portfolio of assets, our choice problem might become Problem 2: First Portfolio Choice Problem. Choose portfolio holdings Q ≡ {Q1, . . . , QN } and consumptions C ≡ {c0, . . . , cW } to maximize utility of consumption U(C) subject to portfolio payoffs c ≡ {c1, . . . , cw} = X Q and budget constraint c0 + P Q = W. Here, Q is the vector of portfolio weights. Time 0 consumption is the numeraire, and wealth W is now chosen in time 0 consumption units and the entire endowment is received at time 0. In the budget constraint, the term P Q is the cost of the portfolio holding, which is the sum across securities n of the price Pn times the number of shares or other unit Qn. The matrix product X Q says that the consumption in state w is cw = n XwnQn, i.e., the sum across securities n of the payoff Xwn of security n in state w, times the number of shares or other units Qn of security n our champion is holding. The first-order condition for Problem 2 is the existence of a vector of shadow prices p and a Lagrangian multiplier l such that pwu (cw) = lpw, (5) where P = pX. (6) The first equation is the same as in the Arrow–Debreu model, with an implicit shadow price vector in place of the given Arrow–Debreu prices. The second equation is a pricing equation that says the prices of all assets must be consistent with the shadow prices of the states. For the Arrow–Debreu model itself, the state-space tableau X is I, the identity matrix, and the price vector P is p, the vector of Arrow–Debreu state prices. For the Arrow–Debreu model, the pricing equation determines the shadow prices as equal to the state prices. Even if the assets are not the Arrow–Debreu securities, Problem 2 may be essentially equivalent to the Arrow–Debreu model in Problem 1. In economic terms, the important feature of the Arrow–Debreu problem is that all payoff patterns are spanned, i.e., each potential payoff pattern can be generated at some price by some portfolio of assets. Linear algebra tells us that all payoff patterns can be generated if the payoff matrix X
    • Ch. 10: Arbitrage, State Prices and Portfolio Theory 611 has full row rank. If X has full row rank, p is determined (or over-determined) by Equation (6). If p is uniquely determined by the pricing equation (and therefore also all Arrow–Debreu assets can be purchased as portfolios of assets in the economy), we say that markets are complete, and for all practical purposes we are in an Arrow– Debreu world. For the choice problem to have a solution for any agent who prefers more to less, we also need for the price of each payoff pattern to be unique (the “law of one price”) and positive, or else there would be arbitrage (i.e., a “money pump” or a “free lunch”). If there is no arbitrage, then there is at least one vector of positive state prices p solving the pricing equation (6). There is an arbitrage if the vector of state prices is overdetermined or if all consistent vectors of state prices assign a negative or zero price to some state. The notion of absence of arbitrage is a central concept in finance, and we develop its implications more fully in the section on preference-free results. So far, we have been stating portfolio problems in prices and quantities, as we would in general equilibrium theory. However, it is also common to describe assets in terms of rates of return, which are relative price changes (often expressed as percentages). The return to security n, which is the relative change in total value (including any dividends, splits, warrant issues, coupons, stock issues, and the like as well as change in the price). There is not an absolute standard of what is meant by return, in different contexts this can be the rate of return, one plus the rate of return, or the difference between two rates of return. It is necessary to figure which is intended by asking or from context. Using the notation above, the rate of return in state w is rwn = (Xwn − Pn)/Pn.3 Often, consumption at the outset is suppressed, and we specialize to von Neumann– Morgenstern expected utility. In this case, we have the following common form of portfolio problem. Problem 3: Portfolio Problem using Returns. Choose portfolio proportions q ≡ {q1, . . . , qN } and consumptions c ≡ {c1, . . . , cW } to maximize expected utility of consumption W w = 1 pq u(cw) subject to the consumption equation c = Wq (1 + r) and the budget constraint q 1 = 1. Here, p = {p1, . . . , pW } is a vector of state probabilities, u(·) is the von Neumann– Morgenstern utility function, and 1 is a vector of 1’s. The dimensionality of 1 is determined implicitly from the context; here the dimensionality is the number of assets. The first-order condition for an optimum is the existence of shadow state-price density vector ø and shadow marginal utility of wealth l such that u (cw) = løw (7) 3 One unfortunate thing about returns is that they are not defined for contracts (like futures) that have zero price. However, this can be finessed formally by bundling a futures with a bond or other asset in defining the securities and unbundling them when interpreting the results. Bundling and unbundling does not change the underlying economics due to the linearity of consumptions and constraints in the portfolio choice problem.
    • 612 P.H. Dybvig and S.A. Ross and 1 = E[(1 + r)ø]. (8) These equations say that the state-price density is consistent with the marginal valuation by the agent and with pricing in the market. As our final typical problem, let us consider a mean-variance optimization. This optimization is predicated on the assumption that investors care only about mean and variance (typically preferring more mean and less variance), so we have a utility function V(m, v) in mean m and variance v. For this problem, suppose there is a risk- free asset paying a return r (although the market-level implications of mean-variance analysis can also be derived in a general model without a risky asset). In this case, portfolio proportions in the risky assets are unconstrained (need not sum to 1) because the slack can be taken up by the risk-free asset. We denote by m the vector of mean risky asset returns and by s the covariance matrix of risky returns. Then our champion solves the following choice problem. Problem 4: Mean-variance optimization. Choose portfolio proportions q ≡ {q1, . . . , qN } to maximize the mean-variance utility function V(r + (m − r1) q, q Sq). The first-order condition for the problem is m − r1 = lSq, (9) where q is the optimal vector of portfolio proportions and l is twice the marginal rate of substitution Vv (m, v)/Vm(m, v), evaluated at m = r + (m − r1) q and v = q Sq. The first-order condition (9) says that mean excess return for each asset is proportional to the marginal contribution of volatility to the agent’s optimal portfolio. We have seen a few of the typical types of portfolio problem. There are a lot of variations. The problem might be stated in terms of excess returns (rate of return less a risk-free rate) or total return (one plus the rate of return). Or, we might constrain portfolio holdings to be positive (no short sales) or we might require consumption to be nonnegative (limited liability). Many other variations adapt the basic portfolio problem to handle institutional features not present in a neoclassical formulation, such as transaction costs, bid–ask spreads, or taxes. These extensions are very interesting, but beyond the scope of what we are doing here, which is to explore the neoclassical foundations. 3. Absence of arbitrage and preference-free results Before considering specific solutions and applications, let us consider some general results that are useful for thinking about portfolio choice. These results are preference- free in the sense that they do not depend on any specific assumptions about preferences
    • Ch. 10: Arbitrage, State Prices and Portfolio Theory 613 but only depend on an assumption that agents prefer more to less. Central to this section is the notion of an arbitrage, which is a “money pump” or a “free lunch”. If there is arbitrage, linearity of the neoclassical problem implies that any candidate optimum can be dominated by adding the arbitrage. As a result, no agent who prefers more to less would have an optimum if there exists arbitrage. Furthermore, this seemingly weak assumption is enough to obtain two useful theorems. The Fundamental Theorem of Asset Pricing says that the following are equivalent: absence of arbitrage, existence of a consistent positive linear pricing rule, and existence of an optimum for some hypothetical agent who prefers more to less. The Pricing Rule Representation Theorem gives different equivalent forms for the consistent positive linear pricing rule, using state prices, risk-neutral probabilities (martingale valuation), state-price density (or stochastic discount factor or pricing kernel), or an abstract positive linear operator. The results in this section are from Cox and Ross (1975), Ross (1976c, 1978b) and Dybvig and Ross (1987). The results have been formalized in continuous time by Harrison and Kreps (1979) and Harrison and Pliska (1981). Occasionally, the theorems in this section can be applied directly to obtain an interesting result. For example, linearity of the pricing rule is enough to derive put- call parity without constructing the arbitrage. More often, the results in this section help to answer conceptual questions. For example, an option pricing formula that is derived using absence of arbitrage is always consistent with equilibrium, as can be seen from the Fundamental Theorem. By the Fundamental Theorem, absence of arbitrage implies there is an optimum for some hypothetical agent who prefers more to less; we can therefore construct an equilibrium in the single-agent pure exchange economy in which this agent is endowed with the optimal holding. By construction the equilibrium in this economy will have the desired pricing, and therefore any no-arbitrage pricing result is consistent with some equilibrium. In this section, we will work in the context of Problem 2. An arbitrage is a change in the portfolio that makes all agents who prefer more to less better off. We make all such agents better off if we increase consumption sometime, and in some state of nature, and we never decrease consumption. By combining the two constraints in Problem 2, we can write the consumption C associated with any portfolio choice Q using the stacked matrix equation C = W 0 + −P X Q. The first row, W − P Q, is consumption at time 0, which is wealth W less the cost of our portfolio. The remaining rows, X Q, give the random consumption across states at time 1. Now, when we move from the portfolio choice Q to the portfolio choice Q + h, the initial wealth term cancels and the change in consumption can now be written as DC = −P X h.
    • 614 P.H. Dybvig and S.A. Ross This will be an arbitrage if DC is never negative and is positive in at least one component, which we will write as4 DC > 0 or −P X h > 0. Some authors describe taxonomies of different types of arbitrage, having perhaps a negative price today and zero payoff tomorrow, a zero price today and a nonnegative but not identically zero payoff tomorrow, or a negative price today and a positive payoff tomorrow. These are all examples of arbitrages that are subsumed by our general formula. The important thing is that there is an increase in consumption in some state of nature at some point of time and there is never any decrease in consumption. 3.1. Fundamental theorem of asset pricing Theorem 1: Fundamental Theorem of Asset Pricing. The following conditions on prices P and payoffs X are equivalent: (i) Absence of arbitrage: ( /∃h) −P X h > 0 . (ii) Existence of a consistent positive linear pricing rule (positive state prices): (∃p 0)(P = p X ). (iii) Some agent with strictly increasing preferences U has an optimum in Problem 2. Proof: We prove the equivalence by showing (i) ⇒ (ii), (ii) ⇒ (iii), and (iii) ⇒ (i). (i) ⇒ (ii): This is the most subtle part, and it follows from a separation theorem or the duality theorem from linear programming. From the definition of absence of arbitrage, we have that the sets S1 ≡ −P X h | h ∈ Rn and S2 ≡ x ∈ RW + 1 | x > 0 must be disjoint. Therefore, there is a separating hyperplane z such that z x = 0 for all x ∈ S1 and z x > 0 for all x ∈ S2. [See Karlin (1959), Theorem B3.5] Normalizing so that the first component (the shadow price of time zero consumption) is 1, we will see that p defined by (1 p ) = z/z0 is the consistent linear pricing rule we seek. Constancy 4 We use the following terminology for vector inequalities: (x y) ⇔ (∀i) (xi yi), (x > y) ⇔ ((x y) & (∃i) (xi > yi)), and (x y) ⇔ (∀i) (xi > yi).
    • Ch. 10: Arbitrage, State Prices and Portfolio Theory 615 of zx for x ∈ S1 implies that (1 p ) −P X = 0, which is to say that P = p X , i.e., p is a consistent linear pricing rule. Furthermore, z x positive for x ∈ S2 implies z 0 and consequently p 0, and p is indeed the desired consistent positive linear pricing rule. (ii) ⇒ (iii): This part is proven by construction. Let U(C) = (1 p ) C, then Q = 0 solves Problem 2. To see this, note that the objective function U(C) is constant and equal to W for all Q: U(C) = (1 p ) C = (1 p ) W 0 + −P X Q = W + (−P + p X ) Q = W. (The motivation of this construction is the observation that the existence of the consistent linear pricing rule with state prices p implies that all feasible consumptions satisfy (1 p ) C = W.) (iii) ⇒ (i): This part is obvious, since any candidate optimum is dominated by adding the arbitrage, and therefore there can be no arbitrage if there is an optimum. More formally, adding an arbitrage implies the change of consumption DC > 0, which implies an increase in U(C). One feature of the proof that may seem strange is the degeneracy (linearity) of the utility function whose existence is constructed. This was all that was needed for this proof, but it could also be constructed to be strictly concave, additively separable over time, and of the von Neumann–Morgenstern class for given probabilities. Assuming any of these restrictions on the class would make some parts of the theorem weaker [(iii) implies (i) and (ii)] at the same time that it makes other parts stronger [(i) or (ii) implies (iii)]. The point is that the theorem is still true if (iii) is replaced by a much more restrictive class that imposes on U any or all of strict concavity, some order of differentiability, additive separability over time, and a von Neumann–Morgenstern form with or without specifying the probabilities in advance. All of these classes are restrictive enough to rule out arbitrage, and general enough to contain a utility function that admits an optimum when there is no arbitrage. The statement and proof of the theorem are a little more subtle if the state space is infinite-dimensional. The separation theorem is topological in nature, so we must restrict our attention to a topologically relevant subset of the nonnegative random variables. Also, we may lose the separating hyperplane theorem because the interior of the positive orthant is empty in most of these spaces (unless we use the sup-norm topology, in which case the dual is very large and includes dual vectors that do not support state prices). However, with some definition of arbitrage in limits, the economic content of the Fundamental Theorem can be maintained.
    • 616 P.H. Dybvig and S.A. Ross 3.2. Pricing rule representation theorem Depending on the context, there are different useful ways of representing the pricing rule. For some abstract applications (like proving put–call parity), it is easiest to use a general abstract representation as a linear operator L(c) such that c > 0 ⇒ L(c) > 0. For asset pricing applications, it is often useful to use either the state-price representation we used in the Fundamental Theorem, L(c) = w pwcw, or risk-neutral probabilities, L(c) = (1 + r∗ )−1 E∗ [cw] = (1 + r∗ )−1 w p∗ wcw. The intuition behind the risk-neutral representation (or martingale representation5 ) is that the price is the expected discounted value computed using a shadow risk-free rate (equal to the actual risk-free rate if there is one) and artificial risk-neutral probabilities p∗ that assign positive probability to the same states as do the true probabilities. Risk- neutral pricing says that all investments are fair gambles once we have adjusted for time preference by discounting and for risk preference by adjusting the probabilities. The final representation using the state-price density (or stochastic discount factor) ø to write L(c) = E[øwcw] = w pwøwcw. The state price density simplifies first- order conditions of portfolio choice problems because the state-price density measures priced scarcity of consumption. The state-price density is also handy for continuous- state models in which individual states have zero state probabilities and state prices but there exists a well-defined positive ratio of the two. Theorem 2: Pricing Rule Representation Theorem. The consistent positive linear pricing rule can be represented equivalently using (i) an abstract linear functional L(c) that is positive: (c > 0) ⇒ (L(c) > 0) (ii) positive state prices p 0: L(c) = W w = 1 pwcw (iii) positive risk-neutral probabilities p∗ 0 summing to 1 with associated shadow risk-free rate r∗ : L(c) = (1 + r∗ )−1 E∗ [cw] ≡ (1 + r∗ )−1 w p∗ wcw (iv) positive state-price densities ø 0: L(c) = E[øc] ≡ w pwøwcw. Proof: (i) ⇒ (ii): This is the known form of a linear operator in RW ; p 0 follows from the positivity of L. (ii) ⇒ (iii): Note first that the shadow risk-free rate must price the riskless asset c = 1: W w = 1 pw1 = (1 + r∗ )−1 E∗ [1], which implies (since E∗ [1] = 1) that r∗ = 1/p 1 − 1. Then, matching coefficients in W w = 1 pwcw = (1 + r∗ )−1 w p∗ wcw, 5 The reason for calling the term “martingale representation” is that using the risk-neutral probabilities makes the discounted price process a martingale, which is a stochastic process that does not increase or decrease on average.
    • Ch. 10: Arbitrage, State Prices and Portfolio Theory 617 we have that p∗ = p/1 p, which sums to 1 as required and inherits positivity from p. (iii) ⇒ (iv): Simply let øw = (1 + r∗ )−1 p∗ w (which is the same as pw/pw). (iv) ⇒ (i): immediate. Perhaps what is most remarkable about the Fundamental Theorem and the Representation Theorem is that neither probabilities nor preferences appear in the determination of the pricing operator, beyond the initial identification of which states have nonzero probability and the assumption that more is preferred to less. It is this observation that empowers the theory of derivative asset pricing, and is, for example, the reason why the Black–Scholes option price does not depend on the mean return on the underlying stock. Preferences and beliefs are, however, in the background: in equilibrium, they would influence the price vector P and/or the payoff matrix X (or the mean return process for the Black–Scholes stock). Although the focus of this chapter is on the single-period model, we should note that the various representations have natural multiperiod extensions. The abstract linear functional and state prices have essentially the same form, noting that cash flows now extend across time as well as states of nature and that there are also conditional versions of the formula at each date and contingency. In some models, the information set is generated by the sample path of security prices; in this case the state of nature is a sample path through the tree of potential security prices. For the state-price density in multiple periods, there is in general a state-price-density process { øt} whose relatives can be used for valuation. For example, the value at time s of receiving subsequent cash flows cs+1, cs+2 . . . ct is given by t t = s + 1 Es øt øs ct , (10) where Es[·] denotes expectation conditional on information available at time s. Basically, this follows from iterated expectations and defining øt as a cumulative product of single-period ø’s. Similarly, we can write risk-neutral valuation as Ps = E∗ s Pt (1 + r∗ s )(1 + r∗ s + 1) . . . (1 + r∗ t ) . (11) Note that unless the risk-free rate is nonrandom, we cannot take the discount factors out of the expectation.6 This is because of the way that the law of iterated expectations works. For example, consider the value V0 at time 0 of the cash flow in time 2. V0 = (1 + r∗ 1 )−1 E∗ 0 [V1] = (1 + r∗ 1 )−1 E∗ 0 [(1 + r∗ 2 )−1 E∗ 1 [c2]] = (1 + r∗ 1 )−1 E∗ 0 [(1 + r∗ 2 )−1 c2]. (12) 6 It would be possible to treat the whole time period from s to t as a single period and apply the pricing result to that large period in which case the discounting would be at the appropriate (t − s)-period rate. The problem with this is that the risk-neutral probabilities would be different for each pair of dates, which is unnecessarily cumbersome.
    • 618 P.H. Dybvig and S.A. Ross Now, (1 + r∗ 1 )−1 is outside the expectation (as could be (1 + r∗ s + 1)−1 in Equation (11), but (1 + r∗ 2 )−1 cannot come outside the expectation unless it is nonrandom.7 So, it is best to remember that when interest rates are stochastic, discounting for risk-neutral valuation should use the rolled-over spot rate, within the expectation. 4. Various analyses: Arrow–Debreu world The portfolio problem is the starting point of a lot of types of analysis in finance. Here are some implications that can be drawn from portfolio problems (usually through the first-order conditions): • optimal portfolio choice (asset allocation or stock selection) • portfolio efficiency • aggregation and market-level implications • asset pricing and performance measurement • payoff distribution pricing • recovery or estimation of preferences • inference of expectations We can think of many of these distinctions as a question of what we are solving for when we look at the first-order conditions. In optimal portfolio choice and its aggregation, we are solving for the portfolio choice given the preferences and beliefs about returns. In asset pricing, we are computing the prices (or restrictions on expected returns) given preferences, beliefs about payoffs, and the optimal choice (which is itself often derived using an aggregation result). In recovery, we derive preferences from beliefs and idealized observations about portfolio choice, e.g. at all wealth levels. Estimation of preferences is similar, but works with noisy observations of demand at a finite set of data points and uses a restriction in the functional form or smoothing in the statistical procedure to identify preferences. And, inference of expectations derives probability beliefs from preferences, prices, and the (observed) optimal demand. In this section, we illustrate the various analyses in the case of an Arrow–Debreu world. Analysis of the complete-markets model has been developed by many people over a period of time. Some of the more important works include some of the original work on competitive equilibrium such as Arrow and Debreu (1954), Debreu (1959) and Arrow and Hahn (1971), as well as some early work specific to security markets such as Arrow (1964), Rubinstein (1976), Ross (1976b), Banz and Miller (1978) and Breeden and Litzenberger (1978). There are also a lot of papers set in 7 In the special case in which c2 is uncorrelated with (1 + r∗ 2)−1 (or in multiple periods if cash flows are all independent of shadow interest rate moves), we can take the expected discount factor outside the expectation. In this case, we can use the multiperiod riskfree discount bond rate for discounting a simple expected final. However, in general, it is best to remember the general formula (11) with the rates in the denominator inside the expectation.
    • Ch. 10: Arbitrage, State Prices and Portfolio Theory 619 multiple periods that contributed to the finance of complete markets; although not strictly within the scope of this chapter, we mention just a few here: Black and Scholes (1973), Merton (1971, 1973), Cox, Ross and Rubinstein (1979) and Breeden (1979). 4.1. Optimal portfolio choice The optimal portfolio choice is the choice of consumptions (c0, c1, . . . , cW ) and Lagrange multiplier l to solve the budget constraint (1) and the first-order conditions (2) and (3). If the inverse I(·) of u (·) and the inverse J(·) of v (·) are both known analytically, then finding the optimum can be done using a one-dimensional monotone search for l such that J(l) + W w = 1 pwI(lpw/pw) = W. In some special cases, we can solve the optimization analytically. For logarithmic utility, v(c) = log(c) and u(c) = d log(c) for some d > 0, optimal consumption is given by c0 = W/(1 + d) and cw = pwWd/((1 + d) pw) (for w = 1, . . . , W). The portfolio choice can also be solved analytically for quadratic utility. 4.2. Efficient portfolios Efficient portfolios are the ones that are chosen by some agent in a given class of utility functions. For the Arrow–Debreu problem, we might take the class of utility functions to be the class of differentiable, increasing and strictly concave time-separable von Neumann–Morgenstern utility functions U(c) = v(c0) + W w = 1 pwu(cw).8 Since u(·) is increasing and strictly concave, (cw > cw ) ⇔ u (cw) < u (cw). Consequently, the first- order condition (4) implies that (cw > cw ) ⇔ ( øw < øw ). Since the state-price density øw ≡ pw/pw is a measure of priced social scarcity in state w, this says that we consume less in states in which consumption is more expensive. This necessary condition for efficiency is also sufficient; if consumption reverses the order across states of the state- price density, then it is easy to construct a utility function that satisfies the first-order conditions. Formally, Theorem 3: Arrow–Debreu Portfolio Efficiency. Consider a complete-markets world (in which agents solve Problem 1) in which state prices and probabilities are all strictly positive, and let U be the class of differentiable, increasing and strictly concave time-separable von Neumann–Morgenstern utility functions of the form U(c) = v(c0) + W w = 1 pwu(cw). Then there exists a utility function in the class U that chooses the consumption vector c satisfying the budget constraint if and only if consumptions at time 1 are in the opposite order as the state-price densities, i.e., (∀w, w ∈ {1, . . . , W})((cw > cw ) ⇔ ( øw < øw )). 8 A non-time-separable version would be of the form U(c) = W w = 1 pwu(c0, cw).
    • 620 P.H. Dybvig and S.A. Ross Proof: The “only if” part follows directly from the first-order condition and concavity as noted in the paragraph above. For the “if” part, we are given a consumption vector with the appropriate ordering and we will construct a utility function that will choose it and satisfy the first-order condition with l = 1. For this, choose v(c) = exp(−(c − c0)) (so that v (c0) = 1 as required by Equation 2), and choose u (c) to be any strictly positive and strictly decreasing function satisfying u (cw) = øw for all w = {1, 2, . . . , W}, for example, by “connecting the dots” (with appropriate treatment past the endpoints) in the graph of øw as a function of cw. Integrating this function yields a utility function u(·) such that the von Neumann–Morgenstern utility function satisfies the first-order conditions, and by concavity this first-order solution is a solution. Friendly warning. There are many notions of efficiency in finance: Pareto efficiency, informational efficiency, and the portfolio efficiency we have mentioned are three leading examples. A common mistake in heuristic arguments is to assume incorrectly that one sense of efficiency necessarily implies another. 4.3. Aggregation Aggregation results typically show what features of individual portfolio choice are preserved at the market level. Many asset pricing results follow from aggregation and the first-order conditions. The most common type of aggregation result is the efficiency of the market portfolio. For most classes of preferences we consider, the efficient set is unchanged by rescaling wealth, and consequently the market portfolio is always efficient if and only if the efficient set is convex. This is because the market portfolio is a rescaled version of the individual portfolios. (If the portfolios are written in terms of proportions, no rescaling is needed). When the market portfolio is efficient, then we can invert the first-order condition for the hypothetical agent who holds the market portfolio to obtain the pricing rule. In the Arrow–Debreu world, the market portfolio is always efficient. This is because the ordering across states is preserved when we sum individual portfolio choices to form the market portfolio. Consider agents m = 1, . . . , M with felicity functions v1 (·), . . . , vM (·) and u1 (·), . . . , uM (·) and optimal consumptions C1∗ , . . . , CM ∗ . The following results are close relatives of standard results in general equilibrium theory. Theorem 4: Aggregation Theorem. In a pure exchange equilibrium in a complete market, (i) all agents order time 1 consumption in the same order across states, (ii) aggregate time 1 consumption is in the same order across states, (iii) equilibrium is Pareto optimal, and (iv) there is a time separable von Neumann–Morgenstern utility function that would choose optimally aggregate consumption. Proof: (i) and (ii) Immediate, given Theorem 3.
    • Ch. 10: Arbitrage, State Prices and Portfolio Theory 621 (iii) Let lm be the Lagrangian multiplier at the optimum in the first order condition in agent m’s decision problem. Consider the problem of maximizing the linear social welfare function with weights lm , namely N n = 1 ln vn (c0) + W w = 1 un (cn w) . It is easy to verify from the first-order conditions from the equilibrium consumptions that they solve this problem too. This is a concave optimization, so the first-order conditions are sufficient, and since the welfare weights are positive the solution must be Pareto optimal (or else a Pareto improvement would increase the objective function). (iv) Define vA (c) ≡ maxcn s N n = 1 ln vn (cn) to be the first-period aggregate felicity function and define uA (c) ≡ maxcn s W n = 1 ln un (cn) to be the second-period aggregate felicity function. Then the utility function vA (c0) + E[uA (cw)] is a time-separable von Neumann–Morgenstern utility function that would choose the market’s aggregate consumption, since the objective function is the same as for the social welfare problem described under the proof of (iii). There is a different perspective that gives an alternative proof of the existence of a represenatitive agent (iv). The existence of a representative agent follows from the convexity of the set of efficient portfolios derived earlier. The main condition we require to have this work is that the efficient set of portfolio proportions is the same at all wealth levels, which is true here and typically of the cases we consider. 4.4. Asset pricing Asset pricing gets its name from valuation of cash flows, although asset pricing formulas may be expressed in several different ways, for example as a formula explaining expected returns across assets or as a moment condition satisfied by returns that can be tested econometrically. Let vA (·) and uA (·) represent the preferences of the hypothetical agent who holds aggregate consumption, as guaranteed by the aggregation theorem 4. Then we can solve the first-order conditions (2) and (3) to compute pw = pwuA (cA w)/vA (cA 0 ) and therefore the time-0 valuation of the time-1 cash flow vector {c1, . . . , cW } is L(c1, . . . , cW ) = W w = 1 pw uA (cA w) vA (cA 0 ) cw = E uA (cA w) vA (cA 0 ) cw . (13) This formula (with state-price density øw = uA (cA w)/vA (cA 0 ) is the right one for pricing assets, but asset pricing equations are more often expressed as explanations of mean
    • 622 P.H. Dybvig and S.A. Ross returns across assets or as moment conditions satisfied by returns. Defining the rate of return (the relative value change) for some asset as rw ≡ (cw − P)/P where cw is the asset value in state w and P is the asset’s price. Letting rf be the risk-free rate of return (or the riskless interest rate), which must be rf = 1 E[uA (cA w)/vA (cA 0 )] , (14) we have that Equation (13) implies E[rw] = rf + (1 + rf ) cov uA (cA w) vA (cA 0 ) , rw , (15) so that the risk premium (the excess of expected return over the risk-free rate) is proportional to covariance of return with the state-price density. This is the representation of asset pricing in terms of expected returns, and is also the so-called consumption-capital asset pricing model (CCAPM) that is more commonly studied in a multiperiod setting. Either of the pricing relations could be used as moment conditions in an asset-pricing test, but it is more common to use the moment condition 1 = E uA (cA w) vA (cA 0 ) (1 + rw) , (16) to test the CCAPM. This same equations characterize pricing for just about all the pricing models (perhaps with optimal consumption for some agent in place of aggregate consumption). Recall that the first-order conditions are just about the same whether markets are complete or incomplete. The main difference is that the state prices are shadow prices (Lagrangian multipliers) when markets are incomplete, but actual asset prices in complete markets. Either way, the first-order conditions are consistent with the same asset pricing equations. 4.5. Payoff distribution pricing For von Neumann–Morgenstern preferences (expected utility theory) and more general Machina preferences, preferences depend only on distributions of returns and payoffs and do not depend on the specific states in which those returns are realized. Consider, for example, a simple example with three equally probable states, p1 = p2 = p3 = 1 3 . Suppose that an individual has to choose one of the following payoff vectors for consumption at time 1: c1 = (1, 2, 2), c2 = (2, 1, 2), and c3 = (2, 2, 1). These three consumption patterns have the same distribution of consumption, giving consumption of 1 with probability 1 3 and consumption of 2 with probability 2 3 . Therefore, an agent with von Neumann–Morgenstern preferences or more general Machina preferences
    • Ch. 10: Arbitrage, State Prices and Portfolio Theory 623 would find all these consumption vectors are equally attractive. However they do not all cost the same unless the state-price density (and in the example, the state price) is the same in all states. However, having the state-price density the same in all states is a risk-neutral world – all consumption bundles priced at their expected value – which is not very interesting since all risk-averse agents would choose a riskless investment.9 In general, we expect the state-price density to be highest of states of social scarcity, when the market is down or the economy is in recession, since buying consumption in states of scarcity is a form of insurance. Suppose that the state-price vector is p = (.3, .2, .4). Then the prices of the bundles can be computed as p ca = .3 × 1 + .2 × 2 + .4 × 2 = 1.5, p cb = .3 × 2 + .2 × 1 + .4 × 2 = 1.6, and p cc = .3 × 2 + .2 × 2 + .4 × 1 = 1.4. The cheapest consumption pattern is cc , which places the larger consumption in the cheap states and the smallest consumption in the most expensive state. This gives us a very useful cash-value measure of the inefficiency of the other strategies. An agent will save 1.5 − 1.4 = 0.1 cash up front by choosing cc up front instead of ca or 1.6 − 1.4 = 0.2 cash up front by choosing cc instead of cb . Therefore, we can interpret 0.1 as a lower bound on the amount of inefficiency in ca , since any agent would pay that amount to swap to cc and perhaps more to swap to something better. The only assumption we need for this result is that the agent has preferences (such as von Neumann–Morgenstern preferences or Machina preferences) that care only about the distribution of consumption and not the identity of the particular states in which different parts of the distribution are realized. The general result is based on the “deep theoretical insight” that you should “buy more when it is cheaper”. This means that efficient consumption is decreasing in the state-price density. We can compute the (lower bound on the) inefficiency of the portfolio by reording its consumption in reverse order as the state-price density and computing the decline in cost. The payoff distributional price of a consumption pattern is the price of getting the same distribution the cheapest possible way (in reverse order as the state-price density). There is a nice general formula for the distributional price. Let Fc(·) be the cumulative distribution function of consumption and let ic(·) be its inverse. Similarly, let Fø(·) be the cumulative distribution function of the state-price density and let iø(·) be its inverse. Let c∗ be the efficient consumption pattern with distribution function Fc(·). Then the distributional price of the consumption pattern can be written as E[c∗ ø] = 1 z = 0 ic(z) iø(1 − z) dz. (17) In this expression, z has units of probability and labels the states ordered in reverse of the state-price density, iø(1 − z) is state-price density in state z, and ic(z) is the optimal 9 This is different from there existing a change of probability that gives risk-neutral pricing. In a risk-neutral world, the actual probabilities are also risk-neutral probabilities.
    • 624 P.H. Dybvig and S.A. Ross consumption c∗ in state z. This formula is simplest to understand for a continuous state space, but also makes sense for finitely many equally-probable states as in the example, provided we define the inverse distribution function at mass points in the natural way. The original analysis of Payoff Distribution pricing for complete frictionless markets was presented by Dybvig (1988a,b). Payoff Distribution pricing can also be used in a model with incomplete markets or frictions, as developed by Jouini and Kallal (2001), but that analysis is beyond the scope of this chapter. 5. Capital asset pricing model (CAPM) The Capital Asset Pricing Model (CAPM) is an asset-pricing model based on equilibrium with agents having mean-variance preferences (as in Problem 4). It is based on the mean-variance analysis pioneered by Markowitz (1952, 1959) and Tobin (1958), and was extended to an equilibrium model by Sharpe (1964) and Lintner (1965). Even though there are many more modern pricing models, the CAPM is still the most important. This model gives us most of our basic intuitions about the trade-off between risk and return, about how market risk is priced, and about how idiosyncratic risk is not priced. The CAPM is also widely used in practice, not only in the derivation of optimal portfolios but also in the ex post assessment of performance. Sometimes people still refer to mean-variance analysis by the term Modern Portfolio Theory without intending a joke, even though we are approaching its 50th anniversary. In theoretical work, the mean-variance preferences assumed in Problem 4 are usually motivated by joint normality of returns (a restriction on beliefs) or by a restriction on preferences (a quadratic von Neumann–Morgenstern utility function). When returns are jointly normal, so are portfolio returns, so the entire distribution of a portfolio’s return (and therefore utility that depends only on distribution) is determined by the mean and variance. For quadratic utility, there is an algebraic relation between expected utility and mean and variance. Letting u(c) = k1 + k2c − k3c2 , E[u(c)] = k1 + k2E[c] − k3E[c2 ] = k1 + k2E[c] − k3(var(c) + (E[c])2 ), (18) which depends on the preferences parameters k1, k2, and k3 and the mean and variance of c and not on other features of the distribution (such as skewness or kurtosis). Neither assumption is literally true, but we must remember that models must be simpler than the world if they are to be useful. You may wonder why we need to motivate the representation of preferences by the utility function V(m, v), since it may seem very intuitive to write down preferences for risk an return directly. However, it is actually a little strange to assume that these preferences apply to all random variables. For example, if there is a trade-off between risk and return (so the agent cares about risk), then there should exist m1 > m2 and v1 > v2 > 0 such that the agent V(m1, v1) < V(m2, v2) and the agent would turn
    • Ch. 10: Arbitrage, State Prices and Portfolio Theory 625 standard deviation of return = q’Vq meanreturn=r+q’ (m-1r) qM rf F E qi Fig. 1. The efficient frontier in means and standard deviations down the higher return because of the higher risk. However, it is easy to construct random variables x1 and x2 with x1 > x2 that have means m1 and m2 and variances v1 and v2. In other words, a non-trivial mean-variance utility function (that does not simply maximize the mean) cannot always prefer more to less. The two typical motivations of mean-variance preferences have different resolutions of this conundrum. Quadratic utility does not prefer more to less, so there is no inconsistency. This is not a nice feature of quadratic utility but it may not be a fatal problem either. Multivariate normality does not define preferences for all random variables, and in particular the random variables that generate the paradox are not available. When using any model, we need to think about whether the unrealistic features of the model are important for the application at hand. Many important features of the CAPM are illustrated by Figures 1 and 2. In Figure 1, F is the efficient frontier of risky asset returns in means and standard deviations. Other feasible portfolios of risky assets will plot to the right of F and will not be chosen by any agent who can choose only among the risky assets and prefers less risk at a given mean. And, agents who choose higher mean given the standard deviation, will only choose risky portfolios on the upper branch of F, which is called the positively efficient frontier of risky assets. When the risk-free asset rf is always available, all agents preferring a higher mean at a given standard deviation will choose a portfolio along the frontier E.10 One important feature in either case is two-fund separation, namely, that the entire frontier F or E is spanned by two portfolios, which can be chosen to be any portfolios at two distinct points on the frontier. This is called a 10 For agents who prefer less risk at a given mean but may not prefer a higher mean at a given level of risk, there is another branch of E below that is the reflection of its continuation to the left of the axis.
    • 626 P.H. Dybvig and S.A. Ross “mutual fund separation” result because we can separate the portfolio choice problem into two stages: first find two “mutual funds” (portfolios) spanning the efficient frontier (which can be chosen independently of preferences) and then find the mixture of the two funds appropriate for the particular preferences. For a typical agent who prefers more to less and prefers to avoid risk, preferences are increasing up and to the left in Figure 1. A more risk-averse agent will choose a portfolio on the lower left part of the frontier, with low return but low risk, and a less risk-averse agent will choose a portfolio on the upper right part of the frontier, accepting higher risk in exchange for higher return. Figure 1 also illustrates the Sharpe ratio [Sharpe (1966)], which is used for performance measurement. The line through the riskless asset rf and the market portfolio qM has a slope in Figure 1 that is larger than the slope for any inefficient portfolio such as qi . The slope of the line through a particular portfolio is the Sharpe ratio for the particular portfolio. The Sharpe ratio is largest for an efficient portfolio and the shortfall below that amount is the measure of inefficiency for any other portfolio. (An even greater Sharpe ratio would be possible if the efficient proxy is inefficient in sample or if we are considering a portfolio, say from an informed trading strategy, that is not a fixed portfolio of the assets.) In practice, due to random sampling error, even an efficient portfolio will have a measured Sharpe ratio that is not the largest value. When stock returns are Gaussian, there is an important connection between the measured Sharpe ratio of the market portfolio and the likelihood ratio test of the CAPM [Gibbons, Ross and Shanken (1989)]. Figure 2 shows the security market line, which quantifies the relation between risk and return in the CAPM. Risk is measured using the beta coefficient, which is the slope coefficient of a linear regression of the asset’s return on the market’s return. If the CAPM is true, all assets and portfolios will plot on the Security Market Line (SML) that goes through the risk-free asset rf and the market portfolio of risky assets qM . In practice, measured asset returns are affected by random sampling error; if the CAPM is true it is entirely random whether a portfolio will plot above or below the security market line ex post. The use of beta as the appropriate measure of risk tells us that investors are rewarded for taking on market risk (correlated with market returns) not taking on idiosyncratic risk (uncorrelated with the market). If the security market line tells us how much of a reward is justified for a given amount of risk, it makes intuitive sense that deviations from the security market line can be used to measure superior or inferior performance. This is the intuition behind the Treynor Index and Jensen’s alpha [Treynor (1965) and Jensen (1969)]. For example, in Figure 2, Jensen’s alpha for qs is as > 0, indicating superior performance, and Jensen’s alpha for qu is au < 0, indicating underperformance. Unfortunately, any formal motivation for using Jensen’s alpha must come from outside the CAPM, since if the CAPM is true then the expected value of Jensen’s alpha is zero and the realized value is purely random. Theoretical models that incorporate superior performance from information-gathering have given mixed results on the value of using the security market line for measuring performance: a superior performer with
    • Ch. 10: Arbitrage, State Prices and Portfolio Theory 627 beta = q’ VqM qM ’ VqM meanreturn=r+q’ (m-1r) qM 1.0 rf qeqs as qu au Fig. 2. The security market line connecting risk and return. security-specific information will have a positive Jensen’s alpha, but for market timing a superior performer may have a negative Jensen’s alpha and may even plot inside the efficient frontier for static strategies [Mayers and Rice (1979) and Dybvig and Ross (1985)]. The Treynor Index is the slope of the line through the evaluated portfolio and the risk-free asset in the security market line diagram Figure 2. Performance is determined by comparing a portfolio’s Treynor Index to that of the market; a larger Treynor Index indicates better performance. The Treynor index will indicate superior or inferior performance compared to the market the same as the Jensen measure. However, the ordering of superior or inferior performers can be different because the Treynor measure is adjusted for leverage. The main results of the CAPM can be derived from the first-order condition (9). The first-order condition for agent n is m − r1 = ln Sqn , (19) where ln = 2Vn v(m, v)/Vn m(m, v), evaluated at the optimum m = r + (m − r1) qn and v = qn Sqn . Now, the market portfolio is the wealth-weighted average of all agents’ portfolios, qM = N n = 1 wn qn N n = 1 wn , (20) and consequently we have the wealth-weighted average of the first-order conditions m − r1 = lM SqM , (21)
    • 628 P.H. Dybvig and S.A. Ross where lM = N n = 1 wn ln N n = 1 wn . (22) We can plug in the market portfolio to solve for lM and we obtain m − r1 = SqM qM SqM (mM − r), (23) where mM ≡ qM m is the mean return on the market portfolio of risky assets. Applying Equation (23) to obtain the expected excess return of a portfolio q of risky assets (with q 1 = 1 since a portfolio of risky assets does not include any holdings of the risk-free asset), we have that11 q m − r = lM q SqM = bq (mM − r), (24) where bq is the portfolio’s beta, which is the slope coefficient of a regression of the returns of the portfolio q’s return on the market return, bq ≡ q SqM qM SqM . (25) The SML equation we plotted in Figure 2 is Equation (24). For a portfolio q, Jensen’s alpha is given by q m − r − bq (mM − r), (26) its Treynor index is q m − r bq , (27) and its Sharpe ratio is q m − r √ q Sq . (28) The portfolios encountered in practice are actively managed and the formulas for these performance measures would be more complex than for the simple fixed mix 11 We looked at the simpler case in the text, but the same pricing result holds for a portfolio including a holding in the risky asset. In this case, the expected return on the portfolio is q m + (1 − q 1) r and the expected excess return is q m + (1 − q 1) r − r = q m − q 1r.
    • Ch. 10: Arbitrage, State Prices and Portfolio Theory 629 of assets q. However, the concepts are unchanged with the natural adaptations, e.g., replacing q m by the sample mean return on the portfolio and replacing bq = q SqM /qM SqM by the estimated slope from the regression of the portfolio return on the market return. 6. Mutual fund separation theory The general portfolio problem for arbitrary preferences and distributions is sufficiently rich to allow for nearly any sort of qualitative behavior [see Hart (1975) for negative results or Cass and Stiglitz (1972) for positive results in special cases]. In an effort to simplify this problem and obtain results that allow for aggregation so that the general behavior of the market can be understood in terms of the primitive properties of risk aversion and of the underlying distributions a collection of results known as separation results have been developed. Mutual Fund Separation is the separation of portfolio choice into two stages. The first stage is the selection of small set of “mutual funds” (portfolios) among which choice is to be made, and the second stage is the selection of an allocation to the mutual funds. We have “k-fund separation” for a particular class of distributions and a particular class of utility functions if for each joint return distribution in the class there exist k funds that can be used in the two-step procedure while making agents with utility in the class and any wealth level just as well off as the choosing in the whole market. The important restriction is that the choice of the funds is done once for the entire class of utility functions. In the literature, there are two general approaches: one approach [Hakansson (1969) and Cass and Stiglitz (1970)] restricts utility functions and has relatively unrestricted distributions, while the other approach [Ross (1978a)] restricts distributions and has relatively unrestricted utility functions. Either approach is useful for deriving asset pricing results because, for example, if individual investors hold mixtures of two funds, then the market portfolio must be a mixture of the same two funds. 6.1. Preference approach The preference approach focuses on classes of special utility functions. Many of the results involve utility functions that have properties of homotheticity or invariance. It is important that we require the same funds to work for each utility function at all wealth levels, since this avoids “accidental” cases such as a set containing any two utility functions over returns. Analysis in this section will use Problem 3, in some cases adding the assumption that one of the assets is riskless. First, we consider one-fund separation, which requires all portfolio choices to lie in a ray. Given the budget constraint, this implies that the portfolio choice is just proportional to wealth. For this to happen at all prices, the preferences have to be
    • 630 P.H. Dybvig and S.A. Ross homothetic. And, given the von Neumann–Morgenstern restriction, this is equivalent to either logarithmic utility, u(c) = log(c), or power utility, u(c) = c1 − R /(1 − R). Theorem 5: One-fund separation from preferences. The following are equivalent properties of a nonempty class U of utility functions: (1) For each joint distribution of security returns there exists a single portfolio q, such that every u ∈ U is just as well off choosing a multiple of q as choosing from the entire market. (2) The class U consists of a single utility function (up to an affine transform that leaves preferences unchanged) of the form u(c) = log(c) or u(c) = c1 − R /(1 − R). Proof: (2)⇒(1) Let u be the single utility function in U. The objective func- tion in terms of portfolio proportions is E[u(wq r)]. In the log case, this is E[log(wq r)] = log(w) + E[log(q r)], and maximizing the objective is the same as maximizing the second term which does not depend on w. In the power case, the objective is E[(wq r)1 − R /(1 − R)] = w1 − R E[(q r)1 − R /(1 − R)], and maximizing the objective is the same as maximizing the second factor which does not depend on w. In either case, choosing the proportions that work at one wealth level gives a portfolio in proportions that will be optimal at all wealth levels. (1)⇒(2) Suppose u is an element of the class U. Then, the first-order condition for an optimum implies E[(1 + r − g) u (Wq (1 + r))] = 0, (29) where g = l/E[u (Wq (1 + r))]. In general, ø must satisfy Equation (8) and may vary with W, but for complete markets ø is uniquely determined by Equation (8) and may be taken as given. For the same portfolio weights q to be optimal for all W, it follows that the derivative of the first-order condition is zero and for complete markets we have E[(1 + r − g) q (1 + r) u (Wq (1 + r))] = 0. (30) Now, one-fund separation implies that in all complete markets Equation (29) implies Equation (30), but the only way this can always be true is if everywhere cu (c) = −Ru (c), (31) where R = 1 implies logarithmic utility and any other R 0 implies power utility. (R 0 corresponds to a convex utility function.) And all utility functions in the class U must correspond to the same preferences or else it is easy to construct a 2-state counterexample. The utility functions in the theorem comprise the Constant Relative Risk Aver- sion (CRRA) class for which the Arrow–Pratt coefficient of relative risk aversion
    • Ch. 10: Arbitrage, State Prices and Portfolio Theory 631 −cu (c)/u (c) is a constant. [See Arrow (1965) and Pratt (1964, 1976).12 ] Other special utility functions lead to two-fund separation if there is a riskless asset. The Constant Absolute Risk Aversion (CARA) class of utility functions of the form u(c) = − exp(−Ac)/A for which the Arrow–Pratt coefficient of absolute risk aversion −u (c)/u (c) is constant leads to a special two-fund separation result in which the risky portfolio holding is constant and only the investment in the risk-free asset is changing as wealth changes. When there is a riskless asset, there is also two-fund separation in the larger Linear Risk Tolerance (LRT) class which encompasses the other two classes as well as wealth-translated relative risk aversion preferences of the form u(c) = log(c − c0) or u(c) = (c − c0)1 − R /(1 − R). The linear risk tolerance class is defined by the risk tolerance −u (c)/u (c) having the linear form a(c − c0). We can include in this class the satiated utility functions of the form −(c − c0)1 − R /(1 − R) defined for c c0 (and is typically extended to c > c0 in the obvious way in the quadratic case R = −1). With quadratic utility, we have a special result of two-fund separation even without a risk-free asset due to linearity of marginal utility. In these results, all utility functions in the class U must have the same power (or absolute risk aversion coefficient for exponential utility) but can have different translates c0 (but exponential utility is unchanged under translation). For details and proofs, see Cass and Stiglitz (1970). 6.2. Beliefs We have already seen one case of separation based on beliefs, which is in mean- variance analysis motivated by multivariate normality, as discussed in the section on the CAPM. Mean-variance preferences can also be derived from more general transformed spherically distributed preferences discussed by Chamberlain (1983).13 We turn now to a strictly more general class, the separating distributions of Ross (1978a). The central intuition behind the separating distributions is that risk-averse agents will not choose to take on risk without any reward. This is the same intuition as in mean-variance analysis, but it is somewhat more subtle because risk can no longer be characterized by variance for general concave von Neumann–Morgenstern preferences. The appropriate definition of risk is related to Jensen’s inequality, which says that for any convex function f (·) and any random variable x, E[ f (x)] f (E[x]), with strict equality if f (·) is strictly concave and x is not (almost surely) constant. A risk-averse von Neumann–Morgenstern utility function u(·) is concave (so that −u(·) is convex), and consequently for any random consumption c, E[u(c)] u(E[c]), with strict inequality for strictly concave u and nonconstant c. More importantly for portfolio choice problems, we can use Jensen’s inequality and the law of iterated expectations 12 Some other ways of comparing risk aversion are given by Kihlstr¨om, Romer and Williams (1981) and Ross (1981). 13 Another special case of one-fund separation is the symmetric case of Samuelson (1967).
    • 632 P.H. Dybvig and S.A. Ross to conclude that adding conditional-mean-zero noise makes a risk-averse agent worse off. That gives us the following useful result: Lemma 1. If E[e|c] = 0 and u is concave, then E[u(c + e)] E[u(c)]. (32) Proof: E[u(c + e)] = E[E[u(c + e)|c]], E[u(E[c + e|c])], = E[u(c)], (33) by Jensen’s inequality and the law of iterated expectations. In fact, it can be shown that one random variable is dominated by another with the same mean for all concave utility functions if and only if the first has the same distribution as the second plus conditional-mean-zero noise. This is one of the results of the theory of Stochastic Dominance, which was pioneered by Quirk and Saposnik (1962) and Hadar and Russell (1969) and was popularized by Rothschild and Stiglitz (1970, 1971). The basic idea behind the separating distributions is that there are k funds (e.g., 2 funds for 2-fund separation) such that everything else is equal to some portfolio of the k funds, plus conditional-mean-zero noise. Formally, we have Theorem 6. Consider a world with k funds that are portfolios with weights y1 , . . . , yk summing (∀ j)1 y j = 1 (or in vector notation, 1 y = 1 ).14 Further assume that returns on each asset i can be written as ri = k j = 1 bijy j r + ei, (34) (i.e., r = b y r + e), where k j = 1 bij = 1 (i.e., 1b = 1) and for all linear combination h of the fund returns, e is conditional-mean-zero noise: E[e|h br] = 0. (35) Then any agent with increasing and concave von Neumann–Morgenstern preferences will be just as happy choosing a portfolio of the k funds, as choosing from the 14 As discussed earlier, the dimension of 1 is determined by context; in 1 y = 1 , the first occurrence of 1 is k × 1 and the second occurrence of 1 is n × 1.
    • Ch. 10: Arbitrage, State Prices and Portfolio Theory 633 entire market. More formally, for each monotone and concave u and for each feasible portfolio q with 1 q = 1, there exists another portfolio h with 1 h such that Eu(h r) Eu(q r). Proof: Consider any portfolio q with 1 q = 1. Then q r = q ( by r + e), = q by r + q e. (36) But yb q is a valid portfolio because 1 yb q = 1 b q = 1 q = 1. And, the second term is conditional-mean-zero noise. Therefore, by Lemma 1 all agents with concave preferences would be at least as happy to switch from q to the portfolio yb q, which is a portfolio of the k funds (with weights b q). In the case of 1- and 2-fund separating distributions, the characterization is necessary as well as sufficient [see Ross (1978a)]. In the CAPM derived using multivariate normality, it is easy to show that the SML implies that each mean-variance inefficient portfolio has a payoff equal to the payoff of the efficient portfolio with the same mean plus conditional-mean-zero noise. Given that the mean-variance frontier is spanned by two portfolios, we see that the CAPM with multivariate normality is indeed in the class of 2-fund separating distributions. 7. Arbitrage pricing theory (APT) The Arbitrage Pricing Theory, which was introduced in Ross (1976a,c), is a model of security pricing that generalizes the pricing relation in the CAPM and also builds on the intuition of the separating distributions. First, we start with a factor model of returns of the sort studied in statistics: r = m + f b + e, (37) where m is a vector of mean returns (unrestricted at the moment but to be restricted by the theory), f is a vector of factor returns, of dimensionality much less than r, b is a matrix of factor loadings, and e is a vector of uncorrelated idiosyncratic noise terms. We can represent the restriction to the factor model by writing the covariance matrix as var(r) = bb + D, (38) where we have assumed an orthonormal set of factors with the identity matrix as covariance matrix (without loss of generality because we can always work with a linear transformation), and where D is a diagonal matrix which is the covariance matrix of the vector of security-specific noise terms e. The factor model is a useful restriction for
    • 634 P.H. Dybvig and S.A. Ross empirical work on security returns: given that typically we have many securities for the number of time periods, the full covariance matrix is not identified but a sufficiently low dimensional factor model has many fewer parameters and can be estimated. One intuition of the APT is that idiosyncratic risk is not very important economically and should not be priced. Another intuition of the APT is that compensation for risk should be linear or else there will be arbitrage. For example, if there is a single factor and two assets have different exposures to the factor (betas), excess return must be proportional to the risk exposure. Suppose the compensation per unit risk is larger for the asset with a larger risk exposure. Then a portfolio mixture of the risk-free asset and the high-risk asset will have the same risk exposure as the low-risk asset but a higher expected return, and combining a long position in the mixture with a short position in the low-risk asset gives a pure profit. This profit will be riskless in the absence of idiosyncratic risk; it will be profitable for some agents if idiosyncratic risk is diversifiable. Conversely, if the compensation per unit risk is larger for the asset with the lower risk exposure, the other asset can be dominated by a combination of a long position in the less risky asset with a short (borrowing) position in the risk-free asset. The main consequence of the APT is a pricing equation that looks like a multifactor version of the CAPM equation: m = rf 1 + Gb. (39) Here, rf is the risk-free rate and G is the vector of factor risk premia. There are several approaches to motivating this APT pricing equation; see for example Ross (1976a,c), Dybvig (1983), or Grinblatt and Titman (1983). The APT shares important features of the CAPM: the value of diversification, compensation for taking on systematic risk, and no compensation for taking on idiosyncratic risk. The main difference is that there may be multiple factors, and that the priced factors are the common factors that appear in many securities and not necessarily just the market factor. 8. Conclusion On reflection it is surprising that even our simplest context of a single-period neoclassical model of investments has such a rich theoretical development. We have hit on many of the highlights but even so we cannot claim to an exhaustive review of all that is known. References Abel, A.B. (1990), “Asset prices under habit formation and catching up with the Joneses”, American Economic Review 80:38−42.
    • Ch. 10: Arbitrage, State Prices and Portfolio Theory 635 Arrow, K., and G. Debreu (1954), “Existence of an equilibrium for a competitive economy”, Econometrica 22:265−290. Arrow, K., and F. Hahn (1971), General Competitive Analysis (Holden-Day, San Francisco). Arrow, K.J. (1964), “The role of securities in the optimal allocation of risk-bearing”, Review of Economic Studies 31:91−96. Arrow, K.J. (1965), “Aspects of the theory of risk-bearing”, Yrjo Jahnsson Lectures (Yrj¨o Jahnssonin S¨a¨ati¨o, Helsinki). Banz, R.W., and M.H. Miller (1978), “Prices for state-contingent claims: some estimates and applications”, Journal of Business 51:653−72. Barberis, N., M. Huang and T. Santos (2001), “Prospect theory and asset prices”, Quarterly Journal of Economics 116:1−53. Bergman, Y. (1985), “Time preference and capital asset pricing models”, Journal of Financial Economics 14:145−159. Bewley, T. (1988), “Knightian uncertainty”, Nancy Schwartz Lecture (Northwestern MEDS, Evanston). Black, F., and M. Scholes (1973), “The pricing of options and corporate liabilities”, Journal of Political Economy 81:637−654. Blume, L., A. Brandenburger and E. Dekel (1991), “Lexicographic probabilities and choice under uncertainty”, Econometrica 59:61−79. Breeden, D.T. (1979), “An intertemporal asset pricing model with stochastic consumption and investment opportunities”, Journal of Financial Economics 7:265−296. Breeden, D.T., and R.H. Litzenberger (1978), “Prices of state-contingent claims implicit in option prices”, Journal of Business 51:621−651. Cass, D., and J.E. Stiglitz (1970), “The structure of investor preferences and separability in portfolio allocation: a contribution to the pure theory of mutual funds”, Journal of Economoic Theory 2: 122−160. Cass, D., and J.E. Stiglitz (1972), “Risk aversion and wealth effects on portfolios with many assets”, Review of Economic Studies 39:331−354. Chamberlain, G. (1983), “A characterization of the distributions that imply mean-variance utility functions”, Journal of Economic Theory 29:185−201. Constantinides, G. (1991), “Habit formation: a resolution of the equity premium puzzle”, Journal of Political Economy 98:519−543. Cox, J.C., and S.A. Ross (1975), “A survey of some new results in financial option pricing theory”, Journal of Finance 31:383−402. Cox, J.C., S.A. Ross and M. Rubinstein (1979), “Option pricing: a simplified approach”, Journal of Financial Economics 7:229−263. Debreu, G. (1959), Theory of Value: An Axiomatic Analysis of Economic Equilibrium (Yale University Press, New Haven). Duesenberry, J.S. (1949), Income, Saving, and the Theory of Consumer Behavior (Harvard University Press, Cambridge, MA). Dybvig, P.H. (1983), “An explicit bound on individual assets’ deviations from APT pricing in a finite economy”, Journal of Financial Economics 12:483−496. Dybvig, P.H. (1988a), “Distributional analysis of portfolio choice”, Journal of Business 61:369−393. Dybvig, P.H. (1988b), “Inefficient dynamic portfolio strategies, or how to throw away a million dollars in the stock market”, Review of Financial Studies 1:67−88. Dybvig, P.H. (1995), “Duesenberry’s ratcheting of consumption: optimal dynamic consumption and investment given intolerance for any decline in standard of living”, Review of Economic Studies 62:287−313. Dybvig, P.H., and S.A. Ross (1985), “Differential information and performance measurement using a security market line”, Journal of Finance 40:383−399. Dybvig, P.H., and S.A. Ross (1987), “Arbitrage”, in: J. Eatwell, M. Milgate and P. Neuman, eds., The New Palgrave: a Dictionary of Economics (Macmillan, London) pp. 100–106.
    • 636 P.H. Dybvig and S.A. Ross Epstein, L.G., and S.E. Zin (1989), “Substitution, risk aversion, and the temporal behavior of consumption and asset returns: a theoretical framework”, Econometrica 57:937−969. Fishburn, P. (1988), Nonlinear Preference and Utility Theory (John Hopkins, Baltimore). Gibbons, M.R., S.A. Ross and J. Shanken (1989), “A test of the efficiency of a given portfolio”, Econometrica 57:1121−1152. Grinblatt, M., and S. Titman (1983), “Factor pricing in a finite economy”, Journal of Financial Economics 12:497−507. Hadar, J., and W.R. Russell (1969), “Rules for ordering uncertain prospects”, American Economic Review 59:25−34. Hakansson, N.H. (1969), “Risk disposition and the separation property in portfolio selection”, Journal of Financial and Quantitative Analysis 4:401−416. Harrison, J.M., and D.M. Kreps (1979), “Martingales and arbitrage in multiperiod securities markets”, Journal of Economic Theory 20:381−408. Harrison, J.M., and S. Pliska (1981), “Martingales and stochastic integrals in the theory of continuous trading”, Stochastic Processes and Their Applications 11:215−260. Hart, O.D. (1975), “Some negative results on the existence of comparative statics results in portfolio theory”, The Review of Economic Studies 42:615−621. Herstein, I.N., and J. Milnor (1953), “An axiomatic approach to measurable utility”, Econometrica 21:291−297. Hindy, A., and C. Huang (1992), “On intertemporal preferences for uncertain consumption: a continuous time approach”, Econometrica 60:781−801. Jensen, M.C. (1969), “Risk, the pricing of capital assets, and the evaluation of investment portfolios”, Journal of Business 42:167−247. Jouini, E., and H. Kallal (2001), “Efficient trading strategies in the presence of market frictions”, Review of Financial Studies 14:343−369. Kahneman, D., and A. Tversky (1979), “Prospect theory: an amalysis of decision under risk”, Econometrica 47:263−292. Karlin, S. (1959), Mathematical Methods and Theory in Games, Programming, and Economics (Addison- Wesley, Reading, MA). Kihlstr¨om, R.E., D. Romer and S. Williams (1981), “Risk aversion with random initial wealth”, Econometrica 49:911−920. Knight, F.H. (1921), Risk, Uncertainty, and Profit (Houghton Mifflin, New York). Kreps, D., and E. Porteus (1978), “Temporal resolution of uncertainty and dynamic choice theory”, Econometrica 46:185−200. Lintner, J. (1965), “The valuation of risk assets and the selection of risky investments in stock portfolios and capital budgets”, Review of Economics and Statistics 47:13−37. Luce, R.D., and H. Raiffa (1957), Games and Decisions (Wiley, New York). Machina, M.J. (1982), “ ‘Expected Utility’ Analysis without the Independence Axiom”, Econometrica 50:277−324. Markowitz, H. (1952), “Portfolio selection”, Journal of Finance 7:77−91. Markowitz, H. (1959), Portfolio Selection, Efficient Diversification of Investments (Wiley, New York). Mayers, D., and E.M. Rice (1979), “Measuring portfolio performance and the empirical content of asset pricing models”, Journal of Financial Economics 7:3−28. Merton, R.C. (1971), “Optimal consumption and portfolio rules in a continuous-time model”, Journal of Economic Theory 3:373−413. Merton, R.C. (1973), “An intertemporal capital asset pricing model”, Econometrica 41:867−887. Pollak, R.A. (1970), “Habit formation and dynamic demand functions”, Journal of Political Economy 78:745−763. Pratt, J.W. (1964), “Risk aversion in the small and the large”, Econometrica 32:122−136. Pratt, J.W. (1976), “Risk aversion in the small and the large (erratum)”, Econometrica 55:420.
    • Ch. 10: Arbitrage, State Prices and Portfolio Theory 637 Quirk, J.P., and R. Saposnik (1962), “Admissibility and measurable utility functions”, Review of Economic Studies 29:140−146. Ross, S.A. (1976a), “The arbitrage theory of capital asset pricing”, Journal of Economic Theory 13:341−360. Ross, S.A. (1976b), “Options and efficiency”, Quarterly Journal of Economics 90:75−89. Ross, S.A. (1976c), “Return, risk, and arbitrage”, in: I. Friend and J. Bicksler, eds., Risk and Return in Finance 1 (Ballinger, Cambridge, MA) pp. 189–218. Ross, S.A. (1978a), “Mutual fund separation in financial theory – the separating distributions”, Journal of Economic Theory 17:254−286. Ross, S.A. (1978b), “A simple approach to the valuation of risky streams”, Journal of Business 51: 453−475. Ross, S.A. (1981), “Some stronger measures of risk aversion in the small and the large”, Econometrica 49:621−638. Rothschild, M., and J.E. Stiglitz (1970), “Increasing risk: I. A definition”, Journal of Economic Theory 2:225−243. See also “Addendum to ‘Increasing risk: I. A definition’ ”, Journal of Economic Theory 5:306. Rothschild, M., and J.E. Stiglitz (1971), “Increasing risk: II. Its economic consequences”, Journal of Economic Theory 3:66−84. Rubinstein, M. (1976), “The valuation of uncertain income streams and the pricing of options”, Bell Journal of Economics 7:407−425. Samuelson, P.A. (1967), “General proof that diversification pays”, Journal of Financial and Quantitative Analysis 2:1−13. Sharpe, W.F. (1964), “Capital asset prices: a theory of market equilibrium under conditions of risk”, Journal of Finance 19:425−442. Sharpe, W.F. (1966), “Mutual fund performance”, Journal of Business 39:119−138. Tobin, J. (1958), “Liquidity preference as behavior towards risk”, Review of Economic Studies 25:65−86. Treynor, J. (1965), “How to rate management of investment funds”, Harvard Business Review 43:63−75. von Neumann, J., and O. Morgenstern (1944), Theory of Games and Economic Behavior (Princeton University Press, Princeton, NJ).
    • This Page Intentionally Left Blank
    • Chapter 11 INTERTEMPORAL ASSET PRICING THEORY DARRELL DUFFIE° Stanford University Contents Abstract 641 Keywords 641 1. Introduction 642 2. Basic theory 642 2.1. Setup 643 2.2. Arbitrage, state prices, and martingales 644 2.3. Individual agent optimality 646 2.4. Habit and recursive utilities 647 2.5. Equilibrium and Pareto optimality 649 2.6. Equilibrium asset pricing 651 2.7. Breeden’s consumption-based CAPM 653 2.8. Arbitrage and martingale measures 654 2.9. Valuation of redundant securities 656 2.10. American exercise policies and valuation 657 3. Continuous-time modeling 661 3.1. Trading gains for Brownian prices 662 3.2. Martingale trading gains 663 3.3. The Black–Scholes option-pricing formula 665 3.4. Ito’s Formula 668 3.5. Arbitrage modeling 670 3.6. Numeraire invariance 670 3.7. State prices and doubling strategies 671 3.8. Equivalent martingale measures 672 3.9. Girsanov and market prices of risk 672 3.10. Black–Scholes again 676 3.11. Complete markets 677 ° I am grateful for impetus from George Constantinides and Ren´e Stulz, and for inspiration and guidance from many collaborators and Stanford colleagues. Address: Graduate School of Business, Stanford University, Stanford CA 94305-5015 USA; or email at duffie@stanford.edu. The latest draft can be downloaded at www.stanford.edu/~duffie/. Handbook of the Economics of Finance, Edited by G.M. Constantinides, M. Harris and R. Stulz This chapter is a moderately revised and updated version of work that appeared originally in “Dynamic Asset Pricing Theory”, © 2002 Princeton University Press.
    • 640 D. Duffie 3.12. Optimal trading and consumption 678 3.13. Martingale solution to Merton’s problem 682 4. Term-structure models 686 4.1. One-factor models 687 4.2. Term-structure derivatives 691 4.3. Fundamental solution 693 4.4. Multifactor term-structure models 695 4.5. Affine models 696 4.6. The HJM model of forward rates 699 5. Derivative pricing 702 5.1. Forward and futures prices 702 5.2. Options and stochastic volatility 705 5.3. Option valuation by transform analysis 708 6. Corporate securities 711 6.1. Endogenous default timing 712 6.2. Example: Brownian dividend growth 713 6.3. Taxes, bankruptcy costs, capital structure 717 6.4. Intensity-based modeling of default 719 6.5. Zero-recovery bond pricing 721 6.6. Pricing with recovery at default 722 6.7. Default-adjusted short rate 724 References 725
    • Ch. 11: Intertemporal Asset Pricing Theory 641 Abstract This is a survey of the basic theoretical foundations of intertemporal asset pricing theory. The broader theory is first reviewed in a simple discrete-time setting, emphasizing the key role of state prices. The existence of state prices is equivalent to the absence of arbitrage. State prices, which can be obtained from optimizing investors’ marginal rates of substitution, can be used to price contingent claims. In equilibrium, under locally quadratic utility, this leads to Breeden’s consumption-based capital asset pricing model. American options call for special handling. After extending the basic modeling approach to continuous-time settings, we turn to such applications as the dynamics of the term structure of interest rates, futures and forwards, option pricing under jumps and stochastic volatility, and the market valuation of corporate securities. The pricing of defaultable corporate debt is treated from a direct analysis of the incentives or ability of the firm to pay, and also by standard reduced-form methods that take as given an intensity process for default. This survey does not consider asymmetric information, and assumes price-taking behavior and the absence of transactions costs and many other market imperfections. Keywords asset pricing, state pricing, option pricing, interest rates, bond pricing JEL classification: G12, G13, E43, E44
    • 642 D. Duffie 1. Introduction This is a survey of “classical” intertemporal asset pricing theory. A central objective of this theory is to reduce asset-pricing problems to the identification of “state prices”, a notion of Arrow (1953) from which any security has an implied value as the weighted sum of its future cash flows, state by state, time by time, with weights given by the associated state prices. Such state prices may be viewed as the marginal rates of substitution among state-time consumption opportunities, for any unconstrained investor, with respect to a numeraire good. Under many types of market imperfections, state prices may not exist, or may be of relatively less use or meaning. While market imperfections constitute an important thrust of recent advances in asset pricing theory, they will play a limited role in this survey, given the limitations of space and the priority that should be accorded to first principles based on perfect markets. Section 2 of this survey provides the conceptual foundations of the broader theory in a simple discrete-time setting. After extending the basic modeling approach to a continuous-time setting in Section 3, we turn in Section 4 to term-structure modeling, in Section 5 to derivative pricing, and in Section 6 to corporate securities. The theory of optimal portfolio and consumption choice is closely linked to the theory of asset pricing, for example through the relationship between state prices and marginal rates of substitution at optimality. While this connection is emphasized, for example in Sections 2.3–2.4 and 3.12–3.13, the theory of optimal portfolio and consumption choice, particularly in dynamic incomplete-markets settings, has become so extensive as to defy a proper summary in the context of a reasonably sized survey of asset-pricing theory. The interested reader is especially directed to the treatments of Karatzas and Shreve (1998) and Schroder and Skiadas (1999, 2002). For ease of reference, as there is at most one theorem per sub-section, we refer to a theorem by its subsection number, and likewise for lemmas and propositions. For example, the unique proposition of Section 2.9 is called “Proposition 2.9”. 2. Basic theory Radner (1967, 1972) originated our standard approach to a dynamic equilibrium of “plans, prices, and expectations,” extending the static approach of Arrow (1953) and Debreu (1953).1 After formulating this standard model, this section provides the equivalence of no arbitrage and state prices, and shows how state prices may be derived from investors’ marginal rates of substitution among state-time consumption opportunities. Given state prices, we examine the pricing of derivative securities, such 1 The model of Debreu (1953) appears in Chapter 7 of Debreu (1959). For more details in a finance setting, see Dothan (1990). The monograph by Magill and Quinzii (1996) is a comprehensive survey of the theory of general equilibrium in a setting such as this.
    • Ch. 11: Intertemporal Asset Pricing Theory 643 as European and American options, whose payoffs can be replicated by trading the underlying primitive securities. 2.1. Setup We begin for simplicity with a setting in which uncertainty is modeled as some finite set W of states, with associated probabilities. We fix a set F of events, called a tribe, also known as a s-algebra, which is the collection of subsets of W that can be assigned a probability. The usual rules of probability apply.2 We let P(A) denote the probability of an event A. There are T + 1 dates: 0, 1, . . . , T. At each of these, a tribe Ft ⊂ F is the set of events corresponding to the information available at time t. Any event in Ft is known at time t to be true or false. We adopt the usual convention that Ft ⊂ Fs whenever t s, meaning that events are never “forgotten”. For simplicity, we also take it that events in F0 have probability 0 or 1, meaning roughly that there is no information at time t = 0. Taken altogether, the filtration F = {F0, . . . , FT }, sometimes called an information structure, represents how information is revealed through time. For any random variable Y, we let Et(Y) = E(Y | Ft) denote the conditional expectation of Y given Ft. In order to simplify things, for any two random variables Y and Z, we always write “Y = Z” if the probability that Y Ñ Z is zero. An adapted process is a sequence X = {X0, . . . , XT } such that, for each t, Xt is a random variable with respect to (W, Ft). Informally, this means that Xt is observable at time t. An adapted process X is a martingale if, for any times t and s > t, we have Et(Xs) = Xt. A security is a claim to an adapted dividend process, say d, with dt denoting the dividend paid by the security at time t. Each security has an adapted security-price process S, so that St is the price of the security, ex dividend, at time t. That is, at each time t, the security pays its dividend dt and is then available for trade at the price St. This convention implies that d0 plays no role in determining ex-dividend prices. The cum-dividend security price at time t is St + dt. We suppose that there are N securities defined by an RN -valued adapted dividend process d = (d(1) , . . . , d(N) ). These securities have some adapted price process S = (S(1) , . . . , S(N) ). A trading strategy is an adapted process q in RN . Here, qt represents the portfolio held after trading at time t. The dividend process dq generated by a trading strategy q is defined by dq t = qt − 1 · (St + dt) − qt · St, (1) with “q−1” taken to be zero by convention. 2 The triple (W, F, P) is a probability space, as defined for example by Jacod and Protter (2000).
    • 644 D. Duffie 2.2. Arbitrage, state prices, and martingales Given a dividend–price pair (d, S) for N securities, a trading strategy q is an arbitrage if dq > 0 (that is, if dq 0 and dq Ñ 0). An arbitrage is thus a trading strategy that costs nothing to form, never generates losses, and, with positive probability, will produce strictly positive gains at some time. One of the precepts of modern asset pricing theory is a notion of efficient markets under which there is no arbitrage. This is a reasonable axiom, for in the presence of an arbitrage, any rational investor who prefers to increase his dividends would undertake such arbitrages without limit, so markets could not be in equilibrium, in a sense that we shall see more formally later in this section. We will first explore the implications of no arbitrage for the representation of security prices in terms of “state prices”, the first step toward which is made with the following result. Proposition. There is no arbitrage if and only if there is a strictly positive adapted process p such that, for any trading strategy q, E T t = 0 ptdq t = 0. Proof: Let Q denote the space of trading strategies. For any q and f in Q and scalars a and b, we have adq + bdf = daq + bf . Thus, the marketed subspace M = {dq : q ∈ Q} of dividend processes generated by trading strategies is a linear subspace of the space L of adapted processes. Let L+ = {c ∈ L: c 0}. There is no arbitrage if and only if the cone L+ and the marketed subspace M intersect precisely at zero. Suppose there is no arbitrage. The Separating Hyperplane Theorem, in a version for closed convex cones that is sometimes called Stiemke’s Lemma (see Appendix B of Duffie (2001)) implies the existence of a nonzero linear functional F such that F(x) < F( y) for each x in M and each nonzero y in L+. Since M is a linear subspace, this implies that F(x) = 0 for each x in M, and thus that F( y) > 0 for each nonzero y in L+. This implies that F is strictly increasing. By the Riesz representation theorem, for any such linear function F there is a unique adapted process p, called the Riesz representation of F, such that F(x) = E T t = 0 ptxt , x ∈ L. As F is strictly increasing, p is strictly positive, that is, P(pt > 0) = 1 for all t. The converse follows from the fact that if dq > 0 and p is a strictly positive process, then E( T t = 0 ptdq t ) > 0.
    • Ch. 11: Intertemporal Asset Pricing Theory 645 For convenience, we call any strictly positive adapted process a deflator. A deflator p is a state-price density if, for all t, St = 1 pt Et ⎛ ⎝ T j = t + 1 pjdj ⎞ ⎠ . (2) A state-price density is sometimes called a state-price deflator, a pricing kernel, or a marginal-rate-of-substitution process. For t = T, the right-hand side of Equation (2) is zero, so ST = 0 whenever there is a state-price density. It can be shown as an exercise that a deflator p is a state-price density if and only if, for any trading strategy q, qt · St = 1 pt Et ⎛ ⎝ T j = t + 1 pjdq j ⎞ ⎠ , t < T, (3) meaning roughly that the market value of a trading strategy is, at any time, the state- price discounted expected future dividends generated by the strategy. The gain process G for (d, S) is defined by Gt = St + t j = 1 dj, the price plus accumulated dividend. Given a deflator g, the deflated gain process Gg is defined by Gg t = gtSt + t j = 1 gjdj. We can think of deflation as a change of numeraire. Theorem. The dividend–price pair (d, S) admits no arbitrage if and only if there is a state-price density. A deflator p is a state-price density if and only if ST = 0 and the state-price-deflated gain process Gp is a martingale. Proof: It can be shown as an easy exercise that a deflator p is a state-price density if and only if ST = 0 and the state-price-deflated gain process Gp is a martingale. Suppose there is no arbitrage. Then ST = 0, for otherwise the strategy q is an arbitrage when defined by qt = 0, t < T, qT = −ST . By the previous proposition, there is some deflator p such that E( T t = 0 dq t pt) = 0 for any strategy q. We must prove Equation (2), or equivalently, that Gp is a martingale. Doob’s Optional Sampling Theorem states that an adapted process X is a martingale if and only if E(Xt ) = X0 for any stopping time t T. Consider, for an arbitrary security n and an arbitrary stopping time t T, the trading strategy q defined by q(k) = 0 for k Ñ n and q(n) t = 1, t < t, with q(n) t = 0, t t. Since E( T t = 0 ptdq t ) = 0, we have E −S(n) 0 p0 + t t = 1 ptd(n) t + pt S(n) t = 0, implying that the p-deflated gain process Gn,p of security n satisfies Gn,p 0 = E(Gn,p t ). Since t is arbitrary, Gn,p is a martingale, and since n is arbitrary, Gp is a martingale. This shows that absence of arbitrage implies the existence of a state-price density. The converse is easy.
    • 646 D. Duffie The proof is motivated by those of Harrison and Kreps (1979) and Harrison and Pliska (1981) for a similar result to follow in this section regarding the notion of an “equivalent martingale measure”. Ross (1987), Prisman (1985), Kabanov and Stricker (2001), and Schachermayer (2001) show the impact of taxes or transactions costs on the state-pricing model. 2.3. Individual agent optimality We introduce an agent, defined by a strictly increasing3 utility function U on the set L+ of nonnegative adapted “consumption” processes, and by an endowment process e in L+. Given a dividend-price process (d, S), a trading strategy q leaves the agent with the total consumption process e + dq . Thus the agent has the budget-feasible consumption set C = {e + dq ∈ L+: q ∈ Q}, and the problem sup c ∈ C U(c). (4) The existence of a solution to Problem (4) implies the absence of arbitrage. Conversely, if U is continuous,4 then the absence of arbitrage implies that there exists a solution to Problem (4). (This follows from the fact that the feasible consumption set C is compact if and only if there there is no arbitrage.) Assuming that (4) has a strictly positive solution c∗ and that U is continuously differentiable at c∗ , we can use the first-order conditions for optimality to characterize security prices in terms of the derivatives of the utility function U at c∗ . Specifically, for any c in L, the derivative of U at c∗ in the direction c is g (0), where g(a) = U(c∗ + ac) for any scalar a sufficiently small in absolute value. That is, g (0) is the marginal rate of improvement of utility as one moves in the direction c away from c∗ . This directional derivative is denoted ∇U(c∗ ; c). Because U is continuously differentiable at c∗ , the function that maps c to ∇U(c∗ ; c) is linear. Since dq is a budget-feasible direction of change for any trading strategy q, the first-order conditions for optimality of c∗ imply that ∇U(c∗ ; dq ) = 0, q ∈ Q. We now have a characterization of a state-price density. 3 A function f : L → R is strictly increasing if f (c) > f (b) whenever c > b. 4 For purposes of checking continuity or the closedness of sets in L, we will say that cn converges to c if E[ T t=0 |cn(t) − c(t)|] → 0. Then U is continuous if U(cn) → U(c) whenever cn → c.
    • Ch. 11: Intertemporal Asset Pricing Theory 647 Proposition. Suppose that Problem (4) has a strictly positive solution c∗ and that U has a strictly positive continuous derivative at c∗ . Then there is no arbitrage and a state-price density is given by the Riesz representation p of ∇U(c∗ ), defined by ∇U(c∗ ; x) = E T t = 0 ptxt , x ∈ L. The Riesz representation of the utility gradient is also sometimes called the marginal- rates-of-substitution process. Despite our standing assumption that U is strictly increasing, ∇U(c∗ ; · ) need not in general be strictly increasing, but is so if U is concave. As an example, suppose U has the additive form U(c) = E T t = 0 ut(ct) , c ∈ L+, (5) for some ut: R+ → R, t 0. It is an exercise to show that if ∇U(c) exists, then ∇U(c; x) = E T t = 0 ut(ct) xt . (6) If, for all t, ut is concave with an unbounded derivative and e is strictly positive, then any solution c∗ to Equation (4) is strictly positive. Corollary. Suppose U is defined by Equation (5). Under the conditions of the Proposition, for any time t < T, St = 1 ut(c∗ t ) Et ut + 1(c∗ t + 1)(St + 1 + dt + 1) . This result is often called the stochastic Euler equation, made famous in a time- homogeneous Markov setting by Lucas (1978). A precursor is due to LeRoy (1973). 2.4. Habit and recursive utilities The additive utility model is extremely restrictive, and routinely found to be inconsistent with experimental evidence on choice under uncertainty, as for example in Plott (1986). We will illustrate the state pricing associated with some simple extensions of the additive utility model, such as “habit-formation” utility and “recursive utility”.
    • 648 D. Duffie An example of a habit-formation utility is some U: L+ → R with U(c) = E T t = 0 u(ct, ht) , where u: R+ × R → R is continuously differentiable and, for any t, the “habit” level of consumption is defined by ht = t j = 1 ajct − j for some a ∈ RT + . For example, we could take aj = g j for g ∈ (0, 1), which gives geometrically declining weights on past consumption. A natural motivation is that the relative desire to consume may be increased if one has become accustomed to high levels of consumption. By applying the chain rule, we can calculate the Riesz representation p of the gradient of U at a strictly positive consumption process c as pt = uc(ct, ht) + Et s > t uh(cs, hs) as − t , where uc and uh denote the partial derivatives of u with respect to its first and second arguments, respectively. The habit-formation utility model was developed by Dunn and Singleton (1986) and in continuous time by Ryder and Heal (1973), and has been applied to asset-pricing problems by Constantinides (1990), Sundaresan (1989) and Chapman (1998). Recursive utility, inspired by Koopmans (1960), Kreps and Porteus (1978) and Selden (1978), was developed for general discrete-time multi-period asset-pricing applications by Epstein and Zin (1989), who take a utility of the form U(c) = V0, where the “utility process” V is defined recursively, backward in time from T, by Vt = F(ct, ~Vt + 1 | Ft), where ~Vt + 1 | Ft denotes the probability distribution of Vt + 1 given Ft, where F is a measurable real-valued function whose first argument is a non-negative real number and whose second argument is a probability distribution, and finally where we take VT + 1 to be a fixed exogenously specified random variable. One may view Vt as the utility at time t for present and future consumption, noting the dependence on the future consumption stream through the conditional distribution of the following period’s utility. As a special case, for example, consider F(x, m) = f (x, E[h(Ym)]) , (7) where f is a function in two real variables, h(·) is a “felicity” function in one variable, and Ym is any random variable whose probability distribution is m. This special case of the “Kreps–Porteus utility” aggregates the role of the conditional distribution of future consumption through an “expected utility of next period’s utility”. If h and J
    • Ch. 11: Intertemporal Asset Pricing Theory 649 are concave and increasing functions, then U is concave and increasing. If h(v) = v and if f (x, y) = u(x) + by for some u: R+ → R and constant b > 0, then (for VT + 1 = 0) we recover the special case of additive utility given by U(c) = E t bt u(ct) . “Non-expected-utility” aggregation of future consumption utility can be based, for example, upon the local-expected-utility model of Machina (1982) and the betweenness-certainty-equivalent model of Chew (1983, 1989), Dekel (1989) and Gul and Lantto (1990). With recursive utility, as opposed to additive utility, it need not be the case that the degree of risk aversion is completely determined by the elasticity of intertemporal substitution. For the special case (Equation 7) of expected-utility aggregation, and with differentiability throughout, we have the utility gradient representation pt = f1 (ct, Et [h (Vt + 1)]) s < t f2 (cs, Es [h (Vs + 1)]) Es h (Vs + 1) , where fi denotes the partial derivative of f with respect to its ith argument. Recursive utility allows for preference over early or late resolution of uncertainty (which have no impact on additive utility). This is relevant for asset prices, as for example in the context of remarks by Ross (1989), and as shown by Skiadas (1998) and Duffie, Schroder and Skiadas (1997). Grant, Kajii and Polak (2000) have more to say on preferences for the resolution of information. The equilibrium state-price density associated with recursive utility is computed in a Markovian setting by Kan (1995).5 For further justification and properties of recursive utility, see Chew and Epstein (1991) and Skiadas (1997, 1998). For further implications for asset pricing, see Epstein (1988, 1992), Epstein and Zin (1999) and Giovannini and Weil (1989). 2.5. Equilibrium and Pareto optimality Now, we explore the implications of multi-agent equilibrium for state prices. A key objective is to link state prices with important macro-economic variables that are, hopefully, observable, such as total economy-wide consumption. Suppose there are m agents. Agent i is defined as above by a strictly increasing utility function Ui: L+ → R and an endowment process e(i) in L+. Given a dividend 5 Kan (1993) further explored the utility gradient representation of recursive utility in this setting.
    • 650 D. Duffie process d for N securities, an equilibrium is a collection (q(1) , . . . , q(m) , S), where S is a security-price process and, for each agent i, q(i) is a trading strategy solving sup q ∈ Q Ui e(i) + dq , with m i = 1 q(i) = 0. We define markets to be complete if, for each process x in L, there is some trading strategy q with dq t = xt, t 1. Complete markets thus means that any consumption process x can be obtained by investing some amount at time 0 in a trading strategy that, at each future period t, generates the dividend xt. The First Welfare Theorem is that complete-markets equilibria provide efficient consumption allocations. Specifically, an allocation (c(1) , . . . , c(m) ) of consumption processes to the m agents is feasible if c(1) + · · · + c(m) e(1) + · · · + e(m) , and is Pareto optimal if there is no feasible allocation (b(1) , . . . , b(m) ) such that Ui(b(i) ) Ui(c(i) ) for all i, with strict inequality for some i. Any equilibrium (q(1) , . . . , q(m) , S) has an associated feasible consumption allocation (c(1) , . . . , c(m) ) defined by letting c(i) − e(i) be the dividend process generated by q(i) . First Welfare Theorem. Suppose (q(1) , . . . , q(m) , S) is an equilibrium and markets are complete. Then the associated consumption allocation is Pareto optimal. An easy proof is due to Arrow (1951). Suppose, with the objective of obtaining a contradiction, that (c(1) , . . . , c(m) ) is the consumption allocation of a complete- markets equilibrium and that there is a feasible allocation (b(1) , . . . , b(m) ) such that Ui(b(i) ) Ui(c(i) ) for all i, with strict inequality for some i. Because of equilibrium, there is no arbitrage, and therefore a state-price density p. For any consumption process x, let p · x = E( t ptxt). We have p · b(i) p · c(i) , for otherwise, given complete markets, the utility of c(i) can be increased strictly by some feasible trading strategy generating b(i) − e(i) . Similarly, for at least some agent, we also have p · b(i) > p · c(i) . Thus p · i b(i) > p · i c(i) = p · i e(i) , the equality from the market-clearing condition i q(i) = 0. This is impossible, however, for feasibility implies that i b(i) i e(i) . This contradiction implies the result. Duffie and Huang (1985) characterize the number of securities necessary for complete markets. Roughly speaking, extending the spanning insight of Arrow (1953) to allow for dynamic spanning, it is necessary (and generically sufficient) that there are at least as many securities as the maximal number of mutually exclusive events of positive conditional probability that could be revealed between two dates. For example, if the information generated at each date is that of a coin toss, then complete markets requires a minimum of two securities, and almost any two will suffice. Cox, Ross
    • Ch. 11: Intertemporal Asset Pricing Theory 651 and Rubinstein (1979) provide the classical example in which one of the original securities has “binomial” returns and the other has riskless returns. That is, S = (Y, Z) is strictly positive, and, for all t < T, we have dt = 0, Yt + 1/Yt is a Bernoulli trial, and Zt + 1/Zt is a constant. More generally, however, to be assured of complete markets given the minimal number of securities, one must verify that the price process, which is endogenous, is not among the rare set that is associated with a reduced market span, a point emphasized by Hart (1975) and dealt with by Magill and Shafer (1990). In general, the dependence of the marketed subspace on endogenous security price processes makes the demonstration and calculation of an equilibrium problematic. Conditions for the generic existence of equilibrium in incomplete markets are given by Duffie and Shafer (1985, 1986). The literature on this topic is extensive.6 Hahn (1994) raises some philosophical issues regarding the possibility of complete markets and efficiency, in a setting in which endogenous uncertainty may be of concern to investors. The Pareto inefficiency of incomplete markets equilibrium consumption allocations, and notions of constrained efficiency, are discussed by Hart (1975), Kreps (1979) (and references therein), Citanna, Kajii and Villanacci (1994), Citanna and Villanacci (1993) and Pan (1993, 1995). The optimality of individual portfolio and consumption choices in incomplete markets in this setting is given a dual interpretation by He and Pag`es (1993). [Girotto and Ortu (1994) offer related remarks.] Methods for computation of equilibrium with incomplete markets are developed by Brown, DeMarzo and Eaves (1996a,b), Cuoco and He (1994), DeMarzo and Eaves (1996) and Dumas and Maenhout (2002). Kraus and Litzenberger (1975) and Stapleton and Subrahmanyam (1978) gave early parametric examples of equilibrium. 2.6. Equilibrium asset pricing We will review a representative-agent state-pricing model of Constantinides (1982). The idea is to deduce a state-price density from aggregate, rather than individual, consumption behavior. Among other advantages, this allows for a version of the 6 Bottazzi (1995) has a somewhat more advanced version of existence in single-period multiple- commodity version. Related existence topics are studied by Bottazzi and Hens (1996), Hens (1991) and Zhou (1997). The literature is reviewed in depth by Geanakoplos (1990). Alternative proofs of existence of equilibrium are given in the 2-period version of the model by Geanakoplos and Shafer (1990), Hirsch, Magill and Mas-Colell (1990) and Husseini, Lasry and Magill (1990); and in a T- period version by Florenzano and Gourdel (1994). If one defines security dividends in nominal terms, rather than in units of consumption, then equilibria always exist under standard technical conditions on preferences and endowments, as shown by Cass (1984), Werner (1985), Duffie (1987) and Gottardi and Hens (1996), although equilibrium may be indeterminate, as shown by Cass (1989) and Geanakoplos and Mas-Colell (1989). On this point, see also Kydland and Prescott (1991), Mas-Colell (1991) and Cass (1991). Surveys of general equilibrium models in incomplete markets settings are given by Cass (1991), Duffie (1992), Geanakoplos (1990), Magill and Quinzii (1996) and Magill and Shafer (1991). Hindy and Huang (1993) show the implications of linear collateral constraints on security valuation.
    • 652 D. Duffie consumption-based capital asset pricing model of Breeden (1979) in the special case of locally-quadratic utility. We define, for each vector l in Rm + of “agent weights”, the utility function Ul: L+ → R by Ul(x) = sup (c(1), ..., c(m)) m i = 1 liUi(ci ) subject to c(1) + · · · + c(m) x. (8) Proposition. Suppose for all i that Ui is concave and strictly increasing. Suppose that (q(1) , . . . , q(m) , S) is an equilibrium and that markets are complete. Then there exists some nonzero l ∈ Rm + such that (0, S) is a (no-trade) equilibrium for the one-agent economy [(Ul, e), d], where e = e(1) + · · · + e(m) . With this l and with x = e = e(1) + · · · + e(m) , problem (8) is solved by the equilibrium consumption allocation. A method of proof, as well as the intuition for this proposition, is that with complete markets, a state-price density p represents Lagrange multipliers for consumption in the various periods and states for all of the agents simultaneously, as well as for some representative agent (Ul, e), whose agent-weight vector l defines a hyperplane separating the set of feasible utility improvements from Rm + . [See, for example, Duffie (2001) for details. This notion of “representative agent” is weaker than that associated with aggregation in the sense of Gorman (1953).] Corollary 1. If, moreover, Ul is continuously differentiable at e, then l can be chosen so that a state-price density is given by the Riesz representation of ∇Ul(e). Corollary 2. Suppose, for each i, that Ui is of the additive form Ui(c) = E T t = 0 uit(ct) . Then Ul is also additive, with Ul(c) = E T t = 0 ult(ct) , where ult( y) = sup x ∈ Rm + m i = 1 liuit(xi) subject to x1 + · · · + xm y. In this case, the differentiability of Ul at e implies that for any times t and t t, St = 1 ult(et) Et ⎡ ⎣ult (et ) St + t j = t + 1 ulj(ej)dj ⎤ ⎦ . (9)
    • Ch. 11: Intertemporal Asset Pricing Theory 653 2.7. Breeden’s consumption-based CAPM The consumption-based capital asset-pricing model (CAPM) of Breeden (1979) extends the results of Rubinstein (1976) by showing that if agents have additive utility that is, locally quadratic, then expected asset returns are linear with respect to their covariances with aggregate consumption, as will be stated more carefully shortly. Notably, the result does not depend on complete markets. Locally quadratic additive utility is an extremely strong assumption. (It does not violate monotonicity, as utility need not be quadratic at all levels). Breeden actually worked in a continuous-time setting of Brownian information, reviewed shortly, within which smooth additive utility functions are automatically locally quadratic, in a sense that is sufficient to recover a continuous-time analogue of the following consumption-based CAPM.7 In a one- period setting, the consumption-based CAPM corresponds to the classical CAPM of Sharpe (1964). First, we need some preliminary definitions. The return at time t + 1 on a trading strategy q whose market value qt · St is non-zero is Rq t + 1 = qt · (St + 1 + dt + 1) qt · St . There is short-term riskless borrowing if, for each given time t < T, there is a trading strategy q with Ft-conditionally deterministic return, denoted rt. We refer to the sequence {r0, r1, . . . , rT − 1} of such short-term risk-free returns as the associated “short-rate process”, even though rT is not defined. Conditional on Ft, we let vart(·) and covt(·) denote variance and covariance, respectively. Proposition: Consumption-based CAPM. Suppose, for each agent i, that the utility Ui(·) is of the additive form Ui(c) = E[ T t = 0 uit(ct)], and moreover that, for equilibrium consumption processes c(1) , . . . , c(m) , we have uit(c(i) t ) = ait + bitc(i) t , where ait and bit > 0 are constants. Let S be the associated equilibrium price process of the securities. Then, for any time t, St = AtEt (dt + 1 + St + 1) − BtEt [(St + 1 + dt + 1) et + 1] , for adapted strictly positive scalar processes A and B. For a given time t, suppose that there is riskless borrowing at the short rate rt. Then there is a trading strategy with the property that its return R∗ t + 1 has maximal Ft-conditional correlation with the aggregate consumption et + 1 (among all trading strategies). Suppose, moreover, that 7 For a theorem and proof, see Duffie and Zame (1989).
    • 654 D. Duffie there is riskless borrowing at the short rate rt and that vart(R∗ t + 1) is strictly positive. Then, for any trading strategy q with return Rq t + 1, Et Rq t + 1 − rt = bq t Et R∗ t + 1 − rt , where bq t = covt(Rq t + 1, R∗ t + 1) vart(R∗ t + 1) . The essence of the result is that the expected return of any security, in excess of risk-free rates, is increasing in the degree to which the security’s return depends (in the sense of regression) on aggregate consumption. This is natural; there is an average preference in favor of securities that are hedges against aggregate economic performance. While the consumption-based CAPM does not depend on complete markets, its reliance on locally-quadratic expected utility, and otherwise perfect markets, is limiting, and its empirical performance is mixed, at best. For some evidence, see for example Hansen and Jaganathan (1990). 2.8. Arbitrage and martingale measures This section shows the equivalence between the absence of arbitrage and the existence of “risk-neutral” probabilities, under which, roughly speaking, the price of a security is the sum of its expected discounted dividends. This idea, stemming from Cox and Ross (1976), was developed into the notion of equivalent martingale measures by Harrison and Kreps (1979). We suppose throughout this subsection that there is short-term riskless borrowing at some uniquely defined short-rate process r. We can define, for any times t and t T, Rt,t = (1 + rt) (1 + rt + 1) . . . (1 + rt − 1) , the payback at time t of one unit of account borrowed risklessly at time t and “rolled over” in short-term borrowing repeatedly until date t. It would be a simple situation, both computationally and conceptually, if any security’s price were merely the expected discounted dividends of the security. Of course, this is unlikely to be the case in a market with risk-averse investors. We can nevertheless come close to this sort of characterization of security prices by adjusting the original probability measure P. For this, we define a new probability measure Q to be equivalent to P if Q and P assign zero probabilities to the same events. An equivalent probability measure Q is an equivalent martingale measure if St = EQ t ⎛ ⎝ T j = t + 1 dj Rt, j ⎞ ⎠ , t < T, where EQ denotes expectation under Q, and EQ t (X ) = EQ (X | Ft) for any random variable X .
    • Ch. 11: Intertemporal Asset Pricing Theory 655 It is easy to show that Q is an equivalent martingale measure if and only if, for any trading strategy q, qt · St = EQ t ⎛ ⎝ T j = t + 1 dq j Rt, j ⎞ ⎠ , t < T. (10) We will show that the absence of arbitrage is equivalent to the existence of an equivalent martingale measure. The deflator g defined by gt = R−1 0,t defines the discounted gain process Gg , by Gg t = gtSt + t j = 1 gjdj. The word “martingale” in the term “equivalent martingale measure” comes from the following equivalence. Lemma. A probability measure Q equivalent to P is an equivalent martingale measure for (d, S) if and only if ST = 0 and the discounted gain process Gg is a martingale with respect to Q. If, for example, a security pays no dividends before T, then the property described by the lemma is that the discounted price process is a Q-martingale. We already know that the absence of arbitrage is equivalent to the existence of a state-price density p. A probability measure Q equivalent to P can be defined in terms of a Radon–Nikodym derivative, a strictly positive random variable dQ dP with E(dQ dP ) = 1, via the definition of expectation with respect to Q given by EQ (Z) = E(dQ dP Z), for any random variable Z. We will consider the measure Q defined by dQ dP = xT , where xT = pT R0,T p0 . (Indeed, one can check by applying the definition of a state-price density to the payoff R0,T that xT is strictly positive and of expectation 1.) The density process x for Q is defined by xt = Et(xT ). Bayes Rule implies that for any times t and j > t, and any Fj-measurable random variable Zj, EQ t (Zj) = 1 xt Et(xjZj). (11) Fixing some time t < T, consider a trading strategy q that invests one unit of account at time t and repeatedly rolls the value over in short-term riskless borrowing until time T, with final value Rt,T . That is, qt · St = 1 and dq T = Rt,T . Relation (3) then implies that pt = Et pT Rt,T = Et pT R0,T R0,t = Et (xT p0) R0,t = xtp0 R0,t . (12) From Equations (11), (12), and the definition of a state-price density, Equation (10) is satisfied, so Q is indeed an equivalent martingale measure. We have shown the following result.
    • 656 D. Duffie Theorem. There is no arbitrage if and only if there exists an equivalent martingale measure. Moreover, p is a state-price density if and only if an equivalent martingale measure Q has the density process x defined by xt = R0,tpt/p0. This martingale approach simplifies many asset-pricing problems that might otherwise appear to be quite complex, and applies much more generally than indicated here. For example, the assumption of short-term borrowing is merely a convenience, and one can typically obtain an equivalent martingale measure after normalizing prices and dividends by the price of some particular security (or trading strategy). Girotto and Ortu (1996) present general results of this type for this finite-dimensional setting. Dalang, Morton and Willinger (1990) gave a general discrete-time result on the equivalence of no arbitrage and the existence of an equivalent martingale measure, covering even the case with infinitely many states. 2.9. Valuation of redundant securities Suppose that the dividend–price pair (d, S) for the N given securities is arbitrage- free, with an associated state-price density p. Now consider the introduction of a new security with dividend process ˆd and price process S. We say that ˆd is redundant given (d, S) if there exists a trading strategy q, with respect to only the original security dividend–price process (d, S), that replicates ˆd, in the sense that dq t = ˆdt, t 1. If ˆd is redundant given (d, S), then the absence of arbitrage for the “augmented” dividend–price process [(d, ˆd), (S, S)] implies that St = Yt, where Yt = 1 pt Et ⎛ ⎝ T j = t + 1 pj ˆdj ⎞ ⎠ , t < T. If this were not the case, there would be an arbitrage, as follows. For example, suppose that for some stopping time t, we have St > Yt , and that t T with strictly positive probability. We can then define the strategy: (a) Sell the redundant security ˆd at time t for St , and hold this position until T. (b) Invest qt · St at time t in the replicating strategy q, and follow this strategy until T. Since the dividends generated by this combined strategy (a)–(b) after t are zero, the only dividend is at t, for the amount St − Yt > 0, which means that this is an arbitrage. Likewise, if St < Yt for some non-trivial stopping time t, the opposite strategy is an arbitrage. We have shown the following. Proposition. Suppose (d, S) is arbitrage-free with state-price density p. Let ˆd be a redundant dividend process with price process S. Then the augmented dividend–price pair [(d, ˆd), (S, S)] is arbitrage-free if and only if it has p as a state-price density. In applications, it is often assumed that (d, S) generates complete markets, in which case any additional security is redundant, as in the classical “binomial” model of
    • Ch. 11: Intertemporal Asset Pricing Theory 657 Cox, Ross and Rubinstein (1979), and its continuous-time analogue, the Black–Scholes option pricing model, coming up in the next section. Complete markets means that every new security is redundant. Theorem. Suppose that FT = F and there is no arbitrage. Then markets are complete if and only if there is a unique equivalent martingale measure. Banz and Miller (1978) and Breeden and Litzenberger (1978) explore the ability to deduce state prices from the valuation of derivative securities. 2.10. American exercise policies and valuation We now extend our pricing framework to include a family of securities, called “American,” for which there is discretion regarding the timing of cash flows. Given an adapted process X , each finite-valued stopping time t generates a dividend process dX ,t defined by dX ,t t = 0, t Ñ t, and dX ,t t = Xt . In this context, a finite-valued stopping time is an exercise policy, determining the time at which to accept payment. Any exercise policy t is constrained by t t, for some expiration time t T. (In what follows, we might take t to be a stopping time, which is useful for the case of certain knockout options.) We say that (X , t) defines an American security. The exercise policy is selected by the holder of the security. Once exercised, the security has no remaining cash flows. A standard example is an American put option on a security with price process p. The American put gives the holder of the option the right, but not the obligation, to sell the underlying security for a fixed exercise price at any time before a given expiration time t. If the option has an exercise price K and expiration time t < T, then Xt = (K − pt)+ , t t, and Xt = 0, t > t. We will suppose that, in addition to an American security (X , t), there are securities with an arbitrage-free dividend-price process (d, S) that generates complete markets. The assumption of complete markets will dramatically simplify our analysis since it implies, for any exercise policy t, that the dividend process dX ,t is redundant given (d, S). For notational convenience, we assume that 0 < t < T. Let p be a state-price density associated with (d, S). From Proposition 2.9, given any exercise policy t, the American security’s dividend process dX ,t has an associated cum-dividend price process, say Vt , which, in the absence of arbitrage, satisfies Vt t = 1 pt Et (pt Xt ) , t t. This value does not depend on which state-price density is chosen because, with complete markets, state-price densities are identical up to a positive scaling. We consider the optimal stopping problem V∗ 0 ≡ max t ∈ T (0) Vt 0 , (13) where, for any time t t, we let T (t) denote the set of stopping times bounded below by t and above by t. A solution to Equation (13) is called a rational exercise policy
    • 658 D. Duffie for the American security X , in the sense that it maximizes the initial arbitrage-free value of the resulting claim. Merton (1973) was the first to attack American option valuation systematically using this arbitrage-based viewpoint. We claim that, in the absence of arbitrage, the actual initial price V0 for the American security must be V∗ 0 . In order to see this, suppose first that V∗ 0 > V0. Then one could buy the American security, adopt for it a rational exercise policy t, and also undertake a trading strategy replicating −dX ,t . Since V∗ 0 = E(pt Xt )/p0, this replication involves an initial payoff of V∗ 0 , and the net effect is a total initial dividend of V∗ 0 − V0 > 0 and zero dividends after time 0, which defines an arbitrage. Thus the absence of arbitrage easily leads to the conclusion that V0 V∗ 0 . It remains to show that the absence of arbitrage also implies the opposite inequality V0 V∗ 0 . Suppose that V0 > V∗ 0 . One could sell the American security at time 0 for V0. We will show that for an initial investment of V∗ 0 , one can “super-replicate” the payoff at exercise demanded by the holder of the American security, regardless of the exercise policy used. Specifically, a super-replicating trading strategy for (X , t, d, S) is a trading strategy q involving only the securities with dividend-price process (d, S) that has the following properties: (a) dq t = 0 for 0 < t < t, and (b) Vq t Xt for all t t, where Vq t is the cum-dividend market value of q at time t. Regardless of the exercise policy t used by the holder of the security, the payment of Xt demanded at time t is dominated by the market value Vq t of a super-replicating strategy q. (In effect, one modifies q by liquidating the portfolio qt at time t, so that the actual trading strategy f associated with the arbitrage is defined by ft = qt for t < t and ft = 0 for t t.) Now, suppose q is super-replicating, with Vq 0 = V∗ 0 . If, indeed, V0 > V∗ 0 then the strategy of selling the American security and adopting a super-replicating strategy, liquidating at exercise, effectively defines an arbitrage. This notion of arbitrage for American securities, an extension of the definition of arbitrage used earlier, is reasonable because a super-replicating strategy does not depend on the exercise policy adopted by the holder (or sequence of holders over time) of the American security. It would be unreasonable to call a strategy involving a short position in the American security an “arbitrage” if, in carrying it out, one requires knowledge of the exercise policy for the American security that will be adopted by other agents that hold the security over time, who may after all act “irrationally.” The approach to American security valuation given here is similar to the continuous- time treatments of Bensoussan (1984) and Karatzas (1988), who do not formally connect the valuation of American securities with the absence of arbitrage, but rather deal with the similar notion of “fair price”. Proposition. Given (X , t, d, S), suppose (d, S) is arbitrage free and generates complete markets. Then there is a super-replicating trading strategy q for (X , t, d, S) with the initial value Vq 0 = V∗ 0 .
    • Ch. 11: Intertemporal Asset Pricing Theory 659 In order to construct a super-replicating strategy with the desired property, we will make a short excursion into the theory of optimal stopping. For any process Y in L, the Snell envelope W of Y is defined by Wt = max t ∈ T (t) Et(Yt ), 0 t t. It can be shown that, naturally, for any t < t, Wt = max[Yt, Et(Wt + 1)], which can be viewed as the Bellman equation for optimal stopping. Thus Wt Et(Wt + 1), implying that W is a supermartingale, implying that we can decompose W in the form W = Z − A, for some martingale Z and some increasing adapted8 process A with A0 = 0. In order to prove the above proposition, we define Y by Yt = Xtpt, and let W, Z, and A be defined as above. By the definition of complete markets, there is a trading strategy q with the property that • dq t = 0 for 0 < t < t; • dq t = Zt /pt ; • dq t = 0 for t > t. Property (a) defining a super-replicating strategy is satisfied by this strategy q. From the fact that Z is a martingale and the definition of a state-price density, the cum- dividend value Vq satisfies ptVq t = Et pt dq t = Et (Zt ) = Zt, t t. (14) From Equation (14) and the fact that A0 = 0, we know that Vq 0 = V∗ 0 because Z0 = W0 = p0V∗ 0 . Since Zt − At = Wt Yt for all t, from Equation (14) we also know that Vq t = Zt pt 1 pt (Yt + At) = Xt + At pt Xt, t t, the last inequality following from the fact that At 0 for all t. Thus the dominance property (b) defining a super-replicating strategy is also satisfied, and q is indeed a super-replicating strategy with Vq 0 = V∗ 0 . This proves the proposition and implies that, unless there is an arbitrage, the initial price V0 of the American security is equal to the market value V∗ 0 associated with a rational exercise policy. The Snell envelope W is also the key to showing that a rational exercise policy is given by the dynamic-programming solution t0 = min{t: Wt = Yt}. In order to verify this, suppose that t is a rational exercise policy. Then Wt = Yt . (This can be seen 8 More can be said, in that At can be taken to be Ft−1-measurable.
    • 660 D. Duffie from the fact that Wt Yt , and if Wt > Yt then t cannot be rational.) From this fact, any rational exercise policy t has the property that t t0 . For any such t, we have Et0 [Y(t)] W(t0 ) = Y(t0 ), and the law of iterated expectations implies that E[Y(t)] E[Y(t0 )], so t0 is indeed rational. We have shown the following. Theorem. Given (X , t, d, S), suppose that (d, S) admits no arbitrage and generates complete markets. Let p be a state-price deflator. Let W be the Snell envelope of X p up to the expiration time t. Then a rational exercise policy for (X , t, d, S) is given by t0 = min{t: Wt = ptXt}. The unique initial cum-dividend arbitrage-free price of the American security is V∗ 0 = 1 p0 E X (t0 ) p(t0 ) . In terms of the equivalent martingale measure Q defined in Section 2.8, we can also write the optimal stopping problem (13) in the form V∗ 0 = max t ∈ T (0) EQ Xt R0,t . (15) An optimal exercise time is t0 = min{t: V∗ t = Xt}, where V∗ t = Wt/pt is the price of the American option at time t. This representation of the rational-exercise problem is sometimes convenient. For example, let us consider the case of an American call option on a security with price process p. We have Xt = ( pt − K)+ for some exercise price K. Suppose the underlying security has no dividends before or at the expiration time t. We suppose positive interest rates, meaning that Rt,s 1 for all t and s t. With these assumptions, we will show that it is never optimal to exercise the call option before its expiration date t. This property is sometimes called “no early exercise”, or “better alive than dead”. We define the “discounted price process” p∗ by p∗ t = pt/R0,t. The fact that the underlying security pays dividends only after the expiration time t implies, by Lemma 2.8, that p∗ is a Q-martingale at least up to the expiration time t. That is, for t s t, we have EQ t ( p∗ s ) = p∗ t .
    • Ch. 11: Intertemporal Asset Pricing Theory 661 With positive interest rates, we have, for any stopping time t t, EQ 1 R0,t ( pt − K)+ = EQ p∗ t − K R0,t + = EQ EQ t p∗ t − K R0,t + EQ EQ t p∗ t − K R0,t + = EQ p∗ t − K R0,t + EQ p∗ t − K R0,t + = EQ 1 R0,t ( pt − K)+ , the first inequality by Jensen’s inequality, the second by the positivity of interest rates. It follows that t is a rational exercise policy. In typical cases, t is the unique rational exercise policy. If the underlying security pays dividends before expiration, then early exercise of the American call is, in certain cases, optimal. From the fact that the put payoff is increasing in the strike price (as opposed to decreasing for the call option), the second inequality above is reversed for the case of a put option, and one can guess that early exercise of the American put is sometimes optimal. Difficulties can arise with the valuation of American securities in incomplete markets. For example, the exercise policy may play a role in determining the marketed subspace, and therefore a role in pricing securities. If the state-price density depends on the exercise policy, it could even turn out that the notion of a rational exercise policy is not well defined. 3. Continuous-time modeling Many problems are more tractable, or have solutions appearing in a more natural form, when treated in a continuous-time setting. We first introduce the Brownian model of uncertainty and continuous security trading, and then derive partial differential equations for the arbitrage-free prices of derivative securities. The classic example is the Black–Scholes option-pricing formula. We then examine the connection between equivalent martingale measures and the “market price of risk” that arises from Girsanov’s Theorem. Finally, we briefly connect the theory of security valuation with that of optimal portfolio and consumption choice, using the elegant martingale approach of Cox and Huang (1989).
    • 662 D. Duffie 3.1. Trading gains for Brownian prices We fix a probability space (W, F, P). A process is a measurable9 function on W × [0, ∞) into R. The value of a process X at time t is the random variable variously written as Xt, X (t), or X (·, t): W → R. A standard Brownian motion is a process B defined by the following properties: (a) B0 = 0 almost surely; (b) Normality: for any times t and s > t, Bs − Bt is normally distributed with mean zero and variance s − t; (c) Independent increments: for any times t0, . . . , tn such that 0 t0 < t1 < · · · < tn < ∞, the random variables B(t0), B(t1) − B(t0), . . . , B(tn) − B(tn − 1) are independently distributed; and (d) Continuity: for each w in W, the sample path t → B(w, t) is continuous. It is a nontrivial fact, whose proof has a colorful history, that (W, F, P) can be constructed so that there exist standard Brownian motions. In perhaps the first scientific work involving Brownian motion, Bachelier (1900) proposed Brownian motion as a model of stock prices. We will follow his lead for the time being and suppose that a given standard Brownian motion B is the price process of a security. Later we consider more general classes of price processes. We fix the standard filtration F = {Ft: t 0} of B, defined for example in Protter (1990). Roughly speaking,10 Ft is the set of events that can be distinguished as true or false by observation of B until time t. Our first task is to build a model of trading gains based on the possibility of continual adjustment of the position held. A trading strategy is an adapted process q specifying at each state w and time t the number qt(w) of units of the security to hold. If a strategy q is a constant, say q, between two dates t and s > t, then the total gain between those two dates is q(Bs − Bt), the quantity held multiplied by the price change. So long as the trading strategy q is piecewise constant, we would have no difficulty in defining the total gain between any two times. For example, suppose, for some stopping times T0, . . . , TN with 0 = T0 < T1 < · · · < TN = T, and for any n, we have q(t) = q(Tn − 1) for all t ∈ [Tn − 1, Tn). Then we define the total gain from trade as T 0 qt dBt = N n = 1 q (Tn − 1) [B (Tn) − B (Tn − 1)] . (16) More generally, in order to make for a good model of trading gains for trading strategies that are not necessarily piecewise constant, a trading strategy q is required to satisfy the technical condition that T 0 q2 t dt < ∞ almost surely for each T. We let L2 denote the space of adapted processes satisfying this integrability restriction. 9 See Duffie (2001) for technical definitions not provided here. 10 The standard filtration is augmented, so that Ft contains all null sets of F.
    • Ch. 11: Intertemporal Asset Pricing Theory 663 For each q in L2 there is an adapted process with continuous sample paths, denoted q dB, that is called the stochastic integral of q with respect to B. A full definition of q dB is outlined in a standard source such as Karatzas and Shreve (1988). The value of the stochastic integral q dB at time T is usually denoted T 0 qt dBt, and represents the total gain generated up to time T by trading the security with price process B according to the trading strategy q. The stochastic integral q dB has the properties that one would expect from a good model of trading gains. In particular, Equation (16) is satisfied for piece-wise constant q, and in general the stochastic integral is linear, in that, for any q and f in L2 and any scalars a and b, the process aq + bf is also in L2 , and, for any time T > 0, T 0 (aqt + bft) dBt = a T 0 qt dBt + b T 0 ft dBt. (17) 3.2. Martingale trading gains The properties of standard Brownian motion imply that B is a martingale. (This follows basically from the property that its increments are independent and of zero expectation.) One must impose technical conditions on q, however, in order to ensure that q dB is also a martingale. This is natural; it should be impossible to generate an expected profit by trading a security that never experiences an expected price change. The following basic proposition can be found, for example, in Protter (1990). Proposition. If E T 0 q2 t dt 1/2 < ∞ for all T > 0, then q dB is a martingale. As a model of security-price processes, standard Brownian motion is too restrictive for most purposes. Consider, more generally, an Ito process, meaning a process S of the form St = x + t 0 ms ds + t 0 ss dBs, (18) where x is a real number, s is in L2 , and m is in L1 , meaning that m is an adapted process such that t 0 |ms| ds < ∞ almost surely for all t. It is common to write Equation (18) in the informal “differential” form dSt = mt dt + st dBt. One often thinks intuitively of dSt as the “increment” of S at time t, made up of two parts, the “locally riskless” part mt dt, and the “locally uncertain” part st dBt.
    • 664 D. Duffie In order to further interpret this differential representation of an Ito process, suppose that s and m have continuous sample paths and are bounded. It is then literally the case that for any time t, d dt Et (St ) t = t = mt almost surely, (19) and d dt vart (St ) t = t = s2 t almost surely, (20) where the derivatives are taken from the right, and where, for any random variable X with finite variance, vart(X ) ≡ Et(X 2 ) − [Et(X )]2 is the Ft-conditional variance of X . In this sense of Equations (19) and (20), we can interpret mt as the rate of change of the expectation of S, conditional on information available at time t, and likewise interpret s2 t as the rate of change of the conditional variance of S at time t. One sometimes reads the associated abuses of notation “Et(dSt) = mt dt” and “vart(dSt) = s2 t dt”. Of course, dSt is not even a random variable, so this sort of characterization is not rigorously justified and is used purely for its intuitive content. We will refer to m and s as the drift and diffusion processes of S, respectively. For an Ito process S of the form (18), let L(S) be the set whose elements are processes q with {qtmt: t 0} in L1 and {qtst: t 0} in L2 . For q in L(S), we define the stochastic integral q dS as the Ito process q dS given by T 0 qt dSt = T 0 qtmt dt + T 0 qtst dBt, T 0. Assuming no dividends, we also refer to q dS as the gain process generated by the trading strategy q, given the price process S. We will have occasion to refer to adapted processes q and f that are equal almost everywhere, by which we mean that E( ∞ 0 |qt − ft| dt) = 0. In fact, we shall write “q = f” whenever q = f almost everywhere. This is a natural convention, for suppose that X and Y are Ito processes with X0 = Y0 and with dXt = mt dt + st dBt and dYt = at dt + bt dBt. Since stochastic integrals are defined for our purposes as continuous sample-path processes, it turns out that Xt = Yt for all t almost surely if and only if m = a almost everywhere and s = b almost everywhere. We call this the unique decomposition property of Ito processes. Ito’s Formula is the basis for explicit solutions to asset-pricing problems in a continuous-time setting. Ito’s Formula. Suppose X is an Ito process with dXt = mt dt + st dBt and f : R2 → R is twice continuously differentiable. Then the process Y, defined by Yt = f (Xt, t), is an Ito process with dYt = fx (Xt, t) mt + ft (Xt, t) + 1 2 fxx (Xt, t) s2 t dt + fx (Xt, t) st dBt.
    • Ch. 11: Intertemporal Asset Pricing Theory 665 A generalization of Ito’s Formula appears later in this section. 3.3. The Black–Scholes option-pricing formula We turn to one of the most important ideas in finance theory, the model of Black and Scholes (1973) for pricing options. Together with the method of proof provided by Robert Merton, this model revolutionized the practice of derivative pricing and risk management, and has changed the entire path of asset-pricing theory. Consider a security, to be called a stock, with price process St = x eat + sB(t) , t 0, where x > 0, a, and s are constants. Such a process, called a geometric Brownian motion, is often called log-normal because, for any t, log(St) = log(x) + at + sBt is normally distributed. Moreover, since Xt ≡ at + sBt = t 0 a ds + t 0 s dBs defines an Ito process X with constant drift a and diffusion s, Ito’s Formula implies that S is an Ito process and that dSt = mSt dt + sSt dBt; S0 = x, where m = a + s2 /2. From Equations (19) and (20), at any time t, the rate of change of the conditional mean of St is mSt, and the rate of change of the conditional variance is s2 S2 t , so that, per dollar invested in this security at time t, one may think of m as the “instantaneous” expected rate of return, and s as the “instantaneous” standard deviation of the rate of return. The coefficient s is also known as the volatility of S. A geometric Brownian motion is a natural two-parameter model of a security-price process because of these simple interpretations of m and s. Consider a second security, to be called a bond, with the price process b defined by bt = b0 ert , t 0, for some constants b0 > 0 and r. We have the obvious interpretation of r as the continually compounding short rate. Since {rt: t 0} is trivially an Ito process, b is also an Ito process with dbt = rbt dt. A pair (a, b) consisting of trading strategies a for the stock and b for the bond is said to be self-financing if it generates no dividends before T (either positive or negative), meaning that, for all t, atSt + bt bt = a0S0 + b0 b0 + t 0 au dSu + t 0 bu dbu. (21) This self-financing condition, conveniently defined by Harrison and Kreps (1979), is merely a statement that the current portfolio value (on the left-hand side) is precisely
    • 666 D. Duffie the initial investment plus any trading gains, and therefore that no dividend “inflow” or “outflow” is generated. Now consider a third security, an option. We begin with the case of a European call option on the stock, giving its owner the right, but not the obligation, to buy the stock at a given exercise price K on a given exercise date T. The option’s price process Y is as yet unknown except for the fact that YT = (ST − K)+ ≡ max(ST − K, 0), which follows from the fact that the option is rationally exercised if and only if ST > K. Suppose that the option is redundant, in that there exists a self-financing trading strategy (a, b) in the stock and bond with aT ST + bT bT = YT . If a0S0 + b0 b0 < Y0, then one could sell the option for Y0, make an initial investment of a0S0 + b0 b0 in the trading strategy (a, b), and at time T liquidate the entire portfolio (−1, aT , bT ) of option, stock, and bond with payoff −YT + aT ST + bT bT = 0. The initial profit Y0 − a0S0 − b0 b0 > 0 is thus riskless, so the trading strategy (−1, a, b) would be an arbitrage. Likewise, if a0S0 + b0 b0 > Y0, the strategy (1, −a, −b) is an arbitrage. Thus, if there is no arbitrage, Y0 = a0S0 + b0 b0. The same arguments applied at each date t imply that in the absence of arbitrage, Yt = atSt + bt bt. A full and careful definition of continuous-time arbitrage will be given later, but for now we can proceed without much ambiguity at this informal level. Our immediate objective is to show the following. The Black–Scholes Formula. If there is no arbitrage, then, for all t < T, Yt = C(St, t), where C(x, t) = xF(z) − e−r(T − t) KF z − s √ T − t , (22) with z = log(x/K) + (r + s2 /2)(T − t) s √ T − t , where F is the cumulative standard normal distribution function. The Black and Scholes (1973) formula was extended by Merton (1973, 1977), and subsequently given literally hundreds of further extensions and applications. Cox and Rubinstein (1985) is a standard reference on options, while Hull (2000) has further applications and references. We will see different ways to arrive at the Black–Scholes formula. Although not the shortest argument, the following is perhaps the most obvious and constructive.11 We start by assuming that Yt = C(St, t), t < T, without knowledge of the function C aside from the assumption that it is twice continuously differentiable on (0, ∞) × [0, T) 11 The line of exposition here is based on Gabay (1982) and Duffie (1988). Andreasen, Jensen and Poulsen (1998) provide numerous alternative methods of deriving the Black–Scholes Formula. The basic approach of using continuous-time self-financing strategies as the basis for making arbitrage arguments is due to Merton (1977).
    • Ch. 11: Intertemporal Asset Pricing Theory 667 (allowing an application of Ito’s Formula). This will lead us to deduce Equation (22), justifying the assumption and proving the result at the same time. Based on our assumption that Yt = C(St, t) and Ito’s Formula, dYt = mY (t) dt + Cx(St, t) sSt dBt, t < T, (23) where mY (t) = Cx(St, t) mSt + Ct(St, t) + 1 2 Cxx(St, t) s2 S2 t . Now suppose there is a self-financing trading strategy (a, b) with atSt + bt bt = Yt, t ∈ [0, T]. (24) This assumption will also be justified shortly. Equations (21) and (24), along with the linearity of stochastic integration, imply that dYt = at dSt + bt dbt = (atmSt + bt btr) dt + atsSt dBt. (25) Based on the unique decomposition property of Ito processes, in order that the trading strategy (a, b) satisfies both Equation (23) and Equation (25), we must “match coefficients separately in both dBt and dt”. Specifically, we choose at so that atsSt = Cx(St, t) sSt; for this, we let at = Cx(St, t). From Equation (24) and Yt = C(St, t), we then have Cx(St, t) St + bt bt = C(St, t), or bt = 1 bt [C (St, t) − Cx (St, t) St] . (26) Finally, “matching coefficients in dt” from Equations (23) and (25) leaves, for t < T, − rC (St, t) + Ct (St, t) + rStCx (St, t) + 1 2 s2 S2 t Cxx (St, t) = 0. (27) In order for Equation (27) to hold, it is enough that C satisfies the partial differential equation (PDE) − rC(x, t) + Ct(x, t) + rxCx(x, t) + 1 2 s2 x2 Cxx(x, t) = 0, (28) for (x, t) ∈ (0, ∞) × [0, T). The fact that YT = C(ST , T) = (ST − K)+ supplies the boundary condition: C(x, T) = (x − K)+ , x ∈ (0, ∞). (29) By direct calculation of derivatives, one can show as an exercise that Equation (22) is a solution to Equations (28) and (29). All of this seems to confirm that C(S0, 0), with C defined by the Black–Scholes formula (22), is a good candidate for the initial price of
    • 668 D. Duffie the option. In order to confirm this pricing, suppose to the contrary that Y0 > C(S0, 0), where C is defined by Equation (22). Consider the strategy (−1, a, b) in the option, stock, and bond, with at = Cx(St, t) and bt given by Equation (26) for t < T. We can choose aT and bT arbitrarily so that Equation (24) is satisfied; this does not affect the self-financing condition (21) because the value of the trading strategy at a single point in time has no effect on the stochastic integral. The result is that (a, b) is self-financing by construction and that aT ST + bT bT = YT = (ST − K)+ . This strategy therefore nets an initial riskless profit of Y0 − a0S0 − b0 b0 = Y0 − C (S0, 0) > 0, which defines an arbitrage. Likewise, if Y0 < C(S0, 0), the trading strategy (+1, −a, −b) is an arbitrage. Thus, it is indeed a necessary condition for the absence of arbitrage that Y0 = C(S0, 0). Sufficiency is a more delicate matter. Under mild technical conditions on trading strategies that will follow, the Black–Scholes formula for the option price is also sufficient for the absence of arbitrage. Transactions costs play havoc with the sort of reasoning just applied. For example, if brokerage fees are any positive fixed fraction of the market value of stock trades, the stock-trading strategy a constructed above would call for infinite total brokerage fees, since, in effect, the number of shares traded is infinite! Leland (1985) has shown, nevertheless, that the Black–Scholes formula applies approximately, for small proportional transacations costs, once one artificially elevates the volatility parameter to compensate for the transactions costs. 3.4. Ito’s Formula Ito’s Formula is extended to the case of multidimensional Brownian motion as follows. A standard Brownian motion in Rd is defined by B = (B1 , . . . , Bd ) in Rd , where B1 , . . . , Bd are independent standard Brownian motions. We fix a standard Brownian motion B in Rd , restricted to some time interval [0, T], on a given probability space (W, F, P). We also fix the standard filtration F = {Ft: t ∈ [0, T]} of B. For simplicity, we take F to be FT . For an Rd -valued process q = (q(1) , . . . , q(d) ) with q(i) in L2 for each i, the stochastic integral q dB is defined by t 0 qs dBs = d i = 1 t 0 q(i) s dBi s. (30) An Ito process is now defined as one of the form Xt = x + t 0 ms ds + t 0 qs dBs, where m is a drift (with t 0 |ms| ds < ∞ almost surely) and t 0 qs dBs is defined as in Equation (30). In this case, we call q the diffusion of X .
    • Ch. 11: Intertemporal Asset Pricing Theory 669 We say that X = (X (1) , . . . , X (N) ) an Ito process in RN if, for each i, X (i) is an Ito process. The drift of X is the RN -valued process m whose ith coordinate is the drift of X (i) . The diffusion of X is the RN × d -matrix-valued process s whose ith row is the diffusion of X (i) . In this case, we use the notation dXt = mt dt + st dBt. (31) Ito’s Formula. Suppose X is the Ito process in RN given by Equation (31) and f : RN × [0, ∞) × R is C2,1 ; that is, f has at least two continuous derivatives with respect to its first (x) argument, and at least one continuous derivative with respect to its second (t) argument. Then { f (Xt, t): t 0} is an Ito process and, for any time t, f (Xt, t) = f (X0, 0) + t 0 D f (Xs, s) ds + t 0 fx (Xs, s) qs dBs, where D f (Xt, t) = fx (Xt, t) mt + ft (Xt, t) + 1 2 tr stst fxx (Xt, t) . Here, fx, ft, and fxx denote the obvious partial derivatives of f , valued in RN , R, and RN × N respectively, and tr(A) denotes the trace of a square matrix A (the sum of its diagonal elements). If X is an Ito process in RN with dXt = mt dt + st dBt and q = (q1 , . . . , qN ) is a vector of adapted processes such that q · m is in L1 and, for each i, q · si is in L2 , then we say that q is in L(X ), which means that the stochastic integral q dX exists as an Ito process when defined by T 0 qt dXt ≡ T 0 qt · mt dt + T 0 st qt dBt, T 0. If X and Y are real-valued Ito processes with dXt = mX (t) dt + sX (t) dBt and dYt = mY (t) dt + sY (t) dBt, then Ito’s Formula (for N = 2) implies that the product Z = XY is an Ito process, with dZt = Xt dYt + Yt dXt + sX (t) · sY (t) dt. (32) If mX , mY , sX , and sY are bounded and have continuous sample paths (weaker conditions would suffice), then it follows from Equation (32) that d ds covt (Xs, Ys) s = t = sX (t) · sY (t) almost surely, where covt(Xs, Ys) = Et(XsYs) − Et(Xs) Et(Ys), and where the derivative is taken from the right, extending the intuition developed with Equations (19) and (20).
    • 670 D. Duffie 3.5. Arbitrage modeling Now, we turn to a more careful definition of arbitrage for purposes of establishing a close link between the absence of arbitrage and the existence of state prices. Suppose the price processes of N given securities form an Ito process X = (X (1) , . . . , X (N) ) in RN . We suppose, for technical regularity, that each security price process is in the space H2 containing any Ito process Y with dYt = a(t) dt + b(t) dB(t) for which E t 0 a(s) ds 2 < ∞ and E t 0 b(s) · b(s) ds < ∞. We will suppose that the securities pay no dividends during the time interval [0, T), and that XT is the vector of cum-dividend security prices at time T. A trading strategy q is an RN -valued process q in L(X ), meaning simply that the stochastic integral q dX defining trading gains is well defined. A trading strategy q is self-financing if qt · Xt = q0 · X0 + t 0 qs dXs, t T. (33) We suppose that there is some short-rate process, a process r with the property that T 0 |rt| dt is finite almost surely and, for some security with strictly positive price process b, bt = b0 exp t 0 rs ds , t ∈ [0, T]. (34) In this case, dbt = rt bt dt, allowing us to view rt as the riskless short-term continuously compounding rate of interest, in an instantaneous sense, and to view bt as the market value of an account that is continually reinvested at the short-term interest rate r. A self-financing strategy q is an arbitrage if q0 · X0 < 0 and qT · XT 0, or if q0 · X0 0 and qT · XT > 0. Our first goal is to characterize the properties of a price process X that admits no arbitrage, at least after placing some reasonable restrictions on trading strategies. 3.6. Numeraire invariance It is often convenient to renormalize all security prices, sometimes relative to a particular price process. A deflator is a strictly positive Ito process. We can deflate the previously given security price process X by a deflator Y to get the new price process X Y defined by X Y t = XtYt. Such a renormalization has essentially no economic effects, as suggested by the following result.
    • Ch. 11: Intertemporal Asset Pricing Theory 671 Numeraire Invariance Theorem. Suppose Y is a deflator. Then a trading strategy q is self-financing with respect to X if and only if q is self-financing with respect to X Y . The proof is an application of Ito’s Formula. We have the following corollary, which is immediate from the Numeraire Invariance Theorem, the strict positivity of Y, and the definition of an arbitrage. On numeraire invariance in more general settings, see Huang (1985a) and Protter (2001).12 Corollary. Suppose Y is a deflator. A trading strategy is an arbitrage with respect to X if and only if it is an arbitrage with respect to the deflated price process X Y . 3.7. State prices and doubling strategies Paralleling the terminology of Section 2.2, a state-price density is a deflator p with the property that the deflated price process X p is a martingale. Other terms used for this concept in the literature are state-price deflator, marginal-rate-of-substitution process, and pricing kernel. In the discrete-state discrete-time setting of Section 2, we found that there is a state-price density if and only if there is no arbitrage. In a general continuous-time setting, this result is “almost” true, up to some technical issues. A technical nuisance in a continuous-time setting is that, without some frictions limiting trade, arbitrage is to be expected. For example, one may think of a series of bets on fair and independent coin tosses at times 1/2, 3/4, 7/8, and so on. Suppose one’s goal is to earn a riskless profit of a by time 1, where a is some arbitrarily large number. One can bet a on heads for the first coin toss at time 1/2. If the first toss comes up heads, one stops. Otherwise, one owes a to one’s opponent. A bet of 2a on heads for the second toss at time 3/4 produces the desired profit if heads comes up at that time. In that case, one stops. Otherwise, one is down 3a and bets 4a on the third toss, and so on. Because there is an infinite number of potential tosses, one will eventually stop with a riskless profit of a (almost surely), because the probability of losing on every one of an infinite number of tosses is (1/2) · (1/2) · (1/2) · · · = 0. This is a classic “doubling strategy” that can be ruled out either by a technical limitation, such as limiting the total number of bets, or by a credit restriction limiting the total amount that one is allowed to be in debt. For the case of continuous-time trading strategies,13 we will eliminate the possibility of “doubling strategies” with a credit constraint, defining the set Q(X ) of self-financing trading strategies satisfying the non-negative wealth restriction qt · Xt 0 for all t. An alternative is to restrict trading strategies with a technical integrability condition, as reviewed in Duffie (2001). The next result is based on Dybvig and Huang (1988). 12 For more on the role of numeraire, see Geman, El Karoui and Rochet (1995). 13 An actual continuous-time “doubling” strategy can be found in Karatzas (1993).
    • 672 D. Duffie Proposition. If there is a state-price density, then there is no arbitrage in Q(X ). Weaker no-arbitrage conditions based on a lower bound on wealth or on integrability conditions, are summarized in Duffie (2001), who provides a standard proof of this result. 3.8. Equivalent martingale measures In the finite-state setting of Section 2, it was shown that the existence of a state- price deflator is equivalent to the existence of an equivalent martingale measure (after some deflation). Here, we say that Q is an equivalent martingale measure for the price process X if Q is equivalent to P (they have the same events of zero probability), and if X is a martingale under Q. Theorem. If the price process X admits an equivalent martingale measure, then there is no arbitrage in Q(X ). In most cases, the theorem is applied along the lines of the following corollary, a consequence of the corollary to the Numeraire Invariance Theorem of Section 3.6. Corollary. If there is a deflator Y such that the deflated price process X Y admits an equivalent martingale measure, then there is no arbitrage in Q(X ). As in the finite-state case, the absence of arbitrage and the existence of equivalent martingale measures are, in spirit, identical properties, although there are some technical distinctions in this infinite-dimensional setting. Inspired from early work by Kreps (1981), Delbaen and Schachermayer (1998) showed the equivalence, after deflation by a numeraire deflator, between no free lunch with vanishing risk (a slight strengthening of the notion of no arbitrage) and the existence of a local martingale measure.14 3.9. Girsanov and market prices of risk We now look for convenient conditions on X supporting the existence of an equivalent martingale measure. We will also see how to calculate such a measure, and conditions for the uniqueness of such a measure, which is in spirit equivalent to complete markets. This is precisely the case for the finite-state setting of Theorem 2.9. The basic approach is from Harrison and Kreps (1979) and Harrison and Pliska (1981), who coined most of the terms and developed most of the techniques and basic results. Huang (1985a,b) generalized the basic theory. The development here 14 For related results, see Ansel and Stricker (1992a,b), Back and Pliska (1987), Cassese (1996), Duffie and Huang (1986), El Karoui and Quenez (1995), Frittelli and Lakner (1995), Jacod and Shiryaev (1998), Kabanov (1997), Kabanov and Kramkov (1995), Kusuoka (1993), Lakner (1993), Levental and Skorohod (1995), Rogers (1994), Schachermayer (1992, 1994, 2002), Schweizer (1992) and Stricker (1990).
    • Ch. 11: Intertemporal Asset Pricing Theory 673 differs in some minor ways. Most of the results extend to an abstract filtration, not necessarily generated by Brownian motion, but the following important property of Brownian filtrations is somewhat special. Martingale Representation Theorem. For any martingale x, there exists some Rd - valued process q such that the stochastic integral q dB exists and such that, for all t, xt = x0 + t 0 qs dBs. Now, we consider any given probability measure Q equivalent to P, with density process x defined by (11). By the martingale representation theorem, we can express the martingale x in terms of a stochastic integral of the form dxt = gt dBt, for some adapted process g = (g(1) , . . . , g(d) ) with T 0 gt · gt dt < ∞ almost surely. Girsanov’s Theorem states that a standard Brownian motion BQ in Rd under Q is defined by BQ 0 = 0 and dBQ t = dBt + ht dt, where ht = −gt/xt. Suppose the price process X of the N given securities (possibly after some change of numeraire) is an Ito process in RN , with dXt = mt dt + st dBt. We can therefore write dXt = (mt − stht) dt + st dBQ t . If X is to be a Q-martingale, then its drift under Q must be zero, which means that, almost everywhere, s(w, t) h(w, t) = m(w, t), (w, t) ∈ W × [0, T]. (35) Thus, the existence of a solution h to the system (35) of linear equations (almost everywhere) is necessary for the existence of an equivalent martingale measure for X . Under additional technical conditions, we will find that it is also sufficient. We can also view a solution h to Equation (35) as providing a proportional relationship between mean rates of change of prices (m) and the amounts (s) of “risk” in price changes stemming from the underlying d Brownian motions. For this reason, any such solution h is called a market-price-of-risk process for X . The idea is that hi(t) is the “unit price”, measured in price drift, of bearing exposure to the increment of B(i) at time t. A numeraire deflator is a deflator that is the reciprocal of the price process of one of the securities. It is usually the case that one first chooses some numeraire deflator Y,
    • 674 D. Duffie and then calculates the market price of risk for the deflated price process X Y . This is technically convenient because one of the securities, the “numeraire”, has a price that is always 1 after such a deflation. If there is a short-rate process r, a typical numeraire deflator is given by Y, where Yt = exp(− t 0 rs ds). If there is no market price of risk, one may guess that something is “wrong”, as the following result confirms. Lemma. Let Y be a numeraire deflator. If there is no market-price-of-risk process for X Y , then there are arbitrages in Q(X ), and there is no equivalent martingale measure for X Y . Proof: Suppose X Y has drift process mY and diffusion sY , and that there is no solution h to sY h = mY . Then, as a matter of linear algebra, there exists an adapted process q taking values that are row vectors in RN such that qsY ≡ 0 and qmY Ñ 0. By replacing q(w, t) with zero for any (w, t) such that q(w, t) mY (w, t) < 0, we can arrange to have qmY > 0. (This works provided the resulting process q is not identically zero; in that case the same procedure applied to −q works.) Finally, because the numeraire security associated with the deflator has a price that is identically equal to 1 after deflation, we can also choose the trading strategy for the numeraire so that, in addition to the above properties, q is self-financing. That is, assuming without loss of generality that the numeraire security is the last security, we can let q(N) t = − N − 1 i = 1 q(i) t X Y,(i) t + t 0 q(i) s dX Y,(i) s . It follows that q is a self-financing trading strategy with q0 · X Y 0 = 0, whose wealth process W, defined by Wt = qt · X Y t , is increasing and not constant. In particular, q is in Q(X Y ). It follows that q is an arbitrage for X Y , and therefore (by Numeraire Invariance) for X . Finally, the reasoning leading to Equation (35) implies that if there is no market- price-of-risk process, then there can be no equivalent martingale measure for X Y . For any Rd -valued adapted process h in L(B), we let xh be defined by xh t = exp − t 0 hs dBs − 1 2 t 0 hs · hs ds . (36) Ito’s Formula implies that dxh t = −xh t ht dBt. Novikov’s Condition, a sufficient technical condition for x to be a martingale, is that E exp 1 2 T 0 hs · hs ds < ∞. Theorem. If X has a market price of risk process h satisfying Novikov’s condition, and moreover xh T has finite variance, then there is an equivalent martingale measure for X , and there is no arbitrage in Q(X ).
    • Ch. 11: Intertemporal Asset Pricing Theory 675 Proof: By Novikov’s Condition, xh is a positive martingale. We have xh 0 = e0 = 1, so xh is indeed the density process of an equivalent probability measure Q defined by dQ dP = xh T . By Girsanov’s Theorem, a standard Brownian motion BQ in Rd under Q is defined by dBQ t = dBt + ht dt. Thus dXt = st dBQ t . As dQ dP has finite variance and each security price process X (i) is by assumption in H2 , we know by the Cauchy–Schwartz Inequality that EQ T 0 s(i) (t) · s(i) (t) dt 1/2 = EP T 0 s(i) (t) · s(i) (t) dt 1/2 dQ dP , is finite. Thus, X (i) is a Q-martingale by Proposition 3.2, and Q is therefore an equivalent martingale measure. The lack of arbitrage in Q(X ) follows from Theorem 3.8. Putting this result together with the previous lemma, we see that the existence of a market-price-of-risk process is necessary and, coupled with a technical integrability condition, sufficient for the absence of “well-behaved” arbitrages and the existence of an equivalent martingale measure. Huang and Pag`es (1992) give an extension to the case of an infinite-time horizon. For uniqueness of equivalent martingale measures, we can use the fact that, for any such measure Q, Girsanov’s Theorem implies that we must have dQ dP = xh T , for some market price of risk h. If s(w, t) is of maximal rank d, however, there can be at most one solution h(w, t) to Equation (35). This maximal rank condition is equivalent to the condition that the span of the rows of s(w, t) is all of Rd . Proposition. If rank(s) = d almost everywhere, then there is at most one market price of risk and at most one equivalent martingale measure. If there is a unique market-price-of-risk process, then rank(s) = d almost everywhere. With incomplete markets, significant attention in the literature has been paid to the issue of “which equivalent martingale measure to use” for the purpose of pricing contingent claims that are not redundant. Babbs and Selby (1996), B¨uhlmann, Delbaen, Embrechts and Shiryaev (1998), and F¨ollmer and Schweizer (1990) suggest some selection criteria or parameterization for equivalent martingale measures in incomplete markets. In particular, Artzner (1995), Bajeux-Besnainou and Portait (1997), Dijkstra (1996), Johnson (1994) and Long (1990) address the numeraire portfolio, also called growth-optimal portfolio, as a device for selecting a state-price density. Little of this literature offers an economic theory for the use of a particular measure for pricing new contingent claims that are not already traded (or replicated) by the given primitive securities.
    • 676 D. Duffie 3.10. Black–Scholes again Suppose the given security-price process is X = (S(1) , . . . , S(N − 1) , b), where, for S = (S(1) , . . . , S(N − 1) ), dSt = mt dt + st dBt, and dbt = rt bt dt; b0 > 0, where m, s, and r are adapted processes (valued in RN − 1 , R(N − 1) × d , and R, respectively). We also suppose for technical convenience that the short-rate process r is bounded. Then Y = b−1 is a convenient numeraire deflator, and we let Z = SY. By Ito’s Formula, dZt = −rtZt + mt bt dt + st bt dBt. In order to apply Theorem 3.9 to the deflated price process X = (Z, 1), it would be enough to know that Z has a market price of risk h and that the variance of xh T is finite. Given this, there would be an equivalent martingale measure Q and no arbitrage in Q(X ). Suppose, for the moment, that this is the case. By Girsanov’s Theorem, there is a standard Brownian motion BQ in Rd under Q such that dZt = st bt dBQ t . Because S = bZ, another application of Ito’s Formula yields dSt = rtSt dt + st dBQ t . (37) Equation (37) is an important intermediate result for arbitrage-free asset pricing, giving an explicit expression for security prices under a probability measure Q with the property that the “discounted” price process S/b is a martingale. For example, this leads to an easy recovery of the Black–Scholes formula, as follows. Suppose that, of the securities with price processes S(1) , . . . , S(N − 1) , one is a call option on another. For convenience, we denote the price process of the call option by U and the price process of the underlying security by V, so that UT = (VT − K)+ , for expiration at time T with some given exercise price K. Because UY is by assumption a martingale under Q, we have Ut = btEQ t UT bT = EQ t exp − T t r(s) ds (VT − K)+ . (38) The reader may verify that this is the Black–Scholes formula for the case of d = 1, V0 > 0, and with constants r and non-zero s such that for all
    • Ch. 11: Intertemporal Asset Pricing Theory 677 t, rt = r and dVt = VtmV (t) dt + Vts dBt, where mV is a bounded adapted process. Indeed, in this case, Z has a market-price-of-risk process h such that xh T has finite variance, an exercise, so the assumption of an equivalent martingale measure is justified. More precisely, it is sufficient for the absence of arbitrage that the option-price process is given by Equation (38). Necessity of the Black– Scholes formula for the absence of arbitrages in Q(X ) is addressed in Duffie (2001). We can already see, however, that the expectation in Equation (38) defining the Black–Scholes formula does not depend on which equivalent martingale measure Q one chooses, so one should expect that the Black–Scholes formula (38) is also necessary for the absence of arbitrage. If Equation (38) is not satisfied, for instance, there cannot be an equivalent martingale measure for S/b. Unfortunately, and for purely technical reasons, this is not enough to imply directly the necessity of Equation (38) for the absence of well-behaved arbitrage, because we do not have a precise equivalence between the absence of arbitrage and the existence of equivalent martingale measures. In the Black–Scholes setting, s is of maximal rank d = 1 almost everywhere. Thus, from Proposition 3.9, there is exactly one equivalent martingale measure. The detailed calculations of Girsanov’s Theorem appear nowhere in the actual solution (37) for the “risk-neutral behavior” of arbitrage-free security prices, which can be given by inspection in terms of s and r only. 3.11. Complete markets We say that a random variable W can be replicated by a self-financing trading strategy q if W = qT · XT . Our basic objective in this section is to give a simple spanning condition on the diffusion s of the price process X under which, up to technical integrability conditions, any random variable can be replicated (without resorting to “doubling strategies”). Proposition. Suppose Y is a numerator deflator and Q is an equivalent martingale measure for the deflated price process X Y . Suppose the diffusion sY of X Y is of maximal rank d almost everywhere. Let W be any random variable with EQ (|WY|) < ∞. Then there is a self-financing trading strategy q that replicates W and whose deflated market-value process {qt · X Y t : t T} is a Q-martingale. Proof: Without loss of generality, the numeraire is the last of the N securities, so we write X Y = (Z, 1). Let BQ be the standard Brownian motion in Rd under Q obtained by Girsanov’s Theorem. The martingale representation property implies that, for any Q-martingale, there is some f such that EQ t (WYT ) = EQ (WYT ) + t 0 fs dBQ s , t ∈ [0, T]. (39)
    • 678 D. Duffie By the rank assumption on sY and the fact that sY Nt = 0, there are adapted processes q(1) , . . . , q(N − 1) solving N − 1 j = 1 q( j) t sY jt = ft , t ∈ [0, T]. (40) Let q(N) be defined by q(N) t = EQ (WYT ) + N − 1 i = 1 t 0 q(i) s dZ(i) s − q(i) t Z(i) t . (41) Then q = (q(1) , . . . , q(N) ) is self-financing and qT · X Y T = WYT . By the Numeraire Invariance Theorem, q is also self-financing with respect to X and qT · XT = W. As f dBQ is by construction a Q-martingale, Equations (39–41) imply that {qt · X Y t : 0 t T} is a Q-martingale. The property that the deflated market-value process {qt · X Y t : 0 t T} is a Q-martingale ensures that there is no use of doubling strategies. For example, if W 0, then the martingale property implies that qt · Xt 0 for all t. Analogues to some of the results in this section for the case of market imperfections such as portfolio constraints or transactions costs are provided by Ahn, Dayal, Grannan and Swindle (1995), Bergman (1995), Constantinides and Zariphopoulou (1999, 2001), Cvitani´c and Karatzas (1993), Davis and Clark (1993), Grannan and Swindle (1996), Henrotte (1991), Jouini and Kallal (1993), Karatzas and Kou (1998), Kusuoka (1992, 1995), Soner, Shreve and Cvitani´c (1994) and Whalley and Wilmott (1997). Many of these results are asymptotic, for “small” proportional transactions costs, based on the approach of Leland (1985). 3.12. Optimal trading and consumption We now apply the “martingale” characterization of the cost of replicating an arbitrary payoff, given in the last proposition, to the problem of optimal portfolio and consumption processes. The setting is Merton’s problem, as formulated and solved in certain settings, for geometric Brownian prices, by Merton (1971). Merton used the method of dynamic programming, solving the associated Hamilton–Jacobi–Bellman (HJB) equation.15 A major alternative method is the martingale approach to optimal investment, which reached a key stage of development with Cox and Huang (1989), who treat the agent’s candidate consumption choice as though it is a derivative security, and maximize 15 The book of Fleming and Soner (1993) treats HJB equations, stochastic control problems, emphasizing the use of viscosity methods.
    • Ch. 11: Intertemporal Asset Pricing Theory 679 the agent’s utility subject to a wealth constraint on the arbitrage-free price of the consumption. Since that price can be calculated in terms of the given state-price density, the result is a simple static optimization problem.16 Karatzas and Shreve (1998) provide a comprehensive treatment of optimal portfolio and consumption processes in this setting. Fixing a probability space (W, F, P) and the standard filtration {Ft: t 0} of a standard Brownian motion B in Rd , we suppose that X = (X (0) , X (1) , . . . , X (N) ) is an Ito process in RN + 1 for the prices of N + 1 securities, with dX (i) t = m(i) t X (i) t dt + X (i) t s(i) t dBt; X (i) 0 > 0, (42) where m = (m(0) , . . . , m(N) ) and the RN × d -valued process s are bounded adapted processes. Letting s(i) denote the ith row of s, we suppose that s(0) = 0, so that we can treat m(0) as the short-rate process r. A special case of this setup is to have geometric Brownian security prices and a constant short rate, which was the setting of Merton’s original problem. We assume for simplicity that N = d. The excess expected returns of the “risky” securities are defined by the RN -valued process l given by l(i) t = m(i) t − rt. A deflated price process X is defined by Xt = Xt exp(− t 0 rs ds). We assume that s is invertible (almost everywhere) and that the market-price-of-risk process h for X , defined by ht = s−1 t lt, is bounded. It follows that markets are complete (in the sense of Proposition 3.11) and that there are no arbitrages meeting the standard credit constraint of non-negative wealth. In this setting, a state-price density p is defined by pt = exp − t 0 rs ds xt, (43) where xh is the density process defined by Equation (36) for an equivalent martingale measure Q, after deflation by exp[ t 0 −r(s) ds]. Utility is defined over the space D of consumption pairs (c, Z), where c is an adapted nonnegative consumption-rate process with T 0 ct dt < ∞ almost surely, and Z is an FT -measurable nonnegative random variable describing terminal lump-sum consumption. Specifically, U: D → R is defined by U(c, Z) = E T 0 u(ct, t) dt + F(Z) , (44) where • F: R+ → R is increasing and concave with F(0) = 0; 16 The related literature is immense, and includes Cox (1983), Pliska (1986), Cox and Huang (1991), Back (1986, 1991), Back and Pliska (1987), Duffie and Skiadas (1994), Foldes (1978a,b, 1990, 1991a,b, 1992, 2001), Harrison and Kreps (1979), Huang (1985b), Huang and Pag`es (1992), Karatzas, Lehoczky and Shreve (1987), Lakner and Slud (1991), Pag`es (1987) and Xu and Shreve (1992).
    • 680 D. Duffie • u: R+ × [0, T] → R is continuous and, for each t in [0, T], u(·, t): R+ → R is increasing and concave, with u(0, t) = 0; • F is strictly concave or zero, or for each t in [0, T ], u(·, t) is strictly concave or zero. • At least one of u and F is non-zero. A trading strategy is a process q = (q(0) , . . . , q(N) ) in L(X ), meaning merely that the gain-from-trade stochastic integral q dX exists. Given an initial wealth w > 0, we say that (c, Z, q) is budget-feasible if (c, Z) is a consumption choice in D and q is a trading strategy satisfying qt · Xt = w + t 0 qs dXs − t 0 cs ds 0, t ∈ [0, T ], (45) and qT · XT Z. (46) The first restriction (45) is that the current market value qt · Xt of the trading strategy is non-negative, a credit constraint, and is equal to its initial value w, plus any gains from security trade, less the cumulative consumption to date. The second restriction (46) is that the terminal portfolio value is sufficient to cover the terminal consumption. We now have the problem, for each initial wealth w, sup (c,Z,q) ∈ L(w) U(c, Z), (47) where L(w) is the set of budget-feasible choices at wealth w. First, we state an extension of the numeraire invariance result of Section 3.4, which obtains from an application of Ito’s Formula. Lemma. Let Y be any deflator. Given an initial wealth w 0, a strategy (c, Z, q) is budget-feasible given price process X if and only if it is budget feasible after deflation, that is, qt · X Y t = wY0 + t 0 qs dX Y s − t 0 Yscs ds 0, t ∈ [0, T ], (48) and qT · X Y T ZYT . (49) With numeraire invariance, we can reduce the dynamic trading and consumption problem to a static optimization problem subject to an initial wealth constraint, as follows.
    • Ch. 11: Intertemporal Asset Pricing Theory 681 Proposition. Given a consumption choice (c, Z) in D, there exists a trading strategy q such that (c, Z, q) is budget-feasible at initial wealth w if and only if E pT Z + T 0 ptct dt w. (50) Proof: Suppose (c, Z, q) is budget-feasible. Applying the previous numeraire-invariance lemma to the state-price deflator p, and using the fact that p0 = x0 = 1, we have w + T 0 qt dX p t pT Z + T 0 ptct dt. (51) Because X p is a martingale under P, the process M, defined by Mt = w + t 0 qs dX p s , is a non-negative local martingale, and therefore a supermartingale. For the definitions of local martingale and supermartingale, and for this property, see for example Protter (1990). By the supermartingale property, M0 E(MT ). Taking expectations through Equation (51) thus leaves Equation (50). Conversely, suppose (c, Z) satisfies Equation (50), and let M be the Q-martingale defined by Mt = EQ t e−rT Z + T 0 e−rt ct dt . By Girsanov’s Theorem, a standard Brownian motion BQ in Rd under Q is defined by dBQ t = dBt + ht dt, and BQ has the martingale representation property. Thus, there is some f = (f(1) , . . . , f(d) ) in L(BQ ) such that Mt = M0 + t 0 fs dBQ s , t ∈ [0, T ], where M0 w. For the deflator Y defined by Yt = exp[− t 0 r(s) ds], we also know that X = X Y is a Q-martingale. From the definitions of the market price of risk h and of BQ , dX (i) t = X (i) t s(i) t dBQ t , 1 i N. Because st is invertible and X is strictly positive with continuous sample paths, we can choose q(i) in L(X (i) ) for each i N such that q(1) t X (1) t , . . . , q(N) t X (N) t st = ft , t ∈ [0, T].
    • 682 D. Duffie This implies that Mt = M0 + N i = 1 t 0 q(i) s dX (i) s . (52) We can also let q(0) t = w + N i = 1 t 0 q(i) s dX (i) s − N i = 1 q(i) t X (i) t − t 0 e−rs cs ds. (53) From Equation (50) and the fact that xt = pt exp[ t 0 r(s) ds] defines the density process for Q, M0 = EQ e−rT Z + T 0 e−rt ct dt w. (54) From Equations (53) and (52), and the fact that q(0) dX (0) = 0, qt · Xt = w + t 0 qs dXs − t 0 e−rs cs ds, = w + Mt − M0 − t 0 e−rs cs ds, = w − M0 + EQ t T t e−rs cs ds + e−rT Z 0, using Equation (54). With numeraire invariance, Equation (45) follows. We can also use the same inequality for t = T, Equation (54), and the fact that MT = exp[− T 0 r(s) ds] Z + T 0 exp[− t 0 r(s) ds] ct dt to obtain Equation (46). Thus, (c, Z, q) is budget-feasible. Corollary. Given a consumption choice (c∗ , Z∗ ) in D and some initial wealth w, there exists a trading strategy q∗ such that (c∗ , Z∗ , q∗ ) solves Merton’s problem (47) if and only if (c∗ , Z∗ ) solves the problem sup (c,Z) ∈ D U(c, Z) subject to E T 0 ptct dt + pT Z w. (55) 3.13. Martingale solution to Merton’s problem We are now in a position to obtain a relatively explicit solution to Merton’s problem (47) by using the equivalent formulation (55).
    • Ch. 11: Intertemporal Asset Pricing Theory 683 By the Saddle Point Theorem and the strict monotonicity of U, (c∗ , Z∗ ) solves (55) if and only if there is a scalar Lagrange multiplier g∗ > 0 such that, first: (c∗ , Z∗ ) solves the unconstrained problem sup (c,Z) ∈ D L (c, Z; g∗ ) , (56) where, for any g 0, L (c, Z; g) = U(c, Z) − gE pT Z + T 0 ptct dt − w , (57) and second, (c∗ , Z∗ ) satisfies the complementary-slackness condition E pT Z∗ + T 0 ptc∗ t dt = w. (58) We can summarize our progress on Merton’s problem (47) as follows. Proposition. Given some (c∗ , Z∗ ) in D, there is a trading strategy q∗ such that (c∗ , Z∗ , q∗ ) solves Merton’s problem (47) if and only if there is a constant g∗ > 0 such that (c∗ , Z∗ ) solves Equation (56) and E(pT Z∗ + T 0 ptc∗ t dt) = w. In order to obtain intuition for the solution of (56), we begin with some arbitrary g > 0 and treat U(c, Z) = E[ T 0 u(ct, t) dt + F(Z)] intuitively by thinking of “E” and “ ” as finite sums, in which case the first-order conditions for optimality of (c∗ , Z∗ ) 0 for the problem sup(c,Z) L(c, Z; g), assuming differentiability of u and F, are uc (c∗ t , t) − gpt = 0, t ∈ [0, T], (59) and F (Z∗ ) − gpT = 0. (60) Solving, we have c∗ t = I (gpt, t) , t ∈ [0, T], (61) and Z∗ = IF (gpT ) , (62) where I(·, t) inverts17 uc(·, t) and where IF inverts F . We will confirm these conjectured forms (61) and (62) of the solution in the next theorem. Under strict 17 If u = 0, we take I = 0. If F = 0, we take IF = 0.
    • 684 D. Duffie concavity of u or F, the inversions I(·, t) and IF , respectively, are continuous and strictly decreasing. A decreasing function ˆw: (0, ∞) → R is therefore defined by ˆw(g) = E T 0 ptI (gpt, t) dt + pT IF (gpT ) . (63) (We have not yet ruled out the possibility that the expectation may be +∞). All of this implies that (c∗ , Z∗ ) of Equations (61) and (62) solves Problem (55) provided the required initial investment ˆw(g) is equal to the endowed initial wealth w. This leaves an equation ˆw(g) = w to solve for the “correct” Lagrange multiplier g∗ , and with that an explicit solution to the optimal consumption policy for Merton’s problem. We now consider properties of u and F guaranteeing that ˆw(g) = w can be solved for a unique g∗ > 0. A strictly concave increasing function F: R+ → R that is differentiable on (0, ∞) satisfies Inada conditions if infx F (x) = 0 and supx F (x) = +∞. If F satisfies these Inada conditions, then the inverse IF of F is well defined as a strictly decreasing continuous function on (0, ∞) whose image is (0, ∞). Condition A. Either F is zero or F is differentiable on (0, ∞), strictly concave, and satisfies Inada conditions. Either u is zero or, for all t, u(·, t) is differentiable on (0, ∞), strictly concave, and satisfies Inada conditions. For each g > 0, ˆw(g) is finite. We recall the standing assumption that at least one of u and F is nonzero. The assumption of finiteness of ˆw(·) has been shown by Kramkov and Schachermayer (1999) to follow from natural regularity conditions. Theorem. Under Condition A and the standing conditions on m, s, and r, for any w > 0, Merton’s problem has the optimal consumption policy given by Equations (61) and (62) for a unique scalar g > 0. Proof: Under Condition A, the Dominated Convergence Theorem implies that ˆw(·) is continuous. Because one or both of I(·, t) and IF (·) have (0, ∞) as their image and are strictly decreasing, ˆw(·) inherits these two properties. From this, given any initial wealth w > 0, there is a unique g∗ with ˆw(g∗ ) = w. Let (c∗ , Z∗ ) be defined by Equation (61) and (62), taking g = g∗ . The previous proposition tells us there is a trading strategy q∗ such that (c∗ , Z∗ , q∗ ) is budget-feasible. Let (q, c, Z) be any budget-feasible choice. The previous proposition also implies that (c, Z) satisfies Equation (50). For each (w, t), the first-order conditions (59) and (60) are sufficient (by concavity of u and F) for optimality of c∗ (w, t) and Z∗ (w) in the problems sup c ∈ [0,∞) u (c, t) − g∗ p(w, t)c, and sup Z ∈ [0,∞) F Z − g∗ p(w, T)Z,
    • Ch. 11: Intertemporal Asset Pricing Theory 685 respectively. Thus, u(c∗ t , t) − g∗ ptc∗ t u(ct, t) − g∗ ptct, 0 t T, (64) and F(Z∗ ) − g∗ pT Z∗ F(Z) − g∗ pT Z. (65) Integrating Equation (64) from 0 to T, adding Equation (65), taking expectations, and then applying the complementary slackness condition (58) and the budget constraint (50), leaves U(c∗ , Z∗ ) U(c, Z). As (c, Z, q) is arbitrary, this implies the optimality of (c∗ , Z∗ , q∗ ). In practice, solving the equation ˆw(g∗ ) = w for g∗ may require a one-dimensional numerical search, which is straightforward because ˆw(·) is strictly monotone. This result, giving a relatively explicit consumption solution to Merton’s problem, has been extended in many directions, even generalizing the assumption of additive utility to allow for habit-formation or recursive utility, as shown by Schroder and Skiadas (1999). For a specific example, we treat terminal consumption only by taking u ≡ 0, and we let F(w) = wa /a for a ∈ (0, 1). Then c∗ = 0 and the calculations above imply that ˆw(g) = E[pT (gpT )1/(a−1) ]. Solving ˆw(g∗ ) = w for g∗ leaves g∗ = wa − 1 E pa/(a − 1) T 1 − a . From Equation (62), Z∗ = IF (g∗ pT ) . Although this approach generates a straightforward solution for the optimal consumption policy, the form of the optimal trading strategy can be difficult to determine. For the special case of geometric Brownian price processes (constant m and s) and a constant short rate r, we can calculate that Z∗ = WT where W is the geometric Brownian wealth process obtained from dWt = Wt (r + f · l) dt + Wtf s dBt; W0 = w, where f = (ss )−1 l/(1 − a) is the vector of fixed optimal portfolio fractions. More generally, in a Markov setting, one can derive a PDE for the wealth process, as for the pricing approach to Black–Scholes option pricing formula, and from the derivatives of the solution function obtain the associated trading strategy. Merton’s original stochastic-control approach, in a Markov setting, gives explicit solutions for the optimal trading strategy in terms of the derivatives of the value function solving the HJB equation. Although there are only a few examples in which these derivatives are
    • 686 D. Duffie known explicitly, they can be approximated by a numerical solution of the Hamilton– Jacobi–Bellman equation. This martingale approach to solving Problem (47) has been extended with duality techniques and other methods to cases of investment with constraints, including incomplete markets. See, for example, Cvitani´c and Karatzas (1996), Cvitani´c, Wang and Schachermayer (2001), Cuoco (1997), and the many sources cited by Karatzas and Shreve (1998). 4. Term-structure models This section reviews models of the term structure of interest rates. These models are used to analyze the dynamic behavior of bond yields and their relationships with macro-economic covariates, and also for the pricing and hedging of fixed-income securities, those whose future payoffs are contingent on future interest rates. Term- structure modeling is one of the most active and sophisticated areas of application of financial theory to everyday business problems, ranging from managing the risk of a bond portfolio to the design and pricing of collateralized mortgage obligations. In this section, we treat default-free instruments. In Section 6, we turn to defaultable bonds. This section provides only a small skeleton of the extensive literature on term-structure models. More extensive notes to the literature are found in Duffie (2001) and in the surveys by Dai and Singleton (2003) and Piazzesi (2002). We first treat the standard “single-factor” examples of Merton (1974), Cox, Ingersoll and Ross (1985a), Dothan (1978), Vasicek (1977), Black, Derman and Toy (1990), and some of their variants. These models treat the entire term structure of interest rates at any time as a function of a single state variable, the short rate of interest. We will then turn to multi-factor models, including multi-factor affine models, extending the Cox– Ingersoll–Ross and Vasicek models. Finally, we turn to the term-structure framework of Heath, Jarrow and Morton (1992), which allows, under technical conditions, any initial term structure of forward interest rates and any process for the conditional volatilities and correlations of these forward rates. Numerical tractability is essential for practical and econometric applications. One must fit model parameters from time-series or cross-sectional data on bond and derivative prices. A fitted model may be used to price or hedge related contingent claims. Typical numerical methods include “binomial trees,” Fourier- transform methods, Monte-Carlo simulation, and finite-difference solution of PDEs. Even the “zero curve” of discounts must be fitted to the prices of coupon bonds.18 In 18 See Adams and Van Deventer (1994), Coleman, Fisher and Ibbotson (1992), Diament (1993), Fisher, Nychka and Zervos (1994), Jaschke (1996), Konno and Takase (1995, 1996) and Svensson and Dahlquist (1996). Consistency of the curve-fitting method with an underlying term-structure model is examined by Bj¨ork and Christensen (1999), Bj¨ork and Gombani (1999) and Filipovi´c (1999).
    • Ch. 11: Intertemporal Asset Pricing Theory 687 econometric applications, bond or option prices must be solved repeatedly for a large sample of dates and instruments, for each of many candidate parameter choices. We fix a probability space (W, F, P) and a filtration F = {Ft: 0 t T} satisfying the usual conditions,19 as well as a short-rate process r. We have departed from a dependence on Brownian information in order to allow for “surprise jumps”, which are important in certain applications. A zero-coupon bond maturing at some future time s > t pays no dividends before time s, and offers a fixed lump-sum payment at time s that we can take without loss of generality to be 1 unit of account. Although it is not always essential to do so, we assume throughout that such a bond exists for each maturity date s. One of our main objectives is to characterize the price Lt,s at time t of the s-maturity bond, and its behavior over time. We fix some equivalent martingale measure Q, after taking as a numeraire for deflation purposes the market value exp[ t 0 r(s) ds] of investments rolled over at the short-rate process r. The price at time t of the zero-coupon bond maturing at s is then Lt,s ≡ EQ t exp − s t r(u) du . (66) The term structure is often expressed in terms of the yield curve. The continuously compounding yield yt,t on a zero-coupon bond maturing at time t + t is defined by yt,t = − log Lt, t + t t . The term structure can also be represented in terms of forward interest rates, as explained later in this section. 4.1. One-factor models A one-factor term-structure model means a model of r that satisfies a stochastic differential equation (SDE) of the form drt = m(rt, t) dt + s(rt, t) dBQ t , (67) where BQ is a standard Brownian motion under Q and where m: R × [0, T] → R and s: R × [0, T] → Rd satisfy technical conditions guaranteeing the existence of a solution to Equation (67) such that, for all t and s t, the price Lt,s of the zero-coupon bond maturing at s is finite and well defined by Equation (66). The one-factor models are so named because the Markov property (under Q) of the solution r to Equation (67) implies, from Equation (66), that the short rate is the only 19 For these technical conditions, see for example, Protter (1990).
    • 688 D. Duffie Table 1 Common single-factor model parameters, Equation (68) Model K0 K1 K2 H0 H1 n Cox, Ingersoll and Ross (1985a) • • • 0.5 Pearson and Sun (1994) • • • • 0.5 Dothan (1978) • 1.0 Brennan and Schwartz (1977) • • • 1.0 Merton (1974), Ho and Lee (1986) • • 1.0 Vasicek (1977) • • • 1.0 Black and Karasinski (1991) • • • 1.0 Constantinides and Ingersoll (1984) • 1.5 state variable, or “factor”, on which the current yield curve depends. That is, for all t and s t, we can write yt,s = F(t, s, rt), for some fixed F: [0, T] × [0, T] × R → R. Table 1 shows many of the parametric examples of one-factor models appearing in the literature, with their conventional names. Each of these models is a special case of the SDE drt = [K0t + K1trt + K2trt log(rt)] dt + [H0t + H1trt]n dBQ t , (68) for deterministic coefficients K0t, K1t, K2t, H0t and H1t depending continuously on t, and for some exponent n ∈ [0.5, 1.5]. Coefficient restrictions, and restrictions on the space of possible short rates, are needed for the existence and uniqueness of solutions. For each model, Table 1 shows the associated exponent n, and uses the symbol “•” to indicate those coefficients that appear in nonzero form. We can view a negative coefficient K1t as a mean-reversion parameter, in that a higher short rate generates a lower drift, and vice versa. Empirically speaking, mean reversion is widely believed to be a useful attribute to include in single-factor short-rate models.20 Non-parametric single-factor models are estimated by A¨ıt-Sahalia (1996a,b, 2002). The empirical evidence, as examined for example by Dai and Singleton (2000), however, points strongly toward multifactor extensions, to which we will turn shortly. 20 In most cases, the original versions of these models had constant coefficients, and were only later extended to allow Kit and Hit to depend on t, for practical reasons, such as calibration of the model to a given set of bond and option prices. The Gaussian short-rate model of Merton (1974), who originated much of the approach taken here, was extended by Ho and Lee (1986), who developed the idea of calibration of the model to the current yield curve. The calibration idea was further developed by Black, Derman and Toy (1990), Hull and White (1990, 1993) and Black and Karasinski (1991), among others. Option evaluation and other applications of the Gaussian model is provided by Carverhill (1988), Jamshidian (1989a,b,c, 1991a, 1993b) and El Karoui and Rochet (1989). A popular special case of the Black–Karasinski model is the Black–Derman–Toy model.
    • Ch. 11: Intertemporal Asset Pricing Theory 689 For essentially any single-factor model, the term structure can be computed (numerically, if not explicitly) by taking advantange of the Feynman–Kac relationship between SDEs and PDEs. Fixing for convenience the maturity date s, the Feynman– Kac approach implies from Equation (66), under technical conditions on m and s, for all t, that Lt,s = f (rt, t), where f ∈ C2,1 (R × [0, T)) solves the PDE Df (x, t) − xf (x, t) = 0, (x, t) ∈ R × [0, s), (69) with boundary condition f (x, s) = 1, x ∈ R, where Df (x, t) = ft(x, t) + fx(x, t) m(x, t) + 1 2 fxx(x, t) s(x, t)2 . This PDE can be quickly solved using standard finite-difference numerical algorithms. A subset of the models considered in Table 1, those with K2 = H1 = 0, are Gaussian.21 Special cases are the models of Merton (1974) (often called “Ho–Lee”) and Vasicek (1977). For a Gaussian model, we can show that bond-price processes are log-normal (under Q) by defining a new process y satisfying dyt = −rt dt, and noting that (r, y) is a two-dimensional Gaussian Markov process. Thus, for any t and s t, the random variable ys − yt = − s t ru du is normally distributed under Q, with a mean m(s − t) and variance v(s − t), conditional on Ft, that are easily computed in terms of rt, K0, K1, and H0. The conditional variance v(s − t) is deterministic. The conditional mean m(t, s) is of the form a(s − t) + b(s − t) rt, for coefficients a(s − t) and b(s − t) whose calculation is left to the reader. It follows that Lt,s = EQ t exp − s t ru du , = exp m(t, s) + v(s − t) 2 , = exp [a(s − t) + b(s − t) r(t)] , where a(s − t) = a(s − t) + v(s − t)/2. Because rt is normally distributed under Q, this means that any zero-coupon bond price is log-normally distributed under Q. Using this property, one can compute bond-option prices in this setting using the original Black– Scholes formula. For this, a key simplifying trick of Jamshidian (1989b) is to adopt as a new numeraire the zero-coupon bond maturing at the expiration date of the option. The associated equivalent martingale measure is sometimes called the forward measure. 21 By a Gaussian process, we mean that the short rates r(t1), . . . , r(tk ) at any finite set {t1, . . . , tk } of times have a joint normal distribution under Q.
    • 690 D. Duffie Under the new numeraire and the forward measure, the price of the bond underlying the option is log-normally distributed with a variance that is easily calculated, and the Black–Scholes formula can be applied. Aside from the simplicity of the Gaussian model, this explicit computation is one of its main advantages in applications. An undesirable feature of the Gaussian model, however, is that it implies that the short rate and yields on bonds of any maturity are negative with positive probability at any future date. While negative interest rates are sometimes plausible when expressed in “real” (consumption numeraire) terms, it is common in practice to express term structures in nominal terms, relative to the price of money. In nominal terms, negative bond yields imply a kind of arbitrage. In order to describe this arbitrage, we can formally view money as a security with no dividends whose price process is identically equal to 1. (This definition in itself is an arbitrage!) If a particular zero-coupon bond were to offer a negative yield, consider a short position in the bond (that is, borrowing) and a long position of an equal number of units of money, both held to the maturity of the bond. With a negative bond yield, the initial bond price is larger than 1, implying that this position is an arbitrage. To address properly the role of money in supporting nonnegative interest rates would, however, require a rather wide detour into monetary theory and the institutional features of money markets. Let us merely leave this issue with the sense that allowing negative interest rates is not necessarily “wrong,” but is somewhat undesirable. Gaussian short-rate models are nevertheless frequently used because they are relatively tractable and in light of the low likelihood that they would assign to negative interest rates within a reasonably short time, with reasonable choices for the coefficient functions. One of the best-known single-factor term-structure models is that of Cox, Ingersoll and Ross (1985b), the “CIR model,” which exploits the stochastic properties of the diffusion model of population sizes of Feller (1951). For constant coefficient functions K0, K1, and H1, the CIR drift and diffusion functions, m and s, may be written in the form m(x, t) = ú(x − x); s(x, t) = C √ x, x 0, (70) for constants ú, x, and C. Provided ú and x are non-negative, there is a nonnegative solution to the associated SDE (67). (Karatzas and Shreve (1988) offer a standard proof.) Given r0, provided úx > C2 , we know that rt has a non-central c2 distribution under Q, with parameters that are known explicitly. The drift ú(x − rt) indicates reversion of rt toward a stationary risk-neutral mean x at a rate ú, in the sense that EQ (rt) = x + e−út (r0 − x) , which tends to x as t goes to +∞. Cox, Ingersoll and Ross (1985b) show how the coefficients ú, x, and C can be calculated in a general equilibrium setting in terms of the utility function and endowment of a representative agent. For the CIR model,
    • Ch. 11: Intertemporal Asset Pricing Theory 691 it can be verified by direct computation of the derivatives that the solution for the term-structure PDE (69) is f (x, t) = exp [a(s − t) + b(s − t) x] , (71) where a(u) = 2úx C2 log (2g exp [(g + ú) u/2]) − log (g + ú)(egu − 1) + 2g , b(u) = 2(1 − egu ) (g + ú)(egu − 1) + 2g , for g = (ú2 + 2C2 )1/2 . The Gaussian and Cox–Ingersoll–Ross models are special cases of single-factor models with the property that the solution f of the term-structure PDE (69) is given by the exponential-affine form (71) for some coefficients a(·) and b(·) that are continuously differentiable. For all t, the yield − log[ f (x, t)]/(s − t) obtained from Equation (71) is affine in x. We therefore call any such model an affine term-structure model. (A function g: Rk → R, for some k, is affine if there are constants a and b in Rk such that for all x, g(x) = a + b · x.) It turns out that, technicalities aside, m and s2 are affine in x if and only if the term structure is itself affine in x. The idea that an affine term-structure model is typically associated with affine drift m and squared diffusion s2 is foreshadowed in Cox, Ingersoll and Ross (1985b) and Hull and White (1990), and is explicit in Brown and Schaefer (1994). Filipovi´c (2001a) provides a definitive result for affine term structure models in a one-dimensional state space. We will get to multi-factor models shortly. The special cases associated with the Gaussian model and the CIR model have explicit solutions for a and b. Cherif, El Karoui, Myneni and Viswanathan (1995), Constantinides (1992), El Karoui, Myneni and Viswanathan (1992), Jamshidian (1996) and Rogers (1995) characterize a model in which the short rate is a linear-quadratic form in a multivariate Markov Gaussian process. This “LQG” class of models overlaps with the general affine models, as for example in Piazzesi (1999), although it remains to be seen how we would maximally nest the affine and quadratic Gaussian models in a simple and tractable framework. 4.2. Term-structure derivatives An important application of term-structure models is the arbitrage-free valuation of derivatives. Some of the most common derivatives are listed below, abstracting from many institutional details that can be found in a standard reference such as Sundaresan (1997). (a) A European option expiring at time s on a zero-coupon bond maturing at some later time u, with strike price p, is a claim to (Ls,u − p)+ at s.
    • 692 D. Duffie (b) A forward-rate agreement (FRA) calls for a net payment by the fixed-rate payer of c∗ − c(s) at time s, where c∗ is a fixed payment and c(s) is a floating-rate payment for a time-to-maturity d, in arrears, meaning that c(s) = L−1 s − d,s − 1 is the simple interest rate applying at time s − d for loans maturing at time s. In practice, we usually have a time to maturity, d, of one quarter or one half year. When originally sold, the fixed-rate payment c∗ is usually set so that the FRA is at market, meaning of zero market value. Cox, Ingersoll and Ross (1981), Duffie and Stanton (1988) and Grinblatt and Jegadeesh (1996) consider the relative pricing of futures and forwards. (c) An interest-rate swap is a portfolio of FRAs maturing at a given increasing se- quence t(1), t(2), . . . , t(n) of coupon dates. The inter-coupon interval t(i) − t(i − 1) is usually 3 months or 6 months. The associated FRA for date t(i) calls for a net payment by the fixed-rate payer of c∗ − c(t(i)), where the floating-rate payment received is c(t(i)) = L−1 t(i − 1),t(i) − 1, and the fixed-rate payment c∗ is the same for all coupon dates. At initiation, the swap is usually at market, meaning that the fixed rate c∗ is chosen so that the swap is of zero market value. Ignoring default risk and market imperfections, this would imply that the fixed-rate coupon c∗ is the par coupon rate. That is, the at-market swap rate c∗ is set at the origination date t of the swap so that 1 = c∗ Lt, t(1) + · · · + Lt, t(n) + Lt, t(n), meaning that c∗ is the coupon rate on a par bond, one whose face value and initial market value are the same. Swap markets are analyzed by Brace and Musiela (1994), Carr and Chen (1996), Collin-Dufresne and Solnik (2001), Duffie and Huang (1996), Duffie and Singleton (1997), El Karoui and Geman (1994) and Sundaresan (1997). For institutional and general economic features of swap markets, see Lang, Litzenberger and Liu (1998) and Litzenberger (1992). (d) A cap can be viewed as portfolio of “caplet” payments of the form (c(t(i)) − c∗ )+ , for a sequence of payment dates t(1), t(2), . . . , t(n) and floating rates c(t(i)) that are defined as for a swap. The fixed rate c∗ is set with the terms of the cap contract. For the valuation of caps, see, for example, Chen and Scott (1995), Clewlow, Pang and Strickland (1997), Miltersen, Sandmann and Sondermann (1997), and Scott (1997). The basic idea is to view a caplet as a put option on a zero-coupon bond. (e) A floor is defined symmetrically with a cap, replacing (c(t(i)) − c∗ )+ with (c∗ − c(t(i)))+ . (f) A swaption is an option to enter into a swap at a given strike rate c∗ at some exercise time. If the future time is fixed, the swaption is European. Pricing of European swaptions is developed in Gaussian settings by Jamshidian (1989a,b,c, 1991a), and more generally in affine settings by Berndt (2002), Collin-Dufresne and Goldstein (2002) and Singleton and Umantsev (2003). An important variant, the Bermudan swaption, allows exercise at any of a given set of successive coupon
    • Ch. 11: Intertemporal Asset Pricing Theory 693 dates. For valuation methods, see Andersen and Andreasen (2000b) and Longstaff and Schwartz (2001). Jamshidian (2001) and Rutkowski (1996, 1998) offer general treatments of LIBOR (London Interbank Offering Rate) derivative modeling.22 Path-dependent derivative securities, such as mortgage-backed securities, sometimes call for additional state variables.23 In a one-factor setting, suppose a derivative has a payoff at some given time s defined by g(rs). By the definition of an equivalent martingale measure, the price at time t for such a security is F (rt, t) ≡ EQ t exp − s t ru du g (rs) . Under technical conditions on m, s and g, we know that F solves the PDE, for (x, t) ∈ R × [0, s), Ft(x, t) + Fx(x, t) m(x, t) + 1 2 Fxx(x, t) s(x, t)2 − xF(x, t) = 0, (72) with boundary condition F(x, s) = g(x), x ∈ R. For example, the valuation of a zero-coupon bond option is given, in a one-factor setting, by the solution F to Equation (72), with boundary value g(x) = [ f (x, s) − p]+ , where f (x, s) is the price at time s of a zero-coupon bond maturing at u. 4.3. Fundamental solution Under technical conditions, we can also express the solution F of the PDE (72) for the value of a derivative term-structure security in the form F(x, t) = +∞ −∞ G(x, t, y, s) g( y) dy, (73) where G is the fundamental solution of the PDE (72). One may think of G(x, t, y, s) dy as the price at time t, state x, of an “infinitesimal security” paying one unit of account in 22 On the valuation of other specific forms of term-structure derivatives, see Artzner and Roger (1993), Bajeux-Besnainou and Portait (1998), Brace and Musiela (1994), Chacko and Das (2002), Chen and Scott (1992, 1993), Cherubini and Esposito (1995), Chesney, Elliott and Gibson (1993), Cohen (1995), Daher, Romano and Zacklad (1992), D´ecamps and Rochet (1997), El Karoui, Lepage, Myneni, Roseau and Viswanathan (1991a,b), and Turnbull (1993), Fleming and Whaley (1994) (wildcard options), Ingersoll (1977) (convertible bonds), Jamshidian (1993a, 1994) (diff swaps and quantos), Jarrow and Turnbull (1994), Longstaff (1990) (yield options), and Turnbull (1995). 23 The pricing of mortgage-backed securities based on term-structure models is pursued by Boudoukh, Richardson, Stanton and Whitelaw (1997), Cheyette (1996), Jakobsen (1992), Stanton (1995) and Stanton and Wallace (1995, 1998), who also review some of the related literature.
    • 694 D. Duffie the event that the state is at level y at time s, and nothing otherwise. One can compute the fundamental solution G by solving a PDE that is “dual” to Equation (72), in the following sense. Under technical conditions, for each (x, t) in R × [0, T), a function y ∈ C2,1 (R × (0, T]) is defined by y( y, s) = G(x, t, y, s), and solves the forward Kolmogorov equation (also known as the Fokker–Planck equation): D∗ y( y, s) − yy( y, s) = 0, (74) where D∗ y( y, s) = −ys( y, s) − ð ðy [y( y, s) m( y, s)] + 1 2 ð2 ðy2 y( y, s) s( y, s)2 . The “intuitive” boundary condition for Equation (74) is obtained from the role of G in pricing securities. Imagine that the current short rate at time t is x, and consider an instrument that pays one unit of account immediately, if and only if the current short rate is some number y. Presumably this contingent claim is valued at 1 unit of account if x = y, and otherwise has no value. From continuity in s, one can thus think of y(·, s) as the density at time s of a measure on R that converges as s ↓ t to a probability measure n with n({x}) = 1, sometimes called the Dirac measure at x. This initial boundary condition on y can be made more precise. See, for example, Karatzas and Shreve (1988) for details. Applications to term-structure modeling of the fundamental solution, sometimes erroneously called the “Green’s function,” are illustrated by Dash (1989), Beaglehole (1990), Beaglehole and Tenney (1991), B¨uttler and Waldvogel (1996), Dai (1994) and Jamshidian (1991b). For example, Beaglehole and Tenney (1991) show that the fundamental solution G of the Cox–Ingersoll–Ross model (70) is given explicitly in terms of the parameters ú, x and C by G(x, 0, y, t) = f(t) Iq f(t) xy e−gt exp [f(t)( y + x e−gt) − h(x + úxt − y)] egt y x q/2 , where g = (ú2 + 2C2 )1/2 , h = (ú − g)/C2 , f(t) = 2g C2(1 − e−gt) , q = 2úx C2 − 1, and Iq(·) is the modified Bessel function of the first kind of order q. For time- independent m and s, as with the CIR model, we have, for all t and s > t, G(x, t, y, s) = G(x, 0, y, s − t). The fundamental solution for the Dothan (log-normal) short-rate model can be deduced from the form of the solution by Hogan and Weintraub (1993) of what he calls the “conditional discounting function”. Chen (1996) provides the fundamental
    • Ch. 11: Intertemporal Asset Pricing Theory 695 solution for his 3-factor affine model. Van Steenkiste and Foresi (1999) provide a general treatment of fundamental solutions of the PDE for affine models. For more technical details and references see, for example, Karatzas and Shreve (1988). Given the fundamental solution G, the derivative asset-price function F is more easily computed by numerically integrating Equation (73) than from a direct numerical attack on the PDE (72). Thus, given a sufficient number of derivative securities whose prices must be computed, it may be worth the effort to compute G. 4.4. Multifactor term-structure models The one-factor model (67) for the short rate is limiting. Even a casual review of the empirical properties of the term structure, for example as reviewed in the surveys of Dai and Singleton (2003) and Piazzesi (2002), shows the significant potential improvements in fit offered by a multifactor term-structure model. While terminology varies from place to place, by a “multifactor” model we mean a model in which the short rate is of the form rt = R(Xt, t), t 0, where X is a Markov process with a state space D that is some subset of Rk , for k > 1. For example, in much of the literature, X is an Ito process solving a stochastic differential equation of the form dXt = m(Xt, t) dt + s(Xt, t) dBQ t , (75) where BQ is a standard Brownian motion in Rd under Q and the given functions R, m and s on D × [0, ∞) into R, Rk and Rk × d , respectively, satisfy enough technical regularity to guarantee that Equation (75) has a unique solution and that the term structure (66) is well defined. In empirical applications, one often supposes that the state process X also satisfies a stochastic differential equation under the probability measure P, in order to exploit the time-series behavior of observed prices and price- determining variables in estimating the model. There are various approaches for identifying the state vector Xt. In certain models, some or all elements of the state vector Xt are latent, that is, unobservable to the modeler, except insofar as they can be inferred from prices that depend on the levels of X . For example, k state variables might be identified from bond yields at k distinct maturities. Alternatively, one might use both bond and bond option prices, as in Singleton and Umantsev (2003) or Collin-Dufresne and Goldstein (2001b, 2002). This is typically possible once one knows the parameters, as explained below, but the parameters must of course be estimated at the same time as the latent states are estimated. This latent-variable approach has nevertheless been popular in much of the empirical literature. Notable examples include Dai and Singleton (2000), and references cited by them. Another approach is to take some or all of the state variables to be directly observable variables, such as macro-economic determinants of the business cycle
    • 696 D. Duffie and inflation, that are thought to play a role in determining the term structure. This approach has also been explored by Piazzesi (1999), among others.24 A derivative security, in this setting, can often be represented in terms of some real- valued terminal payment function g on Rk , for some maturity date s T. By the definition of an equivalent martingale measure, the associated derivative security price is F (Xt, t) = EQ t exp − s t R (Xu, u) du g (Xs) . For the case of a diffusion state process X satisfying Equation (75), extending Equation (72), under technical conditions we have the PDE characterization DF(x, t) − R(x, t) F(x, t) = 0, (x, t) ∈ D × [0, s), (76) with boundary condition F(x, s) = g(x), x ∈ D, (77) where DF(x, t) = Ft(x, t) + Fx(x, t) m(x, t) + 1 2 tr s(x, t) s(x, t) Fxx(x, t) . The case of a zero-coupon bond is g(x) ≡ 1. Under technical conditions, we can also express the solution F, as in Equation (73), in terms of the fundamental solution G of the PDE (76). 4.5. Affine models Many financial applications including term-structure modeling are based on a state process that is Markov, under some reference probability measure that, depending on the application, may or may not be an equivalent martingale measure. We will fix the probability measure P for the current discussion. A useful assumption is that the Markov state process is “affine”. While several equivalent definitions of the class of affine processes can be usefully applied, perhaps the simplest definition of the affine property for a Markov process X in a state space D ⊂ Rd is that its conditional characteristic function is of the form, for any u ∈ Rd , E (exp [iu · X (t)] | X (s)) = exp [f(t − s, u) + y(t − s, u) · X (s)] . (78) for some deterministic coefficients f(t − s, u) and y(t − s, u). Duffie, Filipovi´c and Schachermayer (2003) show that, for a time-homogeneous25 affine process X with a 24 See also Babbs and Webber (1994), Balduzzi, Bertola, Foresi and Klapper (1998) and Piazzesi (1997). On modeling the term-structure of real interest rates, see Brown and Schaefer (1996) and Pennacchi (1991). 25 Filipovi´c (2001b) extends to the time inhomogeneous case.
    • Ch. 11: Intertemporal Asset Pricing Theory 697 state space of the form Rn + × Rd − n , provided the coefficients f(·) and y(·) of the characteristic function are differentiable and their derivatives are continuous at 0, the affine process X must be a jump-diffusion process, in that dXt = m(Xt) dt + s(Xt) dBt + dJt, (79) for a standard Brownian motion B in Rd and a pure-jump process J, and moreoever the drift m(Xt), the “instantaneous” covariance matrix s(Xt) s(Xt) , and the jump measure associated with J must all have affine dependence on the state Xt. This result also provides necessary and sufficient conditions on the coefficients of the drift, diffusion, and jump measure for the process to be a well defined affine process, and provides that the coefficients f(·, u) and y(·, u) of the characteristic function satisfy a certain (generalized Riccati) ordinary differential equation (ODE), the key to tractability for this class of processes.26 Conversely, any jump-diffusion whose coefficients are of this affine class is an affine process in the sense of Equation (78). A complete statement of this result is found in Duffie, Filipovi´c and Schachermayer (2003). Simple examples of affine processes used in financial modeling are the Gaussian Ornstein–Uhlenbeck model, applied to interest rates by Vasicek (1977), and the Feller (1951) diffusion, applied to interest-rate modeling by Cox, Ingersoll and Ross (1985b), as already mentioned in the context of one-factor models. A general multivariate class of affine term-structure jump-diffusion models was introduced by Duffie and Kan (1996) for term-structure modeling. Dai and Singleton (2000) classified 3-dimensional affine diffusion models, and found evidence in U.S. swap rate data that both time- varying conditional variances and negatively correlated state variables are essential ingredients to explaining the historical behavior of term structures. For option pricing, there is a substantial literature building on the particular affine stochastic-volatility model for currency and equity prices proposed by Heston (1993). Bates (1997), Bakshi, Cao and Chen (1997), Bakshi and Madan (2000) and Duffie, Pan and Singleton (2000) brought more general affine models to bear in order to allow for stochastic volatility and jumps, while maintaining and exploiting the simple property (78). A key property related to Equation (78) is that, for any affine function R: D → R and any w ∈ Rd , subject only to technical conditions reviewed in Duffie, Filipovi´c and Schachermayer (2003), Et exp s t −R(X (u)) du + w · X (s) = exp [a(s − t) + b(s − t) · X (t)] , (80) for coefficients a(·) and b(·) that satisfy generalized Riccati ODEs (with real boundary conditions) of the same type solved by f and y of Equation (78), respectively. 26 Recent work, yet to be distributed, by Martino Graselli of CREST, Paris, and Claudio Tebaldi, provides explicit solutions for the Riccati equations of multi-factor affine diffusion processes.
    • 698 D. Duffie In order to get a quick sense of how the Riccati equations for a(·) and b(·) arise, we consider the special case of an affine diffusion process X solving the stochastic differential equation (79), with state space D = R+, and with m(x) = a + bx and s2 (x) = cx, for constant coefficients a, b and c. (This is the continuous branching process of Feller (1951).) We let R(x) = ø0 + ø1x, for constants ø0 and ø1, and apply the Feynman–Kac partial differential equation (PDE) (69) to the candidate solution exp[a(s − t) + b(s − t) · x] of Equation (80). After calculating all terms of the PDE and then dividing each term of the PDE by the common factor exp[a(s − t) + b(s − t) · x], we arrive at − a (z) − b (z) x + b(z)(a + bx) + 1 2 b(z)2 c2 x − ø0 − ø1x = 0, (81) for all z 0. Collecting terms in x, we have u(z) x + v(z) = 0, (82) where u(z) = −b (z) + b(z) b + 1 2 b(z)2 c2 − ø1, (83) v(z) = −a (z) + b(z) a − ø0. (84) Because Equation (82) must hold for all x, it must be the case that u(z) = v(z) = 0. This leaves the Riccati equations: b (z) = b(z) b + 1 2 b(z)2 c2 − ø1, (85) a (z) = b(z) a − ø0, (86) with the boundary conditions a(0) = 0 and b(0) = w, from Equation (80) for s = t. The explicit solutions for a(z) and b(z) were stated earlier for the CIR model (the case w = 0), and are given explicitly in a more general case with jumps, called a “basic affine process”, in Duffie and Gˆarleanu (2001). Beyond the Gaussian case, any Ornstein–Uhlenbeck process, whether driven by a Brownian motion (as for the Vasicek model) or by a more general L´evy process with jumps, as in Sato (1999), is affine. Moreover, any continuous-branching process with immigration (CBI process), including multi-type extensions of the Feller process, is affine. [See Kawazu and Watanabe (1971).] Conversely, an affine process in Rd + is a CBI process. For term-structure modeling,27 the state process X is typically assumed to be affine under a given equivalent martingale measure Q. For econometric modeling of 27 Special cases of affine term-structure models include those of Balduzzi, Das and Foresi (1998), Balduzzi, Das, Foresi and Sundaram (1996), Baz and Das (1996), Berardi and Esposito (1999), Chen (1996), Cox, Ingersoll and Ross (1985b), Das (1993, 1995, 1997, 1998), Das and Foresi (1996), Duffie and Kan (1996), Duffie, Pedersen and Singleton (2003), Heston (1988), Langetieg (1980), Longstaff and Schwartz (1992, 1993), Pang and Hodges (1995) and Selby and Strickland (1993).
    • Ch. 11: Intertemporal Asset Pricing Theory 699 bond yields, the affine assumption is sometimes also made under the data-generating measure P, although Duffee (1999b) suggests that this is overly restrictive from an empirical viewpoint, at least for 3-factor models of interest rates in the USA that do not have jumps. For general reviews of this issue, and summaries of the empirical evidence on affine term structure models, see Dai and Singleton (2003) and Piazzesi (2002). The affine class allows for the analytic calculation of bond option prices on zero-coupon bonds and other derivative securities, as reviewed in Section 5, and extends to the case of defaultable models, as we show in Section 6. For related computational results, see Liu, Pan and Pedersen (1999) and Van Steenkiste and Foresi (1999). Singleton (2001) exploits the explicit form of the characteristic function of affine models to provide a class of moment conditions for econometric estimation. 4.6. The HJM model of forward rates We turn to the term structure model of Heath, Jarrow and Morton (1992). Until this point, we have taken as the primitive a model of the short-rate process of the form rt = R(Xt, t), where (under some equivalent martingale measure) X is a finite- dimensional Markov process. This approach has analytical advantages, especially for derivative pricing and statistical modeling. A more general approach that is especially popular in business applications is to directly model the risk-neutral stochastic behavior of the entire term structure of interest rates. This is the essence of the Heath–Jarrow– Morton (HJM) model. The remainder of this section is a summary of the basic elements of the HJM model. If the discount Lt,s is differentiable with respect to the maturity date s, a mild regularity, we can write Lt,s = exp − s t f (t, u) du , where f (t, u) = − 1 Lt,u ðLt,u ðu . The term structure can thus be represented in terms of the instantaneous forward rates, { f (t, u): u t}. The HJM approach is to take as primitive a particular stochastic model of these forward rates. First, for each fixed maturity date s, one models the one-dimensional forward-rate process f (·, s) = { f (t, s): 0 t s} as an Ito process, in that f (t, s) = f (0, s) + t 0 m(u, s) du + t 0 s(u, s) dBQ u , 0 t s, (87)
    • 700 D. Duffie where m(·, s) = {m(t, s): 0 t s} and s(·, s) = {s(t, s): 0 t s} are adapted processes valued in R and Rd , respectively, such that Equation (87) is well defined.28 Under purely technical conditions, it must be the case that m(t, s) = s(t, s) · s t s(t, u) du. (88) In order to confirm this key risk-neutral drift restriction (88), consider the Q-martingale M defined by Mt = EQ t exp − s 0 ru du = exp − t 0 ru du Lt,s = exp (Xt + Yt) , (89) where Xt = − t 0 ru du; Yt = − s t f (t, u) du. We can view Y as an infinite sum of the Ito processes for forward rates over all maturities ranging from t to s. Under technical conditions29 for Fubini’s Theorem for stochastic integrals, we thus have dYt = mY (t) dt + sY (t) dBQ t , where mY (t) = f (t, t) − s t m(t, u) du, and sY (t) = − s t s(t, u) du. We can then apply Ito’s Formula in the usual way to Mt = eX (t) + Y(t) and obtain the drift under Q of M as mM (t) = Mt mY (t) + 1 2 sY (t) · sY (t) − rt . Because M is a Q-martingale, we must have mM = 0, so, substituting mY (t) into this equation, we obtain s t m(t, u) du = 1 2 s t s(t, u) du · s t s(t, u) du . Taking the derivative of each side with respect to s then leaves the risk-neutral drift restriction (88) which in turn provides, naturally, the property that r(t) = f (t, t). 28 The necessary and sufficient condition is that, almost surely, s 0 |m(t, s)| dt < ∞ and s 0 s(t, s) · s(t, s) dt < ∞. 29 In addition to measurability, it suffices that m(t, u, w) and s(t, u, w) are uniformly bounded and, for each w, continuous in (t, u). For weaker conditions, see Protter (1990).
    • Ch. 11: Intertemporal Asset Pricing Theory 701 Thus, the initial forward rates { f (0, s): 0 s T} and the forward-rate “volatility” process s can be specified with nothing more than technical restrictions, and these are enough to determine all bond and interest-rate derivative price processes. Aside from the Gaussian special case associated with deterministic volatility s(t, s), however, most valuation work in the HJM setting is typically done by Monte Carlo simulation. Special cases aside,30 there is no finite-dimensional state variable for the HJM model, so PDE-based computational methods cannot be used. The HJM model has been extensively treated in the case of Gaussian instantaneous forward rates by Jamshidian (1989b), who developed the forward-measure approach, and Jamshidian (1989a,c, 1991a) and El Karoui and Rochet (1989), and extended by El Karoui, Lepage, Myneni, Roseau and Viswanathan (1991a,b), El Karoui and Lacoste (1992), Frachot (1995), Frachot, Janci and Lacoste (1993), Frachot and Lesne (1993) and Miltersen (1994). A related model of log-normal discrete-period interest rates, the “market model,” was developed by Miltersen, Sandmann and Sondermann (1997).31 Musiela (1994b) suggested treating the entire forward-rate curve g(t, u) = { f (t, t + u): 0 u ∞}, itself as a Markov process. Here, u indexes time to maturity, not date of maturity. That is, we treat the term structure g(t) = g(t, ·) as an element of some convenient state space S of real-valued continuously differentiable functions on [0, ∞). Now, letting v(t, u) = s(t, t + u), the risk-neutral drift restriction (88) on f , and enough regularity, imply the stochastic partial differential equation (SPDE) for g given by dg(t, u) = ðg(t, u) ðu dt + V(t, u) dt + v(t, u) dBQ t , where V(t, u) = v(t, u) · u 0 v(t, z) dz. This formulation is an example of a rather delicate class of SPDEs that are called “hyperbolic”. Existence is usually not shown, or shown only in a “weak sense”, as by Kusuoka (2000). The idea is nevertheless elegant and potentially important in getting a parsimonious treatment of the yield curve as a Markov process. One may even allow 30 See Au and Thurston (1993), Bhar and Chiarella (1995), Cheyette (1995), Jeffrey (1995), Musiela (1994b), Ritchken and Sankarasubramaniam (1992) and Ritchken and Trevor (1993). 31 See also Andersen and Andreasen (2000a), Brace and Musiela (1995), Dothan (1978), Goldberg (1998), Goldys, Musiela and Sondermann (1994), Hansen and Jorgensen (2000), Hogan and Weintraub (1993), Jamshidian (1997a,b, 2001), Sandmann and Sondermann (1997), Miltersen, Sandmann and Sondermann (1997), Musiela (1994a) and Vargiolu (1999). A related log-normal futures-price term structure model is due to Heath (1998).
    • 702 D. Duffie the Brownian motion BQ to be “infinite-dimensional”. For related work in this setting, sometimes called a string, random field, or SPDE model of the term structure, see Cont (1998), Jong and Santa-Clara (1999), Goldstein (1997, 2000), Goldys and Musiela (1996), Hamza and Klebaner (1995), Kennedy (1994), Kusuoka (2000), Musiela and Sondermann (1994), Pang (1996), Santa-Clara and Sornette (2001) and Sornette (1998). 5. Derivative pricing We turn to a review of the pricing of derivative securities, taking first futures and forwards, and then turning to options. The literature is immense, and we shall again merely provide a brief summary of results. Again, we fix a probability space (W, F, P) and a filtration F = {Ft: 0 t T} satisfying the usual conditions, as well as a short- rate process r. 5.1. Forward and futures prices We briefly address the pricing of forward and futures contracts, an important class of derivatives. The forward contract is the simpler of these two closely related securities. Let W be an FT -measurable finite-variance random variable underlying the claim payable to a holder of the forward contract at its delivery date T. For example, with a forward contract for delivery of a foreign currency at time T, the random variable W is the market value at time T of the foreign currency. The forward-price process F is defined by the fact that one forward contract at time t is a commitment to pay the net amount Ft − W at time T, with no other cash flows at any time. In particular, the true price of a forward contract, at the contract date, is zero. We fix an equivalent martingale measure Q for the available securities, after deflation by exp[− t 0 r(u) du], where r is a short-rate process that, for convenience, is assumed to be bounded. The dividend process H defined by the forward contract made at time t is given by Hs = 0, s < T, and HT = W − Ft. Because the true price of the forward contract at t is zero, 0 = EQ t exp − T t rs ds (W − Ft) . Solving for the forward price, Ft = EQ t exp − T t rs ds W EQ t exp − T t rs ds .
    • Ch. 11: Intertemporal Asset Pricing Theory 703 If we assume that there exists at time t a zero-coupon riskless bond maturing at time T, with price Lt,T , then Ft = 1 Lt,T EQ t exp − T t rs ds W . If r and W are statistically independent with respect to Q, we have the simplified expression Ft = EQ t (W), implying that the forward price is a Q-martingale. This would be true, for instance, if the short-rate process r is deterministic. As an example, suppose that the forward contract is for delivery at time T of one unit of a particular security with price process S and cumulative dividend process D. In particular, W = ST . We can obtain a more concrete representation of the forward price, as follows. We have Ft = 1 Lt,T St − EQ t T t exp − s t ru du dDs . If the short-rate process r is deterministic, we can simplify further to Ft = St Lt,T − EQ t T t exp T s ru du dDs , (90) which is known as the cost-of-carry formula for forward prices for the case in which interest rates and dividends are deterministic. As with a forward contract, a futures contract with delivery date T is keyed to some delivery value W, which we take to be an FT -measurable random variable with finite variance. The contract is completely defined by a futures-price process F with the property that FT = W. As we shall see, the contract is literally a security whose price process is zero and whose cumulative dividend process is F. In other words, changes in the futures price are credited to the holder of the contract as they occur. This definition is an abstraction of the traditional notion of a futures contract, which calls for the holder of one contract at the delivery time T to accept delivery of some asset (whose spot market value at T is represented here by W) in return for simultaneous payment of the current futures price FT . Likewise, the holder of −1 contract, also known as a short position of 1 contract, is traditionally obliged to make delivery of the same underlying assset in exchange for the current futures price FT . This informally justifies the property FT = W of the futures-price process F given in the definition above. Roughly speaking, if FT is not equal to W (and if we continue to neglect transactions costs and other details), there is a delivery arbitrage. We won’t explicitly define a delivery arbitrage since it only complicates the analysis of futures prices that follows. Informally, however, in the event that W > FT , one could buy at time T the deliverable asset for W, simultaneously sell one futures contract, and make immediate delivery for a profit of W − FT . Thus, the potential of delivery
    • 704 D. Duffie arbitrage will naturally equate FT with the delivery value W. This is sometimes known as the principle of convergence. Many modern futures contracts have streamlined procedures that avoid the delivery process. For these, the only link that exists with the notion of delivery is that the terminal futures price FT is contractually equated to some such variable W, which could be the price of some commodity or security, or even some abstract variable of general economic interest such as a price deflator. This procedure, finessing the actual delivery of some asset, is known as cash settlement. In any case, whether based on cash settlement or the absence of delivery arbitrage, we shall always take it by definition that the delivery futures price FT is equal to the given delivery value W. The institutional feature of futures markets that is central to our analysis of futures prices is resettlement, the process that generates daily or even more frequent payments to and from the holders of futures contracts based on changes in the futures price. As with the expression “forward price”, the term “futures price” can be misleading in that the futures price Ft at time t is not at all the price of the contract. Instead, at each resettlement time t, an investor who has held q futures contracts since the last resettlement time, say s, receives the resettlement payment q(Ft − Fs), following the simplest resettlement recipe. More complicated resettlement arrangements often apply in practice. The continuous-time abstraction is to take the futures-price process F to be an Ito process and a futures position process to be some q in L(F) generating the resettlement gain q dF as a cumulative-dividend process. In particular, as we have already stated in its definition, the futures-price process F is itself, formally speaking, the cumulative dividend process associated with the contract. The true price process is zero, since (again ignoring some of the detailed institutional procedures), there is no payment against the contract due at the time a contract is bought or sold. The futures-price process F can now be characterized as follows. We suppose that the short-rate process r is bounded. For all t, let Yt = exp[− t 0 r(s) ds]. Because F is strictly speaking the cumulative-dividend process associated with the futures contract, and since the true-price process of the contract is zero, from the fact that the risk- neutral discounted gain is a martingale, 0 = EQ t T t Ys dFs , t T, from which it follows that the stochastic integral Y dF is a Q-martingale. Because r is bounded, there are constants k1 > 0 and k2 such that k1 Yt k2 for all t. The process Y dF is therefore a Q-martingale if and only if F is also a Q-martingale. Since FT = W, we have deduced a convenient representation for the futures-price process: Ft = EQ t (W), t ∈ [0, T]. (91) If r and W are statistically independent under Q, the futures-price process F and the forward-price process F are thus identical. Otherwise, as pointed out by Cox, Ingersoll
    • Ch. 11: Intertemporal Asset Pricing Theory 705 and Ross (1981), there is a distinction based on correlation between changes in futures prices and interest rates. 5.2. Options and stochastic volatility The Black–Scholes formula, which treats option prices under constant volatility, can be extended to cases with stochastic volatility, which is crucial in many markets from an empirical viewpoint. We will briefly examine several basic approaches, and then turn to the computation of option prices using the Fourier-transform method introduced by Stein and Stein (1991), and then first exploited in an affine setting by Heston (1993). We recall that the Black–Scholes option-pricing formula is of the form C(x, p, r, t, s), for C: R5 + → R+, where x is the current underlying asset price, p is the exercise price, r is the constant short rate, t is the time to expiration, and s is the volatility coefficient for the underlying asset. For each fixed (x, p, r, t) with non-zero x and t, the map from s to C(x, p, r, t, s) is strictly increasing, and its range is unbounded. We may therefore invert and obtain the volatility from the option price. That is, we can define an implied volatility function I: R5 + → R+ by c = C (x, p, r, t, I (x, p, r, t, c)) , (92) for all sufficiently large c ∈ R+. If c1 is the Black–Scholes price of an option on a given asset at strike p1 and expiration t1, and c2 is the Black–Scholes price of an option on the same asset at strike p2 and expiration t2, then the associated implied volatilities I(x, p1, r, t1, c1) and I(x, p2, r, t2, c2) must be identical, if indeed the assumptions underlying the Black– Scholes formula apply literally, and in particular if the underlying asset-price process has the constant volatility of a geometric Brownian motion. It has been widely noted, however, that actual market prices for European options on the same underlying asset have associated Black–Scholes implied volatilities that vary with both exercise price and expiration date. For example, in certain markets at certain times, the implied volatilities of options with a given exercise date depend on strike prices in a manner that is often termed a smile curve. Figure 1 illustrates the dependence of Black–Scholes implied volatilities on moneyness (the ratio of strike price to futures price), for various S&P 500 index options on November 2, 1993. Other forms of systematic deviation from constant implied volatilities have been noted, both over time and across various derivatives at a point in time. Three major lines of modeling address these systematic deviations from the assumptions underlying the Black–Scholes model. In all of these, a key step is to generalize the underlying log-normal price process by replacing the constant volatility parameter s of the Black–Scholes model with √ Vt, an adapted non-negative process V with T 0 Vt dt < ∞ such that the underlying asset price process S satisfies dSt = rtSt dt + St Vt dûS t , (93)
    • 706 D. Duffie Fig. 1. “Smile curves” implied by SP500 Index options of 6 different times to expiration, from market data for November 2, 1993. where BQ is a standard Brownian motion in Rd under the given equivalent martingale measure Q, and ûS = cS · BQ is a standard Brownian motion under Q obtained from any cS in Rd with unit norm. In the first class of models, Vt = v(St, t), for some function v: R × [0, T] → R satisfying technical regularity conditions. In practical applications, the function v, or its discrete-time discrete-state analogue, is often “calibrated” to the available option prices. This approach, sometimes referred to as the implied-tree model, was developed by Dupire (1994), Rubinstein (1995) and Jackwerth and Rubinstein (1996). For a second class of models, called autoregressive conditional heteroscedastic, or ARCH, the volatility depends on the path of squared returns, as formulated by Engle (1982). The GARCH (generalized ARCH) variant has the squared volatility Vt at time t of the discrete-period return Rt + 1 = log St + 1 − log St adjusting according to the recursive formula Vt = a + bVt − 1 + cR2 t , (94) for fixed coefficients a, b and c satisfying regularity conditions. By taking a time period of length h, normalizing in a natural way, and taking limits, a natural continous-time limiting behavior for volatility is simply a deterministic mean-reverting process V satisfying the ordinary differential equation dV(t) dt = ú (v − V(t)) . (95)
    • Ch. 11: Intertemporal Asset Pricing Theory 707 Corradi (2000) explains that this deterministic continuous-time limit is more natural than the stochastic limit of Nelson (1990). For both the implied-tree approach and the GARCH approach, the volatility process V depends only on the underlying asset prices; volatility is not a separate source of risk. In a third approach, however, the increments of the squared-volatility process V depend on Brownian motions that are not perfectly correlated with ûS . For example, in a simple “one-factor” setting, dVt = mV (Vt) dt + sV (Vt) dûV t , (96) where ûV = cV · BQ is a standard Brownian motion under Q, for some constant vector cV of unit norm. As we shall see, the correlation parameter cSV = cS · cV has an important influence on option prices. The price of a European option at exercise price p and expiration at time t is f (Ss, Vs, s) = EQ s exp [−r(t − s)] (St − p)+ , which can be solved, for example, by reducing to a PDE and applying, if necessary, a finite-difference approach. In many settings, a pronounced skew to the smile, as in Figure 1, indicates an important potential role for correlation between the increments of the return-driving and volatility-driving Brownian motions, ûS and ûV . This role is borne out directly by the correlation apparent from time-series data on implied volatilities and returns for certain important asset classes, as indicated for example by Pan (2002). A tractable model that allows for the skew effects of correlation is the Heston model, the special case of Equation (96) for which dVt = ú (v − Vt) dt + sv Vt dûV t , (97) for positive coefficients ú, v and sv that play the same respective roles for V as for a Cox–Ingersoll–Ross interest-rate model. Indeed, this Feller diffusion model of volatility (97) is sometimes called a “CIR volatility model.” In the original Heston model, the short rate is a constant, say r, and option prices can be computed analytically, using transform methods explained later in this section, in terms of the parameters (r, cSV , ú, v, sv) of the Heston model, as well as the initial volatility V0, the initial underlying price S0, the strike price, and the expiration time. Figure 2 shows the “smile curves,” for the same options illustrated in Figure 1, that are implied by the Heston model for parameters, including V0, chosen to minimize the sum of squared differences between actual and theoretical option prices, a calibration approach popularized for this application by Bates (1997). Notably, the distinctly downward slopes, often called skews, are captured with a negative correlation coefficient cSV . Adopting a short rate r = 0.0319 that roughly captures the effects of contemporary short-term interest rates, the remaining coefficients of the Heston model are calibrated to cSV = −0.66, ú = 19.66, v = 0.017, sv = 1.516, and √ V0 = 0.094.
    • 708 D. Duffie Fig. 2. “Smile curves” calculated for SP500 Index options of 6 different exercise dates, November 2, 1993, using the Heston Model. Going beyond the calibration approach, time-series data on both options and underlying prices have been used simultaneously to fit the parameters of various stochastic-volatility models, for example by A¨ıt-Sahalia, Wang and Yared (2001), Benzoni (2002), Chernov and Ghysels (2000), Guo (1998), Pan (2002), Poteshman (1998) and Renault and Touzi (1992). The empirical evidence for S&P 500 index returns and option prices suggests that the Heston model is overly restrictive for these data. For example, Pan (2002) rejects the Heston model in favor of a generalization with jumps in returns, proposed by Bates (1997), that is a special case of the affine model for option pricing to which we now turn. 5.3. Option valuation by transform analysis We now address the calculation of option prices with stochastic volatility and jumps in an affine setting of the type already introduced for term-structure modeling, a special case being the model of Heston (1993). We use an approach based on transform analysis that was initiated by Stein and Stein (1991) and Heston (1993), allowing for relatively rich and tractable specifications of stochastic interest rates and volatility, and for jumps. This approach and the underlying stochastic models were subsequently generalized by Bakshi, Cao and Chen (1997), Bakshi and Madan (2000), Bates (1997) and Duffie, Pan and Singleton (2000). We assume that there is a state process X that is affine under Q in a state space D ⊂ Rk , and that the short-rate process r is of the affine form rt = ø0 + ø1 · Xt,
    • Ch. 11: Intertemporal Asset Pricing Theory 709 for coefficients ø0 in R and ø1 in Rk . The price process S underlying the options in question is assumed to be of the exponential-affine form St = exp[a(t) + b(t) · X (t)], for potentially time-dependent coefficients a(t) in R and b(t) in Rk . An example would be the price of an equity, a foreign currency, or, as shown earlier in the context of affine term-structure models, the price of a zero-coupon bond. The Heston model (97) is a special case, for an affine process X = (X (1) , X (2) ), with X (1) t = Yt ≡ log(St), and X (2) t = Vt, and with a constant short rate r = ø0. From Ito’s Formula, dYt = r − 1 2 Vt dt + Vt dûS t , (98) which indeed makes the state vector Xt = (Yt, Vt) an affine process, whose state space is D = R × [0, ∞), as we can see from the fact that the drift and instantaneous covariance matrix of Xt are affine with respect to Xt. The underlying asset price is indeed of the desired exponential-affine form because St = eY(t) . We will return to the Heston model shortly with some explicit results on option valuation. One of the affine models generalizing Heston’s that was tested by Pan (2002) took dYt = r − 1 2 Vt dt + Vt dûS t + dZt, (99) where, under the equivalent martingale measure Q, Z is a pure-jump process whose jump times have an arrival intensity (as defined in Section 6) that is affine with respect to the volatility process V, and whose jump sizes are independent normals. For the general affine case, suppose we are interested in valuing a European call option on the underlying security, with strike price p and exercise date t. We have the initial option price U0 = EQ exp − t 0 ru du (Su − p)+ . Letting A denote the exercise event {w: S(w, t) p}, we have the option price U0 = EQ exp − t 0 rs ds (St1A − p1A) . Because S(t) = exp[a(t) + b(t) · X (t)], U0 = ea(t) G (− log p + a(t); t, b(t), −b(t)) − pG (− log p + a(t); t, 0, −b(t)) , (100) where, for any y ∈ R and for any coefficient vectors d and d in Rk , G( y; t, d, d) = EQ exp − t 0 rs ds exp[d · X (t)] 1d·X (t) y . (101) So, if we can compute the function G, we can obtain the prices of options of any strike and exercise date. Likewise, the prices of European puts, interest-rate caps,
    • 710 D. Duffie chooser options, and many other derivatives can be derived in terms of G. For example, following this approach of Heston (1993), the valuation of discount bond options and caps in an affine setting was undertaken by Chen and Scott (1995), Duffie, Pan and Singleton (2000), Nunes, Clewlow and Hodges (1999) and Scaillet (1996). We note, for fixed (t, d, d), assuming EQ (exp[− t 0 r(u) du] exp[d · X (t)]) < ∞, that G(·; t, d, d) is a bounded increasing function. For any such function g: R → [0, ∞), an associated transform ˆg: R → C, where C is the set of complex numbers, is defined by ˆg(z) = +∞ −∞ eizy dg( y), (102) where i is the usual imaginary number, often denoted √ −1. Depending on one’s conventions, one may refer to ˆg as the Fourier transform of g. Under the technical condition that +∞ −∞ | ˆg(z) | dz < ∞, we have the L´evy Inversion Formula g( y) = ˆg(0) 2 − 1 p ∞ 0 1 z Im e−izy ˆg(z) dz, (103) where Im(c) denotes the imaginary part of a complex number c. For the case g(·) = G(·; t, d, d), with the associated transform G(·; t, d, d), we can compute G( y; t, d, d) from Equation (103), typically by computing the integral in Equation (103) numerically, and thereby obtain option prices from Equation (100). Our final objective is therefore to compute the transform G. Fixing z, and applying Fubini’s Theorem to Equation (102), we have G(z; t, d, d) = f (X0, 0), where f : D × [0, t] → C is defined by f (Xs, s) = EQ exp − t s r(u) du exp [d · X (t)] exp [izd · X (t)] Xs . (104) From Equation (104), the same separation-of-variables arguments used to treat the affine term-structure models imply, under technical regularity conditions, that f (x, s) = exp [a(t − s) + b(t − s) · x] , (105) where (a, b) solves the generalized Riccati ordinary differential equation (ODE) associated with the affine model and the coefficients ø0 and ø1 of the short rate. The solutions for a(·) and b(·) are complex numbers, in light of the complex boundary condition b(0) = d + izd. For technical details, see Duffie, Filipovi´c and Schachermayer (2003). Thus, under technical conditions, we have our transform G(z; t, d, d), evaluated at a particular z. We then have the option-pricing formula (100), where G( y; t, d, d) is obtained from the inversion formula (103) applied to the transforms G(·; t, b(t), −b(t)) and G(·; t, 0, −b(t)).
    • Ch. 11: Intertemporal Asset Pricing Theory 711 For option pricing with the Heston model, we require only the transform y(u) = e−rt EQ (exp[uY(t)]), for some particular choices of u ∈ C. Heston (1993) solved the Riccati equation for this case, arriving at y(u) = exp ¯a(t, u) + uY(0) + ¯b(t, u) V(0) , where, letting b = usvcSV − ú, a = u(1 − u), and g = b2 + as2 v , ¯b(t, u) = − a 1 − e−gt 2g − ( g + b) (1 − e−gt) , ¯a(t, u) = rt(u − 1) − úv g + b s2 v t + 2 s2 v log 1 − g + b 2g 1 − e−gt . Other special cases for which one can compute explicit solutions are cited in Duffie, Pan and Singleton (2000). 6. Corporate securities This section offers a basic review of the valuation of equities and corporate liabilities, beginning with some standard issues regarding the capital structure of a firm. Then, we turn to models of the valuation of defaultable debt that are based on an assumed stochastic arrival intensity of the stopping time defining default. The use of intensity- based defaultable bond pricing models was instigated by Artzner and Delbaen (1990, 1992, 1995), Lando (1994, 1998) and Jarrow and Turnbull (1995), and has become commonplace in business applications among banks and investment banks. We begin with an extremely simple model of the stochastic behavior of the market values of assets, equity, and debt. We may think of equity and debt, at this first pass, as derivatives with respect to the total market value of the firm, as proposed by Black and Scholes (1973) and Merton (1974). In the simplest case, equity is merely a call option on the assets of the firm, struck at the level of liabilities, with possible exercise at the maturity date of the debt.32 At first, we are in a setting of perfect capital markets, where the results of Modigliani and Miller (1958) imply the irrelevance of capital structure for the total market value of the firm. Later, we introduce market imperfections and increase the degree of control that may be exercised by holders of equity and debt. With this, the theory becomes more complex and less like a derivative valuation model. There are many more interesting variations than could be addressed well in the space available here. 32 Geske (1977) used compound option modeling so as to extend to the Black–Scholes–Merton model to cases of debt at various maturities.
    • 712 D. Duffie Our objective is merely to convey some sense of the types of issues and standard modeling approaches. We let B be a standard Brownian motion in Rd on a complete probability space (W, F, P), and fix the standard filtration {Ft: t 0} of B. Later, we allow for information revealed by “Poisson-like arrivals”, in order to tractably model “sudden- surprise” defaults that cannot be easily treated in a setting of Brownian information. 6.1. Endogenous default timing We assume a constant short rate r and take as given a martingale measure Q, in the infinite-horizon sense of Huang and Pag`es (1992), after deflation by e−rt . The resources of a given firm are assumed to consist of cash flows at a rate dt for each time t, where d is an adapted process with t 0 | ds | ds < ∞ almost surely for all t. The market value of the assets of the firm at time t is defined as the market value At of the future cash flows. That is, At = EQ t ∞ t exp [−r(s − t)] ds ds . (106) We assume that At is well defined and finite for all t. The martingale representation theorem implies that dAt = (rAt − dt) dt + st dBQ t , (107) for some adapted Rd -valued process s such that T 0 st · st dt < ∞ for all T ∈ [0, ∞), and where BQ is the standard Brownian motion in Rd under Q obtained from B and Girsanov’s Theorem.33 We suppose that the original owners of the firm chose its capital structure to consist of a single bond as its debt, and pure equity, defined in detail below. The bond and equity investors have already paid the original owners for these securities. Before we consider the effects of market imperfections, the total of the market values of equity and debt must be the market value A of the assets, which is a given process, so the design of the capital structure is irrelevant from the viewpoint of maximizing the total value received by the original owners of the firm. For simplicity, we suppose that the bond promises to pay coupons at a constant total rate c, continually in time, until default. This sort of bond is sometimes called a consol. Equityholders receive the residual cash flow in the form of dividends at the rate dt − c at time t, until default. At default, the firm’s future cash flows are assigned to debtholders. 33 For an explanation of how Girsanov’s Theorem applies in an infinite-horizon setting, see for example the last section of Chapter 6 of Duffie (2001), based on Huang and Pag`es (1992).
    • Ch. 11: Intertemporal Asset Pricing Theory 713 The equityholders’ dividend rate, dt − c, may have negative outcomes. It is commonly stipulated, however, that equity claimants have limited liability, meaning that they should not experience negative cash flows. One can arrange for limited liability by dilution of equity.34 Equityholders are assumed to have the contractual right to declare default at any stopping time T, at which time equityholders give up to debtholders the rights to all future cash flows, a contractual arrangement termed strict priority, or sometimes absolute priority. We assume that equityholders are not permitted to delay liquidation after the value A of the firm reaches 0, so we ignore the possibility that AT < 0. We could also consider the option of equityholders to change the firm’s production technology, or to call in the debt for some price. The bond contract may convey to debtholders, under a protective covenant, the right to force liquidation at any stopping time t at which the asset value At is as low or lower than some stipulated level. We ignore this feature for brevity. 6.2. Example: Brownian dividend growth We turn to a specific model proposed by Fisher, Heinkel and Zechner (1989), and explicitly solved by Leland (1994), for optimal default timing and for the valuation of equity and debt. Once we allow for taxes and bankruptcy distress costs,35 capital structure matters, and, within the following simple parametric framework, Leland (1994) calculated the initial capital structure that maximizes the total initial market value of the firm. Suppose the cash-flow rate process d is a geometric Brownian motion under Q, in that ddt = mdt dt + sdt dBQ t , for constants m and s, where BQ is a standard Brownian motion under Q. We assume throughout that m < r, so that, from Equation (106), A is finite and dAt = mAt dt + sAt dBQ t . 34 That is, so long as the market value of equity remains strictly positive, newly issued equity can be sold into the market so as to continually finance the negative portion (c − dt)+ of the residual cash flow. While dilution increases the quantity of shares outstanding, it does not alter the total market value of all shares, and so is a relatively simple modeling device. Moreover, dilution is irrelevant to individual shareholders, who would in any case be in a position to avoid negative cash flows by selling their own shares as necessary to finance the negative portion of their dividends, with the same effect as if the firm had diluted their shares for this purpose. We are ignoring here any frictional costs of equity issuance or trading. 35 The model was further elaborated to treat coupon debt of finite maturity in Leland and Toft (1996), endogenous calling of debt and re-capitalization in Leland (1998) and Uhrig-Homburg (1998), incomplete observation by bond investors, with default intensity, in Duffie and Lando (2001), and alternative approaches to default recovery by Anderson and Sundaresan (1996), Anderson, Pan and Sundaresan (2001), D´ecamps and Faure-Grimaud (2000, 2002), Fan and Sundaresan (2000), Mella-Barral (1999) and Mella-Barral and Perraudin (1997).
    • 714 D. Duffie We calculate that dt = (r − m) At. For any given constant K ∈ (0, A0), the market value of a security that claims one unit of account at the hitting time t(K) = inf{t: At K} is, at any time t < t(K), EQ t (exp [−r(t(K) − t)]) = At K −g , (108) where g = m + √ m2 + 2rs2 s2 , and where m = m − s2 /2. This can be shown by applying Ito’s Formula to see that e−rt (At/K)−g is a Q-martingale. Let us consider for simplicity the case in which bondholders have no protective covenant. Then, equityholders declare default at a stopping time that attains the maximum equity valuation w(A0) ≡ sup T ∈ T EQ T 0 e−rt (dt − c) dt , (109) where T is the set of stopping times. We naturally conjecture that the maximization problem (109) is solved by a hitting time of the form t(AB) = inf{t: At AB}, for some default-triggering level AB of assets to be determined. Black and Cox (1976) developed the idea of default at the first passage of assets to a sufficiently low level, but used an exogenous default boundary. Longstaff and Schwartz (1995) extended this approach to allow for stochastic default- free interest rates. Their work was then refined by Collin-Dufresne and Goldstein (2001a). Given this conjectured form t(AB) for the optimal default time, we further conjecture from Ito’s Formula that the equity value function w: (0, ∞) → [0, ∞) defined by Equation (109) solves the ODE Dw(x) − rw(x) + (r − m) x − c = 0, x > AB, (110) where Dw(x) = w (x) mx + 1 2 w (x) s2 x2 , (111) with the absolute-priority boundary condition w(x) = 0, x AB. (112) Finally, we conjecture the smooth-pasting condition w (AB) = 0, (113) based on Equation (112) and continuity of the first derivative w (·) at AB. Although not an obvious requirement for optimality, the smooth-pasting condition, sometimes
    • Ch. 11: Intertemporal Asset Pricing Theory 715 called the high-order-contact condition, has proven to be a fruitful method by which to conjecture solutions, as follows. If we are correct in conjecturing that the optimal default time is of the form t(AB) = inf{t: At AB}, then, given an initial asset level A0 = x > AB, the value of equity must be w(x) = x − AB x AB −g − c r 1 − x AB −g . (114) This conjectured value of equity is merely the market value x of the total future cash flows of the firm, less a deduction equal to the market value of the debtholders’ claim to AB at the default time t(AB) using Equation (108), less another deduction equal to the market value of coupon payments to bondholders before default. The market value of those coupon payments is easily computed as the present value c/r of coupons paid at the rate c from time 0 to time +∞, less the present value of coupons paid at the rate c from the default time t(AB) until +∞, again using Equation (108). In order to complete our conjecture, we apply the smooth-pasting condition w (AB) = 0 to this functional form (114), and by calculation obtain the conjectured default triggering asset level as AB = bc, (115) where b = g r(1 + g) . (116) We are ready to state and verify this result of Leland (1994). Proposition. The default timing problem (109) is solved by inf{t: At bc}. The associated initial market value w(A0) of equity is W(A0, c), where W(x, c) = 0, x bc, (117) and W(x, c) = x − bc x bc −g − c r 1 − x bc −g , x bc. (118) The initial value of debt is A0 − W(A0, c). Proof: First, it may be checked by calcuation that W(·, c) satisfies the differential equation (110) and the smooth-pasting condition (113). Ito’s Formula applies to C2 (twice continuously differentiable) functions. In our case, although W(·, c) need not
    • 716 D. Duffie be C2 , it is convex, is C1 , and is C2 except at bc, where Wx( bc, c) = 0. Under these conditions, we obtain the result of applying Ito’s Formula as W (As, c) = W (A0, c) + s 0 DW (At, c) dt + s 0 Wx (At, c) sAt dBQ t , where DW(x, c) is defined as usual by DW(x, c) = Wx(x, c) mx + 1 2 Wxx(x, c) s2 x2 , except at x = bc, where we may replace “Wxx( bc, c)” with zero. [This slight extension of Ito’s Formula is found, for example, in Karatzas and Shreve (1988), p. 219.] For each time t, let qt = e−rt W (At, c) + t 0 e−rs ((r − m) As − c) ds. From Ito’s Formula, dqt = e−rt f (At) dt + e−rt Wx (At, c) sAt dBQ t , (119) where f (x) = DW(x, c) − rW(x, c) + (r − m) x − c. Because Wx is bounded, the last term of Equation (119) defines a Q-martingale. For x bc, we have both W(x, c) = 0 and (r − m) x − c 0, so f (x) 0. For x > bc, we have Equation (110), and therefore f (x) = 0. The drift of q is therefore never positive, and for any stopping time T we have q0 EQ (qT ), or equivalently, W (A0, c) EQ T 0 e−rs (ds − c) ds + e−rT W (AT , c) . For the particular stopping time t(bc), we have W (A0, c) = EQ t(bc) 0 e−rs (ds − c) ds , using the boundary condition (117) and the fact that f (x) = 0 for x > bc. So, for any stopping time T, W (A0, c) = EQ t(bc) 0 e−rs (ds − c) ds EQ T 0 e−rs (ds − c) ds + e−rT W (AT , c) EQ T 0 e−rs (ds − c) ds , using the non-negativity of W for the last inequality. This implies the optimality of the stopping time t(bc) and verification of the proposed solution W(A0, c) of Equation (109).
    • Ch. 11: Intertemporal Asset Pricing Theory 717 Boyarchenko and Levendorski˘ı (2002), Hilberink and Rogers (2002) and Zhou (2000) extend this first passage model of optimal default timing to the case of jump- diffusion asset processes. 6.3. Taxes, bankruptcy costs, capital structure In order to see how the original owners of the firm may have a strict but limited incentive to issue debt, we introduce two market imperfections: • A tax deduction, at a tax rate of q, on interest expense, so that the after-tax effective coupon rate paid by the firm is (1 − q) c. • Bankruptcy costs, so that, with default at time t, the assets of the firm are disposed of at a salvage value of ˆAt At, where ˆA is a given continuous adapted process. We also consider more carefully the formulation of an equilibrium, in which equityholders and bondholders each exercise their own rights so as to maximize the market values of their own securities, given correct conjectures regarding the equilibrium policy of the other claimant. Because the total of the market values of equity and debt is not the fixed process A, new considerations arise, including inefficiencies. That is, in an equilibrium, the total of the market values of equity and bond may be strictly less than maximal, for example because of default that is premature from the viewpoint of maximizing the total value of the firm. An unrestricted central planner could in such a case split the firm’s cash flows between equityholders and bondholders so as to achieve strictly larger market values for each than the equilibrium values of their respective securities. Absent the tax shield on debt, the original owner of the firm, who selects a capital structure at time 0 so as to maximize the total initial market value of all corporate securities, would have avoided a capital structure that involves an inefficiency of this type. For example, an all-equity firm would avoid bankruptcy costs. In order to illustrate the endogenous choice of capital structure based on the tradeoff between the values of tax shields and of bankruptcy losses, we extend the example of Section 6.2 by assuming a tax rate of q ∈ (0, 1) and bankruptcy recovery ˆA = ûA, for a constant fractional recovery rate û ∈ [0, 1]. For simplicity, we assume no protective covenant. The equity valuation and optimal default timing problem is identical to Equa- tion (109), except that equityholders treat the effective coupon rate as the after-tax rate c(1 − q). Thus, the optimal equity market value is W(A0, c(1 − q)), where W(x, y) is given by Equations (117) and (118). The optimal default time is T∗ = inf{t: At b(1 − q) c}. For a given coupon rate c, the bankruptcy recovery rate û has no effect on the equity value. The market value U(A0, c) of debt, at asset level A0 and coupon rate c, is indeed affected by distress costs, in that U(x, c) = ûx, x b(1 − q) c, (120)
    • 718 D. Duffie and, for x b(1 − q) c, U(x, c) = ûbc(1 − q) x bc(1 − q) −g + c r 1 − x bc(1 − q) −g . (121) The first term of Equation (121) is the market value of the payment of the recovery value ûA(T∗ ) = ûbc(1 − q) at default, using Equation (108). The second term is the market value of receiving the coupon rate c until T∗ . The capital structure that maximizes the market value received by the initial owners for sale of equity and debt can now be determined from the coupon rate c∗ solving sup c {U (A0, c) + W (A0, (1 − q) c)} . (122) Leland (1994) provides an explicit solution for c∗ , which then allows one to easily examine the resolution of the tradeoff between the market value H (A0, c) = qc r 1 − A0 bc(1 − q) −g , of tax shields and the market value h (A0, c) = ûbc(1 − q) A0 bc(1 − q) −g , of financial distress costs associated with bankruptcy. The coupon rate that solves Equation (122) is that which maximizes H(A0, c) − h(A0, c), the benefit–cost difference. Although the tax shield is valuable to the firm, it is merely a transfer from somewhere else in the economy. The bankruptcy distress cost, however, involves a net social cost, illustrating one of the inefficiencies caused by taxes. Leland and Toft (1996) extend the model so as to treat bonds of finite maturity with discrete coupons. One can also allow for multiple classes of debtholders, each with its own contractual cash flows and rights. For example, bonds are conventionally classified by priority, so that, at liquidation, senior bondholders are contractually entitled to cash flows resulting from liquidation up to the total face value of senior debt (in proportion to the face values of the respective senior bonds, and normally without regard to maturity dates). If the most senior class of debtholders can be paid off in full, the next most senior class is assigned liquidation cash flows, and so on, to the lowest subordination class. Some bonds may be secured by certain identified assets, or collateralized, in effect giving them seniority over the liquidation value resulting from those cash flows, before any unsecured bonds may be paid according to the seniority of unsecured claims. In practice, the overall priority structure may be rather complicated. Corporate bonds are often callable, within certain time restrictions. Not infrequently, corporate bonds may be converted to equity at pre-arranged conversion ratios (number
    • Ch. 11: Intertemporal Asset Pricing Theory 719 of shares for a given face value) at the timing option of bondholders. Such convertible bonds present a challenging set of valuation issues, some examined by Brennan and Schwartz (1980) and Nyborg (1996). Occasionally, corporate bonds are puttable, that is, may be sold back to the issuer at a pre-arranged price at the option of bondholders. One can also allow for adjustments in capital structure, normally instigated by equityholders, that result in the issuing and retiring of securities, subject to legal restrictions, some of which may be embedded in debt contracts. 6.4. Intensity-based modeling of default This section introduces a model for a default time as a stopping time t with a given intensity process l, as defined below. From the joint behavior of l, the short-rate process r, the promised payment of the security, and the model of recovery at default, as well as risk premia, one can characterize the stochastic behavior of the term structure of yields on defaultable bonds. In applications, default intensities may be modeled as functions of observable variables that are linked with the likelihood of default, such as debt-to-equity ratios, asset volatility measures, other accounting measures of indebtedness, market equity prices, bond yield spreads, industry performance measures, and macroeconomic variables related to the business cycle. This dependence could, but in practice does not usually, arise endogenously from a model of the ability or incentives of the firm to make payments on its debt. Because the approach presented here does not depend on the specific setting of a firm, it has also been applied to the valuation of defaultable sovereign debt, as in Duffie, Pedersen and Singleton (2003) and Pag`es (2000). We fix a complete probability space (W, F, P) and a filtration {Gt: t 0} satisfying the usual conditions. At some points, it will be important to make a distinction between an adapted process and a predictable process. A predictable process is, intuitively speaking, one whose value at any time t depends only on the information in the underlying filtration that is available up to, but not including, time t. Protter (1990) provides a full definition. A non-explosive counting process K (for example, a Poisson process) has an intensity l if l is a predictable non-negative process satisfying t 0 ls ds < ∞ almost surely for all t, with the property that a local martingale M, the compensated counting process, is given by Mt = Kt − t 0 ls ds. (123) The compensated counting process M is a martingale if, for all t, we have E( t 0 ls ds) < ∞. A standard reference on counting processes is Br´emaud (1981). For simplicity, we will say that a stopping time t has an intensity l if t is the first jump time of a non-explosive counting process whose intensity process is l. The accompanying intuition is that, at any time t and state w with t < t(w), the
    • 720 D. Duffie Gt-conditional probability of an arrival before t + D is approximately l(w, t) D, for small D. This intuition is justified in the sense of derivatives if l is bounded and continuous, and under weaker conditions. A stopping time t is non-trivial if P(t ∈ (0, ∞)) > 0. If a stopping time t is non-trivial and if the filtration {Gt: t 0} is the standard filtration of some Brownian motion B in Rd , then t could not have an intensity. We know this from the fact that, if {Gt: t 0} is the standard filtration of B, then the associated compensated counting process M of Equation (123) (indeed, any local martingale) could be represented as a stochastic integral with respect to B, and therefore cannot jump, but M must jump at t. In order to have an intensity, a stopping time t must be totally inaccessible, roughly meaning that it cannot be “foretold” by an increasing sequence of stopping times that converges to t. An inaccessible stopping time is a “sudden surprise”, but there are no such surprises on a Brownian filtration! As an illustration, we could imagine that the firm’s equityholders or managers are equipped with some Brownian filtration for purposes of determining their optimal default time t, but that bondholders have imperfect monitoring, and may view t as having an intensity with respect to the bondholders’ own filtration {Gt: t 0}, which contains less information than the Brownian filtration. Such a situation arises in Duffie and Lando (2001). We say that t is doubly stochastic with intensity l if the underlying counting process whose first jump time is t is doubly stochastic with intensity l. This means roughly that, conditional on the intensity process, the counting process is a Poisson process with that same (conditionally deterministic) intensity. The doubly-stochastic property thus implies that, for t < t, using the law of iterated expectations, P (t > s | Gt) = E P (t > s | Gt, {lu: t u s}) | Gt = E exp − s t l(u) du Gt , (124) using the fact that the probability of no jump between t and s of a Poisson process with time-varying (deterministic) intensity l is exp[− s t l(u) du]. This property (124) is convenient for calculations, because evaluating E(exp[− s t l(u) du] | Gt) is computationally equivalent to the pricing of a default-free zero-coupon bond, treating l as a short rate. Indeed, this analogy is also quite helpful for intuition and suggests tractable models for intensities based on models of the short rate that are tractable for default-free term structure modeling. As we shall see, it would be sufficient for Equation (124) that lt = L(Xt, t) for some measurable L: Rn × [0, ∞) → [0, ∞), where X in Rd solves a stochastic differential equation of the form dXt = m (Xt, t) dt + s (Xt, t) dBt, (125) for some (Gt)-standard Brownian motion B in Rd . More generally, Equation (124) follows from assuming that the doubly-stochastic counting process K whose first jump
    • Ch. 11: Intertemporal Asset Pricing Theory 721 time is t is driven by some filtration {Ft: t 0}. This means roughly that, for any t, conditional on Ft, the distribution of K during [0, t] is that of a Poisson process with time-varying conditionally deterministic intensity l. A complete definition is provided in Duffie (2001).36 For purposes of the market valuation of bonds and other securities whose cash flows are sensitive to default timing, we would want to have a risk-neutral intensity process, that is, an intensity process lQ for the default time t that is associated with (W, F, Q) and the given filtration {Gt: t 0}, where Q is an equivalent martingale measure. In this case, we call lQ the Q-intensity of t. (As usual, there may be more than one equivalent martingale measure.) Such an intensity always exists, as shown by Artzner and Delbaen (1995), but the doubly-stochastic property may be lost with a change of measure [Kusuoka (1999)]. The ratio lQ /l (for l strictly positive) is in some sense a multiplicative risk premium for the uncertainty associated with the timing of default. This issue is pursued by Jarrow, Lando and Yu (2003), who provide sufficient conditions for no default-timing risk premium (but allowing nevertheless a default risk premium). 6.5. Zero-recovery bond pricing We fix a short-rate process r and an equivalent martingale measure Q after deflation by exp[− t 0 r(u) du]. We consider the valuation of a security that pays F1{t > s} at a given time s > 0, where F is a GT -measurable bounded random variable. Because 1{t > s} is the random variable that is 1 in the event of no default by s and zero otherwise, we may view F as the contractually promised payment of the security at time s, with default by s leading to no payment. The case of a defaultable zero-coupon bond is treated by letting F = 1. In the next sub-section, we will consider recovery at default. From the definition of Q as an equivalent martingale measure, the price St of this security at any time t < s is St = EQ t exp − s t r(u) du 1{t > s} F , (126) where EQ t denotes Gt-conditional expectation under Q. From Equation (126) and the fact that t is a stopping time, St must be zero for all t t. Under Q, the default time t is assumed to have a Q-intensity process lQ . Theorem. Suppose that F, r and lQ are bounded and that t is doubly stochastic under Q driven by a filtration {Ft: t 0} such that r is (Ft)-adapted and F is Fs-measurable. Fix any t < s. Then, for t t, we have St = 0, and for t < t, St = EQ t exp − s t r(u) + lQ (u) du F . (127) 36 Included in the definition is the condition that l is (Ft)-predictable, that Ft ⊂ Gt, and that {Ft: t ≥ 0} satisfies the usual conditions.
    • 722 D. Duffie This theorem is based on Lando (1998).37 The idea of this representation (127) of the pre-default price is that discounting for default that occurs at an intensity is analogous to discounting at the short rate r. Proof: From Equation (126), the law of iterated expectations, and the assumption that r is (Ft)-adapted and F is Fs-measurable, St = EQ EQ exp − s t r(u) du 1{t > s}F | Fs ∨ Gt Gt = EQ exp − s t r(u) du FEQ 1{t > s} | Fs ∨ Gt Gt . The result then follows from the implication of double stochasticity that Q(t > s | Fs ∨ Gt) = exp[ s t lQ (u) du]. As a special case, suppose the filtration {Ft: t 0} is that generated by a process X that is affine under Q and valued in D ⊂ Rd . It is natural to allow dependence of lQ , r and F on the state process X in the sense that lQ t = L (Xt) , rt = ø (Xt) , F = exp [ f (X (T))] , (128) where L, ø and f are affine on D. Under the technical regularity in Duffie, Filipovi´c and Schachermayer (2003), relation (127) then implies that, for t < t, we have St = exp [a(T − t) + b(T − t) · X (t)] , (129) for coefficients a(·) and b(·) that are computed from the associated Generalized Riccati equations. 6.6. Pricing with recovery at default The next step is to consider the recovery of some random payoff W at the default time t, if default occurs before the maturity date s of the security. We adopt the assumptions of Theorem 6.5, and add the assumption that W = wt , where w is a bounded predictable process that is also adapted to the driving filtration {Ft: t 0}. 37 Additional work in this vein is by Bielecki and Rutkowski (1999a,b, 2001), Cooper and Mello (1991, 1992), Das and Sundaram (2000), Das and Tufano (1995), Davydov, Linetsky and Lotz (1999), Duffie (1998), Duffie and Huang (1996), Duffie, Schroder and Skiadas (1996), Duffie and Singleton (1999), Elliott, Jeanblanc and Yor (2000), Hull and White (1992, 1995), Jarrow and Yu (2001), Jeanblanc and Rutkowski (2000), Madan and Unal (1998) and Nielsen and Ronn (1995).
    • Ch. 11: Intertemporal Asset Pricing Theory 723 The market value at any time t < min(s, t) of any default recovery is, by definition of the equivalent martingale measure Q, given by Jt = EQ t exp t t −r(u) du 1{t s}wt . (130) The doubly-stochastic assumption implies that t has a probability density under Q, at any time u in [t, s], conditional on Gt ∨ Fs, and on the event that t > t, of q(t, u) = exp u t −lQ (z) dz lQ (u). Thus, using the same iterated-expectations argument of the proof of Theorem 6.5, we have, on the event that t > t, Jt = EQ EQ exp t t −r(z) dz 1{t s}wt Fs ∨ Gt Gt = EQ s t exp u t −r(z) dz q(t, u) wu du Gt = s t F(t, u) du, using Fubini’s Theorem, where F(t, u) = EQ t exp − u t [lQ (z) + r(z)] dz lQ (u) w(u) . (131) We summarize the main defaultable valuation result as follows. Theorem. Consider a security that pays F at s if t > s, and otherwise pays wt at t. Suppose that w, F, lQ and r are bounded. Suppose that t is doubly stochastic under Q, driven by a filtration {Ft: t 0} with the property that r and w are (Ft)-adapted and F is Fs-measurable. Then, for t t, we have St = 0, and for t < t, St = EQ t exp − s t (r(u) + lQ (u)) du F + s t F(t, u) du. (132) These results are based on Duffie, Schroder and Skiadas (1996) and Lando (1994, 1998). Sch¨onbucher (1998) extends to treat the case of recovery W which is not of the form wt for some predictable process w, but rather allows the recovery to be revealed just at the default time t. For details on this construction, see Duffie (2002).
    • 724 D. Duffie In the affine state-space setting described at the end of the previous section, F(t, u) can be computed by our usual “affine” methods, provided that w is of form wt = ea + b·X (t) for constant coefficients a and b. In this case, under technical regularity, F(t, u) = exp [a(u − t) + b(u − t) · X (t)] [c(u − t) + C(u − t) · X (t)] , (133) for readily computed deterministic coefficients a, b, c and C, as in Duffie, Pan and Singleton (2000). This still leaves the task of numerical computation of the integral s t F(t, u) du. For the price of a typical defaultable bond promising periodic coupons followed by its principal at maturity, one may sum the prices of the coupons and of the principal, treating each of these payments as though it were a separate zero-coupon bond. An often-used assumption, although one that need not apply in practice, is that there is no default recovery for coupons remaining to be paid as of the time of default, and that bonds of different maturities have the same recovery of principal. In any case, convenient parametric assumptions, based for example on an affine driving process X , lead to straightforward computation of a term structure of defaultable bond yields that may be applied in practical situations, such as the valuation of credit derivatives, a class of derivative securities designed to transfer credit risk that is treated in Duffie and Singleton (2003). For the case of defaultable bonds with embedded American options, the most typical cases being callable or convertible bonds, the usual resort is valuation by some numerical implementation of the associated dynamic programming problems. See Berndt (2002). 6.7. Default-adjusted short rate In the setting of Theorem 6.6, a particularly simple pricing representation can be based on the definition of a predictable process ° for the fractional loss in market value at default, according to (1 − °t ) (St−) = wt . (134) Manipulation left to the reader shows that, under the conditions of Theorem 6.6, for t < t, St = EQ t exp s t − r(u) + °(u) lQ (u) du F . (135) This valuation model (135) is from Duffie and Singleton (1999), and based on a precursor of Pye (1974). This representation (135) is particularly convenient if we take ° as an exogenously given fractional loss process, as it allows for the application of standard valuation methods, treating the payoff F as default-free, but accounting for the
    • Ch. 11: Intertemporal Asset Pricing Theory 725 intensity and severity of default losses through the “default-adjusted” short-rate process r + °lQ . The adjustment °lQ is in fact the risk-neutral mean rate of proportional loss in market value due to default. Notably, the dependence of the bond price on the intensity lQ and fractional loss ° at default is only through the product °lQ . For example, doubling lQ and halving ° has no effect on the bond price process. Suppose, for example, that t is doubly stochastic driven by the filtration of a state process X that is affine under Q, and we take rt + °tlQ t = R(Xt) and F = exp[ f (X (T))], for affine R(·) and f (·). Then, under regularity conditions, we obtain at each time t before default a bond price of the simple form (129), again for coefficients solving the associated Generalized Riccati equation. Using this affine approach to default-adjusted short rates, Duffee (1999a) provides an empirical model of risk-neutral default intensities for corporate bonds.38 References Adams, K., and D. Van Deventer (1994), “Fitting yield curves and forward rate curves with maximum smoothness”, Journal of Fixed Income 4 (June):52−62. Ahn, H., M. Dayal, E. Grannan and G. Swindle (1995), “Hedging with transaction costs”, Annals of Applied Probability 8:341−366. Andersen, L., and J. Andreasen (2000a), “Volatility skews and extensions of the Libor Market Model”, Applied Mathematical Finance 7(1):1−32. Andersen, L., and J. Andreasen (2000b), “Jump-diffusion processes: volatility smile fitting and numerical methods for option pricing”, Review of Derivatives Research 4:231−262. Anderson, R., and S. Sundaresan (1996), “Design and valuation of debt contracts”, Review of Financial Studies 9:37−68. Anderson, R., Y. Pan and S. Sundaresan (2001), “Corporate bond yield spreads and the term structure”, Finance 21(2):14−37. Andreasen, J., B. Jensen and R. Poulsen (1998), “Eight valuation methods in financial mathematics: the Black–Scholes formula as an example”, Mathematical Scientist 23:18−40. Ansel, J., and C. Stricker (1992a), “Quelques remarques sur un theoreme de Yan”, Working Paper (Universit´e de Franche-Comt´e). Ansel, J., and C. Stricker (1992b), “Lois de martingales, densit´es et d´ecomposition de F¨ollmer– Schweizer”, Annales de l’Institut Henri Poincar´e Probabilit´es Statistiques 28(3):375−392. Arrow, K. (1951), “An extension of the basic theorems of classical welfare economics”, in: J. Neyman, ed., Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability (University of California Press, Berkeley, CA) pp. 507−532. Arrow, K. (1953), “Le rˆole des valeurs boursi`eres pour la repartition la meilleure des risques”, Econometrie. Colloq. Internat. Centre National de la Recherche Scientifique 40(Paris 1952):41−47; 38 For related empirical work on sovereign debt, see Duffie, Pedersen and Singleton (2003) and Pag`es (2000).
    • 726 D. Duffie discussion, pp. 47–48, C.N.R.S. (Paris 1953). English Translation: 1964, Review of Economic Studies 31:91−96. Artzner, P. (1995), “References for the numeraire portfolio”, Working Paper (Institut de Recherche Math´ematique Avanc´ee Universit´e Louis Pasteur et CNRS, et Laboratoire de Recherche en Gestion). Artzner, P., and F. Delbaen (1990), “‘Finem lauda’ or the risk of swaps”, Insurance: Mathematics and Economics 9:295−303. Artzner, P., and F. Delbaen (1992), “Credit risk and prepayment option”, ASTIN Bulletin 22:81−96. Artzner, P., and F. Delbaen (1995), “Default risk and incomplete insurance markets”, Mathematical Finance 5:187−195. Artzner, P., and P. Roger (1993), “Definition and valuation of optional coupon reinvestment bonds”, Finance 14:7−22. A¨ıt-Sahalia, Y. (1996a), “Nonparametric pricing of interest rate derivative securities”, Econometrica 64:527−560. A¨ıt-Sahalia, Y. (1996b), “Testing continuous-time models of the spot interest rate”, Review of Financial Studies 9:385−342. A¨ıt-Sahalia, Y. (2002), “Telling from discrete data whether the underlying continuous-time model is a diffusion”, Journal of Finance 57(5):2075−2113. A¨ıt-Sahalia, Y., Y. Wang and F. Yared (2001), “Do option markets correctly price the probabilities of movement of the underlying asset?” Journal of Econometrics 102:67−110. Au, K., and D. Thurston (1993), “Markovian term structure movements”, Working Paper (School of Banking and Finance, University of New South Wales). Babbs, S., and M. Selby (1996), “Pricing by arbitrage in incomplete markets”, Mathematical Finance 8:163−168. Babbs, S., and N. Webber (1994), “A theory of the term structure with an official short rate” (Fiancial Options Research Center Warwick Business School). Bachelier, L. (1900), “Th´eorie de la speculation”, Annales Scientifiques de L’´Ecole Normale Sup´erieure, 3`eme serie 17:21−88. Translation: 1964, in: P. Cootner, ed., The Random Character of Stock Market Prices (MIT Press, Cambridge, MA) pp. 17−79. Back, K. (1986), “Securities market equilibrium without bankruptcy: contingent claim valuation and the martingale property”, Working Paper (Center for Mathematical Studies in Economics and Management Science, Northwestern University). Back, K. (1991), “Asset pricing for general processes”, Journal of Mathematical Economics 20:371−396. Back, K., and S. Pliska (1987), “The shadow price of information in continuous time decision problems”, Stochastics 22:151−186. Bajeux-Besnainou, I., and R. Portait (1997), “The numeraire portfolio: a new methodology for financial theory”, The European Journal of Finance 3:291−309. Bajeux-Besnainou, I., and R. Portait (1998), “Pricing derivative securities with a multi-factor gaussian model”, Applied Mathematical Finance 5:1−19. Bakshi, G., and D. Madan (2000), “Spanning and derivative security valuation”, Journal of Financial Economics 55:205−238. Bakshi, G., C. Cao and Z. Chen (1997), “Empirical performance of alternative option pricing models”, Journal of Finance 52:2003−2049. Balduzzi, P., S. Das, S. Foresi and R. Sundaram (1996), “A simple approach to three factor affine term structure models”, Journal of Fixed Income 6(December):43−53. Balduzzi, P., G. Bertola, S. Foresi and L. Klapper (1998), “Interest rate targeting and the dynamics of short-term interest rates”, Journal of Money, Credit, and Banking 30:26−50. Balduzzi, P., S. Das and S. Foresi (1998), “The central tendency: a second factor in bond yields”, Review of Economics and Statistics 80:62−72. Banz, R., and M. Miller (1978), “Prices for state-contingent claims: some evidence and applications”, Journal of Business 51:653−672.
    • Ch. 11: Intertemporal Asset Pricing Theory 727 Bates, D. (1997), “Post-87’ crash fears in S-and-P 500 futures options”, Journal of Econometrics 94:181−238. Baz, J., and S. Das (1996), “Analytical approximations of the term structure for jump-diffusion processes: a numerical analysis”, Journal of Fixed Income 6(1):78−86. Beaglehole, D. (1990), “Tax clientele and stochastic processes in the gilt market”, Working Paper (Graduate School of Business, University of Chicago). Beaglehole, D., and M. Tenney (1991), “General solutions of some interest rate contingent claim pricing equations”, Journal of Fixed Income 1:69−84. Bensoussan, A. (1984), “On the theory of option pricing”, Acta Applicandae Mathematicae 2:139−158. Benzoni, L. (2002), “Pricing options under stochastic volatility: an empirical investigation”, Working Paper (Carlson School of Management, University of Minnesota). Berardi, A., and M. Esposito (1999), “A base model for multifactor specifications of the term structure”, Economic Notes 28:145−170. Bergman, Y. (1995), “Option pricing with differential interest rates”, The Review of Financial Studies 8:475−500. Berndt, A. (2002), “Estimating the term structure of credit spreads: callable corporate debt”, Working Paper (Department of Statistics, Stanford University). Bhar, R., and C. Chiarella (1995), “Transformation of Heath–Jarrow–Morton models to Markovian systems”, Working Paper (School of Finance and Economics, University of Technology, Sydney). Bielecki, T., and M. Rutkowski (1999a), “Credit risk modelling: a multiple ratings case”, Working Paper (Northeastern Illinois University and Technical University of Warsaw). Bielecki, T., and M. Rutkowski (1999b), “Modelling of the defaultable term structure: conditionally Markov approach”, Working Paper (Northeastern Illinois University and Technical University of Warsaw). Bielecki, T., and M. Rutkowski (2001), “Credit risk modelling: intensity based approach”, in: E. Jouini, J. Cvitanic and M. Musiela, eds., Option Pricing, Interest Rates and Risk Management (Cambridge University Press) pp. 399–457. Bj¨ork, T., and B. Christensen (1999), “Interest rate dynamics and consistent forward rate curves”, Mathematical Finance 22:17−23. Bj¨ork, T., and A. Gombani (1999), “Minimal realizations of interest rate models”, Finance and Stochastics 3:413−432. Black, F., and J. Cox (1976), “Valuing corporate securities: liabilities: some effects of bond indenture provisions”, Journal of Finance 31:351−367. Black, F., and P. Karasinski (1991), “Bond and option pricing when short rates are lognormal”, Financial Analysts Journal (July–August), pp. 52−59. Black, F., and M. Scholes (1973), “The pricing of options and corporate liabilities”, Journal of Political Economy 81:637−654. Black, F., E. Derman and W. Toy (1990), “A one-factor model of interest rates and its application to treasury bond options”, Financial Analysts Journal (January–February), pp. 33−39. Bottazzi, J.-M. (1995), “Existence of equilibria with incomplete markets: the case of smooth returns”, Journal of Mathematical Economics 24:59−72. Bottazzi, J.-M., and T. Hens (1996), “Excess demand functions and incomplete markets”, Journal of Economic Theory 68:49−63. Boudoukh, J., M. Richardson, R. Stanton and R. Whitelaw (1997), “Pricing mortgage-backed securities in a multifactor interest rate environment: a multivariate density estimation approach”, Review of Financial Studies 10:405−446. Boyarchenko, S., and S. Levendorski˘ı (2002), “Perpetual American options under L´evy processes”, SIAM Journal on Control and Optimization 40(6):1663−1696. Brace, A., and M. Musiela (1994), “Swap derivatives in a Gaussian HJM framework”, Working Paper (Treasury Group, Citibank, Sydney, Australia).
    • 728 D. Duffie Brace, A., and M. Musiela (1995), “The market model of interest rate dynamics”, Mathematical Finance 7:127−155. Breeden, D. (1979), “An intertemporal asset pricing model with stochastic consumption and investment opportunities”, Journal of Financial Economics 7:265−296. Breeden, D., and R. Litzenberger (1978), “Prices of state-contingent claims implicit in option prices”, Journal of Business 51:621−651. Br´emaud, P. (1981), Point Processes and Queues: Martingale Dynamics (Springer, New York). Brennan, M., and E. Schwartz (1977), “Savings bonds, retractable bonds and callable bonds”, Journal of Financial Economics 5:67−88. Brennan, M., and E. Schwartz (1980), “Analyzing convertible bonds”, Journal of Financial and Quantitative Analysis 10:907−929. Brown, D., P. DeMarzo and C. Eaves (1996a), “Computing equilibria when asset markets are incomplete”, Econometrica 64:1−27. Brown, D., P. DeMarzo and C. Eaves (1996b), “Computing zeros of sections vector bundles using homotopies and relocalization”, Mathematics of Operations Research 21:26−43. Brown, R., and S. Schaefer (1994), “Interest rate volatility and the shape of the term structure”, Philosophical Transactions of the Royal Society: Physical Sciences and Engineering 347:449−598. Brown, R., and S. Schaefer (1996), “Ten years of the real term structure: 1984–1994”, Journal of Fixed Income 6(March):6−22. B¨uhlmann, H., F. Delbaen, P. Embrechts and A. Shiryaev (1998), “On Esscher transforms in discrete finance models”, ASTIN Bulletin 28:171−186. B¨uttler, H., and J. Waldvogel (1996), “Pricing callable bonds by means of Green’s function”, Mathematical Finance 6:53−88. Carr, P., and R. Chen (1996), “Valuing bond futures and the quality option”, Working Paper (Johnson Graduate School of Management, Cornell University). Carverhill, A. (1988), “The Ho and Lee term structure theory: a continuous time version”, Working Paper (Financial Options Research Centre, University of Warwick). Cass, D. (1984), “Competitive equilibria in incomplete financial markets”, Working Paper (Center for Analytic Research in Economics and the Social Sciences, University of Pennsylvania). Cass, D. (1989), “Sunspots and incomplete financial markets: the leading example”, in: G. Feiwel, ed., The Economics of Imperfect Competition and Employment: Joan Robinson and Beyond (Macmillan, London) pp. 677−693. Cass, D. (1991), “Incomplete financial markets and indeterminacy of financial equilibrium”, in: J.-J. Laffont, ed., Advances in Economic Theory (Cambridge University Press, Cambridge) pp. 263−288. Cassese, G. (1996), “An elementary remark on martingale equivalence and the fundamental theorem of asset pricing”, Working Paper (Istituto di Economia Politica, Universit`a Commerciale “Luigi Bocconi”, Milan). Chacko, G., and S. Das (2002), “Pricing interest rate derivatives: a general approach”, Review of Financial Studies v15(1):195−241. Chapman, D. (1998), “Habit formation, consumption, and state-prices”, Econometrica 66:1223−1230. Chen, L. (1996), Stochastic Mean and Stochastic Volatility: A Three-Factor Model of the Term Structure of Interest Rates and Its Application to the Pricing of Interest Rate Derivatives: Part I (Blackwell Publishers, Oxford). Chen, R.-R., and L. Scott (1992), “Pricing interest rate options in a two-factor Cox-Ingersoll-Ross model of the term structure”, Review of Financial Studies 5:613−636. Chen, R.-R., and L. Scott (1993), “Pricing interest rate futures options with futures-style margining”, Journal of Futures Markets 13:15−22. Chen, R.-R., and L. Scott (1995), “Interest rate options in multifactor Cox–Ingersoll–Ross models of the term structure”, Journal of Derivatives 3:53−72.
    • Ch. 11: Intertemporal Asset Pricing Theory 729 Cherif, T., N. El Karoui, R. Myneni and R. Viswanathan (1995), “Arbitrage pricing and hedging of quanto options and interest rate claims with quadratic gaussian state variables”, Working Paper (Laboratoire de Probabilit´es, Universit´e de Paris, VI). Chernov, M., and E. Ghysels (2000), “A study towards a unified approach to the joint estimation of objective and risk neutral measures for the purpose of options valuation”, Journal of Financial Economics 56:407−458. Cherubini, U., and M. Esposito (1995), “Options in and on interest rate futures contracts: results from martingale pricing theory”, Applied Mathematical Finance 2:1−15. Chesney, M., R. Elliott and R. Gibson (1993), “Analytical solution for the pricing of American bond and yield options”, Mathematical Finance 3:277−294. Chew, S.-H. (1983), “A generalization of the quasilinear mean with applications to the measurement of income inequality and decision theory resolving the Allais paradox”, Econometrica 51:1065−1092. Chew, S.-H. (1989), “Axiomatic utility theories with the betweenness property”, Annals of Operations Research 19:273−298. Chew, S.-H., and L. Epstein (1991), “Recursive utility under uncertainty”, in: A. Khan and N. Yannelis, eds., Equilibrium Theory with an Infinite Number of Commodities (Springer, New York) pp. 353−369. Cheyette, O. (1995), “Markov representation of the Heath–Jarrow–Morton model”, Working Paper (BARRA Inc., Berkeley, California). Cheyette, O. (1996), “Implied prepayments”, Working Paper (BARRA Inc., Berkeley, California). Citanna, A., and A. Villanacci (1993), “On generic pareto improvement in competitive economies with incomplete asset structure”, Working Paper (Center for Analytic Research in Economics and the Social Sciences, University of Pennsylvania). Citanna, A., A. Kajii and A. Villanacci (1994), “Constrained suboptimality in incomplete markets: a general approach and two applications”, Economic Theory 11:495−521. Clewlow, L., K. Pang and C. Strickland (1997), “Efficient pricing of caps and swaptions in a multi-factor gaussian interest rate model”, Working Paper (University of Warwick). Cohen, H. (1995), “Isolating the wild card option”, Mathematical Finance 2:155−166. Coleman, T., L. Fisher and R. Ibbotson (1992), “Estimating the term structure of interest rates from data that include the prices of coupon bonds”, Journal of Fixed Income 2 (September):85−116. Collin-Dufresne, P., and R. Goldstein (2001a), “Do credit spreads reflect stationary leverage ratios?”, Journal of Finance 56(5):1929−1958. Collin-Dufresne, P., and R. Goldstein (2001b), “Stochastic correlation and the relative pricing of caps and swaptions in a generalized-affine framework”, Working Paper (Carnegie Mellon University). Collin-Dufresne, P., and R. Goldstein (2002), “Pricing swaptions within the affine framework”, Journal of Derivatives 10(1):1−18. Collin-Dufresne, P., and B. Solnik (2001), “On the term structure of default premia in the swap and Libor markets”, Journal of Finance 56:1095−1116. Constantinides, G. (1982), “Intertemporal asset pricing with heterogeneous consumers and without demand aggregation”, Journal of Business 55:253−267. Constantinides, G. (1990), “Habit formation: a resolution of the equity premium puzzle”, Journal of Political Economy 98:519−543. Constantinides, G. (1992), “A theory of the nominal term structure of interest rates”, Review of Financial Studies 5:531−552. Constantinides, G., and J. Ingersoll (1984), “Optimal bond trading with personal taxes”, Journal of Financial Economics 13:299−335. Constantinides, G., and T. Zariphopoulou (1999), “Bounds on prices of contingent claims in an intertemporal economy with proportional transaction costs and general preferences”, Finance and Stochastics 3:345−369. Constantinides, G., and T. Zariphopoulou (2001), “Bounds on option prices in an intertemporal setting with proportional transaction costs and multiple securities”, Mathematical Finance 11:331−346.
    • 730 D. Duffie Cont, R. (1998), “Modeling term structure dynamics: an infinite dimensional approach”, Working Paper (Centre de Math´ematiques Appliqu´ees, Ecole Polytechnique, Palaiseau, France). Cooper, I., and A. Mello (1991), “The default risk of swaps”, Journal of Finance XLVI:597−620. Cooper, I., and A. Mello (1992), “Pricing and optimal use of forward contracts with default risk”, Working Paper (Department of Finance, London Business School, University of London). Corradi, V. (2000), “Degenerate continuous time limits of GARCH and GARCH-type processes”, Journal of Econometrics 96:145−153. Cox, J. (1983), “Optimal consumption and portfolio rules when assets follow a diffusion process”, Working Paper (Graduate School of Business, Stanford University). Cox, J., and C.-F. Huang (1989), “Optimal consumption and portfolio policies when asset prices follow a diffusion process”, Journal of Economic Theory 49:33−83. Cox, J., and C.-F. Huang (1991), “A variational problem arising in financial economics with an application to a portfolio turnpike theorem”, Journal of Mathematical Economics 20:465−488. Cox, J., and S. Ross (1976), “The valuation of options for alternative stochastic processes”, Journal of Financial Economics 3:145−166. Cox, J., and M. Rubinstein (1985), Options Markets (Prentice-Hall, Englewood Cliffs, NJ). Cox, J., S. Ross and M. Rubinstein (1979), “Option pricing: a simplified approach”, Journal of Financial Economics 7:229−263. Cox, J., J. Ingersoll and S. Ross (1981), “The relation between forward prices and futures prices”, Journal of Financial Economics 9:321−346. Cox, J., J. Ingersoll and S. Ross (1985a), “An intertemporal general equilibrium model of asset prices”, Econometrica 53:363−384. Cox, J., J. Ingersoll and S. Ross (1985b), “A theory of the term structure of interest rates”, Econometrica 53:385−408. Cuoco, D. (1997), “Optimal consumption and equilibrium prices with portfolio constraints and stochastic income”, Journal of Economic Theory 72:33−73. Cuoco, D., and H. He (1994), “Dynamic equilibrium in finite-dimensional economies with incomplete financial markets”, Working Paper (Wharton School, University of Pennsylvania). Cvitani´c, J., and I. Karatzas (1993), “Hedging contingent claims with constrained portfolios”, Annals of Applied Probability 3:652−681. Cvitani´c, J., and I. Karatzas (1996), “Hedging and portfolio optimization under transaction costs: a martingale approach”, Mathematical Finance 6:133−165. Cvitani´c, J., H. Wang and W. Schachermayer (2001), “Utility maximization in incomplete markets with random endowment”, Finance and Stochastics 5:259−272. Daher, C., M. Romano and G. Zacklad (1992), “Determination du prix de produits optionnels obligatoires `a partir d’un mod`ele multi-facteurs de la courbe des taux”, Working Paper (Caisse Autonome de Refinancement, Paris). Dai, Q. (1994), “Implied Green’s function in a no-arbitrage Markov model of the instantaneous short rate”, Working Paper (Graduate School of Business, Stanford University). Dai, Q., and K. Singleton (2000), “Specification analysis of affine term structure models”, Journal of Finance 55:1943−1978. Dai, Q., and K. Singleton (2003), “Term structure modelling in theory and reality”, Review of Financial Studies, forthcoming. Dalang, R., A. Morton and W. Willinger (1990), “Equivalent martingale measures and no-arbitrage in stochastic securities market models”, Stochastics and Stochastic Reports 29:185−201. Das, S. (1993), “Mean rate shifts and alternative models of the interest rate: theory and evidence”, Working Paper (Department of Finance, New York University). Das, S. (1995), “Pricing interest rate derivatives with arbitrary skewness and kurtosis: a simple approach to jump-diffusion bond option pricing”, Working Paper (Division of Research, Harvard Business School).
    • Ch. 11: Intertemporal Asset Pricing Theory 731 Das, S. (1997), “Discrete-time bond and option pricing for jump-diffusion processes”, Review of Derivatives Research 1:211−243. Das, S. (1998), “The surprise element: interest rates as jump diffusions”, NBER Working Paper 6631; Journal of Econometrics, under review. Das, S., and S. Foresi (1996), “Exact solutions for bond and option prices with systematic jump risk”, Review of Derivatives Research 1:7−24. Das, S., and R. Sundaram (2000), “A discrete-time approach to arbitrage-free pricing of credit derivatives”, Management Science 46:46−62. Das, S., and P. Tufano (1995), “Pricing credit-sensitive debt when interest rates, credit ratings and credit spreads are stochastic”, Journal of Financial Engineering 5(2):161−198. Dash, J. (1989), “Path integrals and options − I”, Working Paper (Financial Strategies Group, Merrill Lynch Capital Markets, New York). Davis, M., and M. Clark (1993), “Analysis of financial models including transactions costs”, Working Paper (Imperial College, University of London). Davydov, D., V. Linetsky and C. Lotz (1999), “The hazard-rate approach to pricing risky debt: two analytically tractable examples”, Working Paper (Department of Economics, University of Michigan). Debreu, G. (1953), “Une economie de l’incertain”, Working Paper (Electricit´e de France). Debreu, G. (1959), Theory of Value, Cowles Foundation Monograph 17 (Yale University Press, New Haven, CT). D´ecamps, J.-P., and A. Faure-Grimaud (2000), “Bankruptcy costs, ex post renegotiation and gambling for resurrection”, Finance (December). D´ecamps, J.-P., and A. Faure-Grimaud (2002), “Should I stay or should I go? Excessive continuation and dynamic agency costs of debt”, European Economic Review 46(9):1623−1644. D´ecamps, J.-P., and J.-C. Rochet (1997), “A variational approach for pricing options and corporate bonds”, Economic Theory 9:557−569. Dekel, E. (1989), “Asset demands without the independence axiom”, Econometrica 57:163−169. Delbaen, F., and W. Schachermayer (1998), “The fundamental theorem of asset pricing for unbounded stochastic processes”, Mathematische Annalen 312:215−250. DeMarzo, P., and B. Eaves (1996), “A homotopy, Grassmann manifold, and relocalization for computing equilibria of GEI”, Journal of Mathematical Economics 26:479−497. Diament, P. (1993), “Semi-empirical smooth fit to the treasury yield curve”, Working Paper (Graduate School of Business, Columbia University). Dijkstra, T. (1996), “On numeraires and growth-optimum portfolios”, Working Paper (Faculty of Economics, University of Groningen). Dothan, M. (1978), “On the term structure of interest rates”, Journal of Financial Economics 7:229−264. Dothan, M. (1990), Prices in Financial Markets (Oxford University Press, New York). Duffee, G. (1999a), “Estimating the price of default risk”, Review of Financial Studies 12:197−226. Duffee, G. (1999b), “Forecasting future interest rates: are affine models failures?”, Working Paper (Federal Reserve Board). Duffie, D. (1987), “Stochastic equilibria with incomplete financial markets”, Journal of Economic Theory 41:405−416; Corrigendum: 1989, 49:384. Duffie, D. (1988), “An extension of the Black–Scholes model of security valuation”, Journal of Economic Theory 46:194−204. Duffie, D. (1992), The Nature of Incomplete Markets (Cambridge University Press, Cambridge) pp. 214−262. Duffie, D. (1998), “Defaultable term structures with fractional recovery of par”, Working Paper (Graduate School of Business, Stanford University, Stanford, CA). Duffie, D. (2001), Dynamic Asset Pricing Theory, 3rd Edition (Princeton University Press, Princeton, NJ). Duffie, D. (2002), “A short course on credit risk modeling with affine processes”, Working Paper (Graduate School of Business, Stanford University, Stanford, CA).
    • 732 D. Duffie Duffie, D., and N. Gˆarleanu (2001), “Risk and valuation of collateralized debt valuation”, Financial Analysts Journal 57 (1, January–February):41−62. Duffie, D., and C.-F. Huang (1985), “Implementing Arrow–Debreu equilibria by continuous trading of few long-lived securities”, Econometrica 53:1337−1356. Duffie, D., and C.-F. Huang (1986), “Multiperiod security markets with differential information: martingales and resolution times”, Journal of Mathematical Economics 15:283−303. Duffie, D., and M. Huang (1996), “Swap rates and credit quality”, Journal of Finance 51:921−949. Duffie, D., and R. Kan (1996), “A yield-factor model of interest rates”, Mathematical Finance 6:379−406. Duffie, D., and D. Lando (2001), “Term structures of credit spreads with incomplete accounting information”, Econometrica 69:633−664. Duffie, D., and W. Shafer (1985), “Equilibrium in incomplete markets I: a basic model of generic existence”, Journal of Mathematical Economics 14:285−300. Duffie, D., and W. Shafer (1986), “Equilibrium in incomplete markets II: generic existence in stochastic economies”, Journal of Mathematical Economics 15:199−216. Duffie, D., and K. Singleton (1997), “An econometric model of the term structure of interest rate swap yields”, Journal of Finance 52:1287−1321. Duffie, D., and K. Singleton (1999), “Modeling term structures of defaultable bonds”, Review of Financial Studies 12:687−720. Duffie, D., and K. Singleton (2003), Credit Risk: Pricing, Measurement, and Management (Princeton University Press, Princeton, NJ). Duffie, D., and C. Skiadas (1994), “Continuous-time security pricing: a utility gradient approach”, Journal of Mathematical Economics 23:107−132. Duffie, D., and R. Stanton (1988), “Pricing continuously resettled contingent claims”, Journal of Economic Dynamics and Control 16:561−574. Duffie, D., and W. Zame (1989), “The consumption-based capital asset pricing model”, Econometrica 57:1279−1297. Duffie, D., M. Schroder and C. Skiadas (1996), “Recursive valuation of defaultable securities and the timing of the resolution of uncertainty”, Annals of Applied Probability 6:1075−1090. Duffie, D., M. Schroder and C. Skiadas (1997), “A term structure model with preferences for the timing of resolution of uncertainty”, Economic Theory 9:3−22. Duffie, D., J. Pan and K. Singleton (2000), “Transform analysis and asset pricing for affine jump- diffusions”, Econometrica 68:1343−1376. Duffie, D., L. Pedersen and K. Singleton (2003), “Modeling sovereign yield spreads: a case study of Russian debt”, The Journal of Finance 58:119−160. Duffie, D., D. Filipovi´c and W. Schachermayer (2003), “Affine processes and applications in finance”, Annals of Applied Probability 13, forthcoming. Dumas, B., and P. Maenhout (2002), “A central planning approach to dynamic incomplete markets”, Working Paper (INSEAD, France). Dunn, K., and K. Singleton (1986), “Modeling the term structure of interest rates under nonseparable utility and durability of goods”, Journal of Financial Economics 17:27−55. Dupire, B. (1994), “Pricing with a smile”, Risk (January), pp. 18−20. Dybvig, P., and C.-F. Huang (1988), “Nonnegative wealth, absence of arbitrage, and feasible consumption plans”, Review of Financial Studies 1:377−401. El Karoui, N., and H. Geman (1994), “A probabilistic approach to the valuation of general floating-rate notes with an application to interest rate swaps”, Advances in Futures and Options Research 7:47−63. El Karoui, N., and V. Lacoste (1992), “Multifactor models of the term structure of interest rates”, Working Paper (Laboratoire de Probabilit´es, Universit´e de Paris VI). El Karoui, N., and M. Quenez (1995), “Dynamic programming and pricing of contingent claims in an incomplete market”, SIAM Journal of Control and Optimization 33:29−66. El Karoui, N., and J.-C. Rochet (1989), “A pricing formula for options on coupon bonds”, Working Paper (October, Laboratoire de Probabilit´es, Universit´e de Paris VI).
    • Ch. 11: Intertemporal Asset Pricing Theory 733 El Karoui, N., C. Lepage, R. Myneni, N. Roseau and R. Viswanathan (1991a), “The pricing and hedging of interest rate claims: applications”, Working Paper (Laboratoire de Probabilit´es, Universit´e de Paris VI). El Karoui, N., C. Lepage, R. Myneni, N. Roseau and R. Viswanathan (1991b), “The valuation and hedging of contingent claims with gaussian Markov interest rates”, Working Paper (Laboratoire de Probabilit´es, Universit´e de Paris VI). El Karoui, N., R. Myneni and R. Viswanathan (1992), “Arbitrage pricing and hedging of interest rate claims with state variables I: theory”, Working Paper (Laboratoire de Probabilit´es, Universit´e de Paris VI). Elliott, R., M. Jeanblanc and M. Yor (2000), “On models of default risk”, Mathematical Finance 10:77−106. Engle, R. (1982), “Autoregressive conditional heteroskedasticity with estimates of the variance of united kingdom inflation”, Econometrica 50:987−1008. Epstein, L. (1988), “Risk aversion and asset prices”, Journal of Monetary Economics 22:179−192. Epstein, L. (1992), “Behavior under risk: recent developments in theory and application”, in: J.-J. Laffont, ed., Advances in Economic Theory (Cambridge University Press, Cambridge) pp. 1−63. Epstein, L., and S. Zin (1989), “Substitution, risk aversion and the temporal behavior of consumption and asset returns I: a theoretical framework”, Econometrica 57:937−969. Epstein, L., and S. Zin (1999), “Substitution, risk aversion and the temporal behavior of consumption and asset returns: an empirical analysis”, Journal of Political Economy 99:263−286. Fan, H., and S. Sundaresan (2000), “Debt valuation, renegotiations and optimal dividend policy”, Review of Financial Studies 13(4):1057−1099. Feller, W. (1951), “Two singular diffusion problems”, Annals of Mathematics 54:173−182. Filipovi´c, D. (1999), “A note on the Nelson–Siegel family”, Mathematical Finance 9:349−359. Filipovi´c, D. (2001a), “A general characterization of one factor affine term structure models”, Finance and Stochastics 5:389−412. Filipovi´c, D. (2001b), “Time-inhomogeneous affine processes”, Working Paper (Department of Operations Research and Financial Engineering, Princeton University) submitted. Fisher, E., R. Heinkel and J. Zechner (1989), “Dynamic capital structure choice: theory and tests”, Journal of Finance 44:19−40. Fisher, M., D. Nychka and D. Zervos (1994), “Fitting the term structure of interest rates with smoothing splines”, Working Paper (Board of Governors of the Federal Reserve Board, Washington, DC). Fleming, J., and R. Whaley (1994), “The value of wildcard options”, Journal of Finance 1:215−236. Fleming, W., and M. Soner (1993), Controlled Markov Processes and Viscosity Solutions (Springer, New York). Florenzano, M., and P. Gourdel (1994), “T-period economies with incomplete markets”, Economics Letters 44:91−97. Foldes, L. (1978a), “Martingale conditions for optimal saving – discrete time”, Journal of Mathematical Economics 5:83−96. Foldes, L. (1978b), “Optimal saving and risk in continuous time”, Review of Economic Studies 45: 39−65. Foldes, L. (1990), “Conditions for optimality in the infinite-horizon portfolio-cum-saving problem with semimartingale investments”, Stochastics and Stochastics Reports 29:133−170. Foldes, L. (1991a), “Certainty equivalence in the continuous-time portfolio-cum-saving model”, in: M.H.A. Davis and R.J. Elliott, eds., Applied Stochastic Analysis (Gordon and Breach, London) pp. 343–387. Foldes, L. (1991b), “Optimal sure portfolio plans”, Mathematical Finance 1:15−55. Foldes, L. (1992), “Existence and uniqueness of an optimum in the infinite-horizon portfolio-cum-saving model with semimartingale investments”, Stochastic and Stochastic Reports 41:241−267. Foldes, L. (2001), “The optimal consumption function in a brownian model of accumulation, Part A:
    • 734 D. Duffie the consumption function as solution of a boundary value problem”, Journal of Economic Dynamics and Control 25:1951−1971. F¨ollmer, H., and M. Schweizer (1990), “Hedging of contingent claims under incomplete information”, in: M. Davis and R. Elliott, eds., Applied Stochastic Analysis (Gordon and Breach, London) pp. 389−414. Frachot, A. (1995), “Factor models of domestic and foreign interest rates with stochastic volatilities”, Mathematical Finance 5:167−185. Frachot, A., and J.-P. Lesne (1993), “Econometrics of linear factor models of interest rates”, Working Paper (Banque de France, Paris). Frachot, A., D. Janci and V. Lacoste (1993), “Factor analysis of the term structure: a probabilistic approach”, Working Paper (Banque de France, Paris). Frittelli, M., and P. Lakner (1995), “Arbitrage and free lunch in a general financial market model; the fundamental theorem of asset pricing”, Mathematical Finance 5:237−261. Gabay, D. (1982), “Stochastic processes in models of financial markets”, Working Paper; in: Proceedings of the IFIP Conference on Control of Distributed Systems, Toulouse (Pergamon Press, Toulouse). Geanakoplos, J. (1990), “An introduction to general equilibrium with incomplete asset markets”, Journal of Mathematical Economics 19:1−38. Geanakoplos, J., and A. Mas-Colell (1989), “Real indeterminacy with financial assets”, Journal of Economic Theory 47:22−38. Geanakoplos, J., and W. Shafer (1990), “Solving systems of simultaneous equations in economics”, Journal of Mathematical Economics 19:69−94. Geman, H., N. El Karoui and J. Rochet (1995), “Changes of num´eraire, changes of probability measure and option pricing”, Journal of Applied Probability 32:443−458. Geske, R. (1977), “The valuation of corporate liabilities as compound options”, Journal of Financial Economics 7:63−81. Giovannini, A., and P. Weil (1989), “Risk aversion and intertemporal substitution in the capital asset pricing model”, Working Paper w2824 (National Bureau of Economic Research, Cambridge, MA). Girotto, B., and F. Ortu (1994), “Consumption and portfolio policies with incomplete markets and short-sale contraints in the finite-dimensional case: some remarks”, Mathematical Finance 4:69−73. Girotto, B., and F. Ortu (1996), “Existence of equivalent martingale measures in finite dimensional securities markets”, Journal of Economic Theory 69:262−277. Goldberg, L. (1998), “Volatility of the short rate in the rational lognormal model”, Finance and Stochastics 2:199−211. Goldstein, R. (1997), “Beyond HJM: fitting the current term structure while maintaining a Markovian system”, Working Paper (Fisher College of Business, The Ohio State University). Goldstein, R. (2000), “The term structure of interest rates as a random field”, Review of Financial Studies 13:365−384. Goldys, B., and M. Musiela (1996), “On partial differential equations related to term structure models”, Working Paper (School of Mathematics, The University of New South Wales, Sydney, Australia). Goldys, B., M. Musiela and D. Sondermann (1994), “Lognormality of rates and term structure models”, Working Paper (School of Mathematics, University of New South Wales). Gorman, W. (1953), “Community preference fields”, Econometrica 21:63−80. Gottardi, P., and T. Hens (1996), “The survival assumption and existence of competitive equilibria when asset markets are incomplete”, Journal of Economic Theory 71:313−323. Grannan, E., and G. Swindle (1996), “Minimizing transaction costs of option hedging strategies”, Mathematical Finance 6:341−364. Grant, S., A. Kajii and B. Polak (2000), “Temporal resolution of uncertainty and recursive non-expected utility models”, Econometrica 68:425−434. Grinblatt, M., and N. Jegadeesh (1996), “The relative pricing of eurodollar futures and forward contracts”, Journal of Finance 51:1499−1522. Gul, F., and O. Lantto (1990), “Betweenness satisfying preferences and dynamic choice”, Journal of Economic Theory 52:162−177.
    • Ch. 11: Intertemporal Asset Pricing Theory 735 Guo, D. (1998), “The risk premium of volatility implicit in currency options”, Journal of Business and Economics Statistics 16:498−507. Hahn, F. (1994), “On economies with Arrow securities”, Working Paper (Department of Economics, Cambridge University). Hamza, K., and F. Klebaner (1995), “A stochastic partial differential equation for term structure of interest rates”, Working Paper (Department of Statistics, The University of Melbourne). Hansen, A., and P. Jorgensen (2000), “Fast and accurate approximation of bond prices when short interest rates are log-normal”, Journal of Computational Finance 3(3):27−45. Hansen, L., and R. Jaganathan (1990), “Implications of security market data for models of dynamic economies”, Journal of Political Economy 99:225−262. Harrison, M., and D. Kreps (1979), “Martingales and arbitrage in multiperiod securities markets”, Journal of Economic Theory 20:381−408. Harrison, M., and S. Pliska (1981), “Martingales and stochastic integrals in the theory of continuous trading”, Stochastic Processes and Their Applications 11:215−260. Hart, O. (1975), “On the optimality of equilibrium when the market structure is incomplete”, Journal of Economic Theory 11:418−430. He, H., and H. Pag`es (1993), “Labor income, borrowing constraints, and equilibrium asset prices”, Economic Theory 3:663−696. Heath, D. (1998), “Some new term structure models”, Working Paper (Department of Mathematical Sciences, Carnegie Mellon University). Heath, D., R. Jarrow and A. Morton (1992), “Bond pricing and the term structure of interest rates: a new methodology for contingent claims valuation”, Econometrica 60:77−106. Henrotte, P. (1991), “Transactions costs and duplication strategies”, Working Paper (Graduate School of Business, Stanford University). Hens, T. (1991), “Structure of general equilibrium models with incomplete markets”, Working Paper (Department of Economics, University of Bonn). Heston, S. (1988), “Testing continuous time models of the term structure of interest rates”, Working Paper (Graduate School of Industrial Administration, Carnegie-Mellon University). Heston, S. (1993), “A closed-form solution for options with stochastic volatility with applications to bond and currency options”, Review of Financial Studies 6:327−344. Hilberink, B., and L.C.G. Rogers (2002), “Optimal capital structure and endogenous default”, Finance and Stochastic 6(2):237−263. Hindy, A., and M. Huang (1993), “Asset pricing with linear collateral constraints”, Working Paper (Graduate School of Business, Stanford University). Hirsch, M., M. Magill and A. Mas-Colell (1990), “A geometric approach to a class of equilibrium existence theorems”, Journal of Mathematical Economics 19:95−106. Ho, T., and S. Lee (1986), “Term structure movements and pricing interest rate contingent claims”, Journal of Finance 41:1011−1029. Hogan, M., and K. Weintraub (1993), “The lognormal interest rate model and eurodollar futures”, Working Paper (Citybank, New York). Huang, C.-F. (1985a), “Information structures and equilibrium asset prices”, Journal of Economic Theory 31:33−71. Huang, C.-F. (1985b), “Information structures and viable price systems”, Journal of Mathematical Economics 14:215−240. Huang, C.-F., and H. Pag`es (1992), “Optimal consumption and portfolio policies with an infinite horizon: existence and convergence”, Annals of Applied Probability 2:36−64. Hull, J. (2000), Options, Futures, and Other Derivative Securities, 4th Edition (Prentice-Hall, Englewood Cliffs, NJ). Hull, J., and A. White (1990), “Pricing interest rate derivative securities”, Review of Financial Studies 3:573−592. Hull, J., and A. White (1992), “The price of default”, Risk 5:101−103.
    • 736 D. Duffie Hull, J., and A. White (1993), “One-factor interest-rate models and the valuation of interest-rate derivative securities”, Journal of Financial and Quantitative Analysis 28:235−254. Hull, J., and A. White (1995), “The impact of default risk on the prices of options and other derivative securities”, Journal of Banking and Finance 19:299−322. Husseini, S., J.-M. Lasry and M. Magill (1990), “Existence of equilibrium with incomplete markets”, Journal of Mathematical Economics 19:39−68. Ingersoll, J. (1977), “An examination of corporate call policies on convertible securities”, Journal of Finance 32:463−478. Jackwerth, J., and M. Rubinstein (1996), “Recovering probability distributions from options prices”, Journal of Finance 51:1611−1631. Jacod, J., and P. Protter (2000), Probability Essentials (Springer, New York). Jacod, J., and A. Shiryaev (1998), “Local martingales and the fundamental asset pricing theorems in the discrete-time case”, Finance and Stochastics 2:259−274. Jakobsen, S. (1992), “Prepayment and the valuation of Danish mortgage-backed bonds”, Ph.D. Dissertation (The Aarhus School of Business, Denmark). Jamshidian, F. (1989a), “Closed-form solution for American options on coupon bonds in the general gaussian interest rate model”, Working Paper (Financial Strategies Group, Merrill Lynch Capital Markets, New York). Jamshidian, F. (1989b), “An exact bond option formula”, Journal of Finance 44:205−209. Jamshidian, F. (1989c), “The multifactor gaussian interest rate model and implementation”, Working Paper (Financial Strategies Group, Merrill Lynch Capital Markets, New York). Jamshidian, F. (1991a), “Bond and option evaluation in the gaussian interest rate model”, Research in Finance 9:131−170. Jamshidian, F. (1991b), “Forward induction and construction of yield curve diffusion models”, Journal of Fixed Income (June), pp. 62−74. Jamshidian, F. (1993a), “Hedging and evaluating diff swaps”, Working Paper (Fuji International Finance PLC, London). Jamshidian, F. (1993b), “Options and futures evaluation with deterministic volatilities”, Mathematical Finance 3:149−159. Jamshidian, F. (1994), “Hedging quantos, differential swaps and ratios”, Applied Mathematical Finance 1:1−20. Jamshidian, F. (1996), “Bond, futures and option evaluation in the quadratic interest rate model”, Applied Mathematical Finance 3:93−115. Jamshidian, F. (1997a), “Libor and swap market models and measures”, Finance and Stochastics 1: 293−330. Jamshidian, F. (1997b), “Pricing and hedging European swaptions with deterministic (lognormal) forward swap volatility”, Finance and Stochastics 1:293−330. Jamshidian, F. (2001), “Libor market model with semimartingales”, in: E. Jouini, J. Cvitanic and M. Musiela, eds., Option Pricing, Interest Rates and Risk Management. Handbooks in Mathematical Finance (Cambridge University Press) Part II, Ch. 10. Jarrow, R., and S. Turnbull (1994), “Delta, gamma and bucket hedging of interest rate derivatives”, Applied Mathematical Finance 1:21−48. Jarrow, R., and S. Turnbull (1995), “Pricing derivatives on financial securities subject to credit risk”, Journal of Finance 50:53−85. Jarrow, R., and F. Yu (2001), “Counterparty risk and the pricing of defaultable securities”, Journal of Finance 56(5):1765−1799. Jarrow, R., D. Lando and F. Yu (2003), “Default risk and diversification: theory and application”, Working Paper (Cornell University). Jaschke, S. (1996), “Arbitrage bounds for the term structure of interest rates”, Finance and Stochastics 2:29−40.
    • Ch. 11: Intertemporal Asset Pricing Theory 737 Jeanblanc, M., and M. Rutkowski (2000), “Modelling of default risk: an overview”, in: Modern Mathematical Finance: Theory and Practice (Higher Education Press, Beijing) pp. 171–269. Jeffrey, A. (1995), “Single factor Heath–Jarrow–Morton term structure models based on Markov spot interest rate”, Journal of Financial and Quantitative Analysis 30:619−643. Johnson, B. (1994), “Dynamic asset pricing theory: the search for implementable results”, Working Paper (Engineering-Economic Systems Department, Stanford University). Jong, F.D., and P. Santa-Clara (1999), “The dynamics of the forward interest rate curve: a formulation with state variables”, Journal of Financial and Quantitative Analysis 34:131−157. Jouini, E., and H. Kallal (1993), “Efficient trading strategies in the presence of market frictions”, Working Paper (CREST-ENSAE, Paris). Kabanov, Y. (1997), “On the FTAP of Kreps–Delbaen–Schachermayer”, in: Statistics and Control of Stochastic Processes (World Scientific, River Edge, NJ) pp. 191–203. Moscow 1995/1996. Kabanov, Y., and D. Kramkov (1995), “Large financial markets: asymptotic arbitrage and contiguity”, Theory of Probability and its Applications 39:182−187. Kabanov, Y., and C. Stricker (2001), “The Harrison–Pliska arbitrage pricing theorem under transactions costs”, Journal of Mathematical Economics 35:185−196. Kan, R. (1993), “Gradient of the representative agent utility when agents have stochastic recursive preferences”, Working Paper (Graduate School of Business, Stanford University). Kan, R. (1995), “Structure of pareto optima when agents have stochastic recursive preferences”, Journal of Economic Theory 66:626−631. Karatzas, I. (1988), “On the pricing of American options”, Applied Mathematics and Optimization 17:37−60. Karatzas, I. (1993), “IMA tutorial lectures 1–3: Minneapolis”, Working Paper (Department of Statistics, Columbia University). Karatzas, I., and S.-G. Kou (1998), “Hedging American contingent claims with constrained portfolios”, Finance and Stochastics 2:215−258. Karatzas, I., and S. Shreve (1988), Brownian Motion and Stochastic Calculus (Springer, New York). Karatzas, I., and S. Shreve (1998), Methods of Mathematical Finance (Springer, New York). Karatzas, I., J. Lehoczky and S. Shreve (1987), “Optimal portfolio and consumption decisions for a ‘small investor’ on a finite horizon”, SIAM Journal of Control and Optimization 25:1157−1186. Kawazu, K., and S. Watanabe (1971), “Branching processes with immigration and related limit theorems”, Theory of Probability and its Applications 16:36−54. Kennedy, D. (1994), “The term structure of interest rates as a gaussian random field”, Mathematical Finance 4:247−258. Konno, H., and T. Takase (1995), “A constrained least square approach to the estimation of the term structure of interest rates”, Financial Engineering and the Japanese Markets 2:169−179. Konno, H., and T. Takase (1996), “On the de-facto convex structure of a least square problem for estimating the term structure of interest rates”, Financial Engineering and the Japanese Market 3:77−85. Koopmans, T. (1960), “Stationary utility and impatience”, Econometrica 28:287−309. Kramkov, D., and W. Schachermayer (1999), “The asymptotic elasticity of utility functions and optimal investment in incomplete markets”, Annals of Applied Probability 9(3):904−950. Kraus, A., and R. Litzenberger (1975), “Market equilibrium in a multiperiod state preference model with logarithmic utility”, Journal of Finance 30:1213−1227. Kreps, D. (1979), “Three essays on capital markets”, Working Paper (Institute for Mathematical Studies in the Social Sciences, Stanford University). Kreps, D. (1981), “Arbitrage and equilibrium in economies with infinitely many commodities”, Journal of Mathematical Economics 8:15−35. Kreps, D., and E. Porteus (1978), “Temporal resolution of uncertainty and dynamic choice”, Econometrica 46:185−200.
    • 738 D. Duffie Kusuoka, S. (1992), “Consistent price system when transaction costs exist”, Working Paper (Research Institute for Mathematical Sciences, Kyoto University). Kusuoka, S. (1993), “A remark on arbitrage and martingale measure”, Publ. RIMS, Kyoto University 29:833−840. Kusuoka, S. (1995), “Limit theorem on option replication cost with transaction costs”, Annals of Applied Probability 11:1283−1301. Kusuoka, S. (1999), “A remark on default risk models”, Advances in Mathematical Economics 1:69−82. Kusuoka, S. (2000), “Term structure and SPDE”, Advances in Mathematical Economics 2:67−85. Kydland, F., and E. Prescott (1991), “Indeterminacy in incomplete market economies”, Economic Theory 1:45−62. Lakner, P. (1993), “Equivalent local martingale measures and free lunch in a stochastic model of finance with continuous trading”, Working Paper (Statistics and Operation Research Department, New York University). Lakner, P., and E. Slud (1991), “Optimal consumption by a bond investor: the case of random interest rate adapted to a point process”, SIAM Journal of Control and Optimization 29:638−655. Lando, D. (1994), “Three essays on contingent claims pricing”, Working Paper (Ph.D. Dissertation, Statistics Center, Cornell University). Lando, D. (1998), “On Cox processes and credit risky securities”, Review of Derivatives Research 2:99−120. Lang, L., R. Litzenberger and A. Liu (1998), “Determinants of interest rate swap spreads”, Journal of Banking and Finance 22:1507−1532. Langetieg, T. (1980), “A multivariate model of the term structure”, Journal of Finance 35:71−97. Leland, H. (1985), “Option pricing and replication with transactions costs”, Journal of Finance 40: 1283−1301. Leland, H. (1994), “Corporate debt value, bond covenants, and optimal capital structure”, Journal of Finance 49:1213−1252. Leland, H. (1998), “Agency costs, risk management, and capital structure”, Journal of Finance 53: 1213−1242. Leland, H., and K. Toft (1996), “Optimal capital structure, endogenous bankruptcy, and the term structure of credit spreads”, Journal of Finance 51:987−1019. LeRoy, S. (1973), “Risk aversion and the martingale property of asset prices”, International Economic Review 14:436−446. Levental, S., and A. Skorohod (1995), “A necessary and sufficient condition for absence of arbitrage with tame portfolios”, Annals of Applied Probability 5:906−925. Litzenberger, R. (1992), “Swaps: plain and fanciful”, Journal of Finance 47:831−850. Liu, J., J. Pan and L. Pedersen (1999), “Density-based inference in affine jump-diffusions”, Working Paper (Graduate School of Business, Stanford University). Long, J. (1990), “The numeraire portfolio”, Journal of Financial Economics 26:29−69. Longstaff, F. (1990), “The valuation of options on yields”, Journal of Financial Economics 26:97−121. Longstaff, F., and E. Schwartz (1992), “Interest rate volatility and the term structure: a two-factor general equilibrium model”, Journal of Finance 47:1259−1282. Longstaff, F., and E.S. Schwartz (1993), “Implementing of the Longstaff–Schwartz interest rate model”, Journal of Fixed Income 3:7−14. Longstaff, F., and E.S. Schwartz (1995), “A simple approach to valuing risky fixed and floating rate debt”, Journal of Finance 50:789−819. Longstaff, F., and E.S. Schwartz (2001), “Valuing American options by simulation: a simple least-squares approach”, Review of Financial Studies 14(1):113−147. Lucas, R. (1978), “Asset prices in an exchange economy”, Econometrica 46:1429−1445. Machina, M. (1982), “‘Expected utility’ analysis without the independence axiom”, Econometrica 50: 277−323.
    • Ch. 11: Intertemporal Asset Pricing Theory 739 Madan, D., and H. Unal (1998), “Pricing the risks of default”, Review of Derivatives Research 2: 121−160. Magill, M., and M. Quinzii (1996), Theory of Incomplete Markets (MIT Press, Cambridge, MA). Magill, M., and W. Shafer (1990), “Characterization of generically complete real asset structures”, Journal of Mathematical Economics 19:167−194. Magill, M., and W. Shafer (1991), “Incomplete markets”, in: W. Hildenbrand and H. Sonnenschein, eds., Handbook of Mathematical Economics, Vol. 4 (Elsevier, Amsterdam) pp. 1523−1614. Mas-Colell, A. (1991), “Indeterminacy in incomplete market economies”, Economic Theory 1:45−62. Mella-Barral, P. (1999), “Dynamics of default and debt reorganization”, Review of Financial Studies 12:535−578. Mella-Barral, P., and W. Perraudin (1997), “Strategic debt service”, Journal of Finance 52:531−556. Merton, R. (1971), “Optimum consumption and portfolio rules in a continuous time model”, Journal of Economic Theory 3:373−413; Erratum: 1973, 6:213−214. Merton, R. (1973), “The theory of rational option pricing”, Bell Journal of Economics and Management Science 4:141−183. Merton, R. (1974), “On the pricing of corporate debt: the risk structure of interest rates”, Journal of Finance 29:449−470. Merton, R. (1977), “On the pricing of contingent claims and the Modigliani–Miller theorem”, Journal of Financial Economics 5:241−250. Miltersen, K. (1994), “An arbitrage theory of the term structure of interest rates”, Annals of Applied Probability 4:953−967. Miltersen, K., K. Sandmann and D. Sondermann (1997), “Closed form solutions for term structure derivatives with log-normal interest rates”, Journal of Finance 52:409−430. Modigliani, F., and M. Miller (1958), “The cost of capital, corporation finance, and the theory of investment”, American Economic Review 48:261−297. Musiela, M. (1994a), “Nominal annual rates and lognormal volatility structure”, Working Paper (Department of Mathematics, University of New South Wales, Sydney). Musiela, M. (1994b), “Stochastic PDEs and term structure models”, Working Paper (Department of Mathematics, University of New South Wales, Sydney). Musiela, M., and D. Sondermann (1994), “Different dynamical specifications of the term structure of interest rates and their implications”, Working Paper (Department of Mathematics, University of New South Wales, Sydney). Nelson, D. (1990), “ARCH models as diffusion approximations”, Journal of Econometrics 45:7−38. Nielsen, S., and E. Ronn (1995), “The valuation of default risk in corporate bonds and interest rate swaps”, Working Paper (Department of Management Science and Information Systems, University of Texas at Austin). Nunes, J., L. Clewlow and S. Hodges (1999), “Interest rate derivatives in a Duffie and Kan model with stochastic volatility: an Arrow–Debreu pricing approach”, Review of Derivatives Research 3:5−66. Nyborg, K. (1996), “The use and pricing of convertible bonds”, Applied Mathematical Finance 3: 167−190. Pag`es, H. (1987), “Optimal consumption and portfolio policies when markets are incomplete”, Working Paper (Department of Economics, Massachusetts Institute of Technology). Pag`es, H. (2000), “Estimating brazilian sovereign risk from brady bond prices”, Working Paper (Bank of France). Pan, J. (2002), “The jump-risk premia implicit in options: evidence from an integrated time-series study”, Journal of Financial Economics 63:3−50. Pan, W.-H. (1993), “Constrained efficient allocations in incomplete markets: characterization and implementation”, Working Paper (Department of Economics, University of Rochester). Pan, W.-H. (1995), “A second welfare theorem for constrained efficient allocations in incomplete markets”, Journal of Mathematical Economics 24:577−599.
    • 740 D. Duffie Pang, K. (1996), “Multi-factor gaussian HJM approximation to Kennedy and calibration to caps and swaptions prices”, Working Paper (Financial Options Research Center, Warwick Business School, University of Warwick). Pang, K., and S. Hodges (1995), “Non-negative affine yield models of the term structure”, Working Paper (Financial Options Research Center, Warwick Business School, University of Warwick). Pearson, N., and T.-S. Sun (1994), “An empirical examination of the Cox, Ingersoll, and Ross model of the term structure of interest rates using the method of maximum likelihood”, Journal of Finance 54:929−959. Pennacchi, G. (1991), “Identifying the dynamics of real interest rates and inflation: evidence using survey data”, Review of Financial Studies 4:53−86. Piazzesi, M. (1997), “An affine model of the term structure of interest rates with macroeconomic factors”, Working Paper (Stanford University). Piazzesi, M. (1999), “A linear-quadratic jump-diffusion model with scheduled and unscheduled announcements”, Working Paper (Stanford University). Piazzesi, M. (2002), “Affine term structure models”, in: Y. A¨ıt-Sahalia and L.P. Hansen, eds., Handbook of Financial Economics (Elsevier, Amsterdam) forthcoming. Pliska, S. (1986), “A stochastic calculus model of continuous trading: optimal portfolios”, Mathematics of Operations Research 11:371−382. Plott, C. (1986), “Rational choice in experimental markets”, Journal of Business 59:S301−S327. Poteshman, A. (1998), “Estimating a general stochastic variance model from options prices”, Working Paper (Graduate School of Business, University of Chicago, Chicago, IL). Prisman, E. (1985), “Valuation of risky assets in arbitrage free economies with frictions”, Working Paper (Department of Finance, University of Arizona, Tucson, AZ). Protter, P. (1990), Stochastic Integration and Differential Equations (Springer, New York). Protter, P. (2001), “A partial introduction to financial asset pricing theory”, Stochastic Processes and their Applications 91(2):169−203. Pye, G. (1974), “Gauging the default premium”, Financial Analysts Journal (January–February), pp. 49−52. Radner, R. (1967), “Equilibre des march´es a terme et au comptant en cas d’incertitude”, Cahiers d’Econom´etrie 4:35−52. Radner, R. (1972), “Existence of equilibrium of plans, prices, and price expectations in a sequence of markets”, Econometrica 40:289−303. Renault, E., and N. Touzi (1992), “Stochastic volatility models: statistical inference from implied volatilities”, Working Paper (GREMAQ IDEI, Toulouse, and CREST, Paris, France). Ritchken, P., and L. Sankarasubramaniam (1992), “Valuing claims when interest rates have stochastic volatility”, Working Paper (Department of Finance, University of Southern California). Ritchken, P., and R. Trevor (1993), “On finite state Markovian representations of the term structure”, Working Paper (Department of Finance, University of Southern California). Rogers, C. (1994), “Equivalent martingale measures and no-arbitrage”, Stochastics and Stochastic Reports 51:1−9. Rogers, C. (1995), “Which model for term-structure of interest rates should one use?”, Mathematical Finance, IMA, v65 (Springer, New York) pp. 93–116. Ross, S. (1987), “Arbitrage and martingales with taxation”, Journal of Political Economy 95:371−393. Ross, S. (1989), “Information and volatility: the non-arbitrage martingale approach to timing and resolution irrelevancy”, Journal of Finance 64:1−17. Rubinstein, M. (1976), “The valuation of uncertain income streams and the pricing of options”, Bell Journal of Economics 7:407−425. Rubinstein, M. (1995), “As simple as one, two, three”, Risk 8(January):44−47. Rutkowski, M. (1996), “Valuation and hedging of contingent claims in the HJM model with deterministic volatilities”, Applied Mathematical Finance 3:237−267.
    • Ch. 11: Intertemporal Asset Pricing Theory 741 Rutkowski, M. (1998), “Dynamics of spot, forward, and futures libor rates”, International Journal of Theoretical and Applied Finance 1:425−445. Ryder, H., and G. Heal (1973), “Optimal growth with intertemporally dependent preferences”, Review of Economic Studies 40:1−31. Sandmann, K., and D. Sondermann (1997), “On the stability of lognormal interest rate models”, Mathematical Finance 7:119−125. Santa-Clara, P., and D. Sornette (2001), “The dynamics of the forward interest rate curve with stochastic string shocks”, Review of Financial Studies 14:149−185. Sato, K. (1999), L´evy Processes and Infinitely Divisible Distributions (Cambridge University Press, Cambridge). Translated from the 1990 Japanese original, revised by the author. Scaillet, O. (1996), “Compound and exchange options in the affine term structure model”, Applied Mathematical Finance 3:75−92. Schachermayer, W. (1992), “A Hilbert-space proof of the fundamental theorem of asset pricing”, Insurance Mathematics and Economics 11:249−257. Schachermayer, W. (1994), “Martingale measures for discrete-time processes with infinite horizon”, Mathematical Finance 4:25−56. Schachermayer, W. (2001), “The fundamental theorem of asset pricing under proportional transaction costs in finite discrete time”, Working Paper (Institut f¨ur Statistik der Universit¨at Wien). Schachermayer, W. (2002), “No arbitrage: on the work of David Kreps”, Positivity 6:359−368. Sch¨onbucher, P. (1998), “Term stucture modelling of defaultable bonds”, Review of Derivatives Research 2:161−192. Schroder, M., and C. Skiadas (1999), “Optimal consumption and portfolio selection with stochastic differential utility”, Journal of Economic Theory 89:68−126. Schroder, M., and C. Skiadas (2002), “An isomorphism between asset pricing models with and without linear habit formation”, Review of Financial Studies 15:1189−1221. Schweizer, M. (1992), “Martingale densities for general asset prices”, Journal of Mathematical Economics 21:363−378. Scott, L. (1997), “The valuation of interest rate derivatives in a multi-factor Cox–Ingersoll–Ross model that matches the initial term structure”, Working Paper (Morgan Stanley, New York). Selby, M., and C. Strickland (1993), “Computing the Fong and Vasicek pure discount bond price formula”, Working Paper (FORC Preprint 93/42, October 1993, University of Warwick). Selden, L. (1978), “A new representation of preference over ‘certain × uncertain’ consumption pairs: the ‘ordinal certainty equivalent’ hypothesis”, Econometrica 46:1045−1060. Sharpe, W. (1964), “Capital asset prices: a theory of market equilibrium under conditions of risk”, Journal of Finance 19:425−442. Singleton, K. (2001), “Estimation of affine asset pricing models using the empirical characteristic function”, Journal of Econometrics 102:111−141. Singleton, K., and L. Umantsev (2003), “Pricing coupon-bond and swaptions in affine term structure models”, Mathematical Finance, forthcoming. Skiadas, C. (1997), “Conditioning and aggregation of preferences”, Econometrica 65:347−367. Skiadas, C. (1998), “Recursive utility and preferences for information”, Economic Theory 12:293−312. Soner, M., S. Shreve and J. Cvitani´c (1994), “There is no nontrivial hedging portfolio for option pricing with transaction costs”, Annals of Applied Probability 5:327−355. Sornette, D. (1998), “String formulation of the dynamics of the forward interest rate curve”, European Physical Journal B 3:125−137. Stanton, R. (1995), “Rational prepayment and the valuation of mortgage-backed securities”, Review of Financial Studies 8:677−708. Stanton, R., and N. Wallace (1995), “Arm wrestling: valuing adjustable rate mortgages indexed to the eleventh district cost of funds”, Real Estate Economics 23:311−345. Stanton, R., and N. Wallace (1998), “Mortgage choice: what’s the point?”, Real Estate Economics 26:173−205.
    • 742 D. Duffie Stapleton, R., and M. Subrahmanyam (1978), “A multiperiod equilibrium asset pricing model”, Econometrica 46:1077−1093. Stein, E., and J. Stein (1991), “Stock price distributions with stochastic volatility: an analytic approach”, Review of Financial Studies 4:725−752. Stricker, C. (1990), “Arbitrage et lois de martingale”, Annales de l’Institut Henri Poincar´e 26:451−460. Sundaresan, S. (1989), “Intertemporally dependent preferences in the theories of consumption, portfolio choice and equilibrium asset pricing”, Review of Financial Studies 2:73−89. Sundaresan, S. (1997), Fixed Income Markets and Their Derivatives (South-Western, Cincinnati, OH). Svensson, L., and M. Dahlquist (1996), “Estimating the term structure of interest rates for monetary policy analysis”, Scandinavian Journal of Economics 98:163−183. Turnbull, S.M. (1993), “Pricing and hedging diff swaps”, Journal of Financial Engineering (December): 297−334. Turnbull, S.M. (1995), “Interest rate digital options and range notes”, Journal of Derivatives 3:92−101. Uhrig-Homburg, M. (1998), “Endogenous bankruptcy when issuance is costly”, Working Paper 98-13 (Lehrstuhl f¨ur Finanzierung, University of Mannheim). Van Steenkiste, R., and S. Foresi (1999), “Arrow–Debreu prices for affine models”, Working Paper (Salomon Smith Barney, Inc., Goldman Sachs Asset Management). Vargiolu, T. (1999), “Invariant measures for the Musiela equation with deterministic diffusion term”, Finance and Stochastics 3:483−492. Vasicek, O. (1977), “An equilibrium characterization of the term structure”, Journal of Financial Economics 5:177−188. Werner, J. (1985), “Equilibrium in economies with incomplete financial markets”, Journal of Economic Theory 36:110−119. Whalley, A., and P. Wilmott (1997), “An asymptotic analysis of an optimal hedging model for options with transaction costs”, Mathematical Finance 7:307−324. Xu, G.-L., and S. Shreve (1992), “A duality method for optimal consumption and investment under short-selling prohibition. I. General market coefficients”, Annals of Applied Probability 2:87−112. Zhou, C.-S. (2000), “A jump-diffusion approach to modeling credit risk and valuing defaultable securities”, Working Paper (Federal Reserve Board, Washington, DC). Zhou, Y.-Q. (1997), “The global structure of equilibrium manifold in incomplete markets”, Journal of Mathematical Economics 27:91−111.
    • Chapter 12 TESTS OF MULTIFACTOR PRICING MODELS, VOLATILITY BOUNDS AND PORTFOLIO PERFORMANCE WAYNE E. FERSON° Carroll School of Management, Boston College Contents Abstract 745 Keywords 745 1. Introduction 746 2. Multifactor asset-pricing models: Review and integration 748 2.1. The stochastic discount factor representation 748 2.2. Expected risk premiums 750 2.3. Return predictability 751 2.4. Consumption-based asset-pricing models 753 2.5. Multi-beta pricing models 754 2.5.1. Relation to the stochastic discount factor 754 2.5.2. Relation to mean-variance efficiency 756 2.5.3. A large-markets interpretation 757 2.6. Mean-variance efficiency with conditioning information 760 2.6.1. Conditional versus unconditional efficiency 762 2.6.2. Implications for tests 764 2.7. Choosing the factors 765 3. Modern variance bounds 768 3.1. The Hansen–Jagannathan bounds 768 3.2. Variance bounds with conditioning information 770 3.2.1. Efficient portfolio bounds 771 3.2.2. Optimal bounds 771 3.2.3. Discussion 772 3.3. The Hansen–Jagannathan distance 773 4. Methodology and tests of multifactor asset-pricing models 774 4.1. The Generalized Method of Moments approach 774 ° The author acknowledges financial support from the Collins Chair in Finance at Boston College and the Pigott-PACCAR professorship at the University of Washington. He is also grateful to Geert Bekaert, John Cochrane, George Constantinides and Ludan Liu for helpful comments and suggestions. Handbook of the Economics of Finance, Edited by G.M. Constantinides, M. Harris and R. Stulz © 2003 Elsevier B.V. All rights reserved
    • 744 W.E. Ferson 4.2. Cross-sectional regression methods 775 4.2.1. The Fama–MacBeth approach 776 4.2.2. Interpreting the estimates 777 4.2.3. A caveat 778 4.2.4. Errors-in-betas 780 4.3. Multivariate regression and beta-pricing models 781 4.3.1. Comparing the SDF and beta-pricing approaches 784 5. Conditional performance evaluation 785 5.0.1. A numerical example 786 5.1. Stochastic discount factor formulation 787 5.1.1. Invariance to the number of funds 788 5.1.2. Additional issues 788 5.2. Beta-pricing formulation 788 5.3. Using portfolio weights 790 5.3.1. Conditional performance attribution 791 5.3.2. Interim trading bias 791 5.4. Conditional market-timing models 792 5.5. Empirical evidence on conditional performance 793 6. Conclusions 794 References 795
    • Ch. 12: Tests of Multifactor Pricing Models, Volatility Bounds and Portfolio Performance 745 Abstract Three concepts: stochastic discount factors, multi-beta pricing and mean-variance efficiency, are at the core of modern empirical asset pricing. This chapter reviews these paradigms and the relations among them, concentrating on conditional asset- pricing models where lagged variables serve as instruments for publicly available information. The different paradigms are associated with different empirical methods. We review the variance bounds of Hansen and Jagannathan (1991), concentrating on extensions for conditioning information. Hansen’s (1982) Generalized Method of Moments (GMM) is briefly reviewed as an organizing principle. Then, cross-sectional regression approaches as developed by Fama and MacBeth (1973) are reviewed and used to interpret empirical factors, such as those advocated by Fama and French (1993, 1996). Finally, we review the multivariate regression approach, popularized in the finance literature by Gibbons (1982) and others. A regression approach, with a beta pricing formulation, and a GMM approach with a stochastic discount factor formulation, may be considered competing paradigms for empirical work in asset pricing. This discussion clarifies the relations between the various approaches. Finally, we bring the models and methods together, with a review of the recent conditional performance evaluation literature, concentrating on mutual funds and pension funds. Keywords stochastic discount factor, performance evaluation, asset pricing, portfolio efficiency, volatility bounds, predicting returns JEL classification: A23, C1, C31, C51, D91, E20, G10, G11, G12, G14, G23
    • 746 W.E. Ferson 1. Introduction The asset-pricing models of modern finance describe the prices or expected rates of return of financial assets, which are claims traded in financial markets. Examples of financial assets are common stocks, bonds, options, futures and other “derivatives”, so named because they derive their values from other, underlying assets. Asset-pricing models are based on two central concepts. The first is the no arbitrage principle, which states that market forces tend to align prices so as to eliminate arbitrage opportunities. An arbitrage opportunity arises when assets can be combined in a portfolio with zero cost, no chance of a loss and positive probability of a gain. In Chapter 10 of this volume, Dybvig and Ross describe this theory. The second central concept in asset pricing is financial market equilibrium. Investors’ desired holdings of financial assets derive from an optimization problem. In equilibrium the first-order conditions of the optimization problem must be satisfied, and asset-pricing models follow from these conditions. When the agent considers the consequences of the investment decision for more than a single period in the future, intertemporal asset pricing models result. These models are reviewed by Campbell in Chapter 13 of this volume, and by Duffie in Chapter 11. The present chapter reviews multi-factor asset-pricing models from an empiricists’ perspective. Multi-factor models can be motivated by either the no-arbitrage principle or by an equilibrium model. Their distinguishing feature is that expected asset returns are determined by a linear combination of their covariances with variables representing the risk factors. This chapter has two main objectives. The first is to integrate the various empirical models and their tests in a self-contained discussion. The second is to review the application to the problem of measuring investment performance. This chapter concentrates heavily on the role of conditioning information, in the form of lagged variables that serve as instruments for publicly available information. I think that developments in this area, conditional asset pricing, represent some of the most significant advances in empirical asset-pricing research in recent years. The models described in this chapter are set in the classical world of perfectly efficient financial markets, and perfectly rational economic agents. Of course, a great deal of research is devoted to understanding asset prices under market imperfections like information and transactions costs, as several chapters in this volume amply illustrate. For asset pricing emphasizing human imperfections (behavioral finance) see Chapter 18 of this volume by Barberis and Thaler. The perfect-markets models reviewed here represent a baseline, and a starting point for understanding these more complex issues. Work in empirical asset pricing over the last few years has provided a markedly improved understanding of the relations among the various asset-pricing models. Bits and pieces of this are scattered across a number of published papers, and some is “common” knowledge, shared by aficionados. This chapter provides an integrative discussion, refining the earlier review in Ferson (1995) to reflect what I hope is an improved understanding.
    • Ch. 12: Tests of Multifactor Pricing Models, Volatility Bounds and Portfolio Performance 747 Much of our understanding of how asset-pricing models’ empirical predictions are related flows from representing the models as stochastic discount factors. Section 2 presents the stochastic discount factor approach, briefly illustrates a few examples of stochastic discount factors, and then relates the representation to beta pricing and to mean variance efficiency. These three concepts: stochastic discount factors, beta pricing and mean variance efficiency, are at the core of modern empirical asset pricing. We show the relation among these three concepts, and a “large-markets” interpretation of these relations. The discussion then proceeds to refinements of these issues in the presence of conditioning information. Section 2 ends with a brief discussion of how the risk factors have been identified in the empirical literature, and what the empirical evidence has to say about the factors. Section 3 begins with a fundamental empirical application of the stochastic discount factor approach – the variance bounds originally developed by Hansen and Jagannathan (1991). Unlike the case where a model identifies a particular stochastic discount factor, the question in the Hansen–Jagannathan bounds is: Given a set of asset returns, and some conditioning information, what can be said about the set of stochastic discount factors that could properly “price” the assets? By now, a number of reviews of the original Hansen–Jagannathan bounds are available in the literature. The discussion here is brief, quickly moving on to focus on less well-known refinements of the bounds to incorporate conditioning information. Section 4 discusses empirical methods, starting with Hansen’s (1982) Generalized Method of Moments (GMM). This important approach has also been the subject of several review articles and textbook chapters. We briefly review the use of the GMM to estimate stochastic discount factor models. This section is included only to make the latter parts of the chapter accessible to a reader who is not already familiar with the GMM. Section 4 then discusses two special cases that remain important in empirical asset pricing. The first is the cross-sectional regression approach, as developed by Fama and MacBeth (1973), and the second is the multivariate regression approach, popularized in the finance literature following Gibbons (1982). Once the mainstay of empirical work on asset pricing, cross-sectional regression continues to be used and useful. Our main focus is on the economic interpretation of the estimates. The discussion attempts to shed light on recent studies that employ the empirical factors advocated by Fama and French (1993, 1996), or generalizations of that approach. The multivariate regression approach to testing portfolio efficiency can be motivated by its immunity to the errors-in-variables problem that plagues the two step, cross-sectional regression approach. The multivariate approach is also elegant, and provides a nice intuition for the statistical tests. A regression approach, with a beta pricing formulation, and a GMM approach with a stochastic discount factor formulation, may be considered as competing paradigms for empirical work in asset pricing. However, under the same distributional assumptions, and when the same moments are estimated, the two approaches are essentially equivalent. The present discussion attempts to clarify these points, and suggests how to think about the choice of empirical method.
    • 748 W.E. Ferson Section 5 brings the models and methods together, in a review of the relatively recent literature on conditional performance evaluation. The problem of measuring the performance of managed portfolios has been the subject of research for more than 30 years. Traditional measures use unconditional expected returns, estimated by sample averages, as the baseline. However, if expected returns and risks vary over time, this may confuse common time-variation in fund risk and market risk premiums with average performance. In this way, traditional methods can ascribe abnormal performance to an investment strategy that trades mechanically, based only on public information. Conditional performance evaluation attempts to control these biases, while delivering potentially more powerful performance measures, by using lagged instruments to control for time-varying expectations. Section 5 reviews the main models for conditional performance evaluation, and includes a summary of the empirical evidence. Finally, Section 6 of this chapter offers concluding remarks. 2. Multifactor asset-pricing models: Review and integration 2.1. The stochastic discount factor representation Virtually all asset pricing models are special cases of the fundamental equation: Pt = Et{mt + 1 (Pt + 1 + Dt + 1)}, (1) where Pt is the price of the asset at time t and Dt + 1 is the amount of any dividends, interest or other payments received at time t + 1. The market-wide random variable mt + 1 is the stochastic discount factor (SDF).1 The prices are obtained by “discounting” the payoffs using the SDF, or multiplying by mt + 1, so that the expected “present value” of the payoff is equal to the price. The notation Et{·} denotes the conditional expectation, given a market-wide information set, Wt. Since empiricists don’t get to see Wt, it will be convenient to consider expectations conditioned on an observable subset of instruments, Zt. These expectations are denoted as E(·| Zt). When Zt is the null-information set, we have the unconditional expectation, denoted as E(·). Empirical work on asset-pricing models like Equation (1) typically relies on rational expectations, interpreted as the assumption that the expectation terms in the model are mathematical conditional expectations. Taking the expected values of Equation (1), rational expectations implies that versions of Equation (1) must hold for the expectations E(·| Zt) and E(·). 1 The random variable mt + 1 is also known as the pricing kernel, benchmark pricing variable, or intertemporal marginal rate of substitution, depending on the context. The representation (1) goes at least back to Beja (1971), while the term “stochastic discount factor” is usually ascribed to Hansen and Richard (1987).
    • Ch. 12: Tests of Multifactor Pricing Models, Volatility Bounds and Portfolio Performance 749 Assuming nonzero prices, Equation (1) is equivalent to: E (mt + 1Rt + 1 − 1 | Wt) = 0, (2) where Rt + 1 is the N-vector of primitive asset gross returns and 1 is an N-vector of ones. The gross return Ri,t + 1 is defined as (Pi,t + 1 + Di,t + 1)/Pi,t. We say that a SDF “prices” the assets if Equations (1) and (2) are satisfied. Empirical tests of asset-pricing models often work directly with Equation (2) and the relevant definition of mt + 1. Without more structure Equations (1) and (2) have no content because it is almost always possible to find a random variable mt + 1 for which the equations hold. There will be some mt + 1 that “works”, in this sense, as long as there are no redundant asset returns.2 With the restriction that mt + 1 is a strictly positive random variable, Equation (1) becomes equivalent to the no-arbitrage principle, which says that all portfolios of assets with payoffs that can never be negative, but which are positive with positive probability, must have positive prices [Beja (1971), Rubinstein (1976), Ross (1977), Harrison and Kreps (1979), Hansen and Richard (1987)]. The no arbitrage condition does not uniquely identify mt + 1 unless markets are complete. In that case, mt + 1 is equal to primitive state prices divided by state probabilities. To see this write Equation (1) as Pi,t = Et{mt + 1Xi,t + 1}, where Xi,t + 1 = Pi,t + 1 + Di,t + 1. In a discrete-state setting, Pit = SspsXi,s = Ssqs(ps/qs) Xi,s, where qs is the probability that state s will occur and ps is the state price, equal to the value at time t of one unit of the numeraire to be paid at time t + 1 if state s occurs at time t + 1. Xi,s is the total payoff of the security i at time t + 1 if state s occurs. Comparing this expression with Equation (1) shows that ms = ps/qs > 0 is the value of the SDF in state s. While the no-arbitrage principle places some restrictions on mt + 1, empirical work often explores the implications of equilibrium models for the SDF, based on investor optimization. Consider the Bellman equation for a representative consumer- investor’s optimization: J (Wt, st) ≡ Max Et {U (Ct, ·) + J (Wt + 1, st + 1)} , (3) where U(Ct, ·) is the direct utility of consumption expenditures at time t, and J(·) is the indirect utility of wealth. The notation allows the direct utility of current consumption expenditures to depend on variables such as past consumption expenditures or other state variables, st. The state variables are sufficient statistics, given wealth, for the utility of future wealth in an optimal consumption-investment plan. Thus, changes in the state variables represent future consumption-investment opportunity risk. The budget constraint is: Wt + 1 = (Wt − Ct) x Rt + 1, where x is the portfolio weight vector, subject to x 1 = 1. 2 For example, take a sample of assets with a nonsingular second moment matrix and let mt + 1 be [1 (Et{Rt + 1Rt + 1})−1] Rt + 1.
    • 750 W.E. Ferson If the allocation of resources to consumption and investment assets is optimal, it is not possible to obtain higher utility by changing the allocation. Suppose an investor considers reducing consumption at time t to purchase more of (any) asset. The expected utility cost at time t of the foregone consumption is the expected marginal utility of consumption expenditures, Uc(Ct, ·) > 0 (where a subscript denotes partial derivative), multiplied by the price Pi,t of the asset, measured in the numeraire unit. The expected utility gain of selling the investment asset and consuming the proceeds at time t + 1 is Et{(Pi,t + 1 + Di,t + 1) Jw(Wt + 1, st + 1)}. If the allocation maximizes expected utility, the following must hold: Pi,tEt{Uc(Ct, ·)} = Et{(Pi,t + 1 + Di,t + 1) Jw(Wt + 1, st + 1)}, which is equivalent to Equation (1), with mt + 1 = Jw (Wt + 1, st + 1) Et{Uc (Ct, ·)} . (4) The mt + 1 in Equation (4) is the intertemporal marginal rate of substitution (IMRS) of the consumer-investor, and Equations (2) and (4) combined are the intertemporal Euler equation. Asset-pricing models typically focus on the relation of security returns to aggregate quantities. To get there, it is necessary to aggregate the Euler equations of individuals to obtain equilibrium expressions in terms of aggregate quantities. Theoretical conditions which justify the use of aggregate quantities are discussed by Wilson (1968), Rubinstein (1974) and Constantinides (1982), among others. Some recent empirical work does not assume aggregation, but relies on panels of disaggregated data. Examples include Zeldes (1989), Brav, Constantinides and Geczy (2002) and Balduzzi and Yao (2001). Multiple-factor models for asset pricing follow when mt + 1 can be written as a function of several factors. Equation (4) suggests that likely candidates for the factors are variables that proxy for consumer wealth, consumption expenditures or the state variables – the variables that determine the marginal utility of future wealth in an optimal consumption-investment plan. 2.2. Expected risk premiums Typically, empirical work focuses on expressions for expected returns and excess rates of return. Expected excess returns are related to the risk factors that create variation in mt + 1. Consider any asset return Ri,t + 1 and a reference asset return, R0,t + 1. Define the excess return of asset i, relative to the reference asset as ri,t + 1 = Ri,t + 1 − R0,t + 1. If Equation (2) holds for both assets it implies: Et {mt + 1ri,t + 1} = 0 for all i. (5) Use the definition of covariance to expand Equation (5) into the product of expectations plus the covariance, obtaining: Et {ri,t + 1} = Covt ri,t + 1; − mt + 1 Et {mt + 1} , for all i, (6)
    • Ch. 12: Tests of Multifactor Pricing Models, Volatility Bounds and Portfolio Performance 751 where Covt(·; ·) is the conditional covariance. Equation (6) is a general expression for the expected excess return, from which most of the expressions in the literature can be derived. The conditional covariance of return with the SDF, mt + 1, is a very general measure of systematic risk. Asset-pricing models say that assets earn expected return premiums for their systematic risk, not their total risk (i.e., variance of return). The covariance with −mt + 1 is systematic risk because it measures the component of the return that contributes to fluctuations in the marginal utility of wealth. If we regressed the asset return on the SDF, the residual in the regression would capture the “unsystematic” risk and would not be “priced”, or command a risk premium. If the conditional covariance with the SDF is zero for a particular asset, the expected excess return of that asset should be zero.3 The more negative is the covariance with mt+ 1 the less desireable is the distribution of the random return, as the larger payoffs tend to occur when the marginal utility is low. The expected compensation for holding assets with this feature must be higher than for those with a more desireable distribution. Expected risk premiums should therefore differ across assets in proportion to their conditional covariances with −mt + 1. 2.3. Return predictability Rational expectations implies that the difference between return realizations and the expectations in the model should be unrelated to the information that the expectations in the model are conditioned on. For example, Equation (2) says that the conditional expectation of the product of mt + 1 and Ri,t + 1 is the constant, 1.0. Therefore, 1 − mt + 1Ri,t + 1 should not be predictably different from zero using any information available at time t. If we run a regression of 1 − mt + 1Ri,t + 1 on any lagged variable, Zt, the regression coefficients should be zero. If there is predictability in a return Ri,t + 1 using instruments Zt, the model implies that the predictability is removed when Ri,t + 1 is multiplied by the correct mt + 1. This is the sense in which conditional asset- pricing models are asked to “explain” predictable variation in asset returns. This view generalizes the older “random walk” model of stock values, which states that stock returns should be completely unpredictable. That model is a special case which can be motivated by risk neutrality. Under risk neutrality the IMRS, mt + 1, is a constant. Therefore, in this case the model implies that the return Ri,t + 1 should not differ predictably from a constant. Conditional asset pricing presumes the existence of some return predictability. There should be instruments Zt for which E(Rt + 1|Zt) or E(mt + 1|Zt) vary over time, in order 3 Equation (6) is weaker than Equation (2), since Equation (6) is equivalent to Et{mt + 1Ri,t + 1} = Dt, all i, where Dt is a constant across assets, while Equation (2) restricts Dt = 1. Therefore, empirical tests based on Equation (6) do not exploit all of the restrictions implied by a model that may be stated in the form of Equation (2).
    • 752 W.E. Ferson for the equation E(mt + 1Rt + 1 − 1|Zt) = 0 to have empirical bite.4 Interest in predicting security-market returns is about as old as the security markets themselves. Fama (1970) reviews the early evidence and Schwert, in Chapter 15 of this volume, reviews “anomalies” based on predictability. One body of literature uses lagged returns to predict future stock returns, attempting to exploit serial dependence. High frequency serial dependence, such as daily or intra- day patterns, are often considered to represent the effects of market microstructure, such as bid–ask spreads [e.g., Roll (1984)] and nonsynchronous trading of the stocks in an index [e.g., Scholes and Williams (1977)]. Serial dependence at longer horizons may represent predictable changes in the expected returns. Conrad and Kaul (1989) report serial dependence in weekly returns. Jegadeesh and Titman (1993) find that relatively high return, “winner” stocks tend to repeat their performance over three to nine-month horizons. DeBondt and Thaler (1985) find that past high-return stocks perform poorly over the next five years, and Fama and French (1988) find negative serial dependence over two to five-year horizons. These serial-dependence patterns motivate a large number of studies which attempt to assess the economic magnitude and statistical robustness of the implied predictability, or to explain the predictability as an economic phenomenon. For more comprehensive reviews, see Campbell, Lo and MacKinlay (1997) or Kaul (1996). Research in this area continues, and its fair to say that the jury is still out on the issue of predictability using lagged returns. A second body of literature studies predictability using other lagged variables as instruments. Fama and French (1989) assemble a list of variables from studies in the early 1980s, that as of this writing remain the workhorse instruments for conditional asset pricing models. These variables include the lagged dividend yield of a stock market index, a yield spread of long-term government bonds relative to short term bonds, and a yield spread of low-grade (high default risk) corporate bonds over high- grade bonds. In addition, studies often include the level of a short term interest rate [Fama and Schwert (1977), Ferson (1989)] and the lagged excess return of a medium- term over a short-term Treasury bill [Campbell (1987), Ferson and Harvey (1991)]. Recently proposed instruments include an aggregate book-to-market ratio [Pontiff and Schall (1998)] and lagged consumption-to-wealth ratios [Lettau and Ludvigson (2001)]. Of course, many other predictor variables have been proposed and more will doubtless be proposed in the future. Predictability using lagged instruments remains controversial, and there are some good reasons the question the predictability. Studies have identified various statistical biases in predictive regressions [e.g., Hansen and Hodrick (1980), Stambaugh (1999), 4 At one level this is easy. Since E(mt + 1|Zt) should be the inverse of a risk-free return, all we need is observable risk-free rates that vary over time. Ferson (1989) shows that the behavior of stock returns and short-term interest rates imply that conditional covariances of returns with mt + 1 must also vary over time.
    • Ch. 12: Tests of Multifactor Pricing Models, Volatility Bounds and Portfolio Performance 753 Ang and Bekaert (2001), Ferson, Sarkissian and Simin (2002)], questioned the stability of the predictive relations across economic regimes [e.g., Kim, Nelson and Startz (1991)] and raised the possibility that the lagged instruments arise solely through data mining [e.g., Lo and MacKinlay (1990), Foster, Smith and Whaley (1997)]. A reasonable response to these concerns is to see if the predictive relations hold out-of-sample. This kind of evidence is also mixed. Some studies find support for predictability in step-ahead or out-of-sample exercises [e.g., Fama and French (1989), Pesaran and Timmermann (1995)]. Similar instruments show some ability to predict returns outside the context of the USA, where they arose [e.g., Harvey (1991), Solnik (1993), Ferson and Harvey (1993, 1999)]. However, other studies conclude that predictability using the standard lagged instruments does not hold [e.g., Goyal and Welch (1999), Simin (2002)]. It seems that research on the predictability of security returns will always be interesting, and conditional asset-pricing models should be useful in framing many future investigations of these issues. 2.4. Consumption-based asset-pricing models In these models the economic agent maximizes a lifetime utility function of consumption (including possibly a bequest to heirs). Consumption models may be derived from Equation (4) by exploiting the envelope condition, Uc(·) = Jw(·), which states that the marginal utility of consumption must be equal to the marginal utility of wealth if the consumer has optimized the tradeoff between the amount consumed and the amount invested. Breeden (1979) derived a consumption-based asset-pricing model in continuous time, assuming that the preferences are time-additive. The utility function for the lifetime stream of consumption is Stbt U(Ct), where b is a time preference parameter and U(·) is increasing and concave in current consumption, Ct. Breeden’s model is a linearization of Equation (2) which follows from the assumption that asset values and consumption follow diffusion processes [Bhattacharya (1981), Grossman and Shiller (1982)]. A discrete-time version follows Rubinstein (1976) and Lucas (1978), assuming a power utility function: U(C) = C1−a − 1 1 − a , (7) where a > 0 is the concavity parameter. This function displays constant relative risk aversion5 equal to a. Using Equation (7) and the envelope condition, the IMRS in Equation (4) becomes: mt+1 = b (Ct + 1/Ct)−a . (8) 5 Relative risk aversion in consumption is defined as −Cu (C)/u (C). Absolute risk aversion is −u (C)/u (C). Ferson (1983) studies a consumption-based asset-pricing model with constant absolute risk aversion.
    • 754 W.E. Ferson A large literature has tested the pricing Equation (1), with the SDF given by the consumption model (8), and generalizations of that model.6 2.5. Multi-beta pricing models The vast majority of the empirical work on asset-pricing models involves expressions for expected returns, stated in terms of beta coefficients relative to one or more portfolios or factors. The beta is the regression coefficient of the asset return on the factor. Multi-beta models have more than one risk factor and more than one beta for each asset. The Arbitrage Pricing Theory (APT) leads to approximate expressions for expected returns with multiple beta coefficients. Models based on investor optimization and equilibrium lead to exact expressions.7 Both of these approaches lead to models with the following form: Et Ri,t + 1 = l0t + K j = 1 bijtljt, for all i. (9) The bi1t, . . . , biKt are the time t betas of asset i relative to the K risk factors Fj,t + 1, j = 1, . . . , K. These betas are the conditional multiple-regression coefficients of the assets on the factors. The lj,t, j = 1, . . . , K are the factor risk premiums, which represent increments to the expected return per unit of type-j beta. These premiums do not depend on the specific security i. l0,t is the expected zero-beta rate. This is the expected return of any security that is uncorrelated with each of the K factors in the model (i.e., b0jt = 0, j = 1, . . . , K). If there is a risk-free asset, then l0,t is the return of this asset. 2.5.1. Relation to the stochastic discount factor We first show how a multi-beta model can be derived as a special case of the SDF representation, when the factors capture the relevant systematic risks. We take this to mean that the error terms, ui,t + 1, in a regression of returns on the factors are not “priced”; that is, they are uncorrelated with mt + 1: Covt(ui,t + 1, mt + 1) = 0. We then state the general equivalence between the two representations. This equivalence was 6 An important generalization allows for nonseparabilities in the Uc(Ct, ·) function in Equation (4), as may be implied by the durability of consumer goods, habit persistence in the preferences for consumption over time, or nonseparability of preferences across states of nature. Singleton (1990), Ferson (1995) and Campbell, in Chapter 13 of this volume, review this literature. 7 The multiple-beta equilibrium model was developed in continuous time by Merton (1973), Breeden (1979) and Cox, Ingersoll and Ross (1985). Long (1974), Sharpe (1977), Cragg and Malkiel (1982), Connor (1984), Dybvig (1983), Grinblatt and Titman (1983) and Shanken (1987) provide multibeta interpretations of equilibrium models in discrete time.
    • Ch. 12: Tests of Multifactor Pricing Models, Volatility Bounds and Portfolio Performance 755 first discussed, for the case of a single-factor model, by Dybvig and Ingersoll (1982). The general, multi-factor case follows from Ferson and Jagannathan (1996). Let R0,t + 1 be a zero-beta portfolio, and l0,t the expected return on the zero-beta portfolio. Equation (6) implies: Et Ri,t + 1 = l0,t + Covt Ri,t + 1; − mt + 1 Et (mt + 1) . (10) Substituting the regression model Ri,t + 1 = ai + SjbijtFj,t + 1 + ui,t + 1 into the right hand side of (10) and assuming that Covt(ui,t + 1, mt + 1) = 0 implies: Et Ri,t + 1 = l0,t + Sj = 1, ..., K bijt Covt {Fj,t + 1, −mt + 1} Et (mt + 1) . (11) The risk premium per unit of type-j beta is lj,t = [Covt{Fj,t + 1, −mt + 1}/Et(mt + 1)]. In the special case where the factor Fj,t + 1 is a traded asset return, Equation (11) implies that lj,t = Et(Fj,t + 1) − l0,t; the expected risk premium equals the factor portfolio’s expected excess return. Equation (11) is useful because it provides intuition about the signs and magnitudes of expected risk premiums for particular factors. The intuition is the same as in Equation (6) above. If a risk factor Fj,t + 1 is negatively-correlated with mt + 1, the model implies that a positive risk premium is associated with that factor beta. A factor that is positively-related to marginal utility should carry a negative premium, because the big payoffs come when the value of payoffs is high. This implies a high present value and a low expected return. Expected risk premiums for a factor should also change over time if the conditional covariances of the factor with the scaled marginal utility [mt + 1/Et(mt + 1)] vary over time. The steps that take us from Equations (6) to (11) can be reversed, so the SDF and multi-beta representations are, in fact, equivalent. The formal statement is: Lemma 1 [Ferson and Jagannathan (1996)]. The stochastic discount factor representation (2) and the multi-beta model (9) are equivalent, where, mt + 1 = c0t + c1tF1t + 1 + · · · + cKtFKt + 1, (12) with c0t = 1 + Sk lk Et Fk,t + 1 / Vart Fk,t + 1 l0,t , and cjt = − lj,t l0,t Vart Fj,t + 1 , j = 1, . . . , K.
    • 756 W.E. Ferson For a proof, see Ferson and Jagannathan (1996). If the factors are not traded asset returns, then it is typically necessary to estimate the expected risk premiums for the factors, lk,t. These may be identified as the conditional expected excess returns on factor-mimicking portfolios. A factor-mimicking portfolio is defined as a portfolio whose return can be used in place of a factor in the model. There are several ways to obtain mimicking portfolios, as described in more detail below.8 2.5.2. Relation to mean-variance efficiency The concept of a minimum-variance portfolio is central in the asset-pricing literature. A portfolio Rp,t + 1 is a minimum-variance portfolio if no portfolio with the same expected return has a smaller variance. Roll (1977) and others have shown that the portfolio Rp,t + 1 is a minimum-variance portfolio if and only if a beta-pricing model holds:9 Et {Ri,t + 1 − Rpz,t + 1} = biptEt {Rp,t + 1 − Rpz,t + 1} , all i; (13) bipt = Covt Ri,t + 1; Rp,t + 1 Vart Rp,t + 1 . In Equation (13), bipt is the conditional beta of Ri,t + 1 relative to Rp,t + 1. Rpz,t + 1 is a zero beta asset relative to Rp,t + 1. A zero-beta asset satisfies Covt(Rpz,t + 1; Rp,t + 1) = 0. Equation (13) is essentially a restatement of the first-order condition for the optimization problem that defines a minimum-variance portfolio. Equation (13) first appeared as an asset-pricing model in the famous Capital Asset Pricing Model (CAPM) of Sharpe (1964), Lintner (1965) and Black (1972). The CAPM is equivalent to the statement that the market portfolio Rm,t + 1 is mean- variance efficient. The market portfolio is the portfolio of all marketed assets, weighted according to their relative total values. The portfolio is mean-variance efficient if it satisfies Equation (13), and also Et(Rm,t + 1 − Rmz,t + 1) > 0. When the factors are traded assets like a market portfolio, or when mimicking portfolios are used, the multi-beta model in Equation (9) is equivalent to the statement that a combination of the factor portfolios is minimum-variance efficient.10 Therefore, 8 Breeden (1979, footnote 7) derives maximum correlation mimicking portfolios. Grinblatt and Titman (1987), Shanken (1987), Lehmann and Modest (1988) and Huberman, Kandel and Stambaugh (1987) provide further characterizations of mimicking portfolios when there is no conditioning information. Ferson and Siegel (2002b) and Ferson, Siegel and Xu (2002) consider cases where there is conditioning information. 9 It is assumed that the portfolio Rp,t + 1 is not the global minimum-variance portfolio; that is, the minimum variance over all levels of expected return. This is because the betas of all assets on the global minimum-variance portfolio are identical. 10 This result is proved by Grinblatt and Titman (1987), Shanken (1987) and Huberman, Kandel and Stambaugh (1987) and reviewed by Ferson (1995).
    • Ch. 12: Tests of Multifactor Pricing Models, Volatility Bounds and Portfolio Performance 757 multiple-beta asset-pricing models like Equation (9) always imply that combinations of particular portfolios are minimum-variance efficient. This correspondence is exploited by Gibbons, Ross and Shanken (1989) and Kandel and Stambaugh (1989), among others, to develop tests of multi-beta models based on mean-variance efficiency. Such tests are discussed in Section 4 below. Since the SDF representation in Equation (2) is equivalent to a multi-beta expression for expected returns, and a multi-beta model is equivalent to a statement about minimum-variance efficiency, it follows that the SDF representation is equivalent to a statement about minimum-variance efficiency. Let’s now complete the loop. Lemma 2. A portfolio which maximizes squared (conditional) correlation with mt + 1 in Equation (2) is a minimum (conditional) variance portfolio. Proof: Consider the conditional projection of mt + 1 on the vector of returns Rt + 1. The coefficient vector is wt ≡ (Et(Rt + 1Rt + 1))−1 Et(Rt + 1mt + 1) = (Et(Rt + 1Rt + 1))−1 1. Define the portfolio return Rp,t + 1 = (wt/wt 1) Rt + 1. The portfolio maximizes the squared conditional correlation with mt + 1. The regression of mt + 1 on the vector of returns can be written as mt + 1 = (wt 1) Rp,t + 1 + et + 1. The error term et + 1 is conditionally uncorrelated with Ri,t + 1 for all i, and therefore with Rp,t + 1.11 Substituting for mt + 1 in Equation (6) from this regression produces: Et {ri,t + 1} = Covt ri,t + 1; Rp,t + 1 Covt rp,t + 1; Rp,t + 1 Et {rp,t + 1} , all i, (14) where ri,t + 1 and rp,t + 1 are excess returns. If the reference asset for the excess returns is taken to be Rpz,t + 1, a zero-beta asset for Rp,t + 1, then Equation (13) follows directly from Equation (14). Equation (13) implies that Rp,t + 1 is a conditional minimum- variance portfolio. We have seen that exact multi-beta pricing is equivalent to the statement that E(mRi) = 1 for all i, under the assumption that m is a linear function of the factors, and also equivalent to the statement that a portfolio of the factors is a minimum- variance-efficient portfolio. Thus, we have equivalence among the three paradigms: Exact multi-beta pricing, stochastic discount factors, and mean-variance efficiency. 2.5.3. A large-markets interpretation This section describes how the three paradigms of empirical asset pricing work in the large markets of the Arbitrage Pricing Theory (APT) of Ross (1976), as refined 11 The fitted values of the regression will have the same pricing implications as mt + 1. That is, m∗ t + 1 = (wt1) Rp,t + 1 can replace mt + 1 in Equation (1). Note that when the covariance matrix of asset returns is nonsingular, m∗ t + 1 is the unique SDF (i.e., satisfies Equation (1)) which is also an asset return. An SDF which satisfies Equation (1) is not in general an asset return, nor is it unique, unless markets are complete. If markets are complete, m∗ t + 1 is perfectly correlated with mt + 1 [Hansen and Richard (1987)].
    • 758 W.E. Ferson by Chamberlain (1983) and Chamberlain and Rothschild (1983). For this purpose, we ignore the existence of any “conditioning information”, and suppress the time subscripts and related notation. [For arbitrage pricing relations with conditioning information, see Stambaugh (1983)]. Assume that the following data-generating model describes equity returns in excess of a risk-free asset: ri = E (ri) + bi f + ei, (15) where E( f ) = 0 = E(ei f ), all i, and ft = Ft − E(Ft) are the unexpected factor returns. We can normalize the factors to have the identity as their covariance matrix; the bi absorb the normalization. The N × N covariance matrix of the asset return residuals can then be expressed as: Cov(R) ≡ S = BB + V, (16) where V is the covariance matrix of the residual vector, e, B is the N × K matrix of the bi, and S is assumed to be nonsingular for all N. The factor model assumes that the eigenvalues of V are bounded as N → ∞, while the K nonzero eigenvalues of BB become infinite as N → ∞. Thus, the covariance matrix S has K unbounded and N − K bounded eigenvalues as N becomes large. This is called an “approximate factor structure”, to distinguish it from an “exact” factor structure, where V is assumed to be diagonal. The factor model in Equation (16) decomposes the variances of returns into “pervasive” and “nonsystematic” risks. If x is an N-vector of portfolio weights, the portfolio variance is x Sx, where lmax(S) x x x Sx lmin(S) x x, lmin(S) being the smallest eigenvalue of S and lmax(S) being the largest. Following Chamberlain (1983), a portfolio is “well diversified” iff x x → 0 as N grows without bound. For example, an equally weighted portfolio is well diversified; in this case x x = (1/N) → 0. The bounded eigenvalues of V imply that V captures the component of portfolio risk that is not pervasive or systematic, in the sense that this part of the variance vanishes in a well-diversified portfolio. The exploding eigenvalues of BB imply that the common factor risks are pervasive, in the sense that they remain in a large, well-diversified portfolio. The arbitrage pricing theory of Ross (1976) asserts that a a < ∞ as N grows without bound, where a is the N vector of “alphas,” or expected abnormal returns, measured as the differences between the left- and right-hand sides of Equation (9), using the APT factors in the multi-beta model. The alphas are the differences between the assets’ expected returns and the returns predicted by the multi-beta model, sometimes called the “pricing errors”. The Ross APT implies that the multi-beta model’s pricing errors are “small”, on average, in a large market. If a a < ∞ as N grows, then the cross-asset average of the squared pricing errors, (a a)/N, must go to zero as N grows.
    • Ch. 12: Tests of Multifactor Pricing Models, Volatility Bounds and Portfolio Performance 759 To see how the approximate beta pricing of the APT relates to the other paradigms of empirical asset pricing, we first describe how the pricing errors in the beta-pricing model are related to those of a stochastic discount factor representation. If we define am = E(mR − 1), where m is linear in the APT factors, then it follows from Equations (10) and (11) that am = E(m)a; the beta-pricing and stochastic discount factor alphas are proportional, where the risk-free rate determines the constant of proportionality. Provided that the risk-free rate is bounded above −100%, then E(m) is bounded, and a a is bounded above if and only if amam is bounded above. Thus, the Ross APT has the same implications for the pricing errors in the stochastic discount factor and beta-pricing paradigms. The third paradigm is mean-variance efficiency. We know that a combination of the APT factors is minimum-variance efficient, if and only if a = 0. Thus, under the Ross APT a combination of the factors is not minimum-variance efficient. However, an upper bound on a a implies a lower bound on the correlation between a minimum-variance combination of the factors and a minimum-variance-efficient portfolio. To see how the Ross APT restricts the correlation between the factors and a minimum-variance-efficient portfolio, we need two facts. The first is the “law of conservation of squared Sharpe ratios”, developed as Equation (52) below (p. 782). Here we state the law as S2 (r) = a S−1 a + S2 ( f ), where S2 ( f ) is the maximum squared Sharpe ratio that can be obtained by a portfolio of the factors,12 S2 (r) is the squared Sharpe ratio of a minimum-variance-efficient portfolio using all of the assets, and S is the covariance matrix of the assets’s excess returns. The second fact, which follows from Equation (13), describes the correlation, ø, between the minimum-variance-efficient portfolio of the factors and the minimum-variance- efficient portfolios that uses all of the assets: ø = S( f )/S(r). Combining these results, S2 (r) − S2 ( f ) = a S−1 a a almax(S−1 ) = a a/lmin(S). Substituting for S(r) in terms of ø and S( f ), we arrive at: [1/ø2 − 1] a a/[lmin(S)S2 ( f )]. Thus, an upper bound on a a places a lower bound on the squared correlation between the minimum-variance factor portfolio and a minimum-variance-efficient portfolio of the assets. The “exact” version of the APT asserts that a a → 0 as N grows without bound, thus, the pricing errors of all assets go to zero as the market gets large. This version of the model requires stronger economic assumptions, as described by Connor (1984). Chamberlain (1983) shows that the exact APT is equivalent to the statement that all minimum-variance portfolios are well-diversified, and are thus combinations of the APT factors. In this case, we are essentially back to the original equivalence between the three paradigms holding as N gets large. That is, we have E(mR − 1) = 0 if and only if a = 0 when m is linear in the APT factors, and equivalently, a combination of the factors is a minimum-variance-efficient portfolio in the large market. 12 The Sharpe ratio is the expected excess return divided by the standard deviation.
    • 760 W.E. Ferson 2.6. Mean-variance efficiency with conditioning information Most asset-pricing models are stated in terms of expected asset returns, covariances and betas, conditional on the available public information at time t. However, empirical tests traditionally examine unconditional expected returns and betas, or use instruments that are a subset of the available public information. Given the equivalence between mean variance efficiency and the other asset pricing representations, it follows that all tests of asset pricing models using portfolios, have examined whether particular portfolios are either unconditionally minimum variance, or minimum variance conditional on a subset of the information. To understand how such tests are related to the theories we need to examine different concepts of efficiency when there is conditioning information. When there is conditioning information, minimum-variance efficiency may be defined in terms of the conditional means and variances (conditionally efficient), or in terms of unconditional moments. When the objective is to minimize the unconditional variance for a given unconditional mean, but where portfolio strategies may be functions of the information, we have (unconditional) minimum-variance efficiency with respect to the conditioning information. Unconditional efficiency with respect to conditioning information may seem confusing because conditioning information may be employed by the portfolio, but unconditional expectations about that portfolio’s returns are used to define efficiency. However, this information structure is actually quite common. Often the agent conducting a portfolio optimization uses more information than is available to the observer of the outcomes. If the observer does not have the conditioning information, he or she can only form unconditional, or less informed, expectations. Dybvig and Ross (1985) provide an example. Consider a portfolio manager who is evaluated based on the unconditional mean and variance of the portfolio return. The manager may use conditioning information about future returns in forming the portfolio. They show that the manager’s conditionally-efficient portfolio will typically not appear efficient to the uninformed investor. The portfolio that maximizes the manager’s measured performance in this setting is the unconditionally-efficient portfolio with respect to the information. Ferson and Siegel (2001) derive efficient-portfolio strategies with respect to conditioning information and illustrate their properties. Consider an example with two assets: a riskless asset (with gross rate of return Rf ) and a risky asset with gross return, R. The risky asset’s return is written as: R = Rf + m(Z) + e, (17) where m(Z) = E(R − Rf | Z). The conditional variance of the return given Z, is S(Z). The problem to be solved is: Min x(Z) Var {x(Z) R} subject to : mp = Rf + E x(Z) R − Rf . (18)
    • Ch. 12: Tests of Multifactor Pricing Models, Volatility Bounds and Portfolio Performance 761 The weight function x(Z) specifies the fraction invested in the risky asset, as a function of the conditioning information, Z. Here we provide a constructive derivation.13 Let L(Z)−1 = E{(R − Rf )2 | Z} = S(Z) + m(Z)2 and write the Lagrangian: L (x | Z) = E x(Z) L(Z)−1 x(Z) − 2l x(Z) m(Z) − mp − Rf . (19) Let ˆw(Z) = x(Z) + ay(Z), where y(Z) is any function and x(Z) is the optimal solution. If we consider L( ˆw(Z) | Z) = L(x(Z) + ay(Z) | Z), then optimality of x(Z) requires ðL/ða |a = 0= 0 = E[{x(Z) L(Z)−1 − lm(Z)} y(Z)] = 0. Since this must hold for all functions y(Z), it implies x(Z) L(Z)−1 − lm(Z) = 0. Solving for x(Z) and evaluating l by substituting the solution back into the constraint that E{x(Z) m(Z)} = (mp − Rf ) gives the solution for the unconditionally mean-variance efficient strategy with respect to the information, Z: x(Z) = z−1 mP − Rf m(Z) [m(Z)]2 + S(Z) , (20) where z = E m(Z)2 m(Z)2 + S(Z) . (21) The minimized variance implied by this solution is: s2 p = mp − Rf 2 (1 − z) z (22) Figure 1 gives an empirical example of the optimal weight as a function of the conditional expected excess return m(Z), for a given unconditional mean mp, equal to 11.1% per year. This figure matches the Standard and Poors 500 stock index return for the 1963–1994 sample period. The example assumes homoskedasticity, where S(Z) is a constant. The weight is shown for several values of R2 , defined as the ratio of the variance of the conditional mean to the variance of the stock index return. As R2 approaches zero the weight becomes a constant function. When the conditional expected excess return of the risky asset is zero, the weight in the risky asset is zero. For conditional expected excess returns near zero, the efficient weight appears monotone and nearly linear in m(Z). This is similar to other utility- maximizing strategies. For example, assuming a normal distribution the strategy that maximizes an exponential utility is linear in the conditional expected return. Kim and Omberg (1996) and Campbell and Viceira (1999) solve intertemporal portfolio 13 Thanks to Ludan Liu.
    • 762 W.E. Ferson -4 -3 -2 -1 0 1 2 3 4 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 Signal RiskyAssetWeight Fig. 1. Optimal weight versus signal. R2 are 0.0045, 0.105 and 0.355. problems and find that the portfolio weights are approximately linear in the state variables. Thus, traditional solutions to the portfolio optimization problem imply portfolio weights that are highly sensitive to extreme values of the signal. For example, if the signal is normally distributed a linear portfolio weight is unbounded. The weight in Equation (20) satisfies x(Z) → 0 as m(Z) → ± ∞. After a certain point, even an optimistic extreme signal leads to purchasing less of the risky asset, when the objective is to attain a given unconditional mean return with the smallest unconditional variance. Intuitively, an extremely high expected return presents an opportunity to reduce risk by taking a small position in the risky asset this period, without compromising the average portfolio performance.14 The “conservative” nature of the solution implies an interesting “agency” problem in a portfolio management context. The portfolio manager who is evaluated, as is common in practice, on the basis of unconditional mean return relative to unconditional return volatility may be induced to adopt a conservative response to extreme signals in order to maximize the measured performance. 2.6.1. Conditional versus unconditional efficiency Hansen and Richard (1987) show that in the set of returns that can be generated using conditioning information, an unconditionally efficient strategy with respect to 14 The precise shape of the curve depends on the homoskedasticity assumption used in Figure 1. However, according to Equation (20), if there is heteroskedasticity, where an extreme value of the signal is associated with a large conditional variance, the conservative behavior of the strategy is reinforced. Ferson and Siegel (2001) show that the solution for an n-asset example also implies the portfolio weight is a bounded function of the signal. They also note that the graph of the unconditionally efficient portfolio weight is similar to the redescending influence curves used in robust statistics [e.g., Hampel (1974), Goodall (1983) and Carroll (1989)]. This suggests that the unconditionally efficient portfolios may be empirically robust.
    • Ch. 12: Tests of Multifactor Pricing Models, Volatility Bounds and Portfolio Performance 763 the information must be conditionally efficient, but the reverse is not true. The relation between conditional and unconditional efficiency with respect to conditioning information may be understood in terms of the utility functions for which the solutions are optimal. Ferson and Siegel (2001) show that an unconditionally-efficient portfolio with respect to the conditioning information maximizes the conditional expectation of a quadratic utility function in a single-period problem. Since quadratic-utility agents choose mean-variance-efficient portfolios, this implies that an unconditionally-efficient portfolio must be a conditionally mean-variance-efficient portfolio. However, other utility functions also lead to conditional mean-variance-efficient portfolios. One example is the exponential utility function previously mentioned, when returns are normally distributed conditional on the information. Ferson and Siegel show that the solution for the unconditionally efficient portfolio with respect to the information is unique, so the exponential utility agent chooses a conditionally mean-variance-efficient portfolio that is not unconditionally efficient with respect to the information. Thus, conditional efficiency does not imply unconditional efficiency with respect to the information. The unconditional efficient portfolios with respect to given conditioning information are a subset of the conditionally efficient portfolios with respect to the same information. The relation between conditional and unconditional efficient portfolios with respect to given conditioning information can be represented as in Figure 2, with the unconditional mean return on the y axis and unconditional standard deviation on the x axis. The usual “fixed weight” mean-standard deviation boundary, which ignores the conditioning information, is the curve farthest to the right in the lower portion of the figure. There are an infinite number of conditionally efficient portfolio strategies, some examples of which are depicted by the other curves.15 Some of the conditionally efficient strategies can plot inside the fixed-weight strategy that ignores the conditioning information, as shown by Dybvig and Ross (1985) and illustrated by one of the examples in the figure. The unconditional efficient strategy with respect to the information Z is the outer envelope of all the conditionally efficient strategies. This is shown as the left-most curve in Figure 2. Hansen and Richard (1987) provide a formal characterization and prove that the outer envelope in Figure 2 has the familiar properties associated with mean-standard deviation boundaries when there is no conditioning information, e.g., two-fund separation. See Ingersoll (1987). 15 To see that there are an infinite number, note that a conditionally minimum-variance efficient strategy solves: Min x(Z) Var x(Z) R | Z s.t. E x(Z) R | Z = T(Z), where T(Z) is the target conditional mean return. Each of the infinite possible specifications for the function T(Z) implies a conditionally efficient strategy.
    • 764 W.E. Ferson 0.98 0.985 0.99 0.995 1 1.005 1.01 1.015 1.02 0 0.5 1 1.5 2 2.5 3 Standard Deviation ExpectedGrossReturn Fig. 2. Minimum variance boundaries. 2.6.2. Implications for tests These concepts of minimum-variance efficiency have important implications for tests of asset-pricing models. In principle, we can devise tests to reject the hypothesis that a portfolio is unconditional efficient or efficient conditional on some observed instruments, but we can not tell if a portfolio is efficient given all the public information, W. If we interpret asset-pricing models as identifying which portfolios are conditionally efficient given W, we have a problem. The collection of minimum- variance portfolios, conditional on the market information set, W, is larger than the set of minimum-variance portfolios conditional on an observable subset of instruments, Z. Thus, even if we reject that a portfolio is efficient given Z, we can not infer that it is inefficient given W. This is similar to the Roll (1977) critique of tests of the CAPM. Roll pointed out that since the market portfolio of the CAPM can not be measured, the CAPM can not be tested without making assumptions about the unobserved market return. The problem here is that we can not test the conditional CAPM because the full information set W is not observed, unless we make assumptions about the unobserved information set. This problem is present even if the true market portfolio return could be measured.16 There is an important exception to this conundrum. When tests are based on Equation (2) it is possible to test the model without observing the complete information set, when mt + 1 depends only on observable data and model parameters. Equation (2) implies that E(mt + 1Rt + 1|Zt) = 1, so tests may proceed using the observed instruments Zt. This is the case, for example, in versions of the consumption-based asset-pricing model, when the relevant consumption can be measured. Given a model 16 See Wheatley (1989) for a critique of the earliest conditional asset-pricing studies based on similar logic.
    • Ch. 12: Tests of Multifactor Pricing Models, Volatility Bounds and Portfolio Performance 765 in which mt + 1 is a function of observed data and parameters, it is also possible to use the concept of unconditional efficiency with respect to given information, Z, to conduct tests of the model. This approach is developed in Ferson and Siegel (2002b). 2.7. Choosing the factors A beta pricing model has no empirical content until the factors are specified, since there will always be a minimum-variance portfolio which satisfies Equation (13). The minimum-variance portfolio can serve as a single factor in Equation (12). Therefore, the empirical content of the model is the discipline imposed in selecting the factors. There have been three main approaches to specifying empirical factors for multiple- beta asset-pricing models. One approach is to use statistical factor analytic or principal components methods. This approach is motivated by the APT, where the “right” factors are the ones that capture all the pervasive (unbounded eigenvalue) risk, leaving only nonsystematic (bounded eigenvalue) risk in the residuals. That approach is pursued by Roll and Ross (1980) and Connor and Korajczyk (1986, 1988), among others. The advantage of the Connor–Korajczyk approach is that the factor extraction is conducted under essentially the same large-markets assumptions that lead to the APT. This lends some rigor to the tests. The disadvantage is that purely statistical factors provide little economic intuition. Burmeister and McElroy (1988) augment statistical factors with a market portfolio and illustrate how to “rotate” the factors, to interpret them relative to more intuitive economic variables. In a second approach the risk factors are explicitly chosen economic variables or portfolios, chosen based on economic intuition [e.g., Chen, Roll and Ross (1986), Ferson and Harvey (1991), Campbell (1993) and Cochrane (1996)]. Here is where Equation (4) should come in. According to that equation, the factors should be related to consumer wealth, consumption expenditures, and the sufficient statistics for the marginal utility of future wealth in an optimal consumption-investment plan. A third approach for choosing factors uses the cross-sectional empirical relation of stock returns to firm attributes. For example, portfolios are formed by ranking stocks on firm characteristics that are observed to be correlated with the cross-section of average returns. Perhaps the most famous current example is the three-factor model of Fama and French (1993, 1996). Fama and French group common stocks according to their “size” (market value of equity) and their ratios of book value to market value of equity per share. Previous studies such as Keim (1983) and Reinganum (1981) found that stock returns are related to these attributes. Fama and French use the returns of small stocks in excess of large stocks, and the returns of high book-to-market in excess of low book-to-market stocks, as two “factors”. This approach is critiqued on methodological grounds in Section 4.2. The empirical literature which examines multiple-beta pricing models is vast. Fama (1991), Connor and Korajczyk (1995) and Harvey and Kirby (1996) provide selective reviews. Studies typically focus on particular factors, and may mix the three approaches to factor selection. There is scant empirical evidence that focuses directly
    • 766 W.E. Ferson on the general question: which of the three methods of factor selection is superior? The answer to this question depends on the application to which the multiple-beta model is put. In their role as empirical models for security returns, multiple-beta models are used for essentially three things. First, they are used to explain the cross-section of average returns on different securities. This relates to Equation (6), where expected returns differ according to the return covariances with mt + 1. Second, the models are used to explain predictable patterns in security returns over time. This is the main goal of conditional asset pricing, as discussed in Section 2.1. Finally, multiple-beta models are used to explain the contemporaneous variance of security returns, through the variation of the risk factors. This relates more to multiple regression models, that are often associated with multiple-beta expected return models like Equation (9). The cross section of expected returns is central for a number of applications. The models’ fitted expected returns serve as estimates of “required” returns, in relation to risk. They are used, among other things, for the cost of equity capital, an important input in corporate project selection problems (see the surveys of Bruner et al. (1998) and Graham and Harvey (2001), and for portfolio construction and performance evaluation (see Section 5). There are problems in evaluating the three approaches to factor selection for this purpose. First, the results depend crucially on the “test assets”, or portfolios for which the models are evaluated. If portfolios are formed to emphasize cross sectional variation in a particular dimension, thus de-emphasizing others,17 then a model that “explains” that particular dimension will look good. For example, the Fama–French (1993) three-factor model emphasizes size and book-to-market. Fama and French (1996) find that it captures the cross section of average returns pretty well in size and book-to-market sorted portfolios. However, when confronted with industry returns [Fama and French (1997)] or with cross-sectional variation in average returns, related to the momentum effect of Jegadeesh and Titman (1993), the model performs poorly. This issue of portfolio formation has muddled some attempts in the literature to distinguish between the explanation of power of security characteristics versus betas on related factors, for the cross-section of average returns [e.g., Daniel and Titman (1997)]. See Berk (2000) for an analysis and critique. The second problem in evaluating the methods of factor selection relates to the discussion of conditional and unconditional efficiency in Section 2.6.1. A model may identify a conditionally efficient portfolio, but the portfolio is unconditionally inefficient. In other words, conditional covariances with a portfolio return could provide an exact description of the cross section of conditional expected returns, while at the same time average returns are not explained by their unconditional covariances with the same portfolio return. To see this algebraically, take the unconditional expectation of Equation (10), and recall that the expectation of the conditional covariance differs 17 The total sum of squares in any sample must equal the across-group sum of squares plus the within- group sum of squares.
    • Ch. 12: Tests of Multifactor Pricing Models, Volatility Bounds and Portfolio Performance 767 from the unconditional covariance by the covariance of the conditional means. If the cross section of assets i differ in their values of Cov{Et(Ri,t + 1), Et(mt + 1)}, we have a problem. Recall that Et(mt + 1) is the inverse of the risk-free return. Evidence from Fama and Schwert (1977) and Ferson (1989) shows that different assets’ expected returns have different sensitivities to measures of risk-free interest rates, so this problem may be a serious one. Although these caveats make it difficult to interpret the evidence in relation to theory, it is still interesting to know which models provide a good empirical description for the cross section of average returns. A few studies compare the alternative approaches to factor selection from this perspective. Lehmann and Modest (1987) take no stand on which approach is superior, but observe that predicted expected returns for a sample of mutual funds can be very sensitive to using the CAPM versus the APT, as well as to different approaches for implementing the APT. Farnsworth et al. (2002) compare a collection of SDF models, including: (1) three-factor models based on the asymptotic principal components of Connor and Korajczyk (1986); (2) three traded economic factors relating to the stock market, government bonds and low-grade corporate bonds; and (3) the three-factor model of Fama and French (1993, 1996). They estimate the models in a common sample of nine primitive “assets”, portfolios emphasizing variation in equity size, book-to-market and momentum, as well as bond market returns. They find that the principal components-based model is the worst-performing model in this group for explaining the cross section of average returns, as summarized by the Hansen–Jagganathan distance measure described in the next section. Explaining predictability in security returns is another important and controversial application for multi-beta asset-pricing models. Much of the controversy relates to the interpretation. Fama (1970, 1991) emphasizes that evidence relating to market efficiency involves a “joint hypothesis”. A model of equilibrium (essentially, a specification for the SDF) is jointly tested with the hypothesis that markets are informationally efficient with respect to particular information. If the tests reject, then logically the market could be inefficient or the SDF model could be wrong. From this perspective, predictability that cannot be explained using any of the standard asset- pricing models suggests market inefficiency; or alternatively, the need to move beyond the standard models. A few studies have compared the alternative approaches to factor selection, for the purpose of explaining return predictability. Ferson and Korajczyk (1995) compare economic factors similar to those chosen by Chen, Roll and Ross (1986) with the asymptotic principal components of Connor and Korajczyk (1986). They study predictability in one-month to two-year returns based on a list of “standard” lagged instruments discussed above, estimating the fraction of the predictable variance of return that is captured by the models. They find that single-factor models can capture about 60% of the predictable variance in a sample of industry returns, while five- factor models capture about 80%. These results are not highly sensitive to the return horizon. The performance of a five-principal components model and a five prespecified-factor model are broadly similar for capturing predictability in returns for
    • 768 W.E. Ferson all of the horizons. Farnsworth et al. (2002) find that, among the three-factor models, the Fama–French model performs the worst for explaining predictability in their study. Additional evidence that this model performs poorly for capturing return predictability is presented by Kirby (1998) and Ferson and Harvey (1999). Factors with good contemporaneous explanatory power for security returns are useful for risk modeling and for controlling systematic variance in some research contexts. A regression of security returns on a selection of factors does not impose an asset pricing model unless the regression coefficients are restricted: examples are given in Section 4. But it is easier to draw general conclusions about the empirical performance of the methods of factor selection in this setting. In a given sample, a factor analytic approach constructs factors to be highly correlated with the asset returns. If in-sample, contemporaneous correlation is the goal, this approach almost has to be the most effective. Choosing economic variables is likely to be the worst approach, because security returns, and stock returns in particular, are only weakly correlated with most economic data [e.g., Roll (1988)]. Indeed, this low contemporaneous correlation is one motivation for the use of mimicking portfolios, described in Section 4, to replace factors based on economic data in empirical models. 3. Modern variance bounds 3.1. The Hansen–Jagannathan bounds Hansen and Jagannathan (HJ, 1991) showed how the fundamental asset pricing Equation (1) places restrictions on the mean and variance of mt + 1. These restrictions depend only on the sample of assets, and thus provide a diagnostic tool for comparing different models of mt + 1. If a candidate for mt + 1, corresponding to a particular theory, fails to satisfy the HJ bounds, then it can not satisfy Equation (1). Recent papers refine and extend the HJ bounds in several directions, and a number of papers and textbooks provide basic reviews [see Ferson (1995)].18 We briefly review the case where there is no conditioning information, then move on to extensions with conditioning information. Assume that the random column n-vector R of the assets’ gross returns has mean E(R) = m and covariance matrix S. When there is no conditioning information 18 Snow (1991) considers selected higher moments of the returns distribution. Bansal and Lehmann (1997) derive restrictions on E[ln(m)] that involve all higher moments of m and reduce to the HJ bounds if returns are lognormally distributed. Balduzzi and Kallal (1997) incorporate the implications for the risk premium on an economic variable. Cochrane and Hansen (1992) state restrictions in terms of the correlation between the stochastic discount factor and returns, while Cochrane and Saa’-Requejo (2000) consider bounds on the Sharpe ratios of assets’ pricing errors. Hansen, Heaton and Luttmer (1995) develop asymptotic distribution theory for specification errors on stochastic discount factors, where the HJ bounds are a special case, and Ferson and Siegel (2002a) evaluate these standard errors by simulation.
    • Ch. 12: Tests of Multifactor Pricing Models, Volatility Bounds and Portfolio Performance 769 a stochastic discount factor is defined as any random variable m such that E(mR) = 1. Hansen and Jagannathan (1991) show that the stochastic discount factor with minimum variance for its expectation E(m) is given by: m∗ = E(m) + 1 − E(m) E(R ) S−1 [R − m], (23) and the minimum variance for a SDF is the variance of m∗ : Var(m) 1 − E(m) E(R ) S−1 [1 − E(m) E(R)] . (24) The proof is instructive. Consider a regression of any m satisfying E(mR) = 1 on the asset returns, R. The fitted value is m∗ = E(m) + Cov(m, R) S−1 [R − m], and m = m∗ + e, where e is the regression error satisfying E(e) = 0 = E(eR). Since m∗ is a linear function of R, it follows that E(em∗ ) = 0. Thus, Var(m) = Var(m∗ ) + Var(e) Var(m∗ ). Finally, expanding E(mR) = 1 = E(m) m + Cov(m, R) and substituting for Cov(m, R), we arrive at Equation (23). The right-hand side of Equation (24) is just the variance of the m∗ in Equation (23). The HJ bound is related to the maximum Sharpe ratio that can be obtained by a portfolio of the assets. The Sharpe ratio is defined as the ratio of the expected excess return to the standard deviation of the portfolio return. If the vector of assets’ expected excess returns is m − E(m)−1 and S is the covariance matrix, the square of the maximum Sharpe ratio is [m − E(m)−1 ] S−1 [m − E(m)−1 ]. Thus, from Equation (24) the lower bound on the variance of stochastic discount factors is the maximum squared Sharpe ratio multiplied by E(m)2 . The larger is the maximum squared Sharpe ratio for a given E(m), the tighter is the bound on Var(m) and the more potential SDFs can be ruled out. The Hansen and Jagannathan (1991) region for {E(m), s(m)} is given by the square root of Equation (24). The boundary of this region is the minimum value of the standard deviation, s(m), for each value of E(m). Some empirical examples are illustrated in Figure 3, corresponding to the different versions of the bounds described below. The bounds are drawn for quarterly data similar to Hansen and Jagannathan, consisting of 3-, 6-, 9- and 12-month Treasury bill returns for the 1964–1986 sample period. For a given hyperbola in Figure 3, as we vary E(m) we move around the {E(m), s(m)} boundary. In order for an SDF to satisfy E(mR) = 1, its mean and standard deviation must plot above the boundary, “inside the cup”. The points shown by the “×” symbols are the sample means and standard deviations of the mt + 1 of Equation (8), using quarterly total consumption data per capita in the USA over the same sample period, and various values of the relative risk aversion, a. Note that the SDF does not plot inside even the lowest cup for many values of a. In fact, the SDF just touches the boundary of Figure 3 when a = 71. The SDF does not enter the highest cups for any value of risk aversion. The simple consumption model does not produce SDF’s that are volatile enough. This is a version of the equity premium puzzle
    • 770 W.E. Ferson 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0.97 0.98 0.99 1 1.01 1.02 1.03 E(m) Std(m) Fig. 3. Hansen–Jagannathan bounds. of Mehra and Prescott (1985), reviewed by Mehra and Prescott in Chapter 14 of this volume. The bound in Equation (24) is not the sharpest lower bound on s(m) that can be derived. Hansen and Jagannathan (1991) show how imposing that mt + 1 is a strictly positive random variable can sharpen the bound. Computing the bounds imposing positivity requires a numerical search procedure. Another way to sharpen the bounds is through the use of conditioning information. 3.2. Variance bounds with conditioning information The preceding analysis is based on the unconditional moments. With conditioning information Z, we may consider a stochastic discount factor for (R, Z) to be any random variable m such that E(mR | Z) = 1 for all realizations of Z. In principle, everything above could be stated for conditional means and variances, an approach pursued by Gallant, Hansen and Tauchen (1990). This would complicate Figure 3, because we would have to show a new one for each realization of Zt. Alternatively, Hansen and Jagannathan (1991) describe a clever way to extend the analysis to partially exploit the information in a set of lagged instruments, while using unconditional moments to describe the bound. Equation (5) implies that for any set of instruments E{mt + 1ri,t + 1 ⊗ Zt | Zt} = 0, and therefore E{mt + 1ri,t + 1 ⊗ Zt} = 0, where ⊗ is the Kronecker product. If we view {ri,t + 1 ⊗ Zt} as the excess returns to a set of “dynamic” trading strategies, the preceding analysis goes through essentially unchanged. (The trading rule holds at time t, Zt units of the asset i long and Zt units of the zero-th asset short). The approach of Hansen and Jagannathan (1991) is just one way to implement HJ bounds that use conditioning information. To understand how alternative ap-
    • Ch. 12: Tests of Multifactor Pricing Models, Volatility Bounds and Portfolio Performance 771 proaches to conditioning information can refine the bounds let rt + 1 be the vector of excess returns. In this case, Equation (5) implies the following: E {mt + 1rt + 1 f (Zt)} = 0 for all functions f (·), (25) where the unconditional expectation is assumed to exist. In other words, if we consider rt + 1 f (Zt) to represent a possible dynamic trading strategy, then the presence of conditioning information Zt says that mt + 1 should price all the dynamic trading strategies, not just rt + 1 ⊗ Zt. The larger is the set of strategies for which the Equation (25) is required to hold, the smaller is the set of mt + 1’s that can satisfy the condition, and the tighter are the bounds. This is the motivation for extending HJs original approach, in order to use the information efficiently. Three versions of HJ bounds with conditioning information have appeared in the literature. These may be understood through Equation (25). First are the multiplicative bounds of Hansen and Jagannathan (1991), who choose f (·) to be the linear function, I ⊗ Zt. Second are the efficient portfolio bounds of Ferson and Siegel (2002a), where f (·) is the set of portfolio weights that may depend on Zt and sum to 1. Finally, the optimal bounds of Gallant, Hansen and Tauchen (1990) require Equation (25) to hold for all functions f (·). 3.2.1. Efficient portfolio bounds Efficient portfolio bounds are based on the unconditionally efficient portfolios with respect to the information Z, derived by Ferson and Siegel (2001) and discussed in Section 2.6. Since these portfolios maximize the Sharpe ratio, over all dynamic strategies x(Z) whose weights sum to 1.0, they efficiently use the information in Z to tighten the bounds. For given (R, Z), the solutions describe an unconditional mean- standard deviation boundary, as depicted in Figure 2. Fixed-weight combinations of any two portfolios on an unconditional mean-standard-deviation boundary can describe the entire boundary [Hansen and Richard (1987)]. Thus, efficient portfolio bounds can be formed from two “arbitrary” portfolios from the boundary. 3.2.2. Optimal bounds Gallant, Hansen and Tauchen (1990) derive optimal bounds that do not restrict to portfolio functions, with weights that sum to 1.0. The solution for the optimal bounds is presented in Ferson and Siegel (2002a) as follows. First, define the following conditional portfolio constants, which are analogous to the efficient-set constants used in the traditional mean-variance analysis [see, e.g., Ingersoll (1987)]: a(Z) = 1 S(Z)−1 1, b(Z) = 1 S(Z)−1 m(Z), and g(Z) = m(Z) S(Z)−1 m(Z), (26) where m(Z) and S(Z) are the conditional mean and variance functions. The stochastic
    • 772 W.E. Ferson discount factor m for (R, Z) with minimum variance for its expectation E(m) is given by m∗ (Z) = z(Z) + [1 − ú(Z) m(Z)] S(Z)−1 [R − m(Z)] , (27) where z(Z) is the conditional mean of m given Z, defined as: z(Z) = E (m | Z) = {1 + g(Z)} −1 b(Z) + E {1 + g(Z)} −1 E(m) − E b(Z) {1 + g(Z)} . (28) The unconditional variance of m∗ (Z) is: Var (m∗ (Z)) = E {1 + g(Z)} −1 E(m) − E b(Z) {1 + g(Z)} 2 + E [a(Z)] − [E(m)]2 − E b(Z)2 {1 + g(Z)} . (29) Equation (29) may be used directly to compute the optimal HJ bounds. To implement the bound, it is necessary to specify the conditional mean function m(Z) and the conditional variance function. Then, the four unconditional expectations that appear in Equation (29) may be estimated from the corresponding sample means, independent of the value of E(m). 3.2.3. Discussion For given conditioning information, Z, the optimal bounds provide the greatest lower bound on stochastic discount factors and thus the highest, most restrictive cup. The efficient portfolio bounds incorporate an additional restriction to functions that are portfolio weights, which sum to 1.0 at each date. This reduces the flexibility of the efficient portfolio bounds to exploit the conditioning information, and thus they do not attain the greatest lower bound. Intuitively, suppose there was only one asset. Then the restricted weight could not respond at all to the conditioning information. The multiplicative bound of Hansen and Jagannathan (1991) does not restrict portfolio weights, but neither does it attempt to use the conditioning information efficiently. Bekaert and Liu (2002) and Basu and Stremme (2003) further discuss the relations among the HJ bounds with conditioning information. Ferson and Siegel (2002a) conduct a simulation study of HJ bounds with condition- ing information. They find that sample values of the bounds are upwardly biased, the bias becoming substantial when the number of assets is large relative to the number of time-series observations. This means that studies using the biased bounds run a risk of rejecting too many models for the stochastic discount factor. They derive a finite-sample adjustment for the bounds and show that it helps control the bias.
    • Ch. 12: Tests of Multifactor Pricing Models, Volatility Bounds and Portfolio Performance 773 Sarkissian (2002) uses the adjustments and finds that they change the inferences about international consumption-based asset-pricing models. The evidence to date leads to several conclusions. First, the Multiplicative bounds of HJ can be terribly biased in realistic samples. It is important to use the finite- sample adjustment to their location. Second, the Optimal bounds of Gallant, Hansen and Tauchen (1990) are more difficult to implement than the multiplicative bounds, requiring the specification of conditional means and variances, but they are the tightest bounds. While not as biased as the multiplicative bounds, they are significantly biased in some finite samples. The finite-sample adjustment should be used and improves their accuracy. Third, the efficient-portfolio bounds are similar in complexity to the optimal bounds, also requiring the specification of the conditional moments. However, unlike the optimal bounds, they remain valid (but inefficient) when these moments are not specified correctly. They have the smallest sampling error variances of all the bounds with conditioning information. They are not as biased as either the optimal or multiplicative bounds, but finite-sample adjustment is still useful. 3.3. The Hansen–Jagannathan distance Hansen and Jagannathan (1997) develop measures of misspecification for models of the stochastic discount factor. They consider m(f), a “candidate” stochastic discount factor, as may be proposed by an asset-pricing model. Since the candidate SDF is misspecified, then E(m(f) R − 1|Z) Ñ 0. They propose a measure for how “close” is m(f) to a stochastic discount factor that “works”. We first consider the case where there is no conditioning information. It is easy to show that a particular SDF, formed from the asset returns, m∗ = [1 E(RR )−1 ] R, is one that “works” for pricing R. Hansen and Jagannathan measure how close m(f) is to m∗ . They do this by first projecting m(f) on the returns R to get the fitted value ˆm = [E(m(f) R ) E(RR )−1 ] R. They then measure the mean square distance between the fitted values and m∗ . This is the HJ Distance Measure: HJD = E ( ˆm − m∗ )2 , (30) where the sample averages are used in practice to estimate the expectations. Note that ˆm − m∗ = [E(m(f) R) − 1] E(RR )−1 R, so we may write HJD as: [E(m(f) R) − 1] E(RR )−1 [E(m(f) R) − 1]. This leads to a couple of nice interpretations. First, if we let g = E(m(f) R) − 1), W = E(RR )−1 , then HJD = g Wg is Hansen’s J-test with a particular W, described in the next section. Second, by analogy with the T2 test, HJD measures the “most mispriced” return. To see this interpretation, recall that g = am = E{m(f) R − 1} is a measure of expected pricing error using m(f). The alpha of a portfolio with weight vector x is the scalar ap = x am. Consider the problem of finding the absolutely most mispriced portfolio, relative to its second moment: Max x 2x a + l x E RR x − E r2 p , where l is a Lagrange multiplier. The maximized value of a2 p /E(r2 p) is a E(RR )−1 a, equivalent to the HJD measure.
    • 774 W.E. Ferson When there is conditioning information in the form of lagged instruments, Z, then a correctly specified SDF has E(m(f) R − 1|Z) = 0, which implies E[m(f)(R ⊗ Z) − (1 ⊗ Z)] = E(0 ⊗ Z) = 0. Let ´z ≡ Z./E(Z), where ./ denotes element-by-element division. The previous equation holds only if E[m(f)(R ⊗ ´z) − (1 ⊗ ´z)] = 0, or E[m(f)(R ⊗ ´z)] = 1. If we define ´R ≡ R ⊗ ´z, we have E[m(f) ´R] = 1, and we can proceed as before using ´R instead of R. 4. Methodology and tests of multifactor asset-pricing models The method of moments is briefly reviewed as a general way to test models based on Equation (2). This general framework is then specialized to discuss various tests of asset-pricing models. The special cases include cross-sectional regressions and multivariate regressions. 4.1. The Generalized Method of Moments approach Let xt + 1 be a vector of observable variables. Given a model which specifies mt + 1 = m(q, xt + 1), estimation of the parameters q and tests of the model can proceed under weak assumptions, using the Generalized Method of Moments (GMM), as developed by Hansen (1982). Define the model error term: ui,t + 1 = m (q, xt + 1) Ri,t + 1 − 1. (31) Suppose that we have a sample of N assets and T time periods. Combine the error terms from Equation (31) into a T × N matrix u, with typical row ut + 1. Equation (2) and the model for mt + 1 imply that E(ui,t + 1 | Zt) = 0 for all i and t, and therefore E(ut + 1 ⊗ Zt) = 0 for all t. The condition E(ut + 1 ⊗ Zt) = 0 says that ut + 1 is orthogonal to Zt, and is therefore called an orthogonality condition. Define an N × L matrix of sample mean orthogonality conditions: vec(Z u/T), where Z is a T × L matrix of observed instruments with typical row Zt , a subset of the available information at time t.19 Hansen’s (1982) GMM estimates of q are obtained as follows. Search for parameter values that make g close to zero by minimizing a quadratic form g Wg, where W is a fixed NL × NL weighting matrix. Hansen (1982) shows that the estimators of q that minimize g Wg are consistent and asymptotically normal, for any fixed W. If W is chosen to be the inverse of a consistent estimate of the covariance matrix of the orthogonality conditions, g, the estimators are asymptotically efficient in the class of 19 The vec(·) operator stacks the columns of a matrix. We assume that the same instruments are used for each of the asset equations. In general, each asset equation could use a different set of instruments, which complicates the notation.
    • Ch. 12: Tests of Multifactor Pricing Models, Volatility Bounds and Portfolio Performance 775 estimators that minimize g Wg for fixed W’s. The asymptotic variance matrix of the GMM estimator of the parameter vector is then: Cov(q) ≈ T ðg ðq W ðg ðq −1 . (32) where ðg/ðq is an NL× dim(q) matrix of derivatives. Hansen (1982) also shows that J = Tg Wg is asymptotically chi-square distributed, with degrees of freedom equal to the difference between the number of orthogonality conditions NL and the number of parameters, dim(q). This is Hansen’s J-statistic, mentioned in the last section, which serves as a goodness-of-fit statistic for the model. Several choices for the weighting matrix W are available. A simple version of the optimal choice, where W = Cov(g)−1 , is: Cov(g) = 1 T St gtgt = 1 T St ut + 1ut + 1 ⊗ ZtZt , (33) and ⊗ denotes the Kronecker product. This case applies when the error terms ut, and therefore the moment conditions gt are serially uncorrelated. More general cases, and more detailed reviews are available in Hamilton (1994), Ferson (1995), Harvey and Kirby (1996) and Cochrane (2001), among others. 4.2. Cross-sectional regression methods Much of the early empirical work on asset pricing used cross-sectional regressions of returns on estimates of market betas [e.g., Lintner, reported in Douglas (1969)]. The approach remains popular. Multiple-beta models, in particular, are often studied using this technique [e.g., Chen, Roll and Ross (1986), Ferson and Harvey (1991), Fama and French (1993), Lettau and Ludvigson (2001)]. Cross-sectional regression is appealing because it is an intuitive approach. Taking the simple CAPM as an example, we hypothesize: E(Ri) = Rf + biE(Rm − Rf ), i = 1, . . . , N. The model implies that the cross-sectional relation between mean returns and betas has a slope equal to the expected excess return of the market. The intercept should be a risk-free return, or a zero-beta portfolio expected return. Let’s start our discussion of cross-sectional regressions with a classical two-step approach, similar to that of Black, Jensen and Scholes (1972) or Fama and MacBeth (1973). For the first step, suppose that market betas are constant over time. The betas come from: rit = ai + rmt bi + eit, t = 1, . . . , T for each i, (34) For now we ignore the estimation error in the time series estimates of beta. This will be discussed later. The second step is a cross-sectional regression for each month: rit = l0t + l1t bi + uit, i = 1, . . . , N. (35) There could be K > 1 betas if we are testing a multi-factor asset-pricing model, then rmt is a vector of K excess returns, and l1t is a K-vector of slope coefficients.
    • 776 W.E. Ferson It is instructive to consider the GMM solution to the cross-sectional regression estimator. Define gt = (1/N) Si(rit − l0t − l1t bi) ⊗ (1, bi ). Choose the parameters to minimize gt Wgt for some weighting matrix, W. Here the model is exactly identified, with the same number of parameters as moment conditions, so the GMM solution may be obtained by setting gt = 0. This results in: ˆl0t = 1 N Sirit − ˆl1t 1 N Si bi ˆl1t = Si bi bi −1 Si bi rit − ˆl0t , (36) Iteratively solving Equation (36) yields estimates of the premium for market beta, l1t, and the zero-beta return, l0t, similar to Black, Jensen and Scholes (1972). If the risk- free rate is known, we have a cross-sectional regression of excess returns on betas. Of course, we want to use the cross-sectional regression to test hypotheses on the coefficients. For example, the hypothesis that E(l1t) = 0 says that the expected market risk premium is zero, or that beta has no cross-sectional explanatory power for returns. Alternatively, we may hypothesize that E(l1t − rmt) = 0, which says that the premium is the market excess return. Standard errors for the coefficients may be obtained from Equation (36), which implies that ˆl1t − l1t = Si bi bi −1 Si bi uit , so that Var ˆl1t − l1t = Si bi bi −1 Var (Si biuit) Si bi bi −1 = B B −1 B Cov (ut) B B B −1 , (37) where B is the N × K matrix of betas. Note that the variance of the estimators given by Equation (37) is not the same as the OLS solution, sut(B B)−1 , where sut is a scalar, that one would obtain using a standard regression package to run a cross-sectional regression. Only in the special case where Cov(ut) = sutIn, are the OLS standard errors correct. This would occur if the cross-sectional regression errors were uncorrelated across assets and homoskedastic across assets – a very unlikely scenario for stock market return data. 4.2.1. The Fama–MacBeth approach Fama and MacBeth (1973) devise a simple and clever way to get estimates of the standard errors, while accounting for cross-sectional dependence. They suggest using the time-series of the estimators from a sequence of cross-sectional regressions, one for each month in the sample, to compute the standard error of the mean
    • Ch. 12: Tests of Multifactor Pricing Models, Volatility Bounds and Portfolio Performance 777 coefficient. In testing the hypothesis that l1t = 0, they propose a simple t-ratio: (1/T) St ˆl1t/se((1/T) St ˆl1t), where the standard error is estimated by: se 1 T St ˆl1t ≈ 1 √ T 1 T St ˆl1t − 1 T St ˆl1t 2 1/2 . (38) We can evaluate this using the previous equations. First, the sample variance of ˆl1t is examined under the null hypothesis that l1t is zero. E 1 T St ˆl1t − 1 T St ˆl1t 2 = E 1 T St l1t + B B −1 B ut − 1 T St l1t + B B −1 B ut 2 = E 1 T St B B −1 B ut − 1 T Stut 2 , if l1t = 0, = E B B −1 B 1 T St ut − 1 T Stut ut − 1 T Stut B B B −1 = B B −1 B [Cov(u)] B B B −1 , which is the same as Equation (37). If we assume the ˆl1t are uncorrelated over time, with a constant variance, we have Var{(1/T) St ˆl1t} = (1/T)2 St Var{ˆl1t}. Estimating Var{ˆl1} with the sample variance of the time series of the ˆl1t, as in Equation (38), produces the correct result. Thus, if we ignore estimation error in the betas, and assume that stock returns are serially uncorrelated, then under the null hypothesis that l1t = 0, the Fama–MacBeth approach delivers the correct standard errors. 4.2.2. Interpreting the estimates Fama (1976) provides an intuitive interpretation of the cross-sectional regression estimators as portfolio returns. To fix things, start with the cross-sectional regression Rit = l0t + l1tbi + uit, i = 1, . . . , N. Let there be a single beta (K = 1). The CAPM implies that E(l1t) = E(Rmt − Rft) and E(l0t − Rft) = 0. Under the standard assumptions that make OLS best linear unbiased, the cross-sectional estimator solves: ˆl1t solves Min {wi} Si ˆu2 it, subject to : Unbiased : E ˆl1t = l1t Linear : ˆl1t = SiwiRit. (39) We can use these conditions to characterize ˆl1t as a portfolio.20 In particular: E ˆl1t = E (SiwiRit) = E (Siwi [l0t + l1t bi + uit]) = l1t 20 Since uit is likely to be correlated across assets, GLS is better in theory. This amounts to a transformation of the asset returns and their betas into a different set of portfolios, then running OLS on the new portfolios. Therefore, the intuition here will translate.
    • 778 W.E. Ferson implies (Siwi) = 0, and (Siwi bi) = 1. This shows that the portfolio has weights, {wi} on the assets which sum to zero, and has a beta equal to one.21 The first condition, (Siwi) = 0, says that the return is an excess return. The second condition, (Siwi bi) = 1, says that the portfolio beta must equal 1.0. In order for the weights to sum to zero while the beta is positive, the portfolio must be “long” (positive weights) in high beta securities, and also “short” (negative weights) in low beta securities. A similar analysis restricts the intercept estimator, implying that (Siwi) = 1 and (Siwi bi) = 0. Thus, the intercept is a fully invested portfolio with no “systematic”, or factor-related risk. Its expected return should therefore be the zero-beta rate. If there is a risk-free security, this should be the risk-free rate. The Fama–MacBeth cross-sectional regression coefficients represent one way to obtain the excess returns on mimicking portfolios for the risk factors. This is especially useful if the factors are not traded excess returns. Regression betas on nontraded variables, such as consumption, GNP growth or inflation, can be used. In this case, the Fama–MacBeth coefficients deliver excess returns, whose expected values are risk premiums for the factors. Indeed, the preceding analysis goes through if the cross- sectional regressors are not betas. For example, studies have used attributes such as firm size, dividend yield, or book-to-market ratio in place of beta coefficients. 4.2.3. A caveat The Fama–MacBeth procedure constructs a “factor-mimicking” portfolio, for anything that we put on the right-hand side of the regression. This raises a potentially serious caveat. If a firm attribute is used that represents an anomaly, even if completely unrelated to risk, the procedure can deliver a mimicking portfolio return that may appear to work as a risk factor. This caveat is explored by Ferson, Sarkissian and Simin (1999). For a simple illustration, consider the following hypothetical regression: Rit = l0t + l1tai + uit, i = 1, . . . , N, (40) where ai is an anomaly in the average return of asset i. Let Ai ≡ ai − (1/N) Siai, then the OLS Fama–Macbeth slope estimator constructs the portfolio: Rpt = ˆl1t = SiwiRit, wi = Ai SiA2 i . (41) Suppose we used Rpt as a “factor” in an asset-pricing model. Would it appear to “price”, i.e., would returns be linear in covariances with Rpt? Cov Rt, Rpt = Cov(R) w = Cov(R) A SiA2 i , (42) 21 This condition would also apply in a multi-beta context, in which case the coefficient for a particular beta is a portfolio return with unit beta on the particular factor. Unbiasedness would also imply that the beta on the other factors equal zero, so the portfolio targets only the risk as represented by the factor in question.
    • Ch. 12: Tests of Multifactor Pricing Models, Volatility Bounds and Portfolio Performance 779 where A is the N-vector of the Ai’s. If Cov(R) A ∝ A, then the vector of covariances with Rpt = ˆl1t will appear to “explain” the cross-section of expected asset returns. For example: suppose Cov(R) = I. Then, returns are independent and there is no systematic risk. But, Cov(R) A ∝ A, so the “factor” Rpt, formed by the Fama–MacBeth approach, will appear to work perfectly, in the sense that covariances with the factor return will exactly explain the cross-section of expected returns! Similar results should be obtained when “spread” portfolios replace the FM coefficients, as in Fama and French (1993, 1996). Spread portfolios are formed as the difference between a high-attribute portfolio and a low-attribute portfolio return. Thus, they are long the high-attribute stocks, short the low-attribute stocks, and their weights sum to zero. A cross-sectional regression coefficient for stock returns on the attribute is also a linear combination of the returns, with weights that sum to zero. The portfolio is long in high-attribute stocks and short in low-attribute stocks. If a multiple regression is used, it has zero exposure to the other regressors. Subject to these conditions, it has minimum variance. A spread portfolio has a similar property if multiple independent sorts are used, as in Fama and French (1996), to control for the other attributes. While a spread portfolio does not explicitly minimize variance subject to these conditions, it avoids estimation error. Ferson, Sarkissian and Simin (1999) provide an example where Fama–MacBeth coefficients and spread portfolios, similar to Fama and French (1996), produce similar results in the face of an anomaly in asset returns. Their example shows that an arbitrary attribute, bearing an anomalous relation to returns, can be repackaged as a spurious risk factor. Recent studies employing the approach of Fama and French (1996) do not use arbitrary anomalous attributes. Some of the most empirically powerful characteristics for the cross-sectional prediction of returns are ratios, with market price per share in the denominator. Berk (1995) emphasizes that the price of any stock is the value of its future cash flows discounted by future returns, so an anomalous pattern in the cross- section of returns would produce a corresponding pattern in book-to-market ratios or other proxies of cash-flow-to-price. A cross-sectional regression of returns on these ratios will pick out the anomalous patterns. Thus, the use of valuation ratios such as book-to-market as a sorting criterion increases the risk of creating a spurious risk factor. In the real world, empirically measured attributes may be correlated with systematic risk and also with anomalous patterns in return. The net result of the two effects, risk versus anomaly, is complicated and model specific. However, equity market databases are inherently unbalanced panels, with more stocks than quarters or months. As new data on equity attributes becomes widely accessible, more studies will sort securities according to their attributes. The important caveat is that sorting procedures are subtle and easily abused. More work is needed to improve our understanding of the properties of such approaches.
    • 780 W.E. Ferson 4.2.4. Errors-in-betas When the cross-sectional regression uses betas that are measured with error, two main issues arise. First, the cross-sectional regression coefficients suffer from a classical “attenuation bias”. Second, the standard errors are biased. Early studies that used cross- sectional regression also used portfolio grouping procedures, attempting to minimize errors in the betas, and to ensure that the remaining errors were uncorrelated with the other error terms in the model. More recently, empirical studies have taken to sorting stocks in order to accentuate some anomaly in the data, such as firm size, book-to- market, etc., in order to “challenge” the asset pricing model more forcefully. Thus, concerns about errors in the betas remain relevant. Consider first the cross-sectional regression model with no errors in the betas: Rt = l0t + Blt + ut, (43) where Cov(ut, B) = 0 and Rt is an N-vector of returns. Assume that we don’t get to see the true B, instead we have B∗ , where: B∗ = B + v = true + “noise , Cov(v, B) = 0. (44) Using the first stage time-series or GMM estimation, we can get an estimate of Cov(v), the cross-sectional covariance matrix of the errors-in-betas. If we run the cross- sectional regression on the noisy betas: Rt = l0t + B∗ lt + et, (45) then ˆl1 →p Cov(B∗ )−1 Cov(B∗ , Rt) = [Cov(B) + Cov(v)]−1 Cov(B + v, Blt + ut) = [Cov(B) + Cov(v)]−1 {Cov(B) lt}. Theil (1971) proposes an adjusted estimator to control the bias: l∗ t ≡ Cov (B∗ ) − Cov(v) −1 Cov (B∗ ) ˆl1 →p lt. (46) This estimator is used by Black and Scholes (1974), and Litzenberger and Ramaswamy (1979, 1982). Most of the preceding analysis assumes that the same betas are used in each cross- sectional month. Under this simplifying assumption, errors in betas imply that the cross-sectional regression coefficients are not independent over time, because the same beta (with error) is used in each month. Shanken (1992) shows how to correct Fama– MacBeth standard errors for this fact. In principle, the cleanest way to deal with errors-in-betas is to estimate the time-series model of betas and the cross-sectional regression simultaneously, thus accounting for the estimation error. This is the subject of the next section.
    • Ch. 12: Tests of Multifactor Pricing Models, Volatility Bounds and Portfolio Performance 781 4.3. Multivariate regression and beta-pricing models Tests of portfolio efficiency using multivariate regression analysis and maximum likelihood became popular in empirical finance following the work of Jobson and Korkie (1982, 1985), Gibbons (1982) and Stambaugh (1982). The traditional focus of this literature is tests of the CAPM, which implies that the market portfolio is mean- variance efficient. However, given the discussion in Section 2.3, multibeta models and stochastic discount factor models also imply that some portfolio is minimum variance efficient, and the techniques reviewed here can be applied. Following the traditional literature, we ignore the presence of conditioning information in this section. [For tests of efficiency with conditioning information, see Ferson and Siegel (2002b)]. For simplicity, consider the case where a risk-free asset exists. Let rt = {Rit − RFt}i be an N-vector of excess returns, and let rpt = Rpt − RFt, be the excess return of the particular portfolio to be tested. The null hypothesis to be tested is that Rpt is a minimum-variance portfolio, which from Equation (13) is equivalent to: E (rt) = bE rpt ; b ≡ Cov rt; rpt Var rpt −1 , (47) when there is a given riskfree rate. The tests are based on a regression model: rt = a + brpt + ût, ût ~ iid (0, W), (48) where the null hypothesis implies that the vector of alphas or intercepts, a = 0. MacKinlay and Richardson (1991) illustrate that it is easy to use the GMM to implement the tests of portfolio efficiency. To do so, one can form the moment conditions: ût = rt − a − brpt, Zt = 1, rpt , g = St (ût ⊗ Zt) T . (49) The parameters are f = (a , b ) . Choosing the parameters to Minf g Wg, we have the GMM estimators, which are the same as seemingly-unrelated OLS. These are consistent and asymptotically normal, even without the assumptions that the error terms are independent and identically distributed over time. It is assumed that the data are stationary, that E(ût) = 0 = E(ûtrpt), and other technical conditions given by Hansen (1982). If the assumptions that justify OLS as best linear unbiased are imposed, GMM delivers the OLS standard errors as well. If the GMM uses the “optimal” weighting matrix, W = (1/T)[Cov(g)]−1 , then the asymptotic variance of the parameters is given by Equation (32). Imposing that ût ~ iid(0, W), the GLS standard errors fall out as a special case. Several tests for the hypothesis that a = 0 are available using the GMM. [See, e.g.,
    • 782 W.E. Ferson Newey and West (1987)]. One example is the Wald test, which may be formed as Ta A Cov(a)−1 a, where A Cov(·) denotes the asymptotic covariance.22 The Wald statistic is asymptotically distributed as a Chi-squared variable, with degrees of freedom equal to the dimension of a. Much of the literature works in a normal, maximum likelihood setting. In this case, the log of the likelihood function to be maximized is: ln L = NT 2 ln(2P) − T 2 ln |W| − 1 2 St rt − a − brpt W−1 rt − a − brpt . (50) Standard tests for the hypothesis that a = 0 are compared by Buse (1982) and Gibbons, Ross and Shanken (1989), and most of the standard tests have been used to test the efficiency of stock market indexes, as in the CAPM. Examples include the likelihood ratio test [Gibbons (1982)], the Lagrange multiplier test [Stambaugh (1982)] and the Wald test [Gibbons, Ross and Shanken (1989)]. The Wald test is of particular interest. This is not because of its sampling performance, which is typically the worst of the three, but because it leads to a graphical interpretation that provides some economic intuition for the tests. Since the likelihood ratio and Lagrange multiplier tests are simple transformations of the Wald statistic, as shown by Buse (1982), a similar intuition would apply. We first need some facts about squared Sharpe ratios. The Sharpe ratio of rp is E(rp)/s(rp), the ratio of the expected excess return to the standard deviation. Let S2 (r) be the maximum squared Sharpe ratio that can be obtained using fixed-weight portfolios of the N assets: S2 (r) ≡ Max x (x E(r))2 x Sx = E(r) S−1 E(r), (51) where the second equality follows from solving the calculus problem. The maximum squared Sharpe ratio in a sample of assets is related to the squared Sharpe ratio of a tested portfolio, rp, included among the test assets, through a quadratic form in the alphas. I call this result the: Law of Conservation of Squared Sharpe Ratios. S2 (r) = a S−1 a + S2 rp . (52) 22 The notation is as follows: √ T(C − C) converges in distribution to a vector with mean zero and variance, A Cov(C). Thus, the asymptotic approximation to the finite sample variance of C is (1/T) A Cov(C).
    • Ch. 12: Tests of Multifactor Pricing Models, Volatility Bounds and Portfolio Performance 783 A proof uses the fact that, in the stacked regression model, a S−1 b = 0.23 Proof: S2 (r) = E(r) S−1 E(r), = a + bE rp S−1 a + bE rp , = a S−1 a + 2E rp a S−1 b + E rp b S−1 bE rp , = a S−1 a + E rp b S−1 bE rp , = a S−1 a + E rp Var rp −1 E rp , = a S−1 a + S2 rp . The law states that the highest squared Sharpe ratio obtainable in the sample is equal to the squared Sharpe ratio of the tested portfolio, plus a sort of squared Sharpe ratio, based on the alphas. If the alphas are zero, the two Sharpe ratios are the same and the tested portfolio is efficient. When the tested portfolio is not efficient, the quadratic form in the alphas tells how far it is from efficient. This is similar to our previous discussion of how a quadratic form in the APT pricing errors bounds the correlation between a combination of the APT factor portfolios and a minimum variance efficient portfolio. MacKinlay (1995) develops the interpretation of portfolios whose weights are proportional to S−1 a, which have many interesting properties. The law of conservation of squared Sharpe ratios provides a graphical interpretation of the Wald test statistic. Using the law, and the fact that in a multivariate regression model the covariance matrix of the intercept estimator is proportional to the covariance 23 This occurs when the right-hand side variable(s) are simple combinations of the test assets. In a stacked regression model: r = a + rp b + u, where rp = rW is a combination of the test assets with weight given by the n × k matrix, W. Using the definition b = (W SW)−1W S, where S is the covariance matrix of r, then a S−1 b = E r − rp b S−1 W SW −1 W S , = E r − rp b S−1 SW W SW −1 , = E r − rW W SW −1 W S W W SW −1 , = E rW W SW −1 − rW W SW −1 , = 0. Note also that Var(rp) = (W SW), and b S−1 b = (W SW)−1.
    • 784 W.E. Ferson matrix of the left-hand side asset returns, or Cov(a) = (1 + S2 (rp)) S, we may write the Wald test as: Wald = Ta Cov(a)−1 a, = T 1 + S2 rp −1 a S−1 a, = T 1 + S2 rp −1 S2 (r) − S2 rp . (53) Thus, the test may be interpreted as a normalized difference between S2 (r), the maximum squared Sharpe ratio in the sample of tested assets, and S2 (rp), the squared ratio for the tested portfolio. If the tested portfolio presents a Sharpe ratio that is “close” to the sample efficient portfolio, the value of the test statistic is small and we should not reject efficiency. If the tested portfolio lies far inside the sample mean variance frontier, we are likely to reject its efficiency. 4.3.1. Comparing the SDF and beta-pricing approaches Before the mid 1980s, most of the empirical asset pricing literature used the beta- pricing representation (9) and regression-based approaches or MLE. Then, the SDF representation combined with the GMM began to take hold. The latter combination is appealing, since Et{mR − 1} = 0 leads naturally to moment conditions for the GMM, and it is easy to multiply by lagged instruments, in order to use conditioning information. Recent studies have started to explore the tradeoffs between these approaches; see Kan and Zhou (1999), Jagannathan and Wang (2002) and Cochrane (2001). We have seen that both cross-sectional and time-series regressions are special cases of the GMM. So is maximum likelihood. If we use the GMM on the first order conditions, or “scores” of the likelihood function (50), we get quasi-maximum likelihood estimators. If we further impose normality, then the information matrix identity leads to the MLE standard errors for the parameters, and therefore to the Cramer–Rao lower bound [see Hamilton (1994, Chapter 14)]. The implication is that the tradeoff between the approaches has little to do with GMM versus MLE or regression, but has everything to do with the set of moments that are examined. The SDF representation and the beta pricing formulation can lead us to examine different moments. When they do, their empirical results can differ. This can be illustrated in the context of a recent debate. Kan and Zhou (1999) consider returns in excess of a risk-free rate, r, comparing beta pricing with the SDF approach. Ignoring conditioning information, beta pricing says rt = b( ft + l) + ut, where ft ≡ Ft − E(Ft) is the mean-centered factor, and the moments are E(ut) = E(ut ft) = 0. The stochastic discount factor is mt = 1 − b ft, and the SDF moment conditions for the excess returns are E{rr(1 − b ft)} = 0. Kan and Zhou find that the SDF approach is much less efficient than beta pricing. However, the moments being used are not the same. Kan and Zhou implicitly assume that E(Ft) is known,
    • Ch. 12: Tests of Multifactor Pricing Models, Volatility Bounds and Portfolio Performance 785 ignoring the moment condition for estimating E(Ft), and the sampling variation that this moment condition generates. Jagannathan and Wang (2002) and Cochrane (2001) show that when the two methods correctly exploit the same moments, they deliver nearly identical results. 5. Conditional performance evaluation Classical measures of investment performance compare the return of a managed portfolio to that of a benchmark. For example, an alpha for a fund may be calculated as the average return in excess of a risk-free rate, minus a fixed beta times the average excess return of a benchmark portfolio. Using the market portfolio of the CAPM as the benchmark, Jensen (1968) advocated such a risk-adjusted measure of performance. These classical measures are “unconditional”, in the sense that the expected returns and betas in the model are unconditional moments, estimated by past averages. If expected returns and risks vary over time, the classical approach is likely to be unreliable. For example, if the risk exposure of a managed portfolio varies predictably with the business cycle, but the manager has no superior forecasting ability, a traditional approach to performance measurement will confuse the common variation in fund risk and expected market returns with abnormal performance. In conditional performance evaluation, we model the expected returns and risk measures, attempting to account for their changes with the state of the economy, and thus controlling for common variation. The problem of confounding variation in mutual fund risks and market returns has long been recognized [e.g., Jensen (1972), Grant (1977)], but previous studies interpreted it as reflecting superior information or market timing ability. Conditional performance evaluation takes the view that a managed portfolio strategy which can be replicated using readily available public information should not be judged as having superior performance. For example: in a conditional approach, a mechanical market timing rule using lagged interest rate data is not a value-adding strategy. Only managers who correctly use more information than is generally publicly available, are considered to have potentially superior ability. Conditional performance evaluation is therefore consistent with a version of market efficiency, in the semi-strong form sense of Fama (1970). The beauty of a conditional approach to performance evaluation is that it can accommodate whatever standard of superior information is held to be appropriate, by the choice of the lagged instruments that are used to represent the public information. Incorporating a given set of lagged instruments, managers who trade mechanically in response to these variables get no credit. In practice, the trading behavior of managers may overlay complex portfolio dynamics on the dynamics of the underlying assets they trade. The desire to handle such dynamic strategies further motivates a conditional approach.
    • 786 W.E. Ferson 5.1. A numerical example The appeal of a conditional model for performance evaluation can be illustrated with the following highly stylized numerical example. Assume that there are two equally- likely states of the market as reflected in investors’ expectations; say, a “Bull” state and a “Bear” state. In a Bull market, assume that the expected return of the S&P500 is 20%, and in a Bear market, it is 0%. The risk-free return to cash is 5%. Assume that all investors share these views – the current state of expected market returns is common knowledge. In this case, an investment strategy using as its only information the current state of the market, will not yield abnormal returns. Now, imagine a mutual fund which holds the S&P500 in a Bull market and holds cash in a Bear market. Conditional on a Bull market, the beta of the fund is 1.0, the fund’s expected return is 20%, equal to the S&P500, and the fund’s alpha is zero. Conditional on a Bear market, the fund’s beta is 0.0, the expected return of the fund is the risk-free return, 5%, and the alpha is, again, zero. A conditional approach to performance evaluation correctly reports an alpha of zero in each state. By contrast, an unconditional approach to performance evaluation incorrectly reports an alpha greater than zero for our hypothetical mutual fund. The unconditional beta of the fund24 is 0.75. The unconditional expected return of the fund is .5(.20) + .5(.05) = 0.125. The unconditional expected return of the S&P500 is .5(.20) + .5(.0) = .10, and the unconditional alpha of the fund is therefore: (.125 − .05) − 0.75(.10 − .05) = 0.0375. The unconditional approach leads to the mistaken conclusion that the manager has positive abnormal performance. But the manager’s performance does not reflect superior skill or ability, it just reflects the fund’s decision to take on more market risk in times when the risk is more highly rewarded in the market. Investors who have access to the same information about the 24 The calculation is as follows. The unconditional beta is Cov(F, M)/ Var(M), where F is the fund return and M is the market return. The numerator is: Cov(F, M) = E {(F − E(F)) (M − E(M)) | Bull} × Prob(Bull) + E {(F − E(F)) (M − E(M)) | Bear} × Prob(Bear) = {(0.20 − 0.125)(0.20 − 0.10)} × 0.5 + {(0.05 − 0.125)(0 − 0.10)} × 0.5 = 0.0075. The denominator is: Var(M) = E (M − E(M))2 | Bull × Prob(Bull) + E (M − E(M))2 | Bear × Prob(Bear) = (0.20 − 0.10)2 × 0.5 + (0.0 − 0.10)2 × 0.5 = 0.01. The beta is therefore 0.0075/0.01 = 0.75.
    • Ch. 12: Tests of Multifactor Pricing Models, Volatility Bounds and Portfolio Performance 787 economic state, would not be willing to pay the fund management fees to use this common knowledge. 5.1. Stochastic discount factor formulation For a given SDF we may define a fund’s conditional SDF alpha following Chen and Knez (1996) and Farnsworth et al. (2002) as: apt ≡ E mt + 1Rp,t + 1 | Zt − 1, (54) where one dollar invested with the fund at time t returns Rp,t + 1 dollars at time t + 1. If the SDF prices a set of “primitive” assets, Rt + 1, then apt will be zero when the fund (costlessly) forms a portfolio of the primitive assets, if the portfolio strategy uses only the public information at time t. In that case Rp,t + 1 = x(Zt) Rt + 1, where x(Zt) is the portfolio weight vector. Then Equation (2) implies that apt = [E(mt + 1x(Zt) Rt + 1 | Zt)] − 1 = x(Zt) [E(mt + 1Rt + 1 | Zt)] − 1 = x(Zt) 1 − 1 = 0. Consider an example where mt + 1 is the intertemporal marginal rate of substitution for a representative investor, and Equation (2) is the Euler equation which must be satisfied in equilibrium. If the consumer has access to a fund for which the conditional alpha is not zero he or she will wish to adjust the portfolio, purchasing more of the fund if alpha is positive and less if alpha is negative. The SDF alpha depends on the model for the SDF, and the SDF is not unique unless markets are complete. Thus, different SDFs can produce different measured performance. This mirrors the classical approaches to performance evaluation, where performance is sensitive to the benchmark.25 While apt is in general a function of Zt, it is simpler to discuss the estimation of ap = E(apt). The parameter ap is the expectation of the conditional alpha, defined by Equation (54). Thus, we examine the average abnormal performance of a fund.26 A useful approach for estimating SDF alphas in this case is to form a system of equations as follows: u1t = [mt + 1Rt + 1 − 1] ⊗ Zt, u2t = ap − mt + 1Rp,t + 1 + 1. (55) The sample moment condition is g = T−1 St(u1t, u2t) . We can use the GMM to simultaneously estimate the parameters of the SDF model and the fund’s SDF alpha. 25 Roll (1978), Dybvig and Ross (1985), Brown and Brown (1987), Chen, Copeland and Mayers (1987), Lehmann and Modest (1988) and Grinblatt and Titman (1989b) address this issue in the beta-pricing context. Farnsworth et al. (2002) provide empirical evidence for the SDF setting. 26 For a discussion of time-varying conditional alphas, see Christopherson, Ferson and Glassman (1998a).
    • 788 W.E. Ferson 5.1.1. Invariance to the number of funds The system (55) may be estimated using a two-step approach, where the parameters of the model for mt + 1 are estimated in the first step and the fitted SDF is used to estimate alphas in the second step. Farnsworth et al. (2002) find that simultaneous estimation is dramatically more efficient. However, a potential problem with the simultaneous approach is that the number of moment conditions grows substantially if many funds are to be evaluated, and there are more funds than months in most of the available data sets. Fortunately, Farnsworth et al. (2002) show that we can estimate the joint system separately for each fund without loss of generality. Estimating a version of system (55) for one fund at a time is equivalent to estimating a system with many funds simultaneously. The estimates of ap and the standard errors for any subset of funds is invariant to the presence of another subset of funds in the system. 5.1.2. Additional issues Farnsworth et al. (2002) consider two sets of linear factor models for mt + 1. One is based on nontraded factors (e.g., industrial production) and another is based on traded factors (e.g. the S&P500 index). For the traded factor models, they find that it is important to impose the restriction that the model price the traded factors. For example, in the unconditional CAPM, mt + 1 = a + bRm,t + 1, where Rm,t + 1 is the gross market return. Requiring the model to price the market return and also a zero beta return we have: E a + bRm,t + 1 Rm,t + 1 = 1 and E a + bRm,t + 1 R0t + 1 = 1. (56) These two conditions identify the parameters a(·) and b(·) as functions of the first and second moments of the market index and the zero-beta return, as shown previously in Lemma 1. Farnsworth et al. also find that it is important to impose the restriction that the model prices the risk-free asset. This identifies the conditional mean of the SDF: E(mt + 1|Zt) = R−1 ft , when Rft is included in Zt. Non-traded factor models, in particular, are much less accurate when they aren’t forced to price the risk-free asset. 5.2. Beta-pricing formulation Ferson and Schadt (1996) modify Jensen’s alpha and two simple market-timing models to incorporate conditioning information. They start with a conditional CAPM, which implies that Equation (57) is satisfied for the assets available to portfolio managers.
    • Ch. 12: Tests of Multifactor Pricing Models, Volatility Bounds and Portfolio Performance 789 They show it is easy to extend the analysis beyond the CAPM, for a conditional multiple-beta model. rit + 1 = bim (Zt) rmt + 1 + ui,t + 1, i = 0, . . . , N, t = 0, . . . , T − 1, E ui,t + 1 | Zt = 0, E ui,t + 1rmt + 1 | Zt = 0, (57) The bim(Zt) are the time t conditional market betas of the excess return of asset i. The second Equation of (57) follows from the conditional CAPM assumption and the third equation says that the bim(Zt) are conditional regression coefficients. Equation (57) implies that a portfolio strategy which depends only on the public information Zt will satisfy a similar regression. The intercept, or “alpha” of the regression should be zero, and the error term should not be related to the public information variables.27 Under the hypothesis that the manager uses no more information than Zt, then the portfolio beta, bpm(Zt), is a function only of Zt. Using a Taylor series we can approximate this function linearly: bpm (Zt) = b0p + Bpzt, (58) where zt = Zt − E(Z) is a vector of the deviations of Zt from the unconditional means, and Bp is a vector with dimension equal to the dimension of Zt. The coefficient b0p may be interpreted as an “average beta”, i.e., the unconditional mean of the conditional beta: E(bpm(Zt)). The elements of Bp are the response coefficients of the conditional beta with respect to the information variables Zt. Equations (57) and (58) imply a regression of a managed portfolio excess return on the market factor excess return and its product with the lagged information: rpt + 1 = ap + d1prmt + 1 + d2p (ztrmt + 1) + ept + 1, (59) where the model implies ap = 0, d1p = b0p, and d2p = Bp. The regression (59) may be interpreted as a multi-factor model, where the excess market return is the first factor and the products of the market and the lagged information variables are additional factors. The additional factors may be interpreted as the returns to dynamic strategies, which hold zt units of the market index, financed by borrowing or selling zt in Treasury bills. The coefficient ap is the average difference between the managed portfolio excess return and the excess return to the dynamic strategies, which replicate its time-varying risk exposure. A manager with a positive 27 That is, if Rp, t + 1 = x(Zt) Rt + 1, where x(·) is an N-vector of weights and Rt + 1 is the N-vector of the available risky security returns, then the portfolio excess return will satisfy the conditional CAPM, with bpm(Zt) = x(Zt) bm(Zt), where bm(Zt) is the vector of the securities’ conditional betas. The error term in the regression for the portfolio strategy is up, t + 1 = x(Zt) ut + 1, where ut + 1 is the vector of the ui, t + 1’s, and therefore E(up, t + 1 | Zt) = E(x(Zt) ut + 1 | Zt) = x(Zt) E(ut + 1 | Zt) = 0.
    • 790 W.E. Ferson alpha in this setting is one whose average return is higher than the average return of the conditional-beta-replicating strategies. 5.3. Using portfolio weights The previously discussed performance measurement techniques are all returns- based. The strength of returns-based methodologies is their minimal information requirements. One needs only returns on the managed portfolio and data for the model of mt + 1. However, this ignores potentially useful information that is often available in practice: the composition of the managed portfolio. Grinblatt and Titman (1989a, 1993) propose a weight-based measure of mutual fund performance. Their measure combines portfolio weights with unconditional moments to measure performance. Ferson and Khang (2002) argue that the use of portfolio weights may be especially important in a conditional setting. When expected returns are time-varying and managers trade between return observation dates, returns-based approaches are likely to be biased. This “interim trading bias” can be avoided by using portfolio weights in a conditional setting. The intuition behind weight-based performance measures can be motivated with a single-period model where an investor maximizes the expected utility of terminal wealth. Max x E U W0 Rf + x r | Z, S , (60) where Rf is the risk-free rate, r is the vector of risky asset returns in excess of the risk-free rate, W0 is the initial wealth, x is the vector of portfolio weights on the risky assets, Z is public information available at time 0, and S is private information available at time 0. Private information, by definition, is correlated with r, conditional on Z. If returns are conditionally normal, the first and second order conditions for the maximization when the investor has nonincreasing absolute risk aversion imply [see Khang (1997)] that E x(Z, S) r − E(r | Z) | Z > 0, (61) where x(Z, S) is the optimal weight vector and r − E(r|Z) are the unexpected, or abnormal returns, from the perspective of an observer with the public information. Conditional on the public information, the sum of the conditional covariances between the weights of a manager with private information, S, and the abnormal returns for the securities in a portfolio is positive. If the manager has no private information, S, then the covariance is zero. Ferson and Khang (2002) study a Conditional Weight Measure (CWM) that follows from Equation (61). They introduce a “benchmark” weight, xb, that is in the public information set Z, so Equation (61) implies E [x(Z, S) − xb] r − E(r | Z) | Z > 0, (62) if the manager has superior information, S. Because xb is a constant given Z, it will not affect the conditional covariance. Weight changes are advantageous on statistical
    • Ch. 12: Tests of Multifactor Pricing Models, Volatility Bounds and Portfolio Performance 791 grounds, as the levels of the weights may be nonstationary. Other benchmark weights could be used, when a particular benchmark may be suggested by the application. 5.3.1. Conditional performance attribution Traditional regression-based analysis sometimes interprets the regression as providing a decomposition of the sources of a fund’s returns. For example, fund beta times the market excess return is the component of the fund’s excess return due to overall market exposure. Conditional performance measures allow refinements of such decompositions. For example, in the Ferson and Schadt regression (59), we have a component due to average market exposure and one due to the mechanical use of the instruments, Zt, to track the time-varying exposure. The average conditional alpha is the difference between the fund return and the dynamic beta-matched strategy. Weight- based measures allow a similar decomposition. Consider the following identity for the unconditional covariance: Sj Cov Dxj, rj = SjE Cov Dxj, rj|Z + Sj Cov E Dxj|Z , E rj|Z , (63) where Dxj ≡ xjt − xbjt. The left-hand side is the unconditional weight measure (UWM) as in Grinblatt and Titman (1993). The second term is the “average” conditional weight measure, equal to the unconditional mean of Equation (62). The third term captures the variation in the weight changes associated with changes in the expected returns, conditioned on public information. By comparing the conditional and unconditional measures, the third term may be calculated as a residual. Equation (63) decomposes the manager’s total return from active trading into a component attributable to private information (the first term on the right) and a component attributable to the public information. For example, the second component may be compared with the investor’s cost of monitoring the public information. The first component is the performance the investor could not obtain without the manager, even if he chose to monitor the public information. Isolating this component enables an investor to compensate a manager for his use of private information. 5.3.2. Interim trading bias The conditional weight-based approach can control an interim trading bias, which arises when we depart from the assumption that returns are independently and identically distributed over time (iid), and is therefore especially relevant to a conditional setting. Consider an example where returns are measured over two “periods”, but a manager trades each period. The manager has neutral performance, but the portfolio weights for the second period can be a function of public information at the intervening date. If returns are iid, this creates no bias, as there is no information at the intervening date that is correlated with the second period return. However, if expected returns vary with public information, then a manager who observes and trades
    • 792 W.E. Ferson on public information at the intervening date generates a return for the second period from the conditional distribution. His two-period portfolio strategy will contain more than the public information at the beginning of the first period, and a returns-based measure over the two periods will detect this as “superior” information. Goetzmann, Ingersoll and Ivkovic (2000) address interim trading bias by simulating the multiperiod returns generated by the option to trade between return observation dates. A conditional weight-based measure avoids the problem by examining the conditional covariance between the manager’s weights at the beginning of the first period and the subsequent two-period returns. The ability of the manager to trade at the intervening period thus creates no interim trading bias. Of course, managers may engage in interim trading based on superior information to enhance performance, and a weight-based measure will not record these interim trading effects. Interim trading thus presents a bias under the null hypothesis that managers possess only pubic information. Under the alternative hypothesis of superior ability, a weight-based measure may have limited power to detect the ability. Thus, the cost of using a weight-based measure to avoid bias is a potential loss of power. Ferson and Khang (2002) evaluate these tradeoffs, and conclude that the conditional weight-based measure is attractive. 5.4. Conditional market-timing models In a market-timing context, the goal of conditional performance evaluation is to distinguish timing ability that merely reflects publicly available information, as captured by the set of lagged instrumental variables, from timing based on better information. We may call such informed timing ability conditional market timing. A classic market-timing regression, when there is no conditioning information, is the quadratic regression of Treynor and Mazuy (1966): rpt + 1 = ap + bprmt + 1 + gtmu rm,t + 1 2 + vpt + 1, (64) where the coefficient gtmu measures market timing ability. Admati et al. (1986) describe a model in which a manager with constant absolute risk aversion in a normally distributed world, observes at time t a private signal equal to the future market return plus noise, rmt + 1 + h. The manager’s response is change the portfolio beta as a linear function of the signal. They show that the gtmu coefficient in regression (64) is positive if the manager increases market exposure when the signal about the future market return is positive. In a conditional model, the part of the correlation of fund betas with the future market return that can be attributed to the public information, is not considered to reflect market timing ability. Ferson and Schadt (1996) develop a conditional version of the Treynor–Mazuy regression: rpt + 1 = ap + bprmt + 1 + Cp (ztrmt + 1) + gtmc rm,t + 1 2 + vpt + 1, (65) where the coefficient vector Cp captures the linear response of the manager’s beta to the public information, Zt. The term Cp(ztrmt + 1) controls for the public information
    • Ch. 12: Tests of Multifactor Pricing Models, Volatility Bounds and Portfolio Performance 793 effect, which would bias the coefficients in the original Treynor–Mazuy model. The coefficient gtmc measures the sensitivity of the manager’s beta to the private market timing signal. Merton and Henriksson (1981) and Henriksson (1984) describe an alternative model of market timing in which the quadratic term in Equation (64) is replaced by an option payoff, Max(0, rm,t + 1). This reflects the idea that market timers may be thought of as delivering (hopefully, attractively priced) put options on the market index. Ferson and Schadt (1996) develop a conditional version of this model as well. Becker et al. (1999) develop conditional market-timing models with explicit performance benchmarks. In this case, managers maximize the utility of their portfolio returns in excess of a benchmark portfolio return. In practice, performance benchmarks often represent an important component of managers’ incentive systems. Such benchmarks have been controversial in the academic literature. Starks (1987), Grinblatt and Titman (1989c) and Admati and Pfleiderer (1997) argue that benchmarks don’t properly align managers’ incentives. Carpenter, Dybvig and Farnsworth (2000) provide a theoretical justification of benchmarks, used in combination with investment restrictions. Becker et al. simultaneously estimate the fund managers’ risk aversion for tracking error and the precision of the market-timing signal, in a sample of more than 400 U.S. mutual funds for 1976–94, including a subsample with explicit asset allocation objectives. The estimates suggest that U.S. equity mutual funds behave as risk averse, benchmark investors, but little evidence of timing ability is found. 5.5. Empirical evidence on conditional performance Traditional measures of the average abnormal performance of mutual funds, like Jensen’s alpha, are observed to be negative more often than positive across the many studies. For example, Jensen (1968) used the CAPM to conclude that a typical fund has neutral performance, only after adding back expenses. Traditional measures of market timing often find that any significant market timing ability is perversely “negative”, suggesting that investors could time the market by doing the opposite of a typical fund. Such results make little economic sense, which suggests that they may be spurious. Conditional performance evaluation takes the view that a mechanically managed portfolio strategy using only public information does not have abnormal performance. A manager’s return is therefore compared with such a benchmark, mechanically constructed using public information to match the time-varying risk of the fund. The empirical evidence suggests that conditional performance measures can produce results different from the classical methods. Ferson and Schadt (1996) find evidence that funds’ risk exposures change in response to public information on the economy, such as level of interest rates and dividend yields. Using conditional models Ferson and Schadt (1996), Kryzanowski, Lalancette and To (1997) and Zheng (1999) find that the distribution of mutual fund alphas shifts to the right and is centered near zero. Ferson and Warther (1996) attribute differences between unconditional and conditional alphas to predictable flows of public
    • 794 W.E. Ferson money into funds. Inflows are correlated with reduced market exposure, at times when the public expects high returns, due to larger cash holdings at such times. In pension funds, which are not subject to high frequency flows of public money, no overall shift in the distribution of fund alphas is found when moving to conditional models [Christopherson et al. 1998b)]. Once we control for public information variables, there seems to be little evidence that mutual funds have conditional timing ability for the level of the market return. Busse (1999) asks whether fund returns contain information about market volatility. He finds evidence using daily data that funds may shift their market exposures in response to changes in second moments. Further research in this direction is clearly warranted. Farnsworth et al. (2002) use a variety of SDF models to evaluate performance in a monthly sample of US equity mutual funds. They find that many of the SDF models are biased. The average bias is about −0.19% per month for unconditional models, −0.12% for conditional models. This is less than two standard errors, as a typical standard error is 0.1% per month. They find that the average mutual fund alpha is no worse than a hypothetical stock-picking fund with neutral performance. Adding back average expenses of about 0.17% per month to the mutual fund alphas (since the actual funds pay expenses, while the hypothetical funds do not), the average fund’s performance is slightly higher than hypothetical funds with no ability. Ferson and Khang (2002) develop the conditional, weight-based approach to measuring performance. Using a sample of equity pension fund managers, 1985–1994, they find that the traditional, returns-based alphas of the funds are positive, consistent with previous studies of pension fund performance. However, these alphas are smaller than the potential effects of interim trading bias. By using instruments for public information combined with portfolio weights, their conditional weight-based measures find that the pension funds also have neutral performance. Thus, the empirical evidence based on conditional performance measures suggests that abnormal fund performance, controlling for public information, is rare. 6. Conclusions This chapter has reviewed tests of multifactor asset pricing models, volatility bounds and portfolio performance. We developed three essentially equivalent paradigms: beta pricing, stochastic discount factors and minimum variance efficiency, and we discussed each approach in the context of conditional asset-pricing models. These models are stated in terms of expected returns and risk measures, conditioned on available information about the state of the economy. Conditional models are most interesting when there are observable instruments that can track time-varying expected returns and security risks. The evidence for such predictability in returns is both extensive and controversial. Conditional asset pricing models should provide a useful framework for many continuing investigations.
    • Ch. 12: Tests of Multifactor Pricing Models, Volatility Bounds and Portfolio Performance 795 The three paradigms of empirical asset pricing have traditionally been linked with particular empirical methods. The stochastic discount factor paradigm seems to fit naturally with the generalized method of moments. Mean variance efficiency tests have (since the early 1980s) most commonly employed multivariate regression, and multibeta models seem to cry out for cross-sectional regressions. But this pairing of the models and methods is not sacrosanct. In fact, any of these empirical methods can be paired with any of the paradigms, and recent studies are beginning to explore the possibilities and tradeoffs. I am coming to the view that the set of moments the investigator chooses to examine is the key issue. Different approaches can lead one to examine different moments, and when they do, the results will differ. Conditional performance evaluation provides an example, where the models and methods meet the data. Empirical work in this area essentially applies conditional asset pricing models to the returns of managed portfolios. The evidence to date shows that conditional models make a difference. I expect these approaches to yield more interesting insights in the future, about the behavior and performance of mutual funds, pension funds, hedge funds and other professionally managed portfolios. Conditional asset pri