Post Modern Investment Management
Upcoming SlideShare
Loading in...5
×
 

Post Modern Investment Management

on

  • 991 views

An overview of post modern investment management concepts and techniques

An overview of post modern investment management concepts and techniques

Statistics

Views

Total Views
991
Slideshare-icon Views on SlideShare
979
Embed Views
12

Actions

Likes
2
Downloads
22
Comments
0

3 Embeds 12

http://www.linkedin.com 7
https://www.linkedin.com 4
http://www.lmodules.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Post Modern Investment Management Post Modern Investment Management Presentation Transcript

    • Post Modern Investment Management A Scientific Approach for the Complex World An Overview March 2010 Mikhail Munenzon, CFA, CAIA 1
    • Table of Contents • Returns 5 • Volatility 31 • Risk Measurement 39 • Risk and Return 52 • Dependencies 60 • Risk Management 73 • Markets 76 • Human Decision Making and Investor Edge 78 • Forecasting 81 • Optimization 88 • Hedging 101 • Asset Allocation 103 • Manager Evaluation and Selection 105 • Concluding remarks 108 • References 110 2
    • • Everything must be made as simple as possible but not simpler – Albert Einstein • I’d rather be vaguely right than precisely wrong – John Maynard Keynes • There are three kinds of lies: lies, damned lies and statistics – Benjamin Disraeli • Scientific approach – tools are empirically tested and measurable; applied consistently in a transparent manner to allow for improvement and outside verification 3
    • Format for each topic • Motivating question of practical significance for investor • Key classical tools and current practice • Key post modern tools • Particular attention paid to assumptions underlying various analytical tools and empirical support for them • Focus is on introducing principles of and intuition behind practically useful tools through verbal and visual methods and minimizing technical details that might obscure the value of such tools 4
    • behavior? features and understand its be extracted to of a security, what • Motivation – given a time series of returns useful information can SP500 Daily Return Cumulative Return 0 1 2 3 4 5 6 7 1/2/90 -0.15 -0.1 -0.05 0 0.05 0.1 0.15 1/3/90 Daily Return SP500 1/2/91 Cumulative Return Returns 0 1 2 3 4 5 6 7 1/3/91 1/2/92 -0.15 -0.1 -0.05 0 0.05 0.1 0.15 1/2/90 1/3/90 1/3/92 1/2/93 1/2/91 1/3/91 1/3/93 1/2/92 1/2/94 1/3/92 1/3/94 1/2/93 1/2/95 1/3/93 1/3/95 1/2/94 1/3/94 1/2/96 1/3/96 1/2/95 1/3/95 1/2/97 1/3/97 1/2/96 1/3/96 1/2/98 1/3/98 1/2/97 1/3/97 1/2/99 1/2/98 1/3/98 1/3/99 1/2/00 1/2/99 Date 1/3/99 1/3/00 Date 1/3/00 1/2/00 1/2/01 Date 1/3/01 Date 1/3/01 1/2/01 1/2/02 1/3/02 1/3/02 1/2/02 1/2/03 1/3/03 1/3/03 1/2/03 1/2/04 1/3/04 SP500 Daily Return - 1/2/1990 - 1/26/2010 1/3/04 1/2/04 SP500 Daily Return - 1/2/1990 - 1/26/2010 1/2/05 1/3/05 SP500 Cumulative Return - 1/2/1990 - 1/26/2010 1/3/05 1/2/05 SP500 Cumulative Return - 1/2/1990 - 1/26/2010 1/3/06 1/2/06 1/3/06 1/2/06 1/3/07 1/3/07 1/2/07 1/2/07 1/3/08 1/3/08 1/2/08 1/2/08 1/3/09 1/2/09 1/2/09 1/3/09 1/3/10 1/2/10 1/3/10 1/2/10 5
    • Returns (cont.) - Classical Perspective • Returns behavior follows a normal Histogram of 100,000 randomly generated value from the normal distribution (Gaussian) distribution 4500 – First introduced by DeMoirvre – Later further developed by Gauss to study and forecast planetary motions 4000 – Even if returns are not normal for a small sample, such time series will converge to 3500 normal distribution for large sample (via the Central Limit Theorem) • Implications 3000 – Only need the mean and volatility (will discuss later in more detail) of returns to assess 2500 probability of events – Symmetrical process (skewness is 0; returns greater than the mean are as likely as returns 2000 below the mean) – Extreme events are rare – any deviations 1500 from the mean within 2*volatility events will add up to about 95.5% of total probability and 3*volatility deviations add up to about 99.7% 1000 (kurtosis is 3 or excess kurtosis is 0 => measure of fat tails) 500 0 -5 -4 -3 -2 -1 0 1 2 3 4 5 6
    • Returns (cont.) – Classical Perspective • Returns behavior follows a random walk – On any given day, price rise is as likely as price decline as rational actors (see more in human decision making section) fully and quickly incorporate all information in prices • Implications – Past prices cannot provide any valuable information with regard to future prices (serial correlation, relationship of current prices with its lagged values is non-existent or 0) – Any forecasting attempts are unlikely to add any value 7
    • Returns (cont.) – Empirical Evidence from Traditional Asset Classes plus VIX 1/2/1990-1/31/2010 SPX GSCI NAREIT JPMAGG VIX daily data Arithmetic avg return 0.0357% 0.0253% 0.0554% 0.0264% 0.1763% Compounded avg return 0.0291% 0.0158% 0.0426% 0.0261% 0.0068% max 11.6% 7.8% 18.4% 1.3% 64.2% min -9.0% -16.9% -19.5% -1.5% -25.9% vol 1.2% 1.4% 1.6% 0.3% 5.9% Skewness 0.00 -0.40 0.47 -0.14 1.23 Kurtosis 12.71 10.74 33.85 4.77 10.91 Number of days 5,238 5,238 5,238 5,238 5,238 Normality at 95% confidence level? No No No No No p-values 0.1% 0.1% 0.1% 0.1% 0.1% No serial correlation at 95% confidence level? No No No No No p-values 0.0% 0.0% 0.0% 0.0% 0.0% Cumulative Return 358.7% 129.2% 833.0% 292.9% 42.8% % of days with positive returns 51.7% 49.3% 51.9% 53.0% 45.8% 8
    • Returns (cont.) – Empirical Evidence from Traditional Asset Classes plus VIX Notes: Jarque-Bera test was used to evaluate normality of a time series; null hypothesis is stated in the question. Ljung-Box test with 20 lags was used to evaluate serial correlation of a time series; null hypothesis is stated in the question. SPX - SP500 Total Return GSCI - SP GSCI NAREIT - FTSE EPRA/NAREIT US Total Return JPMAGG - JPM Morgan Aggregate Bond Total Return VIX - VIX Index 9
    • Returns (cont.) – Empirical Evidence from Alternative Investment Strategies CISDM Alternative Strategies Indices 12/31/1992 - 12/31/2009 monthly data Convert Merger Long/ TOTAL EW Arb Distressed Arb CTA Macro Short FOF EMN EM FI Arb Arithmetic avg return 1.03% 0.79% 0.91% 0.80% 0.70% 0.78% 0.96% 0.62% 0.67% 0.92% 0.55% Compounded avg return 1.00% 0.78% 0.90% 0.79% 0.67% 0.77% 0.93% 0.61% 0.67% 0.85% 0.54% max 8.37% 4.71% 5.26% 4.74% 7.86% 8.61% 9.40% 4.50% 2.76% 12.13% 5.21% min -8.82% -11.49% -10.59% -5.61% -5.43% -5.36% -9.42% -6.40% -2.10% -26.25% -9.44% vol 2.16% 1.44% 1.83% 1.12% 2.49% 1.62% 2.20% 1.42% 0.58% 3.80% 1.40% Skewness -0.63 -3.92 -1.94 -0.80 0.41 1.19 -0.25 -1.31 -0.43 -2.12 -2.92 Kurtosis 6.49 33.46 13.38 8.64 3.06 7.67 5.75 8.29 6.23 15.94 24.28 cumulative return 762.97% 433.42% 589.06% 449.07% 319.22% 420.27% 641.14% 275.27% 319.76% 518.67% 118.47% Notes: 10 fixed income arbitrage is from 12/31/1997.
    • Returns (cont.) – Empirical Evidence from Alternative Investment Strategies Histogram of monthly returns for CISDM Convertible Arb Index Histogram of monthly returns for CISDM Distressed Index 12/31/1991 - 12/31/2009 12/31/1991 - 12/31/2009 140 60 120 50 100 40 80 30 60 20 40 10 20 0 0 -0.2 -0.15 -0.1 -0.05 0 0.05 0.1 0.15 -0.2 -0.15 -0.1 -0.05 0 0.05 0.1 0.15 Histogram of monthly returns for CISDMCEW CTA Index Historgram of monthly return for CISDM Macro Index 12/31/1991 - 12/31/2009 12/31/1991 - 12/31/2009 40 90 35 80 70 30 60 25 50 20 40 15 30 10 20 5 10 0 0 -0.2 -0.15 -0.1 -0.05 0 0.05 0.1 0.15 -0.2 -0.15 -0.1 -0.05 0 0.05 0.1 0.15 11
    • Returns (cont.) – Empirical Evidence from Alternative Investment Strategies • Note on the prior page how most of the colored graph for convertible arb and distressed is below the mean (negative skewness) and the left part of the graph is thick and long (fat tails or large extreme losses • Results for CTA and macro are exactly the opposite • For more detail on traditional asset classes, see Munenzon (2010) 12
    • Returns (cont.) – Empirical Evidence • Significant non-normality – skewness and kurtosis; mean and volatility are not enough to describe return distribution • Negative skewness and leptokurtosis (fat tails), especially for such alternative strategies like – the opposite of what investors generally prefer: consistent, positive returns • Convergence to normal distribution does not occur for large samples • Short term persistence – Significant serial correlation as measured by autocorrelation coefficients • For liquid asset classes at daily frequency, the magnitude of the first order autocorrelation is generally close to 0 (e.g, for SP500, it is -.0589) • For other time horizons and securities, it can be much larger (e.g., 0.58 for Convertible arb, .43 for Distressed vs 0.15 for Macro, 0.01 for CTA) – Positive momentum effects when measured 9-24 month period => securities with strong performance over the past 6-12 months are likely to continue to outperform in the next 3-12 months (research starting with Jegadeesh et al (1993) for US; later extended to international securities) • Over the very short term (week and month), returns may show negative momentum – ‘Smoothness’ of data (strong positive autocorrelation) implies that volatility on raw data will understate the true range of outcomes (more later in risk section) • Long term persistence – Losers of the past 3-5 years are likely to be winners of the next 3-5 years – Can be measured by Hurst exponent (.5 => no persistence; <.5 =>anti persistence; >.5 => positive persistence); using the full sample at daily frequency = > • SP500 => .18 • GSCI => .2 • NAREIT => .22 • JPMAGG => .18 • VIX => .19 13
    • Returns (cont.) – Empirical Evidence from Traditional Asset Classes plus VIX • Volatility of returns is 3.50% Rolling 250 day daily volatility of SP500 time varying and 3.00% 2.50% exhibits a tendency to 2.00% Volatility 1.50% cluster (also can be 1.00% 0.50% seen in the daily 0.00% 12/18/90 12/18/91 12/18/92 12/18/93 12/18/94 12/18/95 12/18/96 12/18/97 12/18/98 12/18/99 12/18/00 12/18/01 12/18/02 12/18/03 12/18/04 12/18/05 12/18/06 12/18/07 12/18/08 12/18/09 Date return chart on p.6; VIX - 1/2/1990 - 1/26/2010 90 more in the section on 80 70 60 volatility) 50 VIX 40 30 20 10 0 1/2/90 1/2/91 1/2/92 1/2/93 1/2/94 1/2/95 1/2/96 1/2/97 1/2/98 1/2/99 1/2/00 1/2/01 1/2/02 1/2/03 1/2/04 1/2/05 1/2/06 1/2/07 1/2/08 1/2/09 1/2/10 Date 14
    • Returns (cont.) – Empirical Evidence from Traditional Asset Classes plus VIX • Are different levels of volatility associated with meaningfully levels of returns? • Use VIX to assess impact across asset classes; can be refined further through more tailored volatility measures, prior states and other factors 15
    • State = > VIX <= 20 SPX GSCI NAREIT JPMAGG VIX daily Arithmetic avg return 0.0983% 0.0411% 0.1004% 0.0254% -0.1932% Compounded avg return 0.0962% 0.0351% 0.0976% 0.0252% -0.3318% max 2.9% 6.8% 3.6% 1.0% 64.2% min -3.5% -4.6% -4.8% -1.1% -25.9% vol 0.7% 1.1% 0.7% 0.2% 5.3% Skewness 0.02 0.18 -0.33 -0.04 1.25 Kurtosis 4.16 5.06 7.03 4.77 13.94 Number of days 2,945 2,945 2,945 2,945 2,945 Normality at 95% confidence level? No No No No No p-values 0.1% 0.1% 0.1% 0.1% 0.1% No serial correlation at 95% confidence level? Yes Yes No No No p-values 49.5% 13.0% 1.1% 0.9% 0.0% Cumulative Return 1595.7% 181.1% 1669.5% 109.9% -100.0% % of days with positive returns 55.3% 49.3% 55.2% 52.7% 43.5% 16
    • State => 25 < VIX <= 30 SPX GSCI NAREIT JPMAGG VIX daily Arithmetic avg return -0.0776% -0.0227% -0.0296% 0.0230% 0.4789% Compounded avg return -0.0872% -0.0373% -0.0392% 0.0226% 0.2802% max 5.0% 5.9% 8.8% 0.9% 40.7% min -3.8% -16.9% -7.0% -0.9% -20.0% vol 1.4% 1.7% 1.4% 0.3% 6.4% Skewness 0.27 -1.76 0.20 -0.36 0.72 Kurtosis 3.23 19.72 9.96 3.68 6.01 Number of days 590 590 590 590 590 Normality at 95% confidence level? No No No No No p-values 1.9% 0.1% 0.1% 0.1% 0.1% No serial correlation at 95% confidence level? Yes Yes No Yes Yes p-values 11.0% 12.8% 0.0% 84.1% 56.6% Cumulative Return -40.2% -19.8% -20.7% 14.3% 421.0% % of days with positive returns 44.6% 48.8% 45.6% 54.6% 49.3% 17
    • State => VIX > 40 SPX GSCI NAREIT JPMAGG VIX daily Arithmetic avg return -0.5372% -0.4839% -0.7983% 0.0562% 1.5559% Compounded avg return -0.5940% -0.5300% -1.0065% 0.0555% 1.1540% max 11.6% 7.5% 18.4% 1.3% 34.5% min -9.0% -8.1% -19.5% -1.0% -24.7% vol 3.4% 3.0% 6.5% 0.4% 9.2% Skewness 0.41 0.12 0.49 0.12 0.66 Kurtosis 4.22 3.25 3.64 4.36 4.77 Number of days 158 158 158 158 158 Normality at 95% confidence level? No Yes No No No p-values 0.7% 50.0% 1.9% 0.9% 0.1% No serial correlation at 95% confidence level? Yes Yes Yes No No p-values 62.0% 33.2% 30.2% 0.6% 1.9% Cumulative Return -61.0% -56.8% -79.8% 9.2% 512.8% % of days with positive returns 39.2% 37.3% 34.2% 53.8% 51.3% 18
    • Cumulative Return 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 0 2 4 6 8 10 12 14 16 18 20 '8/7/1990' '1/3/1990' '10/10/1990' '4/15/1991' '10/31/1990' '9/9/1991' '1/16/1991' '2/12/1992' '11/17/1997' '7/7/1992' '12/2/1992' SP 500 Total Return '8/24/1998' '4/26/1993' '10/19/1998' SP 500 Total Return '9/16/1993' '1/13/1999' '2/8/1994' '2/18/1999' '7/6/1994' '3/22/2001' FTSE EPRA/NAREIT US Total Return '11/28/1994' '9/11/2001' '4/20/1995' '10/4/2001' FTSE EPRA/NAREIT US Total Return '9/12/1995' '10/23/2001' '2/2/1996' '7/10/2002' '6/28/1996' Date Date '8/29/2002' '12/2/1996' '10/21/2002' '2/18/1998' '9/4/2000' '11/5/2002' '8/11/2003' SP GSCI '1/24/2003' Cumulative Return - State of VIX (<=20) '1/13/2004' '2/6/2003' '6/11/2004' '2/19/2003' Cumulative Return - State of VIX (>30 and <=35) '11/3/2004' '3/5/2003' '3/28/2005' SP GSCI '3/18/2003' '8/18/2005' '3/17/2008' '1/10/2006' '5/4/2009' '6/2/2006' '5/15/2009' '10/30/2006' '6/3/2009' '3/22/2007' JP Morgan US Aggregate Bond Total Return '7/8/2009' '10/10/2007' JP Morgan US Aggregate Bon Cumulative Return Cumulative Return 0 0.2 0.4 0.6 0.8 1 1.2 0.5 0.7 0.9 1.1 1.3 '8/31/1998' '1/5/1990' '9/10/1998' '2/13/1990' '10/5/1998' '3/23/1990' '10/12/1998' '7/23/1990' '9/21/2001' '11/23/1990' '8/6/2002' '12/21/1990' SP 500 Total Return '10/9/2002' '2/11/1991' '10/7/2008' '8/19/1991' SP 500 Total Return '10/14/2008' '7/16/1996' '10/21/2008' '2/5/1997' '10/28/2008' '11/4/2008' '4/1/1997' FTSE EPRA/NAREIT US Total Return '11/11/2008' '6/3/1997' FTSE EPRA/NAREIT US Total Return '11/18/2008' '7/24/1997' '11/25/2008' '8/22/1997' '12/2/2008' '9/25/1997' '12/9/2008' Date '12/5/1997' Date '12/16/2008' '1/29/1998' '12/23/2008' '4/2/1998' '12/30/2008' '5/6/1998' SP GSCI '1/13/2009' '6/11/1998' SP GSCI '1/20/2009' '11/24/1998' Cumulative Return - State of VIX (>25 and <=30) Cumulative Return - State of VIX (>40) '1/27/2009' Returns in Different States '3/5/1999' '2/4/2009' '4/12/1999' '2/11/2009' '6/18/1999' Returns (cont.) – Asset Class '2/18/2009' '7/30/1999' '2/25/2009' '3/4/2009' '9/9/1999' '3/11/2009' '10/27/1999' '3/18/2009' '11/29/1999' '3/25/2009' '12/28/1999' JP Morgan US Aggregate Bond Total Return '4/1/2009' '2/2/2000' JP Morgan US Aggregate Bond Total Return 19
    • Returns (cont.) – Asset Class Returns in Different States • Key observations from the prior pages – Gains and losses are quite concentrated (e.g., for SPX in VIX <=20, the probability of positive days is only 55% but the magnitude of positive days is very large compared to negative days) – Normality is very rare even in individual states and serial correlation generally remains (past information is likely to be useful in predicting the future) – Very consistent behavior among asset classes, particularly at the negative extreme => diversification benefits are possible but may be highly concentrated (e.g., aggregate bond market or more specifically government bonds) 20
    • Returns (cont.) – Approaches to data evaluation • Frequentist school of statistics – Derive the probability distribution of a sample from as large a sample of data – Appropriate for large data sample produced by stable process – Data points are independent and identically distributed – Example: probability of positive return days for SP500 is 51.7% • Baysian school of statistics – There’s uncertainty about the true distribution of a process – Pick your initial distributional shape and update as new relevant information arrives – Can better incorporate limited data and unstable processes but more vulnerable to incorrect, starting distributional choices and more computationally intensive – Example: probability of positive return days for SP500 given that VIX >40 is 39.2% 21
    • Returns (cont.) – Approaches to data evaluation • In the example of SP 500 for Frequentist Bayesian the full sample of 1/2/1990 – perspective perspective 1/29/2010 Probability that daily return is positive 51.7% 51.7% – Frequentist predictions are different from empirical results Probability that daily return is positive if (especially within each state), Last trading day was positive 51.7% 50.6% whose uncertainty can be Last 2 trading days were positive 51.7% 49.8% handled with Bayesian tools Last 3 trading days were positive 51.7% 46.8% • As returns are assumed to be Last 4 trading days were positive 51.7% 44.8% independent and identically distributed, the probability of positive return on any given Probability of 2 consecutive positive days 26.7% 26.2% day is the same Probability of 3 consecutive positive days 13.8% 13.0% • Probability of n consecutive Probability of 4 consecutive positive days 7.2% 6.1% positive days is .517^n Probability of 5 consecutive positive days 3.7% 2.7% 22
    • Returns (cont.) – More on Approaches to Data Evaluation • In the SP500 price series sample from 1928- Decade return for SP500 price index - first decade calculation starts 2009 (excludes dividends and inflation) at the end of 1937 going to 1928, etc 1928 - 2009 – While the probability that each decade return is positive is very high (almost 85%) 400.00% 350.00% – There are distinct period of high and low 300.00% Prior Decade Return returns and 250.00% – When decade returns start declining, such 200.00% 150.00% periods take years to complete, though there 100.00% are only 2 such periods in this sample 50.00% • Economic and sentiment factors affecting 0.00% performance may take a long time to reverse -50.00%1937 1942 1947 1952 1957 1962 1967 1972 1977 1982 1987 1992 1997 2002 2007 • Positive returns are possible; however, they are -100.00% replacing years that were better and may not be Year enough to offset negative years • Given the current state in this historical sample, implications for performance the next few years are not positive SP500 price index annual return - 1928 - 2009 – The concept of sample error (also see 60.00% manager evaluation section) • Error cause by observing the limited sample 40.00% rather than the true population • How likely that the sample one has for analysis 20.00% Annual Return is unrepresentative of the true population data? 0.00% – 1 / square root of the number of observations 1928 1933 1938 1943 1948 1953 1958 1963 1968 1973 1978 1983 1988 1993 1998 2003 2008 -20.00% -40.00% -60.00% Year 23
    • Returns (cont.) – How to evaluate drivers of returns? • Classical perspective – CAPM: 1 factor model using a market index • Advantages: simplicity • Disadvantages: – A extensive list of anomolies and deviations from empirical data - e.g., Fama French factors, calendar effects – Cannot be tested as the true market index is hard to specify (Roll critique) – Time variance of beta and risk premium – APT: multi factor models • Advantages: more flexibility in describing return drivers • Disadvantages: – Which factors should be included? – How many factors should be included? 24
    • Returns (cont.) – Returns and economic growth • Current practice – Invest in countries that show high rates of economic growth • Evidence – Zero to slightly negative relationship between returns and economic growth over multi year period (3+ or the length of a business cycle) • Investors herd and overbid • In order to take advantage of economic growth, companies will require capital, driving down returns for existing shareholders • Benefits of economic growth don’t accrue just to companies already listed on exchanges – Positive or negative relationship possible at certain stages of the economic cycle for short to mid term; however, must have flexibility and tools to appropriately time entry and exit to take advantage 25
    • Returns (cont.) – Practical Issues • Jarque Bera test – measures if data follows normal distribution; visual tools include histrogram, normal probability and quantile/quantile plots • Ljung Box Q test – measures if data exhibits serial correlation • Durbin Watson test – measures if error terms of a a regression are serially correlated • Classical regression – Y = intercept + beta * X +error – Linear – Beta is estimated via the ordinary least squares method • Beta obtained is the BLUE estimator as long as at least the following holds (there are assumptions not covered here): – Error terms are uncorrelated with each other – Error terms have the same volatility (homeskedastic) – BLUE – best (with the lowest variance than other estimators); linear; unbiased (expectation of the estimator is the true value of the coefficient); estimator • Practical issues with regression approach – Most financial data has time varying volatility (see prior pages; heteroskedastic; if not adjusted for heteroskedasticity (White method), the significance of estimators is likely to be overestimated – Most financial data is serially correlated (see prior pages); if not adjusted for serial correlation (Newey West method, which also corrects for heteroskedasticity), the statistical significance of estimators will be overestimated – The greater the deviations (particularly for many alternative investment strategies), the less the output from classical regression can be relied on to assess the statistical significance of estimators – More advanced method like generalized least squares embeds these issues within the estimation process 26
    • Returns (cont.) – Practical Issues • Practical issues with regression approach (cont.) – Which factors and how many • Current practice – The user selects factors for a regression » Capture what occurred vs capture what the user thinks should have occurred » No methodology to decide which and how many factor • Should reflect non-linearity and non-normality of observed returns – e.g., option based strategy factors • Can be selected via stepwise or related algorithm from a universe of factors based on their statistical significance and contribution to the overall explanatory power of a regression • From random matrix theory and tests on financial data, most of return variance of any security can be explained by 4 - 8 factors; the rest is noise • Large number of factors lead to overfitting of historical data and are likely to produce multicollinearity (high correlation among factors), reducing the forecasting power of the model and statistical significance of factors • Ridge or orthogonal regression can be used if multicollinearity problems are severe – May use nonlinear regression to more explicitly incorporate non-linearity – Robust statistics (see optimization section) • Reduce the impact of outliers on metrics, making them more stable to small changes in environment • Separates analysis of regular and extreme events • Examples – Use median, trimmed mean or other metrics that are less sensitive to the presence of outliers – Robust regression to estimate coefficient of parameters – May use other approaches to minimize sampling error (Bayesian tools or shrinkage estimators; see optimization section) – May find spurious relationships among random time series if they are integrated (see forecasting section) 27
    • Returns (cont.) – Practical Issues • Other econometric approaches (more in forecasting section) – Key modeling approach => return today is related to the prior return • Easy to apply • Good results for trending series • No need to pick factors • May add other info – e.g., deviations from the mean • May be refined further by modeling deviations separately via GARCH models (see next section) and extended to include multi- security relationships via VAR models (see forecasting section) • May incorporate both momentum and reversal • Some assumptions may be too restrictive • Depending the model chosen, may be too computationally intensive if there are many securities (VAR) or only applicable to a few time series (cointegration) • Or incorporate statistical learning, Bayesian type methods (more in forecasting section) 28
    • Returns (cont.) – More Empirical Evidence • Significant majority of portfolio return and its variance is explained (starting with Brinson et al (1986)) by – Decisions on which asset classes to invest in – Decisions on which sectors within asset classes to invest in – Decisions on individual securities/managers contribute the least amount (e.g., 10%) and sometimes contribute negatively – Current practice • Investors spend most of their time evaluating individual security/manager factors • Sector vs country – For developed markets, industry factors are the primary driver of returns – For developing counties, country decisions are more important in explaining returns • Value effect (tendency for value stocks to outperform growth stocks) – Concentrated among the smallest companies (~$100mil market cap or less) with little analyst coverage (~3 analysts or fewer) and low institutional ownership (up to 50% owned) 29
    • Returns (cont.) – Summary • Key empirical facts – Returns exhibit positive momentum over the short to mid term but mean reversion over mid to long term => return processes are time varying with different distribution characteristics, particularly at extremes (see risk section on extreme value theory) – Returns are significantly non-normal with negative skews and large tails as measured by kurtosis – Losses and gains are concentrated – Return volatility is time varying and has a tendency to cluster (next section) – Returns across countries are not related to economic growth => zero to slightly negative relationship – Return of a portfolio and its variance are primarily explained by decisions on asset classes and their sectors; security selection is the least important part • Implications – Incorporate non-normal distributions to model returns to capture skews and fat tails => e.g., t distribution better captures empirical features of tails than normal distribution – There’s uncertainty as to the true return distribution due to time variance of returns and different statistical properties of regimes – Because of a wide range of outcomes and regimes, the reliance on the concept of average in analysis will not appropriately incorporate complex reality – Situational awareness => attempt to identify the current environment – Potential value to incorporating time varying volatility in analysis – Regression output must be adjusted for serial correlation and changes in volatility to properly assess the statistical significance of estimators – Methods other than regression may incorporate empirical features better because of their greater flexibility 30
    • Volatility • From the prior section – Volatility is time varying – Volatility has a tendency to cluster – Different volatility regimes are associated with different return processes • Motivation – Can volatility be forecasted? – If yes, which methods are most appropriate? 31
    • Volatility (cont.) – Classical perspective • Use historical volatility figures (full historical sample or, more rarely, rolling window as on p.6) on an equal weighted basis – the dominant practitioner approach – Assumes that returns are independent and identically distributed (iid) – Behavior of future volatility is the same as its past behavior – May lead to sharp changes as crashes enter and exit data series – Silent on the length of window • Use weighted average volatility, giving more weight to recent figures (e.g., exponentially weighted moving average (EWMA) approach via a decay constant,lambda, used by RiskMetrics) – gaining share – Volatility is related to the prior return and volatility that are weighted • vol^2 = (1-lambda)*r(t-1)^2+lambda*vol(t-1)^2; – Assumes that returns are iid – Behavior of future volatility is the same as its past behavior – For mathematical reasons, the same decay factor is used for all securities/asset classes – Persistence in volatility of a security (lambda) is closely related to its sensitivity to market shock (1-lambda) – No methodology for choosing the decay constant – RiskMetrics uses .94 for daily data and .97 for monthly data – Can’t incorporate mean reversion as long term volatility is undefined 32
    • Volatility (cont.) – Empirical evidence – Volatility can be forecasted over a limited time horizon – Has a tendency to cluster – One approach to model clustering is through Markov chains and transition probability matrices (see next page) – Mean reversion (e.g., see p.13,32) – Model should incorporate volatility changes due to market shocks and persistence in its level as separate variables 33
    • States: 1-VIX<=20, 2-20<VIX<=25, 3-25<VIX<=30,4- 30<VIX<=35,5-35<VIX<=40,6-VIX>40; based on daily data Transition probability matrix for VIX Next day State Average Maximum % of all days Current State 1 2 3 4 5 6 VIX State Duration Duration in State 1 95.9% 4.1% 0.0% 0.0% 0.0% 0.0% 1 24.3 578 56.2% 2 9.9% 82.0% 7.9% 0.2% 0.0% 0.0% 2 5.6 37 23.1% 3 0.0% 16.3% 75.1% 8.5% 0.2% 0.0% 3 4.0 16 11.3% 4 0.0% 0.4% 22.1% 68.5% 8.1% 0.9% 4 3.2 25 4.5% 5 0.0% 0.0% 0.0% 22.0% 64.0% 14.0% 5 2.8 10 1.9% 6 0.0% 0.0% 0.0% 0.0% 10.1% 89.9% 6 9.9 64 3.0% 34
    • Volatility (cont.) – Volatility as a random process • GARCH (generallized autoregressive conditional heteroskedasticity) models – Original version (Bollerlev (1986)) – • Volatility is related to the prior return and volatility but the model has important differences compared to the weighting approach – Vol^2 = c+alpha*error(t-1)^2+beta*vol(t-1)^2 • Long term or unconditional volatility is defined as c/(1-(alpha+beta) • Allows to model sensitivity to market shocks and persistence in volatility separately • Can incorporate serial correlation in returns and return volatility clustering • Can be estimated separately for each security/asset • Symmetrical response to positive and negative volatility • Introduces some skewness and kurtosis – More advanced versions (A-GARCH, GJR-GRACH, Student t-GARCH, etc) allow to • Capture assymmetric responses to negative vs positive shocks • More fully include skewness and kurtosis • Model different transition probabilities (original model implies that the probability of switching from one regime to another constant though the conditional probability of being in a given regime varies over time) – Greater computational and mathematical complexity as the cost of more appropriate description of complex reality – Trade-off between capturing time variance nature of volatility and incorporating tails • Other models to model volatility as a random process – e..g, Heston model via stochastic calculus 35
    • A n a lie V lailt n uz d o t y A n a e V lailit n u liz d o t y 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 5 /9 /8 0 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 80.00% 90.00% 5 /9 /8 0 5 /9 /8 1 5 /9 /8 1 5 /9 /8 2 5 /9 /8 2 5 /9 /8 3 5 /9 /8 3 5 /9 /8 4 5 /9 /8 4 5 /9 /8 5 5 /9 /8 5 5 /9 /8 6 5 /9 /8 6 5 /9 /8 7 30-day Volatility 5 /9 /8 7 SP500 A SP500 A 5 /9 /8 8 5 /9 /8 8 5 /9 /8 9 5 /9 /8 9 EWMA Volatility 5 /0 /8 0 5 /0 /8 0 Da te Date 5 /0 /8 1 5 /0 /8 1 5 /0 /8 2 60-day Volatility 5 /0 /8 2 nnualized Volatility nnualized Volatility 5 /0 /8 3 5 /0 /8 3 Approaches 5 /0 /8 4 5 /0 /8 4 5 /0 /8 5 5 /0 /8 5 GARCH Volatility 5 /0 /8 6 5 /0 /8 6 5 /0 /8 7 5 /0 /8 7 90-day Volatility 5 /0 /8 8 5 /0 /8 8 5 /0 /8 9 5 /0 /8 9 Volatility (cont.) – Different 36
    • Volatility (cont.) – Further applications • Serves as key input for – Risk and risk/return measures – Dependencies metrics – Return forecasting – Optimization 37
    • Volatility (cont.) - Summary • Key empirical facts – Volatility clustering – Non-linear – Mean reversion • Implications – Classical and current approaches don’t capture empirical features – More advanced models reflect complex reality better; however, there’s a tradeoff between time variance and tails 38
    • Risk Measurement • Motivation – How should risk be defined and what features should a risk metric have? • Classical perspective – Risk is the volatility of returns – Positive volatility (prices going up) penalized equally with negative volatility (prices going down) • Practitioner perspective – Risk is the loss of capital; permanently or at least for an extended period of time • Current practice – Volatility dominates – VaR (Value at Risk) is also widespread, partly because of regulatory requirements • Informal definition: – VaR(95%) is the maximum loss within 95% confidence interval or more appropriately the minimum loss that will be exceed in 5% of cases 39
    • Risk Measurement (cont.) – Coherent risk • Desirable properties for a risk measure (r) (Artzner et al (1999)) – Monotonicity: If X>=0, r(X)<=0 • If there are only positive returns, risk is non-positive – Subadditivity: r(X+Y) <= r(X)+r(Y) • The risk of a portfolio of 2 assets should be less than or equal to the sum of the risks of individual assets – Positive homogeneity: For any positive real number c, r(cX) = cr(X) • If the portfolio is increased c times, the risk becomes c times larger – Translational invariance: For any real number c, r(X+c) <= r(X) – c • Cash or another risk free asset does not contribute to portfolio risk • Standard deviation does not satisfy #1 • VaR and downside deviation (square root of the average sum of squared deviations from 0) do not satisfy #2 40
    • Risk Measurement (cont.) – CVaR • Conditional Value at Risk (CVaR) or Expected Shortfall – Satisfies all four desirable properties – Informal definition: • CVaR(95%) is the average loss in the 5% of the worst events 41
    • Risk Measurement (cont.) – Comparison • Example : A return time series – 200 periods; 100 periods of .15%, 90 periods of -.1% and 10 periods of -5% • Mean is -.22%; standard deviation is 1.1%; skewness is -4.1 and kurtosis is 15.0 • Downside deviation is 1.59% • Normal VaR(95%) is 2.48% • VaR(95%) is 2.55% (percentile function approximates between 2 points) • CVaR(95%) is 5% • If losses in the tail follow an increasing pattern (as they do practice), the disparity between CVaR and other measures will be even greater • By definition, standard deviation (2nd moment of a distribution, mean return is the first moment of a distribution) and downside deviation measure the range of returns and cannot appropriately incorporate information about higher moments of a distribution – skewness (3rd moment) and kurtosis (4th moment) • While downside deviation measures volatility of negative returns, it will be appropriate only if losses exhibit a consistent pattern (which is rarely the case) as the metric weighs them equally • CVaR focuses on measuring the fat negative tail (which will drive actual losses), incorporating kurtosis • For analytic VaR calculation assuming normality – Results will generally be the lowest – The larger is non-normality, the greater is the gap between normal VaR and other metrics that incorporate more information about a time series – Time series with high return and low volatility will show positive VaR (returns are expected, not losses; see alternatives page) 42
    • Risk Measurement (cont.) – Empirical evidence from Traditional Asset Classes 1/2/1990-1/31/2010 SPX GSCI NAREIT JPMAGG VIX daily data vol 1.2% 1.4% 1.6% 0.3% 5.9% Nomal VaR (95%) 0.0% 0.0% 0.0% 0.0% -0.4% VaR (95%) -1.7% -2.1% -1.7% -0.4% -8.2% VaR (99%) -3.1% -3.7% -5.2% -0.7% -12.4% CVaR(95%) -2.7% -3.2% -3.8% -0.5% -11.2% CVaR(99%) -4.6% -5.2% -8.0% -0.8% -15.8% VIX <=20 vol 0.7% 1.1% 0.7% 0.2% 5.3% VaR (95%) -1.0% -1.8% -1.1% -0.3% -8.0% VaR (99%) -1.6% -2.8% -2.0% -0.6% -12.1% CVaR(95%) -1.4% -2.4% -1.7% -0.5% -10.8% CVaR(99%) -1.9% -3.4% -2.8% -0.7% -15.3% 43
    • Risk Measurement(cont.) – Empirical evidence from Traditional Asset Classes VIX >25 & <=30 SPX GSCI NAREIT JPMAGG VIX vol 1.4% 1.7% 1.4% 0.3% 6.4% VaR (95%) -2.3% -2.8% -2.1% -0.5% -9.5% VaR (99%) -2.9% -4.7% -4.3% -0.8% -12.6% CVaR(95%) -2.7% -4.1% -3.5% -0.6% -11.8% CVaR(99%) -3.2% -7.1% -5.5% -0.8% -15.0% VIX>40 vol 3.4% 3.0% 6.5% 0.4% 9.2% VaR (95%) -5.9% -5.9% -10.0% -0.5% -10.9% VaR (99%) -8.9% -7.6% -15.0% -1.0% -21.1% CVaR(95%) -7.5% -6.7% -13.0% -0.8% -16.0% CVaR(99%) -9.0% -7.9% -17.3% -1.0% -23.0% 44
    • Risk Measurement (cont.) – Empirical Evidence from Alternative Investment Strategies CISDM Alternative Strategies Indices 12/31/1991 - 12/31/2009 monthly data Convert Merger Long/ TOTAL EW Arb Distressed Arb CTA Macro Short FOF EMN EM FI Arb vol 2.16% 1.44% 1.83% 1.12% 2.49% 1.62% 2.20% 1.42% 0.58% 3.80% 1.40% Normal VaR (95%) 0.97% 0.77% 0.88% 0.79% 0.61% 0.75% 0.90% 0.60% 0.67% 0.72% 0.53% VaR (95%) -2.18% -1.00% -1.48% -0.97% -3.23% -1.15% -2.38% -1.65% -0.11% -4.45% -0.97% VaR (99%) -6.98% -4.46% -6.04% -2.41% -4.40% -2.55% -4.54% -5.52% -1.13% -12.68% -5.73% CVaR(95%) -4.06% -3.12% -3.88% -2.02% -4.01% -2.15% -3.94% -3.21% -0.73% -9.17% -3.31% CVaR(99%) -7.75% -7.43% -8.07% -3.51% -4.75% -3.56% -6.31% -5.87% -1.49% -17.63% -7.47% 45
    • Risk Measurement (cont.) – Extreme Value Theory Histogram of 10% worst daily returns of SP500 Total Return Index • At extreme states, 350 1/2/1990 - 1/31/2010 phenomena exhibit 300 distinct 250 characteristics, which can be modeled via 200 extreme value 150 distributions 100 50 0 -0.1 -0.09 -0.08 -0.07 -0.06 -0.05 -0.04 -0.03 -0.02 -0.01 46
    • Risk Measurement (cont.) – Additional Practical Considerations • More on VaR and CVaR calculations – Historical data – Let the data speak as no distributional assumptions required However, – May not be appropriate for the current environment. However, note that the primary purposes of a loss measure is to show potential losses in a bad environment – Will not be appropriate if a structural change occurred – e.g., increasing correlations among alternative investment strategies and traditional asset classes over the past 10-15 years – Analytic solution via a formula • E.g., VaR(b) = mean return + volatility * z(b), where z is a random variable – Typically, z is assumed to follow a normal distribution – z may also incorporate skewness and kurtosis via Cornish Fisher approximation; only holds for small deviations from non-normality – Use other distributional assumptions – Must decide on the appropriate volatility input (see section on volatility and relevant methods) 47
    • Risk Measurement (cont.) – Additional Practical Considerations • ‘Smooth’ data (positive autocorrelation) can significantly understate the range of potential returns and impacts risk measurement – The larger is the magnitude of first order autocorrelation, the larger is the impact • Traditional volatility measure is understated (the longer is the time horizon and the less liquid are securities, the more positive is autocorrelation) – Why ‘smooth’ data • Prices adjust gradually (though likely will move quickly and overshoot in crisis; will be experienced if one tries to liquidate at such a time) – Information is priced gradually – Illiquidity – securities with limited trading volume • Manager discretion in pricing portfolio – Tendency to smooth performance figures • Autocorrelation can also be negative, though it’s more likely for very short term horizons • Traditional volatility measure is overstated (may be observed over the short term for liquid assets/securities) 48
    • Risk Measurement (cont.) – Additional Practical Considerations SPX GSCI NAREIT Convert Arb Distressed CTA Macro Daily Raw data; monthly vol 1.44% 1.83% 2.49% 1.62% Raw data CVaR(95%) -3.12% -3.88% -4.01% -2.15% vol 1.2% 1.4% 1.6% CVaR(95%) -2.7% -3.2% -3.8% Data adjusted for first Data adjusted for first order autocorrelation order autocorrelation (first (first lag) lag) vol 1.1% 1.3% 1.4% vol 3.61% 3.59% 2.70% 2.36% CVaR(95%) -2.6% -3.1% -3.2% CVaR(95%) -7.94% -8.56% -4.39% -3.57% 49
    • Risk Measurement (cont.) – Additional Practical Considerations • Risk factors in a portfolio – Most of portfolio performance and its variance can be mapped to the following risk factors • Market risk (e.g, MSCI World index) • Interest rate risk (e.g., 2 year US Treasury bond) • Yield curve risk (e.g, spread between 10 year and 2 year US Treasury bonds) • Credit risk (e.g., spread between Moody’s BAA average rate and 10 year US Treasury bond) • Inflation risk (e.g, SP GSCI index) • Currency risk (e.g., US Dollar index) • Liquidity risk (e.g., spread between 3 month LIBOR and 3 month US Treasury bill) • Sentiment (e.g, VIX index) – Some of the above factors (e.g, VIX) are highly non-linear – Other factors may be added or substituted for, keeping in mind the issue of overfitting – Most portfolios are primarily exposed to market and interest rate risks and do not vary their exposures a lot (more in asset allocation section) 50
    • Risk Measurement (cont.) - Summary • Key empirical facts – Losses produced are regularly larger than the magnitude captured by classical and typical practitioner risk metrics – Most of portfolio performance and variance can be mapped by a few factors – Investors generally have highly concentrated risk exposures that do not vary significantly over time • Such deviations are larger when deviations from normality are larger – Like returns and volatility, losses are very time sensitive and non-linear – Most risk metrics used do not satisfy desirable properties • Implications – Focus on coherent risk – Incorporate non-normality in risk measurement process – Understand the purpose of a particular result – CVaR based on a rolling window (e.g, the last 12 months of data) is not useful if one is looking to understand the full extent of loss scenarios; time weighing of a risk measure suffers from the same limitations as EWMA volatility approach discussed in the prior section – Risk is very time varying with a large range = > looking at the full range of data can produce useful insights, rather than just the average (subject to the point above) – As returns are highly non-linear, more selective and time varying risk factor exposures may improve portfolio performance 51
    • Risk and Return • Motivation – How to judge if return generated is appropriate for risk taken? • Classical perspective – Sharpe Ratio – the dominant approach • Compares return above a risk free rate to volatility • All the limitations discussed previously about volatility as a measure of risk apply to Sharpe Ratio • Recent extensions use VaR in the denominator instead of volatility (called adjusted SR); all the limitations discussed previously of various VaR versions apply – Treynor Ratio • Compares return above a risk free rate to systematic risk as measured by beta • Beta does not measure actual losses; extremely time sensitive; all CAPM and APT (and classical regression) limitations apply when deriving beta – More advanced measures • Sortino Ratio – Compares return above a risk free rate to downside deviation – All the limitations discussed previously about volatility and its downside version as a measure of risk apply to Sortino Ratio • Calmar Ratio – Compares return above a risk free rate to the worst period return – Limited view of losses as it utilizes just one data point 52
    • Risk and Return (cont.) • Post modern tools – Use CVaR in the denominator when comparing excess returns • Focuses on negative tails • Doesn’t incorporate other moments – Omega Ratio (Shadwick et al (2002)) • Probability weighted gains above a threshold value compared to probability weighted losses below the threshold • Fully incorporates all moments of a distribution (going beyond even the first 4 moments) as it utilizes all of the historical data • Requires no distributional assumptions • Utilizes historical data (all limitations of such an approach noted previously apply) • By definition, cannot not distinguish between the magnitude of potential losses for different return time series – Rachev Ratio • Omega ratio for positive and negative tails • Compares probability weighted extreme gains to probability weighted extreme losses at some percentile (e.g., 5%) – The greater is the abnormality of a return time series, the greater are the differences in ranking of return time series – Low volatilities with high returns imply that the data is excessively smooth (traditional volatility understates the range of returns for such a series) and that risk is pushed to the negative tail 53
    • Risk and Return (cont.) • Example: Extreme Lottery – Which option should you choose? • Buy payoff – a loss of $0.9 999,999 times in 1 million – a gain of $999,999 one time in 1 million • Sell payoff – a gain of $0.9 999,999 times in 1 million – a loss of $999,999 one time in 1 million • Both have the same mean of $0.9 and a variance of $999,999 – Investor is • Indifferent with Sharpe Ratio • Prefers Buy with Sharpe Ratio (CVaR) • Prefers Buy with Sortino Ratio • Prefers Buy with Omega with the threshold set below the mean but prefers Sell with the threshold set above the mean 54
    • Risk and Return (cont.) In the classical world, investor picks A. However, if returns above 2% are targeted, B becomes more attractive as it’s more likely to produce such returns because of higher volatility (Charts are from Urbani (2005)) 55
    • Risk and Return (cont.) 1/2/1990-1/31/2010 SPX GSCI NAREIT JPMAGG VIX Daily, raw data Risk free rate assumed to be 0% SR 0.025 0.012 0.027 0.104 0.001 SR(CVaR, 95%) 0.011 0.005 0.011 0.048 0.001 omega ratio (0%) 1.1 1.1 1.1 1.3 1.1 omega ratio (.02%) 1.0 1.0 1.1 1.1 1.1 56
    • Risk and Return (cont.) CISDM Alternative Strategies Indices 12/31/1991 - 12/31/2009 Monthly, raw data; risk free rate is assumed to be 0% Merge Convert r Long/ TOTAL Distresse EW Arb d Arb CTA Macro Short FOF EMN EM FI Arb SR 0.46 0.54 0.49 0.71 0.27 0.48 0.42 0.43 1.16 0.22 0.39 Rank 6 3 4 2 10 5 9 7 1 11 8 SR (CVaR, 95%) 0.25 0.25 0.23 0.39 0.17 0.36 0.24 0.19 0.92 0.09 0.16 Rank 5 6 7 2 9 3 4 8 1 11 10 omega ratio (0%) 3.4 5.0 4.0 6.7 2.1 4.7 3.1 3.2 18.5 2.0 3.7 Rank 7 3 5 2 10 4 9 8 1 11 6 omega ratio (1%) 1.0 0.6 0.9 0.6 0.7 0.7 0.9 0.5 0.2 0.9 1.9 Rank 2 9 5 8 6 7 4 10 11 3 1 57
    • Risk and Return (cont.) • Unsmoothed data impacts rankings further as ‘smooth’ strategies are adjusted CISDM Alternative Strategies Indices 12/31/1991 - 12/31/2009 monthly data; unsmoothed data (first lag) Convert Unsmoothed data Arb Distressed CTA Macro SR 0.22 0.25 0.25 0.33 SR (CVaR, 95%) 0.10 0.11 0.15 0.22 58
    • Risk and Return (cont.) - Summary • Classical risk/return ratios cannot correctly rank return time series on a consistent basis if non-normality exists • CVaR based ratio and Omega ratio are proposed that more fully incorporate useful information of a time series probability distribution 59
    • Dependencies • Motivation – How to measure the direction and strength of relationships among securities? – Applications in risk management, forecasting and optimization • Classical perspective and current practice – Correlation (formally, Pearson correlation coefficient) • For securities X and Y, corr = covariance of X,Y/ (vol of X * vol Y) • Linear measure • Calculate based on the historical sample • Incorporates limited information about distributions 60
    • Dependencies (cont.) – Anscombe’s quartet; same correlation, mean and volatility; from Wikipedia •File •File history •File links 61
    • Dependencies (cont.) – Other metrics/approaches • Shrinkage estimator for correlations – Minimize noise in the correlation (see also optimization) – As with returns, outliers may provide valuable information • Extract key factors driving correlations (principal component analysis) and model correlations with them – As with returns, relationships across securities can be mapped to very few factors – The rest is statistical noise • Rank correlation coefficient (Spearman’s rank or Kendall’s tau coefficients) – Sometimes used as substitutes for Pearson’s correlation to reduce the amount of calculations or make result less sensitive to non-normality – However, it measures the strength of association, rather than dependence = > will increase as long as signs for both variables match even if individual % changes of return variables are very different in magnitude 62
    • Dependencies (cont.) – Post modern perspective • “Correlation is a minefield for the unwary” (Embrechts et al (2002)) – Correlation is not constant under transformation of variables = > e.g., corr of X and Y != corr of ln(X) and ln(Y) – Feasible values for correlation depend on depend on the marginal (individual) distributions of variables => e.g., ln(X) is standard normal and ln(X) is N(0,4) (mean of 0 and variance of 4), then correlation of more than 2/3 or less than -0.09 is impossible – Perfect positive dependence does not imply a correlation of 1 (similarly, for perfect negative dependence) = > in the above example, perfect positive dependence is 2/3 and perfect negative dependence is -0.09 – Zero correlation does not imply independence = > e.g., If ln(X) is N(0,variance), the minimum correlation goes to 0 as variance rises, though such minimum occurs when the variables are perfectly negatively correlated • Like volatility, correlation is very time sensitive (see p.18 and next page) • Pearson correlation may be justified only in the special case of multi variate normal or t distribution; however, it completely defines dependence only in the multi variate normal case (at extremes, correlation is zero for such a distribution) • If normal distribution is extended to a multi variable case, probability of correlation of 1 is 0 63
    • Dependencies (cont.) – Empirical Evidence Full sample VIX <=20 SPX GSCI NAREIT JPMAGG VIX SPX GSCI NAREIT JPMAGG VIX SPX 1 SPX 1 GSCI 0.614909 1 GSCI 0.898321 1 NAREIT 0.828916 0.846607 1 NAREIT 0.974663 0.929054 1 JPMAGG 0.822595 0.770811 0.864026 1 JPMAGG 0.910627 0.728673 0.883676 1 VIX 0.155275 0.061055 -0.02932 0.276768 1 VIX -0.46543 -0.28484 -0.44793 -0.67316 1 64
    • Dependencies (cont.) – Empirical Evidence VIX >30 & <=35 VIX>40 SPX GSCI NAREIT JPMAGG VIX SPX GSCI NAREIT JPMAGG VIX SPX 1 SPX 1 GSCI -0.28886 1 GSCI 0.947808 1 NAREIT 0.308306 -0.53652 1 NAREIT 0.9922 0.960259 1 JPMAGG -0.71221 0.149066 0.116173 1 JPMAGG -0.66841 -0.81168 -0.69337 1 VIX -0.73975 -0.1188 0.32146 0.842166 1 VIX -0.83997 -0.70865 -0.82259 0.232618 1 65
    • Dependencies (cont.) - Empirical Evidence • Dependence is – Times varying – Non-linear and asymmetric – Converges to 1 across securities at times of stress; there’s also convergence at booms but generally less than at market crashes 66
    • Dependencies (cont.) – Post Modern Perspective • Alternatives – Correlation ratio • A more general measure of dependency, which can capture non-linearity • In the special case of normal, it is equal to the square of Pearson correlation • Only measures the strength of relationship, not direction – Dynamic conditional correlation (DCC by Engle (2002)) – GARCH like model to dynamically model correlations; can be incorporated with copulas to include time variance (see below) – Entropy based mutual information/total correlation • General measure of dependency, which can capture non-linearity • Entropy is a measure of uncertainty associated with a random variable • Greater mathematical and computational requirements than correlation – Copula • A function to model dependencies • Independent of the marginal (individual) distributions of variables, which can be modeled separately • Can apply probability tools and incorporate for other analytical needs which require dependency measures • Can capture non-linearity (fat tails at extremes) – e.g., t copula (however, joint density of booms and busts is symmetric); other copulas can also capture tail dependence and allow to model each tail separately • Normal or Gaussian copula cannot capture tail dependence and should be avoided (with perfect positive or negative dependence, normal copula is not defined) • Can construct own copulas via own parameters or mixtures of existing copulas • Must determine appropriate distributions for individual variable and their dependence • Most copulas apply the same structure to all securities in a portfolio across time • However, new research (e.g., grouped t and dynamic grouped t copula) allows to model each relationships separately and introduce time variability • Tradeoff between time variance/different tail structure across pairs and non-symmetric tails • Which data to use to esimate – historical, most recent or some time weighted mechanisms? All prior discussed limitations apply 67
    • Dependencies (cont.) – Empirical Evidence 1000 randomly generated numbers vis Gaussian copula and Gaussian marginal density SP500 and NAREIT, 1/1/2008 - 12/31/2008 0.2 0.15 Observed daily returns for SP500 and NAREIT 0.1 1/1/2008 - 12/31/2008 0.05 0.2 N EIT 0 AR -0.05 0.15 -0.1 0.1 -0.15 -0.2 0.05 -0.1 -0.05 0 0.05 0.1 N A R E IT SP500 0 -0.05 -0.1 -0.15 1000 randomly generated numbers via Gaussian copula and empirical marginal density SP500 and NARET, 1/1/2008 - 12/31/2008 0.2 -0.2 0.15 -0.1 -0.05 0 0.05 0.1 0.1 SP500 0.05 N EIT 0 AR -0.05 -0.1 -0.15 -0.2 -0.1 -0.05 0 0.05 0.1 SP500 68
    • Dependencies (cont.)– Empirical Evidence 1000 randomly generated numbers via T copula and empirical marginal density SP500 and NAREIT, 1/1/2008 - 12/31/2008 0.2 0.15 0.1 Observed daily returns for SP500 and NAREIT 1/1/2008 - 12/31/2008 0.05 0.2 N R IT 0 A E -0.05 0.15 -0.1 -0.15 0.1 -0.2 -0.1 -0.05 0 0.05 0.1 SP500 0.05 N A R E IT 0 -0.05 -0.1 1000 randomly generated numbers via Clayton copula and empirical marginal density SP500 and NAREIT, 1/1/2008 - 12/31/2008 -0.15 0.2 0.15 -0.2 0.1 -0.1 -0.05 0 0.05 0.1 SP500 0.05 N R IT A E 0 -0.05 -0.1 -0.15 -0.2 -0.1 -0.05 0 0.05 0.1 SP500 69
    • Dependencies (cont.)– Empirical Evidence • Key observations – Assumptions of normal copula and normal distributions for each time series do not produce a realistic picture of tail results, p.68 – Student t or Clayton copula better incorporate empirically observed convergence among assets at the negative tail, p.69 70
    • Dependencies (cont.)– Empirical Evidence 1000 randomly generated numbers via Gaussian copula and empirical marginal density SP500 and NAREIT 2008 daily returns with rank correlation of .2 0.2 • For a slightly different 0.15 0.1 perspective, if rank 0.05 NAREIT 0 correlations are assumed -0.05 -0.1 -0.15 to be much lower (.2), -0.2 -0.1 -0.05 0 0.05 0.1 SP500 one can see even more clearly that normal copula cannot the possibility of 0.2 1000 randomly generated numbers via t copula and empirical marginal density SP500 and NAREIT 2008 daily returns with rank correlation of 0.2 extreme events and tail 0.15 0.1 dependence (upper right 0.05 NAREIT 0 and lower left corners are -0.05 -0.1 empty) -0.15 -0.2 -0.1 -0.05 0 0.05 0.1 SP500 71
    • Dependencies (cont.) - Summary • Key empirical facts – Dependencies among securities are • Time varying (reliance on the average can be dangerous) • Non-linear and asymmetric • Converge to 1 in times of stress – Pearson correlation suffers from numerous theoretical and empirical problems – Extension of normal distribution to multi variable case does not reflect reality • Implications – Can incorporate empirical features of markets via more complex functions such as copulas • Copulas allow to model dependencies of distributions and realistically reflect extreme event behavior (e.g., t copula, but never Gaussian copula) 72
    • Risk Management • Motivation – Now that risk can be measured appropriately, how can it be managed? • Classical perspective and current practice – Returns follow a consistent process, best characterized by the normal distribution – Volatility and VaR focused – Diversification – have a large set of securities with a broad group of exposures – Stress testing (generally, one variable at a time) – Use correlation to model relationships 73
    • Risk Management (cont.) – Post Modern Perspective • All the limitations discussed previously on volatilty, VaR, correlation apply • Model return distribution of a security separately from the relationships among securities • Use copulas to realistically model non-linear dependence among securities, particularly in the negative tails • When conducting stress tests, should incorporate how a change in one variable will also affect others – e.g, SP500 declines by 10% in one day, rates and spreads will also be affected • Options to minimize tail risks – Do not hold risky assets – Add more tail-safe assets to your portfolio – e.g., US Treasury bills – Add options that will produce a positive payout at tail events – Add strategies/instruments that may help preserve capital at tail events as compared to the rest of portfolio – e.g., gold, trend following – Loss limits 74
    • Risk Management (cont.) – Post Modern Perspective • Diversification (see prior sections on supporting data/discussion – e.g., pp.19, 64,65) – The primary purpose should be to protect portfolio at times of stress – Though possible at the asset level, diversification benefits are highly concentrated due to convergence to 1 for most securities at negative tail – Sector and country diversification within an asset class does not provide meaningful diversification benefits when they are needed most – At positive tails, there’s also some convergence Therefore, – As currently applied, diversification dilutes returns from top performers in good times and does not provide sufficient protection in crashes – Hundreds of securities in a portfolio may help eliminate company and industry specific risks but will leave one with just a concentrated market risk exposure – More broadly, keeping risk exposures relatively constant and concentrated in market and interest rate risks implies that one expects those exposures to do very well in all future environments and outperform other risk options • Portfolio flexibility should be consistent with – Observed behavior of returns • If returns have multiple states of uncertain timing, duration and results, one should have sufficient flexibility to incorporate such changes in the portfolio • Relatively static policies make sense if one expects the environment optimal for such a policy to dominate in the future – Investor risk tolerance • The lower is the tolerance for losses, the more important is the flexibility to quickly incorporate any changes in the market environment 75
    • Markets (cont.) • Motivation – What are the features of markets and their potential impact on the investment process? • Classical perspective – Rational investors – Efficient market hypothesis • Prices reflect all publicly available information – Periods of instability are rare – Market inefficiencies are not possible – Empirically limited support 76
    • Markets (cont.) – Post Modern Perspective • Humans are not wired for rational investing (see next section) • Markets as complex, dynamic and adaptive (or evolutionary) systems – Adaptive markets hypothesis (Lo (2004)) • Investors with widely different goals and analysis who • Constantly adapt and adjust their expectations • Key Implications – Risk / reward relationships is not stable over time – There are arbitrage opportunities from time to time – Survival is the only objective that matters; profit and utility maximization are secondary – Innovation is key to survival – as risk/reward changes through time, the best way to achieving consistent targeted returns is to be able to adapt to changing conditions – A market with many participants and scarce resources will be efficient; a market with many resources but few participants will be inefficient – Periods of instability are regular – Different strategies/sectors etc have time varying performance as investors adapt – e.g., as an investment strategy performs well, capital rushes in, driving down returns; as investors leave and adjust their return expectations, the strategy becomes attractive again and the cycle start again – Irrational outcomes can results even if everyone is rational – ‘Stupid’ individual actors can produce smart system outcome – To understand complex adaptive system, one must study the whole not its parts; focus on collective behavior – Changes to one component of the complex system may have unintended consequences for the whole – Small changes in starting conditions can produce drastically different results (e.g., parallels with weather patterns and their relationship to chaos theory, though markets appear to be even more complex to model than chaos) – Can explain a lot of anomolies from a classical perspective – significantly time varying returns, tail dependence, etc – Implies that reliable forecasts are very hard to produce, especially for the long term horizon (as with weather); however, under certain circumstances and with appropriate tools, it may be possible to judge the direction (trend) or ranked order, even if not the magnitude, of price changes (see forecasting section) • The concept of average is dangerous for such a system as one number may be very unrepresentative of the range of potential outcomes or their probabilities 77
    • Human Decision Making and Investor Edge • Motivation – How do humans gather, process and analyze information and how an investor have an edge over others? • Classical perspective – Investors are rational actors, who logically gather and analyze all relevant information – Very limited support 78
    • Human Decision Making and Investor Edge (cont.) – Post Modern • Perspective Humans are not wired for decisions within a complex, dynamic system (e.g., investing) – In complex situations, humans look for simplified patterns that may not be optimal – Humans are sensitive to social and situational influences that may bias decisions – Extensive list of behavioral biases developed by evolution for the different purpose of survival; a lot of key research by Kahnemann, Tversky, Thaler and others – Examples: • Anchoring or recent history bias – weigh recent events more than earlier events and extrapolate into the future • Herding – do what others are doing • Confirmation bias – seek and process information that supports your argument • Risk aversion – humans disproportionately hate losses (e.g., the loss of $1 is felt much more strongly than the gain of $1) • Disposition effect – sell winners and hold on to losers • Overconfidence • Illusion of control – the belief that one can influence or control the outcome when one cannot • Neglect of probability – the tendency to completely probability when making a decision under uncertainty • Outcome bias – the tendency to judge a decision by its eventual outcome instead of the quality of the decision at the time it was made • Status quo bias – the tendency to like for things to stay as they are • Wishful thinking – the tendency to form beliefs and make decisions based on what is pleasant to imagine rather than what is rational or supported by evidence • Hindsight bias- the tendency to think that past events are predictable • Halo effect – the tendency to make specific inferences based on general impressions ((e.g., identify features of successful companies and then recommend that other companies adopt such practices if they want to be successful) – Intuition • Works best in stable environments, where conditions remain largely unchanged (e.g., chess), feedback is clear and cause and effect relationships are linear • Fails when the system is complex and changing, especially with multiple phases – Fat tails in financial markets are a result of human excessive optimism or pessimism 79
    • Human Decision Making and Investor Edge (cont.) – Investor Edge • 3 possible sources – Information • Going beyond general databases, which others can’t easily replicate (e.g., research of a company’s new drug reveals safety issues) • Humans will outperform machines – Analysis • One can analyze data better than others (e.g., see patterns before others can) • Machines will outperform humans – Behavior • Behave in ways that consistently correct for various human biases and incorporates the complex, dynamic behavior of markets • Machines will outperform humans • Perspective from games of chance – Win if the probability of favorable outcome is greater than 50%, even if the ratio of winning payout and losing payout is 1 – If the probability of favorable outcome is less than 50%, the ratio of winning payout to losing payout must be sufficiently greater than 1 to offset losses 80
    • Forecasting • Motivation – Ideally, want to be able to forecast reliably the magnitude, sign (positive or negative) and timing of returns – How reasonable are those goals and what tools can be used? • Classical perspective and current practice – Forecasting is not possible as current prices reflect all available info – As return process is consistent, historical figures will provide the best picture of the future – Experts • Talk to those who are perceived to be able to forecast the future – Factor models • Find key factors explaining return variance historically (e.g,, P/B, P/E, dividend yield); use current factor information to estimate returns – Monte Carlo simulation (relatively rare) • Returns are assumed to be iid (independent and identically distributed) • Relationships are incorporated via correlations 81
    • Forecasting (cont.) – Post Modern Perspective • Returns don’t follow a consistent process – Multiple states with various features – Positive momentum in the short to mid term; reversal in the mid to long term and very short term • Factor analysis – Is there statistical significance to the relationships being modeled? Adjust for heteroskedasticity and autocorrelation, if any – Explaining return variance via regression and predicting returns via regression are separate tasks – Regressions may evaluate false (spurious) relationship: explanatory power for return variance is very high and high statistical significance for factors but no economic meaning • Integrated time series – A non-stationary time series is integrated of order 1 if the difference between current value and its value one period ago is stationary across time => such a time series is random and its variance of such time series is time dependent – Unit root tests to find such a series – Financial ratios and other variables are often close to being integrated of order 1 – Generally, regressing one integrated time series on another will produce spurious predictive regressions » Meaningful regression for such cases are possible if integrated time series are also cointegrated (see later) – One can test data for such issues via unit root tests – Regression generally produces meaningful results only for stationary processes – Is not flexible to capture changes in the environment – Cannot incorporate both momentum and reversal • Experts – Worse results in a complex environment than the group (see next page) 82
    • Forecasting (cont.) – Post Modern Perspective: Alternative Approaches • Ask the crowd, not the expert: the wisdom of crowds – Collective error = avg individual error – prediction diversity – Diverse crowd will always predict better than the average individual – Improving the abilitity of the avg participant and diversity of the crowd improve the forecast – All members of the group do not need to be ‘smart’ to produce a better forecast than an expert However, – Diversity is likely to fail at booms and busts because of herding – Even if the crowds produce the best forecast, it’s not necessarily correct The value of experts Rules based Rules based Probabilistic Probabilitic Limited range of Wide range of Limited range of Wide range of Domain outcomes outcomes outcomes outcomes Expert performance Worse than computers Generally better than computers Equal to or worse than group Worse than group Expert agreement High Moderate Moderate / Low Low Example Credit scoring Chess Poker Stock market; economy Source: 83 p.40 of Michael Mauboussin's Think Twice (2009).
    • Forecasting (cont.) – Post Modern Perspective: Alternative Approaches • Econometric approaches – Quantile regression • Complete regression for each quantile of data to get a better view of non-linearity and multi state features of a time series • Factor coefficients may be slow to adjust to changes in environment • Limitations of regression analysis apply – Regime switching models • Incorporate different probabilities of state transitions • As those probabilities change, the model may not adjust quickly to changes in environment – Other econometric approaches • Can incorporate and replicate important empirical features such as non-linearity of returns, momentum and reversal, volatility clustering and asymmetric impact on securities • Examples – Future returns are related to past returns and deviations from the mean (e.g., ARMA – autoregressive - model); autoregressive means that the variable is regressed on its past or lagged values – GARCH based volatility modeling may be integrated for analyzing volatility and its impact on returns (ARMA- GARCH model) – Relationships across multiple time series (securities or factors) can be modeled via VAR (vector autoregressive); to reduce the required number of estimates, one can focus on key factors expected to drive returns via VAR related dynamic factor and state space models – Cointergration – informally, 2 time series are cointegrated if they stay close together even if individually, they are random. Cointergrated models allow to model long term dynamic of a relationship and short term dynamics separately. However, • Numbers of parameters to estimate can become very large for VAR models • Cointegration is difficult to identify in large amount of data and methods for testing and estimation generally apply to a limited number of processes (e.g., 10-20) • Model assumptions may be too restrictive 84
    • Forecasting (cont.) – Post Modern Perspective: Alternative Approaches • Statistical approaches – Simulation • The primary purpose is to understand the potential behavior of a system in a realistic set of scenarios • Must specify the return behavior (e.g., historical distribution, ARMA-GARCH model, stochastic differential equation) and relationships across securities • If returns are path dependent (not iid), Markov Chain Monte Carlo methods will produce more realistic scenarios • Specify dependencies via copula • Classical inputs of normality and correlation will not produce a realistic set of scenarios – Statistical learning • ‘If…. Then’ Bayesian type approach • Identify and classify prior patterns • As information arrives make decision offering the best chance of success based on prior relevant data • Depending on the approach chosen, can explicitly analyze probabilities of a range of potential outcomes • Update information about patterns and results with the new information • Examples – Neural networks – Classification trees • Can handle any feature of data, including non-linearity • No distributional assumptions required • Must make sure that the initial data to ‘learn’ on is relevant for the variable one is trying to model • Must be careful not to overfit historical data 85
    • Forecasting (cont.) – Post Modern Perspective: Other Topics • Forecasting model evaluation – Estimate the model on a portion of data – Test it on the remaining data – The less is the deviation of forecasts from actual results, the better – The model can be tailored to a particular instrument/market or general – The less it relies on restrictive assumptions and large set of inputs, the more likely it is to handle well general conditions • Other issues in forecasting – Empirically, returns do not follow a consistent process: have multiple states with multiple features whose duration is variable => complex dynamic system Therefore, – Generally, it is very hard to forecast the magnitude, sign and duration of returns, especially over the mid to long term period • Returns have multiple states whose duration changes – It is especially difficult to accurately forecast the magnitude of returns on a consistent basis – It is easier to forecast those time series that have strong trends and high serial correlation – Because of clustering, volatility is easier to forecast than returns but horizon is still limited due to the complexity of markets • Then can use volatility forecasts (and any other relevant info) to estimate the average impact on returns – magnitude and sign • Duration of a volatility state is variable • Shifts between volatility state are very hard to predict accurately as there’s a strong tendency to stay in the current state; however, one can identify if a shift has occurred and market crashes are preceded by a progression from low to high volatility regimes – Because of momentum effects in the short to mid term and reversal in the mid to long term, it is easier to forecast the direction and rank of return time series than it is to forecast the magnitude and duration of returns – The more long term is the forecasting focus, the more useful are loss limits as a risk management tool to protect capital from sharp changes in environment not yet captured by the model due to time lags (or errors in the model itself) – Forecasting markets is more complex than forecasting weather (see prior market secion) • How would you react if someone claimed to be able to forecast reliably temperature and other weather conditions for the day one month from now? 6 months from now? 1 year from now? – Weather forecasts are generally reliable only for a short period of time • Seasonality of weather may help you determine the likely range of potential outcomes and the direction of temperature compared to the prior months and the future months; may apply this approach for other weather related info • Many markets are not seasonal; even seasonal markets are affected by other forces which may outweigh seasonal effects 86
    • Forecasting (cont.) – Post Modern Perspective: Other Topics • Stock market is perceived to be an excellent forecaster SP500 Total Return Index - 9/3/2007 - 3/29/2009 – Evidence • The stock market will achieve 3000 multiple peaks before the bear market starts and will have 2500 multiple false rebounds before the true bottom – People don’t always correctly 2000 anticipate events – With perfect hindsight, the stock Level market will seem an excellent 1500 forecaster when in fact its record is quite mixed 1000 – However, its record may be better than that of many other forecasters 500 • Example – the recent bear market on the right – Since peaking in Oct. 07, there 0 were multiple rallies of the index 8/26/2007 12/4/2007 3/13/2008 6/21/2008 9/29/2008 1/7/2009 4/17/2009 with some exceeding 10%+ (March-April ’08, late October, Date late November – January) before the final bottom in March ‘09 87
    • Optimization • Motivation – How should securities be mixed in a portfolio? • Ideally, reflects the following empirical/practical issues – Uncertainty in the estimation of returns and related metrics due to multiple regimes for variables – Appropriate metrics for risk and risk adjusted returns for incorporate fat tails – Appropriate measure of dependecy – Appropriate investor preferences – As few transactions as possible – Multi stage investment horizon typical for most investors • Classical perspective and current practice – Mean variance (MV) optimization • Relies on linear and non-linear programming tools – Ideal for finding the optimal solution from many combinations with no uncertainty => profit maximizing schedule of production in a factory – Objective function must have one clear maximum/minimum to find the optimal solution (e.g., bell shaped curve for a maximum or upside down bell shape curve for a minimum); objectives with complex shapes that contain relative maximums/minimums (solution is optimal for a range of outcomes but not for all) in addition to the best solution across all possible combinations cannot be solved with such techniques – Not appropriate for very large number of combinations • Returns follow a consistent process, characterized by normal distribution => volatility is an appropriate measure of risk • Investor preferences are defined by gradually decreasing preferences for higher wealth (quadratic utility) • One period view (no transactions can occur in sub-period between the start and finish of the investment horizon) • Generally, historical inputs for returns, volatilities and correlation • One can establish a relatively static investment program that can produce quality average returns across time while protecting capital at stress points through a broad set of exposures 88
    • Optimization (cont.) – Classical Perspective • Practical issues with MV approach – Highly sensitive to estimation errors in inputs (small changes inputs produce very changes in output), leading to frequent re-balancing – Time consuming to solve for a large number of assets/scenarios – Estimation errors for inputs (inputs significantly deviate from the future reality) • Historical mean is not an appropriate forecast of the future as return series is time varying and exhibits fat tails • Volatility is time varying and clustering • Correlation is full of statistical noise and time varying • Generally, errors in return estimates are about 10 times more important than errors in the covariance matrix and errors in the variances are about twice as important as errors in the covariance matrix (p.211 of Fabozzi et al (2007)) – Model risk (model’s results or application are not appropriate for a given problem) • Assumes that inputs are known with certainty • Cannot deal with functions that are non-linear • Time series with large expected return and low variance (‘smooth’) is always preferred, which leads to • Tendency to produce extreme solutions – security either get very large weight or no weight at all in the optimal portfolio (‘error optimizer’) • Empirical evidence – MV optimized portfolio based on historical data perform poorly in practice due to large estimation and model errors 89
    • Optimization (cont.) – Classical Perspective • Attempts to improve MV optimization with no changes to the model or its assumptions (focus is on minimizing estimation error of inputs) – Invest in securities at equal weight – Use weights from global minimum variance portfolio • Does not require any input for returns, which have a disproportionately large impact on estimation error – Adjust correlations for heteroskedasticity and autocorrelation – Robust statistics for inputs • Reduce the impact of outliers in calculation of inputs • User must incorporate extreme events through separate analysis as inputs will not include that info – Mixture approach • Use weighted or equal average of metric in key states to minimize noise • Example – Calculate correlation in low return and high return environment; input correlation is the average of the two data sets – Shrinkage estimators for inputs • Estimator can be improved (in terms of deviations from observed results) by including with raw data other information one uses for estimation; any additional information will result in improvement – Minimize noise in inputs via principal component technique, which extracts only those factors with maximum contribution to explaining variable variability (very few factors will be relevant) – Constraints • Minimizes extreme solutions of very large or no weights • Related to ‘shrinking’ correlations to reduce the impact of outliers • If significant, they (not the forecasts) will completely determine portfolio allocation However, • May lose valuable information contained in unconstrained, general results – Diversification indicators • Measure concentration of portfolio – Black – Litterman approach derive expected returrns • Incorporate the user views using Bayesian techniques • Minimizes extreme solutions as the base solution is pre-specified (see below) • Assumes normal distribution (later versions extended to non-normal dstributions) • Assumes that the market index (e.g, SP500) is the optimal investment option with no views (empirical evidence suggests that most indices are not mean variance efficient) and base returns are derived from CAPM (see limitations of CAPM previously) – Sensitivity analysis for changes in inputs – Empirical evidence • Equal weight and global minimum variance portfolios generally outperform MV optimized portfolios on a risk adjusted basis (using Sharpe ratio) • Portfolio with other techniques also generally outperform MV optimized portfolio based on raw, sample information • Depending on investors focus (e.g., returns, volatility, Sharpe Ratio), more complex techniques may or may not outperform simpler techniques of equal weight and global minimum variance 90
    • Optimization (cont.) – Post Modern Perspective • Attempts to address model risk of MV optimization – Change goals of optimization to relax the assumption of normality • As volatility and Sharpe Ratio have significant limitations, use more empirically supported metrics to optimize on – Examples » CVaR – can be used in optimization via reliable linear programming techniques » Omega ratio for risk/reward analysis • Full scale optimization – From behavioral research, investor preferences are focused on returns, not levels of wealth, and disproportionately averse to losses – Objective is to maximize the adjusted preference function over a sample of returns (possibly, subject to some other constraints) – This new function is high non-linear and can’t be solved the optimization algorithms of MV approach; must use global search and optimization techniques – Requires returns inputs and investor preference function • The above methods incorporate higher moments (not just returns and volatility) – Skewness and kurtosis are important for returns and portfolio selection – Investors generally like positive skewness and small kurtosis – consistent, positive returns – The objective function is to maximize return and skewness and minimize volatility and kurtosis – Skewness and kurtosis will be even more sensitive to outliers than returns and variance as raised to the third and fourth power – The greater is the abnormality of a return series, the greater is the gap between mean variance results and the above • Estimation risk may be addressed separately via any of the techniques discussed before 91
    • Optimization (cont.) – Post Modern Perspective • Attempts to address model risk of MV optimization (cont.) – Estimation risk is integrated into optimization • Portfolio sampling or simulation based optimization – Method » Simulate possible end period scenarios for securities (see forecasting section on simulation) » Optimize for each scenario using linear or quadratic programming techniques discussed before » Find average weights across the scenarios – Incorporates (implicitly) uncertainty in the range of scenarios into optimization – Averaging of weights will not guarantee that all constraints will be satisfied – Computationally intensive • Stochastic programming – Explicitly incorporate uncertainty of inputs into optimization and uses multi stage framework – Can incorporate uncertainty over just 1 stage for some types of models – More realistically, multi stage forecasts (each stage will have multiple states to incorporate uncertainty) for the investment horizon (scenario tree) are simulated and the best weights are selected corresponding to the best path of decisions given the objective through the uncertain scenario tree – Very computationally intensive as number of assets, stages and possible states within a stage rises (e.g., the total number of scenarios for N assets that are allowed to take just 2 values at each state are 2^(NT)); some problems may be too large to solve • Dynamic optimization – Explicitly incorporate uncertainty of inputs into optimization – Start at the end of the multi stage scenario tree and work backwards finding the optimal at each state of a particular stage – Very computationally intensive • Robust optimization – Separate from robust statistics » In this case, robustness refers to solving for the worst case realizations of the uncertain parameters within pre-specified uncertainty set for the random inputs; – Intuition = > securities with the greatest uncertainty will be minimized in the portfolio – Explicitly incorporates uncertainty of inputs into optimization – Can be single or multi stage – Because it limits the range of uncertainty for inputs (uncertainty is pre-specified), the problem becomes more computationally feasible – May produce too conservative portfolios – The above techniques generally outperform basic MV optimization using sample data, especially as deviations from historical estimates increase – The above techniques may or may not meaningfully outperform simpler methods that focus on estimation risk only 92
    • Optimization (cont.) – Post Modern Perspective • Depending on the complexity of objective function shape and the size of the problem, other optimization techniques may be employed as linear/quadratic methods may not find the solution or take too long – Genetic algorithms – Simulated annealing – Pattern search – Swarm optimization – Ant colony optimization 93
    • Optimization (cont.) – Post Modern Perspective • Example – 4 alternative strategies (convertible arb, distressed, CTA and long short) are optimized via mean variance and mean CVaR methods – Global minimums are found • For mean/variance method, this approach avoids return inputs, which carry the highest estimation error • For mean/CVaR method, historical returns are used as possible scenarios – Selected stats for strategies for the sample (presented earlier) and results are the next 3 pages – Results are derived from raw data; unsmoothing data would impact results further 94
    • Optimization (cont.) – Post Modern Perspective: Example CISDM Alternative Strategies Indices 12/31/1991-12/31/2009 monthly data Convert Long/ Arb Distressed CTA Short Arithmetic avg return 0.79% 0.91% 0.70% 0.96% max 4.71% 5.26% 7.86% 9.40% min -11.49% -10.59% -5.43% -9.42% vol 1.44% 1.83% 2.49% 2.20% VaR (95%) -1.00% -1.48% -3.23% -2.38% CVaR(95%) -3.12% -3.88% -4.01% -3.94% CVaR(99%) -7.43% -8.07% -4.75% -6.31% Skewness -3.92 -1.94 0.41 -0.25 Kurtosis 33.46 13.38 3.06 5.75 95
    • Optimization (cont.) – Post Modern Perspective: Example CISDM Alternative Strategy Indices 12/31/1991 - 12/31/2009 monthly, raw data Convertible Arb Distressed CTA Long Short Minimum variance portfolio weights 55.3% 15.5% 26.6% 2.5% Minimum CVaR portfolio weights 0.0% 32.5% 47.6% 19.9% Portfolio statistics Min Variance Min CVAR % change from Min Variance Arithmetic avg return 0.79% 0.82% 3.92% max 4.15% 4.83% 16.38% min -6.55% -2.62% -60.05% volatility 1.18% 1.45% 22.84% VaR(95%) -0.80% -1.41% 76.81% CVaR(95%) -2.22% -1.81% -18.46% CVaR(99%) -5.96% -2.50% -58.03% Skewness -1.74 0.28 -115.94% Kurtosis 9.38 -0.20 -102.10% 96
    • Optimization (cont.) – Post Modern Perspective: Example Histogram of global minimum variance portfolio for Histogram of global minimum CVaR portfolio for CISDM Convert Arb, Distressed, CTA and Long Short CISDM Convertible Arb, Distressed, CTA and Long Short 12/31/1991-12/31/2009 12/31/1991-12/31/2009 90 45 80 40 70 35 60 30 50 25 40 20 30 15 20 10 10 5 0 0 -0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 -0.03 -0.02 -0.01 0 0.01 0.02 0.03 0.04 0.05 97
    • Optimization (cont.) – Post Modern Perspective: Example – Key observations • Differences in goals produce economically different results as abnormality increases and volatility cannot capture the full extent of losses; volatility will be further ‘tricked’ by the ‘smoothness’ of returns – Mean/ variance method always prefers convertible arb and distressed as it has lower volatility, despite much higher extreme losses – Slight improvement in average arithmetic returns for CVaR optimization with large reduction in experienced losses; cumulative returns for CVaR optimization are also larger as the portfolio needs to recover from smaller losses – Focus on negative tails leads to a drastically different probability shape of the portfolio as compared to mean variance methodology » Positive vs negative skew » Slightly negative kurtosis (short, thin tails with most of returns concentrated around the mean) vs large positive kurtosis with long, fat tails 98
    • Optimization (cont.) – On Position Size from Games of Chance • Kelly criterion – Provides some guidance on optimal bet size in uncertain games where a player’s edge and payouts can be measured – Optimal bet size as % of capital = 2*p-1 • Where p is the probability of winning and winning payouts are equal to losing payouts – If r is the ratio of winning payout to losing payout => – Optimal bet size as % of capital = p – (1-p) / r – In practice, there may be uncertainty as to the edge and the magnitude of winning and losing payouts 99
    • Optimization (cont.) - Summary • Classical MV optimization makes a number of unrealistic, making it a poor tool in practice • There’s no single solution; each approach may or may not be appropriate for various targets/goals • Multiple methods exist to that address MV weaknesses, generally resulting in stronger performance – 2 general categories • Minimize estimation error of inputs without changing key MV assumptions or optimization techniques • Change MV assumptions and apply different optimization methods • Various techniques may not be appropriate for a given size/complexity of the problem • There’s a tradeoff between how appropriate the optimal solution is for a stressed environment and for other states – Results of optimization are appropriate only for the environment whose data was used as inputs in optimization – To have optimal portfolios for different states, one must run optimizations for each possible state and have the flexibility to adjust portfolios accordingly – Impossible trinity of investment management • Relatively static portfolio, low losses at stress points and high average returns across time • Can pick only 2 out of 3 – Markets are constantly evolving with low likelihood of extended periods of stablity – At times of stress, benefits of diversification are limited and those are likely not the same places to invest in for periods outside the crash • Example: to have a relatively static portfolio with few transactions that can also survive a crash based on an investor’s loss target, one must accept lower returns for periods outside the crash – The larger is the performance gap between ‘normal’ and ‘crash’ environments, the greater is the number of transactions and the potential gap in returns between a more static and more dynamically optimized portfolio • Be aware of concentration risk with large position sizes – Position size consistent with estimated investor edge 100
    • Hedging • Motivation – How should one think about the effectiveness of hedges one may put in place and the prices paid for such hedging instruments? • Classical perspective and current practice – Black – Scholes model and its related forms are the dominant pricing tool – Key features • Large sudden movements in prices are difficult to incorporate • Large sudden moves do not occur and price movements are conditionally normal (Gaussian) • Fat tails are a special case, not a general property • Markets are complete (all risks can be hedged) • All hedging strategies lead to the zero residual risk – Not supported by empirical facts 101
    • Hedging (cont.) – Post Modern Perspective • Models with jumps (see more in Cont et al (2004)) – Key features consistent with empirical facts • Large sudden price movements are a general property • Fat tails are a general property • Markets are not complete – some risks cannot be hedged • Price discontinuities or jumps can lead to concentrated, large losses • Some hedging strategies are better than others 102
    • Asset Allocation • Motivation – What is asset allocation and how might be organized to reflect empirical observations? • The strategy/process by which portfolio are created that are expected to meet investor’s goals • Classical perspective and current practice – Establish return and volatility targets – Gather historical statistics for asset classes – returns, volatilities, correlations – Run MV optimization with historical inputs (or perhaps, some revisions to return inputs) and any relevant constraints to minimize the weakness of the method discussed previously (e.g., no negative weights and all weights on asset classes add up to one) to produce the optimal investment program – Add some range around the optimal weights (e.g., +/- 10%) to minimize the frequency of rebalancing and to be able to take advantage of market conditions – One may also run classical Monte Carlo simulations based on historical statistics (or adjusted figures) to produce a range of possible outcomes – All limitations of the above steps discussed previously apply 103
    • Asset Allocation (cont.) – Post Modern Perspective • Building blocks were established in prior sections • Key focus should be on measuring losses through CVaR – Coherent risk measure – Negative tail is what drives losses, not dispersion of results as measured by volatility or downside volatility – Can handle any type of a return time series • Return targets should be consistent with loss targets at stress points and throughout the investment horizon • Loss/Return targets should be consistent with risk aversion bias – The same magnitude of losses is perceived disproportionately more intensely than the same magnitude of gains • Utilize all information embedded in a return time series via Omega Ratio to judge risk/return tradeoffs • Pick 2 out of 3 parameters– relatively static portfolio, low losses at stress points and high average returns across time • Depending on selected parameters – Incorporate dependencies via time sensitive correlations or copulas – Develop appropriate inputs via approaches discussed in forecasting section • Examples – If a more dynamic approach is selected, use situation specific data in asset allocation based on Bayesian analysis (e.g., high volatility state vs low volatility state) – Produce volatility forecasts via GARCH type models – Backtest the quality of forecasts from econometric approaches and limit to short term forecasts for best results – Produce return inputs via simulation based on the selected period of historical information – Optimize using appropriate goals and techniques – As markets change, implement appropriate transactions based on revised inputs 104
    • Manager Evaluation and Screening • Motivation – How should we evaluate a manager’s performance to determine if the manager produced value and is likely to continue producing value in the future? • Classical perspective and current practice – Select a group of broad factors reflective of a manager’s strategy – Run a regression, producing as close of a fit as possible – Intercept of regression is alpha or value produced by the manager above factor exposures – Additionally, run regressions for each rolling window of some size (e.g., 24 months) to try to capture any time variance in exposures of the manager’s portfolio – Return and volatility focused – Risk/return perspective is focused on Sharpe ratio – Classical perspective is that alpha is not likely to outweigh extra fees for the significant majority of managers – Most of the output from manager evaluation software does not produce much info on statistical significance of regression results and other statistics • Limitations of the above – All limitations inherent in regressions, especially concerning using numerous factors that overfit historical data and finding spurious regressions – Sharpe ratio is an incomplete measure of risk/return, especially for significantly non-normal distributions as is typical for many hedge fund strategies – User specification of factors does not allow to capture in a rigorous way what may really drive the manager’s perfromance; this is particularly important if the manager is investing in securities inconsistent with the strategy – the above approach may produce alpha whereas the manager may just be increasing risk inappropriate for the mandate – Factors are not highly non-linear, which is especially important for hedge funds – Long rolling windows may not capture rapid trading style well and thus produce alpha when none exists 105
    • Manager Evaluation and Screening (cont.) – Post Modern Perspective • Incorporate all information of a return time series in analysis – Focus on negative tails through CVaR as they are going to drive losses – Use Omega ratio to measure risk/return tradeoff • Incorporate non-linear factors in regression, such as the VIX index • Select factors in a rigorous way via stepwise method or other statistical algorithms to capture what the manager actually does as opposed to what he is supposed to do – Factors should be as specific to a manager’s style as possible to understand if any value is generated (e.g., a manager may outperform SP00 by buying emerging market stocks even if it’s not in his mandate) • Measure statistical significance of results – Concept of sample error – P-values • Keep rolling window as short as possible (while retaining statistical significance of results) to capture dynamics of the manager’s investment style or implement more advanced techniques, such as the Kalman filter • Beware of ‘smooth’ returns as they understate risk • Focus on measuring negative tails via CVaR and risk/return tradeoff via Omega Ratio • Can measure performance persistence via Hurst exponent for historical alphas of a manager • Evaluate all managers relative to key risk factors (see risk measurement section) to produce a consistent picture of risk exposures • Implement quantile regression or up/down market regressions to evaluate the consistency of performance in different market conditions 106
    • Manager Evaluation and Screening (cont.) – Post Modern Perspective • Empirical evidence on alpha and performance persistence (Herzberg et al (2003), Kat et al (2003), Eling (2008), Boyson (2008), Droms (2006) , Dash (2010)) – For mutual funds and hedge funds • Most managers do not produce sufficient alpha to outweigh fees • Generally, no long term performance persistence, especially after appropriate adjustments for smooth returns and any database biases (e.g., survivorship bias) • There’s some short term performance persistence, which can be explained by the momentum effect for a fund’s risk exposures • There’s some persistence in performance for small, young funds (which quickly disappears as assets increase), though the magnitude of outperformance vs larger funds may not be significant • There’s mean reversion in performance and poor performers in prior years are likely to be strong performers in future years and there are short term momentum effects similar to those for individual securities • What persists for funds is not performance but their style and risk profile – investors are generally paying fees for relatively stable risk exposures to factors they can themselves get much more cheaply via index funds/futures and ETFs • This is consistent with prior findings cited previously that asset allocation and to a lesser extent sector/country selection decisions account for virtually all of returns (and their variance) and momentum/long term reversal effects • In this case, classical and post modern perspectives are consistent – alpha is rare – alpha is hard to find • if one does find it, it may exist because of inappropriate factors or methodologies chosen to evaluate a manager as discussed previously – alpha above fees is even harder to find – alpha that is persistent across markets is very rare • In this case, it will be very difficult separate luck from skill – in a large enough group of people, it is mathematically possible for someone to outperform an index over a long period of time 107
    • Concluding remarks • Returns do not follow normal distributions, even with large data samples • Returns have multiple states of varying duration and return patterns • Returns generally show positive momentum (positive autocorrelation) in the short to mid term and reversal in the long term (Hurst exponent below 0.5) • Returns generally show negative skewness and fat tails • Return volatility has a tendency to cluster • Returns (positive and negative) are mostly concentrated • Economic growth is irrelevant to multi year returns • Different volatility levels are associated with different return states • Capture more information about the return distribution in analysis of risk and return via alternative measures such as CVaR and Omega as compared to the traditional measures of volatility and Sharpe Ratio • Most of return and its variance will be explained by decisions on asset classes and to a lesser extent sector/countries; security/manager selection generally contributes very little to return • ‘Smooth’ time series may significantly understate the true range of potential returns • Dependencies across securities are time varying • Correlation is generally a poor tool to model dependencies as it applies in the limited number of circumstances • Copulas are a more flexible tool to measure dependencies • Relationships across securities are time varying • Behavior of securities is highly similar in stress; some behavior convergence is also observed in bull markets • Protection of capital at stress points is possible but such benefits are concentrated among a few instruments/strategies; the same applies to high performance in bull markets. A broad set of exposures may not appropriately protect capital on the downside and will likely dilute returns on the upside. • Multiple approaches exist to manage tail risks – Do not hold risky assets – Add more tail-safe assets to your portfolio – e.g., US Treasury bills – Add options that will produce a positive payout at tail events – Add strategies/instruments that may help preserve capital at tail events as compared to the rest of portfolio – e.g., gold, trend following – Loss limits 108
    • Concluding remarks (cont.) • Markets are complex, dynamic, adaptive systems – Best understood by studying the whole rather than the parts – Require flexibility – Cycles are a regular feature – Small changes in initial conditions can produce different results – The concept of average can be dangerous as it hides the full range of outcomes and their probabilities • Humans are not wired to do well in such environments due to their tendency to look for simplified patterns and numerous other biases • Forecasting is very hard; performance of ‘experts’ is not likely to be strong • Focus efforts on forecasting of information which has the highest probability of success (e.g., hard to forecast magnitude of returns even at short term horizons, use prior information to improve analysis of environments, sign and direction of returns generally easier to forecast); however, everything is uncertain and relationships may change due to structural forces • The stock market’s forecasting reputation is overrated, even if its record may be better than that of many other forecasters – there will likely be multiple peaks in a bull market and multiple false rallies in a bear market; the final peak and the true bottom will only be obvious in hindsight • Optimization is only appropriate for the environment implied by the inputs • Impossible trinity of investment management – static portfolio, low losses at stress points and high average returns across time; can only pick 2 out of 3 • Complete hedging of exposures may not always be possible • Asset allocation incorporating the above empirical features should improve risk adjusted performance and be more consistent with an investor’s targets/preferences • Adjusted for risk taken, most managers do not produce enough value to justify their fees • Generally, it is not managers’ performance that persists but rather their risk profile and style, which remain relatively stable. As their risk exposures go in and out of favor, performance may be strong or weak. Such factor exposures can generally be obtained much more cheaply via index funds/futures and ETFs • Lastly, classical tools are based on a set of assumptions which are a very special case of markets and investment issues (e.g., normal distribution, rational investors, etc) and which are in conflict with empirical reality. New tools exist which provide a more realistic and general description of markets and investment issues. The cost of such new tools however is generally greater mathematical and computational complexity. 109
    • References • Artzner P., Delbaen F., Eber J. and Heath D. (1999) Coherent measures of risk Mathematical Finance 9, 203-228. • Brinson G., Hood L., and Beebower G. (1986) Determinants of portfolio performance. The Financial Analysts Journal, July/August. • Bollerslev T. (1986) Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics 31, 307-327. • Boyson N. (2008) Hedge fund performance persistence: a new approach. Financial Analyst Jounrnal, November/December, vol.64, no. 6. • Cont R. and Tankov P. (2004) Financial modeling with jump processes. CRC Press. • Dash S. (2010) Do past mutual fund winners repeat? S&P Persistence Scorecard January. • Droms W. (2006) Hot hands, cold hands: does past performance predict future returns? FPA Journal. • Eling M. (2008) Does hedge fund performance persists? Overview and empirical evidence. Working Paper • Embrechts P., Lindskog F. and McNeil A. (2002) Correlation and dependence in risk management: properties and pitfalls. In M. Dempster (ed.) Risk Management: Value at Risk and Beyond. Cambridge University Press. • Engle R. (2002) Dynamic conditional correlation – a simple class of multivariate GARCH models. Journal of Business and Economic Statistics 20, 339-350. • Fabozzi F., Kolm P, Pachmanova D. and Focardi S. (2007) Robust portfolio optimization and management. John Wiley and Sons. • Hertzberg M. and Mozes H. (2003) The persistence of hedge fund risk: evidence and implications for investors. Journal of Alternative Investments Fall. • Jegadeesh N. and Titman S. (1993) Returns to buying winners and selling losers: implications for stock market efficiency. Journal of Finance 48, no. 1, 65-91. • Kat H. and Menexe F. (2003) Persistence in hedge fund performance: the true value of a track record. Journal of Alternative Investments Spring. • Lo A. (2004) The adaptive markets hypothesis: market efficiency from an evolutionary perspective. Journal of Portfolio Management 30, 15-29 • Mauboussin M. (2009) Think Twice. Harvard Business Press. • Munenzon, M. (2010) 20 Years of VIX: Fear, Greed and Implications for Traditional Asset Classes. Working paper. http://ssrn.com/abstract=1583504 • Shadwick W. and Keating C. A universal performance measure. Journal of Performance Measurement, Spring 2002, 59-84 • Urbani P. (2005) Omega risk measure. 110