Information ratio mgrevaluation_bossert


Published on

Published in: Economy & Finance, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Information ratio mgrevaluation_bossert

  1. 1. How "Informative" Is the Information Ratio for Evaluating Mutual Fund Managers? THOMAS BOSSERT, ROLAND FUSS, PHILIPP RINDLER, AND CHRISTOPH SCHNEIDER reynor and Blacks [1973] Infor- and that a focused asset class approach is nec- TTHOMAS BOSSERTis managing director mation Ratio (IR) is one of the essary. Moreover, the quality and reliability of(portfolio management) at most commonly used performance the IR depends on certain estimation choices.Union Investment Institu-tional GmbH in Frankfurt, measures. It represents the ratio of First, benchmark choice strongly affectsGermany. the excess portfolio return over a specified the ratio. Ideally, the benchmark should coverthomas.bossert @union- benchmark, as well as excess return volatility. a large proportion of the respective Closely connected is the fundamental universe. Second, data frequency should be as !aw of active portfolio management (Grinold great as possible, because monthly data do notROLAND FUSS [1989]), which relates a fund managers skills accurately represent return volatility. Third,is a professor of finance andholds the Union Investment to the IR. This framework gives insights into non-normally distributed fund returns canChair of Asset Management how to use the IR to construct active portfo- substantially affect the use of the IR. Finally,at European Business lios within predefined risk limits. In order for in order to separate lucky managers fromSchool (EBS), International investors to apply the ratio to a specific port- skilled ones, long-term track record can be anUniversity Schloss important measure. folio choice problem, however, they needReichartshausen inOestrich-Winkcl. Germany. guidelines to identify superior funds. The remainder of this article is Grinold and Kahn [2000] state that top- nized as foUows, The next section, "The Infor- quartile managers have IRs of at least 0.5, while mation Ratio," discusses the IR and its rolePHILIPP R I N D L E R exceptional managers achieve values above 1.0. within active portfolio management. Theis a research assistant at the These numbers are unqualified and should section after that, "Data Description," presentsUnion Investment Chair of hold irrespective of asset class, country, or time our dataset and explains our choice of fundsAsset Management atEuropean Business School period. To the best of our knowledge, IR char- and benchmarks. The empirical results are pre-(EBS), hiternadonal acteristics across difierent asset classes and coun- sented next, in "Is the Information Ratio aUniversity Schloss tries have not been extensively studied yet. Reliable Performance Measure?", whichReichartshausen in Hence, this article addresses whether the IR is begins by testing the IR for stability over timeOesrrich-Winkel, Ciermany. a useful and reliable performance ratio. It and across different fund categories. We thenph¡lipp,rín(ller@ focuses particularly on empirically observable discuss the robustness of the IR against theCHRISTOPH SCHNEIDER quartile ranges for various asset classes and selection of different benchmarks and data fre-IS an analyst at Morgan countries that investors can use as guidelines quencies. Finally, we examine the persistenceStanley, Investment Banking to determine fund quality. of IRs over time in order to separate lucky-Division in Frankñirt, We use return data fi-om nearly 10,000 managers from skilled ones. The final sectionGermany, mutual funds for the January 1998-December provides conclusions and su^esdons for futurechristopli..schneider@ 2008 period. The empirical results show that research. static breakpoints can be widely niisleading SPRING 2010 THEJOURNAL OF INVESTING 67
  2. 2. THE INFORMATION RATIO Depending on the number of independent bets a man- ager takes, different skill levels are required in order to Treynor [1965] defines two characteristics of a achieve a "good" or "very good" IR (Wander [2003])."good" performance measure. First, it should provide the As we noted earlier, Grinold and Kahn [2000] definesame value for the same performance, irrespective of market IR levels on the basis of cost-adjusted fund performance.conditions. Second, it needs to incorporate the preferences A top-quartile portfolio manager would have an IR of 0.5,and risk aversion of investors. Similarly, Hübner [2007] and an exceptional manager would achieve a 1.0 or above.states that there are two factors that determine the quality Again, according to this study, this classification should holdof a performance measure: stability and precision. A stable for all asset classes and rime horizons, with only slight devi-measure is robust under different asset pricing models, and adons. Jacobs and Levy [1996] also found an IR of 0.5 ordoes not vary over time in terms of its classification. Pre- above to be "very good," without restrictions to asset classes.cision means that it should be able to provide the "true" Goodwin [1998], on the other hand, analyzed theranking of funds based on investor preferences. IR distribution for samples of funds with different invest- Treynor and Black [1973] define the IR as: ment universes, and found significantly different results across fund categories. We believe this approach is more ER plausible than the findings of the two other studies. Thus, IR = -^ (1) o. we expect to find different IR ranges in our empirical analysis when evaluating funds that invest in different assetwhere r is the portfolio return, r^ is the benchmark return, classes and countries.ER is the excess remrn, and O^^ is the volatility of the excessreturn. The rationale for the IR LS closely related to Jacobs DATA DESCRIPTIONand Levys [1996] investor utility function. They explaintliat investors in activefiandsare not risk-averse, but rather Fund Dataregret-averse. Regret aversion means generally acceptingthe risk of a passive investment in this asset class, but— Our initial sample includes all actively managed, open-depending on the excess returns—regretting the decision end funds listed for sale in Germany, the U.K., and theto invest in an active fund. United States by Reuters 3000 Xtra as of February 2009. Similarly, Grinold and Kahn [2000] find that investors Closed-end funds are excluded, because investors cannotselect among different opportunities based on their per- freely enter or exit them. REITs and hedge funds are alsosonal preferences, vhich, for actively managed funds, excluded, because their particular characteristics deniand"point toward high residual return and low residual risk" specific performance measures that are not within the scope(p. 5). Thus, by using the IR, investors can limit the fund of this article (see, for example, Ackermann et al. [1999] oruniverse according to their personal risk preferences. Below and Stansell [2003]). We focus separately on equity, Grinold [1989] identifies two factors that lead to fixed-income, and money-market funds because of theirhigh IRs. The first is the managers ability to correctly differing risk-return characteristics. Furthermore, wepredict residual returns in his invesmient universe. Referred exclude balanced funds, because we aim to analyze andto as the Information Coefficient (IC), it measures the characterize performance measures of distinct asset classes.correlation between actual and forecasted alpha. The To categorize the funds, we use the Lipper Global Classi-second, which describes the number of independent fication as it is used throughout Reuters 3000 Xtra.investment decisions made per year, is called breadth. The In the equity class, we choose funds with a focus onfundamental law of active management illustrates the rela- the major equity markets: Europe, Germany, the U.K.,tionship between IR, IC, and breadth, as follows: and the United States. We also distinguish between large- and small-cap funds {although the limited number of small-cap equity funds in Germany resulted in the elim- IR = IC (2) ination of this category). For purposes of this article, the crucial point about In the fixed-income class, we choose corporateEquation (2) is that correct forecasting of residual returns investment-grade bond flinds with a focus on the Britishshould be a key skill of any active portfolio manager. pound, the euro, and the U.S. dollar, which are the three major currencies for corporate bond emissions according68 H o * "iNRmMATIVE" Is THE INFOKMATION P-ATIO FOR EVALUATING MUTUAL FUNU MANAGERS? SPRING 2010
  3. 3. Co Reuters 3000 Xtra. Finally, we use these same three Calculating the IR requires a market benchmark formajor currencies to select relevant funds in the money comparison. Fund managers normally define their bench-market class. Ourfinaliund sample comisted of 9,632 funds. marks in a prospectus. However, in light of the large Our time frame ranges from January 1, 1998 number of funds and corresponding benchmarks withinthrough December 31, 2008. Our weekly return data, the same fund category, it was not possible to use eachlaunch years, and base currencies for the funds come benchmark to calculate the performance measures. Instead,from Thomson Financial DataStream. We correct for we use a general benchmark for each fund category.erroneous data entries by excluding funds with extreme Initially, this may seem somewhat unfair. But weinformation ratios of above 20 or below —20. We also believe it is more logical to judge each fund within a cer-exclude funds with launch dates after January 1,2007. tain investment universe against the same benchmark,If a fund launched in the second half of a year, we set although it does introduce a bias into the analysis. Fundthe launch date to the next year in order to ensure a suf- managers that are actually managing their funds against theficient number of data points per calendar year for cal- benchmark tend to exhibit lower tracking errors, andculating test statistics. For funds quoted in a currency therefore higher IRs, than managers using another bench-other than the corresponding benchmark currency., we mark. They bear the tracking errors versus their trueconvert the return data using the appropriate exchange benchmark plus the tracking errors between the true andrate. Additionally, we retrieve daily and monthly return chosen benchmarks. Exhibit A2 in the Appendix providesdata for large-cap U.S. equity funds in our given time- an overview of the benchmarks assigned to the differentframe in order to analyze the influence of data frequency. fund classes. However, because Reuten 3000 Xtra and ThomsonFinancial DataStream only list funds that are currently Descriptive Statisticsavailable on the market, the data are subject to survivor-ship bias. For us, this is especially relevant prior to 2007, Exhibit 1 gives the descriptive statistics for each fundbecause only the funds that survived are contained in our category as well as the average excess return over thedataset. We posit that the estimated performance measures respective benchmark. All numbers are annualized formay be biased upward. In the "Other Influences on better comparability.Performance Measures" section, we analyze the extent of In terms of risk/return relationships, we see that money-and possible corrections for survivorship bias. market and corporate bond funds behave as expected. ButEXHIBIT 1Descriptive Statistics of Fund Returns Avg. Ann. Avg. Ann. Excess Avg. Ann. Fund Classification Return Std. Dev. Skewness Kurtosis Excess Return Equity Europe -0.72% 17.73% -0.539 2.885 -1.7!% Equity Germany 0.18% 23.42% -0.418 3.484 -0.60% Equity U.K. 1.97% 15.30% -0.722 3.167 0.68% Equity U.S. -2.57% 18.23% -1.092 9.734 -3.22% Equity Small Cap Europe 1.5!% 19.25% -0.986 2.816 2.50% Equity Small Cap U.K. 4.09% 14.02% -1.223 3.020 2.27% Equity Small Cap U.S. -2.54% 21.68% -1.151 9.119 -6.45% Corporate Bonds EUR 2.38% 2.88% -0.666 3.914 -1.20% Corporate Bonds GBP 3.65% 4.39% -0.572 2.662 -1.12% Corporate Bonds USD 3.10% 4.22% -0.551 1.710 -1.58% Money Market EUR 2.11% 0.30% -3.557 20.300 -0.25% Money Market GBP 4.97% 0.45% 4.193 26.245 0.72% Money Market USD 1.97% 3.12% 1.379 33.221 -0.93%Note: Calculations are based on weekly data for thefanuaq i 998—December 2008 period and are annualized.SPRING 2010 THE JOURNAL OF INVESTING 69
  4. 4. the numbers for the equity segment are surprising- The poor for each fund category, not just in terms of value but alsoperformance of equities is due mainly to the impact of the in terms of range. A corporate bond fund with a positive2008financialcrisis; Gainsfixim2003 to 2007 in the US. IR can usually be classified as "very good," while an Equityequity market were completely erased in 2008. Europe Fund would only be average. Additionally, the In terms of performance as measured by alpha, it is value range for a "good" Equity Europe fund is far nar-clear that, over the 11-year period, managers in almost all rower than for a "good" Money Market EUR fund.asset classes and fund categories were not able to beat the Nevertheless, the values and ranges within the a.ssetbenchmark on average after costs. Note also that the classes seem similar. Further testing needs to be done tomoney-market segment exhibits strong skewness and lep- confirm these results. But we find that general statementstokurtosis. We will analyze the effects of non-normally about the IR, such as those of Grinold and Kahn [2000]di.stributed returns on performance measures flirther in the (discussed earlier), are not applicable for all asset classes and"Other Influences on Performance Measures" section. years because the threshold values vary considerably over time. Exhibit 3 shows detailed information about how theIS THE INFORMATION RATIO A RELIABLE threshold values develop over time for the top quartiles.PERFORMANCE MEASURE? But are these strong IR fluctuations statistically sig- nificant? To test for this difference, we calculate the median The Distribution of the Information Ratio [R of the top half of all Equity US. funds for each of the 11 years, i.e., the threshold value between the first 25% To analyze whether the distribution of [Rs is stable and the second 25% of the funds. We then test this valueover time and across different fiand categories, we rank each year to see if it is statistically significantly differentthe ratios for each year and asset class, and then divide from the average threshold value reported in Exhibit 2.them into four quartiles. We use a Wilcoxon signed-rank The results are outlined in Exhibit 4, with thetest and an optional student r-test to test the yearly values threshold values in the first data row and the z-statisticsagainst the overall average for statistically significant in the second row. We again use the Wilcoxon signed-rankdifferences. We present all results in annualized form for test because the IRs are not normally distributed accordingbetter readability and comparability by using arithmetic to the Lilliefors test, and we assume they are dependentmean returns according to Goodwins [ 1998] method 1 . on each other (see Hollander and Wolfe [1973]). Exhibit 2 presents the threshold values for the four The results in Exhibit 4 clearly show that thequartiles, which are averages over the 11-year horizon ot threshold values are significantly different from thethe dauset. Note that the IRs exhibit very different patterns 11-year average in every year. A look at the z-statisticsEXHIBIT 2Information Ratios of Different Fund Categories IR 1st 25% IR 2nd 25% IR 3rd 25% IR 4th 25% Fund Classification «Very Good" «Good" "Below Avg." "Poor" Equity Europe >0.40 0.40 to 0.04 0.03 to -0.36 <-0.36 Equity Germany >0.07 0.07 to-0.11 -0.12 to 0.37 <-0.37 Equity U.K. >0.32 0.32 to-0.01 -0.02 to -0.30 <-0.30 Equity U.S. >0.28 0.28 to -0.40 -0.41 to-I.Ol <-1.01 Equity Small Cap Europe >0.80 0.80 to 0.40 0.29 to -0.09 <-0.09 Equity Small Cap U.K. >0.59 0.59 to 0.22 0.21 to-0.12 <-0.12 Equity Small Cap U.S. >0.08 0.08 to -0.60 -0.61 to-1.18 <-I.18 Corporate Bonds EUR >-0.24 -0.24 to -0.76 -0.77 to-1.30 <-1.30 Corporate Bonds GBP >0.03 0.03 to -0.46 -0.47 to -0.95 <-0.95 Corporate Bonds USD >0.03 0.03 to -0.58 -0.59 to-1.29 <-1.29 Money Market EUR >4.30 4.30 to 1.36 1.35 to-0.39 <-0.39 Money Market GBP >4.30 4.30 to 0.31 0.30 to-1.50 <-1.50 Money Market USD >2.46 2.46 to 0.39 0.38 to-1.29 <-1.2970 How ••INK>HM.TIVE" IS THE RATK) FOR EVALUATING MUTUAL FUND MANAGERS? SPRING 2010
  5. 5. EXHIBIT 3Information Ratio—Threshold Values for 1st Quartile Funds (very good) Fund Classificatioii 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 Equity Europe >0.23 >Î.3O >0.38 >0.I5 >0.28 >-0.12 >0.46 >0.91 >0.81 >-0.09 >0.08 Equity Gennany >0.02 >-0.16 >0.44 >0.2I >0.35 >0.I2 >-0.14 >-0.02 >0.08 >-0.22 >0.11 Equity U.iC >-0.26 >0.79 >0.62 >0.24 >0.15 >0.65 >0.44 >0.31 >0.58 >-0.23 >0.I8 Equity U.S. >-0.39 >0.36 >0.66 >0.51 >0.71 >0.36 >0.18 >0.55 >-0.38 >0.44 >0.08 Equity Small Cap Europe >0.04 >2.40 >0.60 >-0.21 >0.50 >I.4O >1.70 >1.60 >1.60 >-0.28 >-0.49 Equity Small Cap U.K. >-0.92 >2.50 >0.67 >0.19 >0.25 >1.30 >1.40 >0.47 >1.50 >-0.68 >-0.18 Equity Small Cap U.S. >0.65 >1.50 >-0.26 >-0.21 >0.06 >0.44 >-0.74 >-0.05 >-0.49 >0.49 >-0.52 Corporate Bonds EUR N/A N/A N/A N/A >0.08 >-0.30 >-0.95 >-0.32 >0.26 >0.63 >-1.10 Corporate Bonds GBP >0.56 >-0.04 >-0.64 >-0.50 >-0.19 >-0.17 >0.07 >-0.15 >-0.08 >0.30 >1.20 Corporate Bonds USD >-0.37 >0.26 >0.46 >-0.64 >-0.28 >-0.01 >-0.26 >-0.08 >0.00 >0.49 >0.71 Money Market EUR N/A N/A N/A N/A >4.60 >7.70 >7.40 >7.80 >4.20 >0.51 >0.04 Money Market GBP >0.33 >1.10 >1.20 >4.60 >5.90 >5.60 >3.70 > 10.00 >7.90 >3.80 >3.20 Money Market USD >I.8O >1.50 >1.40 >5.90 >3.00 >5.20 >2.70 >0.98 >2.10 >1.30 >1.20EXHIBIT 4Test Statistics for the Difference of Threshold Values of Equity U.S. Funds Wilcoxon Signed-RankTest on Differences in Mean 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 Avg. -0.39* 0.36* 0.66* 0.5 r 0.71* 0.36* 0.18* 0.55* -0.38* 0.44* 0.08* 0.28 -16.5 -3.0 -15.4 -12.5 -19.8 -9.0 -3.2 -20.7 -34.3 -15.4 -4.0Note: ^Denotes values significantly different from average at the 5% significance level. AU test statistics for the Ulliefors test for normality are significant at the5% level. The test is a generalization of the Kolmogorov-Smirnov (KS) test, which requires specification of the population mean and mriatue. The Ulliefors testis capable of testing samples for normality thai hatv incompletely specified distribution characteristics (Ulliefors f1967}).reveals that the values are statistically different from their across different countries. The procedure is exactly theaverage. This is also highlighted by the spread in threshold same as in the previous test on Equity U.S. funds; resultsvalues, from -0.39 in 1998 to 0.71 in 2002. Thus, a fund are in Exhibit 5.evaluated on the yearly threshold value could be catego- Similar to the results in Exhibit 4, Exhibit 5 showsrized as "below average " while simultaneously being rated that all threshold values are significantly different fromas "very good" on the overall average value. their averages. This statement is valid for 1998 and for To conclude, we believe IRs must be calculated anew 2008, so we consider it rather robust. IRs thus not onlyeach year in order to be reliable. Because the relevant change over time, but also between different fund cate-thresholds can only be calculated ex post, it is not possible gories. And we cannot confirm the general statementsto use IRs when setting annual targets for fund managers. about fixed threshold values found in Grinold and KahnHowever, in the context of a multi-year planning process, [2000] or Jacobs and Levy [1996].long-term IRs might be applicable. The results of this part of our empirical study are In the next step, we studied IRs across different similar to the results of Goodwin [1998]. with the addi-fund categories. Our focus was again on U.S. funds, as it tion that IRs also change over time. Exhibit 6 uses box-seems more likely that we will find similar IRs when and-whiskers plots to graphically illustrate the differentlooking at several asset classes within one country than distributions of IRs over time for Equity U.S. funds.SPRING 2010 THE JOURNAL OF INVESTING 71
  6. 6. EXHIBIT 5Test Statistics for the Difference of Threshold Values of U.S. Funds Year Equity Small Cap Equity Fixed-Income Money Market Average Wilcoxon -0.39* 0.65* -0.37* 1.80* 0.42 z-score -17.53 -6.39 -5.51 -3.82 - Wilcoxon 0.08* -0.52* 0.71* 1.20* 0.37 z-score -10.10 -26.66 -5.17 -4.04 -Note: *Denotes values significantly different from average at the 5% significance level. Similarly to the previous test, the IRs are not normally distributedaccording io the LilUefors test, and all test statistics are significant at the 5% level. Tlw second row gives ¡he z-scores of the Wilcoxon signed rank test.EXHIBIT 6Box Plots of Equity U.S. Fund Information Ratios Equity US Funds T T T t i 1 t- t 2ooe The Art of Selecting a Benchmark funds. But we will use two additional indices to compare the resulting IRs, the equally weighted Dow Jones Indus- In fund management companies, benchmark selec- trial Average (DJIA) and the market-weighted Russelltion is usually the result of intense negotiations between 1000 Index. Exhibit 7 presents the threshold IRs for dif-the fund manager and the investors, because it has a major ferent benchmarks using the same procedure as in theimpact on the funds alpha. Depending on the style and previous focus of a fund, one benchmark might be much Note that the IRs based on the S&P 500 and themore favorable to a flind manager than another (Goodwin Russell 1000 are closely related, while the IRs based on[1998], Grinold and Kahn [2000]). Therefore, it is impor- the DJIA behave differently and are far more volatile.tant to analyze the sensitivity of the IR toward the selected It appears that the DJIA does not cover the Equity U.S.benchmark. investment universe very well. This may be because this Lehmann and Modest [1987] show that benchmark index is based on only 30 stocks.selection strongly influences the resulting alphas as well We again use the Wilcoxon signed-rank test to testas their volatility. Thus far, we have used the S&P 500 for significance of the difFerence in threshold values.throughout this article in connection with Equity U.S. Exhibit 8 gives the results from the Russell 1000 and the72 How "INFORMATIVE" IS THE 1NFORJ«IATION P ^ T I O FOR EVALUATING MUTUAL FUND MANAGERS? SPRING 2010
  7. 7. EXHIBIT 7The Effect of Benchmark Selection on the Information Ratio - Dow Jones Industrial Average S&P 600 Russell 1000 2005 2006 2007 2006EXHIBIT 8z-Statistlcs for Significant Difference of the Infonnation Ratios z-Values for 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 D o w Jones -18.1* -9.6* -9.3* -22.0* -26.6* -26.4* -30.9* -32.9* -7.5* -8.6* -25.3* Russell 1000 -9.4* -4.2* -3.5* -14.2* -8.5* -17.1* -25.0* -12.7* -31.9* -21.1* -33.5*Note: *Demtes values significantly different from average at the 5% significance level. Similarly to the previous test, the IRs are not normally distributedaccording to the LiUiefors test, and all test statistics are significant at the 5% level.DJIA versus those from the S&P 500. All are significantly fund category are superior to those based solely on a fewdifferent from those based on the S&P 500 (at a 5% sig- securities and industry sectors. Finally, the best way tonificance level). These results are in line with Goodwin judge the real risk-adjusted value-added of a fund man-[1998], who also found that benchmark selection strongly ager is to consider his actual benchmark, as well as theinfluences the resulting IRs. true benchmarks of the other managers within his peer The scatter plots in Exhibit 9 illustrate the rankings group. Or, as an alternative, we could use the peer group sbased on the three different IRs. We can see that the results average as a general benchmark, which might lead to moreare confirmed: There are noticeable differences between stabiUty in the annual IR thresholds.IRs based on the DJIA and those based on the S&P 500, In the same sense that benchmark selection is sowhile the changes in ranking between the Russell 1000 critical, investment restrictions are also quite important.and the S&P 500 are quite small. Selecting an appropriate The Transfer Coefficient (TC) measures the correlationbenchmark is therefore an important step during perfor- of a managers forecasts with the actual portfolio. A man-mance analyses in general. ager without constraints will end up with a TC of 1, while However, we conclude that benchmark indices that the constrained manager can only achieve a lower result.capture a larger part of the investment univeree of a specific Furthermore, a typical long-oniy fund may achieve a TCSPEUNG 2010 THE JOURNAL OF INVESTING 73
  8. 8. EXHIBIT 9Ranking Differences Caused by Different Benchmarks 200 doo eoo BOO 200 400 600 eoo 1000 Rank by Irtornistion Ralio (S&P 600) Rank by mtoimalion ñatn (S&P 600)between Ü.2 and (.).4, which would significantly impact the monthly data are inappropriate for calculating reliableIR because it implies the manager is not able to fully performance measures. Monthly data also allow for onlytransfer his skills into actual investment decisions. 12 data points per year, which is insufficient to estimate Assuming a (constrained) fund with a TC of 0.5, a the standard deviation.manager must double his skill (IC), or quadruple hisbreadth, in order to achieve the same IR as an uncon- Other Influences on Performance Measuresstrained manager for an equal fund (Wander [2003]). Many studies have documented that returns are gen- Does Data Frequency Matter? erally non-normal. However, many popular performance measures are still based on mean-variance analysis. There- Ané and Labidi [2íK)4] have shown that the return fore, as per Benson et al. [2008], we find that non-normalinterval has a significant impact on the return distribution. returns lead to biased results.Monthly and quarterly returns come close to a normal dis- Additionally, Kraus and Litzenberger [1976] foundtribution, but weekly and especially daily data usually that positively skewed returns are actually favorable forshow strong leptokurtosis. Furthermore, the annualized investors. If we refer back to Exhibit 1, which presentsstandard deviation varies with frequency. descriptive statistics of fund returns for each category, it Other research has also found that data frequency is striking that money-market funds in all currencies pro-influences correlations (Handa et al. [1989]). But we need duce strongly skewed and leptokurtotic returns. Com-to determine whether data frequency also impacts the IR paring the threshold values in Exhibit 2, the values forand, in particular, the threshold values. "top" money-market funds are uncharacteristically high. If we do fmd significant influence, we aim to Thus, we conclude that common performance measuresdescribe these differences and to provide guidance tor are not applicable to these fiinds, probably because of theirselecting an appropriate return interval. Thus, we calcu- special return distribution characteristics. Other perfor-late annualized IRs for Equity U.S. funds using daily, mance measures, such as Keating and Shadwicks [2002]weekly, and monthly fund returns. Using the ranking Omega Measure, could capture the deviation frommethodology explained earlier, we create fund rankings normality.based on three different IRs. Another important factor is the survivorship bias that Results for the year 1999 are given in Exhibit 10. can result because the most common data providers onlyWe see that the rankings of IRs based on daily and weekly list currendy available funds. This may lead to an upwarddata do not differ significantly, but switching to monthly bias in the performance measure estimates. Brown et changes the ranking dramatically. We conclude that [1992] found that survivorship bias can be so strong that74 How "INRIRMATIVE" IS IHE INR)RMATION RATIO FOR EVALUATING MUTU.^L FUND MANAGER.S? SPRING 2010
  9. 9. EXHIBIT 10Comparison of Rankings Based on Different Data Frequencies 200 400 600 600 1000 1200 200 100 SOG 800 !000 Rar* by Infomi^ion Haio (Ftequency: Vi&l REV* by trtomialKin R^io (Frequency: Weekly)it can lead to the erroneous conclusion that mutual fund Eliminating survivorship bias may lead to lower realperformance is predictable. However, when we correct the average IRs, but costs and asynchronous pricing have asample for survivorship bias, this finding disappears. negative impact. Including these two factors will lead to In terms of quantifying the survivorship bias for somewhat higher average IRs.Equity U.S. funds, Grinblatt and Titman [1989] found aO.iyo-0.3% bias per year. Brown et al. [1995] estimated the Performance Persistence: Outperformancebias at between 0.2% and 0.8% per year, while Elton et al. by Luck or by Skill?[1996] posited an average bias of O.7r/o-O.77% annually.Although it is very likely that survivorship exists in our Finally, we question whether a single ratio based ondataset, we were unable to quantify its proportions. How- one year of data is sufficient to evaluate a managers per-ever, because our study is set up similarly to the previously formance. It is important to determine whether achievingcited studies, we believe our results would be subject to a high IR in any given year was attributable to luck ora similar order-of-magnitude of bias. actual skill. Costs and asynchronous pricing are two other fac- Here, the managers track record can be a usefU tool.tors that influence our results. In order to estimate the The probability that good performance is attributable toreal risk-adjusted value-added of a fund manager, we need skill increases when managers can position their tundsto compare his results net of fees with those of other fund among the top 25% for two or three years in a row (Bollen managers. However, we also need another dataset to com- and Busse [2005]). However, care should be taken whenpute thosefigures.And fijnd investors may be more inter- trying to predict fiiture fijnd returns based on past returns. ested in the final result than in whether the results were Horst and Verbeek [2000] show that some studies due to skill or the cost structure of the product. claiming to find performance persistence are likely The asynchronous pricing introduces an upward reporting spurious and biased results. Kahn and Rudd bias into the tracking error. For example, a fund that is [1995] and Carhart [1997] also analyzed persistence of tracking its benchmark perfecdy, but whose NAV is not equity mutual funds, and did not find a significant rela- calculated with the same security prices or foreign tionship betw^een past and future performance. exchange rates as the benchmark, will inevitably exhibit We fmd similar results with our dataset. We catego- a tracking error that is different from zero. Again, quan- rize Equity U.S. funds launched in 1998 or before into tifying this bias would require a dataset that draws heavily quartiles based on their 1998 IR. For each quartile, we cal- on the internal valuation information of a fund manage- culate the average IRs for each year. The results are shown ment company, w^hich is difficult to coine by. in Exhibit 11.SiKING 2010 THEJOURNAL OF INVESTING 75
  10. 10. EXHIBIT 11Performance Persistence of Equity U.S. Funds 4th Quartile 3rd Quartile 2ncl Quaiiile 1st Quartile -2.6 1998 1999 2006 2007 2008 Note that the top-quartile funds of 1998 actually the top quartile for two or three years in a row for rollinghave the lowest average IR afiier two years. The chart sug- three-year periods, and the results are fairly stable andgests a mean-reverting process and shows that, on average, consistent.good performance does not persist. Based on this fact, we We interpret Exhibit 13 as follows. When lookingconclude that lucky managers without skill are not likely at the first row, for the 2008-2006 period, 0.93% of allto remain among the best funds for multiple years in a row. Equity U.S. funds were in the top 25% of the funds in all We therefore propose track record as a second three years. For the 2008-2007 period, 2.33% of all Equitydimension to evaluate manager performance. First, we U.S. funds were in the top 25%, and for 2008, the numbertrack the performance of fiinds from selected categories was 21.73%. These three values add to 25% with onlythat launched in 1998 or earlier and survived until 2008 minor rounding differences.over the entire 11-year period. Then we calculate the We perform the same calculation using the top 50%number of years that each flmd ranked in the top 25%^ of funds. However, we conclude that the top 50% iswithin this period. Exhibit 12 presents our summarized achieved too easily, and is therefore not an appropriateresults, which are as expected. measure.^ Based on these results, performance persistence Note that, during the 11-year period, 95.5% of all (the track record of a manager) is another important factorEquity U.S. funds and 93.4% of all Equity Small Cap U.S. in separating luck from skill in performance measure-funds were in the top 25% at least once. Hence, if a fund ment. Investors should thus seek a fund manager with asurvives for 11 years, it is likely to be in the top quartile consistent series of performance measures, rather thanin some years just by luck. However, it is important to what could be unrelated episodes of good performancenote that these results could be caused at least partly by over longer timeframes.changes in fund management.•* Taking the results from Exhibit 12 one step further, Agency Problemswe calculate how many fiands remain in the top quartilefor two or three years in a row within a three-year period The agency problem can be illustrated as follows.based on their IRs. According to Exhibit 13, such funds Consider a portfoho manager who makes just one activewould be considered extraordinary, since, on average, only investment decision per year, and whose correlation2.76% of all funds have managed this achievement. between forecasted and actual returns is 0.1. According toWe calculate the percentage of all fiinds that remain in the fundamental law of active management, this manager76 How "INFORMATIVE" IS THE INFORMATION RAno FOR EVALUATING MUTUAL FUND MANAGERS? SPRING 2010
  11. 11. EXHIBIT 12Number of Top 25% Rankings Over Lifetime Fund Classificstion 0 1 2 3 4 5 6 or more Equity U.S. 4.5% 12.8% 21.4% 24.9% 17.7% 1A% 6.3% Equity Small Cap U. S. 6.6% 11.0% 21.7% 24.4% 18.3% 9.3% 8.7%EXHIBIT 13Perfonnance Persistence of Equity U.S. Funds Over Time Top 25% ... Years i a Row in Top 50% ... Years in a Row Period lYear 2 Years 3 Years lYear 2 Years 3 Years 2008 to 2006 21.73% 2.33% 0.93% 25.73% 11.58% 12.67% 2007 to 2005 20.41% 2.88% 1.72% 28.22% 7.81% 13.94% 2006 to 2004 18.77% 3.04% 3.18% 26.48% 5.51% 17.99% 2005 to 2003 16.76% 4.07% 4.15% 20.87% 9.32% 19.81% 2004 to 2002 14.80% 7.57% 2.65% 18.20% 14.23% 17.54% 2003 to 2001 19.76% 2.33% 2.87% 25.49% 6.51% 17.97% 2002 to 2000 12.10% 5.15% 1.11% 13.22% 6.87% 29.87% 2001 to 1999 11.11% 13.17% 0.72% 10.66% 29.66% 9.68% 2000 to 1998 23.35% 0.83% 0.83% 34.09% 5.79% 10.12% Mean 17.64% 4.60% 2.76% 22.55% 10.81% 16.62%vill achieve an IR of 0.1. The empirical part of this article mutual fund managers. Based on empirical evidence, weshows that an IR of 0.1 for an Equity U.S. fund would in find that the IR is in fact reliable and useful, but has certainmost years be considered "good," and in some years even limitations. Overall, our analysis reveals that two dimen-"very good," despite the fact that the manager may have sions are important to adequately judge manager perfor-done very little. We believe the IR can potentially incen- mance in a given year: 1) the performance in that year,tivize strategies that may he unfavorable to investors. It and 2) the track record of the fund over the previous threeseems that performance measures that use the tracking years. The former can be used to establish a ranking oferror as a risk measure need a second dimension that cap- funds that are then adjusted either upward or downwardtures the active weights of the fund, such as the Active by the latter.Share measure proposed by Cremers and Petajisto [2009]. In order to transform the IR into a grading system,This measure is easy to calculate and can quantify the we introduce a categorization of quartiles that defineactive holdings of a mutual fund in relation to the corre- thresholds of fund qualities. IRs vary over time and alsosponding benchmark. across different fund categories, so it is necessary to cal- culate threshold values anew for every calendar year. ThisCONCLUSION makes the IR a difficult choice when setting targets for portfolio managers, because they will not know how well Practical Implications they must perform until the end of the year. We found that four factors influence the quality The aim of this article is to evaluate whether the of the IR: 1) benchmark selection, 2) data frequency.IR is a useful and reliable measure of the performance ofSPRING 2010 THE JOURNAL OF INVESTING 77
  12. 12. EXHIBIT 14 biased. Performance should be (and in actualFramework for Performance Evaluation—Year 2008 practice is) measured using returns net of fees. In fact, a significant part of the total fees -1.0 -0.5 0.0 0.5 1.0 1.5 cannot be influenced by the portfolio man- agers, e.g., fund audit or custody fees. Equity Euro Second, the sample is dominated by U.S. funds simply because of the data providers we used. Third, many funds are Equity Germa subject to style drifts, which generally make returns harder to compare (Chan et al. Equity U.K. [20ü2]). Although we selected very broad fund categories, it would be interesting to test for biases caused by style drift. Fourth, the sample is subject to sur- orate Bonds GBP vivorship bias of up to 0.8% per year. This distorts the performance measures calculated Corporate Bonds USD on these returns (Brown et al. [1995]). Fifth, asynchronous pricing might result in tracking "below average" and "poor" "good" • "very good error estimates with an upward bias, and therefore to IRs that are lower than the real3) non-normality of fund returns, and 4) any sur- IRs.vivorship bias inherent in the sample used to estiinate In addition to our dataset, the analyses creates ideasthe threshold values. Regarding the benchmark, we for additional research. For example, we would recom-recommend selecting an index that captures a large part mend comparing the results based on a generic bench-of the respective market. The data frequency should be mark with results based on fund-specific benchmarks ashigh—daily or weekly. Returns should also be tested determined by the portfolio manager. Alternatively, use offor normahty, as this influences the quality of the per- the peer group average as a benchmark might lead toformance measures significantly. Finally, quantifying the more stability in the wildly fluctuating [Rs.survivorship bias within the IR is difficult and still Another suggestion is to analyze the IRs ot tundsunclear. Thus, it is best left for iliture research. Note that with more specific style definitions, such as "U.S. valuethe proposed framework is only valid for funds with stocks" or "European bank stocks." However, the numbersymmetric return profiles. of these funds is rather small, which may render the results Exhibit 14 is an example of a pertorniance evalua- insignificant.tion framework based on the IR that is calculated using Finally, the effect of the Transfer Coefficient on athe dataset of our empirical study. It is valid for funds of managers active performance should be analyzed in morethe selected categories in 2008 and can help estimate per- detail. Fund managers face certain investment restrictionsformance along the first dimension, the performance of that prevent the allocation of funds to the best possiblethe fund within a particular year. We make no differen- portfolio. These restrictions will negatively affect the IR,tiation between funds belonging to the third or fourth although they are not influenced by the manager.quartiles ("below average" or "poor" funds), because their According to Wander [2003], mutual tlinds can face TCsIRs are mostly negative and thereiore unreliable. of 0.5 or even lower, and therefore managers would have to double performance to obtain results comparable to unconstrained portfolio managers. Future research could Further Research develop and empirically analyze ways to modify perfor- While our results answer many of the research ques- mance measures so that the impact of investment restric-tions, they also open up new issues. First, the returns are tions is neutralized across funds.not corrected for fees, so the performance is somewhat78 H o * "INKIRMATIVE" Is THE INFORMATION RATIO FOR EVALUATING MUTUAL FUND MANAGERS? SPRING 2010
  13. 13. APPENDIXEXHIBIT AlSample Size of the Fund Dataset Grouped by Fund Classification Number of Funds in the Dataset by Year Fund Classification 1998 2000 2002 2004 2005 2006 2007/08 Equity Europe 127 214 363 553 689 813 895 Equity Germany 54 57 65 70 73 80 84 Equity U.K. 189 267 370 514 570 658 681 Equity U.S. 970 1,341 2,117 2,832 3,203 3,648 3,953 Equity Small Cap Europe 31 64 98 132 152 184 202 Equity Small Cap U.K. 51 67 83 109 111 127 132 Equity Small Cap U.S. 529 775 1,237 1,653 1,842 2,057 2,184 Corporate Bonds EUR 0 0 49 129 151 171 185 Corporate Bonds GBP 50 86 124 167 187 211 222 Corporate Bonds USD 88 108 158 203 211 231 237 Money Market EUR 0 0 164 223 243 283 300 Money Market GBP 36 53 79 94 99 112 118 Money Market USD 202 230 320 396 410 433 439Source: Aggregation based on Reuters 3000 Xtra and Tliotnson Financial DataStream.EXHIBIT A2Overview of Benchmark Indices Fund Classification Benchmark Name DataStream Ticker Equity Europe MSCI Europe "MSEROP Equity Germany DAX DAXINDX Equity U.K. FTSE 100 FTSE100 Equity U.S. S&P 500 S&PCOMP Equity Small Cap Europe MSCI Europe MSEROP Equity Small Cap U.K. FTSE All Share FTSEALLSH Equity Small Cap U.S. S&P 600 Small Cap S&P600I Corporate Bonds EUR iBoxx Liquid EUR Corporates IBELCAL Corporate Bonds GBP iBoxx Liquid GBP Corporates IB£CSAL Corporate Bonds USD Merrill Lynch Corporate Master MLCORPM Money Market EUR EUR Interbank 3M Offered Rate BBEUR3M Money Market GBP GBP Interbank 3M Offered Rate BBGBP3M Money Market USD USD Interbank 3M Offered Rate BBUSD3MSource: Thomson Financial DataStream.SPRING 2010 THE JOURNAL OF INVESTING 79
  14. 14. ENDNOTES Cremers, M., and A. Petajisto. "How Active is Your Fund Man- ager? A New Measure That Predicts Performance." Working See Exhibit AI in the Appendix for a complete overview Paper, Yale School of Management, New Haven, 2009.of the ftind types analyzed here. ""The reported results were not sensitive to the use of Elton,E.J.,M.J. Gruber, and C.R. Blake. "Survivorship Bias andmethods 2 to 4 and are omitted for brevity. Mutual Fund Performance." Review of Financial Studies, Vol. 9, Using the Information Ratio as the ranking criterion. No.4 (1996), pp. 1097-1120. ^Due to limited data availability, it was not possible tocorrect the sample for changes in fund management. Goodwin, T.H. "The Information Ratio." Financial Analysts ^The results are available from the authors upon request. Journal, Vol. 54, No. 4 (1998), pp. 34-43.REFERENCES Grinblatt, M., and S. Titman. "Mutual Fund Performance: An Analysis of Quarterly Portfolio Holdings."yí)Mmíí/ of Business,Ackermann, C , R. McEnally, and D. Ravenscraft. "The Per- Vol. 62, No. 3 (1989), pp. 393-416.formance of Hedge Funds: BJsk, Return and Incentives."Journii/of Finance, Vol. 54, No. 3 (1999), pp. 833-874. Grinold, R.C. "The Fundamental Law of Active Management." Journal of Portfolio Management, Vol. 15, No. 3 (1989), pp. 30-37.Ané, T., and C. Labidi. "Return Interval, Dependence Struc-ture, and Multivariate Normality."_/oHmii/ of Economics and Grinold, R . C . , and R . N . Kahn, Active Portfolio Management:Finance, Vol. 28, No. 3 (2004), pp. 285-299. A Quantitative Approach for Providing Superior Returns and Con- trolling Risk, 2nd ed. New York: McGraw-Hill, 2000.Below, S.D., and S.R. Stansell. "Do the Individual Moments ofREIT Return Distributions Affect Institutional Ownership Handa, R,S.P. Kothari,and C. Wasley. "The Relation BetweenPatterns?"yníirníj/ of Asset Management, Vol. 4, No. 2 (2003), the Return Interval and Betas: Implications for the Sizepp. 77-95. Effect."_/í>wmií/ of Financial Economics, Vol. 23, No. 1 (1989), pp. 79-100.Benson, K., P. Gray, E. Kalotay, and J. Qiu. "Portfolio Con-struction and Performance Measurement When Returns are Hollander, M., and D.A. Wolfe. Nonparametric Statistical Methods.Non-Normal." ^w5íríi/ííi«yoMmíi/ of Management, Vol. 32. No. 3 Hoboken, NJ:John Wiley 6¿ Sons, Inc., 1973.(2008), pp. 445-461. Horst,J.T., and M. Verbeek. "Estimating Short-Run PersistenceBollen, N.P.B., and J.A. Busse. "Short-Term Persistence in in Mutual Fund Performance." Raneu ofEconomia and Statis-Mutual Fund Performance." Review ofFinanríal Studies, Vol. 18. tics, Vol. 82, No. 4 (2000), pp. 646-655.No. 2 (2005), pp. 569-597. Hübner, G. "How Do Performance Measures Perform?"yiinrtta/Brown., S.J., and WN. Goetzmann. "Performance Persistence." of Portfolio Management, Vol. 33, No. 4 (2007), pp. 64-74.Journal of Finance. Vol. 5U. No. 2 (1995), pp. 679-698. Jacobs, B.I., and K.N. Levy. "Residual Risk: How Much is TooBrown, S.J., W.N. Goetzmann, R.G. Ibbotson, and S.A. Ross. Much?Journal of Portfolio Management, Vol. 2 1 , No. 3 (1996)."Survivorship Bias in Performance Studies." Review ofFinatiaal pp. 10-16.Studies, Vol. 5. No. 4 (1992). pp. 553-580. Kahn, R.N., and A. Rudd. "Does Historical Performance Pre-Brown, S.J., W.N. Goetzmann, and S.A. Ross. "Survival."_/owma/ dict Future Performance?" Financial Analysts Journal, Vol. 51,of Finance, Vol. 50, No. 3 (1995), pp. 853-873. No. 6 (1995), pp. 43-52.Carhart, M.M. "On Persistence in Mutual Fund Performance." Keating, C , and W.F. Shadwick. "Omega: A Universal Perfor-Journal of Finance, Vol. 52, No. 1 (1997). pp. 57-82. mance Measure."yoHmii/ of Performance Measurement, Vol. 6, No. 3 (2002), pp. 59-84.Chan, L.K.C., H.-L. Chen, and J. Lakonishok. "On MutualFund Investment Styles." Review of Finanaal Studies, Vol. 15, Kraus, A., and R.H. Litzenberger. "Skewness Preference andNo. 5 (2002), pp. 1407-1437. the Valuation of Risk Assets." Journal of Finance, Vol. 31, No. 4 (1976), pp. 1085-1100.80 How "IPJFORMATIVE" IS THE INFORMATION RATIO FOR. EVALUATING McrruAL Fu^a) MANAGERS? SPRING 2010
  15. 15. Lehmann, B.N., and D.M. Modest. "Mutual Fund Perfor-mance Evaluation: A Comparison of Benchmarks and Bench-mark Comparisons." JoMmd/ of Finance, Vol. 42, No. 2 (1987),pp. 233-265.LilHefors, H.W. "On the Kolmogorov-Smirnov Test forNormality with Mean and Variance Unknown."_/oHrííií/ ofthe American Statistical Association, Vol. 62, No. 318 (1967),pp. 399-402.TreynorJ.L. "How to Rate Management of Investment Funds."Harvard Bttsiness Review, Vol. 43, No. 1 (1965), pp. 63-75. . "Toward a Theory of Market Value of Risky Assets."Working Paper, 1961. Subsequently published in R.A.Kora-jczyk. Asset Pricing and Portfolio Performance: Models, Strategyand Performance Metrics. London: Risk Books, 1999.Treynor.J.L., and F. Black. "How to Use Security Analysis toImprove Portfolio Selection." Jowrna/ of Business, Vol. 46, No. 1(1973), pp. 66-86.Wander, B.H. "What it Takes to Beat a Benchmark."JtiMmij/ ofInvesting, Vol. 12, No. 3 (2003), pp. 37-42.To order reprints of this article, please contact Dewey Palmieri or 2Í2-224-3675.SPRING 2010 THE JOURNAL OF INVESTINÜ 81
  16. 16. ©Euromoney Institutional Investor PLC. This material must be used for the customers internal business useonly and a maximum of ten (10) hard copy print-outs may be made. No further copying or transmission of thismaterial is allowed without the express permission of Euromoney Institutional Investor PLC. Source: Journal ofInvesting and