Artur-eea-presentation

Understanding rankings of ﬁnancial analysts
Artur Aiguzhinov1,2
Ana Paula Serra1
Carlos Soares 2
1
CEFUP & Department of Economics, University of Porto, Portugal
2
LIAAD-INESC Porto LA & Department of Economics, University of Porto, Portugal
February 25th, 2011
Agent-based computational economics: Computational Finance
Eastern Economics Association Conference, New York
1 of 24

Motivation (1): the value of the recommendations
Eﬃcient Market Hypothesis (Fama, 1970);
Information gathering costly ⇒ providing possibilities for abnormal
returns (Grossman and Stiglitz, 1980; Fama, 1991);
On average, recommendations bring value to investors and ﬁnancial
analysts’ accuracy in forecasts is valuable (Womack, 1996; Barber et al.,
2001);
2 of 24

Motivation(2): rankings of analysts
StarMine R
issues annual rankings of ﬁnancial analysts:
Ranks the analysts based on
recommendation performance
For each analyst a portfolio is constructed. For each “Buy”/“Sell”
recommendation the portfolio is one (two) unit(s) long/short and
simultaneously one (two) unit(s) short/long the benchmark. For “Hold”
recommendations, the portfolio invests one unit in the benchmark
EPS forecast accuracy
Single stock Estimating Score (SES): relative accuracy of each analyst’s
earnings forecast when compared with their peer
3 of 24

Issue: Prediction of rankings of analysts
Foreknowledge of analyst forecast accuracy is valuable (Brown and
Mohammad, 2003)
Is it possible to predict these rankings? (StarMine rankings are ex-post)
If yes, can we use those predictions into proﬁtable strategies?;
Why not predict stock prices instead?
Analysts’ relative performance (rankings) rather more predictable than the
stock prices
Main goal
Accurately predict the rankings of ﬁnancial analysts
4 of 24

Research contributions
Interdisciplinary approach (label ranking algorithm)
First paper to identify the variables that discriminate the rankings of
analysts
analysis of the ﬁnancial analysts based on state variables concerning market
conditions and stock characteristics (instead of analyst characteristics)
First paper to predict the rankings of analysts
5 of 24

Research design: an overview
1. Create (ex-post) rankings of the analysts (target rankings):
to establish the rankings we follow models of Clement (1999); Brown
(2001); Creamer and Stolfo (2009);
2. Deﬁne state variables
3. Identify the most discriminative state variables
4. Predict rankings of the analysts (naive Bayes for label ranking) and
evaluate the ranking accuracy
6 of 24

Database to use
ThomsonOne I/B/E/S Detailed History:
Quarterly EPS forecasts (1989Q1-2009Q4);
ThomsonOne DataStream:
Accounting data;
Market data;
Filtered out stocks
less than 3 analysts
less than 8 quarters in forecast history
7 of 24

Description of the data
Table: Summary of the data
Sector # stocks # brokers # forecasts # forecasts
stocks/broker brokers/stock
Energy 170 205 3.09 2.75
Industrials 298 320 2.42 2.23
IT 442 413 2.71 2.41
Materials 94 229 2.56 2.43
Total 1004 1167 2.70 2.46
8 of 24

Average forecasts
Average forcasts per broker
Quarters
Mu
23456
1989 1 1995 1 2000 1 2005 1 2009 4
Sectors:
Energy
IT
Industrials
Materials
9 of 24

Additional data analysis
Figure: Average distribution of brokers based on issued forecasts for all stocks
1 2 3 4 5
0.10.20.30.4
Forecasts per quarter
%ofanalysts
Sectors:
Energy
Industrials
IT
Materials
10 of 24

topN-lastN analysis
Table: Average topN-lastN changes of analysts per quarter, N=3
topt lastt
topt−1 lastt−1 topt−1 lastt−1
Energy 3.01 2.35 2.34 2.75
Industrials 3.27 2.44 2.43 2.69
IT 3.61 2.64 2.57 3.22
Materials 2.53 2.04 1.90 2.13
11 of 24

Creating target rankings: indexing of the analysts
Based on previous research, we use EPS mean adjusted forecast error
(MAFE1
) as a measure of analysts predicting accuracy:
FEq,a,s = |ActEPSq,s − EPSq,a,s| (1)
FEq,s =
1
n
n
a=1
FEq,a,s (2)
MAFEq,a,s =
FEq,a,s
FEq,s
(3)
1We scaled the error measure by adding 1 so that the most accurate analyst will
receive a higher rank12 of 24

Deﬁne state variables
Variables capture market conditions and stock characteristics (Jegadeesh
et al., 2004; Creamer and Stolfo, 2009):
Market Conditions: Market volatility (MKT)
Consensus Analyst Variables
Lagged (consensus) forecasting error (FELAG)
Change in consensus (consensus)
Earnings Momentum: Standardized Unexpected Earnings (SUE)
Growth Indicators: Sales growth (SG)
Accounting Fundamentals: Total accruals to total assets ratio (TA)
Valuation multiples: Earnings-to-price ratio (EP)
13 of 24

Deﬁne state variables
Naive Bayes algorithm requires that continuous variables be converted into
discrete
discretization using the size bins method (Dougherty et al., 1995)
4 bins
High
Medium High
Medium Low
Low
14 of 24

Discriminative Value
Лист1
First Step: Calculate the rankings similarity matrix
Quarters Ind. Variables Rankings Similarity matrix
X1 X2 X3 X4 a b c 1 2 3 4 5
1 c1 d2 d3 c4 1 2 3 1 1,00 0,25 1,00 0,00 0,00
2 d1 c2 a3 b4 2 3 1 2 0,25 1,00 0,25 0,75 0,75
3 a1 b2 d3 a4 1 2 3 3 1,00 0,25 1,00 0,00 0,00
4 b1 d2 d3 c4 3 2 1 4 0,00 0,75 0,00 1,00 1,00
5 c1 a2 d3 d4 3 2 1 5 0,00 0,75 0,00 1,00 1,00
15 of 24

Discriminative Value
Bins average similarity Weights Weighted average
a1 vs. b1 0.00 1 0.00
a1 vs. c1 0.50 2 1.00
a1 vs. d1 0.25 3 0.75
b1 vs. c1 0.50 1 0.50
b1 vs. d1 0.75 2 1.50
c1 vs. d1 0.50 1 0.50
0.708
Discriminative Power : 1-0.708=0.292 The higher the discriminative power,
the more diﬀerent are the rankings between one and the other state of the
world
16 of 24

Results: Discriminative power
Table: Discriminative power of independent variables
Sectors FELAG SUE consensus EP SG TA MKT
Energy 0.163 0.194 0.224 0.218 0.183 0.186 0.222
IT 0.207 0.214 0.220 0.196 0.183 0.197 0.194
Industrials 0.185 0.179 0.192 0.210 0.200 0.224 0.190
Materials 0.199 0.189 0.258 0.218 0.203 0.220 0.183
17 of 24

Predict with Label Ranking Algorithm
Naive Bayes algorithm for label ranking (Aiguzhinov et al., 2010):
non parametric technique that relies on the similarities of the rankings
predicts rankings conditional on the values of the state variables
Alternative baseline ranking methods:
default (ranking based on the average rank of each label)
naive (previous quarter ranking)
Accuracy of the methods: Spearman’s rank correlation
Figure: Time line of the predicted ˆπ and the target π rankings
18 of 24

Results: Label rankings
Table: Ranking accuracy of the naive Bayes ranking method and the two baselines.
NBr default naive ranking
Sector mean std.dev mean std.dev mean std.dev
Energy 0.037 0.060 0.060 0.087 0.064 0.060
IT -0.005 0.045 0.088 0.075 0.043 0.036
Industrials 0.010 0.050 0.078 0.074 0.033 0.046
Materials 0.012 0.067 0.076 0.073 0.041 0.056
19 of 24

Diﬀerences(1)
Figure: Diﬀerences in ranking accuracy of naive Bayes and the default rankings
0 50 100 150
1.00.50.00.51.0
Energy
Stocks
DifferencebetweenmeanaccuracyofNbranddefaultranking
0 50 100 150 200 250 300
1.00.50.00.51.0
Industrials
Stocks
0 100 200 300 400
1.00.50.00.51.0
IT
Stocks
0 20 40 60 80
1.00.50.00.51.0
Materials
Stocks
20 of 24

Diﬀereces(2)
Figure: Diﬀerences in ranking accuracy of naive Bayes and the naive rankings
0 50 100 150
1.00.50.00.51.0
Energy
Stocks
DifferencebetweenmeanaccuracyofNbrandnaiveranking
0 50 100 150 200 250 300
1.00.50.00.51.0
Industrials
Stocks
0 100 200 300 400
1.00.50.00.51.0
IT
Stocks
0 20 40 60 80
1.00.50.00.51.0
Materials
Stocks
21 of 24

Conclusions
Discriminative power analysis identiﬁes Consensus as the most
discriminative variable in most of the sectors
There is a room for improving of label ranking algorithm in particular
reﬁning predictor state variables
22 of 24

References (1)
Aiguzhinov, Artur, Carlos Soares, and Ana Serra (2010), “A similarity-based
adaptation of naive bayes for label ranking: Application to the
metalearning problem of algorithm recommendation.” In Discovery
Science (Bernhard Pfahringer, Geoff Holmes, and Achim Hoffmann, eds.),
volume 6332 of Lecture Notes in Computer Science, 16–26, Springer
Berlin, Heidelberg.
Barber, B., R. Lehavy, M. McNichols, and B. Trueman (2001), “Can
Investors Profit from the Prophets? Security Analyst Recommendations
and Stock Returns.” The Journal of Finance, 56, 531–563.
Brown, L. (2001), “How Important is Past Analyst Earnings Forecast
Accuracy?” Financial Analysts Journal, 57, 44–49.
Brown, L.D. and E. Mohammad (2003), “The Predictive Value of Analyst
Characteristics.” Journal of Accounting, Auditing and Finance, 18.
23 of 24

References (2)
Clement, M.B. (1999), “Analyst forecast accuracy: Do ability, resources,
and portfolio complexity matter?” Journal of Accounting and Economics,
27, 285–303.
Creamer, G. and S. Stolfo (2009), “A link mining algorithm for earnings
forecast and trading.” Data Mining and Knowledge Discovery, 18,
419–445.
Dougherty, J., R. Kohavi, and M. Sahami (1995), “Supervised and
unsupervised discretization of continuous features.” In MACHINE
LEARNING-INTERNATIONAL WORKSHOP, 194–202, MORGAN
KAUFMANN PUBLISHERS, INC.
Fama, E.F. (1970), “Eﬃcient Capital Markets: A Review of Empirical
Work.” The Journal of Finance, 25, 383–417.
Fama, E.F. (1991), “Eﬃcient Capital Markets: II.” The Journal of Finance,
46, 1575–1617.
24 of 24

References (3)
Grossman, S.J. and J.E. Stiglitz (1980), “On the Impossibility of
Informationally Eﬃcient Prices.” American Economic Review, 70,
393–408.
Jegadeesh, N., J. Kim, S.D. Krische, and C.M.C. Lee (2004), “Analyzing the
Analysts: When Do Recommendations Add Value?” The Journal of
Finance, 59, 1083–1124.
Vogt, M., JW Godden, and J. Bajorath (2007), “Bayesian interpretation of a
distance function for navigating high-dimensional descriptor spaces.”
Journal of chemical information and modeling, 47, 39–46.
Womack, K.L. (1996), “Do Brokerage Analysts’ Recommendations Have
Investment Value?” The Journal of Finance, 51, 137–168.
25 of 24

Similarity-based Naive Bayes for Label Ranking: Prior
probability of label ranking
Table: Demonstration of the prior probability for label ranking
Quarters x1 x2 x3 x4 Ranks
Alex Brown Craig
1 High Low High Medium 1 2 3
2 High High High Low 2 3 1
3 Medium Medium High Low 1 2 3
4 Low Low Low High 1 3 2
...
...
...
...
...
...
...
...
14 Medium High High Medium 1 2 3
15 High Medium High Low 3 1 2
Maximizing the likelihood is equivalent to minimizing the distance (i.e.,
maximizing the similarity) in a Euclidean space Vogt et al. (2007)

Label ranking: formalization
Instance: X ⊆ {V1, . . . , Vm}
Labels: L = {λ1, . . . , λk }
Output: Y = ΠL
Training set: T = {xi , yi }i∈{1,...,n} ⊆ X × Y
Learn a mapping h : X → Y such that a loss function is minimized:
=
n
i=1 ρ(πi , ˆπi )
n
(4)
with ρ being a Spearman correlation coeﬃcient:
ρ(π, ˆπ) = 1 −
6
k
j=1(πj − ˆπj )2
k3 − k
(5)
where π and ˆπ are, respectively, the target and predicted rankings for a
given instance.

Posterior probability of label ranking
Proir probability of label ranking:
PLR (π) =
n
i=1 ρ(π, πi )
n
(6)
Conditional probability of label ranking:
PLR (va,i |π) =
i:xi,a=va,i
ρ(π, πi )
|{i : xi,a = va,i }|
(7)
Estimated ranking:
ˆπ = arg max
π∈ΠL
PLR (π)
m
a=1
PLR (xi,a|π) (8)

Artur-eea-presentation

Recommended

Recommended

More Related Content

Similar to Artur-eea-presentation

Similar to Artur-eea-presentation (20)

Artur-eea-presentation