Integrated Financial Modeling - MS Excel and VBA for MS Excel
Artur-eea-presentation
1. Understanding rankings of financial analysts
Artur Aiguzhinov1,2
Ana Paula Serra1
Carlos Soares 2
1
CEFUP & Department of Economics, University of Porto, Portugal
2
LIAAD-INESC Porto LA & Department of Economics, University of Porto, Portugal
February 25th, 2011
Agent-based computational economics: Computational Finance
Eastern Economics Association Conference, New York
1 of 24
2. Motivation (1): the value of the recommendations
Efficient Market Hypothesis (Fama, 1970);
Information gathering costly ⇒ providing possibilities for abnormal
returns (Grossman and Stiglitz, 1980; Fama, 1991);
On average, recommendations bring value to investors and financial
analysts’ accuracy in forecasts is valuable (Womack, 1996; Barber et al.,
2001);
2 of 24
3. Motivation(2): rankings of analysts
StarMine R
issues annual rankings of financial analysts:
Ranks the analysts based on
recommendation performance
For each analyst a portfolio is constructed. For each “Buy”/“Sell”
recommendation the portfolio is one (two) unit(s) long/short and
simultaneously one (two) unit(s) short/long the benchmark. For “Hold”
recommendations, the portfolio invests one unit in the benchmark
EPS forecast accuracy
Single stock Estimating Score (SES): relative accuracy of each analyst’s
earnings forecast when compared with their peer
3 of 24
4. Issue: Prediction of rankings of analysts
Foreknowledge of analyst forecast accuracy is valuable (Brown and
Mohammad, 2003)
Is it possible to predict these rankings? (StarMine rankings are ex-post)
If yes, can we use those predictions into profitable strategies?;
Why not predict stock prices instead?
Analysts’ relative performance (rankings) rather more predictable than the
stock prices
Main goal
Accurately predict the rankings of financial analysts
4 of 24
5. Research contributions
Interdisciplinary approach (label ranking algorithm)
First paper to identify the variables that discriminate the rankings of
analysts
analysis of the financial analysts based on state variables concerning market
conditions and stock characteristics (instead of analyst characteristics)
First paper to predict the rankings of analysts
5 of 24
6. Research design: an overview
1. Create (ex-post) rankings of the analysts (target rankings):
to establish the rankings we follow models of Clement (1999); Brown
(2001); Creamer and Stolfo (2009);
2. Define state variables
3. Identify the most discriminative state variables
4. Predict rankings of the analysts (naive Bayes for label ranking) and
evaluate the ranking accuracy
6 of 24
7. Database to use
ThomsonOne I/B/E/S Detailed History:
Quarterly EPS forecasts (1989Q1-2009Q4);
ThomsonOne DataStream:
Accounting data;
Market data;
Filtered out stocks
less than 3 analysts
less than 8 quarters in forecast history
7 of 24
8. Description of the data
Table: Summary of the data
Sector # stocks # brokers # forecasts # forecasts
stocks/broker brokers/stock
Energy 170 205 3.09 2.75
Industrials 298 320 2.42 2.23
IT 442 413 2.71 2.41
Materials 94 229 2.56 2.43
Total 1004 1167 2.70 2.46
8 of 24
9. Average forecasts
Average forcasts per broker
Quarters
Mu
23456
1989 1 1995 1 2000 1 2005 1 2009 4
Sectors:
Energy
IT
Industrials
Materials
9 of 24
10. Additional data analysis
Figure: Average distribution of brokers based on issued forecasts for all stocks
1 2 3 4 5
0.10.20.30.4
Forecasts per quarter
%ofanalysts
Sectors:
Energy
Industrials
IT
Materials
10 of 24
11. topN-lastN analysis
Table: Average topN-lastN changes of analysts per quarter, N=3
topt lastt
topt−1 lastt−1 topt−1 lastt−1
Energy 3.01 2.35 2.34 2.75
Industrials 3.27 2.44 2.43 2.69
IT 3.61 2.64 2.57 3.22
Materials 2.53 2.04 1.90 2.13
11 of 24
12. Creating target rankings: indexing of the analysts
Based on previous research, we use EPS mean adjusted forecast error
(MAFE1
) as a measure of analysts predicting accuracy:
FEq,a,s = |ActEPSq,s − EPSq,a,s| (1)
FEq,s =
1
n
n
a=1
FEq,a,s (2)
MAFEq,a,s =
FEq,a,s
FEq,s
(3)
1We scaled the error measure by adding 1 so that the most accurate analyst will
receive a higher rank12 of 24
13. Define state variables
Variables capture market conditions and stock characteristics (Jegadeesh
et al., 2004; Creamer and Stolfo, 2009):
Market Conditions: Market volatility (MKT)
Consensus Analyst Variables
Lagged (consensus) forecasting error (FELAG)
Change in consensus (consensus)
Earnings Momentum: Standardized Unexpected Earnings (SUE)
Growth Indicators: Sales growth (SG)
Accounting Fundamentals: Total accruals to total assets ratio (TA)
Valuation multiples: Earnings-to-price ratio (EP)
13 of 24
14. Define state variables
Naive Bayes algorithm requires that continuous variables be converted into
discrete
discretization using the size bins method (Dougherty et al., 1995)
4 bins
High
Medium High
Medium Low
Low
14 of 24
16. Discriminative Value
Bins average similarity Weights Weighted average
a1 vs. b1 0.00 1 0.00
a1 vs. c1 0.50 2 1.00
a1 vs. d1 0.25 3 0.75
b1 vs. c1 0.50 1 0.50
b1 vs. d1 0.75 2 1.50
c1 vs. d1 0.50 1 0.50
0.708
Discriminative Power : 1-0.708=0.292 The higher the discriminative power,
the more different are the rankings between one and the other state of the
world
16 of 24
17. Results: Discriminative power
Table: Discriminative power of independent variables
Sectors FELAG SUE consensus EP SG TA MKT
Energy 0.163 0.194 0.224 0.218 0.183 0.186 0.222
IT 0.207 0.214 0.220 0.196 0.183 0.197 0.194
Industrials 0.185 0.179 0.192 0.210 0.200 0.224 0.190
Materials 0.199 0.189 0.258 0.218 0.203 0.220 0.183
17 of 24
18. Predict with Label Ranking Algorithm
Naive Bayes algorithm for label ranking (Aiguzhinov et al., 2010):
non parametric technique that relies on the similarities of the rankings
predicts rankings conditional on the values of the state variables
Alternative baseline ranking methods:
default (ranking based on the average rank of each label)
naive (previous quarter ranking)
Accuracy of the methods: Spearman’s rank correlation
Figure: Time line of the predicted ˆπ and the target π rankings
18 of 24
19. Results: Label rankings
Table: Ranking accuracy of the naive Bayes ranking method and the two baselines.
NBr default naive ranking
Sector mean std.dev mean std.dev mean std.dev
Energy 0.037 0.060 0.060 0.087 0.064 0.060
IT -0.005 0.045 0.088 0.075 0.043 0.036
Industrials 0.010 0.050 0.078 0.074 0.033 0.046
Materials 0.012 0.067 0.076 0.073 0.041 0.056
19 of 24
20. Differences(1)
Figure: Differences in ranking accuracy of naive Bayes and the default rankings
0 50 100 150
1.00.50.00.51.0
Energy
Stocks
DifferencebetweenmeanaccuracyofNbranddefaultranking
0 50 100 150 200 250 300
1.00.50.00.51.0
Industrials
Stocks
DifferencebetweenmeanaccuracyofNbranddefaultranking
0 100 200 300 400
1.00.50.00.51.0
IT
Stocks
DifferencebetweenmeanaccuracyofNbranddefaultranking
0 20 40 60 80
1.00.50.00.51.0
Materials
Stocks
DifferencebetweenmeanaccuracyofNbranddefaultranking
20 of 24
21. Differeces(2)
Figure: Differences in ranking accuracy of naive Bayes and the naive rankings
0 50 100 150
1.00.50.00.51.0
Energy
Stocks
DifferencebetweenmeanaccuracyofNbrandnaiveranking
0 50 100 150 200 250 300
1.00.50.00.51.0
Industrials
Stocks
DifferencebetweenmeanaccuracyofNbrandnaiveranking
0 100 200 300 400
1.00.50.00.51.0
IT
Stocks
DifferencebetweenmeanaccuracyofNbrandnaiveranking
0 20 40 60 80
1.00.50.00.51.0
Materials
Stocks
DifferencebetweenmeanaccuracyofNbrandnaiveranking
21 of 24
22. Conclusions
Discriminative power analysis identifies Consensus as the most
discriminative variable in most of the sectors
There is a room for improving of label ranking algorithm in particular
refining predictor state variables
22 of 24
23. References (1)
Aiguzhinov, Artur, Carlos Soares, and Ana Serra (2010), “A similarity-based
adaptation of naive bayes for label ranking: Application to the
metalearning problem of algorithm recommendation.” In Discovery
Science (Bernhard Pfahringer, Geoff Holmes, and Achim Hoffmann, eds.),
volume 6332 of Lecture Notes in Computer Science, 16–26, Springer
Berlin, Heidelberg.
Barber, B., R. Lehavy, M. McNichols, and B. Trueman (2001), “Can
Investors Profit from the Prophets? Security Analyst Recommendations
and Stock Returns.” The Journal of Finance, 56, 531–563.
Brown, L. (2001), “How Important is Past Analyst Earnings Forecast
Accuracy?” Financial Analysts Journal, 57, 44–49.
Brown, L.D. and E. Mohammad (2003), “The Predictive Value of Analyst
Characteristics.” Journal of Accounting, Auditing and Finance, 18.
23 of 24
24. References (2)
Clement, M.B. (1999), “Analyst forecast accuracy: Do ability, resources,
and portfolio complexity matter?” Journal of Accounting and Economics,
27, 285–303.
Creamer, G. and S. Stolfo (2009), “A link mining algorithm for earnings
forecast and trading.” Data Mining and Knowledge Discovery, 18,
419–445.
Dougherty, J., R. Kohavi, and M. Sahami (1995), “Supervised and
unsupervised discretization of continuous features.” In MACHINE
LEARNING-INTERNATIONAL WORKSHOP, 194–202, MORGAN
KAUFMANN PUBLISHERS, INC.
Fama, E.F. (1970), “Efficient Capital Markets: A Review of Empirical
Work.” The Journal of Finance, 25, 383–417.
Fama, E.F. (1991), “Efficient Capital Markets: II.” The Journal of Finance,
46, 1575–1617.
24 of 24
25. References (3)
Grossman, S.J. and J.E. Stiglitz (1980), “On the Impossibility of
Informationally Efficient Prices.” American Economic Review, 70,
393–408.
Jegadeesh, N., J. Kim, S.D. Krische, and C.M.C. Lee (2004), “Analyzing the
Analysts: When Do Recommendations Add Value?” The Journal of
Finance, 59, 1083–1124.
Vogt, M., JW Godden, and J. Bajorath (2007), “Bayesian interpretation of a
distance function for navigating high-dimensional descriptor spaces.”
Journal of chemical information and modeling, 47, 39–46.
Womack, K.L. (1996), “Do Brokerage Analysts’ Recommendations Have
Investment Value?” The Journal of Finance, 51, 137–168.
25 of 24
26. Similarity-based Naive Bayes for Label Ranking: Prior
probability of label ranking
Table: Demonstration of the prior probability for label ranking
Quarters x1 x2 x3 x4 Ranks
Alex Brown Craig
1 High Low High Medium 1 2 3
2 High High High Low 2 3 1
3 Medium Medium High Low 1 2 3
4 Low Low Low High 1 3 2
...
...
...
...
...
...
...
...
14 Medium High High Medium 1 2 3
15 High Medium High Low 3 1 2
Maximizing the likelihood is equivalent to minimizing the distance (i.e.,
maximizing the similarity) in a Euclidean space Vogt et al. (2007)
27. Label ranking: formalization
Instance: X ⊆ {V1, . . . , Vm}
Labels: L = {λ1, . . . , λk }
Output: Y = ΠL
Training set: T = {xi , yi }i∈{1,...,n} ⊆ X × Y
Learn a mapping h : X → Y such that a loss function is minimized:
=
n
i=1 ρ(πi , ˆπi )
n
(4)
with ρ being a Spearman correlation coefficient:
ρ(π, ˆπ) = 1 −
6
k
j=1(πj − ˆπj )2
k3 − k
(5)
where π and ˆπ are, respectively, the target and predicted rankings for a
given instance.
28. Posterior probability of label ranking
Proir probability of label ranking:
PLR (π) =
n
i=1 ρ(π, πi )
n
(6)
Conditional probability of label ranking:
PLR (va,i |π) =
i:xi,a=va,i
ρ(π, πi )
|{i : xi,a = va,i }|
(7)
Estimated ranking:
ˆπ = arg max
π∈ΠL
PLR (π)
m
a=1
PLR (xi,a|π) (8)