This document summarizes a presentation on statistical clustering, hierarchical PCA, and their applications to portfolio management. It introduces PCA and how the first principal component/eigenportfolio can represent the market portfolio. It then describes hierarchical PCA, which partitions assets into clusters and allows for different correlations between and within clusters. The document provides examples analyzing global stock markets with hierarchical PCA. It also describes an algorithm for statistically generating clusters rather than using predefined classifications. Finally, it discusses applications of statistical clustering and hierarchical PCA models to portfolio optimization and mean-variance analysis.
Andheri Call Girls In 9825968104 Mumbai Hot Models
Statistical Clustering and Portfolio Management
1. Statistical Clustering, Hierarchical PCA and Portfolio
Management
Marco Avellaneda & Juan A. Serur
Courant Institute of Mathematical Sciences, New York University
9th Annual Big Data Finance Conference
Dec 2021
9th Annual Big Data Finance Conference Dec 2021
2. Introduction
Quantitative factor analysis has become increasingly important in portfolio management.
Some prominent works
Sharpe → CAPM
˜
ri − rF = βi,M (˜
rM − rF ) (1)
Ross → APT
˜
ri − rF =
N
X
j=1
βi,j (˜
ri − rF ) (2)
Fama-French (explicit factors: value, size, profitability, etc.)
9th Annual Big Data Finance Conference Dec 2021
3. Types of factor models
Models based on explicit factors such as momentum, value, size, quality, etc.
CAPM
Fama-French Three-Factor Model
Fama-French Five-Factor Model
Models based on mathematical factors like statistical features extracted from assets’ returns
using
Principal Component Analysis
Maximum Likelihood
Among others...
Some advantages of mathematical factor models:
Don’t make assumptions on the drivers of price movements
Rely on market data without additional information
Some challenges, like choosing the number K factor. Much of the work in this area is
related to Random Matrix Theory.
9th Annual Big Data Finance Conference Dec 2021
4. Principal Component Analysis (PCA)
In a universe consisting of N stocks and T observations, we consider the N × N empirical
correlation matrix,
C =
1
T
RRt
(3)
where R is the T × N matrix of standardized returns.1
PCA calculates the eigenvalues and eigenvectors of the correlation matrix ranked in de-
creasing order by eigenvalues. Accordingly, the first eigenvector solves the variational
problem
V (1)
= argmax {V t
CV : ||V ||2 = 1} (4)
where ||.||2 represents the Euclidean space.
9th Annual Big Data Finance Conference Dec 2021
5. First Eigenportfolio ≈ Market Portfolio
The first eigenvector is key to describe the statistics of the system
Describes it statistically (as the direction of maximum variance) and financially
Define the principal eigenportfolio as the portfolio with weights
θi = c
V
(1)
i
σi
, where c =
1
N
P
j=1
V
(1)
j
σj
(5)
If F is the return of the principal eigenportfolio, then the first-order optimality condition
for a portfolio that maximizes the Sharpe Ratio over all competing portfolios investing in
the same N stocks can be represented as
r − E(r) = βr (F − E(F)) + r (6)
The Principal Eigenportfolio is connected to the concept of the Market Portfolio (Modern
Portfolio Theory).
9th Annual Big Data Finance Conference Dec 2021
6. The Principal Eigenportfolio is connected to the concept of the Market Portfolio (Mod-
ern Portfolio Theory). If the entries of the principal eigenvector V
(1)
i are positive for all i,
the principal eigenportfolio has positive weights. This follows the Perron-Frobenius The-
orem.
Perron-Frobenius Theorem
Any square matrix Ai,j 0 with positive entries has a unique eigenvector with positive
entries (up to a multiplication by a positive scalar), and the corresponding eigenvalue has
multiplicity one and is strictly greater than the absolute value of any other eigenvalue.
That is
∃ λ∗
0, V ∗
0, ||V ∗
||2 = 1 s.t. Aλ∗
= λ∗
V ∗
. V ∗
→ right eigenvector
∀ λ, |λ| ≤ λ∗
V ∗
is unique up to scaling
The Perron-Frobenius vector applying Collotz-Wielandt formula
λ∗
= max
x≥0,x6=0
min
xi 6=0
[Ax]i
xi
= min
x0
max
[Ax]i
xi
This result can be extended to non-negative matrices
9th Annual Big Data Finance Conference Dec 2021
7. PCA in Stock Markets
Difficult to find a financial explanation in higher-order eigenportfolios
The first eigenvector has all the entries positive (good proxy of the market
portfolio)
9th Annual Big Data Finance Conference Dec 2021
8. Hierarchical PCA
Partition the stock universe into clusters
Strong beliefs on the “intra-cluster data”
Weak beliefs on the “inter-cluster data”
Define b “benchmark portfolios” associated with clusters
Set I(i) = K if stock i is in cluster K
A correlation matrix Ĉ which incorporates the modeler’s beliefs in a parsimonious fashion
is given by:
Ĉi,j =
(
Ci,j if I(i) = I(j)
βi βj ρ̂I(i),I(j)
otherwise.
(7)
From the orthogonality of the eigenportfolios in the same sector, we can derive a simple
formula for the regression coefficients:
βi = Corr(Xi , FI(i)
) =
√
λ1,I(i) V
I(i)
i (8)
9th Annual Big Data Finance Conference Dec 2021
9. If the benchmark portfolio for a cluster is its 1st eigenportfolio, then
Fk
=
1
√
λ1,k
X
i:I(i)=k
V k
i Xi (9)
where Xi are the standardized returns, V k
i is the 1st eigenvector of the PCA factorized
matrix of cluster k, and λ1,k
is the 1st eigenvalue.
𝑪𝑳𝑼𝑺𝑻𝑬𝑹𝑺 𝑴𝑶𝑫𝑰𝑭𝑰𝑬𝑫 𝑴𝑨𝑻𝑹𝑰𝑿
𝑪𝒊,𝒋
𝜷𝒊𝜷𝒋𝝆𝑰(𝒊),𝑰(𝒋)
𝜷𝒊𝜷𝒋𝝆𝑰(𝒊),
𝜷𝒊𝜷𝒋𝝆𝑰(𝒊),𝑰(𝒋)
𝑪𝒊,𝒋
𝜷𝒊𝜷𝒋𝝆𝑰(𝒊),𝑰(𝒋)
𝑪𝒊,𝒋
9th Annual Big Data Finance Conference Dec 2021
10. HPCA matrix presents a more clear, distinct block structure
Darker areas → low correlation (inter-clusters)
Lighter areas → higher correlation (intra-clusters)
0
Figure: Original (left) and modified (right) correlation matrices estimated with the SP 500
returns’ constituents and GICS clusters, from 2010 to 2019.
9th Annual Big Data Finance Conference Dec 2021
11. Examples: Global Stocks
Analysis of four major global equity markets: United States, Europe, Emerging markets
and China.
Sector (GICS) USA Europe Emerging Mkts. China
Communication 24 42 59 10
Consumer Discretionary 64 66 114 84
Consumer Staples 32 43 92 23
Energy 28 22 56 5
Financials 64 109 277 19
Health Care 60 54 54 50
Industrials 69 115 89 97
Information Technology 69 33 121 88
Materials 28 49 121 81
Real Estate 31 33 44 22
Utilities 28 30 46 19
Total 497 596 1049 498
Table: Numbers of companies considered in the study by GICS sectors and regions.
9th Annual Big Data Finance Conference Dec 2021
12. US Stocks
The curve of cumulative explained variance of PCA rises faster
HPCA has lower concentration given a number of components
HPCA is a less greedy algorithm
9th Annual Big Data Finance Conference Dec 2021
13. US Stocks
Clear higher-order portfolios
Single-cluster portfolios
Multi-cluster portfolios (i.e., portfolios of portfolios)
9th Annual Big Data Finance Conference Dec 2021
14. China Stocks
Compared to the other markets, the spread between the two lines is greater
Less diversity level
9th Annual Big Data Finance Conference Dec 2021
15. China Stocks
Clear higher-order eigenportfolios
Almost all are multi-cluster portfolios (i.e., portfolios of portfolios)
9th Annual Big Data Finance Conference Dec 2021
16. European Stocks
Similar behavior to the case of the US
The analysis with European countries (instead of GICS) delivered similar behavior
9th Annual Big Data Finance Conference Dec 2021
17. European Stocks
Mix between single- and multi-cluster portfolios
9th Annual Big Data Finance Conference Dec 2021
18. Emerging Markets
Behavior is more similar to the case of China than the case of the US
The analysis with Emerging countries (instead of GICS) delivered similar behavior
9th Annual Big Data Finance Conference Dec 2021
19. Emerging Markets
Most of the portfolios are multi-cluster
9th Annual Big Data Finance Conference Dec 2021
20. Statistically Generated Clusters
Stocks belonging to the same GICS or country share common factors that capture
–to some extend– their joint dynamics
They are very easy to interpret
However, they have some shortcomings
Stock markets and their components change almost continuously
For risk and portfolio management, practitioners seek a trade off between stable and
adaptive clusters
I It goes against the diversification of a seemingly diversified strategy
I Trading strategies such as sector/country rotation may be affected
9th Annual Big Data Finance Conference Dec 2021
21. Description of the Algorithm
Given M eigenvectors of PCA, we construct 2M
clusters
Set a {+1, −1} M-vector. For each stock i, each entry represents the sign of the
eigenvector (representing a quadrant)
The new space is divided into M quadrants
9th Annual Big Data Finance Conference Dec 2021
22. Results
Like in the US case, the curve of cumulative explained variance of PCA rises faster
The difference here is even higher (here it is for more than 2600 stocks!)
9th Annual Big Data Finance Conference Dec 2021
24. Application to Portfolio Management
First eigenportfolio and the market portfolio (SPY)
First eigenportfolio does a good job of tracking SPY!
9th Annual Big Data Finance Conference Dec 2021
25. Statistical Clustering Factor Model
The model correlation matrix using K components of HPCA reads
C = ÔΛ̂ÔT
+ ζ2
(10)
ζ2
j =
N
X
i=K+1
λ(i)
(O
(i)
j )2
(11)
where ζ2
is the (uncorrelated) idiosyncratic risk. The factor expected returns are
E(r) =
K
X
i=i
β
(K)
j F(K)
+ i (12)
The number K of factors are chosen with the eRank. Let the SVD of the T × N matrix
of standardized log-return
R = UDV (13)
The associated probability distribution is
Pj =
σj
kσk1
for j = 1, ..., Q for Q = M ∧ T (14)
where k.k1 is the L − 1 norm. Then, the effective rank is defined as
eRank(R) = exp{H(P1, P2, ..., PQ )} (15)
where H is the Shannon entropy
9th Annual Big Data Finance Conference Dec 2021
26. Mean-variance optimization:
I Monthly rebalancing using a 6-month estimation window
I HPCA statistical-based factor model (blue) outperforms the HPCA GICS-based factor
model (red) and the shrinkage estimator (salmon)
See [Fabozzi, Focardi and Kolm, 2010], [Fabozzi, Kolm and Focardi, 2006], [?].
9th Annual Big Data Finance Conference Dec 2021
27. References
Avellaneda, M. (2019) Hierarchical PCA and Applications to Portfolio Management.
NYU Courant Working Paper.
Fabozzi, F.J., Focardi, S.M. and Kolm, P.N. (2010) Quantitative Equity Investing:
Techniques and Strategies. Hoboken, NJ: John Wiley Sons, Inc.
Fabozzi, F.J., Kolm, P.N. and Focardi, S.M. (2002) Robust Financial Modeling of
the Equity Market: From CAPM to Cointegration. Hoboken, NJ: John Wiley
Sons, Inc.
Kakushadze, Z. (2015) Heterotic Risk Models. Wilmott Magazine 2015(80): 40-55.
Kakushadze, Z. and Yu, W. (2017) Statistical Risk Models. Journal of Investment
Strategies 6(2): 1-40.
Lopez de Prado, M. (2016) Building Diversified Portfolios that Outperform
Out-of-Sample. Journal of Portfolio Management 42(4): 59-69.
Lopez de Prado, M. (2019) Ten Applications of Financial Machine Learning.
Available at SSRN.
9th Annual Big Data Finance Conference Dec 2021