In this talk Metodi Nikolov, a Quantitative Researcher, is reviewing, without being exhaustive, the usage of factor models in finance – from the simplest single factor linear regression models, through latent variables and beyond. The focus was not be put solely on stocks but rather, on exploring other data types. The hope is to give the listeners an appreciation for the different ways the models can be applied.
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
[Data Meetup] Data Science in Finance - Factor Models in Finance
1. Factor Models in Finance - An Overview
Metodi Nikolov
Data Science in Finance – MeetUp
2019
2. Goals for today
1. Show you what models and tools from Data Science are used
in Finance, with emphasis on Factor Models
3. Goals for today
1. Show you what models and tools from Data Science are used
in Finance, with emphasis on Factor Models
2. Will not aim for extensive reviews
4. Goals for today
1. Show you what models and tools from Data Science are used
in Finance, with emphasis on Factor Models
2. Will not aim for extensive reviews
3. Will not aim to exhaust all model/tools
5. Goals for today
1. Show you what models and tools from Data Science are used
in Finance, with emphasis on Factor Models
2. Will not aim for extensive reviews
3. Will not aim to exhaust all model/tools
4. Will provide you with search keywords
7. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Time-Series Factor Models
8. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Capital Asset Pricing Model
r − rf = α + β(Rm − rf ) +
9. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Capital Asset Pricing Model
r − rf = α + β(Rm − rf ) +
• Several quite important to finance people concepts - market
(and its return), risk free rate, excess return, alpha, beta
10. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Capital Asset Pricing Model
r − rf = α + β(Rm − rf ) +
• Several quite important to finance people concepts - market
(and its return), risk free rate, excess return, alpha, beta
• William Sharp, John Lintner, Jack Traynor, Jan Mossin,
Fisher Black on top of work by Harry Markowitz
11. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Capital Asset Pricing Model
r − rf = α + β(Rm − rf ) +
• Several quite important to finance people concepts - market
(and its return), risk free rate, excess return, alpha, beta
• William Sharp, John Lintner, Jack Traynor, Jan Mossin,
Fisher Black on top of work by Harry Markowitz
• CAPM as published takes expectations and assumes α = 0.
Unfortunately, there is little evidence that this is so (or that
the model works as a whole).
12. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Fama-French Three Factor Model
r − rf = α + β(Rm − rf ) + βsSMB + βv HML +
13. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Fama-French Three Factor Model
r − rf = α + β(Rm − rf ) + βsSMB + βv HML +
1. three factor model (so writing down an estimate is more
involved)
14. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Fama-French Three Factor Model
r − rf = α + β(Rm − rf ) + βsSMB + βv HML +
1. three factor model (so writing down an estimate is more
involved)
2. SMB (Small minus Big) represents size of company (so how
big you are affects your stock’s return)
15. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Fama-French Three Factor Model
r − rf = α + β(Rm − rf ) + βsSMB + βv HML +
1. three factor model (so writing down an estimate is more
involved)
2. SMB (Small minus Big) represents size of company (so how
big you are affects your stock’s return)
3. HML (High minus Low) represents if the stock is growth or
value (via book-to-price ratio)
16. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Fama-French Three Factor Model
r − rf = α + β(Rm − rf ) + βsSMB + βv HML +
1. three factor model (so writing down an estimate is more
involved)
2. SMB (Small minus Big) represents size of company (so how
big you are affects your stock’s return)
3. HML (High minus Low) represents if the stock is growth or
value (via book-to-price ratio)
Kenneth French publishes data about the factors
http://mba.tuck.dartmouth.edu/pages/faculty/ken.
french/data_library.html
17. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Fama-French factor construction
1. Take a big enough sample of stocks from the market in
question
18. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Fama-French factor construction
1. Take a big enough sample of stocks from the market in
question
2. Sort them by market capitalization
19. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Fama-French factor construction
1. Take a big enough sample of stocks from the market in
question
2. Sort them by market capitalization
3. Form two portfolios - one from the stocks in bottom 10%
(call them Small), one from the stocks in the top 10% (call
them Big)
20. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Fama-French factor construction
1. Take a big enough sample of stocks from the market in
question
2. Sort them by market capitalization
3. Form two portfolios - one from the stocks in bottom 10%
(call them Small), one from the stocks in the top 10% (call
them Big)
4. The excess return of portfolio Small over portfolio Big is the
return of the SMB factor
21. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Fama-French factor construction
1. Take a big enough sample of stocks from the market in
question
2. Sort them by market capitalization
3. Form two portfolios - one from the stocks in bottom 10%
(call them Small), one from the stocks in the top 10% (call
them Big)
4. The excess return of portfolio Small over portfolio Big is the
return of the SMB factor
Repeat the same for HML (just sort by book to price ratio).
22. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Fama-French factor construction
1. Take a big enough sample of stocks from the market in
question
2. Sort them by market capitalization
3. Form two portfolios - one from the stocks in bottom 10%
(call them Small), one from the stocks in the top 10% (call
them Big)
4. The excess return of portfolio Small over portfolio Big is the
return of the SMB factor
Repeat the same for HML (just sort by book to price ratio). The
paper by Fama&French from 1992 is called ”The Cross-Section of
Expected Stock Returns”.
23. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Cross-Sectional Factor Models
24. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Cross-Sectional Models
• So far, we have assumed that we observe the factors and we
estimate the loadings.
25. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Cross-Sectional Models
• So far, we have assumed that we observe the factors and we
estimate the loadings.
• Cross-sectional models turn this on this head – we assume we
observe the loadings and estimate the factors
26. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Cross-Sectional Models
• So far, we have assumed that we observe the factors and we
estimate the loadings.
• Cross-sectional models turn this on this head – we assume we
observe the loadings and estimate the factors
• loadings are stocks’ fundamental data – how big it is, how
much debt it has, where it operates etc.
27. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Cross-Sectional Models
• So far, we have assumed that we observe the factors and we
estimate the loadings.
• Cross-sectional models turn this on this head – we assume we
observe the loadings and estimate the factors
• loadings are stocks’ fundamental data – how big it is, how
much debt it has, where it operates etc.
• What we estimate is the common return driving the all stock
returns.
28. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Cross-Sectional Models
• Used for more than one thing:
29. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Cross-Sectional Models
• Used for more than one thing:
• risk model
30. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Cross-Sectional Models
• Used for more than one thing:
• risk model
• alpha model
31. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Cross-Sectional Models
• Used for more than one thing:
• risk model
• alpha model
• The fitting of the model is usually done in steps
32. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Cross-Sectional Models
• Used for more than one thing:
• risk model
• alpha model
• The fitting of the model is usually done in steps
• Usually done for stocks (and selling the models is a business
in and of itself), but hardly exclusive to them
33. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Cross-Sectional Models
• Used for more than one thing:
• risk model
• alpha model
• The fitting of the model is usually done in steps
• Usually done for stocks (and selling the models is a business
in and of itself), but hardly exclusive to them
• Based on the selection of stocks, a model could general or
particular – Global Equities vs Regional vs Country
34. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Other Factor Model types
35. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Constrained Factor Model
Mutual fund managers have mandate to invest be different asset
classes (US Equities, Intl Equities, Bonds) in certain ranges. We
would like to know that.
36. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Constrained Factor Model
Mutual fund managers have mandate to invest be different asset
classes (US Equities, Intl Equities, Bonds) in certain ranges. We
would like to know that.
The solution is usually called Return Based Style Analysis - in
effect this is a constrained factor model:
R = α + Xi βi +
βi ≥ 0, βi = 1
37. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Constrained Factor Model
Mutual fund managers have mandate to invest be different asset
classes (US Equities, Intl Equities, Bonds) in certain ranges. We
would like to know that.
The solution is usually called Return Based Style Analysis - in
effect this is a constrained factor model:
R = α + Xi βi +
βi ≥ 0, βi = 1
In order for the analysis to be interpretable for its goals, the factors
(X) should be investable and representative for the respective
market. A lot of work goes into their selection.
38. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Constrained Factor Model
Mutual fund managers have mandate to invest be different asset
classes (US Equities, Intl Equities, Bonds) in certain ranges. We
would like to know that.
The solution is usually called Return Based Style Analysis - in
effect this is a constrained factor model:
R = α + Xi βi +
βi ≥ 0, βi = 1
In order for the analysis to be interpretable for its goals, the factors
(X) should be investable and representative for the respective
market. A lot of work goes into their selection.
Optimizer becomes a necessity.
39. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Principal Component Analysis
• A dimentionality reduction technique
40. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Principal Component Analysis
• A dimentionality reduction technique
• Works best when the variables are correlated, e.g. elements of
a yield curves (10-20 variables to a curve)
41. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Principal Component Analysis
• A dimentionality reduction technique
• Works best when the variables are correlated, e.g. elements of
a yield curves (10-20 variables to a curve)
• 3 PCA directions account are usually sufficient to account for
over 90% of variablility.
42. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Principal Component Analysis
• A dimentionality reduction technique
• Works best when the variables are correlated, e.g. elements of
a yield curves (10-20 variables to a curve)
• 3 PCA directions account are usually sufficient to account for
over 90% of variablility.
• Shift/Slope/Shape
43. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Principal Component Analysis
• A dimentionality reduction technique
• Works best when the variables are correlated, e.g. elements of
a yield curves (10-20 variables to a curve)
• 3 PCA directions account are usually sufficient to account for
over 90% of variablility.
• Shift/Slope/Shape
• A nice ”gotcha” is what dependence matrix to use -
correlation vs covariance
44. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Principal Component Analysis
• A dimentionality reduction technique
• Works best when the variables are correlated, e.g. elements of
a yield curves (10-20 variables to a curve)
• 3 PCA directions account are usually sufficient to account for
over 90% of variablility.
• Shift/Slope/Shape
• A nice ”gotcha” is what dependence matrix to use -
correlation vs covariance
If you would like to play around, US government provides some
data on this address https://www.treasury.gov/
resource-center/data-chart-center/interest-rates/
45. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Latent Factor Model
• PCA finds a rotation of the space. The number of factor is
selected after the fact.
46. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Latent Factor Model
• PCA finds a rotation of the space. The number of factor is
selected after the fact.
• We can start is a number and find that many latent factors –
Statistical Factor Analysis or Latent Factor Model
47. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Latent Factor Model
• PCA finds a rotation of the space. The number of factor is
selected after the fact.
• We can start is a number and find that many latent factors –
Statistical Factor Analysis or Latent Factor Model
• The resulting factors are difficult to interpret
48. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Latent Factor Model
• PCA finds a rotation of the space. The number of factor is
selected after the fact.
• We can start is a number and find that many latent factors –
Statistical Factor Analysis or Latent Factor Model
• The resulting factors are difficult to interpret
• Different rotation of factors could lead to different results
49. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Latent Factor Model
• PCA finds a rotation of the space. The number of factor is
selected after the fact.
• We can start is a number and find that many latent factors –
Statistical Factor Analysis or Latent Factor Model
• The resulting factors are difficult to interpret
• Different rotation of factors could lead to different results
• Applied to Yield curves, usually leads to split between terms:
short term/mid term/long term
50. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Models that contain a factor model
• Black-Litterman Model – allows the combining of market
expectations of returns (through current prices) and individual
investors expectations
51. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Models that contain a factor model
• Black-Litterman Model – allows the combining of market
expectations of returns (through current prices) and individual
investors expectations
• Stambaugh Model – ”Analyzing Investments Whose Histories
Differ in Length”, calculate first and second moments for
series where they differ in their starting point
52. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Model fitting
53. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
MLE, OLS and WLS
• OLS – form the sum of squares of residuals ( i ) and find the
βk that make that sum minimal
54. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
MLE, OLS and WLS
• OLS – form the sum of squares of residuals ( i ) and find the
βk that make that sum minimal
• nothing is being said about the i
55. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
MLE, OLS and WLS
• OLS – form the sum of squares of residuals ( i ) and find the
βk that make that sum minimal
• nothing is being said about the i
• MLE – i are iid, Gaussian distributed R.V.
56. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
MLE, OLS and WLS
• OLS – form the sum of squares of residuals ( i ) and find the
βk that make that sum minimal
• nothing is being said about the i
• MLE – i are iid, Gaussian distributed R.V.
• both of methods lead to the same solution, but they are by no
means one and the same.
57. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
MLE, OLS and WLS
• OLS – form the sum of squares of residuals ( i ) and find the
βk that make that sum minimal
• nothing is being said about the i
• MLE – i are iid, Gaussian distributed R.V.
• both of methods lead to the same solution, but they are by no
means one and the same.
• Calculation of any p-value implicitly assumes a probability
model (Gaussian, if nothing else has been stated)
58. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
MLE, OLS and WLS
• OLS – form the sum of squares of residuals ( i ) and find the
βk that make that sum minimal
• nothing is being said about the i
• MLE – i are iid, Gaussian distributed R.V.
• both of methods lead to the same solution, but they are by no
means one and the same.
• Calculation of any p-value implicitly assumes a probability
model (Gaussian, if nothing else has been stated)
• WLS - when the residuals are independent but not identically
distributed (have different variances)
59. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
L1, LASSO, Ridge, Elastic Net
• L1 – Rather than minimize sum of squares, minimize sum of
absolute values (L1 norm of residuals)
60. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
L1, LASSO, Ridge, Elastic Net
• L1 – Rather than minimize sum of squares, minimize sum of
absolute values (L1 norm of residuals)
• LASSO – Minimize the sum of squared residuals, subject to
L1 norm of betas not exceeding some constant
61. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
L1, LASSO, Ridge, Elastic Net
• L1 – Rather than minimize sum of squares, minimize sum of
absolute values (L1 norm of residuals)
• LASSO – Minimize the sum of squared residuals, subject to
L1 norm of betas not exceeding some constant
• Ridge – Minimize the sum of squared residuals, subject to L2
norm of betas not exceeding some constant
62. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
L1, LASSO, Ridge, Elastic Net
• L1 – Rather than minimize sum of squares, minimize sum of
absolute values (L1 norm of residuals)
• LASSO – Minimize the sum of squared residuals, subject to
L1 norm of betas not exceeding some constant
• Ridge – Minimize the sum of squared residuals, subject to L2
norm of betas not exceeding some constant
• Elastic Net – combine LASSO and Ridge
63. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Robust and Bayesian methods
• Robust methodologies are applicable when outliers are
suspected
64. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Robust and Bayesian methods
• Robust methodologies are applicable when outliers are
suspected
• M-estimators are one possibility, L1 also has some robust
properties
65. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Robust and Bayesian methods
• Robust methodologies are applicable when outliers are
suspected
• M-estimators are one possibility, L1 also has some robust
properties
• Alternatively, heavy-tailed distributions for residuals can be
used (Student-T and beyond)
66. Time-Series Factor Models Cross-Sectional Factor Models Other Factor Model types Model fitting
Robust and Bayesian methods
• Robust methodologies are applicable when outliers are
suspected
• M-estimators are one possibility, L1 also has some robust
properties
• Alternatively, heavy-tailed distributions for residuals can be
used (Student-T and beyond)
• Such assumptions are often easier applied in a fully Bayesian
setting