RIC-NN: A Robust Transferable Deep Learning Framework for Cross-sectional Investment Strategy

The views expressed here are our own and do not necessarily reflect the views of Nomura Asset management.
Any errors and inadequacies are our own.
IEEE Data Science and Advanced Analytics（DSAA） 2020
October 9th, 2020
Masaya Abe1 and Junpei Komiyama2
RIC-NN: A Robust Transferable Deep Learning
Framework for Cross-sectional Investment Strategy
1. Nomura Asset Management Co., Ltd.
2. New York University
Kei Nakagawa1,

1. Introduction and Motivation
2. Data and Methodology
3. Experimental Results
4. Conclusion
Agenda
1

Agenda
4. Conclusion
2

3
Motivation – Stock Return Prediction
・ Those studies are mostly based on time series prediction and not
practical.
Therefore, we propose a practical framework for stock return
forecasting by cross-sectional prediction with deep learning.
・ Stock returns follow “random walk”, and it is difficult to predict
them with time series prediction for a long time.
・ Many study of stock return prediction using machine learning.

4
Motivation – Stock Return Prediction
クロスセクションで予測
Stock A Stock B ・・・
T+1
T
T-1
・
・
・
Buyastockwithabsolutely
highreturn.
Buy stocks with relatively high scores
based on certain criteria
Investment Universe
TimeSeries
Cross SectionOrthogonal
1≤ 𝑛 ≤N

5
・ Over 300 factors until 2012 [Harvey et al. 2017]
・ In practice, we predict future relative stock returns (scores) by
combining various factors.
Cross-sectional Prediction
・ Criteria for describing the score is called Factor in Finance
Cross-sectional Investment Strategy

ROE
1 Month Return
●
●
●
Score
Value
Growth
Quality
Momentum
・Linear Regression
Factor
Candidates
Factor Classification
by human
Calculate
Relative Goodness
Relative Stock Returns
・Rank IC (Spearman correlation):
6
・Cross-sectional Investment Strategy in practice

7
(2) A few studies use deep learning for cross sectional investment
strategy but they suffer from overfitting.
・ Challenges in cross-sectional investment strategy
(1) Traditionally, assume the relationship between factors and
returns is linear but actually, it's non-linear. Levin[1996]
(3) Not enough data for deep learning can be gathered in some markets.

RIC-NN: Our methodology
Score
Input Output
(3) Deep Transfer Learning
Loss
rank IC
Stop
Epoch
Time step t-1 Time step t
𝒗𝒊 𝒗 𝒇
(2) Weight Initialization and Stopping
Stop
𝒗 𝒇
Loss
(1) Multi-factor Deep Learning Approach
Score
Stock
Input Output
Score
Stock
Input Output
Source Domain
Target Domain
Transfer
Stock Factor
Stock
Factor
Stock
Stock
Factor
Initialization
𝒗𝒊
NorthAmericaAsiaPacific
Initialization
8

Score
Factor
Candidates Deep Learning
・
・
・
・
・
・
・
・
・
・
・
・
・
・
・
ROE
1 Month Return
●
●
●
RIC-NN: Multi-factor Deep Learning Approach
Calculate
Relative Goodness
9
・ Deep learning for cross sectional investment strategy
Capture the non-linear relationship
between factors and scores

・ DL for stock return prediction easily overfits to training data.
✓ Use early stopping to control the fitness to the past data.
RIC-NN: Weight Initialization and Stopping
・Use RankIC (Spearman
correlation) in terms of the fitness.
-> intuitive and controllable.
Loss
rank IC
Epoch
Time step T-1 Time step T
Stop
Loss
Initialization
・ Epoch-based stopping,
Risk of overfitting or underfitting
because the training speed varies.0.20.16
Stopping: the rank IC reaches 0.20.
Initialization: Use the model of timestep t-1 when rank IC is 0.16.
c.f.: fitness of a good portfolio to
future return is around 0.10
Initialization
Our Proposed (RIC-NN)
✓ Training at time step t:
10

ScoreRIC-NN
Factor
Stock
Input Output
Score
Factor
Stock
Input Output
North America Stock Market Asia Pacific Stock Market
(Source Domain) (Target Domain)
Transfer
RIC-NN
・Augment the model using the knowledge of a larger market.
✓ Use transfer learning
RIC-NN: Deep Transfer Learning
We want to capture the asymmetric structure between the two markets.
11

Agenda
4. Conclusion
12

・ We use the 20 factors that are often used in practice.
✓ Calculated for each regional index constituents
- MSCI North America Index (NA)
- MSCI Pacific Index (PA)
Features (Various Factors)
No. Feature (Factor) No. Feature (Factor) No. Feature (Factor)
1 Book-to-market Ratio 8 Return on Invested Capital 15 EPS Revision(1 month)
2 Earnings-to-price Ratio 9 Accruals 16 EPS Revision(3 months)
3 Dividend Yield 10 Total Asset Growth Rate 17 Past Stock Return(1 month)
4 Sales-to-price Ratio 11 Current Ratio 18 Past Stock Return(12 months)
5 Cash flow-to-price Ratio 12 Equity Ratio 19 Volatility
6 Return on Equity 13 Total Asset Turnover Rate 20 Skewness
7 Return on Asset 14 CAPEX Growth Rate
※ Monthly data
Data sources: Namely, Compustat, WorldScope, Thomson Reuters, I/B/E/S and EXSHARE.
13

Features (Various Factors)
・ Cumulative returns in NA(left-side) and PA(right-side) on 20 factors
・ Return of each factor varies largely over time.
14

Problem Formulation
MSE 𝑡 =
1
𝐾
෍
𝑡′=𝑡−𝑁
𝑡−1
෍
𝑖∈𝑈 𝑡′
𝑟𝑖,𝑡′+1 − 𝑓 𝒗𝑖,𝑇; 𝜽 𝑇+1
2
𝑁 = 120 (10 years)
𝐾 = ෍
𝑡′=𝑡−𝑁
𝑡−1
𝑈 𝑡′
・ We define the problem as a regression problem to minimize MSE.
✓ Approximate function 𝑓 ∙ with the parameter 𝜽 𝑇+1 that maps 𝒗𝑖,𝑇 to 𝑟𝑖,𝑇+1
𝑓 𝒗𝑖,𝑇; 𝜽 𝑇+1 → 𝑟𝑖,𝑇+1
✓ Train the models using the data of the latest 120 time steps from the
past 10 years.
15
𝒗𝑖,𝑇 ：Augmented factors
𝜽 𝑇+1：NN Weights
𝑟𝑖,𝑇+1：Relative Stock return

✓ 20 factors:
Problem Formulation
・ Given a stock 𝑖 at month 𝑇 (𝑖 ∈ 𝑈 𝑇: a regional index constituents at 𝑇)
𝒙𝑖,𝑇 ∈ 𝑅20
✓ Features: 20 factors and preprocessed factors
𝑥/ 𝑅 𝑦 ≔ 2(𝑥 − 𝑦)/( 𝑥 + 𝑦 )
✓ Output variable: scaled one-month-ahead stock return (Score)
Scale to the range [0,1]
Pre-processing ＆ Feature augmentation
𝒗𝑖,𝑇 = (𝒙𝑖,𝑇, 𝒙𝑖,𝑇−3, … , 𝒙𝑖,𝑇−12, 𝒙𝑖,𝑇/ 𝑅
𝒙𝑖,𝑇−3, … , 𝒙𝑖,𝑇/ 𝑅
𝒙𝑖,𝑇−12) ∈ [0,1] 𝟏𝟖𝟎
𝑟𝑖,𝑇+1 ∈ [0,1]
16
・Most of factors are updated quarterly ・Time difference between the present and each quarter ago

・ Architecture of RIC-NN is quite standard
✓ Fully-connected feedforward neural networks
✓ 6 Hidden layers: { 150 – 150 – 100 – 100 – 50 – 50 }
Dropout rates: (50% – 50% – 30% – 30% – 10% – 10%)
✓ Activation function: ReLU function
✓ RIC-NN(Transfer Learning: TF)
Compared Models
We use the weights of the first four layers that are trained
in the source region as the initial weight of the target region.
Our Proposed (RIC-NN)
17

Compared Models
・ Other off-the-shelf machine learning models
✓ LASSO Regression (LR)
- scikit-learn: "sklearn.linear_model.Lasso“
✓ Random Forest (RF)
- scikit-learn: "sklearn.ensemble.RandomForestRegressor“
✓ Gradient Boosting Tree(GB)
- xgboost: "XGBRegressor“
✓ Epoch-based Neural Network (NN(Epoch))
- TensorFlow
18

19
Prediction Period
✓ 14 years (from January 2005 to December 2018)
✓ Updated by sliding one-month-ahead and performed a monthly forecast.
14 years ( 168 months )
・・・
October 2018
:
November 2008
December 2018
November 2018
𝜽 𝑇
∗𝒗𝑖,𝑇
December 2004
:
January 1995
February 2005
January 2005
𝒗𝑖,𝑇 𝜽 𝑇
∗
November 2004
:
December 1994
January 2005
December 2004
𝜽 𝑇
∗
Scores
Training
120 set
𝑓 𝒗𝑖,𝑇; 𝜽 𝑇
∗
argmin
𝜽
MSE 𝑇
Features: 𝒗𝑖,𝑇
𝒗𝑖,𝑇

Agenda
4. Conclusion
20

・ Simple portfolio strategies
Performance Measure
✓ Long Portfolio Strategy
✓ We make quintile portfolios.
- Buy (Long) the top 1/5 score stocks with equal weighting
- Benchmark: the average return of all stocks
→ Relative performance evaluation
１２３４５
Relative goodness
Investment Universe
21
✓ Long Short Portfolio Strategy
- Buy the top 1/5 score stocks with equal weighting
- Sell the bottom 1/5 score stocks with equal weighting
→ Absolute performance evaluation (No benchmark)

Performance Measure
・ We use the following measures for Long (resp. Long-Short) portfolio.
✓ Alpha Return ≔ ς 𝑡=1
𝑇
1 + 𝑟𝑡
12/𝑇 − 1
✓ 𝑇𝐸 Risk ≔
12
𝑇−1
σ 𝑡=1
𝑇
𝑟𝑡 − 𝜇 𝑟
2
✓ 𝐼𝑅 R/R ≔ Alpha(Return)/𝑇𝐸（Risk）
portfolio return – benchmark return
𝜇 𝑟: Average of 𝑟𝑡
𝑟𝑡:
✓ MaxDD ≔ min
𝑘∈[1,𝑇]
(0,
𝑊𝑘
𝑃𝑜𝑟𝑡
max
𝑗∈ 1,𝑘
𝑊𝑗
𝑃𝑜𝑟𝑡 − 1)
𝑊𝑘
𝑃𝑜𝑟𝑡
: Cumulative return of the portfolio
22

Linear Linear
LR RF GB
NN
(Epoch)
RIC-NN
RIC-NN
(TF from PF)
LR RF GB
NN
(Epoch)
RIC-NN
RIC-NN
(TF from NA)
Alpha [%] 0.62 0.79 0.87 0.82 1.23 1.20 5.35 3.79 5.59 4.34 5.25 5.78
TE [%] 5.40 5.14 4.36 4.48 4.14 4.43 5.17 5.75 5.00 4.18 4.20 3.95
IR 0.11 0.15 0.20 0.18 0.30 0.27 1.04 0.66 1.12 1.04 1.25 1.46
MaxDD [%] -21.84 -24.57 -18.10 -17.41 -14.37 -20.57 -11.53 -11.43 -8.86 -9.37 -7.51 -3.37
Linear Linear
LR RF GB
NN
(Epoch)
RIC-NN
RIC-NN
(TF from PF)
LR RF GB
NN
(Epoch)
RIC-NN
RIC-NN
(TF from NA)
Return [%] 2.24 1.71 2.29 2.10 3.86 2.16 10.27 7.78 10.79 8.52 9.81 10.95
Risk [%] 10.90 11.51 8.62 9.47 7.85 9.52 9.23 9.65 8.68 7.78 7.83 7.14
R/R 0.21 0.15 0.27 0.22 0.49 0.23 1.11 0.81 1.24 1.10 1.25 1.53
MaxDD [%] -34.73 -42.21 -34.35 -34.49 -21.26 -39.35 -18.07 -18.66 -13.92 -19.74 -11.06 -8.89
Long
MSCI North America MSCI Pacific
Nonlinear Nonlinear
Long-Short
MSCI North America MSCI Pacific
Nonlinear Nonlinear
Experimental Results (1/3)
・ NA: RIC-NN without transfer learning performed best.
・ PA: RIC-NN with transfer learning performed best.
・ NA as a source domain enhances the performance of PA,
not vice versa.
23

・ While NN at epoch 50 performs better in NA, NN at epoch 60 performs better in PA.
・ NN(Epoch) is very sensitive to the choice of the epoch.
・ RIC-NN outperforms epoch-based stopping: rank IC controls the fitness of
the stock prediction models consistently.
*These epochs are chosen so that the rank IC reaches 0.20 during the training of the first time step.
24
40 50 56 60 80 40 46 50 60 80
Alpha [%] 1.23 0.18 1.48 0.82 1.25 0.70 5.25 4.13 4.34 4.28 4.52 2.99
TE [%] 4.14 4.52 4.35 4.48 4.49 4.14 4.20 4.36 4.18 4.73 4.34 4.06
IR 0.30 0.04 0.34 0.18 0.28 0.17 1.25 0.95 1.04 0.90 1.04 0.74
MaxDD [%] -14.37 -22.67 -13.48 -17.41 -20.98 -15.94 -7.51 -8.08 -9.37 -7.16 -7.45 -7.52
40 50 56 60 80 40 46 50 60 80
Return 3.86 0.67 3.24 2.10 2.02 3.10 9.81 8.89 8.52 8.97 9.78 6.15
Risk 7.85 9.08 10.06 9.47 9.05 7.73 7.83 7.63 7.78 8.05 7.73 7.18
Return/Risk 0.49 0.07 0.32 0.22 0.22 0.40 1.25 1.16 1.10 1.11 1.26 0.86
MaxDD -21.26 -40.09 -26.20 -34.49 -31.62 -23.47 -11.06 -13.07 -19.74 -12.17 -14.70 -13.44
MSCI Pacific
MSCI Pacific
NN(Epoch)Long
Long-Short
MSCI North America
MSCI North America
RIC-NN
NN(Epoch)
RIC-NN
RIC-NN
NN(Epoch)
RIC-NN
NN(Epoch)
*
*
*
*

25
・ We select these funds by querying Bloomberg fund
screening search with the following conditions:
・We compare the performance of RIC-NN with major funds where
the investments involve decision-making by human experts.
・ We select the top 5 funds in terms of the total assets and
calculate average total return series of these funds, including the
trust fees.
Fund Asset Class Focus: Equity Asset Class
Fund Geographical Focus: North America Region (resp. Asian Pacific Region)
Fund Type: Open-End-Funds
Currency Base: US dollar
Market Cap Focus: Large-cap and Mid-cap focus

26
5 Funds
(average)
RIC-NN
RIC-NN
(After Cost
Reduction)
5 Funds
(average)
RIC-NN
RIC-NN
(After Cost
Reduction)
Return [%] 5.90 9.09 7.79 7.88 12.08 9.44
Risk [%] 14.91 17.78 17.78 17.58 17.23 17.23
R/R 0.40 0.51 0.44 0.45 0.70 0.55
MSCI Pacific
Long
MSCI North America
・RIC-NN after transaction costs outperformed the top 5 funds average.

27
・ Machine learning methods in finance
✓ Survey: Bahrammirzaee (2010), Cavalcante et al. (2016)
→ Most of all studies are time series prediction
・ Neural Networks for cross sectional investment strategy
✓ Classical Neural Networks(Epoch): Levin (1996)
✓ Deep Neural Networks(Epoch): Abe and Nakayama (2018),
Nakagawa et al (2018, 2019)
Related Work
However, as we confirmed epoch based networks are very sensitive
to the choice of the epoch.

・ We have proposed a new cross sectional stock return prediction framework
called RIC-NN.
by introducing three practical ideas:
(1) A nonlinear multi-factor approach is better than a linear approach.
(2) Rank IC-based stopping outperforms epoch-based stopping.
(3) Multi-region transfer learning works well.
・ Better return of the portfolio, better control of the fitness of the model to
the past dataset.
Conclusion
28

RIC-NN: A Robust Transferable Deep Learning Framework for Cross-sectional Investment Strategy

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to RIC-NN: A Robust Transferable Deep Learning Framework for Cross-sectional Investment Strategy

Similar to RIC-NN: A Robust Transferable Deep Learning Framework for Cross-sectional Investment Strategy (20)

More from Kei Nakagawa

More from Kei Nakagawa (17)

Recently uploaded

Recently uploaded (20)

RIC-NN: A Robust Transferable Deep Learning Framework for Cross-sectional Investment Strategy