SlideShare a Scribd company logo
MACHINE LEARNING
FINANCIAL ENGINEERING
MAR 2018
STEVEN WANG
QUANTITATIVE INVESTING
Quick Take
Machine Learning
Financial Engineering
Quantitative Investing
Takeaway
P-, Q-
easure
ntitative
vesting
Q-Measure
Financial
Engineering
QT
, Q-
asure
titative
esting
Q-Measure
Financial
Engineering
ML
arning?
g Problem?
ible?
FE
QI
Mean
Machine
Machine
Learning
What is Machine Learning
What is Learning Problem
Is Learning Feasible
How to Learn Well
Mean
Machine
1.1 What is Machine Learning? - Overview
Mean
Machine
1.1 What is Machine Learning? - Type
Agent
Environment
ActionReward State
Input
Layer
Output
Layer
Hidden
Layer 1
Hidden
Layer 2
Supervised Learning Unsupervised Learning
Semi-supervised Learning Reinforcement Learning
Deep Learning
Mean
Machine
1.2 What is Learning Problem?
Unknown Target Function
c: X  Y
Training Examples
(x(1), y(1)), (x(2), y(2)),…, (x(n), y(n))
Learning
Algorithm
A
Hypothesis Set
H = {h1, h2,…, hM}
Final Hypothesis
g  c
ideal loan approval formula
historical records of applicants
a set of candidate formulas
learned loan approval formula
Mean
Machine
1.3 Is Learning Feasible? - No Free Lunch Theorem
Feature x Label y Model A Model B Model C
Training
Data
[0, 0] 0 0 0 0
[1, 1] 1 1 1 1
Test
Data
[1, 0] ? 1 0 1
[0, 1] ? 0 0 1
Model A = random guess
Model B = support vector machine
Model C = deep neural network
Is Model C > Model B > Model A?
c(x) = x1 ⋁ x2 : Model C wins
o The 3rd data x = [1, 0], so y = 1 ⋁ 0 = 1
o The 4th data x = [0, 1], so y = 0 ⋁ 1 = 1
c(x) = x1 ⋀ x2 : Model B wins
o The 3rd data x = [1, 0], so y = 1 ⋀ 0 = 0
o The 4th data x = [0, 1], so y = 0 ⋀ 1 = 0
c(x) = x1: Model A wins
o The 3rd data x = [1, 0], so y = 1
o The 4th data x = [0, 1], so y = 0
Model A is as good as Model C! Anything needs to learn?
All Models are expected to have equivalent performance!
The c is an unknown function, the performance on training data is
not indicative of the performance on test data.
The performance on test data is all that matters in learning!
Can we really learn something? Learning seems to be doomed but …
Mean
Machine
1.3 Is Learning Feasible? - No Free Lunch Proof
It is meaningless to discuss the superiority of algorithm given no specific problems.
Define
A = algorithm
xin = in-sample data
xout = out-of-sample data (N)
c = unknown target function
h = hypothesis function
Consider all cases of c, the expected out-of-sample
error under algorithm A is the same.
E A xin, c
= ෍
c
෍
h
෍
xout
P xout
PDF of xout
∙ I h xout ≠ c xout
error of h on xout
∙ P h xin, A
PDF of h given A and xin
= ෍
xout
P xout ∙ ෍
h
P h xin, A ∙ ෍
c
I h xout ≠ c xout
= ෍
xout
P xout ∙ ෍
h
P h xin, A ∙
1
2
2N
= 2N−1
෍
xout
P xout ∙ ෍
h
P h xin, A
= 2N−1 ෍
xout
P xout
The error is independent of algorithm A!
E[random guess|xin, c] = E[state-of-the-art|xin, c]
Mean
Machine
1.3 Is Learning Feasible? - Add Probability Distribution
Unknown Target Function
c: X  Y
Training Examples
(x(1), y(1)), (x(2), y(2)),…, (x(n), y(n))
Learning
Algorithm
A
Hypothesis Set
H = {h1, h2,…, hM}
Final Hypothesis
g  c
Input Distribution
P(X)
x
x(1),
x(2),
…,
x(n)
Mean
Machine
1.3 Is Learning Feasible? - Logic Chain of Proof
learning
feasibleProve
target
function
c
Learn g closest
to cFind Etrue(g)
smallShow
Etrain(g) ≈
Etrue(g)
Etrain(g)
small
GivenEtrue(g)
small Showg ≈ c Showlearning
feasible Show
God’s Gift: Etrain(g) ≈ Etrue(g)
Your Capability: Etrain(g) is small
Target function c is UNKNOWN
True error Etrue(g) is IMPOSSIBLE to compute
Mean
Machine
1.3 Is Learning Feasible? - From Unknown to Known
Can we infer u from v?
No, people from sample might all support Clinton
but Trump eventually win!
The above statement is POSSIBLE but not
PROBABLE. When sample size is big enough,
“v ≈ u” is probably approximately correct (PAC)
P( v − u > ε) ≤ 2e−2ε2nHoeffding’s
inequality
• No u appears to be at RHS of above formula
• A link from unknown u to known v
u is deterministic unknown, v is stochastic known.Population Sampleu v = 2/5
Mean
Machine
1.3 Is Learning Feasible? - From Polling to Learning
Polling Learning
Label
Support Trump
Support Clinton
Correct classification
Incorrect classification
Aim Get vote percentage for Trump Learn target function c(x) = y
Data US citizens Examples
Data Distribution Every citizen is i.i.d Every example is i.i.d
In-Sample Sample Training set
In-Sample
Statistics
v = vote percentage for Trump in-sample training error Etrain h =
1
n
σi=1
n
I{h 𝐱 i
≠ c 𝐱 i
}
Out-of-Sample
Statistics
u = vote percentage for Trump out-of-sample true error Etrue h = P(h 𝐱 ≠ c(𝐱))
P( v − u > ε) ≤ 2e−2ε2n P( Etrain(h) − Etrue(h) > ε) ≤ 2e−2ε2n
Polling Learning
simplify
P(bad h) ≤ 2e−2ε2n
analogy
Are we done? No!
This is verification,
not learning
Mean
Machine
1.3 Is Learning Feasible? - From One to Many
Unknown Target Function
c: X  Y
Verify Training Examples
(x(1), y(1)), (x(2), y(2)),…, (x(n), y(n))
A Fixed Hypothesis function
h
Verification
h  c or h ≠ c
Input Distribution
P(X)
x
x(1),
x(2),
…,
x(n)
The entire flowchart assumed a FIXED h and
then came the data.
In order to be real learning, we have to choose
g among a hypothesis set {h1, h2, …, hM} instead
of fixing a single h
P Etrain g − Etrue g > ε
= P bad g
≤ P bad h1 or bad h2 or ⋯ or bad hM
≤ P bad h1 + P bad h2 + ⋯ + P bad hM
≤ 2e−2ε2n + 2e−2ε2n + ⋯ + 2e−2ε2n
= 2Me−2ε2n
P(bad h) ≤ 2e−2ε2n
P(bad g) ≤ 2Me−2ε2n
From h to g
Are we done? No!
M can be very huge,
infinite-huge
Mean
Machine
1.3 Is Learning Feasible? - From Finite to Infinite
When M  ∞
P bad g ≤ 2Me−2ε2n = 2
M
e2ε2n
= very large number
Congratulations!
Even primary student knows P(bad g) ≤ 1
What went wrong?
Mean
Machine
1.3 Is Learning Feasible? - From Infinite to Finite
h1
h2
h3
The hypothesis h1, h2 and h3 are effective equivalent!
Dichotomy
Growth
function
Shattered
Break
Point
VC
Dimension
P Etrain g − Etrue g > ε ≤ 4
2n dvc + 1
e
1
8
ε2n
Mean
Machine
1.3 Is Learning Feasible? - Learning is Feasible
P Etrain g − Etrue g > ε ≤ 4
2n dvc + 1
e
1
8
ε2n
We can reach above conclusion by not knowing
• algorithm A
• input distribution P(X)
• target function c
We just need
• training examples D
• hypothesis set H
to find final hypothesis g to learn c.
Learning is feasible
when VC dimension is finite
Unknown Target Function
c: X  Y
Training Examples
(x(1), y(1)), (x(2), y(2)),…, (x(n), y(n))
Learning
Algorithm
A
Hypothesis Set
H = {h1, h2,…, hM}
Final Hypothesis
g  c
Input Distribution
P(X)
x
x(1),
x(2),
…,
x(n)
Mean
Machine
1.4 How to Learn Well? - Over-learn vs Under-learn
Exercise Exam
Both are leaves Which one is leaf?
When you learn too much:
This is not leaf, since
leaves must be serrated
When you learn too little:
This is leaf, since leaves
are green
Mean
Machine
1.4 How to Learn Well? - Overfit vs Underfit
Mean
Machine
Financial
Engineering
Overview
Data, Parameter
Curve Construction
Model Calibration
Instrument Valuation
Risk Measurement
Mean
Machine
2.1 Overview
Curve
Construction
Model
Calibration
Instrument
Valuation
Θnum(t)
Θprm(t)
P(t, ·) Θmdl(t) V(t)
Evaluation
Risk
Measurement
CalibrationBoostrapping
Perturbation
Extraction
∂V/∂Θmkt
∂V/∂ΘmdlΘmkt(t)
Data
Parameter
Variable
Computation
P&L
VaR
Mean
Machine
2.2 Data, Parameter
• Deposit rates, futures rates and swap rates (yield curve construction)
• Cap and swaption implied volatilities (IR volatility calibration)
• FX swap point and volatilities (FX volatility calibration)
• CDS spread curve (hazard rate calibration)
Θmkt(t)
Daily observable market data
• Libor fixing (historical data point)
• Correlation between CMS and FX (historical time series)
• Short rate mean reversion speed (κ = 0.01)
Θprm(t)
Indirectly observed or
estimated from historical data
or treated as “exotic” constants
• Number of Monte Carlo paths (N = 50,000)
• Number of node points in finite difference grid (N = 1,000)
• Tolerance of errors in optimization (ε = 10-5)
Θnum(t)
Parameters that control the
numerical schemes
Mean
Machine
2.3 Curve Construction
Benchmark Curve
Deposit
ED Futures
FRA
Swap
2
Benchmark Curve
USD USD
FX Swap Point
CRX Basis Swap
Deposit
ED Futures
FRA
Swap
4
Index Curve
CUR CUR CUR
4
1
Discount Curve
USD
OIC
OIS
USD
2
IR Basis Swap
USD
Index Curve
USD
FX Discount Curve
CUR
CURUSD
3 3
IR Basis Swap
CUR
3
1
Discount Curve
CUR
OIC
OIS
CUR
2
2
3
4
Mean
Machine
2.4 Model Calibration
Expiry|Strike 15705.69 16578.23 17450.77 18323.31 19195.85
1M 28.63 26.00 24.34 23.16 23.16
3M 27.06 25.60 24.70 23.94 23.49
6M 26.28 25.47 24.92 24.34 23.97
12M 26.07 25.66 25.33 24.96 24.74
24M 26.54 26.40 26.16 25.94 25.72
60M 29.00 28.87 28.73 28.66 28.60
Maturity|Expiry 1M 3M 6M 1Y 2Y 3Y 4Y 5Y 7Y 10Y 15Y 20Y 25Y 30Y
1Y 59.80 56.15 56.27 65.12 66.75 55.32 44.80 36.16 28.18 22.39 19.98 18.09 17.52 17.17
2Y 53.00 46.22 50.38 59.33 56.22 46.80 39.23 33.31 27.06 22.04 20.26 18.72 18.19 18.05
3Y 53.00 43.60 47.48 57.00 48.87 41.21 35.61 31.16 26.06 21.80 20.19 18.63 18.15 18.14
4Y 52.70 50.04 48.35 50.06 43.32 37.41 33.03 29.58 25.17 21.60 20.07 18.46 18.00 18.12
5Y 50.80 48.45 48.02 46.04 40.06 34.93 31.15 28.33 24.50 21.47 19.84 18.04 17.65 17.94
7Y 41.50 43.49 41.98 39.47 35.22 31.46 28.38 26.08 23.38 21.18 19.73 18.11 18.25 19.32
10Y 10.00 33.70 32.49 32.59 32.36 30.12 27.94 26.01 24.66 22.56 20.74 19.55 18.23 19.28
15Y 30.60 26.74 27.17 27.46 26.07 24.78 23.60 22.70 21.20 19.45 17.94 17.32 19.25 21.27
20Y 25.50 25.24 25.69 25.90 24.73 23.70 22.70 21.89 20.60 19.09 18.09 17.63 19.57 22.18
25Y 24.80 24.65 24.68 24.77 23.78 22.95 22.11 21.39 20.27 19.02 18.24 17.51 19.88 22.92
30Y 24.60 24.52 24.11 24.04 23.11 22.40 21.65 21.01 20.05 19.04 18.22 17.53 20.27 23.39
Expiry|Strike 272.57 287.71 295.28 302.85 310.42 317.99 333.14
1M 32.33 31.18 30.63 30.60 31.40 32.27 33.32
2M 32.13 32.18 32.38 32.71 33.11 33.47 33.92
3M 35.17 35.67 36.10 36.52 36.93 37.35 37.81
6M 34.63 35.10 35.55 36.00 36.48 36.93 37.41
1Y 31.87 32.07 32.24 32.45 32.69 33.00 33.29
18M 29.31 29.68 29.95 30.29 30.66 31.10 31.60
2Y 28.75 29.07 29.31 29.66 30.03 30.49 31.09
Expiry|Convention ATM 25RR 10RR 25BF 10BF
O/N 6.44 -0.56 -1.01 0.14 0.48
1W 8.55 -0.65 -1.17 0.15 0.50
2W 8.65 -0.75 -1.35 0.14 0.47
1M 8.78 -1.00 -1.79 0.11 0.40
2M 8.70 -1.10 -1.98 0.17 0.59
3M 8.75 -1.25 -2.25 0.18 0.62
6M 9.00 -1.50 -2.74 0.28 0.98
9M 9.19 -1.60 -2.91 0.30 1.03
1Y 9.30 -1.65 -3.00 0.29 0.99
2Y 9.78 -1.70 -3.18 0.32 1.15
Calibration
EQ: N225 Option CM: Coffee Option
IR: USD ATM Swaption
FX: EURUSD Option
SABR , , ,  Schwartz κ, σ, θ
Hull-White κ, σ
Heston κ, v0, η, , 
Mean
Machine
2.5 Instrument Valuation - Fundamentals
No
Arbitrage
Numeraire
Change
Measure
Pricing
Formula V 0 = N 0 × EN
V T
N T
Numeraire Probability Measure
Bank Account Risk-neutral Measure
Zero-Coupon Bond Forward Measure
Annuity Swap Measure
Given two assets A and B with their
payoff f and g at T, if f = g, then A = B
dP
dQ
=
Q 0
P 0
∙
P T
Q T
Mean
Machine
2.5 Instrument Valuation - Fundamentals (No-Arbitrage Principle)
Given two assets A and B with their payoff f and g at T, by no-arbitrage principle, if f = g, then A = B.
• When A > B :At t = 0 buy B sell A with profit A – B > 0, at T sell B buy A with profit g – f = 0
• When A < B :At t = 0 sell B buy A with profit B – A > 0, at T buy B sell A with profit f – g = 0
B0 = A0 =
1
1 + r T
𝐝𝐢𝐬𝐜𝐨𝐮𝐧𝐭
∙ pu ∙ h uS0 + pd ∙ h(dS0)
𝐞𝐱𝐩𝐞𝐜𝐭𝐞𝐝 𝐩𝐚𝐲𝐨𝐟𝐟
A0 = xS0 − yC
AT = ൝
xuS0 − y 1 + r T
C stock ↑ uS0
xdS0 − y 1 + r T
C stock ↓ dS0
BT = ቊ
h(uS0) stock ↑ uS0
h(dS0) stock ↓ dS0
AT = BT
xuS0 − y 1 + r TC = h(uS0) stock ↑ uS0
xdS0 − y 1 + r TC = h(dS0) stock ↓ dS0
x =
h uS0 − h(dS0)
(u − d)S0
y =
1
1 + r TC
∙
d ∙ h uS0 − u ∙ h(dS0)
u − d
Use no-arbitrage principle to price any financial instrument B at time 0, B0
Construct
Express
Equal
Link
Solve
Present Value = Discount Factor × Expected Payoff
Mean
Machine
2.5 Instrument Valuation - Fundamentals (Numeraire and Probability Measure)
Numeraire Numeraire is unit, and it can be money, tradeable asset or even apple.
Probability
Measure
Probability measure is a set of probabilities for certain event.
Q(H) = 0.8, Q(T) = 0.2
P(H) = P(T) = 0.5Fair Coin
Biased Coin
A 0 = φ1A1 T + φ2A2 T + ⋯ + φKAK T = ෍
k=1
K
φkAk(T)
B 0 = φ1B1 T + φ2B2 T + ⋯ + φKBK T = ෍
k=1
K
φkBk(T)
A 0
B 0
=
σk=1
K
φkAk T
σk=1
K
φkBk T
= ෍
k=1
K
φk
b
Ak T = ෍
k=1
K
φkBk T
b
×
Ak T
Bk T
= ෍
k=1
K
φkBk T
σk=1
K
φkBk T
×
Ak T
Bk T
= ෍
k=1
K
πk ×
Ak T
Bk T
= EB
A T
B T
A 0 = B 0 × EB
A T
B T
B is a numeraire, EB is the expectation with probability measure induced by B
Numeraire Probability Measure Instrument
Bank Account Risk-neutral Measure FX, Equity, Commodity Option
Zero-Coupon Bond Forward Measure Cap, Floor
Annuity Swap Measure Swaption
Mean
Machine
2.5 Instrument Valuation - Fundamentals (Change of Probability Measure)
EP X = p1x1 + p2x2 = q1
p1
q1
x1 + q2
p2
q2
x2 = EQ Z ∙ X
q1 = 0.8, q2 = 0.2
p1 = p2 = 0.5Fair Coin
Biased Coin
Z is Radon-Nikodym derivative,
denoted as Z = dP/dQQuestion 1:
Relationship between two
probability measures
Numeraire
Probability
Measure
corresponds Change
Numeraire
Change
Probability
Measure
corresponds
What is the relationship between two probability measures
What is the relationship between two numeraires
Why change of measure
Mean
Machine
2.5 Instrument Valuation - Fundamentals (Change of Probability Measure)
Question 2:
Relationship between
two numeraires
Question 3:
Why change of measure
A 0
B 0
= EB
A T
B T
EP X = EQ
dP
dQ
∙ XE1 E2
EQ
A T
Q T
∙
Q 0
P 0
=
A 0
Q 0
𝐁𝐲 𝐄𝟏
𝐐=𝐁
∙
Q 0
P 0
=
A 0
P 0
= EP
𝐁𝐲 𝐄𝟏
𝐏=𝐁
A T
P T
= EQ
dP
dQ
∙
A T
P T
𝐁𝐲 𝐄2
dP
dQ
=
Q 0
P 0
∙
P T
Q T
Risk-Neutral Measure EQ Forward Measure ET
Numeraire Bank Account β(t) Zero Coupon Bond P(t, T)
Property β(0) = 1 P(T, T) = 1
Martingale
Formula
V 0
β 0
= EQ
V T
β T
V 0
P 0, T
= ET
V T
P T, T
Simplified
Formula
V 0 = EQ
V T
β T
V 0 = P 0, T ∙ ET V T
Mean
Machine
2.5 Instrument Valuation - Pricing Methods
V t = N t × Et
N
V T
N T
1. Find the PDF of V(T)/N(T) under measure N.
2. Use integration to represent expectation.
3. Simplify it to closed-form if possible, or leave it as
numerical integration otherwise.
1. Change measure N to risk-neutral measure.
2. Use Feynman-Kac Theorem to derive PDE of V.
3. Fix solution domain, construct grid, set terminal and
boundary conditions, discretize derivatives in spatial
and time dimension, adopt finite difference scheme.
1. By law of large number, compute E[V/N] by taking
the average of Vi/Ni as approximation.
2. Adopt variance reduction technique to enhance
Monte Carlo efficiency.
Closed-Form,
Numerical
Integration
PDE Finite
Difference
Method
Monte Carlo
Method
Mean
Machine
2.5 Instrument Valuation - Closed-Form, Numerical Integration
dS t
)S(t
= [r − q]dt + σdB(t)
൞
dS t
S t
= r − q dt + v t dB1 t
dv t = κ θ − v t dt + η v t dB2 t
Closed-Form
V = ω ∙ [e−qT
∙ S0Φ(ωd+) − e−rT
∙ KΦ(ωd−)]
• d =
1
σ T
ln
S0e r−q T
K
σ T
2
Numerical Integration
V = ω ∙ [e−qT
∙ S0P1(ω) − e−rT
∙ KP2(ω)]
• P1 ω = ½ 1 −  + P1 S0, v0, T, K
• P2 ω = ½ 1 −  + P2 S0, v0, T, K
• Pj x, v, T, y =
1
2
+
1
π
‫׬‬0
∞
Re
Cj T,ϕ −Dj T,ϕ v+ln(
x
y
)ϕi
ϕi
dϕ
• Dj T, ϕ =
bj−ρηϕi+dj−(bj−ρηϕi−dj)gje
djT
η2(1−gje
djT
)
• Cj T, ϕ = r − q Tϕi +
κθ
η2 bj − ρηϕi + dj T − 2ln
1−gje
djT
1−gj
• dj = (bj − ρηϕi)2−η2(2ujϕi − ϕ2), gj =
bj−ρηϕi+dj
bj−ρηϕi−dj
• b1 =  - η, b2 = , u1 = 0.5, u2 = -0.5
Black-Scholes Model
Heston Model
Technique Used:
• Itô's Formula
• Girsanov’s Theorem
• Moment Matching
• Drift Interpolation
• Parameter Averaging
Mean
Machine
2.5 Instrument Valuation - PDE Finite Difference Method (SDE to PDE)
Given the SDE of x(t) and payoff function V of a derivative at maturity T:
)dx t = μ t, x dt + σ t, x dB(t
V x(T), T = h(x T )
The V(x, t) satisfies the following PDE:
𝜕V
𝜕t
+ μ t, x
𝜕V
𝜕x
+
1
2
σ2
t, x
𝜕2V
𝜕x2
− rV = 0
V x, t = e−r T−t ∙ Et h(x T )
By no-arbitrage principle:
SDE
PDE
Feynman-Kac
Mean
Machine
2.5 Instrument Valuation - PDE Finite Difference Method (Grid Construction)
t
X
t0
tn = T
x0 Xm+1
Terminal Condition
Xj-1 Xj Xj+1
ti-1
ti
ti+1
Interior Points
BoundaryCondition
BoundaryCondition
x ∈ xj j=0
m+1
⇨ xj = xmin + j∆x, ∆x=
xmax −xmin
m + 1
t ∈ ti i=0
m
⇨ ti = i∆t, ∆t=
T
n
Mean
Machine
2.5 Instrument Valuation - PDE Finite Difference Method (Discretization and Scheme)
t
X
Fully Explicit (θ = 0)
(xj, ti)(xj-1, ti) (xj+1, ti)
(xj, ti-1)
t
X
(xj, ti)(xj-1, ti) (xj+1, ti)
(xj, ti+1)
Fully Implicit (θ = 1) t
X
(xj, ti+1)(xj-1, ti+1) (xj+1, ti+1)
(xj, ti)(xj-1, ti) (xj+1, ti)
Crank-Nicolson (θ = ½)
Order Spatial Dimension Time Dimension
1st
𝜕Vj t
𝜕x
≈
Vj+1 t − Vj−1 t
2∆x
𝜕𝐕
𝜕t
≈
𝐕 ti+1 − 𝐕 ti
∆t
2nd
𝜕Vj
2
t
𝜕x2
≈
Vj+1 t − 2Vj t + Vj−1 t
∆x
2
Nodes in relation Node at discretization Nodes not in relation
 Use central difference on
∂Vj/∂xand ∂2Vj/∂x2 at xj
 Discretize ∂Vj/∂tat tθ
i,i+1 =
θti + (1 - θ)ti+1
Mean
Machine
2.5 Instrument Valuation - PDE Finite Difference Method (Representation)
The difference equation at (tθ
i,i+1, xj) is
Vj ti+1 − Vj ti
∆t
= − μ ti,i+1
θ
, xj ·
Vj+1 ti,i+1
θ
− Vj−1 ti,i+1
θ
2∆x
−
σ2 ti,i+1
θ
, xj
2
·
Vj+1 ti,i+1
θ
− 2Vj ti,i+1
θ
+ Vj−1 ti,i+1
θ
∆x
2
+ r ti,i+1
θ
, xj · Vj(ti,i+1
θ
)
𝐈 − θ∆t 𝐀 ti,i+1
θ
∙ 𝐕 ti = 𝐈 + 1 − θ ∆t 𝐀 ti,i+1
θ
∙ 𝐕 ti+1 + θ𝛀 ti + 1 − θ 𝛀 ti+1
Write the algebraic form to matrix form
Identity Matrix Tri-Diagonal Matrix Boundary Value Vector
Mean
Machine
2.5 Instrument Valuation - Monte Carlo Method (Fundamentals)
Consider a derivative V with time T payout
V(T) = g(T), by no-arbitrage principle
V t = N t × Et
N
g T
N T
Law of Large Numbers
Let Y1, Y2, …, Yn be a sequence of independent identically distributed (i.i.d.)
random variables with finite expectation . Define the sample mean
ഥY n =
1
n
σi=1
n
Yi → lim
n→∞
ฑഥY n
sample
mean
= lim
n→∞
1
n
σi=1
n
Yi = ฎμ
population
mean
V t ≈ ഥV t = N t ×
1
n
෍
i=1
n
gi
Ni
Central Limit Theorem
Let Y1, Y2, …, Yn be a sequence of i.i.d. random variables with finite
expectation  and standard deviation σ. Then for n → ∞
ฑഥY n
sample
mean
− ฎμ
population
mean
s(n)
n
~ N 0,1
where
s2 n =
1
n−1
σi=1
n
[Yi − ഥY n ]2 =
n
n−1
σ2
V(t) ∈ ഥV t − zα
2
∙
s n
n
, ഥV t + zα
2
∙
s n
n
ฑs n
n
𝐬𝐭𝐚𝐧𝐝𝐚𝐫𝐝 𝐞𝐫𝐫𝐨𝐫 ↓
=
n − 1
n
ฏσY
𝐯𝐚𝐫𝐢𝐚𝐧𝐜𝐞 ↓
ดn
𝐬𝐚𝐦𝐩𝐥𝐞 𝐬𝐢𝐳𝐞 ↑
Mean
Machine
2.5 Instrument Valuation - Monte Carlo Method (Variance Reduction)
E Ynew = E Y
Variance Reduction
Yi
av
= (Yi
1
+ Yi
2
)/2
E Ynew = E
Yi
1
+ Yi
2
2
= E Y
Var Ynew = Var
Yi
1
+ Yi
2
2
=
Var Y
2n
1 + ρ <
Var Y
2
)Ynew
= Y + c(Ycv
− μcv
E Ynew
= E Y] + cE[Ycv
− μcv
= E Y
Var Ynew
= min
c
Var Y + c Ycv
− μcv
= (1 − ρY,Ycv)Var Y ≤ Var Y
Ynew
= E Y Z
E Ynew = E E Y Z = E[Y]
Var Y = E Var Y Z + Var E Y Z
≥ Var E Y Z = Var Ynew
Ynew
= E Y Z
E Ynew = E E Y Z = E[Y]
Var Ynew
= min
Nj
E Var Y Z
≤ E Var Y Z ≤ Var Y
Var Ynew < Var Y
Ef )V(X = නV x f x dx
= න
V x f x
g x
g(x)dx = Eg
V X f X
g X
Varf
V X − Varg
V X f X
g X
= න V2 x f x 1 −
f x
g x
dx > 0
൝
g x > f x , when V2
x f x large
g x < f x when V2
x f x small
Antithetic
Variate
Control
Variate
Conditioning
Sampling
Stratified
Sampling
Importance Sampling
Mean
Machine
2.6 Risk Measurement - Sensitivity
𝜕V
𝜕Θk
=
V Θ1, Θ1, … , Θk + ∆, … , ΘK − V Θ1, Θ1, … , Θk, … , ΘK
∆
Bump and
Revaluation
Compute delta of European Option in Monte Carlo
Pathwise Differentiation
Compute delta of Digital Option in Monte Carlo
Likelihood Ratio
𝜕V
𝜕S0
= E
)𝜕g(ST
𝜕ST
∙
𝜕ST
𝜕S0
= E
𝜕 ST − K +
𝜕ST
∙
𝜕ST
𝜕S0
= E 1{ST > K} ∙
𝜕ST
𝜕S0
𝜕V
𝜕S0
=
𝜕 ‫׬‬ g(ST) ∙ f(ST; S0)dST
𝜕S0
= න g(ST) ∙
)𝜕f(ST; S0
𝜕S0
dST
= න g(ST) ∙
൯fS0
(ST; S0
)f(ST; S0
f(ST; S0)dST
= E g(ST) ∙
൯fS0
(ST; S0
)f(ST; S0
Mean
Machine
2.6 Risk Measurement - Value-at-Risk
Window: 1 Year
Holding Period: 10 Day
Confidence Level: 99%
Risk Factors
Historical Time Series Perturbations Historical Scenarios
Simulated PVs Simulated P&Ls
S1
1
S1
m
…
S1
250
…
S2
1
S2
m
…
S2
250
…
Sn
1
Sn
m
…Sn
250
…
PV1
PVm
…
PV250
…
P&L1
P&Lm
…
P&L250
…
Δ1
1
Δ1
m
…
Δ1
250
…
Δ2
1
Δ2
m
…
Δ2
250
…
Δn
1
Δn
m
…
Δn
250
…
RF1
1
RF1
m
…
RF1
260
…
RF2
1
RF2
m
…
RF2
260
…
RFn
1
RFn
m
…
RFn
260
…
RF1
RFn
…
…
…
RF2
……
Portfolio VaR
(100 - )% VaR PnL0
ProfitLoss
Loss < VaR
%
Mean
Machine
Quantitative
Investing
Overview
Quant Platform
Data Processing
Stock Selection
Portfolio Construction
Mean
Machine
3.1 Overview
• Data Collection
• Outlier Handling: MAD, 3σ,
Percentile
• Standardization: Raw, Ranked
Data
Preprocessing
Stock
Selection
• Optimization: EW, MVO, GMV,
MDP, RP, RB, EMV, BL
• Constraints: Industry, Factor
Exposure, Stock
Portfolio
Construction
Traditional
Approach
Machine Learning
Approach
• Single-Factor Test: IC, Stratified Backtesting
• Multi-Factor Test: Correlation, Factor Synthesis
• Multi-Factor Linear Regression
Import Package
Parameter
Setting
Data Labeling Model Training
Model Setting
Data Splitting
Model
Assessment
Strategy
Implementation
Strategy
Assessment
Mean
Machine
3.2 Quant Platform
https://www.joinquant.com/
https://uqer.io/home/https://www.ricequant.com/
https://www.quantopian.com
Mean
Machine
3.3 Data Preprocessing - Data Collection
date = '2018-1-4'
stocks = all_instruments(type="CS", date=date).order_book_id.tolist()
data = get_fundamentals(
query(
fundamentals.eod_derivative_indicator.pb_ratio,
fundamentals.eod_derivative_indicator.market_cap
).filter(
fundamentals.income_statement.stockcode.in_(stocks)
), date, '1d').major_xs(date).dropna()
data['BP'] = 1/data['pb_ratio']
data.head(3).append(data.tail(3))
Mean
Machine
3.3 Data Preprocessing - Outlier handling
def filter_extreme_MAD(series,n):
median = series.quantile(0.5)
new_median = ((series - median).abs()).quantile(0.50)
max_range = median + n*new_median
min_range = median - n*new_median
return np.clip(series,min_range,max_range)
def filter_extreme_3sigma(series,n=3):
mean = series.mean()
std = series.std()
max_range = mean + n*std
min_range = mean - n*std
return np.clip(series,min_range,max_range)
def filter_extreme_percentile(series,min = 0.025,max = 0.975):
series = series.sort_values()
q = series.quantile([min,max])
return np.clip(series,q.iloc[0],q.iloc[1])
MAD
3 Sigma
Percentile
Mean
Machine
3.3 Data Preprocessing - Standardization
def standardize_series(series):
return (series-series.mean()) / series.std()
new = filter_extreme_3sigma(data['BP'])
ax = standardize_series(new).plot.kde(label = 'Standardized Raw Factor')
ax.legend();
ax = standardize_series(new.rank()).plot.kde(label = 'Standardized
Ranked Factor')
ax.legend();
zi =
Xi − μ
σ
zi =
Yi − μ
σ
, Y = Rank(X)
Standardized Raw Factor
Standardized Ranked Factor
Mean
Machine
3.4 Stock Selection - Traditional Approach
ri = βi1f1 + ⋯ + βiKfK + εi = ෍
k=1
K
βikfk + εi
Multi-Factor
Model
The basic premise of multi-factor model is
that similar assets display similar returns.
Factor
Exposure
Factor
Premium
Specified
Return
Excess
Return
Estimate Factor Premium
Consider the following stocks and factors:
Stock 1: Apple Stock 2: Facebook Stock 3: Google
Factor 1: PE (price-to-earnings) Factor 2: DY (dividend yield)
For each t
1. Collect the factor exposures i1(t-1) and i2(t-1),
2. Collect the stock price at t-1 and t, and compute excess return ri(t)
3. Perform cross-sectional regression to get factor premiums f1(t) and f2(t)
r1(t) = β11(t − 1)f1(t) + β12(t − 1)f2(t)
r2(t) = β21(t − 1)f1(t) + β22(t − 1)f2(t)
r3(t) = β31(t − 1)f1(t) + β32(t − 1)f2(t)
)f1(1 )f2(1 ⋯ )fK(1
)f1(2 f2(2) ⋱ )fK(2
⋮ ⋱ ⋱ ⋮
)f1(T )f2(T ⋯ )fK(T
f1 T + 1 f2 T + 1 ⋯ fK T + 1
Collect
Time Series
From 0 to T
Predict
Factor
Premium
at T+1
AR(p)
MA(q)
ARMA(p, q)
…
Mean
Machine
3.4 Stock Selection - Traditional Approach
Factor Premium
Factor Exposure
Excess Return
•Rank and Select
𝐁(T + 1) =
൯β1,1(T + 1 ൯β1,2(T + 1 ⋯ ൯β1,K(T + 1
൯β2,1(T + 1 ൯β2,2(T + 1 ⋱ ൯β2,K(T + 1
⋮ ⋱ ⋱ ⋮
൯βN,1(T + 1 ൯βN,2(T + 1 ⋯ ൯βN,K(T + 1
𝐟(T + 1) = f1 T + 1 f2 T + 1 ⋯ fK T + 1 T
ri(T + 1) = ෍
k=1
K
βi,k(T + 1) · fk T + 1
𝐫(T + 1) = r1 T + 1 r2 T + 1 ⋯ rN T + 1 T
Mean
Machine
3.4 Stock Selection - Machine Learning Approach
Import Package
Parameter
Setting
Data Labeling Model Training
Model Setting
Data Splitting
Model
Assessment
Strategy
Implementation
Strategy
Assessment
Mean
Machine
3.4 Stock Selection - Machine Learning Approach (Import Package)
import numpy as np # Matrix Computation
import pandas as pd # Handle Dataframe
# Show plot in this notebook
%matplotlib inline
import matplotlib.pyplot as plt # Plotting
from sklearn.model_selection import train_test_split # Split training and test set
from sklearn.model_selection import GridSearchCV # Select hyper-parameter by cross-validation error
from sklearn.model_selection import KFold # CV model for binary class or balanced class
from sklearn.model_selection import StratifiedKFold # CV model for multi-class or inbalanced class
from sklearn import metrics as me
# Machine Learning Model
from sklearn.svm import SVC # Support Vector Machine
from sklearn.ensemble import RandomForestClassifier as RFC # Random Forest
from sklearn.ensemble import GradientBoostingClassifier as GBC # Gradient Boosted
Mean
Machine
3.4 Stock Selection - Machine Learning Approach (Parameter Setting)
class PARA:
method = 'SVM' # Specify the method, can be 'RF', 'GBT'
month_train = range(1, 84+1) # In-sample 84 data points = 84 training examples
month_test = range(85, 120+1) # Out-of-sample 36 data points = 36 test examples
percent_select = [0.5, 0.5] # 50% positive examples,50% negative examples
cv = 10 # 10-fold cross-validation
seed = 1 # Random seed, for results reproduction
para = PARA()
Mean
Machine
3.4 Stock Selection - Machine Learning Approach (Data labelling)
def label_data( data ):
data['Label'] = np.nan # Initialization
data = data.sort_values( by='Return', ascending=False ) # Sort excess return in descending order
n_stock = np.multiply( para.percent_select, data.shape[0] ) # Compute the number for stocks for pos and neg class
n_stock = np.around(n_stock).astype(int) # Round number of stocks to integer
data.iloc[0:n_stock[0], -1] = 1 # Assign 1 to those stocks with best performace
data.iloc[-n_stock[1]:, -1] = 0 # Assign 0 to those stocks with worst performace
data = data.dropna(axis=0) # Delete examples with NaN value
return data
Data Format: m × n matrix, first row
contains labels, next m-1 rows contains
the information of m-1 stocks.
• Column 1 - 3: basic information
• Column 4: excess return of next month
• Column 5 - n: factor exposure
csv file
Mean
Machine
3.4 Stock Selection - Machine Learning Approach (Data Splitting)
for i in para.month_train: # load csv month by month
file_name = str(i) + ’.csv'
data_curr_month = pd.read_csv( file_name, header=0 )
para.n_stock = data_curr_month.shape[0]
data_curr_month = data_curr_month.dropna(axis=0) # remove NaN
data_curr_month = label_data( data_curr_month ) # label data
# merger to a single dataframe
if i == para.month_train[0]: # first month
data_train = data_curr_month
else:
data_train = data_train.append(data_curr_month)
X = data_train.loc[:, 'EP':'BIAS'];
y = data_train.loc[:, 'Label'];
X_train, X_cv, y_train, y_cv = train_test_split( X, y, test_size=1.0/para.cv, random_state=para.seed )
Mean
Machine
3.4 Stock Selection - Machine Learning Approach (Model Setting)
if para.method == 'SVM': # Support Vector Machine
model = SVC( kernel = 'linear', C = 1 )
elif para.method == 'RF': # Random Forest
model = RFC( n_estimators=200, max_depth=6, random_state=para.seed )
elif para.method == 'GBT': # Gradient Boosted Tree
model = GBC( n_estimators=200, max_depth=6, random_state=para.seed )
Mean
Machine
3.4 Stock Selection - Machine Learning Approach (Model Training)
model.fit( X_train, y_train )
y_pred_train = model.predict( X_train )
y_score_train = model.decision_function( X_train )
y_pred_cv = model.predict( X_cv )
y_score_cv = model.decision_function( X_cv )
print( 'Training set, accuracy = %.2f' %me.accuracy_score( y_train, y_pred_train ) )
print( 'Training set, AUC = %.2f' %me.roc_auc_score( y_train, y_score_train ) )
print( 'Validation set, accuracy = %.2f' %me.accuracy_score( y_cv, y_pred_cv ) )
print( 'Validation set, AUC = %.2f' %me.roc_auc_score( y_cv, y_score_cv ) )
kernel = ['linear', 'rbf']
C = [0.01, 0.1, 1, 10]
param_grid = dict(kernel=kernel, C=C)
kfold = StratifiedKFold( n_splits=PARA.cv, shuffle=True, random_state=PARA.seed )
grid_search = GridSearchCV( model, param_grid, n_jobs=-1, cv=kfold, verbose=1 )
grid_result = grid_search.fit( X, y )
Default
Parameter
Hyperparameter
Tuning
Mean
Machine
3.4 Stock Selection - Machine Learning Approach (Model Training)
best_model = grid_result.best_estimator_
# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
print("%f (%f) with: %r" % (mean, stdev, param))
# plot results
scores = np.array(means).reshape(len(kernel), len(C))
for i, value in enumerate(kernel):
plt.plot(C, scores[i], label='depth: ' + str(value))
plt.legend()
plt.xlabel('C')
plt.ylabel('kernel')
plt.savefig('kernel_vs_C.png')
Mean
Machine
3.4 Stock Selection - Machine Learning Approach (Model Assessment)
for i in para.month_test: # Print accuracy and AUC for each test example
y_true_i_month = pd.DataFrame( {'Return':y_true_test.iloc[:,i-1]} )
y_pred_i_month = y_pred_test.iloc[:,i-1]
y_score_i_month = y_score_test.iloc[:,i-1]
y_true_i_month = y_true_i_month.dropna(axis=0) # remove NaN
y_i_month = label_data( y_true_i_month )['Label']
y_pred_i_month = y_pred_i_month[ y_i_month.index ].values
y_score_i_month = y_score_i_month[ y_i_month.index ].values
print( 'test set, month %d, accuracy = %.2f' %(i, me.accuracy_score( y_i_month, y_pred_i_month ) ) )
print( 'test set, month %d, AUC = %.2f' %(i, me.roc_auc_score( y_i_month, y_score_i_month ) ) )
…
Mean
Machine
3.4 Stock Selection - Machine Learning Approach (Strategy Implementation)
n_stock = 15
strategy = pd.DataFrame( {'Return':[0]*para.month_test[-1], 'Value':[1]*para.month_test[-1]} )
for i in para.month_test:
y_true_i_month = y_true_test.iloc[:,i-1]
y_score_i_month = y_score_test.iloc[:,i-1]
y_score_i_month = y_score_i_month.sort_values(ascending=False) # Sort the score (probability) in descending order
i_index = y_score_i_month[0:n_stock].index # Find the index of the first 15 stocks
strategy.loc[i-1,'Return'] = np.mean(y_true_i_month[i_index])/100 # Compute the mean return of the 15 stocks
strategy['Value'] = (strategy['Return']+1).cumprod() # Mutiply the mean return each test month to get total return
Mean
Machine
3.4 Stock Selection - Machine Learning Approach (Strategy Assessment)
strategy_value = strategy.reindex(index=para.month_test, columns=['Value'])
strategy_return = strategy.reindex(index=para.month_test, columns=['Return'])
plt.plot( para.month_test, strategy_value, 'r-' )
plt.show()
excess_return = np.mean(strategy_return) * 12
excess_vol = np.std(strategy_return) * np.sqrt(12)
IR = excess_return / excess_vol
print( 'annual excess return = %.2f' %excess_return )
print( 'annual excess volatility= %.2f' %excess_vol )
print( 'information ratio = %.2f' %IR)
Mean
Machine
3.5 Portfolio Construction
Expected Return and
Variance Known?
Yes
MVO
No
Any View on Risk?
No
Any View on Return?
Yes
RB
Yes
BL
RP
EMV
Zero
Correlation
MDP
GMV EW
Same
Sharpe
Ratio
• MVO: Mean-Variance Optimization
• MDP: Most Diversified Portfolio
• GMV: Global Minimum Variance
• EMV: Equal Marginal Volatility
• EW: Equal Weight
• RB: Risk Budgeting
• RP: Risk Parity
• BL: Black-Litterman
Same
Risk
Budget
Same
Volatility
Same VolatilitySame
Expected
Return
Zero
Average
Correlation
Mean
Machine
Takeaway
P-Measure
Machine
Learning
P-, Q-
Measure
Quantitative
Investing
Q-Measure
Financial
Engineering
Model the
future
Real
probability
Discrete
process
Statistics
Estimation
Buy-side
Extrapolate
the present
Risk-neutral
probability
Continuous
process
Ito’s
Calculus
Calibration
Sell-side
THANK YOU
STEVEN WANG SHENGYUAN
Wechat Account: MeanMachine1031

More Related Content

What's hot

Probability
ProbabilityProbability
Probability
Anjali Devi J S
 
Binomial distribution
Binomial distributionBinomial distribution
Binomial distribution
Anjali Devi J S
 
Mixed-integer and Disjunctive Programming - Ignacio E. Grossmann
Mixed-integer and Disjunctive Programming - Ignacio E. GrossmannMixed-integer and Disjunctive Programming - Ignacio E. Grossmann
Mixed-integer and Disjunctive Programming - Ignacio E. Grossmann
CAChemE
 
Probability Assignment Help
Probability Assignment HelpProbability Assignment Help
Probability Assignment Help
Statistics Assignment Help
 
When Classifier Selection meets Information Theory: A Unifying View
When Classifier Selection meets Information Theory: A Unifying ViewWhen Classifier Selection meets Information Theory: A Unifying View
When Classifier Selection meets Information Theory: A Unifying View
Mohamed Farouk
 
Uncertainty Awareness in Integrating Machine Learning and Game Theory
Uncertainty Awareness in Integrating Machine Learning and Game TheoryUncertainty Awareness in Integrating Machine Learning and Game Theory
Uncertainty Awareness in Integrating Machine Learning and Game Theory
Rikiya Takahashi
 
random variable and distribution
random variable and distributionrandom variable and distribution
random variable and distribution
lovemucheca
 
Probability Assignment Help
Probability Assignment HelpProbability Assignment Help
Probability Assignment Help
Statistics Assignment Help
 
02.03 Artificial Intelligence: Search by Optimization
02.03 Artificial Intelligence: Search by Optimization02.03 Artificial Intelligence: Search by Optimization
02.03 Artificial Intelligence: Search by Optimization
Andres Mendez-Vazquez
 
First part of my NYU Tandon course "Numerical and Simulation Techniques in Fi...
First part of my NYU Tandon course "Numerical and Simulation Techniques in Fi...First part of my NYU Tandon course "Numerical and Simulation Techniques in Fi...
First part of my NYU Tandon course "Numerical and Simulation Techniques in Fi...
Edward D. Weinberger
 
Natural Language Processing: L03 maths fornlp
Natural Language Processing: L03 maths fornlpNatural Language Processing: L03 maths fornlp
Natural Language Processing: L03 maths fornlp
ananth
 
Bt0080 fundamentals of algorithms1
Bt0080 fundamentals of algorithms1Bt0080 fundamentals of algorithms1
Bt0080 fundamentals of algorithms1
Techglyphs
 
Chapter4
Chapter4Chapter4
Chapter4
Vu Vo
 
Probability Distributions
Probability Distributions Probability Distributions
Probability Distributions
Anthony J. Evans
 
2. Mathematics
2. Mathematics2. Mathematics
2. Mathematics
Matteo Bedini
 
Econometrics 2017-graduate-3
Econometrics 2017-graduate-3Econometrics 2017-graduate-3
Econometrics 2017-graduate-3
Arthur Charpentier
 

What's hot (19)

Chapter 13
Chapter 13Chapter 13
Chapter 13
 
Probability
ProbabilityProbability
Probability
 
Binomial distribution
Binomial distributionBinomial distribution
Binomial distribution
 
Numerical approximation
Numerical approximationNumerical approximation
Numerical approximation
 
Mixed-integer and Disjunctive Programming - Ignacio E. Grossmann
Mixed-integer and Disjunctive Programming - Ignacio E. GrossmannMixed-integer and Disjunctive Programming - Ignacio E. Grossmann
Mixed-integer and Disjunctive Programming - Ignacio E. Grossmann
 
2주차
2주차2주차
2주차
 
Probability Assignment Help
Probability Assignment HelpProbability Assignment Help
Probability Assignment Help
 
When Classifier Selection meets Information Theory: A Unifying View
When Classifier Selection meets Information Theory: A Unifying ViewWhen Classifier Selection meets Information Theory: A Unifying View
When Classifier Selection meets Information Theory: A Unifying View
 
Uncertainty Awareness in Integrating Machine Learning and Game Theory
Uncertainty Awareness in Integrating Machine Learning and Game TheoryUncertainty Awareness in Integrating Machine Learning and Game Theory
Uncertainty Awareness in Integrating Machine Learning and Game Theory
 
random variable and distribution
random variable and distributionrandom variable and distribution
random variable and distribution
 
Probability Assignment Help
Probability Assignment HelpProbability Assignment Help
Probability Assignment Help
 
02.03 Artificial Intelligence: Search by Optimization
02.03 Artificial Intelligence: Search by Optimization02.03 Artificial Intelligence: Search by Optimization
02.03 Artificial Intelligence: Search by Optimization
 
First part of my NYU Tandon course "Numerical and Simulation Techniques in Fi...
First part of my NYU Tandon course "Numerical and Simulation Techniques in Fi...First part of my NYU Tandon course "Numerical and Simulation Techniques in Fi...
First part of my NYU Tandon course "Numerical and Simulation Techniques in Fi...
 
Natural Language Processing: L03 maths fornlp
Natural Language Processing: L03 maths fornlpNatural Language Processing: L03 maths fornlp
Natural Language Processing: L03 maths fornlp
 
Bt0080 fundamentals of algorithms1
Bt0080 fundamentals of algorithms1Bt0080 fundamentals of algorithms1
Bt0080 fundamentals of algorithms1
 
Chapter4
Chapter4Chapter4
Chapter4
 
Probability Distributions
Probability Distributions Probability Distributions
Probability Distributions
 
2. Mathematics
2. Mathematics2. Mathematics
2. Mathematics
 
Econometrics 2017-graduate-3
Econometrics 2017-graduate-3Econometrics 2017-graduate-3
Econometrics 2017-graduate-3
 

Similar to Machine Learning, Financial Engineering and Quantitative Investing

Deep Learning: Introduction & Chapter 5 Machine Learning Basics
Deep Learning: Introduction & Chapter 5 Machine Learning BasicsDeep Learning: Introduction & Chapter 5 Machine Learning Basics
Deep Learning: Introduction & Chapter 5 Machine Learning Basics
Jason Tsai
 
ML unit-1.pptx
ML unit-1.pptxML unit-1.pptx
ML unit-1.pptx
SwarnaKumariChinni
 
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof..."Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
Paris Women in Machine Learning and Data Science
 
02 math essentials
02 math essentials02 math essentials
02 math essentials
Poongodi Mano
 
L1 intro2 supervised_learning
L1 intro2 supervised_learningL1 intro2 supervised_learning
L1 intro2 supervised_learning
Yogendra Singh
 
Introduction to Machine Learning and Deep Learning
Introduction to Machine Learning and Deep LearningIntroduction to Machine Learning and Deep Learning
Introduction to Machine Learning and Deep Learning
Terry Taewoong Um
 
Can machine think like human being : A Godelian perspective
Can machine think like human being : A Godelian perspective Can machine think like human being : A Godelian perspective
Can machine think like human being : A Godelian perspective
Jaynarayan Tudu
 
Accelerating Metropolis Hastings with Lightweight Inference Compilation
Accelerating Metropolis Hastings with Lightweight Inference CompilationAccelerating Metropolis Hastings with Lightweight Inference Compilation
Accelerating Metropolis Hastings with Lightweight Inference Compilation
Feynman Liang
 
Lecture 7
Lecture 7Lecture 7
Lecture 7butest
 
Lecture 7
Lecture 7Lecture 7
Lecture 7butest
 
AML_030607.ppt
AML_030607.pptAML_030607.ppt
AML_030607.pptbutest
 
Introduction
IntroductionIntroduction
Introductionbutest
 
Coursera 1week
Coursera  1weekCoursera  1week
Coursera 1week
csl9496
 
Kk20503 1 introduction
Kk20503 1 introductionKk20503 1 introduction
Kk20503 1 introductionLow Ying Hao
 
Machine learning by Dr. Vivek Vijay and Dr. Sandeep Yadav
Machine learning by Dr. Vivek Vijay and Dr. Sandeep YadavMachine learning by Dr. Vivek Vijay and Dr. Sandeep Yadav
Machine learning by Dr. Vivek Vijay and Dr. Sandeep Yadav
Agile Testing Alliance
 
Predictive Testing
Predictive TestingPredictive Testing
Predictive Testing
Herminio Vazquez
 
Evolutionary deep learning: computer vision.
Evolutionary deep learning: computer vision.Evolutionary deep learning: computer vision.
Evolutionary deep learning: computer vision.
Olivier Teytaud
 
Chap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsChap 8. Optimization for training deep models
Chap 8. Optimization for training deep models
Young-Geun Choi
 
Regression
RegressionRegression
Regression
Ncib Lotfi
 

Similar to Machine Learning, Financial Engineering and Quantitative Investing (20)

Deep Learning: Introduction & Chapter 5 Machine Learning Basics
Deep Learning: Introduction & Chapter 5 Machine Learning BasicsDeep Learning: Introduction & Chapter 5 Machine Learning Basics
Deep Learning: Introduction & Chapter 5 Machine Learning Basics
 
ML unit-1.pptx
ML unit-1.pptxML unit-1.pptx
ML unit-1.pptx
 
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof..."Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
 
02 math essentials
02 math essentials02 math essentials
02 math essentials
 
L1 intro2 supervised_learning
L1 intro2 supervised_learningL1 intro2 supervised_learning
L1 intro2 supervised_learning
 
Introduction to Machine Learning and Deep Learning
Introduction to Machine Learning and Deep LearningIntroduction to Machine Learning and Deep Learning
Introduction to Machine Learning and Deep Learning
 
Can machine think like human being : A Godelian perspective
Can machine think like human being : A Godelian perspective Can machine think like human being : A Godelian perspective
Can machine think like human being : A Godelian perspective
 
Accelerating Metropolis Hastings with Lightweight Inference Compilation
Accelerating Metropolis Hastings with Lightweight Inference CompilationAccelerating Metropolis Hastings with Lightweight Inference Compilation
Accelerating Metropolis Hastings with Lightweight Inference Compilation
 
Lecture 7
Lecture 7Lecture 7
Lecture 7
 
Lecture 7
Lecture 7Lecture 7
Lecture 7
 
AML_030607.ppt
AML_030607.pptAML_030607.ppt
AML_030607.ppt
 
Introduction
IntroductionIntroduction
Introduction
 
Coursera 1week
Coursera  1weekCoursera  1week
Coursera 1week
 
Kk20503 1 introduction
Kk20503 1 introductionKk20503 1 introduction
Kk20503 1 introduction
 
Machine learning by Dr. Vivek Vijay and Dr. Sandeep Yadav
Machine learning by Dr. Vivek Vijay and Dr. Sandeep YadavMachine learning by Dr. Vivek Vijay and Dr. Sandeep Yadav
Machine learning by Dr. Vivek Vijay and Dr. Sandeep Yadav
 
Predictive Testing
Predictive TestingPredictive Testing
Predictive Testing
 
151028_abajpai1
151028_abajpai1151028_abajpai1
151028_abajpai1
 
Evolutionary deep learning: computer vision.
Evolutionary deep learning: computer vision.Evolutionary deep learning: computer vision.
Evolutionary deep learning: computer vision.
 
Chap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsChap 8. Optimization for training deep models
Chap 8. Optimization for training deep models
 
Regression
RegressionRegression
Regression
 

Recently uploaded

Transkredit Finance Company Products Presentation (1).pptx
Transkredit Finance Company Products Presentation (1).pptxTranskredit Finance Company Products Presentation (1).pptx
Transkredit Finance Company Products Presentation (1).pptx
jenomjaneh
 
一比一原版(GWU,GW毕业证)加利福尼亚大学|尔湾分校毕业证如何办理
一比一原版(GWU,GW毕业证)加利福尼亚大学|尔湾分校毕业证如何办理一比一原版(GWU,GW毕业证)加利福尼亚大学|尔湾分校毕业证如何办理
一比一原版(GWU,GW毕业证)加利福尼亚大学|尔湾分校毕业证如何办理
obyzuk
 
Instant Issue Debit Cards - High School Spirit
Instant Issue Debit Cards - High School SpiritInstant Issue Debit Cards - High School Spirit
Instant Issue Debit Cards - High School Spirit
egoetzinger
 
Seminar: Gender Board Diversity through Ownership Networks
Seminar: Gender Board Diversity through Ownership NetworksSeminar: Gender Board Diversity through Ownership Networks
Seminar: Gender Board Diversity through Ownership Networks
GRAPE
 
Commercial Bank Economic Capsule - May 2024
Commercial Bank Economic Capsule - May 2024Commercial Bank Economic Capsule - May 2024
Commercial Bank Economic Capsule - May 2024
Commercial Bank of Ceylon PLC
 
US Economic Outlook - Being Decided - M Capital Group August 2021.pdf
US Economic Outlook - Being Decided - M Capital Group August 2021.pdfUS Economic Outlook - Being Decided - M Capital Group August 2021.pdf
US Economic Outlook - Being Decided - M Capital Group August 2021.pdf
pchutichetpong
 
how to sell pi coins at high rate quickly.
how to sell pi coins at high rate quickly.how to sell pi coins at high rate quickly.
how to sell pi coins at high rate quickly.
DOT TECH
 
Instant Issue Debit Cards
Instant Issue Debit CardsInstant Issue Debit Cards
Instant Issue Debit Cards
egoetzinger
 
how to sell pi coins effectively (from 50 - 100k pi)
how to sell pi coins effectively (from 50 - 100k  pi)how to sell pi coins effectively (from 50 - 100k  pi)
how to sell pi coins effectively (from 50 - 100k pi)
DOT TECH
 
when will pi network coin be available on crypto exchange.
when will pi network coin be available on crypto exchange.when will pi network coin be available on crypto exchange.
when will pi network coin be available on crypto exchange.
DOT TECH
 
The Role of Non-Banking Financial Companies (NBFCs)
The Role of Non-Banking Financial Companies (NBFCs)The Role of Non-Banking Financial Companies (NBFCs)
The Role of Non-Banking Financial Companies (NBFCs)
nickysharmasucks
 
SWAIAP Fraud Risk Mitigation Prof Oyedokun.pptx
SWAIAP Fraud Risk Mitigation   Prof Oyedokun.pptxSWAIAP Fraud Risk Mitigation   Prof Oyedokun.pptx
SWAIAP Fraud Risk Mitigation Prof Oyedokun.pptx
Godwin Emmanuel Oyedokun MBA MSc ACA ACIB FCTI FCFIP CFE
 
一比一原版(UCSB毕业证)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB毕业证)圣芭芭拉分校毕业证如何办理一比一原版(UCSB毕业证)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB毕业证)圣芭芭拉分校毕业证如何办理
bbeucd
 
what is a pi whale and how to access one.
what is a pi whale and how to access one.what is a pi whale and how to access one.
what is a pi whale and how to access one.
DOT TECH
 
The European Unemployment Puzzle: implications from population aging
The European Unemployment Puzzle: implications from population agingThe European Unemployment Puzzle: implications from population aging
The European Unemployment Puzzle: implications from population aging
GRAPE
 
Scope Of Macroeconomics introduction and basic theories
Scope Of Macroeconomics introduction and basic theoriesScope Of Macroeconomics introduction and basic theories
Scope Of Macroeconomics introduction and basic theories
nomankalyar153
 
The WhatsPump Pseudonym Problem and the Hilarious Downfall of Artificial Enga...
The WhatsPump Pseudonym Problem and the Hilarious Downfall of Artificial Enga...The WhatsPump Pseudonym Problem and the Hilarious Downfall of Artificial Enga...
The WhatsPump Pseudonym Problem and the Hilarious Downfall of Artificial Enga...
muslimdavidovich670
 
how to sell pi coins on Bitmart crypto exchange
how to sell pi coins on Bitmart crypto exchangehow to sell pi coins on Bitmart crypto exchange
how to sell pi coins on Bitmart crypto exchange
DOT TECH
 
how to sell pi coins in South Korea profitably.
how to sell pi coins in South Korea profitably.how to sell pi coins in South Korea profitably.
how to sell pi coins in South Korea profitably.
DOT TECH
 
The Evolution of Non-Banking Financial Companies (NBFCs) in India: Challenges...
The Evolution of Non-Banking Financial Companies (NBFCs) in India: Challenges...The Evolution of Non-Banking Financial Companies (NBFCs) in India: Challenges...
The Evolution of Non-Banking Financial Companies (NBFCs) in India: Challenges...
beulahfernandes8
 

Recently uploaded (20)

Transkredit Finance Company Products Presentation (1).pptx
Transkredit Finance Company Products Presentation (1).pptxTranskredit Finance Company Products Presentation (1).pptx
Transkredit Finance Company Products Presentation (1).pptx
 
一比一原版(GWU,GW毕业证)加利福尼亚大学|尔湾分校毕业证如何办理
一比一原版(GWU,GW毕业证)加利福尼亚大学|尔湾分校毕业证如何办理一比一原版(GWU,GW毕业证)加利福尼亚大学|尔湾分校毕业证如何办理
一比一原版(GWU,GW毕业证)加利福尼亚大学|尔湾分校毕业证如何办理
 
Instant Issue Debit Cards - High School Spirit
Instant Issue Debit Cards - High School SpiritInstant Issue Debit Cards - High School Spirit
Instant Issue Debit Cards - High School Spirit
 
Seminar: Gender Board Diversity through Ownership Networks
Seminar: Gender Board Diversity through Ownership NetworksSeminar: Gender Board Diversity through Ownership Networks
Seminar: Gender Board Diversity through Ownership Networks
 
Commercial Bank Economic Capsule - May 2024
Commercial Bank Economic Capsule - May 2024Commercial Bank Economic Capsule - May 2024
Commercial Bank Economic Capsule - May 2024
 
US Economic Outlook - Being Decided - M Capital Group August 2021.pdf
US Economic Outlook - Being Decided - M Capital Group August 2021.pdfUS Economic Outlook - Being Decided - M Capital Group August 2021.pdf
US Economic Outlook - Being Decided - M Capital Group August 2021.pdf
 
how to sell pi coins at high rate quickly.
how to sell pi coins at high rate quickly.how to sell pi coins at high rate quickly.
how to sell pi coins at high rate quickly.
 
Instant Issue Debit Cards
Instant Issue Debit CardsInstant Issue Debit Cards
Instant Issue Debit Cards
 
how to sell pi coins effectively (from 50 - 100k pi)
how to sell pi coins effectively (from 50 - 100k  pi)how to sell pi coins effectively (from 50 - 100k  pi)
how to sell pi coins effectively (from 50 - 100k pi)
 
when will pi network coin be available on crypto exchange.
when will pi network coin be available on crypto exchange.when will pi network coin be available on crypto exchange.
when will pi network coin be available on crypto exchange.
 
The Role of Non-Banking Financial Companies (NBFCs)
The Role of Non-Banking Financial Companies (NBFCs)The Role of Non-Banking Financial Companies (NBFCs)
The Role of Non-Banking Financial Companies (NBFCs)
 
SWAIAP Fraud Risk Mitigation Prof Oyedokun.pptx
SWAIAP Fraud Risk Mitigation   Prof Oyedokun.pptxSWAIAP Fraud Risk Mitigation   Prof Oyedokun.pptx
SWAIAP Fraud Risk Mitigation Prof Oyedokun.pptx
 
一比一原版(UCSB毕业证)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB毕业证)圣芭芭拉分校毕业证如何办理一比一原版(UCSB毕业证)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB毕业证)圣芭芭拉分校毕业证如何办理
 
what is a pi whale and how to access one.
what is a pi whale and how to access one.what is a pi whale and how to access one.
what is a pi whale and how to access one.
 
The European Unemployment Puzzle: implications from population aging
The European Unemployment Puzzle: implications from population agingThe European Unemployment Puzzle: implications from population aging
The European Unemployment Puzzle: implications from population aging
 
Scope Of Macroeconomics introduction and basic theories
Scope Of Macroeconomics introduction and basic theoriesScope Of Macroeconomics introduction and basic theories
Scope Of Macroeconomics introduction and basic theories
 
The WhatsPump Pseudonym Problem and the Hilarious Downfall of Artificial Enga...
The WhatsPump Pseudonym Problem and the Hilarious Downfall of Artificial Enga...The WhatsPump Pseudonym Problem and the Hilarious Downfall of Artificial Enga...
The WhatsPump Pseudonym Problem and the Hilarious Downfall of Artificial Enga...
 
how to sell pi coins on Bitmart crypto exchange
how to sell pi coins on Bitmart crypto exchangehow to sell pi coins on Bitmart crypto exchange
how to sell pi coins on Bitmart crypto exchange
 
how to sell pi coins in South Korea profitably.
how to sell pi coins in South Korea profitably.how to sell pi coins in South Korea profitably.
how to sell pi coins in South Korea profitably.
 
The Evolution of Non-Banking Financial Companies (NBFCs) in India: Challenges...
The Evolution of Non-Banking Financial Companies (NBFCs) in India: Challenges...The Evolution of Non-Banking Financial Companies (NBFCs) in India: Challenges...
The Evolution of Non-Banking Financial Companies (NBFCs) in India: Challenges...
 

Machine Learning, Financial Engineering and Quantitative Investing

  • 1. MACHINE LEARNING FINANCIAL ENGINEERING MAR 2018 STEVEN WANG QUANTITATIVE INVESTING
  • 2. Quick Take Machine Learning Financial Engineering Quantitative Investing Takeaway P-, Q- easure ntitative vesting Q-Measure Financial Engineering QT , Q- asure titative esting Q-Measure Financial Engineering ML arning? g Problem? ible? FE QI
  • 3. Mean Machine Machine Learning What is Machine Learning What is Learning Problem Is Learning Feasible How to Learn Well
  • 4. Mean Machine 1.1 What is Machine Learning? - Overview
  • 5. Mean Machine 1.1 What is Machine Learning? - Type Agent Environment ActionReward State Input Layer Output Layer Hidden Layer 1 Hidden Layer 2 Supervised Learning Unsupervised Learning Semi-supervised Learning Reinforcement Learning Deep Learning
  • 6. Mean Machine 1.2 What is Learning Problem? Unknown Target Function c: X  Y Training Examples (x(1), y(1)), (x(2), y(2)),…, (x(n), y(n)) Learning Algorithm A Hypothesis Set H = {h1, h2,…, hM} Final Hypothesis g  c ideal loan approval formula historical records of applicants a set of candidate formulas learned loan approval formula
  • 7. Mean Machine 1.3 Is Learning Feasible? - No Free Lunch Theorem Feature x Label y Model A Model B Model C Training Data [0, 0] 0 0 0 0 [1, 1] 1 1 1 1 Test Data [1, 0] ? 1 0 1 [0, 1] ? 0 0 1 Model A = random guess Model B = support vector machine Model C = deep neural network Is Model C > Model B > Model A? c(x) = x1 ⋁ x2 : Model C wins o The 3rd data x = [1, 0], so y = 1 ⋁ 0 = 1 o The 4th data x = [0, 1], so y = 0 ⋁ 1 = 1 c(x) = x1 ⋀ x2 : Model B wins o The 3rd data x = [1, 0], so y = 1 ⋀ 0 = 0 o The 4th data x = [0, 1], so y = 0 ⋀ 1 = 0 c(x) = x1: Model A wins o The 3rd data x = [1, 0], so y = 1 o The 4th data x = [0, 1], so y = 0 Model A is as good as Model C! Anything needs to learn? All Models are expected to have equivalent performance! The c is an unknown function, the performance on training data is not indicative of the performance on test data. The performance on test data is all that matters in learning! Can we really learn something? Learning seems to be doomed but …
  • 8. Mean Machine 1.3 Is Learning Feasible? - No Free Lunch Proof It is meaningless to discuss the superiority of algorithm given no specific problems. Define A = algorithm xin = in-sample data xout = out-of-sample data (N) c = unknown target function h = hypothesis function Consider all cases of c, the expected out-of-sample error under algorithm A is the same. E A xin, c = ෍ c ෍ h ෍ xout P xout PDF of xout ∙ I h xout ≠ c xout error of h on xout ∙ P h xin, A PDF of h given A and xin = ෍ xout P xout ∙ ෍ h P h xin, A ∙ ෍ c I h xout ≠ c xout = ෍ xout P xout ∙ ෍ h P h xin, A ∙ 1 2 2N = 2N−1 ෍ xout P xout ∙ ෍ h P h xin, A = 2N−1 ෍ xout P xout The error is independent of algorithm A! E[random guess|xin, c] = E[state-of-the-art|xin, c]
  • 9. Mean Machine 1.3 Is Learning Feasible? - Add Probability Distribution Unknown Target Function c: X  Y Training Examples (x(1), y(1)), (x(2), y(2)),…, (x(n), y(n)) Learning Algorithm A Hypothesis Set H = {h1, h2,…, hM} Final Hypothesis g  c Input Distribution P(X) x x(1), x(2), …, x(n)
  • 10. Mean Machine 1.3 Is Learning Feasible? - Logic Chain of Proof learning feasibleProve target function c Learn g closest to cFind Etrue(g) smallShow Etrain(g) ≈ Etrue(g) Etrain(g) small GivenEtrue(g) small Showg ≈ c Showlearning feasible Show God’s Gift: Etrain(g) ≈ Etrue(g) Your Capability: Etrain(g) is small Target function c is UNKNOWN True error Etrue(g) is IMPOSSIBLE to compute
  • 11. Mean Machine 1.3 Is Learning Feasible? - From Unknown to Known Can we infer u from v? No, people from sample might all support Clinton but Trump eventually win! The above statement is POSSIBLE but not PROBABLE. When sample size is big enough, “v ≈ u” is probably approximately correct (PAC) P( v − u > ε) ≤ 2e−2ε2nHoeffding’s inequality • No u appears to be at RHS of above formula • A link from unknown u to known v u is deterministic unknown, v is stochastic known.Population Sampleu v = 2/5
  • 12. Mean Machine 1.3 Is Learning Feasible? - From Polling to Learning Polling Learning Label Support Trump Support Clinton Correct classification Incorrect classification Aim Get vote percentage for Trump Learn target function c(x) = y Data US citizens Examples Data Distribution Every citizen is i.i.d Every example is i.i.d In-Sample Sample Training set In-Sample Statistics v = vote percentage for Trump in-sample training error Etrain h = 1 n σi=1 n I{h 𝐱 i ≠ c 𝐱 i } Out-of-Sample Statistics u = vote percentage for Trump out-of-sample true error Etrue h = P(h 𝐱 ≠ c(𝐱)) P( v − u > ε) ≤ 2e−2ε2n P( Etrain(h) − Etrue(h) > ε) ≤ 2e−2ε2n Polling Learning simplify P(bad h) ≤ 2e−2ε2n analogy Are we done? No! This is verification, not learning
  • 13. Mean Machine 1.3 Is Learning Feasible? - From One to Many Unknown Target Function c: X  Y Verify Training Examples (x(1), y(1)), (x(2), y(2)),…, (x(n), y(n)) A Fixed Hypothesis function h Verification h  c or h ≠ c Input Distribution P(X) x x(1), x(2), …, x(n) The entire flowchart assumed a FIXED h and then came the data. In order to be real learning, we have to choose g among a hypothesis set {h1, h2, …, hM} instead of fixing a single h P Etrain g − Etrue g > ε = P bad g ≤ P bad h1 or bad h2 or ⋯ or bad hM ≤ P bad h1 + P bad h2 + ⋯ + P bad hM ≤ 2e−2ε2n + 2e−2ε2n + ⋯ + 2e−2ε2n = 2Me−2ε2n P(bad h) ≤ 2e−2ε2n P(bad g) ≤ 2Me−2ε2n From h to g Are we done? No! M can be very huge, infinite-huge
  • 14. Mean Machine 1.3 Is Learning Feasible? - From Finite to Infinite When M  ∞ P bad g ≤ 2Me−2ε2n = 2 M e2ε2n = very large number Congratulations! Even primary student knows P(bad g) ≤ 1 What went wrong?
  • 15. Mean Machine 1.3 Is Learning Feasible? - From Infinite to Finite h1 h2 h3 The hypothesis h1, h2 and h3 are effective equivalent! Dichotomy Growth function Shattered Break Point VC Dimension P Etrain g − Etrue g > ε ≤ 4 2n dvc + 1 e 1 8 ε2n
  • 16. Mean Machine 1.3 Is Learning Feasible? - Learning is Feasible P Etrain g − Etrue g > ε ≤ 4 2n dvc + 1 e 1 8 ε2n We can reach above conclusion by not knowing • algorithm A • input distribution P(X) • target function c We just need • training examples D • hypothesis set H to find final hypothesis g to learn c. Learning is feasible when VC dimension is finite Unknown Target Function c: X  Y Training Examples (x(1), y(1)), (x(2), y(2)),…, (x(n), y(n)) Learning Algorithm A Hypothesis Set H = {h1, h2,…, hM} Final Hypothesis g  c Input Distribution P(X) x x(1), x(2), …, x(n)
  • 17. Mean Machine 1.4 How to Learn Well? - Over-learn vs Under-learn Exercise Exam Both are leaves Which one is leaf? When you learn too much: This is not leaf, since leaves must be serrated When you learn too little: This is leaf, since leaves are green
  • 18. Mean Machine 1.4 How to Learn Well? - Overfit vs Underfit
  • 20. Mean Machine 2.1 Overview Curve Construction Model Calibration Instrument Valuation Θnum(t) Θprm(t) P(t, ·) Θmdl(t) V(t) Evaluation Risk Measurement CalibrationBoostrapping Perturbation Extraction ∂V/∂Θmkt ∂V/∂ΘmdlΘmkt(t) Data Parameter Variable Computation P&L VaR
  • 21. Mean Machine 2.2 Data, Parameter • Deposit rates, futures rates and swap rates (yield curve construction) • Cap and swaption implied volatilities (IR volatility calibration) • FX swap point and volatilities (FX volatility calibration) • CDS spread curve (hazard rate calibration) Θmkt(t) Daily observable market data • Libor fixing (historical data point) • Correlation between CMS and FX (historical time series) • Short rate mean reversion speed (κ = 0.01) Θprm(t) Indirectly observed or estimated from historical data or treated as “exotic” constants • Number of Monte Carlo paths (N = 50,000) • Number of node points in finite difference grid (N = 1,000) • Tolerance of errors in optimization (ε = 10-5) Θnum(t) Parameters that control the numerical schemes
  • 22. Mean Machine 2.3 Curve Construction Benchmark Curve Deposit ED Futures FRA Swap 2 Benchmark Curve USD USD FX Swap Point CRX Basis Swap Deposit ED Futures FRA Swap 4 Index Curve CUR CUR CUR 4 1 Discount Curve USD OIC OIS USD 2 IR Basis Swap USD Index Curve USD FX Discount Curve CUR CURUSD 3 3 IR Basis Swap CUR 3 1 Discount Curve CUR OIC OIS CUR 2 2 3 4
  • 23. Mean Machine 2.4 Model Calibration Expiry|Strike 15705.69 16578.23 17450.77 18323.31 19195.85 1M 28.63 26.00 24.34 23.16 23.16 3M 27.06 25.60 24.70 23.94 23.49 6M 26.28 25.47 24.92 24.34 23.97 12M 26.07 25.66 25.33 24.96 24.74 24M 26.54 26.40 26.16 25.94 25.72 60M 29.00 28.87 28.73 28.66 28.60 Maturity|Expiry 1M 3M 6M 1Y 2Y 3Y 4Y 5Y 7Y 10Y 15Y 20Y 25Y 30Y 1Y 59.80 56.15 56.27 65.12 66.75 55.32 44.80 36.16 28.18 22.39 19.98 18.09 17.52 17.17 2Y 53.00 46.22 50.38 59.33 56.22 46.80 39.23 33.31 27.06 22.04 20.26 18.72 18.19 18.05 3Y 53.00 43.60 47.48 57.00 48.87 41.21 35.61 31.16 26.06 21.80 20.19 18.63 18.15 18.14 4Y 52.70 50.04 48.35 50.06 43.32 37.41 33.03 29.58 25.17 21.60 20.07 18.46 18.00 18.12 5Y 50.80 48.45 48.02 46.04 40.06 34.93 31.15 28.33 24.50 21.47 19.84 18.04 17.65 17.94 7Y 41.50 43.49 41.98 39.47 35.22 31.46 28.38 26.08 23.38 21.18 19.73 18.11 18.25 19.32 10Y 10.00 33.70 32.49 32.59 32.36 30.12 27.94 26.01 24.66 22.56 20.74 19.55 18.23 19.28 15Y 30.60 26.74 27.17 27.46 26.07 24.78 23.60 22.70 21.20 19.45 17.94 17.32 19.25 21.27 20Y 25.50 25.24 25.69 25.90 24.73 23.70 22.70 21.89 20.60 19.09 18.09 17.63 19.57 22.18 25Y 24.80 24.65 24.68 24.77 23.78 22.95 22.11 21.39 20.27 19.02 18.24 17.51 19.88 22.92 30Y 24.60 24.52 24.11 24.04 23.11 22.40 21.65 21.01 20.05 19.04 18.22 17.53 20.27 23.39 Expiry|Strike 272.57 287.71 295.28 302.85 310.42 317.99 333.14 1M 32.33 31.18 30.63 30.60 31.40 32.27 33.32 2M 32.13 32.18 32.38 32.71 33.11 33.47 33.92 3M 35.17 35.67 36.10 36.52 36.93 37.35 37.81 6M 34.63 35.10 35.55 36.00 36.48 36.93 37.41 1Y 31.87 32.07 32.24 32.45 32.69 33.00 33.29 18M 29.31 29.68 29.95 30.29 30.66 31.10 31.60 2Y 28.75 29.07 29.31 29.66 30.03 30.49 31.09 Expiry|Convention ATM 25RR 10RR 25BF 10BF O/N 6.44 -0.56 -1.01 0.14 0.48 1W 8.55 -0.65 -1.17 0.15 0.50 2W 8.65 -0.75 -1.35 0.14 0.47 1M 8.78 -1.00 -1.79 0.11 0.40 2M 8.70 -1.10 -1.98 0.17 0.59 3M 8.75 -1.25 -2.25 0.18 0.62 6M 9.00 -1.50 -2.74 0.28 0.98 9M 9.19 -1.60 -2.91 0.30 1.03 1Y 9.30 -1.65 -3.00 0.29 0.99 2Y 9.78 -1.70 -3.18 0.32 1.15 Calibration EQ: N225 Option CM: Coffee Option IR: USD ATM Swaption FX: EURUSD Option SABR , , ,  Schwartz κ, σ, θ Hull-White κ, σ Heston κ, v0, η, , 
  • 24. Mean Machine 2.5 Instrument Valuation - Fundamentals No Arbitrage Numeraire Change Measure Pricing Formula V 0 = N 0 × EN V T N T Numeraire Probability Measure Bank Account Risk-neutral Measure Zero-Coupon Bond Forward Measure Annuity Swap Measure Given two assets A and B with their payoff f and g at T, if f = g, then A = B dP dQ = Q 0 P 0 ∙ P T Q T
  • 25. Mean Machine 2.5 Instrument Valuation - Fundamentals (No-Arbitrage Principle) Given two assets A and B with their payoff f and g at T, by no-arbitrage principle, if f = g, then A = B. • When A > B :At t = 0 buy B sell A with profit A – B > 0, at T sell B buy A with profit g – f = 0 • When A < B :At t = 0 sell B buy A with profit B – A > 0, at T buy B sell A with profit f – g = 0 B0 = A0 = 1 1 + r T 𝐝𝐢𝐬𝐜𝐨𝐮𝐧𝐭 ∙ pu ∙ h uS0 + pd ∙ h(dS0) 𝐞𝐱𝐩𝐞𝐜𝐭𝐞𝐝 𝐩𝐚𝐲𝐨𝐟𝐟 A0 = xS0 − yC AT = ൝ xuS0 − y 1 + r T C stock ↑ uS0 xdS0 − y 1 + r T C stock ↓ dS0 BT = ቊ h(uS0) stock ↑ uS0 h(dS0) stock ↓ dS0 AT = BT xuS0 − y 1 + r TC = h(uS0) stock ↑ uS0 xdS0 − y 1 + r TC = h(dS0) stock ↓ dS0 x = h uS0 − h(dS0) (u − d)S0 y = 1 1 + r TC ∙ d ∙ h uS0 − u ∙ h(dS0) u − d Use no-arbitrage principle to price any financial instrument B at time 0, B0 Construct Express Equal Link Solve Present Value = Discount Factor × Expected Payoff
  • 26. Mean Machine 2.5 Instrument Valuation - Fundamentals (Numeraire and Probability Measure) Numeraire Numeraire is unit, and it can be money, tradeable asset or even apple. Probability Measure Probability measure is a set of probabilities for certain event. Q(H) = 0.8, Q(T) = 0.2 P(H) = P(T) = 0.5Fair Coin Biased Coin A 0 = φ1A1 T + φ2A2 T + ⋯ + φKAK T = ෍ k=1 K φkAk(T) B 0 = φ1B1 T + φ2B2 T + ⋯ + φKBK T = ෍ k=1 K φkBk(T) A 0 B 0 = σk=1 K φkAk T σk=1 K φkBk T = ෍ k=1 K φk b Ak T = ෍ k=1 K φkBk T b × Ak T Bk T = ෍ k=1 K φkBk T σk=1 K φkBk T × Ak T Bk T = ෍ k=1 K πk × Ak T Bk T = EB A T B T A 0 = B 0 × EB A T B T B is a numeraire, EB is the expectation with probability measure induced by B Numeraire Probability Measure Instrument Bank Account Risk-neutral Measure FX, Equity, Commodity Option Zero-Coupon Bond Forward Measure Cap, Floor Annuity Swap Measure Swaption
  • 27. Mean Machine 2.5 Instrument Valuation - Fundamentals (Change of Probability Measure) EP X = p1x1 + p2x2 = q1 p1 q1 x1 + q2 p2 q2 x2 = EQ Z ∙ X q1 = 0.8, q2 = 0.2 p1 = p2 = 0.5Fair Coin Biased Coin Z is Radon-Nikodym derivative, denoted as Z = dP/dQQuestion 1: Relationship between two probability measures Numeraire Probability Measure corresponds Change Numeraire Change Probability Measure corresponds What is the relationship between two probability measures What is the relationship between two numeraires Why change of measure
  • 28. Mean Machine 2.5 Instrument Valuation - Fundamentals (Change of Probability Measure) Question 2: Relationship between two numeraires Question 3: Why change of measure A 0 B 0 = EB A T B T EP X = EQ dP dQ ∙ XE1 E2 EQ A T Q T ∙ Q 0 P 0 = A 0 Q 0 𝐁𝐲 𝐄𝟏 𝐐=𝐁 ∙ Q 0 P 0 = A 0 P 0 = EP 𝐁𝐲 𝐄𝟏 𝐏=𝐁 A T P T = EQ dP dQ ∙ A T P T 𝐁𝐲 𝐄2 dP dQ = Q 0 P 0 ∙ P T Q T Risk-Neutral Measure EQ Forward Measure ET Numeraire Bank Account β(t) Zero Coupon Bond P(t, T) Property β(0) = 1 P(T, T) = 1 Martingale Formula V 0 β 0 = EQ V T β T V 0 P 0, T = ET V T P T, T Simplified Formula V 0 = EQ V T β T V 0 = P 0, T ∙ ET V T
  • 29. Mean Machine 2.5 Instrument Valuation - Pricing Methods V t = N t × Et N V T N T 1. Find the PDF of V(T)/N(T) under measure N. 2. Use integration to represent expectation. 3. Simplify it to closed-form if possible, or leave it as numerical integration otherwise. 1. Change measure N to risk-neutral measure. 2. Use Feynman-Kac Theorem to derive PDE of V. 3. Fix solution domain, construct grid, set terminal and boundary conditions, discretize derivatives in spatial and time dimension, adopt finite difference scheme. 1. By law of large number, compute E[V/N] by taking the average of Vi/Ni as approximation. 2. Adopt variance reduction technique to enhance Monte Carlo efficiency. Closed-Form, Numerical Integration PDE Finite Difference Method Monte Carlo Method
  • 30. Mean Machine 2.5 Instrument Valuation - Closed-Form, Numerical Integration dS t )S(t = [r − q]dt + σdB(t) ൞ dS t S t = r − q dt + v t dB1 t dv t = κ θ − v t dt + η v t dB2 t Closed-Form V = ω ∙ [e−qT ∙ S0Φ(ωd+) − e−rT ∙ KΦ(ωd−)] • d = 1 σ T ln S0e r−q T K σ T 2 Numerical Integration V = ω ∙ [e−qT ∙ S0P1(ω) − e−rT ∙ KP2(ω)] • P1 ω = ½ 1 −  + P1 S0, v0, T, K • P2 ω = ½ 1 −  + P2 S0, v0, T, K • Pj x, v, T, y = 1 2 + 1 π ‫׬‬0 ∞ Re Cj T,ϕ −Dj T,ϕ v+ln( x y )ϕi ϕi dϕ • Dj T, ϕ = bj−ρηϕi+dj−(bj−ρηϕi−dj)gje djT η2(1−gje djT ) • Cj T, ϕ = r − q Tϕi + κθ η2 bj − ρηϕi + dj T − 2ln 1−gje djT 1−gj • dj = (bj − ρηϕi)2−η2(2ujϕi − ϕ2), gj = bj−ρηϕi+dj bj−ρηϕi−dj • b1 =  - η, b2 = , u1 = 0.5, u2 = -0.5 Black-Scholes Model Heston Model Technique Used: • Itô's Formula • Girsanov’s Theorem • Moment Matching • Drift Interpolation • Parameter Averaging
  • 31. Mean Machine 2.5 Instrument Valuation - PDE Finite Difference Method (SDE to PDE) Given the SDE of x(t) and payoff function V of a derivative at maturity T: )dx t = μ t, x dt + σ t, x dB(t V x(T), T = h(x T ) The V(x, t) satisfies the following PDE: 𝜕V 𝜕t + μ t, x 𝜕V 𝜕x + 1 2 σ2 t, x 𝜕2V 𝜕x2 − rV = 0 V x, t = e−r T−t ∙ Et h(x T ) By no-arbitrage principle: SDE PDE Feynman-Kac
  • 32. Mean Machine 2.5 Instrument Valuation - PDE Finite Difference Method (Grid Construction) t X t0 tn = T x0 Xm+1 Terminal Condition Xj-1 Xj Xj+1 ti-1 ti ti+1 Interior Points BoundaryCondition BoundaryCondition x ∈ xj j=0 m+1 ⇨ xj = xmin + j∆x, ∆x= xmax −xmin m + 1 t ∈ ti i=0 m ⇨ ti = i∆t, ∆t= T n
  • 33. Mean Machine 2.5 Instrument Valuation - PDE Finite Difference Method (Discretization and Scheme) t X Fully Explicit (θ = 0) (xj, ti)(xj-1, ti) (xj+1, ti) (xj, ti-1) t X (xj, ti)(xj-1, ti) (xj+1, ti) (xj, ti+1) Fully Implicit (θ = 1) t X (xj, ti+1)(xj-1, ti+1) (xj+1, ti+1) (xj, ti)(xj-1, ti) (xj+1, ti) Crank-Nicolson (θ = ½) Order Spatial Dimension Time Dimension 1st 𝜕Vj t 𝜕x ≈ Vj+1 t − Vj−1 t 2∆x 𝜕𝐕 𝜕t ≈ 𝐕 ti+1 − 𝐕 ti ∆t 2nd 𝜕Vj 2 t 𝜕x2 ≈ Vj+1 t − 2Vj t + Vj−1 t ∆x 2 Nodes in relation Node at discretization Nodes not in relation  Use central difference on ∂Vj/∂xand ∂2Vj/∂x2 at xj  Discretize ∂Vj/∂tat tθ i,i+1 = θti + (1 - θ)ti+1
  • 34. Mean Machine 2.5 Instrument Valuation - PDE Finite Difference Method (Representation) The difference equation at (tθ i,i+1, xj) is Vj ti+1 − Vj ti ∆t = − μ ti,i+1 θ , xj · Vj+1 ti,i+1 θ − Vj−1 ti,i+1 θ 2∆x − σ2 ti,i+1 θ , xj 2 · Vj+1 ti,i+1 θ − 2Vj ti,i+1 θ + Vj−1 ti,i+1 θ ∆x 2 + r ti,i+1 θ , xj · Vj(ti,i+1 θ ) 𝐈 − θ∆t 𝐀 ti,i+1 θ ∙ 𝐕 ti = 𝐈 + 1 − θ ∆t 𝐀 ti,i+1 θ ∙ 𝐕 ti+1 + θ𝛀 ti + 1 − θ 𝛀 ti+1 Write the algebraic form to matrix form Identity Matrix Tri-Diagonal Matrix Boundary Value Vector
  • 35. Mean Machine 2.5 Instrument Valuation - Monte Carlo Method (Fundamentals) Consider a derivative V with time T payout V(T) = g(T), by no-arbitrage principle V t = N t × Et N g T N T Law of Large Numbers Let Y1, Y2, …, Yn be a sequence of independent identically distributed (i.i.d.) random variables with finite expectation . Define the sample mean ഥY n = 1 n σi=1 n Yi → lim n→∞ ฑഥY n sample mean = lim n→∞ 1 n σi=1 n Yi = ฎμ population mean V t ≈ ഥV t = N t × 1 n ෍ i=1 n gi Ni Central Limit Theorem Let Y1, Y2, …, Yn be a sequence of i.i.d. random variables with finite expectation  and standard deviation σ. Then for n → ∞ ฑഥY n sample mean − ฎμ population mean s(n) n ~ N 0,1 where s2 n = 1 n−1 σi=1 n [Yi − ഥY n ]2 = n n−1 σ2 V(t) ∈ ഥV t − zα 2 ∙ s n n , ഥV t + zα 2 ∙ s n n ฑs n n 𝐬𝐭𝐚𝐧𝐝𝐚𝐫𝐝 𝐞𝐫𝐫𝐨𝐫 ↓ = n − 1 n ฏσY 𝐯𝐚𝐫𝐢𝐚𝐧𝐜𝐞 ↓ ดn 𝐬𝐚𝐦𝐩𝐥𝐞 𝐬𝐢𝐳𝐞 ↑
  • 36. Mean Machine 2.5 Instrument Valuation - Monte Carlo Method (Variance Reduction) E Ynew = E Y Variance Reduction Yi av = (Yi 1 + Yi 2 )/2 E Ynew = E Yi 1 + Yi 2 2 = E Y Var Ynew = Var Yi 1 + Yi 2 2 = Var Y 2n 1 + ρ < Var Y 2 )Ynew = Y + c(Ycv − μcv E Ynew = E Y] + cE[Ycv − μcv = E Y Var Ynew = min c Var Y + c Ycv − μcv = (1 − ρY,Ycv)Var Y ≤ Var Y Ynew = E Y Z E Ynew = E E Y Z = E[Y] Var Y = E Var Y Z + Var E Y Z ≥ Var E Y Z = Var Ynew Ynew = E Y Z E Ynew = E E Y Z = E[Y] Var Ynew = min Nj E Var Y Z ≤ E Var Y Z ≤ Var Y Var Ynew < Var Y Ef )V(X = නV x f x dx = න V x f x g x g(x)dx = Eg V X f X g X Varf V X − Varg V X f X g X = න V2 x f x 1 − f x g x dx > 0 ൝ g x > f x , when V2 x f x large g x < f x when V2 x f x small Antithetic Variate Control Variate Conditioning Sampling Stratified Sampling Importance Sampling
  • 37. Mean Machine 2.6 Risk Measurement - Sensitivity 𝜕V 𝜕Θk = V Θ1, Θ1, … , Θk + ∆, … , ΘK − V Θ1, Θ1, … , Θk, … , ΘK ∆ Bump and Revaluation Compute delta of European Option in Monte Carlo Pathwise Differentiation Compute delta of Digital Option in Monte Carlo Likelihood Ratio 𝜕V 𝜕S0 = E )𝜕g(ST 𝜕ST ∙ 𝜕ST 𝜕S0 = E 𝜕 ST − K + 𝜕ST ∙ 𝜕ST 𝜕S0 = E 1{ST > K} ∙ 𝜕ST 𝜕S0 𝜕V 𝜕S0 = 𝜕 ‫׬‬ g(ST) ∙ f(ST; S0)dST 𝜕S0 = න g(ST) ∙ )𝜕f(ST; S0 𝜕S0 dST = න g(ST) ∙ ൯fS0 (ST; S0 )f(ST; S0 f(ST; S0)dST = E g(ST) ∙ ൯fS0 (ST; S0 )f(ST; S0
  • 38. Mean Machine 2.6 Risk Measurement - Value-at-Risk Window: 1 Year Holding Period: 10 Day Confidence Level: 99% Risk Factors Historical Time Series Perturbations Historical Scenarios Simulated PVs Simulated P&Ls S1 1 S1 m … S1 250 … S2 1 S2 m … S2 250 … Sn 1 Sn m …Sn 250 … PV1 PVm … PV250 … P&L1 P&Lm … P&L250 … Δ1 1 Δ1 m … Δ1 250 … Δ2 1 Δ2 m … Δ2 250 … Δn 1 Δn m … Δn 250 … RF1 1 RF1 m … RF1 260 … RF2 1 RF2 m … RF2 260 … RFn 1 RFn m … RFn 260 … RF1 RFn … … … RF2 …… Portfolio VaR (100 - )% VaR PnL0 ProfitLoss Loss < VaR %
  • 40. Mean Machine 3.1 Overview • Data Collection • Outlier Handling: MAD, 3σ, Percentile • Standardization: Raw, Ranked Data Preprocessing Stock Selection • Optimization: EW, MVO, GMV, MDP, RP, RB, EMV, BL • Constraints: Industry, Factor Exposure, Stock Portfolio Construction Traditional Approach Machine Learning Approach • Single-Factor Test: IC, Stratified Backtesting • Multi-Factor Test: Correlation, Factor Synthesis • Multi-Factor Linear Regression Import Package Parameter Setting Data Labeling Model Training Model Setting Data Splitting Model Assessment Strategy Implementation Strategy Assessment
  • 42. Mean Machine 3.3 Data Preprocessing - Data Collection date = '2018-1-4' stocks = all_instruments(type="CS", date=date).order_book_id.tolist() data = get_fundamentals( query( fundamentals.eod_derivative_indicator.pb_ratio, fundamentals.eod_derivative_indicator.market_cap ).filter( fundamentals.income_statement.stockcode.in_(stocks) ), date, '1d').major_xs(date).dropna() data['BP'] = 1/data['pb_ratio'] data.head(3).append(data.tail(3))
  • 43. Mean Machine 3.3 Data Preprocessing - Outlier handling def filter_extreme_MAD(series,n): median = series.quantile(0.5) new_median = ((series - median).abs()).quantile(0.50) max_range = median + n*new_median min_range = median - n*new_median return np.clip(series,min_range,max_range) def filter_extreme_3sigma(series,n=3): mean = series.mean() std = series.std() max_range = mean + n*std min_range = mean - n*std return np.clip(series,min_range,max_range) def filter_extreme_percentile(series,min = 0.025,max = 0.975): series = series.sort_values() q = series.quantile([min,max]) return np.clip(series,q.iloc[0],q.iloc[1]) MAD 3 Sigma Percentile
  • 44. Mean Machine 3.3 Data Preprocessing - Standardization def standardize_series(series): return (series-series.mean()) / series.std() new = filter_extreme_3sigma(data['BP']) ax = standardize_series(new).plot.kde(label = 'Standardized Raw Factor') ax.legend(); ax = standardize_series(new.rank()).plot.kde(label = 'Standardized Ranked Factor') ax.legend(); zi = Xi − μ σ zi = Yi − μ σ , Y = Rank(X) Standardized Raw Factor Standardized Ranked Factor
  • 45. Mean Machine 3.4 Stock Selection - Traditional Approach ri = βi1f1 + ⋯ + βiKfK + εi = ෍ k=1 K βikfk + εi Multi-Factor Model The basic premise of multi-factor model is that similar assets display similar returns. Factor Exposure Factor Premium Specified Return Excess Return Estimate Factor Premium Consider the following stocks and factors: Stock 1: Apple Stock 2: Facebook Stock 3: Google Factor 1: PE (price-to-earnings) Factor 2: DY (dividend yield) For each t 1. Collect the factor exposures i1(t-1) and i2(t-1), 2. Collect the stock price at t-1 and t, and compute excess return ri(t) 3. Perform cross-sectional regression to get factor premiums f1(t) and f2(t) r1(t) = β11(t − 1)f1(t) + β12(t − 1)f2(t) r2(t) = β21(t − 1)f1(t) + β22(t − 1)f2(t) r3(t) = β31(t − 1)f1(t) + β32(t − 1)f2(t) )f1(1 )f2(1 ⋯ )fK(1 )f1(2 f2(2) ⋱ )fK(2 ⋮ ⋱ ⋱ ⋮ )f1(T )f2(T ⋯ )fK(T f1 T + 1 f2 T + 1 ⋯ fK T + 1 Collect Time Series From 0 to T Predict Factor Premium at T+1 AR(p) MA(q) ARMA(p, q) …
  • 46. Mean Machine 3.4 Stock Selection - Traditional Approach Factor Premium Factor Exposure Excess Return •Rank and Select 𝐁(T + 1) = ൯β1,1(T + 1 ൯β1,2(T + 1 ⋯ ൯β1,K(T + 1 ൯β2,1(T + 1 ൯β2,2(T + 1 ⋱ ൯β2,K(T + 1 ⋮ ⋱ ⋱ ⋮ ൯βN,1(T + 1 ൯βN,2(T + 1 ⋯ ൯βN,K(T + 1 𝐟(T + 1) = f1 T + 1 f2 T + 1 ⋯ fK T + 1 T ri(T + 1) = ෍ k=1 K βi,k(T + 1) · fk T + 1 𝐫(T + 1) = r1 T + 1 r2 T + 1 ⋯ rN T + 1 T
  • 47. Mean Machine 3.4 Stock Selection - Machine Learning Approach Import Package Parameter Setting Data Labeling Model Training Model Setting Data Splitting Model Assessment Strategy Implementation Strategy Assessment
  • 48. Mean Machine 3.4 Stock Selection - Machine Learning Approach (Import Package) import numpy as np # Matrix Computation import pandas as pd # Handle Dataframe # Show plot in this notebook %matplotlib inline import matplotlib.pyplot as plt # Plotting from sklearn.model_selection import train_test_split # Split training and test set from sklearn.model_selection import GridSearchCV # Select hyper-parameter by cross-validation error from sklearn.model_selection import KFold # CV model for binary class or balanced class from sklearn.model_selection import StratifiedKFold # CV model for multi-class or inbalanced class from sklearn import metrics as me # Machine Learning Model from sklearn.svm import SVC # Support Vector Machine from sklearn.ensemble import RandomForestClassifier as RFC # Random Forest from sklearn.ensemble import GradientBoostingClassifier as GBC # Gradient Boosted
  • 49. Mean Machine 3.4 Stock Selection - Machine Learning Approach (Parameter Setting) class PARA: method = 'SVM' # Specify the method, can be 'RF', 'GBT' month_train = range(1, 84+1) # In-sample 84 data points = 84 training examples month_test = range(85, 120+1) # Out-of-sample 36 data points = 36 test examples percent_select = [0.5, 0.5] # 50% positive examples,50% negative examples cv = 10 # 10-fold cross-validation seed = 1 # Random seed, for results reproduction para = PARA()
  • 50. Mean Machine 3.4 Stock Selection - Machine Learning Approach (Data labelling) def label_data( data ): data['Label'] = np.nan # Initialization data = data.sort_values( by='Return', ascending=False ) # Sort excess return in descending order n_stock = np.multiply( para.percent_select, data.shape[0] ) # Compute the number for stocks for pos and neg class n_stock = np.around(n_stock).astype(int) # Round number of stocks to integer data.iloc[0:n_stock[0], -1] = 1 # Assign 1 to those stocks with best performace data.iloc[-n_stock[1]:, -1] = 0 # Assign 0 to those stocks with worst performace data = data.dropna(axis=0) # Delete examples with NaN value return data Data Format: m × n matrix, first row contains labels, next m-1 rows contains the information of m-1 stocks. • Column 1 - 3: basic information • Column 4: excess return of next month • Column 5 - n: factor exposure csv file
  • 51. Mean Machine 3.4 Stock Selection - Machine Learning Approach (Data Splitting) for i in para.month_train: # load csv month by month file_name = str(i) + ’.csv' data_curr_month = pd.read_csv( file_name, header=0 ) para.n_stock = data_curr_month.shape[0] data_curr_month = data_curr_month.dropna(axis=0) # remove NaN data_curr_month = label_data( data_curr_month ) # label data # merger to a single dataframe if i == para.month_train[0]: # first month data_train = data_curr_month else: data_train = data_train.append(data_curr_month) X = data_train.loc[:, 'EP':'BIAS']; y = data_train.loc[:, 'Label']; X_train, X_cv, y_train, y_cv = train_test_split( X, y, test_size=1.0/para.cv, random_state=para.seed )
  • 52. Mean Machine 3.4 Stock Selection - Machine Learning Approach (Model Setting) if para.method == 'SVM': # Support Vector Machine model = SVC( kernel = 'linear', C = 1 ) elif para.method == 'RF': # Random Forest model = RFC( n_estimators=200, max_depth=6, random_state=para.seed ) elif para.method == 'GBT': # Gradient Boosted Tree model = GBC( n_estimators=200, max_depth=6, random_state=para.seed )
  • 53. Mean Machine 3.4 Stock Selection - Machine Learning Approach (Model Training) model.fit( X_train, y_train ) y_pred_train = model.predict( X_train ) y_score_train = model.decision_function( X_train ) y_pred_cv = model.predict( X_cv ) y_score_cv = model.decision_function( X_cv ) print( 'Training set, accuracy = %.2f' %me.accuracy_score( y_train, y_pred_train ) ) print( 'Training set, AUC = %.2f' %me.roc_auc_score( y_train, y_score_train ) ) print( 'Validation set, accuracy = %.2f' %me.accuracy_score( y_cv, y_pred_cv ) ) print( 'Validation set, AUC = %.2f' %me.roc_auc_score( y_cv, y_score_cv ) ) kernel = ['linear', 'rbf'] C = [0.01, 0.1, 1, 10] param_grid = dict(kernel=kernel, C=C) kfold = StratifiedKFold( n_splits=PARA.cv, shuffle=True, random_state=PARA.seed ) grid_search = GridSearchCV( model, param_grid, n_jobs=-1, cv=kfold, verbose=1 ) grid_result = grid_search.fit( X, y ) Default Parameter Hyperparameter Tuning
  • 54. Mean Machine 3.4 Stock Selection - Machine Learning Approach (Model Training) best_model = grid_result.best_estimator_ # summarize results print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_)) means = grid_result.cv_results_['mean_test_score'] stds = grid_result.cv_results_['std_test_score'] params = grid_result.cv_results_['params'] for mean, stdev, param in zip(means, stds, params): print("%f (%f) with: %r" % (mean, stdev, param)) # plot results scores = np.array(means).reshape(len(kernel), len(C)) for i, value in enumerate(kernel): plt.plot(C, scores[i], label='depth: ' + str(value)) plt.legend() plt.xlabel('C') plt.ylabel('kernel') plt.savefig('kernel_vs_C.png')
  • 55. Mean Machine 3.4 Stock Selection - Machine Learning Approach (Model Assessment) for i in para.month_test: # Print accuracy and AUC for each test example y_true_i_month = pd.DataFrame( {'Return':y_true_test.iloc[:,i-1]} ) y_pred_i_month = y_pred_test.iloc[:,i-1] y_score_i_month = y_score_test.iloc[:,i-1] y_true_i_month = y_true_i_month.dropna(axis=0) # remove NaN y_i_month = label_data( y_true_i_month )['Label'] y_pred_i_month = y_pred_i_month[ y_i_month.index ].values y_score_i_month = y_score_i_month[ y_i_month.index ].values print( 'test set, month %d, accuracy = %.2f' %(i, me.accuracy_score( y_i_month, y_pred_i_month ) ) ) print( 'test set, month %d, AUC = %.2f' %(i, me.roc_auc_score( y_i_month, y_score_i_month ) ) ) …
  • 56. Mean Machine 3.4 Stock Selection - Machine Learning Approach (Strategy Implementation) n_stock = 15 strategy = pd.DataFrame( {'Return':[0]*para.month_test[-1], 'Value':[1]*para.month_test[-1]} ) for i in para.month_test: y_true_i_month = y_true_test.iloc[:,i-1] y_score_i_month = y_score_test.iloc[:,i-1] y_score_i_month = y_score_i_month.sort_values(ascending=False) # Sort the score (probability) in descending order i_index = y_score_i_month[0:n_stock].index # Find the index of the first 15 stocks strategy.loc[i-1,'Return'] = np.mean(y_true_i_month[i_index])/100 # Compute the mean return of the 15 stocks strategy['Value'] = (strategy['Return']+1).cumprod() # Mutiply the mean return each test month to get total return
  • 57. Mean Machine 3.4 Stock Selection - Machine Learning Approach (Strategy Assessment) strategy_value = strategy.reindex(index=para.month_test, columns=['Value']) strategy_return = strategy.reindex(index=para.month_test, columns=['Return']) plt.plot( para.month_test, strategy_value, 'r-' ) plt.show() excess_return = np.mean(strategy_return) * 12 excess_vol = np.std(strategy_return) * np.sqrt(12) IR = excess_return / excess_vol print( 'annual excess return = %.2f' %excess_return ) print( 'annual excess volatility= %.2f' %excess_vol ) print( 'information ratio = %.2f' %IR)
  • 58. Mean Machine 3.5 Portfolio Construction Expected Return and Variance Known? Yes MVO No Any View on Risk? No Any View on Return? Yes RB Yes BL RP EMV Zero Correlation MDP GMV EW Same Sharpe Ratio • MVO: Mean-Variance Optimization • MDP: Most Diversified Portfolio • GMV: Global Minimum Variance • EMV: Equal Marginal Volatility • EW: Equal Weight • RB: Risk Budgeting • RP: Risk Parity • BL: Black-Litterman Same Risk Budget Same Volatility Same VolatilitySame Expected Return Zero Average Correlation
  • 60. THANK YOU STEVEN WANG SHENGYUAN Wechat Account: MeanMachine1031