2. Luc_Faucheux_2020
Recombining Binomial trees
2
¨ Fast, memory efficient, numerically stable and well understood (number of nodes ~ N^2)
¨ Can be used to run Monte Carlo simulations on the trees
¨ Arbitrage is respected ONLY on average on a slice
3. Luc_Faucheux_2020
Recombining Binomial trees
¨ Cropping
¨ Mean reversion
¨ Storing curve in memory versus recalculating on the fly (storing discounts versus calling
exp())
¨ Single volatility models for callable as an example
3
4. Luc_Faucheux_2020
Non-recombining binomial tree
¨ Respect arbitrage free at EVERY node in the tree
¨ Simplest to implement
¨ Very close to Monte Carlo simulations
¨ Very expensive in CPU and memory (number of nodes ~ 2^N)
4
5. Luc_Faucheux_2020
Monte Carlo simulation
¨ Bundling
¨ Regression
¨ Choice of regression factors
¨ Numerically noisy (accuracy ~ N^(-1/2))
¨ Very CPU intensive
¨ But very intuitive, very flexible, and as we saw in the Skew module, you can run a Monte
Carlo on a simpler model, or on an entire portfolio once you have created a portfolio map
5
7. Luc_Faucheux_2020
The glorious life of a valiant forward
¨ f(t,t1,t2) is the forward rate between the time t1 and t2 on the curve observed at time t
¨ t, t1 and t2 are by convention in absolute
¨ f(t,t1,t2) evolves from (t) to (t+1) into f(t+1,t1,t2) with instantaneous volatility 𝜎(𝑡, 𝑡!, 𝑡")
¨ f(t,t1,t2) “dies” as the anchor overnight rate on the curve observed at time t2
¨ “Rolling forward” convention as opposed to “constant forward”
7
f(0,0,1) f(0,1,2) f(0,2,3) f(0,3,4) f(0,4,5) f(0,5,6) f(0,6,7) f(0,7,8) f(0,8,9) f(0,9,10) f(0,10,11) f(0,11,12)
f(1,1,2) f(1,6,7)
f(2,2,3) f(2,6,7)
f(3,3,4) f(3,6,7)
f(4,4,5) f(4,6,7)
f(5,5,6) f(5,6,7)
f(6,6,7)
f(7,7,8)
f(8,8,9)
f(9,9,10)
f(10,10,11)
f(11,11,12)
f(12,12,13)
8. Luc_Faucheux_2020
The glorious life of a valiant forward
¨ Each line can be viewed as the new curve at time t
¨ Today (t=0) curve is defined by the successive forwards f(0,0,1), f(0,1,2)…..
¨ At time t the curve will then be defined by the successive forwards f(t,t,t+1), f(t,t+1,t+2),…
¨ Similar to our HJM spreadsheet but sliding down the curve back one every time
8
f(0,0,1) f(0,1,2) f(0,2,3) f(0,3,4) f(0,4,5) f(0,5,6) f(0,6,7) f(0,7,8) f(0,8,9) f(0,9,10) f(0,10,11) f(0,11,12)
f(1,1,2) f(1,6,7)
f(2,2,3) f(2,6,7)
f(3,3,4) f(3,6,7)
f(4,4,5) f(4,6,7)
f(5,5,6) f(5,6,7)
f(6,6,7)
f(7,7,8)
f(8,8,9)
f(9,9,10)
f(10,10,11)
f(11,11,12)
f(12,12,13)
9. Luc_Faucheux_2020
The glorious life of a valiant forward
¨ In practice, 𝜎 𝑡, 𝑡!, 𝑡" tends to 0 when (t=t1), and has a maximum in the “belly” of the
curve
¨ In reality, 𝜎 𝑡, 𝑡!, 𝑡" is also dependent on the actual forward f(t,t1,t2) as well as previous
instantaneous volatilities (GARCH for example) and previous forwards
¨ A common assumption is for the volatility 𝜎 𝑡, 𝑡!, 𝑡" to be stationary for the same class of
forwards. A class of forward is defined as all forwards of equal maturity T: (t2-t1=T)
¨ 𝜎 𝑡, 𝑡!, 𝑡" = '𝜎 𝑡! − 𝑡
9
f(0,0,1) f(0,1,2) f(0,2,3) f(0,3,4) f(0,4,5) f(0,5,6) f(0,6,7) f(0,7,8) f(0,8,9) f(0,9,10) f(0,10,11) f(0,11,12)
f(1,1,2) f(1,6,7) f(1,11,12)
f(2,2,3) f(2,6,7) f(2,11,12)
f(3,3,4) f(3,6,7) f(3,11,12)
f(4,4,5) f(4,6,7) f(4,11,12)
f(5,5,6) f(5,6,7) f(5,11,12)
f(6,6,7) f(6,11,12)
f(7,7,8) f(7,11,12)
f(8,8,9) f(8,11,12)
f(9,9,10) f(9,11,12)
f(10,10,11) f(10,11,12)
f(11,11,12)
f(12,12,13)
10. Luc_Faucheux_2020
Regular Eurodollar options or caplet
¨ Average variance for the forward over the life, option expires at the same time that the
forward
¨ 𝜎". 𝑡! = ∫#$%
#$#!
𝜎" 𝑡, 𝑡!, 𝑡" . 𝑑𝑡 = ∫#$%
#$#!
'𝜎" 𝑡! − 𝑡 . 𝑑𝑡
¨ Pricing different option for different strikes K, and expressing those option prices in a
common model (say Lognormal or Normal) will return the skew and smile expressed within
that model
10
f(0,0,1) f(0,1,2) f(0,2,3) f(0,3,4) f(0,4,5) f(0,5,6) f(0,6,7) f(0,7,8) f(0,8,9) f(0,9,10) f(0,10,11) f(0,11,12)
f(1,1,2) f(1,11,12)
f(2,2,3) f(2,11,12)
f(3,3,4) f(3,11,12)
f(4,4,5) f(4,11,12)
f(5,5,6) f(5,11,12)
f(6,6,7) f(6,11,12)
f(7,7,8) f(7,11,12)
f(8,8,9) f(8,11,12)
f(9,9,10) f(9,11,12)
f(10,10,11) f(10,11,12)
f(11,11,12)
f(12,12,13)
11. Luc_Faucheux_2020
Mid-curve Eurodollar options or forward caplets
¨ Average variance for the forward over the option, option expires BEFORE the forward at a
time Texp
¨ 𝜎". 𝑡&'( = ∫#$%
#$#"#$
𝜎" 𝑡, 𝑡!, 𝑡" . 𝑑𝑡 = ∫#$%
#$#"#$
'𝜎" 𝑡! − 𝑡 . 𝑑𝑡
¨ Pricing different option for different strikes K, and expressing those option prices in a
common model (say Lognormal or Normal) will return the skew and smile expressed within
that model
11
f(0,0,1) f(0,1,2) f(0,2,3) f(0,3,4) f(0,4,5) f(0,5,6) f(0,6,7) f(0,7,8) f(0,8,9) f(0,9,10) f(0,10,11) f(0,11,12)
f(1,1,2) f(1,11,12)
f(2,2,3) f(2,11,12)
f(3,3,4) f(3,11,12)
f(4,4,5) f(4,11,12)
f(5,5,6) f(5,11,12)
f(6,6,7) f(6,11,12)
f(7,7,8) Texpiry f(7,11,12)
f(8,8,9)
f(9,9,10)
f(10,10,11)
f(11,11,12)
f(12,12,13)
12. Luc_Faucheux_2020
A swap is a weighted basket of forwards
¨ Consider a swap with swap rate R (at-the-money swap rate)
– Nfloat periods on the Float side with forecasted forward f(i)
– indexed by i, with
– daycount fraction DCF(i),
– discount D(i)
– Notional N(i)
– Nfixed periods on the Fixed side,
– indexed by j, with
– daycount fraction DCF(j),
– discount D(j)
– Notional N(j)
!
!
𝐷𝐶𝐹 𝑖 . 𝐷 𝑖 . 𝑁 𝑖 . 𝑓 𝑖 = !
"
𝐷𝐶𝐹 𝑗 . 𝐷 𝑗 . 𝑁 𝑗 . 𝑅
12
13. Luc_Faucheux_2020
A swap rate is a weighted basket of forward rates
¨ At-the-money swap rate equation: ∑) 𝐷𝐶𝐹 𝑖 . 𝐷 𝑖 . 𝑁 𝑖 . 𝑓 𝑖 = ∑* 𝐷𝐶𝐹 𝑗 . 𝐷 𝑗 . 𝑁 𝑗 . 𝑅
¨ Above equation is valid at all times before the swap start, forwards and discount factors
being calculated on the then current discount curve the usual way, if the period I on the
float side starts at time ts(i) and ends at time te(i), and the forward is “aligned” with the
period (no swap in arrears or CMS like)
¨ 𝑅(𝑡) = ∑) 𝐷𝐶𝐹 𝑖 . 𝐷 𝑖 . 𝑁 𝑖 . 𝑓 𝑡, 𝑡𝑠 𝑖 , 𝑡𝑒(𝑖) /[∑* 𝐷𝐶𝐹 𝑗 . 𝐷 𝑗 . 𝑁 𝑗 ]
¨ “frozen numeraire” approximation, expand above equation in first order in forward rates but
keeping the discount factors constant
¨ 𝑑𝑅(𝑡) = ∑) 𝐷𝐶𝐹 𝑖 . 𝐷 𝑖 . 𝑁 𝑖 . 𝑑𝑓 𝑡, 𝑡𝑠 𝑖 , 𝑡𝑒(𝑖) /[∑* 𝐷𝐶𝐹 𝑗 . 𝐷 𝑗 . 𝑁 𝑗 ]
¨ Taking the square of the above yields the instantaneous volatility of the swap rate
¨ Λ+
" . 𝑑𝑡 =< 𝑑𝑅" >=
∑)! ∑)" 𝐷𝐶𝐹 𝑖1 . 𝐷 𝑖1 . 𝑁 𝑖1 . 𝐷𝐶𝐹 𝑖2 . 𝐷 𝑖2 . 𝑁 𝑖2 . < 𝑑𝑓 𝑖1 . 𝑑𝑓 𝑖2 > /
[∑*! ∑*" 𝐷𝐶𝐹 𝑗1 . 𝐷 𝑗1 . 𝑁 𝑗1 𝐷𝐶𝐹 𝑗2 . 𝐷 𝑗2 . 𝑁 𝑗2 ]
13
14. Luc_Faucheux_2020
A swap rate is a weighted basket of forward rates
¨ instantaneous volatility of the swap rate
¨ Λ+
"
. 𝑑𝑡 =< 𝑑𝑅" >=
∑)! ∑)" 𝐷𝐶𝐹 𝑖1 . 𝐷 𝑖1 . 𝑁 𝑖1 . 𝐷𝐶𝐹 𝑖2 . 𝐷 𝑖2 . 𝑁 𝑖2 . < 𝑑𝑓 𝑖1 . 𝑑𝑓 𝑖2 > /
[∑*! ∑*" 𝐷𝐶𝐹 𝑗1 . 𝐷 𝑗1 . 𝑁 𝑗1 𝐷𝐶𝐹 𝑗2 . 𝐷 𝑗2 . 𝑁 𝑗2 ]
¨ Where 𝑑𝑓 𝑖1 = 𝑑𝑓 𝑡, 𝑡𝑠 𝑖1 , 𝑡𝑒(𝑖1) and 𝑑𝑓 𝑖2 = 𝑑𝑓 𝑡, 𝑡𝑠 𝑖2 , 𝑡𝑒(𝑖2)
¨ In abbreviated notation
¨ < 𝑑𝑓 𝑖1 . 𝑑𝑓 𝑖2 >= 𝜎 𝑖1 . 𝜎 𝑖2 . 𝜌 𝑖1, 𝑖2 . 𝑑𝑡
¨ So to calculate the instantaneous volatility of the swap rate you need the instantaneous
volatility of each forward BUT ALSO the instantaneous correlation matrix between the
forward constituting the weighted basket.
14
15. Luc_Faucheux_2020
A swap evolving to the first set
¨ Example above : a 5x12 swap evolving on the volatility surface up until the first set
15
f(0,0,1) f(0,1,2) f(0,2,3) f(0,3,4) f(0,4,5) f(0,5,6) f(0,6,7) f(0,7,8) f(0,8,9) f(0,9,10) f(0,10,11) f(0,11,12)
f(1,1,2) f(1,5,6) f(1,6,7) f(1,7,8) f(1,8,9) f(1,9,10) f(1,10,11) f(1,11,12)
f(2,2,3) f(2,5,6) f(2,6,7) f(2,7,8) f(2,8,9) f(2,9,10) f(2,10,11) f(2,11,12)
f(3,3,4) f(3,5,6) f(3,6,7) f(3,7,8) f(3,8,9) f(3,9,10) f(3,10,11) f(3,11,12)
f(4,4,5) f(4,5,6) f(4,6,7) f(4,7,8) f(4,8,9) f(4,9,10) f(4,10,11) f(4,11,12)
f(5,5,6) f(5,6,7) f(5,7,8) f(5,8,9) f(5,9,10) f(5,10,11) f(5,11,12)
f(6,6,7)
f(7,7,8)
f(8,8,9)
f(9,9,10)
f(10,10,11)
f(11,11,12)
f(12,12,13)
17. Luc_Faucheux_2020
A swaption is a mid-curve on the basket of forwards
¨ Example above : a “5y7y” swaption, or a 5y option on a 7y swap, equating the year to the
time units
¨ Option expires at time t5, underlying is a swap starting at time t5 and ending at time t12
¨ Note that only the first forward gets to experience the “whole life” volatility, all the other
forwards essentially will experience the “mid-curve” or truncated volatility up to the
swaption expiry
17
f(0,0,1) f(0,1,2) f(0,2,3) f(0,3,4) f(0,4,5) f(0,5,6) f(0,6,7) f(0,7,8) f(0,8,9) f(0,9,10) f(0,10,11) f(0,11,12)
f(1,1,2) f(1,5,6) f(1,6,7) f(1,7,8) f(1,8,9) f(1,9,10) f(1,10,11) f(1,11,12)
f(2,2,3) f(2,5,6) f(2,6,7) f(2,7,8) f(2,8,9) f(2,9,10) f(2,10,11) f(2,11,12)
f(3,3,4) f(3,5,6) f(3,6,7) f(3,7,8) f(3,8,9) f(3,9,10) f(3,10,11) f(3,11,12)
f(4,4,5) f(4,5,6) f(4,6,7) f(4,7,8) f(4,8,9) f(4,9,10) f(4,10,11) f(4,11,12)
f(5,5,6) f(5,6,7) f(5,7,8) f(5,8,9) f(5,9,10) f(5,10,11) f(5,11,12)
f(6,6,7)
f(7,7,8)
f(8,8,9)
f(9,9,10)
f(10,10,11)
f(11,11,12)
f(12,12,13)
18. Luc_Faucheux_2020
Market practice assumption
¨ Volatility is assumed to be stationary: 𝜎 𝑡, 𝑡!, 𝑡" = '𝜎 𝑡! − 𝑡 , 𝑡"
¨ Correlation is also stationary: 𝜌 𝑖1, 𝑖2 = 𝜌( 𝑡𝑠(𝑖1) − 𝑡𝑠(𝑖2) )
¨ Parametrization of the volatility and correlation (Ribbonato, ..)
¨ 𝜌 𝑖1, 𝑖2 = 𝜌 𝑡𝑠 𝑖1 − 𝑡𝑠 𝑖2 = 𝐿, + 1 − 𝐿, . exp{
-|#/ )! -#/ )" |
0%
}
– 𝐿# is the long term (asymptotic) level of correlation
– 𝑇# is the characteristic time scale (half-life) for the correlation to decrease from 100%
perpendicular to the diagonal
18
22. Luc_Faucheux_2020
Parametrization of the volatility
¨ Usually assume a bell shape, with a maximum and a long term asymptote, with simple
analytical expression in order to integrate easily, and have numerically stable calibration
(sensitivity of the parameters to the market input are well behaved)
¨ We choose the following
– .𝜎 0 = .𝜎$%&
– .𝜎 ∞ = .𝜎'
– .𝜎 𝑇$() = .𝜎$()
– If 𝑡 < 𝑇$(), we use .𝜎*
𝑡 = .𝜎$%&
*
+ .𝜎$()
*
− .𝜎$%&
*
∗ (
+
,!"#
)
– If 𝑡 >= 𝑇$(), we use .𝜎*
𝑡 = .𝜎$()
*
+ .𝜎'
*
− .𝜎$()
*
∗ [1 − exp
- +-,!"#
,$
]
22
32. Luc_Faucheux_2020
A caplet and two distinct swaptions on the grid
¨ A t5-t12 swaption (orange)
¨ A t7 caplet (mauve)
¨ A t9-t12 swaption (dark orange)
¨ On the typical swaption grid, the t5-t12 and the t9-t12 are different distinct points with no
overlap
¨ The caplet is also one point. Even though those 3 structures pan overlapping areas of the
forward volatility surface of different sizes, they all get condensed to one point on the
swaption grid
32
f(0,0,1) f(0,1,2) f(0,2,3) f(0,3,4) f(0,4,5) f(0,5,6) f(0,6,7) f(0,7,8) f(0,8,9) f(0,9,10) f(0,10,11) f(0,11,12)
f(1,1,2) f(1,5,6) f(1,6,7) f(1,7,8) f(1,8,9) f(1,9,10) f(1,10,11) f(1,11,12)
f(2,2,3) f(2,5,6) f(2,6,7) f(2,7,8) f(2,8,9) f(2,9,10) f(2,10,11) f(2,11,12)
f(3,3,4) f(3,5,6) f(3,6,7) f(3,7,8) f(3,8,9) f(3,9,10) f(3,10,11) f(3,11,12)
f(4,4,5) f(4,5,6) f(4,6,7) f(4,7,8) f(4,8,9) f(4,9,10) f(4,10,11) f(4,11,12)
f(5,5,6) f(5,6,7) f(5,7,8) f(5,8,9) f(5,9,10) f(5,10,11) f(5,11,12)
f(6,6,7) f(6,7,8) f(6,9,10) f(6,10,11) f(6,11,12)
f(7,7,8) f(7,9,10) f(7,10,11) f(7,11,12)
f(8,8,9) f(8,9,10) f(8,10,11) f(8,11,12)
f(9,9,10) f(9,10,11) f(9,11,12)
f(10,10,11)
f(11,11,12)
f(12,12,13)
33. Luc_Faucheux_2020
You have priced too much, and not enough
¨ The options market is incredibly hard to calibrate
¨ Some points have lots of overlapping market info
¨ Large gaps in between, requiring either
– Interpolation on the market inputs (on the grid) with very little theoretical justification
– Interpolation on the model parameters, running the risk of being “off market”
33
36. Luc_Faucheux_2020
An illustrated example : Black Derman Toy
¨ Work through a tree implementation of the BDT model (short rate model)
¨ Most widely used model when modeling fixed income securities
¨ Illustrate some of the relevant issues around numerical implementation
¨ Work through some of the math to illustrate some pitfalls when switching from:
– The numerical tree implementation
– To the dynamics (Stochastic Differential Equation or SDE)
– To the PDE for the PDF (Partial Differential Equation for the Probability Density Function)
– To the grid implementation
36
37. Luc_Faucheux_2020
The BDT model tree implementation
37
t0
t1
t2
t3
t4
(0,0)
(1,0)
(1,1)
(2,0)
(2,1)
(2,2)
(3,0)
(3,1)
(3,2)
(3,3)
(4,0)
(4,1)
(4,2)
(4,3)
(4,4)
38. Luc_Faucheux_2020
The BDT model tree implementation
38
tk tk+1
X(k,i)
X(k+1,i+1)
X(k+1,i)
¨ We index the stochastic variable 𝑋(𝑘, 𝑖), where 𝑘 is the period index and 𝑖 is the position
index
¨ We choose a time step to be 𝛿𝑡 = 𝜏
¨ At first we assume this time step to be constant
¨ The spacing for a given period is 𝑋 𝑘, 𝑖 + 1 − 𝑋 𝑘, 𝑖 = 𝛿𝑋 for all index 𝑖
¨ For a given period, all nodes in the tree are equally spaced
¨ We start with the assumption that the spacing 𝛿𝑋 is a constant
39. Luc_Faucheux_2020
Building a BDT tree I
¨ For sake of simplicity, we go back to time spacing of unit 1, and neglect any holidays and
daycount fraction details like we did in the curve module
¨ We have the forward yield curve as input 𝑓(𝑡1, 𝑡1-!)
¨ We can compute the spot discount factors and the forward discount factors
¨ Forward discount 𝑑1 = 𝑑(𝑡1) which here we chose to be 𝑑 𝑡1 =
!
!23(#&,#&'!)
¨ The spot discount curve is such that 𝐷 0 = 1, 𝑑% = 𝑑 0 = 1
and 𝐷 𝑡12! = 𝐷 𝑡1 . 𝑑(𝑡12!)
39
0-1 1-2 2-3 3-4 4-5 5-6 6-7
Y0_fwd 1 1.5 2 2.5 3 3.25 3.5
D0_spot 0.990099 0.975467 0.95634 0.933015 0.90584 0.877327 0.847658
D0_fwd 0.990099 0.985222 0.980392 0.97561 0.970874 0.968523 0.966184
VOL_spot 0.2 0.3 0.3 0.4 0.5 0.7
STDEV 0.08 0.36 0.54 1.28 2.5 5.88
40. Luc_Faucheux_2020
Building a BDT tree II
¨ Forward discount 𝑑1 = 𝑑(𝑡1) which here we chose to be 𝑑 𝑡1 =
!
!23(#&,#&'!)
¨ Note that this is more a choice on how we define the forward yield curve.
¨ As we kept repeating in the curve module, the only thing that matters are the discount
factors. Yields are not unique and depend on conventions, holidays, daycount fraction and
assumptions to compute them. Discount factors ARE unique
¨ We also have as input to the models the volatility curve for the variable we are using the
build the BDT tree. In our case we will use the forward.
¨ Note that you could use another variable
¨ The goal is to calibrate the model inputs to the market on observable prices
¨ So if you build the tree on a given variable, you will get a different value for the volatility
inputs after calibration but you will recover the market prices
40
41. Luc_Faucheux_2020
Building a BDT tree III
41
0-1 1-2 2-3 3-4 4-5 5-6 6-7
Y0_fwd 1 1.5 2 2.5 3 3.25 3.5
D0_spot 0.990099 0.975467 0.95634 0.933015 0.90584 0.877327 0.847658
D0_fwd 0.990099 0.985222 0.980392 0.97561 0.970874 0.968523 0.966184
VOL_spot 0.2 0.3 0.3 0.4 0.5 0.7
STDEV 0.08 0.36 0.54 1.28 2.5 5.88
¨ In our case we chose the forward to be lognormally distributed, and so this volatility will be
the annualized lognormal yield volatility in % per year
¨ Again, the point is that because there are going to be so many assumptions and numerical
computations that will be different for each implementation of a tree, you cannot really
compare the model inputs between two models, just like you cannot really compare a
normal volatility to a lognormal one
¨ The only thing you can compare are the prices produced by the model
42. Luc_Faucheux_2020
Building a BDT tree IV
¨ We create a grid of forward yields 𝑋(𝑘, 𝑖)
¨ We chose for the “bottom” value 𝑋 𝑘, 0 =
3(#&,#&'!)
(!2
((&)
+
)^(1)
, where 𝑉(𝑘) is the volatility input
¨ The idea is to start with a distribution of forwards centered around the input one 𝑓(𝑡1, 𝑡1-!)
and with a standard deviation that will be close to the volatility input
¨ Each successive value for a given period going up the nodes in the tree is given by
¨ 𝑋 𝑘, 𝑖 + 1 = 𝑋 𝑘, 𝑖 . [1 + 𝑉 𝑘 ]1
42
Y_fwd 1 1.498501 1.994013 2.488784 2.97612 3.209678 3.427392
1.501499 2.005996 2.51125 3.024024 3.290726 3.573885
2.01805 2.533919 3.072699 3.373821 3.72664
2.556793 3.122158 3.459014 3.885924
3.172413 3.546359 4.052015
3.635909 4.225206
4.405799
43. Luc_Faucheux_2020
Building a BDT tree V
¨ Note that the goal is to start with a ”reasonable” state for the tree
¨ The calibration process will automatically adjust to the solution we need
¨ Note that the BDT model is somewhat different from the other models out there
¨ BDT does NOT start with an equation, BDT essentially starts from a numerical
implementation and just “let the calibration takes care of things”
¨ This makes the underlying dynamics a somewhat complicated beast to express in the usual
SDE and PDE
¨ We will show in a simple case how to do this, if only to point out the pitfalls and the gaps
between the SDE world and the discrete numerical implementation world
43
44. Luc_Faucheux_2020
Building a BDT tree VI
¨ This pragmatic approach (again does not sound smart at first because you are just using
Excel, you are not writing complicated equations), turns out to be the smart one in the end.
¨ This is also what makes BDT (and the tweaks like BK,..) one of the most commonly used
model in Finance, it is intuitive, it is robust, it is fast
¨ Note that BDT also has drawbacks, the calibration process can be at time unstable, and
because of the discrete nature of it, it is also numerically “noisy”. For computing the Greeks
using a BDT, depending on the payoff of the option that you are pricing, you will get “jumps”
in value when you are crossing a node in your valuation
¨ Note that the number of nodes in the BDT tree increases linearly with the number of time
steps for a given period, and thus the total number of nodes scales as the square of the
number of time steps, making reducing the noise by increasing the number of nodes and
times steps a computationally expensive proposition
44
45. Luc_Faucheux_2020
Building a BDT tree VII
¨ From each forward 𝑋(𝑘, 𝑖) at each node we compute the Discount factor
¨ 𝑑 𝑘, 𝑖 = 1/(1 + 𝑋 𝑘, 𝑖 )
¨ We now have built a tree where at each node we have the forward and the discount factor
45
D_fwd 0.990099 0.985236 0.98045 0.975717 0.971099 0.968901 0.966862
0.985207 0.980335 0.975503 0.970647 0.968141 0.965494
0.980219 0.975287 0.970189 0.967363 0.964072
0.975069 0.969724 0.966566 0.962594
0.969251 0.965751 0.961058
0.964917 0.959461
0.957801
46. Luc_Faucheux_2020
Building a BDT tree VIII
¨ We also attach to each node the usual binomial probability in order to recover the Gaussian
distribution
¨ The probability at each node is the probability to end up at that given node
¨ We assume here for simplicity equal probability for “up” and “down” to be 0.5
¨ Note that there are tons of numerical schemes out there to change the probabilities and
thus affect the drift inside the tree, but this does not change the fundamentals of the model
¨ Note also that now that we have a tree, sometimes Monte-Carlo simulation are run “on the
tree” in order ot price path-dependent structures
46
PROBA 1 0.5 0.25 0.125 0.0625 0.03125 0.015625
0.5 0.5 0.375 0.25 0.15625 0.09375
0.25 0.375 0.375 0.3125 0.234375
0.125 0.25 0.3125 0.3125
0.0625 0.15625 0.234375
0.03125 0.09375
0.015625
48. Luc_Faucheux_2020
Calibrating a BDT tree I
¨ We now have built this tree, but we need to calibrate it to market
¨ This is usually a multi step process usually performed in a sequential manner (but not
always)
¨ The first step is usually ensuring that we recover the discount curve, this will ensure that we
will be pricing bonds and swaps and zero coupon bonds to market
¨ This is where the numerical implementation of a solver for example, is crucial, in particular
for the stability of the solution
48
49. Luc_Faucheux_2020
Calibrating a BDT tree II
¨ We calculate the expectation for the discount (remember that it should be probability
weighted)
¨ This is obviously not equal to our input discount curve
¨ We need to “shift” the forwards in order to solve for the discount
49
51. Luc_Faucheux_2020
Calibrating a BDT tree IV
¨ This is where we get into the wonderful world of numerical solvers
¨ Note that we are “bootstrapping” the factors along the tree
¨ So a little error at one level needs to be corrected at the next level, potentially creating
instabilities
51
55. Luc_Faucheux_2020
Pricing a cap
¨ From each forward 𝑋(𝑘, 𝑖) at each node we compute the discount factor
¨ 𝑑 𝑘, 𝑖 = 1/(1 + 𝑋 𝑘, 𝑖 ), (assuming the usual convention of unity for daycount fraction)
¨ At each node the payoff is 𝐶𝑎𝑝 𝑘, 𝑖 = 𝑀𝐴𝑋(𝑋 𝑘, 𝑖 − 𝐾, 0)
¨ The discounting back is given by:
¨ 𝐶𝑎𝑝 𝑘, 𝑖 =
!
"
. 𝑑 𝑘, 𝑖 . 𝐶𝑎𝑝 𝑘 + 1, 𝑖 + 1 + 𝐶𝑎𝑝 𝑘 + 1, 𝑖 + 𝑀𝐴𝑋 𝑋 𝑘, 𝑖 − 𝐾, 0 . 𝑑
¨ If the cap is paid at the end of the period, 𝑑 = 𝑑(𝑘, 𝑖) otherwise 𝑑 = 1
55
tk tk+1
X(k,i)
X(k+1,i+1)
X(k+1,i)
56. Luc_Faucheux_2020
Pricing a swap in BDT
¨ In the ”swap” world you do not need to build a tree, you only need a discount curve and use
simple formulas like SUMPRODUCT
56
57. Luc_Faucheux_2020
Pricing a swap in BDT - II
¨ We saw this when looking at the convexity
¨ This is because the right way (the right measure) to look at cashflows is the discount curve.
¨ A swap fixed leg is the sum of fixed payment times discount factors
¨ Same for the Float leg (a little more complicated)
¨ But in fact a swap PV is a linear sum of discount factors
¨ As such it exhibits ZERO convexity against the discount curve
¨ And so to price a swap you do not need a volatility curve, nor an option model, but if you
are pricing a derivatives on a swap (like a swaption), you better make sure that your option
model does recover the same price for the underlier
¨ This sounds obvious, but you would be surprised how many options models out there are
not repricing the underlier correctly
¨ Let’s look back at the “yield curve” (zero vol) swap valuation
57
58. Luc_Faucheux_2020
A swap is a weighted basket of forwards: AT-THE-MONEY
¨ Consider a swap with swap rate R (at-the-money swap rate)
– Nfloat periods on the Float side with forecasted forward f(i)
– indexed by i, with
– daycount fraction DCF(i),
– discount D(i)
– Notional N(i)
– Nfixed periods on the Fixed side,
– indexed by j, with
– daycount fraction DCF(j),
– discount D(j)
– Notional N(j)
𝑃𝑉 𝐹𝐿𝑂𝐴𝑇 = )
,
𝐷𝐶𝐹 𝑖 . 𝐷 𝑖 . 𝑁 𝑖 . 𝑓 𝑖 = )
-
𝐷𝐶𝐹 𝑗 . 𝐷 𝑗 . 𝑁 𝑗 . 𝑅 = 𝑃𝑉(𝐹𝐼𝑋𝐸𝐷)
58
59. Luc_Faucheux_2020
Standard Swap periods
¨ On the fixed side, coupon payment at the end of the period
– Period start date (psj)
– Adjusted period start date (psj_adj)
– Period end date (pej)
– Adjusted period end date (pej_adj)
– Payment date (pmj)
– PV of a period 𝑃𝑉 𝑗 = 𝐷𝐶𝐹 𝑗 . 𝐷 𝑗 . 𝑁 𝑗 . 𝑅 = 𝐷𝐶𝐹 𝑝𝑠𝑗./-, 𝑝𝑒𝑗_𝑎𝑑𝑗 . 𝐷 𝑝𝑚- . 𝑁 𝑗 . 𝑅
¨ On the float side, floating rate sets at the beginning of the period, and pays at the end (Libor in advance or
standard Libor swap, as opposed to Libor in arrears)
– PV of a period (swaplet) 𝑃𝑉 𝑖 = 𝐷𝐶𝐹 𝑖 . 𝐷 𝑖 . 𝑁 𝑖 . 𝑓 𝑖
– 𝐷𝐶𝐹 𝑖 = 𝐷𝐶𝐹 𝑝𝑠𝑖./-, 𝑝𝑒𝑖./- and 𝐷 𝑖 = 𝐷(𝑝𝑚𝑖)
– 𝐷 𝑝𝑒𝑖 = 𝐷 𝑝𝑠𝑖 ∗
0
01234 56,,58, .:(,)
or
– 𝐷𝐶𝐹 𝑝𝑒𝑖, 𝑝𝑠𝑖 . 𝑓 𝑖 = [1 −
2 58,
2 56,
]
59
60. Luc_Faucheux_2020
Zero coupon bonds on today’s curve with zero volatility
¨ Zero-coupon bonds 𝑃 𝑡0, 𝑡; = 𝐷 𝑡0, 𝑡; = ⁄𝐷(𝑡;) 𝐷(𝑡0)
¨ “𝑃 𝑡0, 𝑡; is the price at time 𝑡0of a zero-coupon bond maturing at time 𝑡;”
¨ “𝑃 𝑡0, 𝑡; is the price at time 𝑡0of a risk-free zero-coupon bond with principal $1 maturing at time 𝑡;”
¨ IT SHOULD REALLY SAY : “Using today’s discount curve at time 𝑡<, 𝑃 𝑡0, 𝑡; is the price of a risk-free zero-
coupon bond with principal $1 maturing at time 𝑡;, and the value of that price has been forward
discounted to time 𝑡0, again using today’s discount curve”
¨ People love the zero coupon bonds, in many cases they make those the stochastic drivers of the rates
model (HJM for example)
60
61. Luc_Faucheux_2020
Expected Values in a non deterministic world
¨ Simply compounded spot interest rate: 𝐿(𝑡0, 𝑡;)
¨ 𝐿 𝑡0, 𝑡; =
0=>(?!,?")
/@: ?!,?" .>(?!,?")
or more simply 𝑃 𝑡0, 𝑡; =
0
01/@: ?!,?" .A(?!,?")
¨ Related to how to roll the curve forward at zero volatility,
¨ Method 2
– Compute the discount factors curve
– Divide all discount factors by the overnight d(t0,t1)=d(t1)
– Use new discount factor curve starting at t1
¨ So at zero volatility, when t goes from t0 to t1, the price of a zero discount bonds 𝑃 𝑡0, 𝑡; is unchanged:
𝑃 𝑡@, 𝑡0, 𝑡; = 𝑃 0, 𝑡0, 𝑡; where 𝑡@ is the “curve” time in the future.
¨ NOW, if the volatility is non zero, 𝑃 1, 𝑡0, 𝑡; ≠ 𝑃 0, 𝑡0, 𝑡;
¨ It is only true ON AVERAGE < 𝑃 1, 𝑡0, 𝑡; >= 𝑃 0, 𝑡0, 𝑡; or EXP 𝑃 1, 𝑡0, 𝑡; = 𝑃 0, 𝑡0, 𝑡; where EXP is
the Expected value (average).
¨ This is called the rolling numeraire or “bank account” numeraire: if you deposit 𝑃 0,0, 𝑡; today to get $1
at time t2, ON AVERAGE you should also be able to invest 𝑃 0,0, 𝑡; until time t1, then deposit it until
time t2 and still get $1
61
62. Luc_Faucheux_2020
Fixed Leg of a swap
¨ A fixed leg of a swap is a series of fixed cash flows.
¨ Now matter how the curve moves, ON AVERAGE the price of zero coupon bonds is conserved
¨ < 𝑃 𝑡@, 𝑡0, 𝑡; > = 𝑃 0, 𝑡0, 𝑡; and 𝑃 𝑡@, 𝑡0, 𝑡; =
2(?#,?")
2(?#,?!)
¨ In particular when t2=t1+1, 𝑃 𝑡@, 𝑡0, 𝑡0 + 1 =
2(?#,?!10)
2(?#,?!)
= 𝑑(𝑡@, 𝑡0)
¨ So < 𝑃 𝑡@, 𝑡0, 𝑡0 + 1 > = <
2 ?#,?!10
2 ?#,?!
> = < 𝑑(𝑡@, 𝑡0)> = 𝑃 0, 𝑡0, 𝑡0 + 1 = 𝑑(0, 𝑡0)
¨ Also by recurrence < 𝐷 1, 𝑡0 > ∗ 𝑑 0,1 = 𝐷(0, 𝑡0) at time t=0
¨ At time t=1, 𝑑 1,2 is fixed and has zero volatility (will drop when t goes from 1 to 2)
¨ So < 𝐷 2, 𝑡0 > ∗ 𝑑 1,2 = 𝐷(1, 𝑡0) at time t=1
¨ So at every point in the future, if you invest then a unit of currency and ”roll” it forward (bank
numeraire), the expected gain is today’s gain if you had entered into the same contract.
¨ Still another way to say, if you invest one unit of currency for a given length of time t, it is equivalent to
investing overnight and rolling the proceeds everyday (the arbitrage free framework does not take credit
into consideration)
62
63. Luc_Faucheux_2020
Floating leg of a swap
¨ A floating swaplet pays 𝐷𝐶𝐹 𝑖 . 𝑁 𝑖 . 𝑓 𝑖 and its PV is 𝑃𝑉 𝑖 = 𝐷𝐶𝐹 𝑖 . 𝐷 𝑖 . 𝑁 𝑖 . 𝑓 𝑖
¨ Where 𝐷𝐶𝐹 𝑝𝑒𝑖, 𝑝𝑠𝑖 . 𝑓 𝑖 = [1 −
2 58,
2 56,
]
¨ We know from the fixed rate leg that < 𝐷 𝑖 >= 𝐷 𝑖 , but what about < 𝐷 𝑖 . 𝑓 𝑖 > ?
¨ Note, to be exact < 𝐷 𝑖 >= 𝐷 𝑖 should really read ∏?#BC?$
𝐸𝑋𝑃{𝑡@, 𝑑(𝑡@, 𝑡@ + 1)}, where
𝐸𝑋𝑃 𝑡@, 𝑑 𝑡@, 𝑡@ + 1 is the expected value of the overnight discount between the time (tc) and (tc+1),
observed up until time tc (because it drops off the curve after tc, and before tc, no matter where you
observe it, its expected value is equal to today’s value)
¨ 𝐸𝑋𝑃 𝑡@, 𝑑 𝑡@, 𝑡@ + 1 = 𝐸𝑋𝑃 𝑡 < 𝑡@, 𝑑 𝑡@, 𝑡@ + 1 = 𝑑(𝑡@ + 1)
¨ Back to < 𝐷 𝑖 . 𝑓 𝑖 > , there is a little trick
63
64. Luc_Faucheux_2020
Floating leg of a swap
¨ Because the forward f(i) sets at the beginning of the period, once we reach the period start, everything
is known about the payment, and it becomes a fixed cashflow.
¨ 𝑃𝑉 𝑖 = 𝐷𝐶𝐹 𝑖 . 𝐷 𝑖 . 𝑁 𝑖 . 𝑓 𝑖 = 𝐷𝐶𝐹 𝑖 . 𝑁 𝑖 . 𝐸𝑋𝑃 𝑝𝑠𝑖, 𝐷 𝑝𝑠𝑖 . 𝐸𝑋𝑃{𝑝𝑠𝑖, 𝐷 𝑝𝑠𝑖, 𝑝𝑒𝑖 . 𝑓 𝑖 }
¨ 𝐸𝑋𝑃 𝑝𝑠𝑖, 𝐷 𝑝𝑠𝑖 = 𝐷(𝑝𝑠𝑖)
¨ 𝐷𝐶𝐹 𝑖 . 𝐸𝑋𝑃 𝑝𝑠𝑖, 𝐷 𝑝𝑠𝑖, 𝑝𝑒𝑖 . 𝑓 𝑖 = 𝐸𝑋𝑃{𝑝𝑠𝑖,
234(,)
01234 , .:(,)
. 𝑓 𝑖 }
¨ Now, magic trick,
D
01D
=
D10=0
01D
=
01D=0
01D
= 1 −
0
01D
¨ So, 𝐸𝑋𝑃 𝑝𝑠𝑖,
234 ,
01234 , .: ,
. 𝑓 𝑖 = 𝐸𝑋𝑃 𝑝𝑠𝑖, 1 −
0
01234 , .: ,
= 𝐸𝑋𝑃 𝑝𝑠𝑖, 1 − 𝑃 𝑝𝑠𝑖, 𝑝𝑠𝑖, 𝑝𝑒𝑖
¨ And because the price of zero coupon bond is respected:
¨ 𝐸𝑋𝑃 𝑝𝑠𝑖,
234 ,
01234 , .: ,
. 𝑓 𝑖 = 1 − 𝑃 0, 𝑝𝑠𝑖, 𝑝𝑒𝑖 = 1 −
0
01234 , .: ,
=
234 ,
01234 , .: ,
. 𝑓 𝑖
64
65. Luc_Faucheux_2020
Quick summary
¨ In the rolling numeraire measure,
– PV of fixed cashflows are conserved (Expected value of a fixed cashflow as the curve evolves in a stochastic manner
over time will converge to the fixed amount at the payment date)
– Price of zero coupon bonds are conserved
– Price of bonds are conserved (the price of a bond will change over time, but ON AVERAGE the price you should be
willing to pay for this bond is the price you can compute today using today’s curve, because the price of a bond
exhibits no convexity with respect to the discount curve changing, AS LONG as the discount curve changes in a manner
that respect the Arbitrage free condition, that is that a contract where you invest X today to get Y at time T, is the same
(equivalent, on average), as any contract where you invest X today, get the proceeds at some point in time in the
future, then reinvest them until T.)
– This is either painfully obvious or really deep, depending on how you look at it
– The arbitrage free assumption does NOT know about credit
– The arbitrage free assumption does NOT know about individual utility function (also called time-indifferent, it assumes
that market participants are indifferent about receiving X today versus Y at time T, where the ratio Y/X is the price of
today zero coupon bond maturing at time T, and that price will be conserved over time)
– The Floating leg of a swap ALSO exhibit zero convexity against the discount factor curve, because it can be expressed as
a linear function of discount factors, thanks to the amazing trick x=x+1-1
– It is surprising that of ALL the possible structures we could come up with a swap, there is almost zero probability that
we would not need a volatility curve, and yet 99.9% of the swaps being traded are such that they are NOT convex in
terms of the discount factors. Note that this breaks down once you have stochastic funding (forecasting and
discounting on different curves
¨ Other markets (equity, commodities,..) do NOT have such a strong underlying constraint that needs to be
respected.. In HJM for example, we will show that respecting the arbitrage enforces zero possible choice
for the drift, once the volatility is known, everything else is.
65
67. Luc_Faucheux_2020
Pricing a swap in BDT - IV
¨ NOTE : swaps payments are "in arrears" at the end of the period, and so you have to
discount for that period
¨ index is set "in advance" and paid "in arrears"
¨ If you were dealing with a swap where the payment is set at the same time as the payment,
value would be different
¨ in fact it would exhibit convexity ("in arrears" swap)
¨ you would need a BDT tree or an option model in order to value it
¨ But numerically quite easy to do, you just do not discount for that period
67
69. Luc_Faucheux_2020
Pricing a swap in BDT - VI
¨ We recover the same swap rate
¨ Please note that we are really discounting cash flows, never a rate or a yield
¨ Note that in practice to ”take into account” all the numerical imprecisions and the many
steps, the solver is sometimes done on the Swap Rate (Market observable)
¨ The Swap Rate is really only the value of the Fixed coupon so that the discounted value of all
fixed payments is equal to the discounted value of all the floating payments
¨ Note that if we had priced the swap “in arrears” we would get a different swap rate than the
one computed from the yield curve (even taking into account the timing difference)
¨ The difference between those two rates is the convexity adjustment
¨ In the SUMPRODUCT formula (deterministic world, or zero volatility), we just need to offset
the discounts by one period. Float fixes at the beginning of the period and pays at the
beginning, fixed also pays at the beginning of the period. This timing difference between the
fixing of the float and the payment is what creates the convexity.
¨ Because we cannot do the little trick:
'
!2'
=
'2!-!
!2'
=
!2'-!
!2'
= 1 −
!
!2'
69
70. Luc_Faucheux_2020
Pricing a swap in BDT - VI-b
¨ Another way is also to parametrize the shifts in order to allow for some tolerance, and
globally minimize for the Market observables (Swap rates), and be careful over the overlap
of swap durations
¨ This is sometimes done in practice in order to “correct” for a number of numerical
imprecisions along the way (missing holidays, imperfect daycount fraction, small residuals in
the minimization steps,..)
70
71. Luc_Faucheux_2020
Legs of a swap in arrears
¨ A fixed leg of a swap is a series of fixed cash flows that pays
¨ 𝐷𝐶𝐹 𝑖 . 𝑁 𝑖 . 𝑆𝑅
¨ And its PV is:
¨ 𝐷𝐶𝐹 𝑖 . 𝑁 𝑖 . 𝑆𝑅. 𝐷(𝑖 − 1)
¨ A floating swaplet “in arrears” pays:
¨ 𝐷𝐶𝐹 𝑖 . 𝑁 𝑖 . 𝑓 𝑖
¨ and its PV is:
¨ 𝑃𝑉 𝑖 = 𝐷𝐶𝐹 𝑖 . 𝐷 𝑖 − 1 . 𝑁 𝑖 . 𝑓 𝑖
¨ Where 𝐷𝐶𝐹 𝑝𝑒𝑖, 𝑝𝑠𝑖 . 𝑓 𝑖 = [1 −
2 58,
2 56,
]
71
72. Luc_Faucheux_2020
Floating leg of a regular swap
¨ Because the forward f(i) sets at the beginning of the period, once we reach the period start, everything
is known about the payment, and it becomes a fixed cashflow.
¨ 𝑃𝑉 𝑖 = 𝐷𝐶𝐹 𝑖 . 𝐷 𝑖 . 𝑁 𝑖 . 𝑓 𝑖 = 𝐷𝐶𝐹 𝑖 . 𝑁 𝑖 . 𝐸𝑋𝑃 𝑝𝑠𝑖, 𝐷 𝑝𝑠𝑖 . 𝐸𝑋𝑃{𝑝𝑠𝑖, 𝐷 𝑝𝑠𝑖, 𝑝𝑒𝑖 . 𝑓 𝑖 }
¨ 𝐸𝑋𝑃 𝑝𝑠𝑖, 𝐷 𝑝𝑠𝑖 = 𝐷(𝑝𝑠𝑖)
¨ 𝐷𝐶𝐹 𝑖 . 𝐸𝑋𝑃 𝑝𝑠𝑖, 𝐷 𝑝𝑠𝑖, 𝑝𝑒𝑖 . 𝑓 𝑖 = 𝐸𝑋𝑃{𝑝𝑠𝑖,
234(,)
01234 , .:(,)
. 𝑓 𝑖 }
¨ Now, magic trick,
D
01D
=
D10=0
01D
=
01D=0
01D
= 1 −
0
01D
¨ So, 𝐸𝑋𝑃 𝑝𝑠𝑖,
234 ,
01234 , .: ,
. 𝑓 𝑖 = 𝐸𝑋𝑃 𝑝𝑠𝑖, 1 −
0
01234 , .: ,
= 𝐸𝑋𝑃 𝑝𝑠𝑖, 1 − 𝑃 𝑝𝑠𝑖, 𝑝𝑠𝑖, 𝑝𝑒𝑖
¨ And because the price of zero coupon bond is respected:
¨ 𝐸𝑋𝑃 𝑝𝑠𝑖,
234 ,
01234 , .: ,
. 𝑓 𝑖 = 1 − 𝑃 0, 𝑝𝑠𝑖, 𝑝𝑒𝑖 = 1 −
0
01234 , .: ,
=
234 ,
01234 , .: ,
. 𝑓 𝑖
72
73. Luc_Faucheux_2020
Floating leg of an “in arrears” swap
¨ Because the forward f(i) sets at the beginning of the period, once we reach the period start, everything
is known about the payment, and it becomes a fixed cashflow.
¨ 𝑃𝑉 𝑖 = 𝐷𝐶𝐹 𝑖 . 𝐷 𝑖 − 1 . 𝑁 𝑖 . 𝑓 𝑖 = 𝐷𝐶𝐹 𝑖 . 𝑁 𝑖 . 𝐸𝑋𝑃 𝑝𝑠𝑖, 𝐷 𝑝𝑠𝑖 . 𝐸𝑋𝑃{𝑝𝑠𝑖, 1. 𝑓 𝑖 }
¨ 𝐸𝑋𝑃 𝑝𝑠𝑖, 𝐷 𝑝𝑠𝑖 = 𝐷(𝑝𝑠𝑖)
¨ 𝐷𝐶𝐹 𝑖 . 𝐸𝑋𝑃 𝑝𝑠𝑖, 1. 𝑓 𝑖 = 𝐸𝑋𝑃{𝑝𝑠𝑖, 𝐷𝐶𝐹 𝑖 . 𝑓 𝑖 }
¨ Where 𝐷𝐶𝐹 𝑝𝑒𝑖, 𝑝𝑠𝑖 . 𝑓 𝑖 = [1 −
2 58,
2 56,
]
¨ Now, magic trick,
D
01D
=
D10=0
01D
=
01D=0
01D
= 1 −
0
01D
, but that gets us nowhere
¨ So, 𝐸𝑋𝑃 𝑝𝑠𝑖, 𝐷𝐶𝐹 𝑖 . 𝑓 𝑖 = 𝐸𝑋𝑃 1 −
2 58,
2 56,
¨ This has not only a timing difference but a ratio, and we all know that 𝐸𝑋𝑃
0
D
<> 1/𝐸𝑋𝑃(𝑥)
¨ This can be done through the Taylor expansion (or also Jensen inequality)
¨ This is the famous Swap in arrears convexity trade of May 1995 (story time)
73
75. Luc_Faucheux_2020
Pricing a swaption - II
¨ Careful that the only thing you should discount are cashflows, not rates
¨ You could look at the rates but then at each node you would need to weigh by the actual
duration of that swap on that node to express it back into a cash flow
¨ Note that the Black’77 model assumes a single duration for all nodes, and so putting in the
same input volatility in your BDT model and the Black’77 model will NOT recover the same
option price
¨ The average of a product is not the product of the averages (or integrals)
¨ If you want to calibrate BDT to the implied volatilities from Black77 (or match the option
prices that you see in the market), you would need to input a different volatilities.
¨ Just because it is called the same thing (volatility) and has the same units does not mean
that you can equate the two
75
76. Luc_Faucheux_2020
Pricing a callable
¨ We can now even price a callable swap
¨ This is the main reason to build a tree, as opposed to closed form models
¨ Note also that once you have the tree calibrated, it is very flexible in terms of any payoff of
structure that you want to price
¨ A closed form solution would be hard to modify
¨ In practice, the most important step is the calibration step. Think about the yield curve
moving in real time every seconds, the volatility curve moving maybe on a time scale of
minutes, and so you need to build a calibration routine that can be done quickly and still be
stable
¨ Sometimes tree models are calibrated carefully overnight and kept constant during the day,
and trading desks adjust the price based on an estimate of the Greeks and the market
moves
76
78. Luc_Faucheux_2020
Pricing a callable - III
¨ Specific callable swap in the example is a 5y3ync1
¨ Swap starts in 5year
¨ Swap ends in 8 years
¨ Call option at par (no fees either way) at year 6 and year 7
78
79. Luc_Faucheux_2020
Pricing a callable swap -IV
¨ From each forward 𝑋(𝑘, 𝑖) at each node we compute the discount factor
¨ 𝑑 𝑘, 𝑖 = 1/(1 + 𝑋 𝑘, 𝑖 ), (assuming the usual convention of unity for daycount fraction)
¨ At each node the payoff of the regular swap is SWAP 𝑘, 𝑖 = 𝑋 𝑘, 𝑖 − 𝐾 . 𝑑(𝑘, 𝑖)
¨ The discounting back is given by:
¨ SWAP 𝑘, 𝑖 =
!
"
. 𝑑 𝑘, 𝑖 . SWAP 𝑘 + 1, 𝑖 + 1 + SWAP 𝑘 + 1, 𝑖 + 𝑋 𝑘, 𝑖 − 𝐾 . 𝑑(𝑘, 𝑖)
79
tk tk+1
X(k,i)
X(k+1,i+1)
X(k+1,i)
80. Luc_Faucheux_2020
Pricing a callable swap -V
¨ At each node the payoff of the regular swap is SWAP 𝑘, 𝑖 = 𝑋 𝑘, 𝑖 − 𝐾 . 𝑑(𝑘, 𝑖)
¨ The discounting back is given by:
¨ SWAP 𝑘, 𝑖 =
!
"
. 𝑑 𝑘, 𝑖 . SWAP 𝑘 + 1, 𝑖 + 1 + SWAP 𝑘 + 1, 𝑖 + 𝑋 𝑘, 𝑖 − 𝐾 . 𝑑(𝑘, 𝑖)
¨ IF on an option node, the new payoff then becomes 𝑀𝐴𝑋(SWAP 𝑘, 𝑖 , 0)
¨ You can easily add a fee to cancel (>0 or <0), for example if there is a fee to cancel expressed
as paid over the cancel period, the payoff becomes: 𝑀𝐴𝑋(SWAP 𝑘, 𝑖 , −fee)
80
tk tk+1
X(k,i)
X(k+1,i+1)
X(k+1,i)
82. Luc_Faucheux_2020
Pricing a callable swap -VII
¨ Because of the optionality, the payer of a fixed rate callable swap is willing to pay an above
market rate (higher than the swap rate) to pay for the option
¨ We can use again a numerical solver (GoalSeek in Excel)
¨ Note: a great numerical solver is hard to find, and is used all the times in Finance
¨ Solve for 0 PV on the regular swap -> SR = 3.414 in our example
¨ Solve for 0 PV on the fixed callable ->SR = 3.448 in our example
¨ The callable swap offers a pick-up of 3.5 basis point running above the Swap rate
¨ The more optionality, the higher the pick-up
¨ The higher the volatility, the more valuable the option, the higher the pick-up
¨ If you calibrate your model to European swaptions, do you recover the price of callables
observed in the market ? The answer is no (for many reasons, see the Structured ppt), but
people try to adjust for that by introducing mean reversion in the BDT tree. Let’s see what
that means and how that works.
82
83. Luc_Faucheux_2020
Pricing a callable swap -VIII
¨ Most of the times in practice, trading desks will keep an adjustment grids between the
European and the Callable and adjust the volatility curve by that adjustment to price a
callable.
¨ This is quite unsatisfactory for a lot of reason
¨ It is not consistent
¨ Where do you hedge the callable? On the shifted grid or the original one?
¨ The adjustment creates a PL difference, do you hold that as a reserve, do you release it?
¨ The PL difference will show some market directionality. Do you hedge that market
directionality?
¨ If you hedge a Callable with European options, do you price and hedge those Europeans on
the shifted grid or the original one?
¨ Different callables at different coupon will have different adjustments
83
84. Luc_Faucheux_2020
Pricing a callable swap -IX
¨ We have now all the building blocks of a full-fledged Fixed-Income front-to-back
infrastructure
¨ To do the risk management, bump the yield curve and the vol curve and reprice all the
trades AFTER recalibrating the BDT tree
¨ You can price pretty much any payoff that you want that is a function of the variables in the
tree
¨ You can also do a “Monte-Carlo on the tree” for path-dependent options, you are just going
forward in the tree, as opposed to discounting backwards
84
88. Luc_Faucheux_2020
The BDT dynamics I
¨ The following question comes up:
¨ You have built something that reprices market observables
¨ Congratulations but what dynamics did you use ?
¨ When we use Black-Sholes, we can write the SDE for the underlier, and the PDE
¨ Can we do the same thing for BDT ?
¨ Answer is sort of yes unless you do numerical things to the BDT tree that are too drastic
¨ We are now giving some insights on this dynamics, if only to point out the difficulties in
bridging the gap between the equations and the discrete numerical implementation of
those
88
89. Luc_Faucheux_2020
The BDT dynamics II
89
t0
t1
t2
t3
t4
(0,0)
(1,0)
(1,1)
(2,0)
(2,1)
(2,2)
(3,0)
(3,1)
(3,2)
(3,3)
(4,0)
(4,1)
(4,2)
(4,3)
(4,4)
91. Luc_Faucheux_2020
The BDT dynamics IV
91
tk tk+1
X(k,i)
X(k+1,i+1)
X(k+1,i)
¨ We index the stochastic variable 𝑋(𝑘, 𝑖), where 𝑘 is the period index and 𝑖 is the position
index
¨ We choose a time step to be 𝛿𝑡 = 𝜏
¨ At first we assume this time step to be constant
¨ The spacing for a given period is 𝑋 𝑘, 𝑖 + 1 − 𝑋 𝑘, 𝑖 = 𝛿𝑋 for all index 𝑖
¨ For a given period, all nodes in the tree are equally spaced
¨ We start with the assumption that the spacing 𝛿𝑋 is a constant
92. Luc_Faucheux_2020
The BDT dynamics V
¨ To the first order with no drift
¨ 𝑋 𝑘 + 1, 𝑖 = 𝑋 𝑘, 𝑖 − 𝛿𝑋/2
¨ 𝑋 𝑘 + 1, 𝑖 + 1 = 𝑋 𝑘, 𝑖 + 𝛿𝑋/2
¨ Probabilities Π 𝑘, 𝑖 to be at the node 𝑘, 𝑖
¨ Using (1/2) probabilities up and down
¨ Π 𝑘 + 1, 𝑖 =
!
"
Π 𝑘, 𝑖 − 1 + Π 𝑘, 𝑖
¨ This is also referred to as the master equation.
¨ Easier to work in the Π 𝑡, 𝑋 space to avoid the drift of the (𝑖) indices (zero drift means that
the middle for 𝑋 would be the starting point but that is roughly the node (𝑘, 𝑖 =
1
"
)
¨ ALWAYS pay attention to the boundary conditions (borders), so for example
¨ Π 𝑘, 𝑖 = 0 IF 𝑖 > 𝑘 OR (𝑖 < 0)
92
93. Luc_Faucheux_2020
The BDT dynamics VI
¨ < ∆𝑋 > =
!
"
. 𝑋 𝑘 + 1, 𝑖 + 1 + 𝑋 𝑘 + 1, 𝑖 = 𝑋(𝑘, 𝑖)
¨ So no drift in this simple example
¨ When writing 𝑑𝑋 = 𝜇. 𝑑𝑡 + 𝜎. 𝑑𝑊, we would have 𝜇 = 0
¨ < ∆𝑋"> =
!
"
. (𝑋 𝑘 + 1, 𝑖 + 1 − 𝑋 𝑘, 𝑖 )^(2) + (𝑋 𝑘 + 1, 𝑖 − 𝑋 𝑘, 𝑖 )^(2)
¨ < ∆𝑋"> =
!
"
. (𝛿𝑋/2)" + (𝛿𝑋/2)" =
89+
:
¨ When writing 𝑑𝑋 = 𝜇. 𝑑𝑡 + 𝜎. 𝑑𝑊, we would have (𝜎". 𝛿𝑡) =
89+
:
93
X(k,i)
X(k+1,i+1)
X(k+1,i)
𝛿𝑋/2
𝛿𝑋/2
94. Luc_Faucheux_2020
The BDT dynamics VII
¨ (𝜎". 𝛿𝑡) =
89+
:
¨ Again we see the scaling argument that if we want 𝜎to be well defined (does not go to 0,
does not go to infinity), as we are reducing the size of the time step and the spacing in order
to reach the continuous limit from the discrete implementation, we need
¨
89+
:8#
→ 𝜎" = 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡 when 𝛿𝑋 → 0 and 𝛿𝑡 → 0
¨ So the discrete tree implemented converges to the continuous dynamics described by the
SDE (Stochastic Differential Equation): 𝑑𝑋 = 𝜇. 𝑑𝑡 + 𝜎. 𝑑𝑊
¨ With 𝜇 = 0
¨ And 𝜎 = lim(
89+
:8#
)
94
95. Luc_Faucheux_2020
The BDT dynamics VIII
¨ From the SDE 𝑑𝑋 = 𝜇. 𝑑𝑡 + 𝜎. 𝑑𝑊 we should be able to recover the PDE for the PDF
¨
;< (9,#)
;#
= −
;
;9
𝜇. Π 𝑋, 𝑡 −
;
;9
=+
"
. Π 𝑋, 𝑡 =
=+
"
.
;+< (9,#)
;9+
¨
;< (9,#)
;#
=
=+
"
.
;+< (9,#)
;9+
¨ Let’s rederive this directly from the tree and the master equation
¨ Π 𝑘 + 1, 𝑖 =
!
"
Π 𝑘, 𝑖 − 1 + Π 𝑘, 𝑖
¨ Or more importantly
¨ Π 𝑋(𝑘 + 1, 𝑖) =
!
"
Π 𝑋(𝑘, 𝑖 − 1) + Π 𝑋(𝑘, 𝑖)
95
96. Luc_Faucheux_2020
The BDT dynamics IX
¨ We have Π 𝑋(𝑘 + 1, 𝑖) =
!
"
Π 𝑋(𝑘, 𝑖 − 1) + Π 𝑋(𝑘, 𝑖)
¨ With:
¨ 𝑋 𝑘 + 1, 𝑖 = 𝑋 𝑘, 𝑖 − 𝛿𝑋/2
¨ 𝑋 𝑘 + 1, 𝑖 + 1 = 𝑋 𝑘, 𝑖 + 𝛿𝑋/2
¨ Taylor expanding in time and space the master equation around 𝑋(𝑘 + 1, 𝑖) which is the
middle point (easier because we will perform symmetric calculations)
96
X(k,i)
X(k+1,i)
X(k,i-1)
𝛿𝑋/2
𝛿𝑋/2
101. Luc_Faucheux_2020
The BDT mean reversion I
¨ Everything we did so far assumed constant time step and spacing between nodes.
¨ We keep a constant time step
¨ Because of the algorithm construction of the BDT, there is a constant shift (up or down) in
order to match the discount curve.
¨ This is just a drift on top of the diffusive process.
¨ More subtle is the mean reversion effect. Mean reversion is a crucial concept in finance. It
indicates that things have a tendency when diffusing away from the equilibrium to ”revert
back to the mean” or mean revert, hence the term “mean reversion”
¨ What is the intuition for it in the BDT model?
¨ Essentially at a given time indexed by k, there are k nodes. The spacing between the nodes
is given by the volatility input. What happens when this spacing is not time-dependent?
Meaning its value change from (k) to (k+1). We could end up in a situation like this one:
101
102. Luc_Faucheux_2020
The BDT mean reversion II
¨ Decreasing volatility input from (k) to (k+1). The tree gets compressed. Positive mean
reversion (usual one)
¨ Note: in this case it is obvious that a BDT binomial model breaks the arbitrage locally, even
though one average over the time slices the arbitrage is still respected
¨ The further away from the center the greater the pull-back : typical of mean reversion
102
103. Luc_Faucheux_2020
The BDT mean reversion III
¨ Increasing volatility input from (k) to (k+1). The tree explodes. Negative mean reversion
(unusual one)
¨ Note: in this case it is obvious that a BDT binomial model breaks the arbitrage locally, even
though one average over the time slices the arbitrage is still respected
¨ The further away from the center the greater the push-out : typical of negative mean
reversion
103
104. Luc_Faucheux_2020
The BDT mean reversion IV
¨ It should not come as a surprise, because in some ways, in BDT for a given “splice” at a given
point in time k, you just put k nodes equally spaced and then you just shift those up and
down in order to recover the average discount factor, and then you connect those dots back
to the previous time slice
¨ Local versus average arbitrage free
¨ BDT is said to be arbitrage free because when discounting back on the tree you recover the
price of today’s discount curve. Note that it is only ON AVERAGE that the arbitrage is being
respected. When there is mean reversion, locally at any point in the BDT tree the ”local”
arbitrage is NOT being respected, and so the local dynamics of BDT (if we can figure it out”
will NOT be arbitrage free (only in very limited case for example constant volatility).
¨ Another way to say it is that if you build a BDT tree out of any node in an existing BDT tree,
this will NOT be a BDT tree, and will not respect arbitrage
¨ Let’s try to estimate how this impacts the dynamic equation.
104
105. Luc_Faucheux_2020
The BDT mean reversion V
¨ For k large enough the tree is centered around 𝑋 𝑘,
1
"
= 𝑋?(𝑘) for middle
¨ For each node 𝑋 𝑘, 𝑖 = 𝑋? 𝑘 + 𝑖 −
1
"
. 𝛿𝑋(𝑘)
¨ 𝑋 𝑘 + 1, 𝑖 = 𝑋? 𝑘 + 1 + 𝑖 −
12!
"
. 𝛿𝑋(𝑘 + 1)
¨ 𝑋 𝑘 + 1, 𝑖 + 1 = 𝑋? 𝑘 + 1 + 𝑖 + 1 −
12!
"
. 𝛿𝑋(𝑘 + 1)
105
X(k,i)
X(k+1,i+1)
X(k+1,i)
𝛿𝑋(𝑘 + 1)
106. Luc_Faucheux_2020
The BDT mean reversion VI
¨ Setting the drift of the overall tree to 0, 𝑋? 𝑘 + 1 = 𝑋? 𝑘
¨ Note that we can always add a drift later constant for all nodes, we are trying to isolate the
effect of the changing spacing between two time slices
¨ 𝑋 𝑘, 𝑖 = 𝑋? 𝑘 + 𝑖 −
1
"
. 𝛿𝑋(𝑘)
¨ 𝑋 𝑘 + 1, 𝑖 = 𝑋? 𝑘 + 1 + 𝑖 −
12!
"
. 𝛿𝑋(𝑘 + 1)
¨ 𝑋 𝑘 + 1, 𝑖 + 1 = 𝑋? 𝑘 + 1 + 𝑖 + 1 −
12!
"
. 𝛿𝑋(𝑘 + 1)
¨ 𝛿𝑋 𝑘 + 1 = 𝛿𝑋 𝑘 +
;89 #
;#
. 𝛿𝑡 + 𝒪(𝛿𝑡")
¨ < ∆𝑋 > =
!
"
. 𝑋 𝑘 + 1, 𝑖 + 1 + 𝑋 𝑘 + 1, 𝑖
106
108. Luc_Faucheux_2020
The BDT mean reversion VIII
¨ Locally there is a drift term equal to 𝑖 −
1
"
.
;89 #
;#
. 𝛿𝑡
¨ In the middle where 𝑖 =
1
"
, this drift term is nul
¨ The farther we get away from the center, the greater this drift term (not equal for all nodes)
¨ The greater the derivative of the spacing with respect to time
;89 #
;#
the greater the drift
¨ < 𝑋 > = 𝑋(𝑘, 𝑖) + 𝑖 −
1
"
.
;89 #
;#
. 𝛿𝑡
¨ < ∆𝑋 > = 𝑖 −
1
"
.
;89 #
;#
. 𝛿𝑡
¨ 𝑋 𝑘, 𝑖 = 𝑋? 𝑘 + 𝑖 −
1
"
. 𝛿𝑋(𝑘)
¨ So we have < ∆𝑋 > = (
9 1,) -9E(1)
89 #
).
;89 #
;#
. 𝛿𝑡
108
109. Luc_Faucheux_2020
The BDT mean reversion IX
¨ < ∆𝑋 > =
9 1,) -9E 1
89 #
.
;89 #
;#
. 𝛿𝑡 = 𝑋 𝑘, 𝑖 − 𝑋? 𝑘 .
!
89 #
.
;89 #
;#
. 𝛿𝑡
¨ For convenience sake, either we set 𝑋? 𝑘 = 0, or we define 𝑌 𝑘, 𝑖 = 𝑋 𝑘, 𝑖 − 𝑋? 𝑘
and work in the tree in 𝑌.
¨ If we set 𝑋? 𝑘 = 0
¨ < ∆𝑋 > =
9 1,)
89 #
.
;89 #
;#
. 𝛿𝑡 = 𝑋 𝑘, 𝑖 .
!
89 #
.
;89 #
;#
. 𝛿𝑡
¨ So when writing 𝑑𝑋 = 𝜇. 𝑑𝑡 + 𝜎. 𝑑𝑊 we would have
¨ 𝜇 = 𝑋.
!
89 #
.
;89 #
;#
¨ We had previously (𝜎". 𝛿𝑡) =
89+
:
, so looks like 𝜇 = 𝑋.
!
=
.
;=
;#
, where 𝜎 = 𝜎(𝑡)
¨ So looks like 𝑑𝑋 = 𝑋.
=@
=
. 𝑑𝑡 + 𝜎. 𝑑𝑊
109
110. Luc_Faucheux_2020
The BDT mean reversion X
¨ We still need to check that the 𝜎 term in front of the Brownian driver is still the correct one
that we obtain in the case of constant 𝜎
¨ To do that we need to estimate:
¨ < ∆𝑋"> =
!
"
. (𝑋 𝑘 + 1, 𝑖 + 1 − < 𝑋 >)^(2) + (𝑋 𝑘 + 1, 𝑖 − < 𝑋 >)^(2)
¨ Where < 𝑋 > is the mid point between 𝑋 𝑘 + 1, 𝑖 + 1 and 𝑋 𝑘 + 1, 𝑖
¨ < 𝑋 > =
!
"
. 𝑋 𝑘 + 1, 𝑖 + 1 + 𝑋 𝑘 + 1, 𝑖
¨ And so < ∆𝑋"> =
!
:
. {(𝑋 𝑘 + 1, 𝑖 + 1 − 𝑋 𝑘 + 1, 𝑖 }"=
!
:
. {(𝛿𝑋 𝑘 + 1 }"
¨ This is the same as before (the intuition is that the drift does not pollute the volatility,
because the drift is of order 1 in time, and so any drift when computing the second moment
will appear as a term of order 2 in time, which will be neglected when compared to the
stochastic term, which is only of order 1 in time
110
111. Luc_Faucheux_2020
The BDT mean reversion XI
¨ The numerical implementation of the BDT tree, by equally spacing k nodes at time k, and
then forcing those nodes to connect in a recombining binomial manner, “artificially” creates
a mean reversion drift
¨ The BDT dynamics is 𝑑𝑋 = 𝑋.
=@
=
. 𝑑𝑡 + 𝜎. 𝑑𝑊
¨ At constant volatility, the drift term disappear
¨ I can bet some large amount of money that if we had started from the actual dynamics and
being asked to build a tree that follows it we would have struggled quite a lot
¨ We went through this derivation to illustrate the potential pitfalls of going in between
numerical tree implementation and continuous dynamics
¨ Note that usually BDT are built with a short term rate in lognormal space, 𝑋 = ln(𝑟)
¨ The dynamics then become 𝑑(ln 𝑟 ) = ln(𝑟).
=@
=
. 𝑑𝑡 + 𝜎. 𝑑𝑊
¨ Note that the meaning of 𝜎 is now different, it is now the lognormal volatility
111
112. Luc_Faucheux_2020
The BDT mean reversion XII
¨ Note that as usual following the Ito convention (Ito lemma) in Ito calculus
¨ 𝑑 ln(𝑟) = ( ⁄𝑑𝑟 𝑟) − ( ⁄𝜎" 2). 𝑑𝑡
¨ Let’s go back to 𝑑𝑋 = 𝑋.
=@
=
. 𝑑𝑡 + 𝜎. 𝑑𝑊
¨ Also normal models are much more in favor now that rates have proven that they can go
negative.
¨ The associated PDE for the PDF is given by
¨
;< (9,#)
;#
= −
;
;9
𝜇. Π 𝑋, 𝑡 −
;
;9
=+
"
. Π 𝑋, 𝑡
¨
;< (9,#)
;#
= −
;
;9
𝑋.
=@
=
. Π 𝑋, 𝑡 −
;
;9
=+
"
. Π 𝑋, 𝑡
¨ Note that 𝜎 is a function of the time t and not the stochastic variable, so we are safe from
any Ito versus Stratonovitch controversy for now
112
113. Luc_Faucheux_2020
The BDT mean reversion XIII
¨ In the most general case with added drift (the multiplicative factors K that we used in the
Excel implementation of the BDT)
¨ 𝑑𝑋 = {𝐾 𝑡 + 𝑋.
=F #
= #
}. 𝑑𝑡 + 𝜎(𝑡). 𝑑𝑊
¨ We can also split the drift term in order to make obvious the reversion to the mean 𝑋A(𝑡)
¨ 𝑑𝑋 = {𝐾 𝑡 + (𝑋 − 𝑋A 𝑡 ).
=F #
= #
}. 𝑑𝑡 + 𝜎(𝑡). 𝑑𝑊
¨ Or going into Lognormal space
¨ 𝑑(ln 𝑟 ) = {𝐾 𝑡 + (ln(𝑟) − ln(𝑟A 𝑡 )).
=F #
= #
}. 𝑑𝑡 + 𝜎(𝑡). 𝑑𝑊
¨ The usual textbook description of the BDT dynamics just goes with
¨ 𝑑 ln 𝑟 = {𝜃 𝑡 + ln(𝑟).
=F
=
}. 𝑑𝑡 + 𝜎. 𝑑𝑊
¨ With the drift 𝜃 𝑡 taking care of the arbitrage free constraint to recover the discounts
113
114. Luc_Faucheux_2020
The BDT mean reversion XIV
¨ Always be careful when dealing with numerical implementation of SDE
¨ In the example above, when having constant volatility, the dynamics is the one sometimes
called the Ho-Lee:
¨ 𝑑𝑋 = 𝐾 𝑡 . 𝑑𝑡 + 𝜎. 𝑑𝑊
¨ If 𝑋 is the short-rate, we have 𝐾 𝑡 =
;3(%,#)
;#
+ 𝜎". 𝑡
¨ Where 𝑓(0, 𝑡) is the initial forward rate curve
¨ When 𝜎 becomes a function of time 𝜎(𝑡), we would be tempted to keep the tree that we
built for Ho-Lee and just change the spacing at each time step in order to fit the volatility.
¨ What we showed is that doing this “blindly” introduces mean reversion and changes the
dynamics (SDE) to:
¨ 𝑑𝑋 = {𝐾 𝑡 + 𝑋.
=F #
= #
}. 𝑑𝑡 + 𝜎(𝑡). 𝑑𝑊
114
116. Luc_Faucheux_2020
A quick aside on probabilities
¨ Probability to go up 𝑃B, probability to go down 𝑃C, with 𝑃B + 𝑃C = 1
¨ If both probability are equal to 1/2
¨ < ∆𝑋 > =
!
"
. 𝑋 +
89
"
+ 𝑋 − 𝛿𝑋/2 − 𝑋 = 0
¨ < ∆𝑋"> =
!
"
. (
89
"
)^(2) + (
89
"
)^(2) = (
89+
:
)
116
X
X + 𝛿𝑋/2
X - 𝛿𝑋/2
𝛿𝑋/2
𝛿𝑋/2
𝑃!
𝑃"
𝛿𝑡
117. Luc_Faucheux_2020
A quick aside on probabilities II
¨ < ∆𝑋 > = 0
¨ < ∆𝑋"> = (
89+
:
)
¨ We match the process 𝑑𝑋 = 0. 𝑑𝑡 + 𝜎. 𝑑𝑊
¨ 𝜎". 𝛿𝑡 = (
89+
:
)
¨ 𝛿𝑋 = 2𝜎. 𝛿𝑡
117
118. Luc_Faucheux_2020
Adding drift the easy way (?)
¨ We keep the probabilities equal to (1/2)
¨ The only difference is the addition of the drift term
118
X
𝑃!
𝑃"
𝛿𝑡
𝑋 −
#$
%
+ 𝑎. 𝛿𝑡
𝑋 +
#$
%
+ 𝑎. 𝛿𝑡
119. Luc_Faucheux_2020
Adding drift the easy way (?) II
¨ < ∆𝑋 > =
!
"
. 𝑋 +
89
"
+ 𝑎. 𝛿𝑡 + 𝑋 −
89
"
+ 𝑎. 𝛿𝑡 − 𝑋 = 𝑎. 𝛿𝑡
¨ < ∆𝑋"> =
!
"
. (
89
"
+ 𝑎. 𝛿𝑡)^(2) + (−
89
"
+ 𝑎. 𝛿𝑡)^(2)
¨ < ∆𝑋"> = (𝑎. 𝛿𝑡)"+(
89+
:
)
¨ We only keep the terms in order 1 in 𝛿𝑡 (small time steps limit)
¨ < ∆𝑋"> = (
89+
:
)
¨ Note that locally, < ∆𝑋"> =
!
"
. (
89
"
)^(2) + (
89
"
)^(2) = (
89+
:
) exactly
¨ We match the process 𝑑𝑋 = 𝑎. 𝑑𝑡 + 𝜎. 𝑑𝑊
119
120. Luc_Faucheux_2020
Changing the drift by changing the probabilities I
¨ We do not change the values in the space axis (the values for X have been picked already)
¨ We change the probabilities 𝑃B and 𝑃C
¨ This is quite common when building trees, as the node spacing has already been calibrated.
¨ This actually is a little more tricky to deal with (in particular when you change the
probabilities, do you still recover a Gaussian for example? This is linked to the Girsanov
theorem, just to tell you that this is not as simple as it seems)
120
X
X + 𝛿𝑋/2
X - 𝛿𝑋/2
𝛿𝑋/2
𝛿𝑋/2
𝑃!
𝑃"
𝛿𝑡
121. Luc_Faucheux_2020
Changing the drift by changing the probabilities II
¨ We define the probabilities 𝑃B =
!
"
+ 𝜀 and 𝑃C =
!
"
− 𝜀
¨ < ∆𝑋 > =
!
"
+ 𝜀 . 𝑋 +
89
"
} +
!
"
− 𝜀 . {𝑋 −
89
"
− 𝑋 = 𝜀. 𝛿𝑋
¨ Now, 𝛿𝑋 scales as 𝛿𝑡!/", so we know that right off the bat, in order for the drift term to be
linear in time (in order to model 𝑑𝑋 = 𝑎. 𝑑𝑡 + 𝜎. 𝑑𝑊), we are going to need to have that
number 𝜀 to scale something like {𝑎. (𝛿𝑡)!/"}
¨ < ∆𝑋"> =
!
"
+ 𝜀 .
89
"
"
+
!
"
− 𝜀 .
89
"
"
¨ < ∆𝑋"> =
!
"
+ 𝜀 .
89
"
"
+
!
"
− 𝜀 .
89
"
"
¨ < ∆𝑋"> =
89+
:
(1 + 2. 𝜀")
121
122. Luc_Faucheux_2020
Changing the drift by changing the probabilities III
¨ < ∆𝑋 > = 𝜀. 𝛿𝑋 = 𝑎. 𝛿𝑡
¨ < ∆𝑋"> =
89+
:
1 + 2. 𝜀" = 𝜎". 𝛿𝑡
¨ So 𝜀 =
E.8#
89
and 𝛿𝑋 = 2𝜎. 𝛿𝑡!/" so we can also write 𝜀 =
E
"=
𝛿𝑡!/"
¨ So changing the probability does not affect the variance in the limit of small time steps
(continuous limit).
¨ Not as straightforward as shifting the values in space and keeping the probabilities equal to
(1/2)
¨ Changing the probabilities to affect the drift is NOT trivial (in particular we have to convince
ourselves that we did not change the shape of the distribution, just the first moment)
¨ Girsanov theorem, Radon-Nikodym derivative if you want to use big words
122
123. Luc_Faucheux_2020
Changing the drift by changing the probabilities IV
¨ So we match the process: 𝑑𝑋 = 𝑎. 𝑑𝑡 + 𝜎. 𝑑𝑊
¨ 𝛿𝑋 = 2𝜎. 𝛿𝑡!/"
¨ 𝜀 =
E
"=
𝛿𝑡!/"
¨ WAIT A MINUTE you should say, 𝜀 is only a number, not a complicated formula:
¨ 𝜀 =
E
"=
𝛿𝑡!/"
¨ 𝑎 is a drift and so scales as [
9
#
]
¨ 𝜎 is a volatility and so scales as [
9
#!/+], or easier 𝜎". [𝑡]scales as [𝑋"]
¨ SO.. 𝜀 scales as [
9
#
].
#
!
+
9
. [𝑡!/"] which is dimensionless and a number indeed !!
123
125. Luc_Faucheux_2020
Adding some more mean reversion to the BDT tree
¨ Market practice is to do the following:
– Build the tree (if needed to save memory, build also some “ghost nodes”)
– Start with a given spacing in rates
– Adjust the multiplication factor to match the discount curve
– Adjust the spacing in order to match at-the-money caplet prices (this take into account the
numerical noise aspect, and also the bias in the BDT discounting, which needs to be taken into
account by adjusting the vol)
– Recalibrate the multiplication factors to match the discount curve
– Iterate a couple of times if needed
– NOW comes the kicker: you will NOT recover the price of at-the-money European swaptions,
NOR will you recover the price of callable options (Formosa), there are a lot of reasons why
– The most used “trick” is to add another mean reversion in the BDT tree on top of the “natural”
mean reversion : {
.%
.
}
125
126. Luc_Faucheux_2020
Adding some more mean reversion to the BDT tree-b
¨ This is in practice a very important point:
¨ Models do get calibrated to a finite subset of instruments.
¨ For reasons that we went over in the Structured Powerpoint, a simple model will NEVER be
able to capture the right price (and as a result risk and hedges) for caps and callables, or
European swaptions
¨ A true model that should capture the callable prices should be able to incorporate the
correlation between the curve steepness as stochastic driver, and the first factor in the
volatility surface
¨ Such a model is quite a task to build (no one in the market currently has one)
¨ But as we also showed in the Skew Powerpoint, it is quite legit to use a simpler model, run it
through a hedging scenario and capture the residual as the difference in pricing
¨ So you could say, why even bother with mean reversion? Take your model as is, and then
run it through your scenario hedging
126
127. Luc_Faucheux_2020
Adding some more mean reversion to the BDT tree-c
¨ The answer to that is, yes you could do that, and that would be fine
¨ Introducing mean reversion to the current model brings it closer to the market price.
¨ So it might not be the actual dynamics, but at least you are starting closer to where you
want to end, so any Taylor expansion / approximation will be smaller and more justified
¨ Also, even though we are a big proponent of the Scenario hedging, this is a class after all, so
playing with the mean reversion is a nice way to introduce some of the concepts around tree
pricing
127
128. Luc_Faucheux_2020
Adding some more mean reversion to the BDT tree II
¨ Let’s go through the derivation of that “trick” (not the only one but a rather popular one)
¨ More to illustrate the tree dynamics, and how a change in the algorithm affects the
underlying dynamics, and how one needs to be careful about the small details
¨ In textbooks, sometimes you read that you need a trinomial tree in order to have an extra
degree of freedom in order to accommodate mean reversion.
¨ I have never fully understood why that is the case, and what a trinomial tree has that a two-
step binomial tree cannot exhibit
¨ We will also present a trinomial tree implementation as it is a neat example of something
crucial when dealing with trees (or sometimes trees on a grid) : stability analysis and
constraints on the discrete parameters
128
129. Luc_Faucheux_2020
Adding some more mean reversion to the BDT tree III
¨ 𝑋 𝑘, 𝑖 = 𝑋? 𝑘 + 𝑖 −
1
"
. 𝛿𝑋(𝑘)
¨ 𝑋 𝑘,
1
"
= 𝑋?(𝑘)
¨ Once the volatility structure 𝛿𝑋(𝑘) is given, there is no choice on the mean reversion
parameter that is given by {
=F
=
}
¨ Π 𝑋(𝑘 + 1, 𝑖) =
!
"
Π 𝑋(𝑘, 𝑖 − 1) + Π 𝑋(𝑘, 𝑖)
129
X(k,i)
X(k+1,i+1)
X(k+1,i)
𝛿𝑋/2
𝛿𝑋/2
130. Luc_Faucheux_2020
Adding some more mean reversion to the BDT tree IV
¨ Π 𝑋(𝑘, 𝑖) follows the binomial distribution, in the continuous limit we recover the Gaussian
distribution and the diffusion equation.
¨ Π 𝑋(𝑘 + 1, 𝑖) =
!
"
Π 𝑋(𝑘, 𝑖 − 1) + Π 𝑋(𝑘, 𝑖)
¨ Π 𝑋(𝑘, 𝑖) = 2-1. )
1 =
!
"& .
1!
)! 1-) !
¨ Π 𝑋(𝑘, 𝑖 − 1) = 2-1. )-!
1 =
!
"& .
1!
()-!)! 1-)2! !
¨ We have : )-!
1 + )
1 = )
12!
¨ So by recurrence we also have Π 𝑋(𝑘 + 1, 𝑖) = 2-12!. )
12! =
!
"&H! .
(12!)!
)! 12!-) !
130
131. Luc_Faucheux_2020
Adding some more mean reversion to the BDT tree V
¨ One introduces mean reversion in an algorithmic fashion in the following way:
¨ Usual tree:
¨ 𝑋 𝑘, 𝑖 → 𝑋 𝑘 + 1, 𝑖 with probability 𝑃C = 1/2
¨ 𝑋 𝑘, 𝑖 → 𝑋 𝑘 + 1, 𝑖 + 1 with probability 𝑃B = 1/2
¨ We introduce 𝜇 (note that a time dependent 𝜇(𝑡) is completely analogous, but for sake of
simplicity we will drop the time dependence for now
¨ 𝑋 𝑘, 𝑖 → 𝑋 𝑘 + 1, 𝑖 with probability 𝑃C =
!
"
. (1 − 𝜇)
¨ 𝑋 𝑘, 𝑖 → 𝑋 𝑘 + 1, 𝑖 + 1 with probability 𝑃B =
!
"
. (1 − 𝜇)
¨ 𝑋 𝑘, 𝑖 → 𝑋 𝑘 + 1, 𝑗 with probability 𝑃 1,) →(12!,*) = 𝜇. Π 𝑋(𝑘 + 1, 𝑗)
131
132. Luc_Faucheux_2020
Adding some more mean reversion to the BDT tree VI
¨ The functional 𝑃 1,) →(12!,*) = 𝜇. Π 𝑋(𝑘 + 1, 𝑗) is chosen so that the probabilities on the
tree are intact
¨ We do not change the variance, nor the shape of the distribution for a given time section in
the BDT tree
¨ We only change the probability of jumping from one node back to another across time
sections
¨ This will work either when jumping forward in time in the BDT tree (when computing things
like Asian options that are path dependent, or for example the Arrow-Debreu prices of
securities), but also when jumping backward in the BDT tree, when discounting back
American options or pricing callable options.
132
133. Luc_Faucheux_2020
Adding some more mean reversion to the BDT tree VI-a
¨ Note that we do not lose any probability density
¨ 𝑋 𝑘, 𝑖 → 𝑋 𝑘 + 1, 𝑖 with probability 𝑃C =
!
"
. (1 − 𝜇)
¨ 𝑋 𝑘, 𝑖 → 𝑋 𝑘 + 1, 𝑖 + 1 with probability 𝑃B =
!
"
. (1 − 𝜇)
¨ 𝑋 𝑘, 𝑖 → 𝑋 𝑘 + 1, 𝑗 with probability 𝑃 1,) →(12!,*) = 𝜇. Π 𝑋(𝑘 + 1, 𝑗)
¨ So out of the node 𝑋 𝑘, 𝑖 we will transfer to the next time slice 𝑘 + 1 a total:
¨ 𝑃 =
!
"
. 1 − 𝜇 +
!
"
. 1 − 𝜇 + 𝜇. ∑*$%
*$12!
Π 𝑋 𝑘 + 1, 𝑗
¨ And since ∑*$%
*$12!
Π 𝑋 𝑘 + 1, 𝑗 = 1 we have 𝑃 = 1 = 100%
¨ So 100% of the probability density in the node 𝑋 𝑘, 𝑖 gets transferred to the next time slice,
we are not losing any probability density (note that we assumed that the sum of the next
tree level probabilities add up to one, so a little self consistent)
133
134. Luc_Faucheux_2020
Adding some more mean reversion to the BDT tree VII
¨ The old master equation was: Π 𝑋(𝑘 + 1, 𝑖) =
!
"
Π 𝑋(𝑘, 𝑖 − 1) + Π 𝑋(𝑘, 𝑖)
¨ The new master equation is a little more complicated at first:
¨ So just to be rigorous, we will note Π′ 𝑋(𝑘 + 1, 𝑖) the new probability distribution
¨ Π 𝑋(𝑘 + 1, 𝑖) is the baseline or the 𝜇 = 0 solution
¨ Π 𝑋(𝑘 + 1, 𝑖) is the usual binomial distribution
¨ Π 𝑋(𝑘 + 1, 𝑖) = 2-12!. )
12! =
!
"&H! .
(12!)!
)! 12!-) !
134
136. Luc_Faucheux_2020
Adding some more mean reversion to the BDT tree VII-c
¨ We now go from 𝑘 → (𝑘 + 1)
¨ We assume that Π@ 𝑋 𝑘, 𝑗 = Π 𝑋(𝑘, 𝑗)
¨ The master equation is
¨ Π′ 𝑋(𝑘 + 1, 𝑖) =
!-I
"
Π′ 𝑋 𝑘, 𝑖 − 1 + Π′ 𝑋(𝑘, 𝑖) + 𝜇 ∑* Π′ 𝑋 𝑘, 𝑗 Π 𝑋(𝑘 + 1, 𝑖)
¨ Π′ 𝑋(𝑘 + 1, 𝑖) =
!-I
"
Π 𝑋 𝑘, 𝑖 − 1 + Π 𝑋(𝑘, 𝑖) + 𝜇 ∑* Π 𝑋 𝑘, 𝑗 Π 𝑋(𝑘 + 1, 𝑖)
¨ Since ∑* Π 𝑋(𝑘, 𝑗) = 1
¨ Π@ 𝑋 𝑘 + 1, 𝑖 =
!-I
"
. Π 𝑋 𝑘, 𝑖 − 1 + Π 𝑋 𝑘, 𝑖 + 𝜇. Π 𝑋(𝑘 + 1, 𝑖)
136
137. Luc_Faucheux_2020
Adding some more mean reversion to the BDT tree VII-d
¨ Π@ 𝑋 𝑘 + 1, 𝑖 =
!-I
"
. Π 𝑋 𝑘, 𝑖 − 1 + Π 𝑋 𝑘, 𝑖 + 𝜇. Π 𝑋(𝑘 + 1, 𝑖)
¨ Now note that if Π@ 𝑋 𝑘 + 1, 𝑖 = Π 𝑋(𝑘 + 1, 𝑖)
¨ The Master Equation now would become
¨ Or: Π′ 𝑋 𝑘 + 1, 𝑖 . (1 − 𝜇) =
!
"
1 − 𝜇 . Π 𝑋(𝑘, 𝑖 − 1) + Π 𝑋(𝑘, 𝑖)
¨ Dividing both sides by the 1 − 𝜇 term we get:
¨ Π@ 𝑋 𝑘 + 1, 𝑖 =
!
"
Π 𝑋 𝑘, 𝑖 − 1 + Π 𝑋 𝑘, 𝑖 = Π 𝑋(𝑘 + 1, 𝑖)
¨ This is the same master equation, so probabilities are conserved, and the distribution is also
conserved !! In particular, we also have : ∑*$%
*$12!
Π 𝑋 𝑘 + 1, 𝑗 = 1
137
138. Luc_Faucheux_2020
Adding some more mean reversion to the BDT tree VII-e
¨ Without guessing so much:
¨ Π@ 𝑋 𝑘 + 1, 𝑖 =
!-I
"
. Π 𝑋 𝑘, 𝑖 − 1 + Π 𝑋 𝑘, 𝑖 + 𝜇. Π 𝑋(𝑘 + 1, 𝑖)
¨ And the binomial Master Equation ensures that
¨
!-I
"
. Π 𝑋 𝑘, 𝑖 − 1 + Π 𝑋 𝑘, 𝑖 = 1 − 𝜇 . Π 𝑋(𝑘 + 1, 𝑖)
¨ And so :
¨ Π@ 𝑋 𝑘 + 1, 𝑖 = 1 − 𝜇 . Π 𝑋(𝑘 + 1, 𝑖) + 𝜇. Π 𝑋(𝑘 + 1, 𝑖) = Π 𝑋(𝑘 + 1, 𝑖)
¨ Sounds pretty obvious, but our little trick did not change the probability distribution, and so
it will not affect the price of caplets, and swaps, or any derivatives that can be discounted
without any path dependency
¨ Swaptions and callables will change in pricing as it will have to be discounted back along the
tree (say it another way, some of the State Prices will be affected)
138
139. Luc_Faucheux_2020
Adding some more mean reversion to the BDT tree VIII
¨ So numerically we can observe that this will change the price of callable and swaptions
¨ Will it affect the price of caplets?
¨ Can we write down the new dynamics for the BDT with this new parameter?
¨ Can we even write an equation for the continuous process (SDE) ?
¨ Let’s do the same exercise we did in order to get the new dynamics (if we can)
139
140. Luc_Faucheux_2020
Adding some more mean reversion to the BDT tree IX
¨ For k large enough the tree is centered around 𝑋 𝑘,
1
"
= 𝑋?(𝑘) for middle
¨ For each node 𝑋 𝑘, 𝑖 = 𝑋? 𝑘 + 𝑖 −
1
"
. 𝛿𝑋(𝑘)
¨ 𝑋 𝑘 + 1, 𝑖 = 𝑋? 𝑘 + 1 + 𝑖 −
12!
"
. 𝛿𝑋(𝑘 + 1)
¨ 𝑋 𝑘 + 1, 𝑖 + 1 = 𝑋? 𝑘 + 1 + 𝑖 + 1 −
12!
"
. 𝛿𝑋(𝑘 + 1)
140
X(k,i)
X(k+1,i+1)
X(k+1,i)
𝛿𝑋(𝑘 + 1)
141. Luc_Faucheux_2020
Adding some more mean reversion to the BDT tree X
¨ The big difference is that NOW the probabilities have changed to :
¨ 𝑋 𝑘, 𝑖 → 𝑋 𝑘 + 1, 𝑖 with probability 𝑃C =
!
"
. (1 − 𝜇)
¨ 𝑋 𝑘, 𝑖 → 𝑋 𝑘 + 1, 𝑖 + 1 with probability 𝑃B =
!
"
. (1 − 𝜇)
¨ 𝑋 𝑘, 𝑖 → 𝑋 𝑘 + 1, 𝑗 with probability 𝑃 1,) →(12!,*) = 𝜇. Π 𝑋(𝑘 + 1, 𝑗)
¨ Where before we had :
¨ < 𝑋 > =< 𝑋12!|𝑋(𝑘, 𝑖) > =
!
"
. 𝑋 𝑘 + 1, 𝑖 + 1 + 𝑋 𝑘 + 1, 𝑖
¨ We now have to sum over all the possible nodes in the tree at time (𝑘 + 1)
¨ < 𝑋12!|𝑋(𝑘, 𝑖) > = ∑*$%
*$12!
𝑋 𝑘 + 1, 𝑗 . 𝑃 1,) →(12!,*)
141
142. Luc_Faucheux_2020
Adding some more mean reversion to the BDT tree XI
¨ We can make things easier for us and work in the limit of (𝜇 → 0)
¨ We know that when (𝜇 = 0) we have:
¨ < ∆𝑋(𝜇 = 0) > =
9 1,) -9E 1
89 #
.
;89 #
;#
. 𝛿𝑡 = 𝑋 𝑘, 𝑖 − 𝑋? 𝑘 .
!
89 #
.
;89 #
;#
. 𝛿𝑡
¨ For convenience sake, either we set 𝑋? 𝑘 = 0, or we define 𝑌 𝑘, 𝑖 = 𝑋 𝑘, 𝑖 − 𝑋? 𝑘
and work in the tree in 𝑌.
¨ If we set 𝑋? 𝑘 = 0 (we can always add a constant tree drift after)
¨ < ∆𝑋(𝜇 = 0) > =
9 1,)
89 #
.
;89 #
;#
. 𝛿𝑡 = 𝑋 𝑘, 𝑖 .
!
89 #
.
;89 #
;#
. 𝛿𝑡
142
144. Luc_Faucheux_2020
Adding some more mean reversion to the BDT tree XIII
¨ We now have:
¨ < ∆𝑋 > = < ∆𝑋 𝜇 = 0 >. 1 − 𝜇 + 𝜇. 𝛿 < ∆𝑋 >
¨ 𝛿 < ∆𝑋 > = [∑*$%
*$12!
𝑋 𝑘 + 1, 𝑗 . Π 𝑋 𝑘 + 1, 𝑗 − 𝑋 𝑘, 𝑖 ]
¨ 𝛿 < ∆𝑋 > = [𝑋?(𝑘 + 1) − 𝑋 𝑘, 𝑖 ]
¨ Remember that:
¨ 𝑋? 𝑘 + 1 = ∑*$%
*$12!
𝑋 𝑘 + 1, 𝑗 . Π 𝑋 𝑘 + 1, 𝑗
¨ In the limit of a dense tree 𝑋? 𝑘 = 𝑋(𝑘,
1
"
)
¨ And for each node 𝑋 𝑘, 𝑖 = 𝑋? 𝑘 + 𝑖 −
1
"
. 𝛿𝑋(𝑘)
144
145. Luc_Faucheux_2020
Adding some more mean reversion to the BDT tree XIV
¨ < ∆𝑋 > = < ∆𝑋 𝜇 = 0 >. 1 − 𝜇 + 𝜇. [𝑋?(𝑘 + 1) − 𝑋 𝑘, 𝑖 ]
¨ Again we take the liberty of setting 𝑋? 𝑘 + 1 = 𝑋? 𝑘 = 0
¨ (as we can always impose a constant drift later)
¨ And so : < ∆𝑋 > = 𝑋 𝑘, 𝑖 .
!
89 #
.
;89 #
;#
. 𝛿𝑡. 1 − 𝜇 − 𝜇. 𝑋 𝑘, 𝑖
¨ This looks almost like what we are after
¨ We need to check that the units are correct and consistent between the two terms, in
particular we look to be missing a 𝛿𝑡 factor in the second term
¨ Also we worked in the limit (𝜇 → 0) but we know that the BDT construction is algorithmic in
nature, so we are missing the cross-terms and higher order terms, we will need to be
conscious of that
¨ On the other hand we are trying to get the SDE, which is continuous limit, and not that
helpful, because we are using BDT anyways, and not looking for closed form solutions
145
146. Luc_Faucheux_2020
Adding some more mean reversion to the BDT tree XV
¨ < ∆𝑋 > = 𝑋 𝑘, 𝑖 .
!
89 #
.
;89 #
;#
. 𝛿𝑡. 1 − 𝜇 − 𝜇. 𝑋 𝑘, 𝑖
¨ We have (𝜎". 𝛿𝑡) =
89+
:
, so
!
89 #
.
;89 #
;#
=
!
=
.
;=
;#
=
=@
=
¨ For 𝜇 = 0 , the BDT dynamics is 𝑑𝑋 = 𝑋.
=@
=
. 𝑑𝑡 + 𝜎. 𝑑𝑊
¨ In the continuous limit the binomial distribution Π 𝑋(𝑘, 𝑖) = 2-1. )
1 =
!
"& .
1!
)! 1-) !
converges to the Gaussian ℎ 𝑋, 𝑡 =
!
"J=+#
. exp(
-9+
"=+#
)
146
147. Luc_Faucheux_2020
Adding some more mean reversion to the BDT tree XVI
¨ We still need to calculate the standard deviation
¨ We could say the following: our trick did not change the probability distribution and it did
not change the spacing.
¨ And so the variance of the distribution for a given “slice” (all nodes at the same time) has
not changed
¨ This is not completely correct
¨ Because for the dynamics what we care about is the LOCAL variance (for the process)
¨ This is quite an important and sometimes concept in trees implementation , where people
sometimes get confused between local and average concept
¨ For example a BDT tree respects the arbitrage free constraints ON AVERAGE (averaging over
a slice)
¨ But the BDT tree will NOT respect the arbitrage free constraints locally (as a matter of fact
the very existence of mean reversion in the SDE proves it)
147
148. Luc_Faucheux_2020
Adding some more mean reversion to the BDT tree XVII
¨ Why is the binomial distribution converging to a Gaussian
¨ Taylor expansion of the Master equation (Bachelier thesis)
¨ Stirling approximation (Bachelier thesis once again)
148
150. Luc_Faucheux_2020
Trinomial trees – an example of stability constraint I
¨ Again, not quite sure why that is so different from a two-step binomial, by cutting the time
step by half (remember that 𝛿𝑋 = 2𝜎. 𝛿𝑡)
150
X(k,i)
X(k+1,i+1)
X(k+1,i-1)
𝛿𝑋/2
𝛿𝑋/2
X(k+1,i)
X(k,i)
X(k+2,i+1)
X(k+2,i-1)
𝛿𝑋/2
X(k+2,i)
𝛿𝑋/2
151. Luc_Faucheux_2020
Trinomial trees – an example of stability constraint II
¨ We saw in the binomial that in order to match the process 𝑑𝑋 = 0. 𝑑𝑡 + 𝜎. 𝑑𝑊
¨ 𝜎". 𝛿𝑡 = (
89+
:
)
¨ 𝛿𝑋 = 2𝜎. 𝛿𝑡
¨ So we just have to be a little careful, if we divide the time steps in half, we should only adjust
the space step (vertical distance or spacing between nodes) by the square root of it.
¨ And now adding mean reversion in that binomial tree is also a little tricky
¨ But again, barring the fact that trinomial trees are maybe easier to use (and also easier to
put on a grid from the usual PDE numerical methods like Explicit / Implicit / Crank-
Nicholson), I am still not convinced that trinomial trees present a true qualitative difference
from binomial trees.
¨ But again I have been known to be wrong before, and would love to be challenged on that
issue
151
152. Luc_Faucheux_2020
Trinomial trees – an example of stability constraint III
¨ We are going to model the process 𝑑𝑋 = 𝛼. 𝑑𝑡 + 𝜎. 𝑑𝑊
¨ We will then extend to something like 𝛼 = 𝛼 𝑋, 𝑡 = 𝜃 𝑡 − 𝑘 𝑡 . 𝑋 with mean reversion
¨ The trinomial trees usually are “on a grid”, meaning that the spacing is kept constant and the
probabilities are adjusted to reflect the drift, the mean reversion if any, and the volatility
¨ This is useful as the same grid can be used for the PDE solving (implicit and explicit, Crank-
Nicholson,..) for the diffusion equation
152
X(k,i)
X(k+1,i+1)
X(k+1,i-1)
𝛿𝑋/2
𝛿𝑋/2
X(k+1,i)
𝑃&
𝑃'
𝑃(
𝛿𝑡
153. Luc_Faucheux_2020
Trinomial trees – an example of stability constraint IV
¨ We have the 3 unknowns 𝑃K, 𝑃A, 𝑃L
¨ We have the 3 equations :
¨ 𝑃K + 𝑃A + 𝑃L = 1
¨ < ∆𝑋 > = 𝑃K. −
89
"
+ 𝑃A. 0 + 𝑃L. (
89
"
) = 𝛼. 𝛿𝑡
¨ < ∆𝑋"> = 𝑃K.
89
"
"
+ 𝑃A. 0 + 𝑃L.
89
"
"
= (𝛼. 𝛿𝑡)"+𝜎". 𝛿𝑡
¨ We have 3 equations, we have 3 unknowns, we have a fighting chance
153
155. Luc_Faucheux_2020
Trinomial trees – an example of stability constraint VI
¨ The solution is then:
¨ 𝑃A = 1 − {(𝛼. 𝛿𝑡)"+𝜎". 𝛿𝑡}.
89
"
-"
¨ 𝑃L =
!
"
. [
"M.8#
89
+ {(𝛼. 𝛿𝑡)"+𝜎". 𝛿𝑡}.
89
"
-"
]
¨ 𝑃K =
!
"
. [−
"M.8#
89
+ {(𝛼. 𝛿𝑡)"+𝜎". 𝛿𝑡}.
89
"
-"
]
¨ Note that it would seem sensible that all probabilities should be in [0,1]
¨ A negative probability in the tree is not necessarily a sign that something is wrong, but it is a
constraint that we would like to enforce (related to stability analysis)
155
156. Luc_Faucheux_2020
Trinomial trees – an example of stability constraint VII
¨ In the simple case where 𝛼 = 0
¨ 𝑃A = 1 − 𝜎". 𝛿𝑡.
89
"
-"
¨ 𝑃L =
!
"
. 𝜎". 𝛿𝑡.
89
"
-"
= 𝑃K
¨ 𝑃K =
!
"
. 𝜎". 𝛿𝑡.
89
"
-"
= 𝑃L
¨ And we have: 𝑃A > 0 implies 1 − 𝜎". 𝛿𝑡.
89
"
-"
> 0 or 𝛿𝑋 > 2𝜎 𝛿𝑡
¨ So if we chose a spacing in 𝛿𝑋 smaller than (2𝜎 𝛿𝑡) we will create a negative probability for
the middle jump 𝑃A (meaning the phase spacing is too small, and so the probabilities will
push for more on the wings and will deplete the center in order to still verify the variance or
second moment
156
157. Luc_Faucheux_2020
Trinomial trees – an example of stability constraint VIII
¨ Note that this does not mean that the probability density at the node 𝑋(𝑘 + 1, 𝑖) will be
negative, but it will be depleted relative to the wings 𝑋(𝑘 + 1, 𝑖 + 1) and 𝑋(𝑘 + 1, 𝑖 − 1)
¨ If the spacing is really too small we can run into
¨ 𝑃L =
!
"
. 𝜎". 𝛿𝑡.
89
"
-"
= 𝑃K > 1 when 𝛿𝑋 < 𝜎 2𝛿𝑡
¨ Again, mathematically there is nothing wrong there yet, it is just a little weird to have
negative jump probabilities. Also a little hard to see what happens with more than one level
¨ But one can see how that effect will propagate along the tree and create a situation where
only the 2 nodes on the wings have infinite positive probability and every other nodes in
between will have negative infinite probability. This will also violate the added constraint
that the probability distribution has to follow the heat equation and converges to a
Gaussian.
¨ This is essentially what stability analysis is all about : the propagation of small errors, does
that grow exponentially and diverges or is it “under control” ?
157