Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Prediction of Financial Processes
1. 4th International Summer School
Achievements and Applications of Contemporary
Informatics, Mathematics and Physics
National University of Technology of the Ukraine
Kiev, Ukraine, August 5-16, 2009
Prediction of Financial Processes
Parameter Estimation in
Stochastic Differential Equations
by Continuous Optimization
Gerhard-
Gerhard-Wilhelm Weber *
Vefa Gafarova, Nüket Erbil, Cem Ali Gökçen, Azer Kerimov
Institute of Applied Mathematics
Middle East Technical University, Ankara, Turkey
* Faculty of Economics, Management and Law, University of Siegen, Germany
Center for Research on Optimization and Control, University of Aveiro, Portugal
Pakize Taylan Dept. Mathematics, Dicle University, Diyarbakır, Turkey
2. Outline
• Stochastic Differential Equations
• Parameter Estimation
• Various Statistical Models
• C-MARS
• Accuracy vs. Stability
• Tikhonov Regularization
• Conic Quadratic Programming
• Nonlinear Regression
• Portfolio Optimization
• Outlook and Conclusion
4. Stochastic Differential Equations
dX t = a ( X t , t )dt + b( X t , t )dWt
drift and diffusion term
Wt N (0, t ) (t ∈ [0, T ])
Wiener process
5. Stochastic Differential Equations
dX t = a ( X t , t )dt + b( X t , t )dWt
drift and diffusion term
Ex.: price, wealth, interest rate, volatility
processes
Wt N (0, t ) (t ∈ [0, T ])
Wiener process
6. Regression
X = ( X1 , X 2 ,..., X m ) and output variable Y ;
T
Input vector
linear regression :
m
Y = E (Y X 1 ,..., X m ) + ε = β0 + ∑ X j β j + ε ,
j =1
β = ( β 0 , β1 ,..., β m ) which minimizes
T
2
( )
N
RSS ( β ) := ∑ yi − x β T
i
i =1
ˆ = ( X T X )−1 X T y ,
β
( )
−1
Cov( β) = X T X
ˆ σ2
7. Generalized Additive Models
( ) ( )
m
E Yi xi1 , xi 2 ,..., xi m = β0 + ∑ f j x i j
j =1
f j are estimated by a smoothing on a single coordinate.
Standard convention : ( )
E f j ( xij ) = 0 .
• Backfitting algorithm (Gauss-Seidel)
ri j = yi − β 0 − ∑ f k ( xik ) ,
ˆ
k≠ j
it “cycles” and iterates.
8. Generalized Additive Models
• Given data ( yi , xi ) (i = 1,2,...,N ),
• penalized residual sum of squares
2
N m m b
PRSS (β 0 , f1 ,..., f m ) : = ∑ yi − β 0 − ∑ f j ( xij ) + ∑ µ j ∫
2
f j'' (t j ) dt j
i =1 j =1 j =1 a
µ j ≥ 0.
• New estimation methods for additive model with CQP :
9. Generalized Additive Models
min t
t , β0 , f
2
N m
subject to ∑
i=1
yi − β0 − ∑ f j ( xij ) ≤ t 2 , t ≥ 0,
j =1
2
∫ f j (t j ) dt j ≤ M j (j = 1, 2,..., m),
''
dj
splines: f j ( x) = ∑ θl j hl j ( x).
l =1
By discretizing, we get
min t
t , β0 , f
W ( β 0 , θ ) 2 ≤ t 2 , t ≥ 0,
2
subject to
2
V j ( β0 ,θ ) ≤ M j (j = 1,..., m).
2
10. Generalized Additive Models
min t
t , β0 , f
2
N m
subject to ∑
i=1
yi − β0 − ∑ f j ( xij ) ≤ t 2 , t ≥ 0,
j =1
2
∫ f j (t j ) dt j ≤ M j (j = 1, 2,..., m),
''
dj
splines: f j ( x) = ∑ θl j hl j ( x).
l =1
By discretizing, we get
min t
t , β0 , f
W ( β 0 , θ ) 2 ≤ t 2 , t ≥ 0,
2
subject to
2
V j ( β0 ,θ ) ≤ M j (j = 1,..., m).
2
11. Generalized Additive Models
min t
t , β0 , f
2
N m
subject to ∑
i=1
yi − β0 − ∑ f j ( xij ) ≤ t 2 , t ≥ 0,
j =1
2
∫ f j (t j ) dt j ≤ M j (j = 1, 2,..., m),
''
dj
splines: f j ( x) = ∑ θl j hl j ( x).
l =1
By discretizing, we get
min t
t , β0 , f
W ( β 0 , θ ) 2 ≤ t 2 , t ≥ 0,
2
subject to
2
V j ( β0 ,θ ) ≤ M j (j = 1,..., m).
2
13. MARS
y y
• • • • • •
• • • •
• • • • • • • •
• • • • • •
• • • • • •
• • •• • •
• • ••
c-(x,τ)=[−(x−τ)]+ c+(x,τ)=[+(x−τ)]+ c-(x,τ)=[−(x−τ)]+ c+(x,egressionx−τ)]+
rτ)=[+( w ith
τ x
τ x
14. C-MARS
N M max 2
∑( y − f (x ) ) + ∑ µ ∑ ∑
2
θ Drα, sψ m (t m ) d t m
∫
2
PRSS := i i m
2
m
i =1 m =1 α =1 r <s
α = (α1 ,α 2 ) r , s∈V ( m )
Tradeoff between both accuracy and complexity.
{
V (m) := κ m | j = 1, 2,..., K m
j } ( )
Drα, sψ m (t m ) := ∂αψ m ∂α1 trm ∂α 2 tsm (t m )
t m := (tm1 , tm2 ,..., tm K )T
m
α = (α1 , α 2 )
α := α1 + α 2 , where α1 , α 2 ∈{0,1}
15. C-MARS
Tikhonov regularization:
2
PRSS = y −ψ (d ) θ + µ Lθ
2
2 2
Lθ 2
Conic quadratic programming:
y − ψ (d ) θ 2
min t,
t ,θ
subject to ψ (d ) θ − y 2 ≤ t ,
Lθ 2
≤ M
16. C-MARS
Tikhonov regularization:
2
PRSS = y −ψ (d ) θ + µ Lθ
2
2 2
Lθ 2
Conic quadratic programming:
y − ψ (d ) θ 2
min t,
t ,θ
subject to ψ (d ) θ − y 2 ≤ t ,
Lθ 2
≤ M
18. Stochastic Differential Equations Revisited
dX t = a ( X t , t )dt + b( X t , t )dWt
drift and diffusion term
Ex.: price, wealth, interest rate, volatility,
processes
Wt N (0, t ) (t ∈ [0, T ])
Wiener process
19. Stochastic Differential Equations
dX t = a ( X t , t )dt + b( X t , t )dWt
drift and diffusion term
Ex.: bioinformatics, biotechnology
(fermentation, population dynamics)
Universiti Teknologi Malaysia
Wt N (0, t ) (t ∈ [0, T ])
Wiener process
20. Stochastic Differential Equations Revisited
dX t = a ( X t , t )dt + b( X t , t )dWt
drift and diffusion term
Ex.: price, wealth, interest rate, volatility,
processes
Wt N (0, t ) (t ∈ [0, T ])
Wiener process
21. Stochastic Differential Equations
Milstein Scheme :
ˆ ˆ ˆ 1
2
ˆ (
X j +1 = X j + a ( X j , t j )(t j +1 − t j ) + b( X j , t j )(W j +1 − W j ) + (b′b)( X j , t j ) (W j +1 − W j ) 2 − (t j +1 − t j )
ˆ )
and, based on our finitely many data:
& ∆W j ( ∆W j ) 2
X j = a ( X j , t j ) + b( X j , t j ) + 1 2 (b ′b)( X j , t j ) − 1 .
hj hj
22. Stochastic Differential Equations
• step length h j = t j +1 − t j := ∆ t j
X j +1 − X j
, if j = 1, 2,..., N − 1
& hj
X j :=
X N − X N −1 , if j = N
hN
• Wt N (0, t ), ∆W j (independent), Var( ∆W j ) = ∆ t j
• ∆W j = Z j ∆ t j , Zj N (0,1)
( )
& Zj 1
X j = a ( X j , t j ) + b( X j , t j ) + (b′b)( X j , t j ) Z j2 − 1
hj 2
23. Stochastic Differential Equations
• More simple form:
X j = G j + H j c j + ( H j ′ H j )d j ,
&
where
G j := a( X j , t j ) , H j := b( X j , t j ),
c j := Z j hj , (
d j :=1 2 Z j2 − 1 . )
• Our problem:
∑( )
N 2
min X j − (G j + H j c j + ( H j′ H j )d j )
&
y 2
j =1
y is a vector which comprises a subset of all the parameters.
24. Stochastic Differential Equations
g
2 2 dp
G j = a( X j , t j ) = α 0 + ∑ f p (U j , p ) = α 0 + ∑∑ α lp B p (U j , p )
l
p =1 p =1 l =1
2 2 d rh
H j c j = b( X j , t j )c j = β 0 + ∑ g r (U j ,r ) = β 0 + ∑∑ β rm Crm (U j ,r )
r =1 r =1 m =1
2 2 d sf
Fj d j = b′b( X j , t j )d j = ϕ0 + ∑ hs (U j , s ) = ϕ0 + ∑∑ ϕ sn Dsn (U j , s )
s =1 s =1 n =1
where
U j = (U j ,1 , U j ,2 ) := ( X j , t j ) ;
• k th order base spline Bη ,k : a polynomial of degree k − 1, with knots, say x η ,
1, xη ≤ x < xη +1
Bη ,1 ( x) =
0, otherwise
x − xη xη + k − x
Bη ,k ( x) = Bη ,k −1 ( x) + Bη +1,k −1 ( x)
xη + k −1 − xη xη + k − xη +1
25. Stochastic Differential Equations
• penalized sum of squares PRRS
∑{ Xj ( j j j) }
N 2
& − G + H c + F d 2 + λ f ′′(U ) 2 dU
PRSS (θ , f , g , h) : =
j =1
j j ∑ p∫ p p pp =1
2 2
+ ∑ µr ∫ [ gr (U r )] dU r +∑ϕs ∫ [ hs′′(U s )] dU s
′′
2 2
r =1 s =1
bκ
• λ p , µ r , ϕ s ≥ 0 (smoothing parameters), ∫ = ∫ (κ = p, r , s )
aκ
• large values of λ p , µ r , ϕ s yield smoother curves,
smaller ones allow more fluctuation
∑{ X j − ( G j + H j c j + Fj d j ) }
N 2
& =
j =1
2
N
& 2 dp
h
2 dr
g
2 ds
f
∑ X j − α 0 + ∑∑ α p Bp (U j , p ) + β0 + ∑∑1 βr Cr (U j ,r ) + ϕ0 + ∑∑ ϕs Ds (U j ,s )
j =1
l l m m n n
p =1 l =1 r =1 m = s =1 n =1
26. Stochastic Differential Equations
θ = (α , β , ϕ ) ( ) ( )
T T g T
, α = α0 ,α ,α α p = α , α ,..., α ( p = 1, 2),
T T T T T 1 2 dp
1 2 , p p p
β = ( β0 , β , β ) ( )
T
T T
T
1 2 , β r = β , β ,..., β
1
r
2
r
d rh
r (r = 1, 2),
(
ϕ = (ϕ0 , ϕ1T , ϕ 2 ) , ϕ s = ϕ s , ϕ s2 ,..., ϕ sd )
T T
( s = 1, 2).
f
T 1 s
∑{ } ( )
N T
• Then, &
X j − Ajθ
2
& − Aθ 2 .
= X A = A1T , A2 ,..., AN
T T
( )
j =1 2 T
& & & &
X = X 1 , X 2 ,..., X N
• Furthermore,
b 2 N −1 2
∫ f p′′ (U p ) dU p ≅
a
∑ f p′′ (U jp ) (U j +1, p − U jp )
j =1
2
dp l l
g
N −1
= ∑ ∑ α p B p′′ (U jP )u j .
j =1 l =1
27. Appendix Stochastic Differential Equations
b 2 N −1 2
∫ f p′′ (U p ) dU p ≅ ∑ B j ′′u jα p = AP α p
2
p B
( p = 1, 2)
a
j =1
2
( )
T
Ap := B1p′′T u1 , B2p′′T u2 ,..., BN −1′′T u N −1
B p
u j := U j +1, p − U j , p ( j = 1, 2,..., N − 1).
b N −1 2
[ gr′′(U r )] dU r ≅ ∑ C rj ′′v j β r = ArC β r
2
∫ (r = 1, 2)
2
a j =1
2
( )
T
ArC := C1r′′T v1 , C2 ′′T v2 ,..., CN −1′′T vN −1
r r
v j := U j +1,r − U j ,r ( j = 1, 2,..., N − 1).
b 2 N −1 2
h ′′ (U ) dU ≅ D s′′ w ϕ = A Dϕ
∫ s s s ∑ j j s
2
s s ( s = 1, 2)
2
a j =1
( )
T
A := D ′′ w1 , D2 ′′T w2 ,..., DN −1′′T wN −1
s
D s
1
s T s
w j := U j +1, s − U j , s ( j = 1, 2,..., N − 1).
28. Stochastic Differential Equations
2 2 2
& − Aθ 2 + λ A Bα 2 + µ AC β
∑ p p p ∑ r r r + ∑ ϕ s AsDϕ s
2 2
PRSS (θ , f , g , h) = X
2 2 2 2
p =1 r =1 s =1
Let us assume that λ p = µr = ϕ s =: µ = δ :
2
•
2
&
PRSS (θ , f , g , h) ≈ X − Aθ + δ 2 Lθ 2 ,
2
2
where L is a 6( N − 1) × m matrix:
0 A1B 0 0 0 0 0 0 0
0 0 A2B 0 0 0 0 0 0
0
θ = (α T , β T , ϕ T ) .
T
0 0 0 A1C 0 0 0 0
L := ,
0 0 0 0 0 A2C 0 0 0
0 0 0 0 0 0 0 A1D 0
0 0 0 0 0 0 0 0 A2D
29. Stochastic Differential Equations
2
min &
X − Aθ + µ Lθ
2
θ 2
2
Tikhonov regularization
min t,
t ,θ
subject to &
Aθ − X ≤ t,
2
Lθ 2
≤ M
Conic quadratic programming
30. Stochastic Differential Equations
min t
t ,θ
0N A t −X
&
subject to χ := T
+ ,
0m
θ 0
1 primal problem
06( N −1) L t 06( N −1)
η := + ,
0 0T θ M
m
χ ∈ LN +1 , η ∈ L6( N −1)+1
{
LN + 1 := x = ( x1 , x2 ,..., xN )T ∈ R N +1 | xN+1 ≥ x12 + x2 + ... + xN
2 2
}
&
(
max ( X T , 0) κ 1 + 0T N −1) , − M κ 2
6( )
0T 1 0T N −1) 0 1
κ1 + T κ2 = ,
N 6(
subject to T dual problem
A 0m L 0m 0m
κ 1 ∈ LN +1 , κ 2 ∈ L6 ( N −1)+1
31. Stochastic Differential Equations
(t , θ , χ ,η , κ1 , κ 2 ) is a primal dual optimal solution if and only if
0N A t −X &
χ := T
+ ,
0m θ 0
1
06( N −1) L t 06( N −1)
η := +
0 0T θ M
m
0T 1 0T N −1) 0 1
κ1 + T κ2 =
N 6(
T
A 0m L 0m 0m
κ 1T χ = 0, κ 2 η = 0
T
κ 1 ∈ LN +1 , κ 2 ∈ L6( N −1)+1
χ ∈ LN +1 , η ∈ L6( N −1)+1.
32. Stochastic Differential Equations
Ex.:
dVt = (θtT ( µ − rt ) + rt )Vt dt − ct dt + θtT σVt dWt ,
drt = α ⋅ ( R − rt ) dt + σ t ⋅ rt τ ⋅ dWt ,
dX t = µ ( t , X t , Zt ) dt + σ ( t , X t , Zt ) dWt .
nonlinear regression
33. Nonlinear Regression
2
∑ j ( j )
N
min f ( β ) = d − g x ,β
j =1
N
=: ∑ f j2 ( β )
j =1
F ( β ) := ( f1 ( β ),..., f N ( β ) )
T
min f ( β ) = F T ( β ) F ( β )
34. Nonlinear Regression
β k +1 := β k + qk
• Gauss-Newton method :
∇F ( β )∇T F ( β )q = −∇F ( β ) F ( β )
• Levenberg-Marquardt method :
λ ≥0
( )
∇F ( β )∇T F (β ) + λ I p q = −∇F ( β ) F ( β )
35. Nonlinear Regression
alternative solution
min t,
t,q
subject to ( ∇F (β )∇ T
)
F ( β ) + λ I p q − ( −∇F ( β ) F ( β ) )
2
≤ t , t ≥ 0,
|| Lq || 2 ≤ M
conic quadratic programming
36. Nonlinear Regression
alternative solution
min t,
t,q
subject to ( ∇F (β )∇ T
)
F ( β ) + λ I p q − ( −∇F ( β ) F ( β ) )
2
≤ t , t ≥ 0,
|| Lq || 2 ≤ M
conic quadratic programming
interior point methods
37. Nonlinear Regression
alternative solution
min t,
t,q
subject to ( ∇F (β )∇ T
)
F ( β ) + λ I p q − ( −∇F ( β ) F ( β ) )
2
≤ t , t ≥ 0,
|| Lq || 2 ≤ M
1
min Q(q) := f ( β ) + qT ∇F ( β ) F ( β ) + qT ∇F ( β )∇T F ( β )q
q 2
subject to q 2 ≤∆
trust region
38. Portfolio Optimization
max utility ! or
min costs !
martingale method:
Optimization Problem
Representation Problem
or stochastic control
39. Portfolio Optimization
max utility ! or
min costs !
martingale method:
Parameter Estimation
Optimization Problem
Representation Problem
or stochastic control
40. Portfolio Optimization
max utility ! or
min costs !
martingale method:
Optimization Problem
Representation Problem
Parameter Estimation
or stochastic control
41. Portfolio Optimization
max utility ! or
min costs !
martingale method:
Optimization Problem
Representation Problem
Parameter Estimation
or stochastic control
42. References
Aster, A., Borchers, B., and Thurber, C., Parameter Estimation and Inverse Problems, Academic Press, 2004.
Boyd, S., and Vandenberghe, L., Convex Optimization, Cambridge University Press, 2004.
Buja, A., Hastie, T., and Tibshirani, R., Linear smoothers and additive models, The Ann. Stat. 17, 2 (1989)
453-510.
Fox, J., Nonparametric regression, Appendix to an R and S-Plus Companion to Applied Regression,
Sage Publications, 2002.
Friedman, J.H., Multivariate adaptive regression splines, Annals of Statistics 19, 1 (1991) 1-141.
Friedman, J.H., and Stuetzle, W., Projection pursuit regression, J. Amer. Statist Assoc. 76 (1981) 817-823.
Hastie, T., and Tibshirani, R., Generalized additive models, Statist. Science 1, 3 (1986) 297-310.
Hastie, T., and Tibshirani, R., Generalized additive models: some applications, J. Amer. Statist. Assoc.
82, 398 (1987) 371-386.
Hastie, T., Tibshirani, R., and Friedman, J.H., The Element of Statistical Learning, Springer, 2001.
Hastie, T.J., and Tibshirani, R.J., Generalized Additive Models, New York, Chapman and Hall, 1990.
Kloeden, P.E, Platen, E., and Schurz, H., Numerical Solution of SDE Through Computer Experiments,
Springer Verlag, New York, 1994.
Korn, R., and Korn, E., Options Pricing and Portfolio Optimization: Modern Methods of Financial Mathematics,
Oxford University Press, 2001.
Nash, G., and Sofer, A., Linear and Nonlinear Programming, McGraw-Hill, New York, 1996.
Nemirovski, A., Lectures on modern convex optimization, Israel Institute of Technology (2002).
43. References
Nemirovski, A., Modern Convex Optimization, lecture notes, Israel Institute of Technology (2005).
Nesterov, Y.E , and Nemirovskii, A.S., Interior Point Methods in Convex Programming, SIAM, 1993.
Önalan, Ö., Martingale measures for NIG Lévy processes with applications to mathematical finance,
presentation in: Advanced Mathematical Methods for Finance, Side, Antalya, Turkey, April 26-29, 2006.
Taylan, P., Weber G.-W., and Kropat, E., Approximation of stochastic differential equations by additive
models using splines and conic programming, International Journal of Computing Anticipatory Systems 21
(2008) 341-352.
Taylan, P., Weber, G.-W., and A. Beck, New approaches to regression by generalized additive models
and continuous optimization for modern applications in finance, science and techology, in the special issue
in honour of Prof. Dr. Alexander Rubinov, of Optimization 56, 5-6 (2007) 1-24.
Taylan, P., Weber, G.-W., and Yerlikaya, F., A new approach to multivariate adaptive regression spline
by using Tikhonov regularization and continuous optimization, to appear in TOP, Selected Papers at the
Occasion of 20th EURO Mini Conference (Neringa, Lithuania, May 20-23, 2008) 317- 322.
Seydel, R., Tools for Computational Finance, Springer, Universitext, 2004.
Stone, C.J., Additive regression and other nonparametric models, Annals of Statistics 13, 2 (1985) 689-705.
Weber, G.-W., Taylan, P., Akteke-Öztürk, B., and Uğur, Ö., Mathematical and data mining contributions
dynamics and optimization of gene-environment networks, in the special issue Organization in Matter
from Quarks to Proteins of Electronic Journal of Theoretical Physics.
Weber, G.-W., Taylan, P., Yıldırak, K., and Görgülü, Z.K., Financial regression and organization, to appear
in the Special Issue on Optimization in Finance, of DCDIS-B (Dynamics of Continuous, Discrete and
Impulsive Systems (Series B)).