presentazione

Optimal Weighted Distributions
and
Applications to Financial Time Series
Alessandro Zanatta
Relatore: Prof. Marco Maggis
Corelatore: Prof. Giacomo Aletti
Universit`a Degli Studi di Milano
Corso di Laurea in Matematica
Anno Accademico 2013/2014
1 / 20

Introduction
Given a probability space (Ω, F, P), a vector of observations {Y i
}n
i=1 with ﬁxed
n ∈ N and a process {Xt}t∈[0,T] with unknown distribution at maturity T we
deﬁne
Weighted Sample Distribution ⇒ F(u) =
n
i=1
wi 1{Y i ≤u}
where w = (w1, . . . , wn) ∈ ∆n
with
∆n
= w ∈ Rn
: wi ≥ 0 for any i = 1, ..., n and
n
i=1
wi = 1
the n-simplex in Rn
and
Target Distribution ⇒ FX (u) = P(XT (ω) ≤ u)
for all ω ∈ Ω and for all u ∈ R.
2 / 20

Optimization Problem
Problem 1
Given a vector of observations {Y i
}n
i=1, n ∈ N, we want to minimize the following
distance
d(F, FX ) = EP
R
F(u) − FX (u)
2
du
with respect the weights w1, . . . , wn ∈ ∆n
, where F is a weighted sample
distribution and FX is a target distribution.
Using the optimal weights w∗
1, . . . , w∗
n obtained solving the minimum problem we
can compute the
Optimal Weighted Distribution ⇒ F∗(u) =
n
i=1
w∗
i 1{Y i ≤u}
for all u ∈ R.
3 / 20

Let us ﬁx the following objects
Ak =
R
FY k (u) (1 − FY k (u)) + (FX (u) − FY k (u))
2
du
Bk,j =
R
P Y k
≤ u, Y j
≤ u − FY k (u)FY j (u)+
+ FX (u)2
− FX (u)FY k (u) − FX (u)FY j (u) + FY k (u)FY j (u)du
Remark
Bk,j is symmetric
if j = k ⇒ Bk,k = Ak
if {Y i
}n
i=1 is an i.i.d. sample for XT then
Ak =
R
FX (u) (1 − FX (u)) du and Bk,j = 0
for all k, j = 1, . . . , n
4 / 20

The k-optimal weight that solves Problem 1 has the following form
wk = −
1
2Ak
n
j=k,j=1
wj Bk,j +
1
Ak
n
i=1
1
Ai
+
n
i=1
n
j=i,j=1
wj Bi,j
Ai
2Ak
n
i=1
1
Ai
,
for all k = 1, . . . , n.
Remark
the k-optimal weight has an implicit form
we are not doing any assumptions about the vector of observations {Y i
}n
i=1
If we assume the observations Y i
, i = 1, . . . , n, independent and identically
distributed (i.i.d.) the k-optimal weight that solves Problem 1 have the following
form
wk =
1
n
, ∀k = 1, . . . , n
5 / 20

Populations
a population is a complete set of items that share at least one property in
common that is the subject of a statistical analysis
a statistical sample is a subset drawn from the population to represent the
population in a statistical analysis
a subset of a population is called a subpopulation if they share one or more
additional properties
populations consisting of subpopulations can be modeled by mixture models,
which combine the distributions within subpopulations into an overall population
distribution
6 / 20

The Two Populations Problem
Consider two populations Ω1 and Ω2, we assume
nk the number of observation in population Ωk , k = 1, 2, with n1 + n2 = n
the observation in the same populations an i.i.d. sample
Problem 2
Given two populations Ω1 = {W i
}n1
i=1 and Ω2 = {Zj
}n2
j=1, n1, n2 ∈ N, we want to
minimize the following distance
d(F, FX ) = EP



R


n1
i=1
w11{W i ≤u} +
n2
j=1
w21{Zj ≤u} − FX (u)


2
du



with respect the weights n1w1, n2w2 ∈ ∆2
, where FX is a target distribution.
7 / 20

Let us ﬁx the following objects
ai =
R
(Fi (u) − FX (u))
2
du vi =
R
Fi (u) (1 − Fi (u)) du
bi,j =
R
(Fi (u) − FX (u)) (Fj (u) − FX (u)) du
ci,j =
R
CovP


n1
i=1
1{W i ≤u};
n2
j=1
1{Zj ≤u}

 du
where Fi is the distribution of the population Ωi , i = 1, 2.
Remark
ai is the square of the norm ||Fi − FX ||L2
bi,j and ci,j are both symmetric
ai and bi,j depend on target distribution FX
8 / 20

The optimal weights that solve Problem 2 have the following form
n1w∗
1 =
a2 + v2
n2
− b1,2 −
c1,2
n1n2
(a1 + v1
n1
− b1,2 −
c1,2
n1n2
) + (a2 + v2
n2
− b1,2 −
c1,2
n1n2
)
n2w∗
2 =
a1 + v1
n1
− b1,2 −
c1,2
n1n2
(a1 + v1
n1
− b1,2 −
c1,2
n1n2
) + (a2 + v2
n2
− b1,2 −
c1,2
n1n2
)
Remark
the two optimal weights have an explicit form
the weight n1w∗
1 depends on a2 and vice versa
the two optimal weights depend on the target distribution FX
9 / 20

The N Populations Problem
Consider N populations Ω1, . . . , ΩN , we assume
nk the number of observation in population Ωk , k = 1, . . . , N, with
n1 + · · · + nN = n
the observation in the same populations an i.i.d. sample
Problem 3
Given N populations Ω1 = {Y n1
}, . . . , ΩN = {Y nN
}, n1, . . . , nN ∈ N, we want to
minimize the following distance
d(F, FX ) = EP



R


N
i=1
ni
j=1
wi 1{Y ni ≤u} − FX (u)


2
du



with respect the weights n1w1, . . . , nN wN ∈ ∆N
, where FX is a target distribution.
10 / 20

The N Populations Problem
The i-optimal weights that solves Problem 3 has the following form
ni w∗
i = −
1
2(ai + vi
ni
)
N
k=i,k=1
nk wk (bi,k +
ci,k
ni nk
)+
+
1
ni (ai + vi
ni
)
N
j=1
1
n2
j (aj +
vj
nj
)
+
N
j=1
N
k=j,k=1
nk wk (bj,k +
cj,k
nj nk
)
aj +
vj
nj
2ni (ai + vi
ni
)
N
j=1
1
n2
j (aj +
vj
nj
)
for all i = 1, . . . , N.
Remark
the i-optimal weight has an implicit form
11 / 20

Financial Time Series
a time series is a collection of numerical observations described through random
variables, and indexed according to the order
when we think of a time series, we usually think of a collection of values
{Xt}n
t=1 in which the index t indicates the time at which the datum Xt is
observed
the study of time series has diverse applications ranging from biology to ﬁnance
a key feature that distinguishes ﬁnancial time series from other type of time
series consists in the presence of an element of uncertainty and randomness
12 / 20

The Algorithm
Given two populations Ω1 and Ω2 with distribution functions F1 and F2 consider
the optimal weights deduced from Problem 2 with the extra assumption
ci,j = 0
w1 =
a2 + v2
n2
− b1,2
(a1 + v1
n1
− b1,2) + (a2 + v2
n2
− b1,2)
w2 =
a1 + v1
n1
− b1,2
(a1 + v1
n1
− b1,2) + (a2 + v2
n2
− b1,2)
the optimal weights as function of the Target distribution FX
an initial guess F0
X ∈ X for the Target distribution FX , where X is the convex
hull formed by distributions F1 and F2
13 / 20

The Algorithm
Given two populations Ω1 and Ω2 with distribution functions F1 and F2 consider
the optimal weights deduced from Problem 2 with the extra assumption
ci,j = 0
w1 =
a2 + v2
n2
− b1,2
(a1 + v1
n1
− b1,2) + (a2 + v2
n2
− b1,2)
w2 =
a1 + v1
n1
− b1,2
(a1 + v1
n1
− b1,2) + (a2 + v2
n2
− b1,2)
the optimal weights as function of the Target distribution FX
an initial guess F0
X ∈ X for the Target distribution FX , where X is the convex
hull formed by distributions F1 and F2
Iterative Algorithm
Fi+1
X = w1 Fi
X F1 + w1 Fi
X F2, for i = 0, 1, 2, . . .
14 / 20

Properties
Given the operator T : X → X such that
T[·] = w1 (·) F1 + w2 (·) F2
a ﬁxed point for the operator T[·] is a distribution G ∈ X such that T[G] = G
G ∈ X ⇒ G = αF1 + (1 − α)F2 for α ∈ [0, 1]
than we want to ﬁnd α ∈ [0, 1] such that
αF1 + (1 − α)F2 = T[αF1 + (1 − α)F2].
Lemma
Given the previous considerations we have that
α =
v2/n2
v1/n1 + v2/n2
and 1 − α =
v1/n1
v1/n1 + v2/n2
where
v1 =
R
F1(u) (1 − F1(u)) du and v2 =
R
F2(u) (1 − F2(u)) du
15 / 20

Numerical Applications
Google time series
black: populations Ω1
green: stress populations Ω2
log return time series
16 / 20

red: Target distribution
optimal weights
w∗
1 = 0.84686
w∗
2 = 0.15314
α = 0.84686
1 − α = 0.15314
17 / 20

Figura: Diﬀerent initial guess and corresponding Target distributions
18 / 20

Possible developments
numerical extension to the case of N populations
minimal hypothesis
change the distance
19 / 20

Possible developments
numerical extension to the case of N populations
minimal hypothesis
change the distance
Thanks for your attention
20 / 20

presentazione

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (8)

Similar to presentazione

Similar to presentazione (20)

presentazione