1
Recursive
Bayesian Estimation
SOLO HERMELIN
Updated: 22.02.09
11.01.14
http://www.solohermelin.com
2
SOLO
Table of Content Recursive Bayesian Estimation
Review of Probability
Conditional Probability
Total Probability Theorem
Conditional Probability - Bayes Formula
Statistical Independent Events
Expected Value or Mathematical Expectation
Variance and Central Moments
Characteristic Function and Moment-Generating Function
Probability Distribution and Probability Density Functions (Examples)
Normal (Gaussian) Distribution
Existence Theorems 1 & 2
Monte Carlo Method
Estimation of the Mean and Variance of a Random Variable
Generating Discrete Random Variables
Existence Theorem 3
Markov Processes
Functions of one Random Variable
The Laws of Large Numbers
Central Limit Theorem
Problem Definition
Stochastic Processes
3
SOLO
Table of Content (continue -1)
Recursive Bayesian Estimation
Bayesian Estimation Introduction
Linear Gaussian Markov Systems
Closed-Form Solutions of Estimation
Kalman Filter
Extended Kalman Filter
General Bayesian Nonlinear Filters
Additive Gaussian Nonlinear Filter
Gauss – Hermite Quadrature Approximation
Unscented Kalman Filter
Monte Carlo Kalman Filter (MCKF)
Non-Additive Non-Gaussian Nonlinear Filter
Nonlinear Estimation Using Particle Filters
Importance Sampling (IS)
Sequential Importance Sampling (SIS)
Sequential Importance Resampling (SIR)
Monte Carlo Particle Filter (MCPF)
Bayesian Maximum Likelihood Estimate (Maximum Aposteriori – MAP Estimate)
4
SOLO
Table of Content (continue -2)
Recursive Bayesian Estimation
References
Nonlinear Filters based on the Fokker-Planck Equation
5
SOLO Recursive Bayesian Estimation
kx1−kx
kz1−kz
0x 1x 2x
1z 2z kZ :11:1 −kZ
( )11, −− kk wxf
( )kk vxh ,
( )00 ,wxf
( )11,vxh
( )11,wxf
( )22 ,vxh
Since this is a probabilistic problem, we start with a remainder of Probability Theory
A discrete nonlinear system is defined by
( )
( )kkk
kkk
vxkhz
wxkfx
,,
,,1 11
=
−= −− State vector dynamics
Measurements
kk vw ,1− State and Measurement Noise Vectors, respectively
Problem Definition:
Estimate the hidden States of a Non-linear Dynamic Stochastic System from
Noisy Measurements .
kx
kz
Table of Content
6
SOLO
Pr (A) is the probability of the event A if
S nAAAA ∪∪∪= 21
1A 2A nA
jiOAA ji ≠∀/=∩
( ) 0Pr ≥A(1)
(3) If jiOAAandAAAA jin ≠∀/=∩∪∪∪= 21
( ) 1Pr =S(2)
then ( ) ( ) ( ) ( )nAAAA PrPrPrPr 21 +++= 
Probability Axiomatic Definition
Probability Geometric Definition
Assume that the probability of an event in a geometric region A is defined as the
ratio between A surface to surface of S.
( ) ( )
( )SSurface
ASurface
A =Pr
( ) 0Pr ≥A(1)
( ) 1Pr =S(2)
(3) If jiOAAandAAAA jin ≠∀/=∩∪∪∪= 21
then ( ) ( ) ( ) ( )nAAAA PrPrPrPr 21 +++= 
S
A
Review of Probability
A more detailed explanation
of the subject is given in the
“Probability” Presentation
7
SOLO
From those definition we can prove the
following:( ) 0=/OP(1’)
Proof: OOSandOSS /=/∩/∪=
( )
( ) ( ) ( ) ( ) 0PrPrPrPr
3
=/⇒/+=⇒ OOSS
( ) ( )APAP −= 1(2’)
Proof: OAAandAAS /=∩∪= ( )
( ) ( )
( ) ( ) ( ) ( )AAAAS Pr1PrPrPr1Pr
32
−=⇒+==⇒
( ) 1Pr0 ≤≤ A(3’)
Proof: ( )
( )
( )
( )
( ) 1Pr0Pr1Pr
1'2
≤⇒≥−= AAA
( )
( )APr0
1
≤
( ) 0Pr ≥A(1) ( ) 1Pr =S(2) (3) If jiOAAandAAAA jin ≠∀/=∩∪∪∪= 21
then ( ) ( ) ( ) ( )n
AAAA PrPrPrPr 21
+++= 
( ) ( )AABAIf PrPr ≤⇒⊂(4’)
Proof: ( )
( )
( ) ( ) ( ) ( )BAAABB PrPr0PrPrPr
00
3
≤⇒≥+−=
≥≥

( ) ( ) OAABandAABB /=∩−∪−=
( ) ( ) ( ) ( )BABABA ∩−+=∪ PrPrPrPr(5’)
Proof: ( ) ( )
( ) ( ) ( ) ( ) OABBAandABBAB
OABAandABABA
/=−∩∩−∪∩=
/=−∩−∪=∪
( )
( )
( ) ( )
( )
( )
( ) ( )
( ) ( ) ( ) ( )BABABA
ABBAB
ABABA
∩−+=∪⇒




−+∩=
−+=∪
PrPrPrPr
PrPrPr
PrPrPr
3
3
Table of Content
Review of Probability
8
SOLO
Conditional Probability
S nAAAA ααα ∪∪∪= 21

1αA
jiOAA ji ≠∀/=∩
1αβA
mAAAB βββ ∪∪∪= 212αA
2αβA 1βA 2βA

Given two events A and B decomposed in elementary
events
jiOAAandAAAAA ji
n
i
in ≠∀/=∩=∪∪∪=
=
αααααα 
1
21
lkOAAandAAAAB lk
m
k
km ≠∀/=∩=∪∪∪=
=
ββββββ 
1
21
jiOAAandAAABA jir ≠∀/=∩∪∪∪=∩ αβαβαβαβαβ 21
( ) ( ) ( ) ( )n
AAAA ααα PrPrPrPr 21
+++=  ( ) ( ) ( ) ( )mAAAB βββ PrPrPrPr 21 +++= 
( ) ( ) ( ) ( ) nmrAAABA r ,PrPrPrPr 21 ≤+++=∩ βαβαβα 
We want to find the probability of A event under the condition that the event B
had occurred designed as P (A|B)
( )
( ) ( ) ( )
( ) ( ) ( )
( )
( )B
BA
AAA
AAA
BA
m
r
Pr
Pr
PrPrPr
PrPrPr
|Pr
21
21 ∩
=
+++
+++
=
βββ
βαβαβα


Review of Probability
9
SOLO
Conditional Probability S nAAAA ααα ∪∪∪= 21

1αA
jiOAA ji ≠∀/=∩
1αβA
mAAAB βββ ∪∪∪= 212αA
2αβA 1βA 2βA

If the events A and B are statistical independent, that the fact that B occurred will
not affect the probability of A to occur.
( ) ( )
( )B
BA
BA
Pr
Pr
|Pr
∩
= ( ) ( )
( )A
BA
AB
Pr
Pr
|Pr
∩
=
( ) ( )ABA Pr|Pr = ( ) ( ) ( ) ( ) ( ) ( ) ( )BAAABBBABA PrPrPr|PrPr|PrPr ⋅=⋅=⋅=∩
Definition:
n events Ai i = 1,2,…n are statistical independent if:
( ) nrAA
r
i
i
r
i
i ,,2PrPr
11
 =∀=





∏==
Table of Content
Review of Probability
10
SOLO
Conditional Probability - Bayes Formula
Using the relation:
( ) ( ) ( ) ( ) ( )llll AABBBABA ββββ Pr|PrPr|PrPr ⋅=⋅=∩
( ) ( ) ( ) klOBABABAB lk
m
k
k ,
1
∀/=∩∩∩∩=
=
βββ
( ) ( )∑
=
∩=
m
k
k
BAB
1
PrPr β
we obtain:
( ) ( ) ( )
( )
( ) ( )
( ) ( )∑=
⋅
⋅
=
⋅
= m
k
kk
llll
l
AAB
AAB
B
AAB
BA
1
Pr|Pr
Pr|Pr
Pr
Pr|Pr
|Pr
ββ
ββββ
β
Bayes Formula
Thomas Bayes
1702 - 1761
Table of Content
Review of Probability
11
SOLO
Total Probability Theorem
Table of Content
jiOAAandSAAA jin ≠∀/=∩=∪∪∪ 21If
we say that the set space S is decomposed in exhaustive and
incompatible (exclusive) sets.
The Total Probability Theorem states that for any event B,
its probability can be decomposed in terms of conditional
probability as follows:
( ) ( ) ( ) ( )∑∑ ==
==
n
i
i
n
i
i BPBABAB
11
|Pr,PrPr
Using the relation:
( ) ( ) ( ) ( ) ( )llll AABBBABA Pr|PrPr|PrPr ⋅=⋅=∩
( ) ( ) ( ) klOBABABAB lk
n
k
k ,
1
∀/=∩∩∩∩=
=

( ) ( )∑=
∩=
n
k
k BAB
1
PrPr
For any event B
we obtain:
Review of Probability
12
SOLO
Statistical Independent Events
( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( )∏∑∏∑∏∑
∑∑∑
=
−






≠≠
=






≠
=






=
=
−






≠≠






≠






==
−+−+−=






−+−+−=





n
i
i
n
n
kji
kji i
i
n
ji
ji i
i
n
i
i
tIndependen
lStatisticaA
n
i
i
n
n
kji
kji
kji
n
ji
ji
ji
n
i
i
n
i
i
AAAA
AAAAAAAA
i
1
1
3
,.
3
1
2
.
2
1
1
1
1
1
3
,.
2
.
1
11
Pr1PrPrPr
Pr1PrPrPrPr

 
From Theorem of Addition
Therefore
( )[ ]∏==
−=





−
n
i
i
tIndependen
lStatisticaA
n
i
i AA
i
11
Pr1Pr1  ( )[ ]∏==
−−=




 n
i
i
tIndependen
lStatisticaA
n
i
i AA
i
11
Pr11Pr 
Since OAASAA
n
i
i
n
i
i
n
i
i
n
i
i /=














=














====
 
1111
&








=





−
==

n
i
i
n
i
i AA
11
PrPr1
( )∏==
=




 n
i
i
tIndependen
lStatisticaA
n
i
i AA
i
11
PrPr 
If the n events Ai i = 1,2,…n are statistical independent
than are also statistical independentiA
( )∏=
=
n
i
iA
1
Pr





=
=

n
i
i
MorganDe
A
1
Pr ( )[ ]∏=
−=
n
i
i
tIndependen
lStatisticaA
A
i
1
Pr1
( ) nrAA
r
i
i
r
i
i ,,2PrPr
11
 =∀=





∏==
Table of Content
Review of Probability
13
SOLO Review of Probability
Expected Value or Mathematical Expectation
Given a Probability Density Function p (x) we define the Expected Value
For a Continuous Random Variable: ( ) ( )∫
+∞
∞−
= dxxpxxE X:
For a Discrete Random Variable: ( ) ( )∑=
k
kXk xpxxE :
For a general function g (x) of the
Random Variable x: ( )[ ] ( ) ( )∫
+∞
∞−
= dxxpxgxgE X:
( )xp
x
0 ∞+∞−
0.1
( )xE
( )
( )
( )∫
∫
∞+
∞−
+∞
∞−
=
dxxp
dxxpx
xE
X
X
:
The Expected Value is the center of
surface enclosed between the
Probability Density Function and x
axis.
Table of Content
14
SOLO Review of Probability
Variance
Given a Probability Density Functions p (x) we define the Variance
( ) ( )[ ]{ } ( ) ( )[ ] ( ) ( )22222
2: xExExExExxExExExVar −=+−=−=
Central Moment
( ) { }k
k xEx =:'µ
Given a Probability Density Functions p (x) we define the Central Moment
of order k about the origin
( ) ( )[ ]{ } ( ) ( )∑=
−−
−





=−=
k
j
jk
j
jkk
k xE
j
k
xExEx
0
'1: µµ
Given a Probability Density Functions p (x) we define the Central Moment
of order k about the Mean E (x)
Table of Content
15
SOLO Review of Probability
Moments
Normal Distribution ( ) ( ) ( )[ ]
σπ
σ
σ
2
2/exp
;
22
x
xpX
−
=
[ ] ( )


 −⋅
=
oddnfor
evennforn
xE
n
n
0
131 σ
[ ]
( )





+=
=−⋅
= +
12!2
2
2131
12
knfork
knforn
xE kk
n
n
σ
π
σ
Proof:
Start from: and differentiate k time with respect to a( ) 0exp 2
>=−∫
∞
∞−
a
a
dxxa
π
Substitute a = 1/(2σ2
) to obtain E [xn
]
( ) ( ) 0
2
1231
exp 12
22
>
−⋅
=− +
∞
∞−
∫ a
a
k
dxxax kk
k π
[ ] ( ) ( )[ ] ( ) ( )[ ]
( ) ( ) 12
!
0
122/
0
222221212
!2
2
exp
2
22
2/exp
2
2
2/exp
2
1
2
+
∞+
=
∞∞
∞−
++
=−=
−=−=
∫
∫∫
kk
k
k
k
xy
kkk
kdyyy
xdxxxdxxxxE
σ
πσ
σ
π
σ
σπ
σ
σπ
σ
  
Now let compute:
[ ] [ ]( )2244
33 xExE == σ
Chi-square
16
SOLO Review of Probability
Functions of one Random Variable
Let y = g (x) a given function of the random variable x defined o the domain Ω, with
probability distribution pX (x). We want to find pY (y).
Fundamental Theorem
Assume x1, x2, …, xn all the solutions of the equation
( ) ( ) ( )n
xgxgxgy ==== 21
( ) ( )
( )
( )
( )
( )
( )n
nXXX
Y
xg
xp
xg
xp
xg
xp
yp
''' 2
2
1
1
+++= 
( ) ( )
xd
xgd
xg =:'
Proof
( ) ( ) ( ) ( ) ( )
( )∑∑∑ ===
==±≤≤=+≤≤=
n
i i
iX
n
i
iiX
n
i
iiiY yd
xg
xp
xdxpxdxxxydyYyydyp
111 '
PrPr:
q.e.d.
17
SOLO Review of Probability
Functions of one Random Variable (continue – 1)
Example 1
bxay += ( ) 




 −
=
a
by
p
a
yp XY
1
Example 2
x
a
y = ( ) 





=
y
a
p
y
a
yp XY 2
Example 3
2
xay = ( ) ( )yU
a
y
p
a
y
p
ya
yp XXY
















−+








=
2
1
Example 4
xy = ( ) ( ) ( )[ ] ( )yUypypyp XXY −+=
Table of Content
18
SOLO Review of Probability
Characteristic Function and Moment-Generating Function
Given a Probability Density Functions pX (x) we define the Characteristic Function or
Moment Generating Function
( ) ( )[ ]
( ) ( ) ( ) ( )
( ) ( )




=
==Φ
∑
∫∫
+∞
∞−
+∞
∞−
x
X
XX
X
discretexxpxj
continuousxxPdxjdxxpxj
xjE
ω
ωω
ωω
exp
expexp
exp:
This is in fact the complex conjugate of the Fourier Transfer of the Probability Density
Function. This function is always defined since the sufficient condition of the existence of a
Fourier Transfer :
Given the Characteristic Function we can find the Probability Density
Functions pX (x) using the Inverse Fourier Transfer:
( )
( )
( ) ∞<== ∫∫
+∞
∞−
≥+∞
∞−
1
0
dxxpdxxp X
xp
X
( ) ( ) ( )∫
+∞
∞−
Φ−= ωωω
π
dxjxp XX exp
2
1
is always fulfilled.
19
SOLO Review of Probability
Properties of Moment-Generating Function
( ) ( ) ( )∫
+∞
∞−
=
Φ
dxxpxxjj
d
d
X
X
ω
ω
ω
exp
( ) ( ) 10
==Φ ∫
+∞
∞−
=
dxxpXX ω
ω
( ) ( ) ( )xEjdxxpxj
d
d
X
X
==
Φ
∫
+∞
∞−=0ω
ω
ω
( ) ( ) ( ) ( )∫
+∞
∞−
=
Φ
dxxpxxjj
d
d
X
X 22
2
2
exp ω
ω
ω ( ) ( ) ( ) ( ) ( )2222
0
2
2
xEjdxxpxj
d
d
X
X
==
Φ
∫
+∞
∞−=ω
ω
ω
( ) ( ) ( ) ( )∫
+∞
∞−
=
Φ
dxxpxxjj
d
d
X
nn
n
X
n
ω
ω
ω
exp
( ) ( ) ( ) ( ) ( )nn
X
nn
n
X
n
xEjdxxpxj
d
d
==
Φ
∫
+∞
∞−=0ω
ω
ω
 
( ) ( ) ( )∫
+∞
∞−
=Φ dxxpxj XX ωω exp
This is the reason why ΦX (ω) is also called the Moment-Generation Function.
20
SOLO Review of Probability
Properties of Moment-Generating Function
( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( ) 

+++++=
+
Φ
++
Φ
+
Φ
+Φ=Φ
===
=
n
n
n
n
X
n
XX
XX
xE
n
j
xE
j
xE
j
d
d
nd
d
d
d
!!2!1
1
!
1
!2
1
2
2
0
2
0
2
2
0
0
ωωω
ω
ω
ω
ω
ω
ω
ω
ω
ω
ωω
ωωω
ω
Develop ΦX (ω) in a Taylor series
( ) ( ) ( )∫
+∞
∞−
=Φ dxxpxj XX ωω exp
21
SOLO Review of Probability
Probability Distribution and Probability Density Functions (Examples)
(2) Poisson’s Distribution ( ) ( )0
0
exp
!
, k
k
k
nkp
k
−≈
(1) Binomial (Bernoulli) ( )
( )
( ) ( ) knkknk
pp
k
n
pp
knk
n
nkp
−−
−





=−
−
= 11
!!
!
,
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 k
( )nkP ,
(3) Normal (Gaussian)
( ) ( ) ( )[ ]
σπ
σµ
σµ
2
2/exp
,;
22
−−
=
x
xp
(4) Laplacian Distribution ( )







 −
−=
b
x
b
bxp
µ
µ exp
2
1
,;
22
SOLO Review of Probability
Probability Distribution and Probability Density Functions (Examples)
(5) Gama Distribution ( )
( )
( )





<
≥
Γ
−
=
−
00
0
/exp
,;
1
x
xx
k
x
kxp
k
k
θ
θ
θ
(6) Beta Distribution
( ) ( )
( )
( )
( ) ( )
( ) 11
1
0
11
11
1
1
1
,;
−−
−−
−−
−
ΓΓ
+Γ
=
−
−
=
∫
βα
βα
βα
βα
βα
βα xx
duuu
xx
xp
(7) Cauchy Distribution ( )
( ) 





+−
= 22
0
0
1
,;
γ
γ
π
γ
xx
xxp
23
SOLO Review of Probability
Probability Distribution and Probability Density Functions (Examples)
SOLO
(8) Exponential Distribution
( )
( )



<
≥−
=
00
0exp
;
x
xx
xp
λλ
λ
(9) Chi-square Distribution
( )
( )
( )
( )





<
≥−
Γ=
−
00
02/exp
2/
2/1
;
12/
2/
x
xxx
kkxp
k
k
Γ is the gamma function ( ) ( )∫
∞
−
−=Γ
0
1
exp dttta a
(10) Student’s t-Distribution
( ) ( )[ ]
( ) ( )( ) 2/12
/12/
2/1
; +
+Γ
+Γ
= ν
ννπν
ν
ν
x
xp
24
SOLO Review of Probability
Probability Distribution and Probability Density Functions (Examples)
SOLO
(11) Uniform Distribution (Continuous)
( )





>>
≤≤
−=
bxxa
bxa
abbaxp
0
1
,;
(12) Rayleigh Distribution
( ) 2
2
2
2
exp
;
σ
σ
σ






−
=
x
x
xp
(13) Rice Distribution
( ) 










 +
−
= 202
2
22
2
exp
,;
σσ
σ
σ
vx
I
vx
x
vxp
25
SOLO Review of Probability
Probability Distribution and Probability Density Functions (Examples)
(14) Weibull Distribution
SOLO
( )





<
>≥













 −
−




 −
=
−
00
0,,exp
,,;
1
x
x
xx
xp
αγµ
α
µ
α
µ
α
γ
αµγ
γγ
Table of Content
26
SOLO Review of Probability
Normal (Gaussian) Distribution
Karl Friederich Gauss
1777-1855
( )
( )
( )σµ
σπ
σ
µ
σµ ,;:
2
2
exp
,;
2
2
x
x
xp N=





 −
−
=
( ) ( )
∫
∞−





 −
−=
x
du
u
xP 2
2
2
exp
2
1
,;
σ
µ
σπ
σµ
( ) µ=xE
( ) σ=xVar
( ) ( )[ ]
( ) ( )






−=





 −
−=
=Φ
∫
∞+
∞−
2
exp
exp
2
exp
2
1
exp
22
2
2
σω
µω
ω
σ
µ
σπ
ωω
j
duuj
u
xjE
Probability Density Functions
Cumulative Distribution Function
Mean Value
Variance
Moment Generating Function
27
SOLO Review of Probability
Moments
Normal Distribution ( ) ( ) ( )[ ] ( )σ
σπ
σ
σ ,0;:
2
2/exp
,0;
22
x
x
xpX N=
−
=
[ ] ( )


 −⋅
=
oddnfor
evennforn
xE
n
n
0
131 σ
[ ]
( )





+=
=−⋅
= +
12!2
2
2131
12
knfork
knforn
xE kk
n
n
σ
π
σ
Proof:
Start from: and differentiate k time with respect to a( ) 0exp 2
>=−∫
∞
∞−
a
a
dxxa
π
Substitute a = 1/(2σ2
) to obtain E [xn
]
( ) ( ) 0
2
1231
exp 12
22
>
−⋅
=− +
∞
∞−
∫ a
a
k
dxxax kk
k π
[ ] ( ) ( )[ ] ( ) ( )[ ]
( ) ( ) 12
!
0
122/
0
222221212
!2
2
exp
2
22
2/exp
2
2
2/exp
2
1
2
+
∞+
=
∞∞
∞−
++
=−=
−=−=
∫
∫∫
kk
k
k
k
xy
kkk
kdyyy
xdxxxdxxxxE
σ
πσ
σ
π
σ
σπ
σ
σπ
σ
  
Now let compute:
[ ] [ ]( )2244
33 xExE == σ
Chi-square
28
SOLO Review of Probability
Normal (Gaussian) Distribution (continue – 1)
Karl Friederich Gauss
1777-1855
( ) ( ) ( ) ( )PxxxxPxxPPxxp
T
,;:
2
1
exp2,; 12/1 
N=



−−−= −−
π
A Vector – Valued Gaussian Random Variable has the
Probability Density Functions
where
{ }xEx

= Mean Value
( )( ){ }T
xxxxEP

−−= Covariance Matrix
If P is diagonal P = diag [σ1
2
σ2
2
… σk
2
] then the components of the random vector
are uncorrelated, and
x

( )
( ) ( ) ( ) ( )
∏=
−
−





 −
−
=





 −
−




 −
−




 −
−
=






























−
−
−




























−
−
−
−=
k
i i
i
ii
k
k
kk
kk
k
T
kk
xxxxxxxx
xx
xx
xx
xx
xx
xx
PPxxp
1
2
2
2
2
2
2
2
2
22
1
2
1
2
11
22
11
1
2
2
2
2
1
22
11
2/1
2
2
exp
2
2
exp
2
2
exp
2
2
exp
0
0
2
1
exp2,;
σπ
σ
σπ
σ
σπ
σ
σπ
σ
σ
σ
σ
π



therefore the
components of the
random vector are
also independent
29
SOLO Review of Probability
The Laws of Large Numbers
The Law of Large Numbers is a fundamental concept in statistics and probability that
describes how the average of randomly selected sample of a large population is likely
to be close to the average of the whole population. There are two laws of large numbers
the Weak Law and the Strong Law.
The Weak Law of Large Numbers
The Weak Law of Large Numbers states that if X1,X2,…,Xn,… is an infinite sequence
of random variables that have the same expected value μ and variance σ2
, and are
uncorrelated (i.e., the correlation between any two of them is zero), then
( ) nXXX nn /: 1 ++= 
converges in probability (a weak convergence sense) to μ . We have
{ } ∞→=<− nforXn 1Pr εµ
converges in
probability
The Strong Law of Large Numbers
The Strong Law of Large Numbers states that if X1,X2,…,Xn,… is an infinite sequence
of random variables that have the same expected value μ and variance σ2
, and are
uncorrelated (i.e., the correlation between any two of them is zero), and E (|Xi|) < ∞
then ,i.e. converges almost surely to μ.{ } ∞→== nforXn 1Pr µ
converges
almost surely
3030
SOLO Review of Probability
The Law of Large Numbers
Differences between the Weak Law and the Strong Law
The Weak Law states that, for a specified large n, (X1 + ... + Xn) / n is likely to be near μ.
Thus, it leaves open the possibility that | (X1 + ... + Xn) / n − μ | > ε happens an infinite
number of times, although it happens at infrequent intervals.
The Strong Law shows that this almost surely will not occur.
In particular, it implies that with probability 1, we have for any positive value ε, the
inequality | (X1 + ... + Xn) / n − μ | > ε is true only a finite number of times (as opposed to
an infinite, but infrequent, number of times).
Almost sure convergence is also called strong convergence of random variables.
This version is called the strong law because random variables which converge
strongly (almost surely) are guaranteed to converge weakly (in probability). The
strong law implies the weak law.
3131
SOLO Review of Probability
The Law of Large Numbers
Proof of the Weak Law of Large Numbers
( ) iXE i ∀= µ ( ) iXVar i ∀= 2
σ ( )( )[ ] jiXXE ji ≠∀=−− 0µµ
( ) ( ) ( )[ ] µµ ==++= nnnXEXEXE nn //1 
( ) ( )[ ]{ } ( ) ( )
( )( )[ ] ( )[ ] ( )[ ]
nn
n
n
XEXE
n
XX
E
n
XX
EXEXEXVar
n
jiXXE
nn
nnn
ji 2
2
2
2
22
1
0
2
1
2
12
σσµµ
µµ
µ
µµ
==
−++−
=













 −++−
=














−
++
=−=
≠∀=−−


Given
we have:
Using Chebyshev’s inequality on we obtain:nX ( ) 2
2
/
Pr
ε
σ
εµ
n
Xn ≤≥−
Using this equation we obtain:
( ) ( ) ( ) n
XXX nnn 2
2
1Pr1Pr1Pr
ε
σ
εµεµεµ −≥≥−−≥>−−=≤−
As n approaches infinity, the expression approaches 1.
Chebyshev’s
inequality
q.e.d.
Monte Carlo
Integration
Monte Carlo
Integration
Table of Content
3232
SOLO Review of Probability
Central Limit Theorem
The first version of this theorem was first postulated by the
French-born English mathematician Abraham de Moivre in
1733, using the normal distribution to approximate the
distribution of the number of heads resulting from many tosses
of a fair coin. This was published in1756 in “The Doctrine
of Chance” 3th Ed.
Pierre-Simon Laplace
(1749-1827)
Abraham de Moivre
(1667-1754)
This finding was forgotten until 1812 when the French
mathematician Pierre-Simon Laplace recovered it in his work
“Théory Analytique des Probabilités”, in which he approximate
the binomial distribution with the normal distribution.
This is known as the De Moivre – Laplace Theorem.
De Moivre – Laplace
Theorem
The present form of the Central Limit Theorem was given by the
Russian mathematician Alexandr Lyapunov in 1901.
Alexandr Mikhailovich
Lyapunov
(1857-1918)
3333
SOLO Review of Probability
Central Limit Theorem (continue – 1)
Let X1, X2, …, Xm be a sequence of independent random variables with the same
probability distribution function pX (x). Define the statistical mean:
m
XXX
X m
m
+++
=
21
( ) ( ) ( ) ( ) µ=
+++
=
m
XEXEXE
XE m
m
21
( ) ( )[ ]{ } ( ) ( ) ( )
mm
m
m
XXX
EXEXEXVar m
mmmXm
2
2
22
21
22 σσµµµ
σ ==













 −++−+−
=−==

Define also the new random variable
( ) ( ) ( ) ( )
m
XXXXEX
Y m
X
mm
m
σ
µµµ
σ
−++−+−
=
−
=
21
:
We have:
The probability distribution of Y tends to become gaussian (normal) as m
tends to infinity, regardless of the probability distribution of the random
variable, as long as the mean μ and the variance σ2
are finite.
3434
SOLO Review of Probability
Central Limit Theorem (continue – 2)
( ) ( ) ( ) ( )
m
XXXXEX
Y m
X
mm
m
σ
µµµ
σ
−++−+−
=
−
=
21
:
Proof
The Characteristic Function
( ) ( )[ ] ( ) ( ) ( )
( ) ( )
( )
m
X
m
i
m
i
i
m
Y
m
X
m
j
E
m
X
jE
m
XXX
jEYjE
i














Φ=



















 −
=













 −
=













 −++−+−
==Φ
−
=
∏
ω
σ
µω
σ
µ
ω
σ
µµµ
ωωω
σ
µexpexp
expexp
1
21 
( )
( ) ( ) ( ) ( ) ( ) ( )
0/lim
2
1
!3
/
!2
/
!1
/
1
2222
33
1
22
0
=





Ο/





Ο/+−=
+













 −
+













 −
+




 −
+=





Φ
∞→
−
mmmm
X
E
mjX
E
mjX
E
mj
m
m
iii
Xi
ωωωω
σ
µω
σ
µω
σ
µωω
σ
µ 
  
Develop in a Taylor series( ) 





Φ −
miX
ω
σ
µ
35
SOLO Review of Probability
Central Limit Theorem (continue – 3)
Proof (continue – 1)
The Characteristic Function ( ) ( )
m
XY
m
E i














Φ=Φ −
ω
ω
σ
µ
( ) 0/lim
2
1
2222
=





Ο/





Ο/+−=





Φ
∞→
−
mmmmm m
Xi
ωωωωω
σ
µ
( ) ( )2/exp
2
1 2
22
ω
ωω
ω −→











Ο/+−=Φ
∞→m
m
Y
mm
Therefore
( ) ( ) ( ) ( ) ( )2/exp
2
1
2/exp
2
1
exp
2
1 22
ydyjdyjyp
m
YY −=−−→Φ−= ∫∫
+∞
∞−
∞→+∞
∞− π
ωωω
π
ωωω
π
The probability distribution of Y tends to become gaussian (normal) as m tends to infinity
(Convergence in Distribution).
Characteristic Function
of Normal Distribution
Convergence
Concepts
Monte Carlo
Integration
Table of Content
36
SOLO Review of Probability
Central Limit Theorem (continue – 2)
( ) ( ) ( ) ( )
m
XXXXEX
Y m
X
mm
m
σ
µµµ
σ
−++−+−
=
−
=
21
:
Proof
The Characteristic Function
( ) ( )[ ] ( ) ( ) ( )
( ) ( )
( )
m
X
m
i
m
i
i
m
Y
m
X
m
j
E
m
X
jE
m
XXX
jEYjE
i














Φ=



















 −
=













 −
=













 −++−+−
==Φ
−
=
∏
ω
σ
µω
σ
µ
ω
σ
µµµ
ωωω
σ
µexpexp
expexp
1
21 
( )
( ) ( ) ( ) ( ) ( ) ( )
0/lim
2
1
!3
/
!2
/
!1
/
1
2222
33
1
22
0
=





Ο/





Ο/+−=
+













 −
+













 −
+




 −
+=





Φ
∞→
−
mmmm
X
E
mjX
E
mjX
E
mj
m
m
iii
Xi
ωωωω
σ
µω
σ
µω
σ
µωω
σ
µ 
  
Develop in a Taylor series( ) 





Φ −
miX
ω
σ
µ
37
SOLO Review of Probability
Central Limit Theorem (continue – 3)
Proof (continue – 1)
The Characteristic Function ( ) ( )
m
XY
m
E i














Φ=Φ −
ω
ω
σ
µ
( ) 0/lim
2
1
2222
=





Ο/





Ο/+−=





Φ
∞→
−
mmmmm m
Xi
ωωωωω
σ
µ
( ) ( )2/exp
2
1 2
22
ω
ωω
ω −→











Ο/+−=Φ
∞→m
m
Y
mm
Therefore
( ) ( ) ( ) ( ) ( )2/exp
2
1
2/exp
2
1
exp
2
1 22
ydyjdyjyp
m
YY −=−−→Φ−= ∫∫
+∞
∞−
∞→+∞
∞− π
ωωω
π
ωωω
π
The probability distribution of Y tends to become gaussian (normal) as m tends to infinity
(Convergence in Distribution).
Characteristic Function
of Normal Distribution
Convergence
Concepts
Table of Content
38
SOLO Review of Probability
Existence Theorems
Existence Theorem 1
Given a function G (x) such that
( ) ( ) ( ) 1lim,1,0 ==∞+=∞−
∞→
xGGG
x
( ) ( ) 2121 0 xxifxGxG <=≤ ( G (x) is monotonic non-decreasing)
( ) ( ) ( )xGxGxG n
xx
xx
n
n
==
≥
→
+ lim
We can find an experiment X and a random variable x, defined on X, such that
its distribution function P (x) equals the given function G (x).
Proof of Existence Theorem 1
Assume that the outcome of the experiment X is any real number -∞ <x < +∞.
We consider as events all intervals, the intersection or union of intervals on the
real axis.
5x
1x 2x 3x 4x 6x 7x 8x
∞− ∞+
To specify the probability of those events we define P (x)=Prob { x ≤ x1}= G (x1).
From our definition of G (x) it follows that P (x) is a distribution function.
Existence Theorem 2 Existence Theorem 3
39
SOLO Review of Probability
Existence Theorems
Existence Theorem 2
If a function F (x,y) is such that
( ) ( ) ( )
( ) ( ) ( ) ( ) 0,,,,
1,,0,,
11122122 ≥+−−
=+∞∞+=−∞=∞−
yxFyxFyxFyxF
FxFyF
for every x1 < x2, y1 < y2, then two random variables x and y can be found such that
F (x,y) is their joint distribution function.
Proof of Existence Theorem 2
Assume that the outcome of the experiment X is any real number -∞ <x < +∞.
Assume that the outcome of the experiment Y is any real number -∞ <y < +∞.
We consider as events all intervals, the intersection or union of intervals on the
real axes x and y.
To specify the probability of those events we define P (x,y)=Prob { x ≤ x1, y ≤ y1, }= F (x1,y1).
From our definition of F (x,y) it follows that P (x,y) is a joint distribution function.
The proof is similar to that in the Existence Theorem 1
40
SOLO Review of Probability
Monte Carlo Method
Monte Carlo methods are a class of computational algorithms that
rely on repeated random sampling to compute their results. Monte
Carlo methods are often used when simulating physical and
mathematical systems. Because of their reliance on repeated
computation and random or pseudo-random numbers, Monte Carlo
methods are most suited to calculation by a computer. Monte Carlo
methods tend to be used when it is infeasible or impossible to
compute an exact result with a deterministic algorithm.
The term Monte Carlo method was coined in the 1940s by physicists Stanislaw Ulam,
Enrico Fermi, John von Neumann, and Nicholas Metropolis, working on nuclear
weapon projects in the Los Alamos National Laboratory (reference to the Monte Carlo
Casino in Monaco where Ulam's uncle would borrow money to gamble)
Stanislaw Ulam
1909 - 1984
Enrico - Fermi
1901 - 1954
John von Neumann
1903 - 1957
Monte Carlo Casino
Nicholas Constantine Metropolis
(1915 –1999)
41
SOLO Review of Probability
Monte Carlo Approximation
Monte Carlo runs, generate a set of random samples that approximate the distribution p (x).
So, with P samples, expectations with respect to the filtering distribution are approximated by
( ) ( ) ( )
( )∑∫ =
≈
P
L
L
xf
P
dxxpxf
1
1
and , in the usual way for Monte Carlo, can give all the moments etc. of the distribution
up to some degree of approximation.
{ } ( ) ( )
∑∫ =
≈==
P
L
L
x
P
dxxpxxE
1
1
1
µ
( ){ } ( ) ( ) ( )
( )∑∫ =
−≈−=−=
P
L
nLnn
n x
P
dxxpxxE
1
111
1
µµµµ

Table of Content
x(L)
are generated (draw) samples from distribution p (x)
( )
( )xpx L
~
42
SOLO Review of Probability
Estimation of the Mean and Variance of a Random Variable (Unknown Statistics)
{ } { } jimxExE ji ,∀==
Define
Estimation of the
Population mean
∑=
=
k
i
ik x
k
m
1
1
:ˆ
A random variable, x, may take on any values in the range - ∞ to + ∞.
Based on a sample of k values, xi, i = 1,2,…,k, we wish to compute the sample mean, ,
and sample variance, , as estimates of the population mean, m, and variance, σ2
.
2
ˆkσ
kmˆ
( )
{ }
( ) ( ) ( )[ ] ( ) ( )[ ]
2
1
2
1
222
2
22222
1 11
2
1
2
2
11
2
1
2
11
1
1
1
1
1
21
11
2
1
ˆˆ2
1
ˆ
1
σσ
σσσ
k
k
kk
mkmkk
k
mmk
k
m
k
xx
k
Ex
k
xExE
k
mxmxE
k
mx
k
E
k
i
k
i
k
i
k
l
l
k
j
j
k
j
jii
k
k
i
ik
k
i
i
k
i
ki
−
=





−=






++−+++−−+=














+






−=






+−=






−
∑
∑
∑ ∑∑∑
∑∑∑
=
=
= ===
===
{ } { } jimxExE ji ,2222
∀+== σ
{ } { } mxE
k
mE
k
i
ik == ∑=1
1
ˆ
{ } { } { } jimxExExxE ji
tindependenxx
ji
ji
,2
,
∀==
Compute
Biased
Unbiased
Monte Carlo simulations assume independent and identical distributed (i.i.d.) samples.
43
SOLO Review of Probability
Estimation of the Mean and Variance of a Random Variable (continue - 1)
{ } { } jimxExE ji ,∀==
Define
Estimation of the
Population mean
∑=
=
k
i
ik x
k
m
1
1
:ˆ
A random variable, x, may take on any values in the range - ∞ to + ∞.
Based on a sample of k values, xi, i = 1,2,…,k, we wish to compute the sample mean, ,
and sample variance, , as estimates of the population mean, m, and variance, σ2
.
2
ˆkσ
kmˆ
( ) 2
1
2 1
ˆ
1
σ
k
k
mx
k
E
k
i
ki
−
=






−∑=
{ } { } jimxExE ji ,2222
∀+== σ
{ } { } mxE
k
mE
k
i
ik == ∑=1
1
ˆ
{ } { } { } jimxExExxE ji
tindependenxx
ji
ji
,2
,
∀==
Biased
Unbiased
Therefore, the unbiased estimation of the sample variance of the population is defined as:
( )∑=
−
−
=
k
i
kik mx
k 1
22
ˆ
1
1
:ˆσ since { } ( ) 2
1
22
ˆ
1
1
:ˆ σσ =






−
−
= ∑=
k
i
kik mx
k
EE
Unbiased
Monte Carlo simulations assume independent and identical distributed (i.i.d.) samples.
44
SOLO Review of Probability
Estimation of the Mean and Variance of a Random Variable (continue - 2)
A random variable, x, may take on any values in the range - ∞ to + ∞.
Based on a sample of k values, xi, i = 1,2,…,k, we wish to compute the sample mean, ,
and sample variance, , as estimates of the population mean, m, and variance, σ2
.
2
ˆkσ
kmˆ
{ } { } mxE
k
mE
k
i
ik == ∑=1
1
ˆ
{ } ( ) 2
1
22
ˆ
1
1
:ˆ σσ =






−
−
= ∑=
k
i
kik mx
k
EE
Monte Carlo simulations assume independent and identical distributed (i.i.d.) samples.
45
SOLO Review of Probability
Estimation of the Mean and Variance of a Random Variable (continue - 3)
{ } { } mxE
k
mE
k
i
ik == ∑=1
1
ˆ { } ( ) 2
1
22
ˆ
1
1
:ˆ σσ =






−
−
= ∑=
k
i
kik mx
k
EEWe found:
Let Compute:
( ){ } ( )
( ){ } ( ) ( ){ }
( ){ } ( ){ } ( ){ }
k
mxEmxEmxE
k
mxmxEmxE
k
mx
k
Emx
k
EmmE
k
i
k
ij
j
ji
k
i
i
k
i
k
ij
j
ji
k
i
i
k
i
i
k
i
ikmk
2
1 1
00
1
2
2
1 11
2
2
2
1
2
1
22
ˆ
2
1
1
11
ˆ:
σ
σ
σ
=










−−+−=










−−+−=














−=














−=−=
∑ ∑∑
∑∑∑
∑∑
=
≠
==
=
≠
==
==

( ){ } k
mmE kmk
2
22
ˆ ˆ:
σ
σ =−=
46
SOLO Review of Probability
Estimation of the Mean and Variance of a Random Variable (continue - 4)
Let Compute:
( ){ } ( ) ( )
( ) ( ) ( ) ( )[ ]
( ) ( ) ( ) ( )














−−
−
+−
−
−
+−
−
=














−−+−−+−
−
=














−−+−
−
=














−−
−
=−=
∑∑
∑
∑∑
==
=
==
2
22
11
2
2
2
1
22
2
2
1
2
2
2
1
22222
ˆ
ˆ
11
ˆ2
1
1
ˆˆ2
1
1
ˆ
1
1
ˆ
1
1
ˆ:2
σ
σ
σσσσσσ
k
k
i
i
k
k
i
i
k
i
kkii
k
i
ki
k
i
kik
mm
k
k
mx
k
mm
mx
k
E
mmmmmxmx
k
E
mmmx
k
Emx
k
EE
k
( )
( ){ } ( ){ } ( ){ } ( ){ }
( )
( ){ } ( )
( ){ }
( ){ }
( )
( ){ } ( ){ }
( )
( ){ } ( )
( ){ }
( ){ }
( )
( ){ } ( ){ }
( )
( ){ }
( )
( ){ }




    
  

  
k
k
k
i
i
k
k
i
i
k
k
k
i
i
k
k
i
i
k
k
k
i
i
k
k
k
k
i
i
k
k
k
i
k
ij
j
ji
k
k
i
i
mmE
k
k
mxE
k
mmE
mxE
k
mmEk
mxE
k
mxE
k
mmEk
mxE
k
mmE
mmE
k
k
mxE
k
mmE
mxEmxEmxE
kk
/
2
2
1
0
2
0
1
0
2
3
1
2
2
1
2
2
/
2
1
3
2
0
44
2
2
1
2
2
/
2
1 1
22
1
4
2
2
ˆ
2
222
22
22
4
2
ˆ
1
2
1
ˆ4
1
ˆ4
1
2
1
ˆ2
1
ˆ4
ˆ
11
ˆ4
1
1
σ
σσσ
σσ
σσ
µ
σ
σσ
σ
σσ
−
−
−−
−
−
−−
−
−
+
−
−
−−
−
−
+−
−
−
+
+−
−
+−
−
−
+












−−+−
−
≈
∑∑
∑∑∑
∑∑ ∑∑
==
===
==
≠
==
Since (xi – m), (xj - m) and are all independent for i ≠ j:( )kmm ˆ−
47
SOLO Review of Probability
Estimation of the Mean and Variance of a Random Variable (continue - 4)
Since (xi – m), (xj - m) and are all independent for i ≠ j:( )kmm ˆ−
( )
( )
( ) ( ) ( )
( ){ }
( ) ( ) ( ) ( ) ( ) ( )
( ){ }4
2
2
4
22
4
44
2
4
44
2
2
2
4
2
4
2
42
ˆ
ˆ
11
7
11
2
1
2
1
2
ˆ
11
4
1
1
1
2
k
k
mmE
k
k
k
k
k
k
kk
k
k
k
mmE
k
k
kk
kk
k
k
k
−
−
+
−
+−
+
−
=
−
−
−
−
−
+
+−
−
+
−
+
−
−
+
−
≈
σ
µσσσ
σ
σσµ
σσ
kk
4
42
ˆ 2
σµ
σσ
−
≈ ( ){ }4
4 : mxE i −=µ
( )
( ){ } ( ){ } ( ){ } ( ){ }
( )
( ){ } ( )
( ){ }
( ){ }
( )
( ){ } ( ){ }
( )
( ){ } ( )
( ){ }
( ){ }
( )
( ){ } ( ){ }
( )
( ){ }
( )
( ){ }




    
  

  
k
k
k
i
i
k
k
i
i
k
k
k
i
i
k
k
i
i
k
k
k
i
i
k
k
k
k
i
i
k
k
k
i
k
ij
j
ji
k
k
i
i
mmE
k
k
mxE
k
mmE
mxE
k
mmEk
mxE
k
mxE
k
mmEk
mxE
k
mmE
mmE
k
k
mxE
k
mmE
mxEmxEmxE
kk
/
2
2
1
0
2
0
1
0
2
3
1
2
2
1
2
2
/
2
1
3
2
0
44
2
2
1
2
2
/
2
1 1
22
1
4
2
2
ˆ
2
222
22
22
4
2
ˆ
1
2
1
ˆ4
1
ˆ4
1
2
1
ˆ2
1
ˆ4
ˆ
11
ˆ4
1
1
σ
σσσ
σσ
σσ
µ
σ
σσ
σ
σσ
−
−
−−
−
−
−−
−
−
+
−
−
−−
−
−
+−
−
−
+
+−
−
+−
−
−
+












−−+−
−
≈
∑∑
∑∑∑
∑∑ ∑∑
==
===
==
≠
==
48
SOLO Review of Probability
Estimation of the Mean and Variance of a Random Variable (continue - 5)
{ } { } mxE
k
mE
k
i
ik == ∑=1
1
ˆ
{ } ( ) 2
1
22
ˆ
1
1
:ˆ σσ =






−
−
= ∑=
k
i
kik mx
k
EE
We found:
( ){ } k
mmE kmk
2
22
ˆ ˆ:
σ
σ =−=
( ){ } ( )
k
mx
k
EE
k
i
kik
k
4
4
2
2
1
22222
ˆ
ˆ
1
1
ˆ:2
σµ
σσσσσ
−
≈














−−
−
=−= ∑=
( ){ }4
4 : mxE i −=µ
Kurtosis of random variable xi
Define
4
4
:
σ
µ
λ =
( ){ } ( ) ( )
k
mx
k
EE
k
i
kik
k
42
2
1
22222
ˆ
1
ˆ
1
1
ˆ:2
σλ
σσσσσ
−
≈














−−
−
=−= ∑=
49
SOLO Review of Probability
Estimation of the Mean and Variance of a Random Variable (continue - 6)
[ ] ϕσσσ σσ =≤≤
2
ˆ
2
k
2
k
ˆ-0Prob n
For high values of k, according to the Central Limit Theorem the estimations of mean
and of variance are approximately Gaussian Random Variables.
kmˆ
2
ˆkσ
We want to find a region around that
will contain σ2
with a predefined probability
φ as function of the number of iterations k.
2
ˆkσ
Since are approximately Gaussian Random
Variables nσ is given by solving:
2
ˆkσ
ϕζζ
π
σ
σ
=





−∫
+
−
n
n
d2
2
1
exp
2
1
nσ φ
1.000 0.6827
1.645 0.9000
1.960 0.9500
2.576 0.9900
Cumulative Probability within nσ
Standard Deviation of the Mean for a
Gaussian Random Variable
22
k
22 1
ˆ-
1
σ
λ
σσσ
λ
σσ
k
n
k
n
−
≤≤
−
−
22
k
2
1
1
ˆ-1
1
σ
λ
σσ
λ
σσ 







−
−
≤≤







+
−
−
k
n
k
n
( ) ( ) ( ) ( )( )42222
1,0;ˆ~ˆ&,0;ˆ~ˆ σλσσσσ −−− kkkk kmmmk NN
50
SOLO Review of Probability
Estimation of the Mean and Variance of a Random Variable (continue - 7)
[ ] ϕσσσ σσ =≤≤
2
ˆ
2
k
2
k
ˆ-0Prob n
22
k
22 1
ˆ-
1
σ
λ
σσσ
λ
σσ
k
n
k
n
−
≤≤
−
−
22
k
2
1
1
ˆ-1
1
σ
λ
σσ
λ
σσ 







−
−
≤≤







+
−
−
k
n
k
n
22
ˆ
1
2
k
σ
λ
σσ
k
−
=
22
k
2 1
1ˆ
1
1 σ
λ
σσ
λ
σσ 






 −
−≥≥






 −
+
k
n
k
n







 −
−
≥≥







 −
+
k
n
k
n
1
1
ˆ
1
1
2
2
k
2
λ
σ
σ
λ
σ
σσ
k
n
k
n
1
1
:ˆ:
1
1
k
−
−
=≥≥=
−
+
λ
σ
σσσ
λ
σ
σσ
51
SOLO Review of Probability
Estimation of the Mean and Variance of a Random Variable (continue - 8)
52
SOLO Review of Probability
Estimation of the Mean and Variance of a Random Variable (continue - 9)
53
SOLO Review of Probability
Estimation of the Mean and Variance of a Random Variable (continue - 10)
k
n
k
n
kk 1ˆ
1
:&
1ˆ
1
:
00
−
−
=
−
+
=
λ
σ
σ
λ
σ
σ
σσ
Monte-Carlo Procedure
Choose the Confidence Level φ and find the corresponding nσ
using the normal (Gaussian) distribution.
nσ φ
1.000 0.6827
1.645 0.9000
1.960 0.9500
2.576 0.9900
1
Run a few sample k0 > 20 and estimate λ according to2
( )
( )
2
1
2
0
1
4
0
0
0
0
0
0
ˆ
1
ˆ
1
:ˆ






−
−
=
∑
∑
=
=
k
i
ki
k
i
ki
k
mx
k
mx
k
λ∑=
=
0
0
10
1
:ˆ
k
i
ik x
k
m
3 Compute and as function of kσ σ
4 Find k for which
[ ] ϕσσσ σσ =≤≤
2
ˆ
2
k
2
k
ˆ-0Prob n
5 Run k-k0 simulations
54
SOLO Review of Probability
Estimation of the Mean and Variance of a Random Variable (continue – 11)
Monte-Carlo Procedure
Choose the Confidence Level φ = 95% that gives the
corresponding nσ=1.96.
nσ φ
1.000 0.6827
1.645 0.9000
1.960 0.9500
2.576 0.9900
1
The kurtosis λ = 32
3 Find k for which ϕσ
λ
σσ
σ
σ =












−
≤≤

2
kˆ
22
k
2 1
ˆ-0Prob
k
n
4 Run k>800 simulations
Example:
Assume a Gaussian distribution λ = 3
95.0
2
96.1ˆ-0Prob
2
kˆ
22
k
2
=












≤≤

σ
σσσ
k
Assume also that we require also that with probability φ = 95 %22
k
2
1.0ˆ- σσσ ≤
1.0
2
96.1 =
k
800≈k
55
SOLO Review of Probability
Generating Discrete Random Variables
Pseudo-Random Number Generators
• First attempts to generate “random numbers”:
- Draw balls out of a stirred urn
- Roll dice
• 1927: L.H.C. Tippett published a table of 40,000 digits taken “at random” from
census reports.
• 1939: M.G. Kendall and B. Babington-Smith create a mechanical machine to
generate random numbers. They published a table of 100,000 digits.
• 1946: J. Von Neumann proposed the “middle square method”.
• 1948: D.H. Lehmer introduced the “linear congruential method”.
• 1955: RAND Corporation published a table of 1,000,000 random digits obtained
from electronic noise.
• 1965: M.D. MacLaren and G. Marsaglia proposed to combine two congruential
generators.
• 1989: R.S. Wikramaratna proposed the additive congruential method.
56
SOLO Review of Probability
Generating Discrete Random Variables
Pseudo-Random Number Generators
A Random Number represents the value of a random variable uniform distributed on (0,1).
Pseudo-Random Numbers constitute a sequence of values, which although are
deterministically generated, have all the appearances of being independent uniform
distributed on (0,1).
One approach
1. Define x0 = integer initial condition or seed
2. Using integers a and m recursively compute
mxax nn modulo1−= mxIntegerxkmaxmkxa nnn <∈+⋅=− ,,,1
Therefore xn takes the values 0,1,…,m-1 and the quantity un=xn/m , called a pseudo-random
number is an approximation to the value of uniform (0,1) random variable.
In general the integers a and m should be chose to satisfy three criteria:
1. For any initial seed, the resultant sequence has the “appearance” of being a sequence
of independent (0,1) random variables.
For any initial seed, the number of variables that can be generated before repetition
begins is large.
The values can be computed efficiently on a digital computer.
Multiplicative congruential method
Return to
Monte Carlo Approximation
57
SOLO Review of Probability
Generating Discrete Random Variables
Pseudo-Random Number Generators (continue – 1)
A guideline is to choose m to be a large prime number compared to the computer word size.
Examples:
32 bits word computer: 807,16712 531
==−= am
125,35312 535
==−= am36 bits word computer:
Another generator of pseudo-random numbers uses recursions of the type:
( ) mcxax nn modulo1 += −
mxIntegerxkmcaxmkcxa nnn <∈+⋅=+− ,,,,1
Mixed congruential method
58
SOLO Review of Probability
Generating Discrete Random Variables
Histograms
Return to Table of Content
A histogram is a graphical display of tabulated frequencies, shown as bars. It shows what
proportion of cases fall into each of several categories: it is a form of data binning. The categories
are usually specified as non-overlapping intervals of some variable. The categories (bars) must be
adjacent. The intervals (or bands, or bins) are generally of the same size.
Histograms are used to plot density of data, and often for density estimation: estimating the
probability density function of the underlying variable. The total area of a histogram always
equals 1. If the length of the intervals on the x-axis are all 1, then a histogram is identical to a
relative frequency plot.
A cumulative histogram is a mapping that counts the
cumulative number of observations in all of the bins
up to the specified bin. That is, the cumulative
histogram Mi of a histogram mi is defined as:
An ordinary and a cumulative
histogram of the same data. The
data shown is a random sample of
10,000 points from a normal
distribution with a mean of 0 and
a standard deviation of 1.
Mathematical Definition
∑=
=
k
i
imn
1
In a more general mathematical sense, a histogram is
a mapping mi that counts the number of observations
that fall into various disjoint categories (known as
bins), whereas the graph of a histogram is merely one
way to represent a histogram. Thus, if we let n be the
total number of observations and k be the total number
of bins, the histogram mi meets the following
conditions:
∑=
=
i
j
ji mM
1
59
SOLO Review of Probability
Generating Discrete Random Variables
The Inverse Transform Method
Suppose we want to generate a discrete random variable X
having probability density function:
( ) 1,1,0)( ==−= ∑j
jjj pjxxpxp δ
To accomplish this, let generate a random number U that is uniformly distributed
over (0,1) and set:











<≤
+<≤
<
=
∑∑ =
−
=


j
i
i
j
i
ij pUpifx
ppUpifx
pUifx
X
1
1
1
1001
00
j
j
i
i
j
i
ij ppUpPxXP =






<<== ∑∑ =
−
= 1
1
1
)(
Since , for any a and b such that 0 < a < b < 1, and U is uniformly distributed
P (a ≤ U < b) = b-a, we have:
and so X has the desired distribution.
60
SOLO Review of Probability
Generating Discrete Random Variables
The Inverse Transform Method (continue – 1)
Suppose we want to generate a discrete random variable X
having probability density function: ( ) 1,1,0)( ==−= ∑j
jjj pjxxpxp δ
Draw X, N times,
from p (x)
Histogram of the
Results
61
SOLO Review of Probability
Generating Discrete Random Variables
The Inverse Transform Method (continue – 2)
Generating a Poisson Random Variable: 1,1,0
!
)( ===== ∑−
i
i
i
i pi
i
eiXPp 
λλ
( )
1
!
!1
1
1
+
=
+
=
−
+
−
+
i
i
e
i
e
p
p
i
i
i
i λ
λ
λ
λ
λ
Draw X , N times, from
Poisson Distribution
Histogram of the Results
62
SOLO Review of Probability
Generating Discrete Random Variables
The Inverse Transform Method (continue – 3)
Generating Binominal Random Variable:
( )
( ) 1,1,01
!!
!
)( ==−
−
=== ∑−
i
i
ini
i pipp
ini
n
iXPp 
( ) ( )
( )
( )
( ) p
p
i
in
pp
ini
n
pp
ini
n
p
p
ini
ini
i
i
−+
−
=
−
−
−
−−+
=
−
−−+
+
111
!!
!
1
!1!1
! 11
1
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 k
( )nkP ,
Histogram of the Results
63
SOLO Review of Probability
Generating Discrete Random Variables
The Accaptance-Rejection Technique
Suppose we have an efficient method for simulating a random variable having a
probability density function { qj, j ≥0 }. We want to use this to obtain a random
variable that has the probability density function { pj, j ≥0 }.
Let c be a constant such that: 0.. ≠∀≤ j
j
j
qtsjc
q
p
If such a c exists, it must satisfy: cqcp
j
j
j
j ≤⇒≤ ∑∑ 1
11

Rejection Method
Step 1: Simulate the value of Y, having probability density function qj.
Step 2: Generate a random number U (that is uniformly distributed
over (0,1) ).
Step 3: If U < pY/c qY, set X = Y and stop. Otherwise return to Step 1.
64
SOLO Review of Probability
Generating Discrete Random Variables
The Acceptance-Rejection Technique (continue – 1)
Theorem
The random variable X obtained by the rejection method has probability density
function P { X=i } = pi.
Proof
{ } { } { }
{ } { }Acceptance
,
Acceptance
Acceptance,
Acceptance
Method
Acceptance
Method
Acceptance
P
qc
p
UiYP
P
iYP
iYPiXP i
i
Bayes






≤=
=
=
====
{ }
{ } { } { }AcceptanceAcceptanceAcceptance
(0,1)ddistribute
uniformlyU
ceindependen
by
Pc
p
P
qc
p
q
P
qc
p
UPiYP
ii
i
i
i
i
qi
==





≤=
=

Summing over all i, yields
{ }
{ }Acceptance
1
1
Pc
p
iXP i
i
i

 ∑
∑ ==
{ } 1Acceptance =Pc
{ } ipiXP ==
{ } 1
1
Acceptance ≤=
c
P
q.e.d.
65
SOLO Review of Probability
Generating Discrete Random Variables
The Acceptance-Rejection Technique (continue – 2)
Example
Generate a truncated Gaussian using the
Accept-Reject method. Consider the case with
( ) [ ]



 −∈
≈
−
otherwise
xe
xp
x
0
4,42/2/2
π
Consider the Uniform proposal function
( )
[ ]


 −∈
≈
otherwise
x
xq
0
4,48/1
In Figure we can see the results of the
Accept-Reject method using N=10,000 samples.
66
SOLO Review of Probability
Generating Continuous Random Variables
The Inverse Transform Algorithm
Let U be a uniform (0,1) random variable. For any continuous
distribution function F the random variable X defined by
( )UFX 1−
=
has distribution F. [ F-1
(u) is defined to be that value of x such that F (x) = u ]
Proof
Let Px(x) denote the Probability Distribution Function X=F-1
(U)
( ) { } ( ){ }xUFPxXPxPx ≤=≤= −1
Since F is a distribution function, it means that F (x) is a monotonic increasing
function of x and so the inequality “a ≤ b” is equivalent to the inequality
“F (a) ≤ F (b)”, therefore
( ) ( )[ ] ( ){ }
( )[ ]
( ){ } ( )
( )
( )xFxFUP
xFUFFPxP
uniformU
xF
UUFF
x
1,0
10
1
1
≤≤
=
−
=≤=
≤=
−
67
SOLO Review of Probability
Importance Sampling
Let Y = (Y1,…,Ym) a vector of random variables having a joint probability density
function f (y1,…,ym), and suppose that we are interested in estimating
( )[ ] ( ) ( )∫== mmmmf dydyyyfyyhYYhE  1111 ,,,,,,θ
Suppose that a direct generation of the random vector Y so as to compute h (Y) is
inefficient possible because
(a) is difficult to generate the random vector Y, or
(b) the variance of h (Y) is large, or
(c) both of the above
Suppose that W=(W1,…,Wm) is another random vector, which takes values in the
same domain as Y, and has a joint density function g(w1,…,wm) that can be easily
generated. The estimation θ can be expressed as:
( )[ ] ( ) ( )
( )
( ) ( ) ( )
( ) 





=== ∫ Wg
WfWh
Edwdwwwg
wwg
wwfwwh
YYhE gmm
m
mm
mf 


 11
1
11
1 ,,
,,
,,,,
,,θ
Therefore, we can estimate θ by generating values of random vector W, and then
using as the estimator the resulting average of the values h (W) f (W)/ g (W).
Return to Particle Filters
68
SOLO Review of Probability
Monte Carlo Integration
Monte Carlo Method can be used to numerically evaluate multidimensional integrals
( ) ( )∫∫ == xdxgdxdxxxgI mm  11 ,,
To use Monte Carlo we factorize ( ) ( ) ( )xpxfxg ⋅=
( ) ( ) 1&0 =≥ ∫ xdxpxp
in such a way that is interpreted as a Probability Density Function such that( )xp
We assume that we can draw NS samples from ( )xp( )S
i
Nix ,,1, =
( ) S
i
Nixpx ,,1~ =
Using Monte Carlo we can approximate ( ) ( )∑=
−≈
SN
i
S
i
Nxxxp
1
/δ
( ) ( ) ( ) ( )
( ) ( ) ( )∑∑∫
∫ ∑∫
==
=
=−⋅=
−⋅=≈⋅=
SS
S
S
N
i
i
S
N
i
i
S
N
i
S
i
N
xf
N
xdxxxf
N
xdNxxxfIxdxpxfI
11
1
11
/
δ
δ
69
SOLO Review of Probability
Monte Carlo Integration
we draw NS samples from ( )xp( )S
i
Nix ,,1, =
( ) S
i
Nixpx ,,1~ =
( ) ( ) ( )∑∫ =
=≈⋅=
S
S
N
i
i
S
N xf
N
IxdxpxfI
1
1
If the samples are independent, then INS
is an unbiased estimate of I.
i
x
According to the Law of Large Numbers INS
will almost surely converge to I:
II
sa
N
N
S
S
..
∞→
→
( )[ ] ( ) ∞<−= ∫ xdxpIxff
22
:σIf the variance of is finite; i.e.:( )xf
then the Central Limit Theorem holds and the estimation error converges in
distribution to a Normal Distribution:
( ) ( )2
,0~lim fNS
N
IIN S
S
σN−
∞→
The error of the MC estimate, e = INS
– I, is of the order of O (NS
-1/2
), meaning
that the rate of convergence of the estimate is independent of the dimension of
the integrand.
Numerical Integration of
and ( )kk xzp |( )1| −kk xxp
Return to Particle Filters
70
SOLO Review of Probability
Existence Theorems
Existence Theorem 3
Given a function S (ω)= S (-ω) or, equivalently, a positive-defined function R (τ),
(R (τ) = R (-τ), and R (0)=max R (τ), for all τ ), we can find a stochastic process x (t)
having S (ω) as its power spectrum or R (τ) as its autocorrelation.
Proof of Existence Theorem 3
Define
( ) ( ) ( ) ( ) ( )ω
π
ω
π
ω
ωωω
π
−=
−
=== ∫
+∞
∞−
f
a
S
a
S
fdSa 22
2
:&
1
:
Since , according to Existence Theorem 1,
we can find a random variable ω with the even density function f (ω), and
probability density function
( ) ( ) 1&0 =≥ ∫
+∞
∞−
ωωω dff
( ) ( )∫∞−
=
ω
ττω dfP :
We now form the process , where is a random variable
uniform distributed in the interval (-π,+π) and independent of ω.
( ) ( )ϑω += tatx cos: ϑ
71
SOLO Review of Probability
Existence Theorems
Existence Theorem 3
Proof of Existence Theorem 3 (continue – 1)
Since is uniform distributed in the interval (-π,+π) and independent of ω,
its spectrum is
( ){ } ( ){ } ( ){ } ( ){ } ( ){ } 0sinsincoscos
00
,
=−=

ϑωϑω ϑωϑω
ϑω
EtEaEtEatxE
tindependen
ϑ
( ) { } ( )
ϖπ
ϖπ
ϖπϖπ
ϑ
π
ϖ
πϖπϖπ
π
ϑϖπ
π
ϑϖϑϖ
ϑϑ
sin
2
1
2
1
2
1
=
−
====
−+
−
+
−
∫ j
ee
j
e
deeES
jjj
jj
or { } ( ){ } ( ){ } ( )
ϖπ
ϖπ
ϑϖϑϖ ϑϑ
ϑϖ
ϑ
sin
sincos =+= EjEeE j
1=ϖ 1=ϖ
( ) ( ){ } ( ) ( )[ ]{ }
( ){ } ( )[ ]{ }
( ){ } ( )[ ]{ } ( ){ } ( )[ ]{ } ( ){ }
 0
2
0
22,
22
2
2sin2sin
2
2cos2cos
2
cos
2
22cos
2
cos
2
coscos
ϑτωϑτωτω
ϑτωτω
ϑτωϑωτ
ϑωϑωω
ϑω
EtE
a
EtE
a
E
a
tE
a
E
a
ttEatxtxE
tindependen
+−++=
+++=
+++=+
2=ϖ 2=ϖ
Given a function S (ω)= S (-ω) or, equivalently, a positive-defined function R (τ),
(R (τ) = R (-τ), and R (0)=max R (τ), for all τ ), we can find a stochastic process x (t)
having S (ω) as its power spectrum or R (τ) as its autocorrelation.
72
SOLO Review of Probability
Existence Theorems
Existence Theorem 3
Proof of Existence Theorem 3 (continue – 2)
( ){ } 0=txE
( ) ( ){ } ( ){ } ( ) ( ) ( )τωωτωτωτ ω xRdf
a
E
a
txtxE ===+ ∫
+∞
∞−
cos
2
cos
2
22
( ) ( )ϑω += tatx cos:We have
Because of those two properties x (t) is wide-sense stationary with a power spectrum
given by:
( ) ( ) ( ) ( )[ ]
( ) ( )
( ) ( )∫∫
+∞
∞−
−=+∞
∞−
=−= ττωτττωτωτω
ττ
dRdjRS x
RR
xx
xx
cossincos
( ) ( ) ( ) ( )[ ]
( ) ( )
( ) ( )∫∫
+∞
∞−
−=+∞
∞−
=+= ωτωω
π
ωτωτωω
π
τ
ωω
dSdjSR x
SS
xx
xx
cos
2
1
sincos
2
1
Therefore ( ) ( )ωπω faSx
2
=
q.e.d.
Fourier
Inverse
Fourier
( ) ( )∫
+∞
∞−
= ωωτω df
a
cos
2
2
f (ω) definition
( )ωS=
Given a function S (ω)= S (-ω) or, equivalently, a positive-defined function R (τ),
(R (τ) = R (-τ), and R (0)=max R (τ), for all τ ), we can find a stochastic process x (t)
having S (ω) as its power spectrum or R (τ) as its autocorrelation.
73
SOLO
Markov Processes
A Markov Process is defined by:
Andrei Andreevich
Markov
1856 - 1922
( ) ( )( ) ( ) ( )( ) 111
,|,,,|, tttxtxptxtxp >∀ΩΩ=≤ΩΩ ττ
i.e. the Random Process, the past up to any time t1 is fully defined
by the process at t1.
Examples of Markov Processes:
1. Continuous Dynamic System
( ) ( )
( ) ( )vuxthtz
wuxtftx
,,,
,,,
=
=
2. Discrete Dynamic System
( ) ( )
( ) ( )kkkkk
kkkkk
vuxthtz
wuxtftx
,,,
,,, 1111
=
= −−−−
x - state space vector (n x 1)
u - input vector (m x 1)
- measurement vector (p x 1)z
v - white measurement noise vector (p x 1)
- white input noise vector (n x 1)w
Recursive Bayesian Estimation
74
Recursive Bayesian EstimationSOLO
Using this property we obtain:
( ) ( )1021 |,,,| −−− = kkkkk xxpxxxxp 
Markov Processes
( ) ( )
( )
( )
( ) ( )
( )
( )
( ) ( )∏=
−
−−−−
−−−−−−
=
=
=
−−
−
k
i
ii
k
xxp
kkkk
kk
xxp
kkkkkk
xxpxp
xxpxxxpxxp
xxxpxxxxpxxxxp
kk
kk
1
10
02
|
0211
021
|
021021
|
,,,,||
,,,,,,|,,,,
21
1

  


  

Markov Process:
Table of Content
the present discrete state probability depends only on the previous state.
The Markov Process is defined if we know p (x0) and p(xi|xi-1) for each i.
75
Recursive Bayesian EstimationSOLO
In a Markovian system the probability of the current
true state depends only on the previous state, and is
independent of the other earlier states
( ) ( )1021 |,,,| −−− = kkkkk xxpxxxxp 
Similarly the measurements at the k-th time-
step is dependent upon the current true
state, so is conditionally independent of all other
earlier states, given the current state
( ) ( )kkkkk xzpxxxzp |,,,| 01 =− 
( ) ( ) ( ) ( ) ( )kkkkkkkk zpzxpxpxzpxzp ||, ==
From the definition of the Markovian system (see Figure) p (xk|xk-1) is defined by
f and the statistics of x and w and p (zk|xk) is defined by h and statistics of x and v.
kx1−kx
kz1−kz
0x 1x 2x
1z 2z kZ :11:1 −kZ
( )111 ,, −−− kkk wuxf
( )kk vxh ,
Markov Processes
( )000 ,, wuxf
( )11,vxh
( )111 ,, wuxf
( )22 ,vxh
Hidden States
Measurements
76
Recursive Bayesian EstimationSOLO
( ) ( ) ( )
( ) ( )kvkkk
xkkwkkkk
vpgivenvxhz
xpuwpgivenwuxfx
:,
,,:,, 011111 0
=
= −−−−−
Markov Processes
( ) ( )j
kkkkxkkkw
j
k wuxfxtsNjuxxfw k 11111
1
1 ,,..,..,1,, −−−−−
−
− ===
Suppose that we can obtain all for which
j
kw 1−
( ) ( ) ( )∑=
−
−−−−− ∇=
kxN
j
j
kkkw
j
kwkk wuxfwpxxp
1
1
11111 ,,|then
( ) ( ) ( )∑=
−
∇=
kx
k
N
j
j
kkv
j
kvkk vxhvpxzp
1
1
,|
( ) ( )j
kkkzkkv
j
k vxhztsNjxzhv k
,..,..,1,1
=== −
In the same way, suppose that we can obtain all for whichj
kv
then
( ) ( ) ( )
( ) ( )∑
∑
=
−
−−−−
=
−−−−
∇=
=+≤≤=
kx
kx
N
j
k
j
kkkw
j
kw
N
j
j
k
j
kwkkkkkkkk
xdwuxfwp
wdwpxxdxXxxdxxp
1
1
1111
1
1111
,,
|Pr|
This is a Conceptual
Not a Practical Procedure
Analytic Computations of and .( )kk xzp |( )1| −kk xxp
77
Recursive Bayesian EstimationSOLO
( ) ( ) ( )
( ) ( )kvkkk
xkkwkkkk
vpgivenvxhz
xpuwpgivenwuxfx
:
,,:, 011111 0
+=
+= −−−−−
kx1−kx
kz1−kz
( ) 111, −−− + kkk wuxf
( ) kk vxh +
Markov Processes
( ) ( )[ ]111 ,| −−− −= kkkwkk uxfxpxxptherefore
( ) ( )[ ]kkvkk xhzpxzp −=|and
For additive noise
we have
( )
( )kkk
kkkk
xhzv
uxfxw
−=
−= −−− 111 ,
Analytic Computations of and (continue – 1)( )kk xzp |( )1| −kk xxp
78
SOLO
( )
( )kkk
kkk
vxhz
wxfx
,
, 11
=
= −−
kk vw &1− are system and measurement white-noise sequences
independent of past and current states and on each other and
having known P.D.F.s ( ) ( )kk vpwp &1−
We want to compute p (xk|Z1:k) recursively, assuming knowledge of p(xk-1|Z1:k-1)
in two stages, prediction (before) and update (after measurement)
( ) ( )( ) ( )∫ −−−−− −= 11111 ,| kkkkkkk wdwpwxfxxxp δ
We need to evaluate the following integrals:
( ) ( )( ) ( )∫ −= kkkkkkk vdvpvxhzxzp ,| δ
We use the numeric Monte Carlo Method to evaluate the integrals:
Generate (Draw): ( ) ( ) Sk
i
kk
i
k Nivpvwpw ,,1~&~ 11 =−−
( ) ( )( ) S
N
i
i
k
i
k
i
kkk Nwxfxxxp
S
∑=
−−− −≈
1
111 /,| δ
( ) ( )( ) S
N
i
i
k
i
k
i
kkk Nvxhzxzp
S
∑=
−≈
1
/,| δ
or
( ) ( ) ( ) S
N
i
i
kkkk
i
k
i
k
i
k Nxxxxpwxfx
S
∑=
−−− −≈→=
1
111 /|, δ
( ) ( ) ( ) S
N
i
i
kkkk
i
k
i
k
i
k Nzzxzpvxhz
S
∑=
−≈→=
1
/|, δ
Analytic solutions for those integral
equations do not exist in the general
case.
Recursive Bayesian Estimation
Numerical Computations of and .( )kk xzp |( )1| −kk xxp
Markov Processes
Prediction (before measurement) ( ) ( ) ( )∫ −−−−− = 11:1111:1 ||| kkkkkkk xdZxpxxpZxp1
Update (after measurement)
( ) ( )
( ) ( ) ( )
( )
( ) ( )
( )
( ) ( )
( ) ( )∫ −
−
−
−
=
− ===
kkkkk
kkkk
kk
kkkk
Bayes
bp
apabp
bap
kkkkk
xdZxpxzp
Zxpxzp
Zzp
Zxpxzp
ZzxpZxp
1:1
1:1
1:1
1:1
|
|
1:1:1
||
||
|
||
,||
2
79
Recursive Bayesian EstimationSOLO
( ) ( ) ( )
( ) ( )kvkkk
xkkwkkkk
vpgivenvxhz
xpuwpgivenwuxfx
:,
,,:,, 011111 0
=
= −−−−−
Markov Processes
Monte Carlo Computations of and .( )kk xzp |( )1| −kk xxp
Generate (Draw) ( ) Sx
i
Nixpx ,,1~ 00 0
=
For { }∞∈ ,,1 k
Initialization0
1 At stage k-1
Generate (Draw) NS samples ( ) Skw
i
k Niwpw ,,1~ 11 =−−
2 State Update ( ) S
i
kk
i
k
i
k Niwuxfx ,,1,, 111 == −−−
3 Generate (Draw) Measurement Noise ( ) Skv
i
k Nivpv ,,1~ =
k:=k+1 & return to 1
Compute Histograms of
to obtain ( )kk xzp |
kk xz |
( ) ( )∑=
− −≈
SN
i
S
i
kkkk Nxxxxp
1
1 /| δ
( ) ( )∑=
−≈
SN
i
S
i
kkkk Nzzxzp
1
/| δ
Compute Histograms of
to obtain
1| −kk xx
( )1| −kk xxp
4 Measurement , Update ( ) S
i
k
i
k
i
k Nivxhz ,,1, ==kz
SOLO
Stochastic Processes deal with systems corrupted by noise. A description of those processes is
given in “Stochastic Processes” Presentation. Here we give only one aspect of those processes.
( ) ( ) ( ) [ ]fttttwddttxftxd ,, 0∈+=
A continuous dynamic system is described by:
Stochastic Processes
( )tx - n- dimensional state vector
( )twd - n- dimensional process noise vector
Assuming system measurements at discrete time tk given by:
( ) ( )( ) [ ]fkkkkk tttvttxhtz ,,, 0∈=
kv - m- dimensional measurement noise vector at tk
We are interested in the probability of the state at time t given the set of discrete
measurements until (included) time tk < t.
x
( )kZtxp |,
{ }kk zzzZ ,,, 21 = - set of all measurements up to and including time tk.
The time evolution of the probability density function is described by the
Fokker–Planck equation.
A solution to the one-dimensional
Fokker–Planck equation, with both the
drift and the diffusion term. The initial
condition is a Dirac delta function in
x = 1, and the distribution drifts
towards x = 0.
The Fokker–Planck equation describes the time evolution of
the probability density function of the position of a particle, and
can be generalized to other observables as well. It is named after
Adriaan Fokker and Max Planck and is also known as the
Kolmogorov forward equation. The first use of the Fokker–
Planck equation was the statistical description of Brownian
motion of a particle in a fluid.
In one spatial dimension x, the Fokker–Planck equation for a
process with drift D1(x,t) and diffusion D2(x,t) is
More generally, the time-dependent probability distribution
may depend on a set of N macrovariables xi. The general
form of the Fokker–Planck equation is then
where D1
is the drift vector and D2
the diffusion tensor; the latter results from the presence of the
stochastic force.
Fokker – Planck Equation
Adriaan Fokker
1887 - 1972
Max Planck
1858 - 1947
SOLO
Adriaan Fokker
„Die mittlere Energie rotierender
elektrischer Dipole im Strahlungsfeld"
Annalen der Physik 43, (1914) 810-
820
Max Plank, „Ueber einen Satz der
statistichen Dynamik und eine
Erweiterung in der Quantumtheorie“,
Sitzungberichte der Preussischen
Akadademie der Wissenschaften
(1917) p. 324-341
Stochastic Processes
( ) ( ) ( )[ ] ( ) ( )[ ]txftxD
x
txftxD
x
txf
t
,,,,, 22
2
1
∂
∂
+
∂
∂
−=
∂
∂
( )[ ] ( )[ ]∑∑∑ = == ∂∂
∂
+
∂
∂
−=
∂
∂ N
i
N
j
Nji
ji
N
i
Ni
i
ftxxD
xx
ftxxD
x
f
t 1 1
1
2
2
1
1
1
,,,,,, 
Fokker – Planck Equation (continue – 1)
The Fokker–Planck equation can be used for computing the probability densities of stochastic
differential equations.
where is the state and is a standard M-dimensional Wiener process. If the initial
probability distribution is , then the probability distribution of the state
is given by the Fokker – Planck Equation with the drift and diffusion terms:
Similarly, a Fokker–Planck equation can be derived for Stratonovich stochastic differential
equations. In this case, noise-induced drift terms appear if the noise strength is state-dependent.
SOLO
Consider the Itô stochastic differential equation:
( ) ( ) ( )[ ] ( ) ( )[ ]txftxD
x
txftxD
x
txf
t
,,,,, 22
2
1
∂
∂
+
∂
∂
−=
∂
∂
Fokker – Planck Equation (continue – 2)
Derivation of the Fokker–Planck Equation
SOLO
Start with ( ) ( ) ( )11|1, 111
|, −−− −−−
= kxkkxxkkxx xpxxpxxp kkkkk
and ( ) ( ) ( ) ( )∫∫
+∞
∞−
−−−
+∞
∞−
−− −−−
== 111|11, 111
|, kkxkkxxkkkxxkx xdxpxxpxdxxpxp kkkkkk
define ( ) ( )ttxxtxxttttt kkkk ∆−==∆−== −− 11 ,,,
( ) ( )[ ] ( ) ( ) ( ) ( )[ ] ( ) ( )[ ] ( )∫
+∞
∞−
∆−∆− ∆−∆−∆−= ttxdttxpttxtxptxp ttxttxtxtx ||
Let use the Characteristic Function of
( ) ( ) ( ) ( ) ( )[ ]{ } ( ) ( ) ( ) ( )[ ] ( ) ( ) ( ) ( )ttxtxtxtxdttxtxpttxtxss ttxtxttxtx ∆−−=∆∆−∆−−−=Φ ∫
+∞
∞−
∆−∆−∆ |exp: ||
( ) ( ) ( ) ( )[ ]ttxtxp ttxtx ∆−∆− ||
The inverse transform is ( ) ( ) ( ) ( )[ ] ( ) ( )[ ]{ } ( ) ( ) ( )∫
∞+
∞−
∆−∆∆− Φ∆−−=∆−
j
j
ttxtxttxtx sdsttxtxs
j
ttxtxp || exp
2
1
|
π
Using Chapman-Kolmogorov Equation we obtain:
( ) ( )[ ] ( ) ( )[ ]{ } ( ) ( ) ( )
( ) ( ) ( ) ( )[ ]
( ) ( )[ ] ( )
( ) ( )[ ]{ } ( ) ( ) ( ) ( ) ( )[ ] ( )ttxdsdttxpsttxtxs
j
ttxdttxpsdsttxtxs
j
txp
j
j
ttxttxtx
ttx
ttxtxp
j
j
ttxtxtx
ttxtx
∆−∆−Φ∆−−=
∆−∆−Φ∆−−=
∫ ∫
∫ ∫
∞+
∞−
∞+
∞−
∆−∆−∆
+∞
∞−
∆−
∆−
∞+
∞−
∆−∆
∆−
|
|
|
exp
2
1
exp
2
1
|
π
π
  
Stochastic Processes
Fokker – Planck Equation (continue – 3)
Derivation of the Fokker–Planck Equation (continue – 1)
SOLO
The Characteristic Function can be expressed in terms of the moments about x (t-Δt) as:
( ) ( )[ ] ( ) ( )[ ]{ } ( ) ( ) ( ) ( ) ( )[ ] ( )ttxdsdttxpsttxtxs
j
txp
j
j
ttxttxtxtx ∆−∆−Φ∆−−= ∫ ∫
+∞
∞−
∞+
∞−
∆−∆−∆ |exp
2
1
π
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )[ ] ( ){ }∑
∞
=
∆−∆∆−∆ ∆−∆−−
−
+=Φ
1
|| |
!
1
i
i
ttxtx
i
ttxtx ttxttxtxE
i
s
s
Therefore
( ) ( )[ ] ( ) ( )[ ]{ } ( )
( ) ( ) ( ) ( )[ ] ( ){ } ( ) ( )[ ] ( )ttxdsdttxpttxttxtxE
i
s
ttxtxs
j
txp
j
j
ttx
i
i
ttxtx
i
tx ∆−∆−






∆−∆−−
−
+∆−−= ∫ ∫ ∑
+∞
∞−
∞+
∞−
∆−
∞
=
∆−
1
| |
!
1exp
2
1
π
Use the fact that ( ) ( ) ( )[ ]{ } ( ) ( ) ( )[ ]
( )[ ]
,2,1,01exp
2
1
=
∂
∆−−∂
−=∆−−−∫
∞+
∞−
i
tx
ttxtx
sdttxtxss
j i
i
i
j
j
i δ
π
( ) ( )[ ] ( ) ( )[ ]{ } ( ) ( )[ ] ( )
( ) ( ) ( )[ ]
( )[ ]
( ) ( )[ ] ( ){ } ( ) ( )[ ] ( )∫∑
∫ ∫
∞+
∞−
∞
=
∆−
+∞
∞−
∆−
∞+
∞−
∆−∆−∆−∆−−
∂
∆−−∂−
+
∆−∆−∆−−=
1
|
!
1
exp
2
1
i
ttx
i
i
ii
ttx
j
j
tx
ttxdttxpttxttxtxE
tx
ttxtx
i
ttxdttxpsdttxtxs
j
txp
δ
π
where δ [u] is the Dirac delta function:
[ ] { } ( ) [ ] ( ) ( ) ( ) ( ) ( )000..0exp
2
1
FFFtsuFFduuuFsdus
j
u
j
j
==∀== −+
+∞
∞−
∞+
∞−
∫∫ δ
π
δ
Stochastic Processes
Fokker – Planck Equation (continue – 4)
Derivation of the Fokker–Planck Equation (continue – 2)
SOLO
[ ] ( ){ } ( ) [ ] ( ) ( ) ( ) ( ) ( )afafaftsufufduuaufsduas
j
ua au
j
j
==∀=−−=− −+=
+∞
∞−
∞+
∞−
∫∫ ..exp
2
1
δ
π
δ
[ ] ( ){ } ( ) ( ) { } ( ) ( ) { }∫∫∫
∞+
∞−
∞+
∞−
∞+
∞−
=→=−
−
=−
j
j
j
j
j
j
sdussFs
j
uf
du
d
sdussF
j
ufsduass
j
ua
ud
d
exp
2
1
exp
2
1
exp
2
1
πππ
δ
( ) [ ] ( ) ( ){ } ( ) ( ){ }
{ } ( ) { } { } ( ) ( )
au
j
j
j
j
j
j
j
j
ud
ufd
sdsFass
j
sdduusufass
j
sdduuasufs
j
dusduass
j
ufduua
ud
d
uf
=
∞+
∞−
∞+
∞−
∞+
∞−
∞+
∞−
+∞
∞−
+∞
∞−
∞+
∞−
+∞
∞−
−=
−
=−
−
=
−
−
=−
−
=−
∫∫ ∫
∫ ∫∫ ∫∫
exp
2
1
expexp
2
1
exp
2
1
exp
2
1
ππ
ππ
δ
[ ] ( ) ( ){ } ( ) ( ) { } ( ) ( ) { }∫∫∫
∞+
∞−
∞+
∞−
∞+
∞−
=→=−
−
=−
j
j
i
i
ij
j
j
j
i
i
i
i
sdussFs
j
uf
du
d
sdussF
j
ufsduass
j
ua
ud
d
exp
2
1
exp
2
1
exp
2
1
πππ
δ
( ) [ ] ( ) ( ) ( ){ } ( ) ( ) ( ){ }
( ) { } ( ) { } ( ) ( ) { } ( ) ( )
au
i
i
i
j
j
i
ij
j
i
i
j
j
i
ij
j
i
i
i
i
ud
ufd
sdassFs
j
sdduusufass
j
sdduuasufs
j
dusduass
j
ufduua
ud
d
uf
=
−=
−
=−
−
=
−
−
=−
−
=−
∫∫ ∫
∫ ∫∫ ∫∫
∞+
∞−
∞+
∞−
∞+
∞−
∞+
∞−
+∞
∞−
+∞
∞−
∞+
∞−
+∞
∞−
1exp
2
1
expexp
2
1
exp
2
1
exp
2
1
ππ
ππ
δ
Useful results related to integrals involving Delta (Dirac) function
Stochastic Processes
Fokker – Planck Equation (continue – 5)
Derivation of the Fokker–Planck Equation (continue – 3)
SOLO
( ) ( )[ ]{ }
( ) ( )[ ]
( ) ( )[ ] ( ) ( ) ( )[ ] ( ) ( )[ ] ( ) ( ) ( )[ ]txpttxdttxpttxtxttxdttxpsdttxtxs
j
ttxttxttx
ttxtx
j
j
∆−
+∞
∞−
∆−
+∞
∞−
∆−
∆−−
∞+
∞−
=∆−∆−∆−−=∆−∆−∆−− ∫∫ ∫ δ
π
δ
  
exp
2
1
( ) ( ) ( )[ ]
( )[ ] ( ) ( ) ( ) ( )[ ] ( ){ } ( ) ( )[ ] ( )
( ) ( ) ( )[ ]
( )[ ] ( ) ( ) ( ) ( )[ ] ( ){ } ( ) ( )[ ] ( )
( ) ( ) ( ) ( ) ( )[ ] ( ){ } ( ) ( )[ ]( )
( )[ ]∑
∑ ∫
∫∑
∞
=
=∆
∆−∆−
∞
=
∞+
∞−
∆−∆−
+∞
∞−
∞
=
∆−∆−
∂
∆−∆−−∂−
=
∆−∆−∆−∆−−
∂
∆−−∂−
=
∆−∆−∆−∆−−
∂
∆−−∂−
1
0
|
1
|
1
|
|
!
1
|
!
1
|
!
1
i
t
i
ttx
i
ttxtx
ii
i
ttx
i
ttxtxi
ii
i
ttx
i
ttxtxi
ii
tx
txpttxttxtxE
i
ttxdttxpttxttxtxE
tx
ttxtx
i
ttxdttxpttxttxtxE
tx
ttxtx
i
δ
δ
( ) [ ] ( ) ( ) ( )
[ ]
[ ] ( )
auau
i
i
i
i
i
i
i
i
i
ud
ufd
duua
uad
d
uf
ud
ufd
duua
ud
d
uf
==
=−
−
→−=− ∫∫
+∞
∞−
+∞
∞−
δδ 1We found
( ) ( )[ ] ( ) ( )[ ] ( ) ( ) ( ) ( ) ( )[ ] ( ){ } ( ) ( )[ ]( )
( )[ ]∑
∞
=
=∆
∆−∆−
∆−
∂
∆−∆−−∂−
+=
1
0
| |
!
1
i
t
i
ttx
i
ttxtx
ii
ttxtx
tx
txpttxttxtxE
i
txptxp
( ) ( )[ ] ( ) ( )[ ] ( ) ( ) ( )[ ] ( ){ } ( ) ( )[ ]( )
( )[ ]∑
∞
=
∆−
→∆
∆−
→∆ ∂
∆−∆−−∂
∆
−
=
∆
−
1
00
|1
lim
!
1
lim
i
i
ttx
ii
t
i
ttxtx
t tx
txpttxttxtxE
tit
txptxp
Therefore
Rearranging, dividing by Δt, and tacking the limit Δt→0, we obtain:
Stochastic Processes
Fokker – Planck Equation (continue – 6)
Derivation of the Fokker–Planck Equation (continue – 4)
SOLO
We found ( ) ( )[ ] ( ) ( )[ ] ( ) ( ) ( ) ( ) ( )[ ] ( ){ } ( ) ( )[ ]( )
( )[ ]∑
∞
=
∆−∆−
→∆
∆−
→∆ ∂
∆−∆−−∂
∆
−
=
∆
−
1
|
00
|1
lim
!
1
lim
i
i
ttx
i
ttxtx
i
t
i
ttxtx
t tx
txpttxttxtxE
tit
txptxp
Define: ( ) ( )[ ] ( ) ( ) ( ) ( )[ ] ( ){ }
t
ttxttxtxE
txtxm
i
ttxtx
t
i
∆
∆−∆−−
=−
∆−
→∆
−
|
lim:
|
0
Therefore ( ) ( )[ ] ( ) ( ) ( )[ ] ( ) ( )[ ]( )
( )[ ]∑
∞
=
−
∂
−∂−
=
∂
∂
1 !
1
i
i
tx
iii
tx
tx
txptxtxm
it
txp
( ) ( )ttxtx
t
∆−=
→∆
−
0
lim: and:
This equation is called the Stochastic Equation or Kinetic Equation.
It is a partial differential equation that we must solve, with the initial condition:
( ) ( )[ ] ( )[ ]000 0 txptxp tx ===
Stochastic Processes
Fokker – Planck Equation (continue – 7)
Derivation of the Fokker–Planck Equation (continue – 5)
SOLO
We want to find px(t) [x(t)] where x(t) is the solution of
( ) ( ) ( ) [ ]fg ttttntxf
dt
txd
,, 0∈+=
( ){ } 0: == tnEn gg

( )tng
( ) ( )[ ] ( ) ( )[ ]{ } ( ) ( )τδττ −=−− ttQnntntnE gggg
ˆˆ
Wiener (Gauss) Process
( ) ( )[ ] ( ) ( )[ ] ( ){ } [ ] ( ){ } [ ]{ } ( )tQnEtxnE
t
ttxttxtxE
txtxm gg
t
===
∆
∆−∆−−
=−
→∆
−
22
2
2
0
2
|
|
lim:
( ) ( )[ ] ( ) ( )[ ] ( ){ } ( ) ( ) ( ) ( ) ( )txfnEtxftx
td
txd
E
t
ttxttxtxE
txtxm g
t
,,|
|
lim:
0
0
1
=+=












=
∆
∆−∆−−
=−
→∆
−

( ) ( )[ ] ( ) ( )[ ] ( ){ } 20
|
lim:
0
>=
∆
∆−∆−−
=−
→∆
− i
t
ttxttxtxE
txtxm
i
t
i
Therefore we obtain:
( ) ( )[ ] ( )[ ] ( ) ( )[ ]( )
( )
( ) ( ) ( )[ ]
( )[ ]2
2
2
1,
tx
txp
tQ
tx
txpttxf
t
txp txtxtx
∂
∂
+
∂
∂
−=
∂
∂
Stochastic Processes
Fokker–Planck Equation
Return to Daum
89
Recursive Bayesian EstimationSOLO
Given a nonlinear discrete stochastic Markovian system we want to use k discrete
measurements Z1:k={z1,z2,…,zk} to estimate the hidden state xk. For this we want to
compute the probability of xk given all the measurements Z1:k={z1,z2,…,zk} .
If we know p ( xk| Z1:k ) then xk is estimated using:
{ } ( )∫== kkkkkkkk xdZxpxZxEx :1:1| ||:ˆ
( )( ){ } ( )( ) ( )∫ −−=−−= kkk
T
kkkkk
T
kkkkkk xdZxpxxxxZxxxxEP :1:1| |ˆˆ|ˆˆ
or more general we can compute all moments of the probability distribution p ( xk| Z1:k ):
( ){ } ( ) ( )∫= kkkkkk xdZxpxgZxgE :1:1 ||
Bayesian Estimation Introduction
Problem:
Estimate the hidden
States of a
Non-linear Dynamic
Stochastic System
from Noisy
Measurements.
kx1−kx
kz1−kz
0x 1x 2x
1z 2z kZ :11:1 −kZ
( )11, −− kk wxf
( )kk vxh ,
( )00 ,wxf
( )11,vxh
( )11,wxf
( )22 ,vxh
The knowledge of p ( xk| Z1:k ) allows also the computation of Maximum a Posteriori
(MAP) estimate using: ( )kk
x
MAP
kk Zxpx
k
:1| |maxargˆ =
90
Recursive Bayesian EstimationSOLO
To find the expression for p ( xk| Z1:k ) we use the theorem of joint probability (Bayes Rule):
( ) ( )
( )k
kk
RuleBayes
kk
Zp
Zxp
Zxp
:1
:1
:1
,
| =
Since Z1:k ={ zk, Z1:k-1 }: ( ) ( )
( )1:1
1:1
:1
,
,,
|
−
−
=
kk
kkk
kk
Zzp
Zzxp
Zxp
The denominator of this expression is
( ) ( ) ( )1:11:11:1 ,,|,, −−− = kkkkk
RuleBayes
kkk ZxpZxzpZzxp
( ) ( ) ( )
  
1:11:11:1 |,| −−−= kkkkkk ZpZxpZxzp
Since the knowledge of xk supersedes the need for Z1:k-1 = {z1, z2,…,zk-1}
( ) ( )kkkkk xzpZxzp |,| 1:1 ≡−
( ) ( ) ( ) ( )
( ) ( )1:11:1
1:11:1
:1
|
||
|
−−
−−
=
kkk
kkkkk
kk
ZpZzp
ZpZxpxzp
ZxpTherefore:
( ) ( ) ( )1:11:11:1 |, −−− = kkk
RuleBayes
kk ZpZzpZzp
and the nominator is
Bayesian Estimation Introduction
91
Recursive Bayesian EstimationSOLO
The final result is:
( ) ( ) ( )
( )1:1
1:1
:1
|
||
|
−
−
=
kk
kkkk
kk
Zzp
Zxpxzp
Zxp
Therefore:
Since p ( xk| Z1:k ) is a probability distribution it must satisfy:
( ) ( ) ( )
( )
( ) ( )
( )∫
∫
∫ −
−
−
−
===
1:1
1:1
1:1
1:1
:1
|
||
|
||
|1
kk
kkkkk
k
kk
kkkk
kkk
Zzp
xdZxpxzp
xd
Zzp
Zxpxzp
xdZxp
( ) 1| :1 =∫ kkk xdZxp
( ) ( ) ( )∫ −− = kkkkkkk xdZxpxzpZzp 1:11:1 |||
( ) ( ) ( )
( ) ( )∫ −
−
=
kkkkk
kkkk
kk
xdZxpxzp
Zxpxzp
Zxp
1:1
1:1
:1
||
||
|
and:
This is a recursive relation that needs the value of p (xk|Z1:k-1), assuming that
p (zk|xk) is obtained from the Markovian system definition ( zk = h (xk,vk) ).
Bayesian Estimation Introduction
kx1−kx
kz1−kz
0x 1x 2x
1z 2z kZ :11:1 −kZ
( )11, −− kk wxf
( )kk vxh ,
( )00 ,wxf
( )11,vxh
( )11,wxf
( )22 ,vxh
Hidden States
Measurements
92
Recursive Bayesian EstimationSOLO
The Correction Step is:
( ) ( ) ( )
( )1:1
1:1
:1
|
||
|
−
−
=
kk
kkkk
kk
Zzp
Zxpxzp
Zxp
Bayesian Estimation Introduction
evidence
priorlikeliood
posterior
⋅
=
or:
prior: given by prediction equation ( )kk xzp |
likelihood: given by observation model ( )1:1| −kk Zxp
evidence: the normalized constant on the denominator
( ) ( ) ( )∫ −− = kkkkkkk xdZxpxzpZzp 1:11:1 |||
93
Recursive Bayesian EstimationSOLO
( ) ( ) ( )1:111:111:11 |,||, −−−−−− = kkkkk
Bayes
kkk ZxpZxxpZxxp
( ) ( ) ( ) ( )∫∫ −−−−−−−− == 11:11111:111:1 |||,| kkkkkkkkkkk xdZxpxxpxdZxxpZxp
Using:
We obtain:
Since for a Markov Process the knowledge of xk-1 supersedes the need for
Z1:k-1 = {z1, z2,…,zk-1}
( ) ( )11:11 |,| −−− = kkkkk xxpZxxp
Chapman – Kolmogorov Equation
Sydney Chapman
1888 - 1970
Andrey
Nikolaevich
Kolmogorov
1903-1987
Bayesian Estimation Introduction
kx1−kx
kz1−kz
0x 1x 2x
1z 2z kZ :11:1 −kZ
( )11, −− kk wxf
( )kk vxh ,
( )00 ,wxf
( )11,vxh
( )11,wxf
( )22 ,vxh
Hidden States
Measurements
94
Recursive Bayesian EstimationSOLO
( ) ( ) ( ) ( )∫∫ −−−−−−−− == 11:11111:111:1 |||,| kkkkkkkkkkk xdZxpxxpxdZxxpZxp
Using p (xk-1|Z1:k-1) from time-step k-1 and p (xk|xk-1) of the Markov system, compute:
Initialize with p (x0)
( ) ( ) ( )
( ) ( )∫ −
−
=
kkkkk
kkkk
kk
xdZxpxzp
Zxpxzp
Zxp
1:1
1:1
:1
||
||
|
Using p (xk|Z1:k-1) from Prediction phase and p (zk|xk) of the Markov system, compute:
{ } ( )∫== kkkkkkkk xdZxpxZxEx :1:1| ||ˆ
( )( ){ } ( )( ) ( )∫ −−=−−= kkk
T
kkkkk
T
kkkkkk xdZxpxxxxZxxxxEP :1:1| |ˆˆ|ˆˆ
At stage k
k:=k+1
( )1|11|
ˆˆ −−− = kkkk xfx
0
Prediction phase
(before zk measurement)
1
Correction Step (after zk measurement)2
Filtering3
kx1−kx
kz1−kz
0x 1x 2x
1z 2z kZ :11:1 −kZ
( )11, −− kk wxf
( )kk vxh ,
( )00 ,wxf
( )11,vxh
( )11,wxf
( )22 ,vxh
Bayesian Estimation Introduction - Summary
95
Recursive Bayesian EstimationSOLO
( ) ( ) ( ) ( )∫∫ −−−−−−−− == 11:11111:111:1 |||,| kkkkkkkkkkk xdZxpxxpxdZxxpZxp
( ) ( ) ( )
( ) ( )∫ −
−
=
kkkkk
kkkk
kk
xdZxpxzp
Zxpxzp
Zxp
1:1
1:1
:1
||
||
|
Prediction phase
(before zk measurement)
1
Correction Step (after zk measurement)2
kx1−kx
kz1−kz
0x 1x 2x
1z 2z kZ :11:1 −kZ
( )11, −− kk wxf
( )kk vxh ,
( )00 ,wxf
( )11,vxh
( )11,wxf
( )22 ,vxh
Bayesian Estimation Introduction - Summary
This is a Conceptual Solution because the Integrals are Often Not Tractable.
An optimal solution is possible for some restricted cases:
• Linear Systems with Gaussian Noises (system and measurements)
• Grid-Based Filters
Table of Content
96
SOLO
Linear Gaussian Systems
A Linear Combination of Independent Gaussian random vectors is also a
Gaussian random vector mmm XaXaXaS +++= 2211:
( ) ( ) ( )
( ) ( )
( ) ( ) ( )
( ) ( )



+++++++−=




+−



+−



+−=
ΦΦ⋅Φ==Φ ∫ ∫
+∞
∞−
+∞
∞−
mmmm
mmmm
YYYm
YpYp
mYYmS
aaajaaa
ajaajaaja
YdYdYYpSj m
mmYY
mm
µµµωσσσω
µωσωµωσωµωσω
ωωωωω



  



2211
222
2
2
2
2
1
2
1
2
222
22
2
2
2
2
2
11
2
1
2
1
2
11,,
2
1
exp
2
1
exp
2
1
exp
2
1
exp
,,exp 21
11
1
( ) ( )





 −
−= 2
2
2
exp
2
1
,;
i
ii
i
iiiX
X
Xp i
σ
µ
σπ
σµ ( ) ( ) ( ) 



+−==Φ ∫
+∞
∞−
iiiiXiX jXdXpXj ii
µωσωωω
22
2
1
expexp:
Moment-
Generating
Function
Gaussian
distribution
Define
Proof:
( ) ( )iX
ii
i
X
i
iYiii Xp
aa
Y
p
a
YpXaY iii
11
: =





=→=
( ) ( ) ( ) ( )
( )
( ) 





+−=Φ===Φ ∫∫
+∞
∞−
+∞
∞−
iiiiiiX
asign
asign
ii
i
iX
iiiiYiY ajaXaXda
a
Xp
XajYdYpYj i
i
ii
µωσωωωω
222
2
1
expexpexp:
1
1
Review of Probability
97
SOLO
Linear Gaussian Systems (continue – 1)
A Linear Combination of Independent Gaussian random vectors is also a
Gaussian random vector mmm XaXaXaS +++= 2211:
Therefore the Linear Combination of Independent Gaussian Random Variables is a
Gaussian Random Variable with
mmS
mmS
aaa
aaa
m
m
µµµµ
σσσσ
+++=
+++=


2211
222
2
2
2
2
1
2
1
2
Therefore the Sm probability distribution is:
( ) ( )







 −
−= 2
2
2
exp
2
1
,;
m
m
m
mm
S
S
S
SSm
x
Sp
σ
µ
σπ
σµ
Proof (continue – 1):
( ) ( ) ( )





+++++++−=Φ mmmmS aaajaaam
µµµωσσσωω  2211
222
2
2
2
2
1
2
1
2
2
1
exp
We found:
Review of Probability
q.e.d.
98
Recursive Bayesian EstimationSOLO
Linear Gaussian Markov Systems (continue – 2)
( )
( )kkkk
kkkk
vuxkhz
wuxkfx
,,,
,,,1 111
=
−= −−−
kkkk
kkkkkkk
vxHz
wuGxx
+=
Γ++Φ= −−−−−− 111111
wk-1 and vk, white noises, zero mean, Gaussian, independent
( ) ( ) ( ){ } ( ) ( ){ } ( )kPkekeEkxEkxke x
T
xxx =−= &:
( ) ( ) ( ){ } ( ) ( ){ } ( ) lk
T
www kQlekeEkwEkwke ,
0
&: δ=−=

( ) ( ) ( ){ } ( ) ( ){ } ( ) lk
T
vvv kRlekeEkvEkvke ,
0
&: δ=−=

( ) ( ){ } { }0=lekeE
T
vw



=
≠
=
lk
lk
lk
1
0
,δ
( ) ( )Qwwpw ,0;N=
( ) ( )Rvvpv ,0;N=
( )
( ) 





−= −
wQw
Q
wp T
nw
1
2/12/
2
1
exp
2
1
π
( )
( ) 



−= −
vRv
R
vp T
pv
1
2/12/
2
1
exp
2
1
π
A Linear Gaussian Markov Systems is defined as
( ) ( )0|0000 ,;0
Pxxxp ttx == = N ( )
( )
( ) ( )



−−−= =
−
== 00
1
0|0002/1
0|0
2/0
2
1
exp
2
1
0
xxPxx
P
xp t
T
tntx
π
99
Recursive Bayesian EstimationSOLO
Linear Gaussian Markov Systems (continue – 3)
111111 −−−−−− Γ++Φ= kkkkkkk wuGxx
Prediction phase (before zk measurement)
{ } { } { }  
0
1:111111:1111:11| |||:ˆ −−−−−−−−−− Γ++Φ== kkkkkkkkkkkk ZwEuGZxEZxEx
or 111|111|
ˆˆ −−−−−− +Φ= kkkkkkk uGxx
The expectation is
{ }[ ] { }[ ]{ }
( )[ ] ( )[ ]{ }1:1111|111111|111
1:11|1|1|
|ˆˆ
|ˆˆ:
−−−−−−−−−−−−−
−−−−
Γ+−ΦΓ+−Φ=
−−=
k
T
kkkkkkkkkkkk
k
T
kkkkkkkk
ZwxxwxxE
ZxExxExEP
( ) ( ){ } ( ){ }
( ){ } { } T
k
Q
T
kkk
T
k
T
kkkkk
T
k
T
kkkkk
T
k
P
T
kkkkkkk
wwExxwE
wxxExxxxE
kk
11111
0
1|1111
1
0
11|11111|111|111
ˆ
ˆˆˆ
1|1
−−−−−−−−−−
−−−−−−−−−−−−−−
ΓΓ+Φ−Γ+
Γ−Φ+Φ−−Φ=
−−
  
    
T
kk
T
kkkkkk QPP 1111|111| −−−−−−− ΓΓ+ΦΦ=
{ } ( )1|1|1:1 ,ˆ;| −−− = kkkkkkk PxxZxP N
Since is a Linear Combination of Independent
Gaussian Random Variables:
111111 −−−−−− Γ++Φ= kkkkkkk wuGxx
100
SOLO
For the particular vector measurement equation
where the measurement noise, is Gaussian (normal), with zero mean: ( ) ( )kkkv Rvvp ,0;N=
( )
( )
( )xp
zxp
xzp
x
zx
xz
,
| ,
| =
and independent of , the conditional probability can be written,
using Bayes rule as:
kx ( )xzp xz ||
( )










−
−
==−=
1
111
1111
1
1
,
nxpp
nx
pxnxpxnpxpx
xHz
xHz
zxfxHzv
xn
xn

( ) ( )
2/1
,,
/,, T
vxzx
JJvxpzxp =
The measurement noise can be related to and by the function:v zx
pxp
p
pp
p
I
z
f
z
f
z
f
z
f
z
f
J =
















∂
∂
∂
∂
∂
∂
∂
∂
=





∂
∂
=



1
1
1
1
( ) ( ) ( ) ( )vpxpvxpzxp vxvxzx
⋅== ,, ,,
kv
Since the measurement noise is independent of :xv
zThe joint probability of and is given by:x
Recursive Bayesian Estimation
Linear Gaussian Markov Systems (continue – 4)
kkkk vxHz +=
Correction Step (after zk measurement) - 1st
Way
( ) ( ) ( )
( )1:1
1:1
:1
|
||
|
−
−
=
kk
kkkk
kk
Zzp
Zxpxzp
Zxp
101
( ) ( )kkkv Rvvp ,0;N=
kkkk vxHz +=
Consider a Gaussian vector , where ,
measurement, , where the Gaussian noise
is independent of and .
v
kx ( ) [ ]1|1| ,; −−= kkkkkkx Pxxxp

N
kx
( ) ( ) ( ) ( )∫∫
+∞
∞−
+∞
∞−
== kkxkkxzkkkzxkz xdxpxzpxdzxpzp |, |,
is Gaussian with( )kz zp ( ) ( ) ( ) ( ) 1|
0
−=+=+= kkkkkkkkk xHvExEHvxHEzE


( ) ( )[ ] ( )[ ]{ } [ ][ ]{ }
( )[ ] ( )[ ]{ } [ ]{ }
[ ]{ } [ ]{ } { } k
T
kkkk
T
kk
T
k
T
kkkk
T
kkkkk
T
k
T
kkkkkkk
T
kkkkkkkkkk
T
kkkkkkkkkkkk
T
kkkkk
RHPHvvEHxxvEvxxEH
HxxxxEHvxxHvxxHE
xHvxHxHvxHEzEzzEzEz
+=+−−−−
−−=+−+−=
−+−+=−−=
−−−
−−−−
−−
1|
0
1|
0
1|
1|1|1|1|
1|1|cov
  

  



( )
( ) ( )
( )[ ] ( )[ ] ( )[ ]






−−+−−−−
+−
=
−
xHzRHPHxHz
RHPH
zp TT
Tpz
ˆˆ
2
1
exp
2
1 1
2/12/
π
( )
( )
( ) ( )





−−−= −
−
−−
−
−− 1|
1
1|1|2/1
1|
2/1:1|
2
1
exp
2
1
|1:1 kkkkk
T
kkk
kk
nkkZx xxPxx
P
Zxp kk

π
( ) ( )
( )
( ) ( )



−−−=−= −
kkk
T
kkkpkkkvkkxz xHzRxHz
R
xHzpxzp 1
2/12/|
2
1
exp
2
1
|
π
Recursive Bayesian EstimationSOLO
Linear Gaussian Markov Systems (continue – 5)
Correction Step (after zk measurement) 1st
Way (continue – 1)
102
Recursive Bayesian EstimationSOLO
Linear Gaussian Markov Systems (continue – 6)
kkkk vxHz +=
( ) ( )Rvvpv ,0;N=
( )
( ) 





−= −
vRv
R
vp T
pv
1
2/12/
2
1
exp
2
1
π
Correction Step (after zk measurement) 1st
Way (continue – 2)
( )
( )
( ) ( )



−−−= −
−
−−
−
−− 1|
1
1|1|2/1
1|
2/1:1|
2
1
exp
2
1
|1:1 kkkkk
T
kkk
kk
nkkZx xxPxx
P
Zxp kk

π
( ) ( )
( )
( ) ( )



−−−=−= −
kkk
T
kkkpkkkvkkxz xHzRxHz
R
xHzpxzp 1
2/12/|
2
1
exp
2
1
|
π
( )
( )
[ ] [ ] [ ]






−+−−
+
= −
−
−−
−
1|
1
1|1|2/1
1|
2/
ˆˆ
2
1
exp
2
1
kkkk
T
kkkk
T
kkk
k
T
kkkk
p
kz xHzRHPHxHz
RHPH
zp
π
( ) ( ) ( )
( )
( )
( ) ( ) ( ) ( ) [ ] [ ] [ ]





−+−+−−−−−−⋅
+
==
−
−
−−−
−
−−
−
−
−−
−
1|
1
1|1|1|
1
1|1|
1
2/1
1|
2/12/1
1|2/1:1
1:1
:1
ˆˆ
2
1
2
1
2
1
exp
2
1
|
||
|
kkkkk
T
kkkk
T
kkkkkkkkk
T
kkkkkkk
T
kkk
k
T
kkkk
kkknkk
kkkk
kk
xHzRHPHxHzxxPxxxHzRxHz
RHPH
RPZzp
Zxpxzp
Zxp

π
from which
103
( ) ( ) ( ) ( ) ( ) [ ] ( )1|
1
1|1|1|
1
1|1|
1
−
−
−−−
−
−−
−
−+−−−−+−− kkkk
T
kkkkk
T
kkkkkkkkk
T
kkkkkkk
T
kkk xHzHPHRxHzxxPxxxHzRxHz

( )[ ] ( )[ ] ( ) ( )
( ) [ ] ( ) ( ) [ ]{ }( )
( ) ( ) ( ) ( ) ( ) [ ]( )1|
11
1|1|1|
1
1|1|
1
1|
1|
1
1|
1
1|1|
1
1|1|
1|
1
1|1|1|1|
1
1|1|
−
−−
−−−
−
−−
−
−
−
−
−
−
−−
−
−−
−
−
−−−−
−
−−
−+−+−−−−−−
−+−−=−+−−
−−+−−−−−−=
kkkkk
T
kkk
T
kkkkkkkk
T
kkkkkkkkk
T
k
T
kkk
kkkk
T
kkkkkk
T
kkkkkkkk
T
kkkkk
T
kkkk
kkkkk
T
kkkkkkkkkkkk
T
kkkkkkkk
xxHRHPxxxxHRxHzxHzRHxx
xHzHPHRRxHzxHzHPHRxHz
xxPxxxxHxHzRxxHxHz



[ ] [ ] 1111
1|
1111
1|
1 −−−−
−
−−−−
−
−
++/−/=+− k
T
kkk
T
kkkkkkk
LemmaMatrixInverse
T
kkkkkk RHHRHPHRRRHPHRRwe have
Define:
[ ] [ ] 1
1|
1
1|
1
1|
1
1|
111
1|| :
−
−
−
−
−
−
−
−
−−−
− +−=+= kk
T
k
T
kkkkkkkkkk
LemmaMatrixInverse
kk
T
kkkkk PHHPHRHPPHRHPP
( )[ ] ( )[ ]1|
1
|1|
1
|1|
1
|1| −
−
−
−
−
−
− −+−−+−= kkkkk
T
kkkkkkkk
T
kkkkk
T
kkkkkk xHzRHPxxPxHzRHPxx

( )
( )
( )[ ] ( )[ ]





−+−−+−−⋅= −
−
−
−
−
−
− 1|
1
|1|
1
|1|
1
|1|2/1
|
2/:1|
2
1
exp
2
1
| kkkkk
T
kkkkkkkk
T
kkkkk
T
kkkkkk
kk
nkkzx xHzRHPxxPxHzRHPxx
P
Zxp

π
Recursive Bayesian EstimationSOLO
Linear Gaussian Markov Systems (continue – 7)
Correction Step (after zk measurement) 1st
Way (continue – 3)
then ( ) ( ) ( ) ( ) ( ) [ ] ( )1|
1
1|1|1|
1
1|1|
1
−
−
−−−
−
−−
−
−+−−−−+−− kkkkk
T
kkkk
T
kkkkkkkkk
T
kkkkkkk
T
kkk xHzRHPHxHzxxPxxxHzRxHz

( ) ( ) ( ) ( ) ( ) ( )
( ) ( )( ) ( ) ( )1|
1
|1|1|
1
||
1
1|
1|
1
|
1
|1|1|
1
|
1
||
1
1|
−
−
−−
−−
−
−
−−
−−
−−−
−
−−+−−−
−−−−−=
kkkkk
T
kkkkkkkkkkkk
T
kkkk
kkkkk
T
kkkkk
T
kkkkkkkk
T
kkkkkkkkk
T
kkkk
xxPxxxxPPHRxHz
xHzRHPPxxxHzRHPPPHRxHz


104
then
( )kkzx
x
Zxp
k
:1| |max
( )
{ }kk
kkkkk
T
kkkkkkkk
ZxE
xHzRHPxxx
:1
1|
1
|1|
*
|
|
ˆˆ:ˆ
=
−+== −
−
−
Recursive Bayesian EstimationSOLO
Linear Gaussian Markov Systems (continue – 8)
Correction Step (after zk measurement) 1st
Way (continue – 4)
( )
( )
( )[ ] ( )[ ]





−+−−+−−⋅= −
−
−
−
−
−
− 1|
1
1|
1
|1|
1
1|2/1
|
2/:1|
2
1
exp
2
1
| kkkkk
T
kkkkkk
T
kkkkk
T
kkkk
kk
nkkzx xHzRHxxPxHzRHxx
P
Zxp

π
where:[ ] ( )( ){ }k
T
kkkkkkkk
T
kkkkk ZxxxxEHRHPP :1||
111
1||
ˆˆ: −−=+=
−−−
−
105
{ } ( ) ( )

ki
kkkkkkkkkkkkkkkkk zzKxxHzKxZxEx 1|1|1|1|:1| ˆ| −−−− −+=−+==
Recursive Bayesian EstimationSOLO
Linear Gaussian Markov Systems (continue – 9)
Summary 1st
Way – Kalman Filter
Initial Conditions:
[ ] 111
1|| :
−−−
− += kk
T
kkkkk HRHPP
Prediction phase (before zk measurement)
111|111|
ˆˆ −−−−−− +Φ= kkkkkkk uGxx
Correction Step (after zk measurement)
T
kk
T
kkkkkk QPP 1111|111| −−−−−−− ΓΓ+ΦΦ=
1
|:
−
= k
T
kkkk RHPK
{ }00|0
ˆ xEx = ( ) ( ){ }T
xxxxEP 0|000|000|0
ˆˆ: −−=
kkkk wxHz += { } { } { }
0
1:11|1:11:11| |ˆ||ˆ −−−−− +=+== kkkkkkkkkkkkk ZwExHZwxHEZzEz
1|1|
ˆˆ −− = kkkkk xHz
106
Recursive Bayesian EstimationSOLO
Linear Gaussian Markov Systems (continue – 10)
kkkk vxHz +=
( ) ( )Rvvpv ,0;N= ( )
( ) 





−= −
vRv
R
vp T
pv
1
2/12/
2
1
exp
2
1
π
( )
( )
[ ] [ ] [ ]






−+−−
+
= −
−
−−
−
1|
1
1|1|2/1
1|
2/
ˆˆ
2
1
exp
2
1
kkkkk
T
kkkk
T
kkkk
k
T
kkkk
p
kz xHzRHPHxHz
RHPH
zp
π
from which { } 1|1:11|
ˆ|ˆ −−− == kkkkkkk xHZzEz
( ) ( ){ } kk
T
kkkkk
T
kkkkkk
zz
kk SRHPHZzzzzEP =+=−−= −−−−− :ˆˆ 1|1:11|1|1|
[ ][ ]{ }
[ ] ( )[ ]{ } T
kkkk
T
kkkkkkkk
k
T
kkkkkk
xz
kk
HPZvxxHxxE
ZzzxxEP
1|1:11|1|
1:11|1|1|
ˆˆ
ˆˆ
−−−−
−−−−
=+−−=
−−=
We also have
Correction Step (after zk measurement) 2nd
Way
Define the innovation: 1|1|
ˆˆ: −− −=−= kkkkkk xHzzzi
107
Recursive Bayesian EstimationSOLO
Joint and Conditional Gaussian Random Variables






=
k
k
k
z
x
yDefine: assumed that they are Gaussian distributed
Prediction phase (before zk measurement) 2nd
way (continue -1)
{ }








=












=
−
−
−
−
−
1|
1|
1:1
1:1
1:1
ˆ
ˆ
|
|
|
kk
kk
kk
kk
kk
z
x
Zz
Zx
EZyE








=
















−
−








−
−
=
−−
−−
−
−
−
−
−
− zz
kk
zx
kk
xz
kk
xx
kk
k
T
kkk
kkk
kkk
kkkyy
kk
PP
PP
Z
zz
xx
zz
xx
EP
1|1|
1|1|
1:1
1|
1|
1|
1|
1|
ˆ
ˆ
ˆ
ˆ
where: [ ][ ]{ } 1|1:11|1|1|
ˆˆ −−−−− =−−= kkk
T
kkkkkk
xx
kk PZxxxxEP
[ ][ ]{ } kk
T
kkkkk
T
kkkkkk
zz
kk SRHPHZzzzzEP =+=−−= −−−−− :ˆˆ 1|1:11|1|1|
[ ][ ]{ } T
kkkk
T
kkkkkk
xz
kk HPZzzxxEP 1|1:11|1|1| ˆˆ −−−−− =−−=
Linear Gaussian Markov Systems (continue – 11)
108
( ) ( ) ( )



−−−= −
−
−−
−
− 1|
1
1|1|2/1
1|
1:1,
ˆˆ
2
1
exp
2
1
|, kkk
yy
kk
T
kkk
yy
kk
kkkzx yyPyy
P
Zzxp
π
Recursive Bayesian EstimationSOLO
Joint and Conditional Gaussian Random Variables
The conditional probability distribution function (pdf) of xk given zk is given by:
Prediction phase (before zk measurement) 2nd
Way (continue – 2)
( ) ( ) ( )





−−−= −
−
−−
−
− 1|
1
1|1|2/1
1|
1:1 ˆˆ
2
1
exp
2
1
| kkk
zz
kk
T
kkk
zz
kk
kkz zzPzz
P
Zzp
π
( ) ( )
( )
( )
( ) ( )
( ) ( )



−−−




−−−
===
−
−
−−
−
−
−−
−
−
−
−
−
1|
1
1|1|
1|
1
1|1|
2/1
1|
2/1
1|
1:1
1:1,
|1:1|
ˆˆ
2
1
exp
ˆˆ
2
1
exp
2
2
|
|,
|,|
kkk
zz
kk
T
kkk
kkk
yy
kk
T
kkk
yy
kk
zz
kk
kkz
kkkzx
kkzxkkkzx
zzPzz
yyPyy
P
P
Zzp
Zzxp
zxpZzxp
π
π
( ) ( ) ( ) ( )



−−+−−−= −
−
−−−
−
−−
−
−
1|
1
1|1|1|
1
1|1|2/1
1|
2/1
1|
ˆˆ
2
1
ˆˆ
2
1
exp
2
2
kkk
zz
kk
T
kkkkkk
yy
kk
T
kkk
yy
kk
zz
kk
zzPzzyyPyy
P
P
π
π
Linear Gaussian Markov Systems (continue – 12)
We assumed that is Gaussian distributed:





=
k
k
k
z
x
y
109
Recursive Bayesian EstimationSOLO
Joint and Conditional Gaussian Random Variables
Prediction phase (before zk measurement) 2nd
Way (continue – 3)
( ) ( ) ( ) ( ) ( )



−−+−−−= −
−
−−−
−
−−
−
−
1|
1
1|1|1|
1
1|1|2/1
1|
2/1
1|
| ˆˆ
2
1
ˆˆ
2
1
exp
2
2
| kkk
zz
kk
T
kkkkkk
zz
kk
T
kkk
yy
kk
zz
kk
kkzx zzPzzyyPyy
P
P
zxp
π
π
Define: 1|1| ˆ:&ˆ: −− −=−= kkkkkkkk zzxx ςξ
( ) ( ) ( ) ( )
k
zz
kk
T
kk
zz
kk
T
kk
zx
kk
T
kk
xz
kk
T
kk
xx
kk
T
k
kkkzz
T
k
k
k
zz
kk
zx
kk
xz
kk
xx
kk
T
k
k
k
zz
kk
T
k
k
k
zz
kk
zx
kk
xz
kk
xx
kk
T
k
k
kkk
zz
kk
T
kkkkkk
yy
kk
T
kkk
PTTTT
P
TT
TT
P
PP
PP
zzPzzyyPyyq
ςςςςξςςξξξ
ςς
ς
ξ
ς
ξ
ςς
ς
ξ
ς
ξ
1
1|1|1|1|1|
1
1|
1|1|
1|1|
1
1|
1
1|1|
1|1|
1|
1
1|1|1|
1
1|1| ˆˆˆˆ:
−
−−−−−
−
−
−−
−−
−
−
−
−−
−−
−
−
−−−
−
−−
−+++=
−



















=
−



















=
−−−−−=
Linear Gaussian Markov Systems (continue – 13)
110
Recursive Bayesian EstimationSOLO
Joint and Conditional Gaussian Random Variables
Prediction phase (before zk measurement) 2nd
way (continue – 4)
Using Inverse Matrix Lemma:
( ) ( )
( ) ( ) 







−−−
−−−
=





−−−−−
−−−−−−
11111
111111
nxmnxnmxnmxmmxnmxmnxmnxnmxnmxm
mxmnxmmxnmxmnxmnxnmxnmxmnxmnxn
mxmmxn
nxmnxn
BADCDCBADC
CBDCBADCBA
CD
BA








=








−−
−−
−
−−
−−
zz
kk
zx
kk
xz
kk
xx
kk
zz
kk
zx
kk
xz
kk
xx
kk
TT
TT
PP
PP
1|1|
1|1|
1
1|1|
1|1|
in 1
1|1|1|
1
1|
1|
1
1|1|1|
1
1|
1|
1
1|1|1|
1
1|
−
−−−
−
−
−
−
−−−
−
−
−
−
−−−
−
−
−=
−=
−=
zz
kk
xz
kk
xz
kk
xx
kk
xz
kk
xx
kk
zx
kk
zz
kk
zz
kk
kkzxkkzzkkxzkkxxkkxx
PPTT
TTTTP
PPPPT
k
zz
kk
T
kk
zz
kk
T
kk
zx
kk
T
kk
xz
kk
T
kk
xx
kk
T
k PTTTTq ςςςςξςςξξξ
1
1|1|1|1|1|
−
−−−−− −+++=
( )
k
zz
kk
T
kk
zz
kk
T
k
k
xz
kk
xx
kk
zx
kk
T
kk
xz
kk
xx
kk
zx
kk
T
kk
xz
kk
T
kk
xx
kk
xx
kk
zx
kk
T
k
T
k
PT
TTTTTTTTTT
ςςςς
ςςςςςξξςξ
1
1|1|
1|
1
1|1|1|
1
1|1|1|1|
1
1|1|
−
−−
−
−
−−−
−
−−−−
−
−−
−+
−+++=
( ) ( )
( ) ( ) ( )k
xz
kk
xx
kkk
xx
kk
T
k
xz
kk
xx
kkkk
zz
kk
xz
kk
xx
kkkkzx
zz
kk
T
k
k
xz
kk
xx
kk
xx
kk
T
k
xz
kk
xx
kkkk
xx
kk
T
k
xz
kk
xx
kkk
TT
TTTTTPTTTT
TTTTTTTT
zx
kk
Txz
kk
ςξςξςς
ςςξξςξ
1|
1
1|1|1|
1
1|
0
1|1|
1
1|1|1|
1|
1
1|1|1|
1
1|1|1|
1
1|
1|1|
−
−
−−−
−
−−−
−
−−−
−
−
−−−
−
−−−
−
−
=
++=−−+
+++=
−−
  
Linear Gaussian Markov Systems (continue – 14)
111
Recursive Bayesian EstimationSOLO
Joint and Conditional Gaussian Random Variables
Prediction phase (before zk measurement) 2nd
way (continue – 5)








=








−−
−−
−
−−
−−
zz
kk
zx
kk
xz
kk
xx
kk
zz
kk
zx
kk
xz
kk
xx
kk
TT
TT
PP
PP
1|1|
1|1|
1
1|1|
1|1|
1
1|1|1|
1
1|
1|
1
1|1|1|
1
1|
1|
1
1|1|1|
1
1|
−
−−−
−
−
−
−
−−−
−
−
−
−
−−−
−
−
−=
−=
−=
zz
kk
xz
kk
xz
kk
xx
kk
xz
kk
xx
kk
zx
kk
zz
kk
zz
kk
kkzxkkzzkkxzkkxxkkxx
PPTT
TTTTP
PPPPT
( ) ( )k
xz
kk
xx
kkk
xx
kk
T
k
xz
kk
xx
kkk TTTTTq ςξςξ 1|
1
1|1|1|
1
1| −
−
−−−
−
− ++=
1|1| ˆ:&ˆ: −− −=−= kkkkkkkk zzxx ςξ
( )
( )[ ] ( )[ ]






−−−−−−−=






−=
−−−−−
−
−
−
−
1|1|1|1|1|2/1
1|
2/1
1|
2/1
1|
2/1
1|
|
ˆˆˆˆ
2
1
exp
2
2
2
1
exp
2
2
|
kkkkkkk
xx
kk
T
kkkkkkk
yy
kk
zz
kk
yy
kk
zz
kk
kkzx
zzKxxTzzKxx
P
P
q
P
P
zxp
π
π
π
π
( )1|
1
1|1|1|
1
1|1| ˆˆ −
−
−−−
−
−− −−−=+ kkk
K
zz
kk
xz
kkkkkk
xx
kk
xz
kkk zzPPxxTT
k


ςξ
Linear Gaussian Markov Systems (continue – 15)
112
Recursive Bayesian EstimationSOLO
Joint and Conditional Gaussian Random Variables
Prediction phase (before zk measurement) 2nd
Way (continue – 6)
( ) ( )[ ] ( )[ ]





−−−−−−−= −
−
−−−−−
−
−−− 1|
1
1|1|1|1|1|
1
1|1|1|| ˆˆˆˆ
2
1
exp| kkk
xx
kk
xz
kkkkk
xx
kk
T
kkk
xx
kk
xz
kkkkkkkzx zzPPxxTzzPPxxczxp
From this we can see that
{ } ( )1|
1
1|1|1|| ˆˆˆ| −
−
−−− −+== kkk
K
zz
kk
xz
kkkkkkkk zzPPxxzxE
k



( )( ){ }
T
k
zz
kkk
xx
kk
zx
kk
zz
kk
xz
kk
xx
kk
xx
kkk
T
kkkkkk
xx
kk
KPKP
PPPPTZxxxxEP
1|1|
1|
1
1|1|1|
1
1|:1|||
ˆˆ
−−
−
−
−−−
−
−
−=
−==−−=
[ ][ ]{ } 1|1:11|1|1|
ˆˆ −−−−− =−−= kkk
T
kkkkkk
xx
kk PZxxxxEP
[ ][ ]{ } k
T
kkkkkk
T
kkkkkk
zz
kk SHPHRZzzzzEP =+=−−= −−−−− :ˆˆ 1|1:11|1|1|
[ ][ ]{ } T
kkkk
T
kkkkkk
xz
kk HPZzzxxEP 1|1:11|1|1| ˆˆ −−−−− =−−=
Linear Gaussian Markov Systems (continue – 16)
113
Recursive Bayesian EstimationSOLO
Joint and Conditional Gaussian Random Variables
Prediction phase (before zk measurement) 2nd
Way (continue – 7)
From this we can see that
( ) [ ] 111
1|1|
1
1|1|1||
−−−
−−
−
−−− +=+−= kk
T
kkkkkk
T
kkkkk
T
kkkkkkk HRHPPHHPHRHPPP
( ) 1
1|
1
1|1|
1
1|1|
−
−
−
−−
−
−− =+== k
T
kkk
T
kkkkk
T
kkk
zz
kk
xz
kkk SHPHPHRHPPPK
Linear Gaussian Markov Systems (continue – 17)
kk
T
kkkkk KSKPP −= −1||
or
[ ][ ]{ } 1|1:11|1|1|
ˆˆ −−−−− =−−= kkk
T
kkkkkk
xx
kk PZxxxxEP
[ ][ ]{ } k
T
kkkkkk
T
kkkkkk
zz
kk SHPHRZzzzzEP =+=−−= −−−−− :ˆˆ 1|1:11|1|1|
[ ][ ]{ } T
kkkk
T
kkkkkk
xz
kk HPZzzxxEP 1|1:11|1|1| ˆˆ −−−−− =−−=
114
We found that the optimal Kk is
[ ] 1
1|1|
−
−− +=
T
kkkkk
T
kkkk HPHRHPK
[ ] [ ] 1111
|1
11
&
1
|1 1
1|
1
−−−−
+
−−−
+ +−=+ −
−
− k
T
kkk
T
kkkkkk
LemmaMatrixInverse
existPR
T
kkkkk RHHRHPHRRHPHR
kkk
[ ] 1111
1|
1
1|
1
1|
−−−−
−
−
−
−
− +−= k
T
kkk
T
kkkkk
T
kkkk
T
kkkk RHHRHPHRHPRHPK
[ ]{ } [ ] 1111
|1
111
|1|1
−−−−
+
−−−
++ +−+= k
T
kkk
T
kkkkk
T
kkk
T
kkkkk RHHRHPHRHHRHPP
[ ] 1
|
1111
|1
−−−−−
+ =+= RHPRHHRHPK T
kkk
T
kkk
T
kkkk
If Rk
-1
and Pk|k-1
-1
exist:
Recursive Bayesian EstimationSOLO
Linear Gaussian Markov Systems (continue – 18)
Relation Between 1st
and 2nd
ways
2nd
Way
1st
Way = 2nd
Way
115
1|1| ˆˆ: −− −=−= kkkkkkkk zzxHzi
Recursive Bayesian EstimationSOLO
Linear Gaussian Markov Systems (continue – 19)
Innovation
The innovation is the quantity:
We found that:
{ } ( ){ } { } 0ˆ||ˆ| 1|1:11:11|1:1 =−=−= −−−−− kkkkkkkkkk zZzEZzzEZiE
[ ][ ]{ } { } k
T
kkkkkk
T
kkk
T
kkkkkk SHPHRZiiEZzzzzE =+==−− −−−−− :ˆˆ 1|1:11:11|1|
Using the smoothing property of the expectation:
{ }{ } ( ) ( ) ( ) ( )
( )
( ) ( ) { }xEdxxpxdxdyyxpx
dxdyypyxpxdyypdxyxpxyxEE
x
X
x y
YX
x y
yxp
YYX
y
Y
x
YX
YX
==








=










=





=
∫∫ ∫
∫ ∫∫ ∫
∞+
−∞=
∞+
−∞=
∞+
−∞=
∞+
−∞=
∞+
−∞=
∞+
−∞=
∞+
−∞=
,
||
,
,
||
,
  
{ } { }{ }1:1 −= k
T
jk
T
jk ZiiEEiiEwe have:
Assuming, without loss of generality, that k-1 ≥ j, and innovation I (j) is
Independent on Z1:k-1, and it can be taken outside the inner expectation:
{ } { }{ } { } 0
0
1:11:1 =








== −−
T
jkkk
T
jk
T
jk iZiEEZiiEEiiE

116
1|1| ˆˆ: −− −=−= kkkkkkkk zzxHzi
Recursive Bayesian EstimationSOLO
Linear Gaussian Markov Systems (continue – 20)
Innovation (continue – 1)
The innovation is the quantity:
We found that:
{ } ( ){ } { } 0ˆ||ˆ| 1|1:11:11|1:1 =−=−= −−−−− kkkkkkkkkk zZzEZzzEZiE
{ } k
T
kkkkkk
T
kk SHPHRZiiE =+= −− :1|1:1
{ } 0=
T
jk iiE
{ } jik
T
jk SiiE δ=
The uncorrelated ness property of the innovation implies that since they are Gaussian,
the innovation are independent of each other and thus the innovation sequence is
Strictly White.
Thus the innovation sequence is zero mean and white for the Kalman (Optimal) Filter.
Without the Gaussian assumption, the innovation sequence is Wide Sense White.
Table of Content
117
Recursive Bayesian EstimationSOLO
Closed-Form Solutions of Estimation
Closed-Form solutions for the Optimal Recursive Bayesian Estimation
can be derived only for special cases
The most important case:
• Dynamic and measurement models are linear
( )
( )kkkk
kkkk
vuxkhz
wuxkfx
,,,
,,,1 111
=
−= −−−
kkkk
kkkkkkk
vxHz
wuGxx
+=
Γ++Φ= −−−−−− 111111
• Random noises are Gaussian
( ) ( )Qwwpw ,0;N=
( ) ( )Rvvpv ,0;N=
( )
( ) 





−= wQw
Q
wp T
nw
2
1
exp
2
1
2/12/
π
( )
( ) 



−= −
vRv
R
vp T
pv
1
2/12/
2
1
exp
2
1
π
• Solution: KALMAN FILTER
• In other non-linear/non-Gaussian cases:
USE APPROXIMATIONS
118
Recursive Bayesian EstimationSOLO
Closed-Form Solutions of Estimation (continue – 1)
• Dynamic and measurement models are linear
kkkk
kkkkkkk
vxHz
wuGxx
+=
Γ++Φ= −−−−−− 111111
• The Optimal Estimator is the Kalman Filter
developed by R. E. Kalman in 1960
( ) ( ) ( ){ } ( ) ( ){ } ( )kPkekeEkxEkxke x
T
xxx =−= &:
( ) ( ) ( ){ } ( ) ( ){ } ( ) lk
T
www kQlekeEkwEkwke ,
0
&: δ=−=

( ) ( ) ( ){ } ( ) ( ){ } ( ) lk
T
vvv kRlekeEkvEkvke ,
0
&: δ=−=
 ( ) ( ){ } { }0=lekeE
T
vw



=
≠
=
lk
lk
lk
1
0
,δ
Rudolf E. Kalman
( 1920 - )
• K.F. is an Optimal Estimator (in the
Minimum Mean Square Estimator (MMSE) ) sense if:
- state and measurement models are linear
- the random elements are Gaussian
• Under those conditions, the covariance matri:
- independent of the state (can be calculated off-line)
- equals the Cramer – Rao lower bound
Table of Content
119
Kalman Filter
State Estimation in a Linear System (one cycle)
SOLO
1: += kk
Initialization{ } ( ) ( ){ }T
xxxxEPxEx 00000|000
ˆˆˆ −−==0
State vector prediction111|111|
ˆˆ −−−−−− +Φ= kkkkkkk uGxx1
Covariance matrix extrapolation111|111| −−−−−− +ΦΦ= k
T
kkkkkk QPP2
Innovation Covariancek
T
kkkkk RHPHS += −1|3
Gain Matrix Computation
1
1|
−
−= k
T
kkkk SHPK4
Measurement & Innovation
1|ˆ
1|
ˆ
−
−−=
kkz
kkkkk xHzi5
Filteringkkkkkk iKxx += −1||
ˆˆ6
Covariance matrix updating
( )
( ) ( ) T
kkk
T
kkkkkk
kkkk
T
kkkkk
kkkk
T
kkkkkkk
KRKHKIPHKI
PHKI
KSKP
PHSHPPP
+−−=
−=
−=
−=
−
−
−
−
−
−−
1|
1|
1|
1|
1
1|1||7
120
Kalman Filter
State Estimation in a Linear System (one cycle)
Sensor Data
Processing and
Measurement
Formation
Observation -
to - Track
Association
Input
Data Track Maintenance
( Initialization,
Confirmation
and Deletion)
Filtering and
Prediction
Gating
Computations
Samuel S. Blackman, " Multiple-Target Tracking with Radar Applications", Artech House,
1986
Samuel S. Blackman, Robert Popoli, " Design and Analysis of Modern Tracking Systems",
Artech House, 1999
SOLO
Rudolf E. Kalman
( 1920 - )
121
SOLO
General Bayesian Nonlinear Filters
General Bayesian
Nonlinear Filters
Additive Gaussian
Noise
Gauss Hermite
Kalman Filter
(GHKF)
Unscented
Kalman Filter
(UKF)
Non-Resampling
Particle
Filter
Gaussian
Particle
Filter (GPF)
Gauss Hermite
Particle Filter
(GHPF)
Unscented
Particle Filter
(UPF)
Monte Carlo
Particle Filter
(MCPF)
Recursive Bayesian Estimation
Monte Carlo
Kalman Filter
(MCKF)
Extended
Kalman Filter
(EKF)
Non-Additive
Non-Gaussian
Noise
Resampling
Particle
Filter
Sequential Importance
Sampling Particle
Filter (SIS PF)
Bootstrap
Particle
Filter (BPF)
Run This
Table of Content
122
Extended Kalman Filter
Sensor Data
Processing and
Measurement
Formation
Observation -
to - Track
Association
Input
Data Track Maintenance
( Initialization,
Confirmation
and Deletion)
Filtering and
Prediction
Gating
Computations
Samuel S. Blackman, " Multiple-Target Tracking with Radar Applications", Artech House,
1986
Samuel S. Blackman, Robert Popoli, " Design and Analysis of Modern Tracking Systems",
Artech House, 1999
SOLO
In the extended Kalman filter, (EKF) the state
transition and observation models need not be linear
functions of the state but may instead be (differentiable)
functions.
( ) ( ) ( )[ ] ( )kwkukxkfkx +=+ ,,1
( ) ( ) ( )[ ] ( )11,1,11 +++++=+ kkukxkhkz ν
State vector dynamics
Measurements
( ) ( ) ( ){ } ( ) ( ){ } ( )kPkekeEkxEkxke x
T
xxx =−= &:
( ) ( ) ( ){ } ( ) ( ){ } ( ) lk
T
www kQlekeEkwEkwke ,
0
&: δ=−=

( ) ( ){ } lklekeE
T
vw ,0 ∀=



=
≠
=
lk
lk
lk
1
0
,δ
The function f can be used to compute the predicted state from the previous estimate
and similarly the function h can be used to compute the predicted measurement from
the predicted state. However, f and h cannot be applied to the covariance directly.
Instead a matrix of partial derivatives (the Jacobian) is computed.
( ) ( ) ( )[ ] ( ){ } ( )[ ] ( )
( ){ }
( ) ( )
( ){ }
( ) ( )keke
x
f
keke
x
f
kekukxEkfkukxkfke wx
Hessian
kxE
T
xx
Jacobian
kxE
wx ++
∂
∂
+
∂
∂
=+−=+ 

2
2
2
1
,,,,1
( ) ( ) ( )[ ] ( ){ } ( )[ ] ( )
( ){ }
( ) ( )
( ){ }
( ) ( 111
2
1
111,1,11,1,11
1
2
2
1
++++
∂
∂
+++
∂
∂
=+++++−+++=+
++
kke
x
h
keke
x
h
kkukxEkhkukxkhke x
Hessian
kxE
T
xx
Jacobian
kxE
z νν 

Taylor’s Expansion:
123
Extended Kalman Filter
State Estimation (one cycle)
SOLO
1: += kk
( )11|11| ,ˆ,1ˆ −−−− −= kkkkk uxkfx
State vector prediction1
Jacobians Computation
1|1|1 ˆˆ
1 &
−−−
∂
∂
=
∂
∂
=Φ −
kkkk x
k
x
k
x
h
H
x
f
2
Covariance matrix extrapolation111|111| −−−−−− +ΦΦ= k
T
kkkkkk QPP3
Innovation Covariancek
T
kkkkk RHPHS += −1|4
Gain Matrix Computation
1
1|
−
−= k
T
kkkk SHPK5
Measurement & Innovation
1|ˆ
1|
ˆ
−
−−=
kkz
kkkkk xHzi6
Filteringkkkkkk iKxx += −1||
ˆˆ7
Covariance matrix updating
( )
( ) ( ) T
kkk
T
kkkkkk
kkkk
T
kkkkk
kkkk
T
kkkkkkk
KRKHKIPHKI
PHKI
KSKP
PHSHPPP
+−−=
−=
−=
−=
−
−
−
−
−
−−
1|
1|
1|
1|
1
1|1||8
0 Initialization (k = 0){ } ( ) ( ){ }T
xxxxEPxEx 00000|000
ˆˆˆ −−==
124
Extended Kalman Filter
State Estimation (one cycle)
Sensor Data
Processing and
Measurement
Formation
Observation -
to - Track
Association
Input
Data Track Maintenance
( Initialization,
Confirmation
and Deletion)
Filtering and
Prediction
Gating
Computations
Samuel S. Blackman, " Multiple-Target Tracking with Radar Applications", Artech House,
1986
Samuel S. Blackman, Robert Popoli, " Design and Analysis of Modern Tracking Systems",
Artech House, 1999
SOLO
Rudolf E. Kalman
( 1920 - )
125
SOLO
Criticism of the Extended Kalman Filter
Unlike its linear counterpart, the extended Kalman filter is not an optimal estimator.
In addition, if the initial estimate of the state is wrong, or if the process is modeled
incorrectly, the filter may quickly diverge, owing to its linearization. Another problem
with the extended Kalman filter is that the estimated covariance matrix tends to
underestimate the true covariance matrix and therefore risks becoming inconsistent
in the statistical sense without the addition of "stabilizing noise".
Having stated this, the Extended Kalman filter can give reasonable performance, and
is arguably the de facto standard in navigation systems and GPS.
Extended Kalman Filter
Table of Content
126
SOLO
Additive Gaussian Nonlinear Filter
Consider the case of a Markovian process where the noise is additive and Gaussian:
( )
( ) kkk
kkk
vxhz
wxfx
+=
+= −− 11
( ) ( )kkkw Qwwp ,0;N=
( ) ( )kkkv Rvvp ,0;N=
( )
( ) 





−= kk
T
k
k
nkw wQw
Q
wp
2
1
exp
2
1
2/12/
π
( )
( ) 



−=
−
kk
T
k
k
pkv vRv
R
vp
1
2/12/
2
1
exp
2
1
π
where wk and vk are independent white noises Gaussian, with zero mean and
covariances Qk and Rk, respectively:
Recursive Bayesian Estimation
Therefore, since f (xk-1) is a deterministic function, by adding the Gaussian
noise wk-1, we obtain xk also a Gaussian random variable.
( ) ( )( )111:11 ,;,| −−−− = kkkkkk QxfxZxxp N
127
SOLO
Additive Gaussian Nonlinear Filter (continue – 1) ( )
( ) kkk
kkk
vxhz
wxfx
+=
+= −− 11
Recursive Bayesian Estimation
( ) ( )( )111:11 ,;,| −−−− = kkkkkk QxfxZxxp N
( ) ( ) ( )1:111:111:11 |,||, −−−−−− = kkkkk
Bayes
kkk ZxpZxxpZxxp
( ) ( ) ( ) ( )∫∫ −−−−−−−−− == 11:111:1111:111:1 |,||,| kkkkkkkkkkkk xdZxpZxxpxdZxxpZxp
Using:
we obtain:
( ) ( )( ) ( )∫ −−−−−− = 11:11111:1 |,;| kkkkkkkk xdZxpQxfxZxp N
{ } ( ) ( )( ) ( )[ ]∫ ∫∫ −−−−−−−− === kkkkkkkkkkkkkkkk xdxdZxpQxfxxxdZxpxZxEx 11:11111:11:11| |,;||:ˆ N
( )( )[ ] ( ) ( ) ( )∫∫ ∫ −−−−−−−−− == 11:11111:1111 ||,; kkkkkkkkkkkk xdZxpxfxdZxpxdQxfxx N
Assume that is Gaussian with mean and covariance , then1−kx
1|1
ˆ −− kkx 1|1 −− kkP
( ) ( )1|11|111:11 ,ˆ;| −−−−−−− = kkkkkkk PxxZxp N
{ } ( ) ( )∫ −−−−−−−−− == 11|11|1111:11| ,ˆ;|ˆ k
xx
kkkkkkkkkk xdPxxxfZxEx N
128
SOLO
Additive Gaussian Nonlinear Filter (continue – 2) ( )
( ) kkk
kkk
vxhz
wxfx
+=
+= −− 11
Recursive Bayesian Estimation
( ) ( )xx
kkkkkkk PxxZxp 1|11|111:11 ,ˆ;| −−−−−−− = N
{ } ( ) ( )∫ −−−−−−−−− == 11|11|1111:11| ,ˆ;|ˆ k
xx
kkkkkkkkkk xdPxxxfZxEx N
( )( ){ } ( )[ ] ( )[ ]{ }
( )[ ] ( )[ ] ( )∫ −−−−−−−−−−−−
−−−−−−−−−−−
−+−+=
−+−+=−−=
11|11|111|111|11
1:11|111|111:11|1|1|
,ˆ;ˆˆ
|ˆˆ|ˆˆ
k
xx
kkkkk
T
kkkkkkkk
k
T
kkkkkkkkk
T
kkkkkk
xx
kk
xdPxxxwxfxwxf
ZxwxfxwxfEZxxxxEP
N
( ) ( ) ( ) T
kkkkkk
xx
kkkkkk
T
k
xx
kk xxQxdPxxxfxfP 1|1|111|11|11111|
ˆˆ.ˆ, −−−−−−−−−−−− −+= ∫ N
Let compute now
{ } ( )∫ −−−− == kkkkkkkkk xdZxpzZxzEz 1:11:111| |,|ˆ
{ } ( ) ( )[ ] ( )∫∫ −−−−−−− +=== k
xx
kkkkkkkk
xx
kkkkkkkkkkk xdPxxvxhxdPxxzZxzEz 1|1|1|1|1:111| ,ˆ;,ˆ;,|ˆ NN
Since xk and vk are independent { } ( ) ( )∫ −−−−− == k
xx
kkkkkkkkkkk xdPxxxhZxzEz 1|1|1:111| ,ˆ;,|ˆ N
Using the Gaussian approximation of p (xk| Z1:k-1) given by
( ) ( )xx
kkkkkkk PxxZxp 1|1|1:1 ,ˆ;| −−− ≈ N
129
SOLO
Additive Gaussian Nonlinear Filter (continue – 3) ( )
( ) kkk
kkk
vxhz
wxfx
+=
+= −− 11
Recursive Bayesian Estimation
( ) ( )xx
kkkkkkk PxxZxp 1|1|1:1 ,ˆ;| −−− ≈ N
Since xk and vk are independent
{ } ( ) ( )∫ −−−−− == k
xx
kkkkkkkkkkk xdPxxxhZxzEz 1|1|1:111| ,ˆ;,|ˆ N
( )( ){ } ( )[ ] ( )[ ]{ }
( )[ ] ( )[ ] ( )∫ −−−−
−−−−−−−
−+−+=
−+−+=−−=
k
xx
kkkkk
T
kkkkkkkk
k
T
kkkkkkkkk
T
kkkkkk
zz
kk
xdPxxzvxhzvxh
ZzvxhzvxhEZzzzzEP
1|1|1|1|
1:11|1|1:11|1|1|
.ˆ,ˆˆ
|ˆˆ|ˆˆ
N
( ) ( ) ( ) T
kkkkkk
xx
kkkkkk
T
k
zz
kk zzRxdPxxxhxhP 1|1|1|1|1| ˆˆ,ˆ; −−−−− −+= ∫ N
In the same way
( )( ){ } ( ) ( )[ ]{ }
( ) ( )[ ] ( )∫ −−−−
−−−−−−−
−+−=
−+−=−−=
k
xx
kkkkk
T
kkkkkkk
k
T
kkkkkkkk
T
kkkkkk
zx
kk
xdPxxzvxhxx
ZzvxhxxEZzzxxEP
1|1|1|1|
1:11|1|1:11|1|1|
.ˆ,ˆˆ
|ˆˆ|ˆˆ
N
( ) ( ) T
kkkkk
xx
kkkkkk
T
k
zx
kk zxxdPxxxhxP 1|1|1|1|1| ˆˆ,ˆ; −−−−− −= ∫ N
130
SOLO
Additive Gaussian Nonlinear Filter (continue – 4) ( )
( ) kkk
kkk
vxhz
wxfx
+=
+= −− 11
Recursive Bayesian Estimation
{ } ( ) ( )∫ −−−−− == k
xx
kkkkkkkkkkk xdPxxxhZxzEz 1|1|1:111| ,ˆ;,|ˆ N
( ) ( ) ( ) T
kkkkkk
xx
kkkkkk
T
k
zz
kk zzRxdPxxxhxhP 1|1|1|1|1| ˆˆ,ˆ; −−−−− −+= ∫ N
( ) ( ) T
kkkkk
xx
kkkkkk
T
k
zx
kk zxxdPxxxhxP 1|1|1|1|1| ˆˆ,ˆ; −−−−− −= ∫ N
{ } ( ) ( )∫ −−−−−−−−− == 11|11|1111:11| .ˆ;|ˆ k
xx
kkkkkkkkkk xdPxxxfZxEx N
( ) ( ) ( ) T
kkkkkk
xx
kkkkkk
T
k
xx
kk xxQxdPxxxfxfP 1|1|111|11|11111|
ˆˆ,ˆ; −−−−−−−−−−−− −+= ∫ N
Summary
Initialization0
{ }
( ) ( ){ }T
xxxxEP
xEx
00000|0
00
ˆˆ
ˆ
−−=
=
For { }∞∈ ,,1 k
State Prediction and its Covariance1
Measure Prediction and Covariances2
kx1−kx
kz1−kz
0x 1x 2x
1z 2z kZ :11:1 −kZ
( ) 11 −− + kk wxf
( ) kk vxh +
( ) 00 wxf +
( ) 11 vxh +
( ) 11 wxf +
( ) 22 vxh +
131
SOLO
Additive Gaussian Nonlinear Filter (continue – 5) ( )
( ) kkk
kkk
vxhz
wxfx
+=
+= −− 11
Recursive Bayesian Estimation
Summary (continue – 1)
We showed that the Kalman Filter, that uses this computations is given by:
{ } ( )1|
1
1|1|1|| ˆˆ|ˆ −
−
−−− −+== kkk
K
zz
kk
zx
kkkkkkkk zzPPxzxEx
k



( )( ){ }
T
k
zz
kkk
xx
kk
xz
kk
zz
kk
zx
kk
xx
kkk
T
kkkkkk
xx
kk
KPKP
PPPPZxxxxEP
1
1|1|
1|
1
1|1|1|:1|||
ˆˆ
−
−−
−
−
−−−
−=
−=−−=
Kalman Gain Computations3
1
1|1|
−
−−= zz
kk
xz
kkk PPK
k := k+1 & return to 1
Update State and its Covariance4
132
SOLO
Additive Gaussian Nonlinear Filter (continue – 6) ( )
( ) kkk
kkk
vxhz
wxfx
+=
+= −− 11
Recursive Bayesian Estimation
( ) ( )∫= xdPxxxgI xx
,ˆ;N
To obtain the Kalman Filter, we must approximate integrals of the type:
Three approximation are presented:
(1) Gauss – Hermite Quadrature Approximation
(2) Unscented Transformation Approximation
(3) Monte Carlo Approximation
Table of Content
133
SOLO
Additive Gaussian Nonlinear Filter (continue – 7)
( )
( ) kkk
kkk
vxhz
wxfx
+=
+= −− 11
Recursive Bayesian Estimation
( ) ( )∫= xdPxxxgI xx,ˆ;N
To obtain the Kalman Filter, we must approximate integrals of the type:
Gauss – Hermite Quadrature Approximation
( )
( )[ ]
( ) ( )∫ 





−−−=
−
xdxxPxx
P
xgI xx
T
xx
n
ˆˆ
2
1
exp
2
1 1
2/1
π
Let Pxx = ST
S a Cholesky decomposition, and define: ( )xxSz ˆ
2
1
: 1
−= −
( )
( )∫
−
= zdezgI zz
n
T
2/
2
2
π
This integral can be approximated using the Gauss – Hermite
quadrature rule:
( ) ( )∑∫ =
−
≈
M
i
ii
z
zfwzdzfe
1
2
where the quadrature points zi and weights wi are defined as follows:
Carl Friedrich
Gauss
1777-1855
Charles Hermite
1822-1901
Andre – Louis
Cholesky
1875 - 1918
134
SOLO
Additive Gaussian Nonlinear Filter (continue – 8)
( )
( ) kkk
kkk
vxhz
wxfx
+=
+= −− 11
Recursive Bayesian Estimation
Gauss – Hermite Quadrature Approximation (continue – 1)
( ) ( )∑∫ =
−
≈
M
i
ii
z
zfwzdzfe
1
2
The quadrature points zi and weights wi are defined as follows:
A set of orthonormal Hermite polynomials are generated from the recurrence relationship:
( ) ( )
( ) ( ) ( )zH
j
j
zH
j
zzH
zHzH
jjj 11
4/1
01
11
2
/1,0
−+
−
+
−
+
=
== π
or in matrix form:
( )
( )
( )
( )
( )
( )
( )
( ) 
( ) Mj
j
zH
zH
zH
zH
zH
zH
zH
z jM
e
M
zh
M
J
M
M
zh
M
M
M
,,2,1
2
:
1
0
0
0
00
00
00
00
00
1
1
0
1
1
2
21
1
1
1
0



  



==
















+
































=












−
−
−−
ββ
β
β
β
ββ
β
( )

( ) ( )zH
j
zH
j
zHz jjj
jj
11
1
2
1
2
+−
+
+
+=

ββ
( ) ( ) ( )zHezhJzhz MMMM β+=
135
SOLO
Additive Gaussian Nonlinear Filter (continue –9)
Recursive Bayesian Estimation
Gauss – Hermite Quadrature Approximation (continue – 2)
( ) ( )∑∫ =
−
≈
M
i
ii
z
zfwzdzfe
1
2
Orthonormal Hermitian
Polynomials in matrix form:
( ) Mj
j
JJ j
T
M
M
M
M ,,2,1
2
:
00
00
00
00
00
1
1
2
21
1


===




















=
−
−
β
β
β
β
ββ
β
( ) ( ) ( )zHezhJzhz MMMM β+=
Let evaluate this equation for the M roots zi for which ( ) MizH iM ,,2,10 ==
( ) ( ) MizhJzhz iMii ,,2,1 ==
From this equation we can see that zi and
are the eigenvalues and eigenvectors, respectively, of the symmetric matrix JM.
( ) ( ) ( ) ( )[ ] MizHzHzHzh
T
iMiii ,,1,,, 110  == −
Because of the symmetry of JM the eigenvectors are orthogonal and can be normalized.
Define:
( ) ( ) MjizHWWzHv
M
j
ijiiij
i
j ,,2,1,:&/:
1
0
2
=== ∑
−
=
We have:
( ) ( )
( ) ( ) li
li
li
li
M
j l
lj
i
ij
M
j
l
j
i
j zhzh
WWW
zH
W
zH
vv δ=⋅==
≠
−
=
−
=
∑∑ 
0
1
0
1
0
1
:
Table of Content
136
Unscented Kalman FilterSOLO
When the state transition and observation models – that is, the predict and update
functions f and h (see above) – are highly non-linear, the Extended Kalman Filter
can give particularly poor performance [JU97]. This is because only the mean is
propagated through the non-linearity. The Unscented Kalman Filter (UKF) [JU97]
uses a deterministic sampling technique known as the to pick a minimal set of
sample points (called “sigma points”) around the mean. These “sigma points” are
then propagated through the non-linear functions and the covariance of the estimate
is then recovered. The result is a filter which more accurately captures the true mean
and covariance. (This can be verified using Monte Carlo sampling or through a
Taylor series expansion of the posterior statistics.) In addition, this technique
removes the requirement to analytically calculate Jacobians, which for complex
functions can be a difficult task in itself.
( ) 111,,1 −−− +−= kkkk wuxkfx
( ) kkk xkhz ν+= ,
State vector dynamics
Measurements
( ) ( ) ( ){ } ( ) ( ){ } ( )kPkekeEkxEkxke x
T
xxx =−= &:
( ) ( ) ( ){ } ( ) ( ){ } ( ) lk
T
www kQlekeEkwEkwke ,
0
&: δ=−=

( ) ( ){ } lklekeE
T
vw ,0 ∀=



=
≠
=
lk
lk
lk
1
0
,δ
The Unscented Algorithm using ( ) ( ) ( ){ } ( ) ( ){ } ( )kPkekeEkxEkxke x
T
xxx =−= &:
determines ( ) ( ) ( ){ } ( ) ( ){ } ( )kPkekeEkzEkzke z
T
zzz =−= &:
137
Unscented Kalman FilterSOLO
( ) ( )[ ]
( )
n
n
j j
j
n
x
n
x
n
x
x
x
xx
fx
n
xxf








∂
∂
=∇⋅
∇⋅=+
∑
∑
=
∞
=
1
0
ˆ
:
!
1
ˆ
δδ
δδ
Develop the nonlinear function f in a Taylor series
around
xˆ
Define also the operator ( )[ ] ( )xf
x
xfxfD
n
n
j j
jx
n
x
n
x
x








∂
∂
=∇⋅= ∑=1
: δδδ
Propagating Means and Covariances Through Nonlinear Transformations
Consider a nonlinear function .( )xfy =
Let compute
Assume is a random variable with a probability density function pX (x) (known or
unknown) with mean and covariance
x
{ } ( ) ( ){ }Txx
xxxxEPxEx ˆˆ,ˆ −−==
( ){ } { }
( )[ ]{ } ∑ ∑∑
∑
∞
= =
∞
=
∞
=
























∂
∂
=∇⋅=
=+=
0
ˆ
10
ˆ
0
!
1
!
1
!
1
ˆˆ
n
x
n
n
j j
j
n
x
n
x
n
n
x
f
x
xE
n
fxE
n
DE
n
xxfEy
x
δδ
δ δ
{ } { }
{ } ( )( ){ } xxTT
PxxxxExxE
xxExE
xxx
=−−=
=−=
+=
ˆˆ
0ˆ
ˆ
δδ
δ
δ
138
Unscented Kalman Filter
SOLO
Propagating Means and Covariances Through Nonlinear Transformations
Consider a nonlinear function .
(continue – 1)
( )xfy =
{ } { }
{ } ( )( ){ } xxTT
PxxxxExxE
xxExE
xxx
=−−=
=−=
+=
ˆˆ
0ˆ
ˆ
δδ
δ
δ
( ){ } ( )
+
























∂
∂
+
























∂
∂
+
























∂
∂
+
























∂
∂
+=
























∂
∂
=+=
∑∑∑
∑∑ ∑
===
=
∞
= =
x
n
j j
jx
n
j j
jx
n
j j
j
x
n
j j
j
n
x
n
n
j j
j
f
x
xEf
x
xEf
x
xE
f
x
xExff
x
xE
n
xxfEy
xxx
xx
ˆ
4
1
ˆ
3
1
ˆ
2
1
ˆ
10
ˆ
1
!4
1
!3
1
!2
1
ˆ
!
1
ˆˆ
δδδ
δδδ
Since all the differentials of f are computed around the mean (non-random)xˆ
( )[ ]{ } ( )[ ]{ } { }( )[ ] ( )[ ]xx
xxT
xxx
TT
xxx
TT
xxx fPfxxEfxxEfxE ˆˆˆˆ
2
∇∇=∇∇=∇∇=∇⋅ δδδδδ
( )[ ]{ } { } { } 0
ˆ
1
0ˆ
1
ˆ0
ˆ =
















∂
∂
=
























∂
∂
=
















∇⋅=∇⋅ ∑∑ ==
x
n
j j
j
x
n
j j
j
x
xxx f
x
xEf
x
xEfxEfxE
xx

δδδδ
( ){ } [ ]{ } ( ) ( )[ ] [ ]{ } [ ]{ } +++∇∇+==+= ∑
∞
=
xxxxxx
xxT
x
n
x
n
x fDEfDEfPxffDE
n
xxfEy ˆ
4
ˆ
3
ˆ
0
ˆ
!4
1
!3
1
!2
1
ˆ
!
1
ˆˆ δδδδ
139
Simon J. Julier
Unscented Kalman FilterSOLO
Propagating Means and Covariances Through Nonlinear Transformations
Consider a nonlinear function .
(continue - 2)
( )xfy = { } { }
{ } ( )( ){ } xxTT
PxxxxExxE
xxExE
xxx
=−−=
=−=
+=
ˆˆ
0ˆ
ˆ
δδ
δ
δ
Unscented Transformation (UT), proposed by Julier and Uhlmann
uses a set of “sigma points” to provide an approximation of
the probabilistic properties through the nonlinear function
Jeffrey K. Uhlman
A set of “sigma points” S consists of p+1 vectors and their associated
weights S = { i=0,1,..,p: x(i)
, W(i)
}.
(1) Compute the transformation of the “sigma points” through the
nonlinear transformation f:
( ) ( )
( ) pixfy ii
,,1,0 ==
(2) Compute the approximation of the mean: ( ) ( )
∑=
≈
p
i
ii
yWy
0
ˆ
The estimation is unbiased if:
( ) ( ) ( ) ( )
{ } ( )
yWyyEWyWE
p
i
i
p
i
y
ii
p
i
ii
ˆˆ
00
ˆ
0
===






∑∑∑ ===

( )
1
0
=∑=
p
i
i
W
(3) The approximation of output covariance is given by
( ) ( )
( ) ( )
( )∑=
−−≈
p
i
Tiiiyy
yyyyWP
0
ˆˆ
140
Unscented Kalman FilterSOLO
Propagating Means and Covariances Through Nonlinear Transformations
Consider a nonlinear function (continue – 3)( )xfy =
One set of points that satisfies the above conditions consists of a symmetric set of symmetric
p = 2nx points that lie on the covariance contour Pxx
:
th
xn
( ) ( )
( )
( )
( )
( ) ( )
( )
( ) ( )
x
x
ni
x
i
xxxni
i
xxxi
ni
nWW
nWW
P
W
n
xx
P
W
n
xx
WWxx
x
x
,,1
2/1
2/1
1
ˆ
1
ˆ
ˆ
0
0
0
0
0
00
=











−=
−=








−
−=








−
+=
==
+
+
where is the row or column of the matrix square root of nx Pxx
/(1-W0)
(the original covariance matrix Pxx
multiplied by the number of dimensions of x, nx/(1-W0)).
This implies:
( )( )i
xx
x WPn 01/ −
xxx
n
i
T
i
xxx
i
xxx
P
W
n
P
W
n
P
W
nx
01 00 111 −
=







−







−
∑=
Unscented Transformation (UT) (continue – 1)
141
Unscented Kalman Filter
SOLO
Propagating Means and Covariances Through Nonlinear Transformations
Consider a nonlinear function (continue – 3)( )xfy =
Unscented Transformation (UT) (continue – 2)
( ) ( )
( )
( )
( )
( )








+=
=
=
==
∑
∑
∞
=
−
∞
=
0
0
2,,1ˆ
!
1
,,1ˆ
!
1
0ˆ
n
xx
n
x
n
x
n
x
ii
nnixfD
n
nixfD
n
ixf
xfy
i
i


δ
δ
1
2
Unscented Algorithm:
( ) ( )
( ) ( ) ( )
( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( )∑∑
∑
∑ ∑∑ ∑∑
==
=
=
∞
=
−
=
∞
==




++
−
+
−
+=




++++
−
+=
−
+
−
+==
x
ii
x
i
x
iii
x
i
x
i
x
n
i
xx
x
n
i
x
x
n
i
xxx
x
n
i n
n
x
x
n
i n
n
x
x
n
i
ii
UT
xfDxfD
n
W
xfD
n
W
xf
xfDxfDxfDxf
n
W
xfW
xfD
nn
W
xfD
nn
W
xfWyWy
1
640
1
20
1
6420
0
1 0
0
1 0
0
0
2
0
ˆ
!6
1
ˆ
!4
11
ˆ
2
11
ˆ
ˆ
!6
1
ˆ
!4
1
ˆ
!2
1
ˆ
1
ˆ
ˆ
!
1
2
1
ˆ
!
1
2
1
ˆˆ


δδδ
δδδ
δδ
( )
i
xxx
i
i
P
W
n
xxxx 







−
±=±=
01
ˆˆ δ
Since ( ) ( )
( )
( )



−
=








∂
∂
−= ∑=
−
oddnxfD
evennxfD
xf
x
xxfD n
x
n
x
n
n
j j
ij
n
x
i
i
x
i
ˆ
ˆ
ˆˆ
1 δ
δ
δ δ
142
Unscented Kalman Filter
( ) ( ) ( ) ( )∑=




++
−
+∇∇+=
x
ii
n
i
xx
x
xxT
UT xfDxfD
n
W
xfPxfy
1
640
ˆ
!6
1
ˆ
!4
11
ˆ
2
1
ˆˆ δδ
( )
i
xxx
i
i
P
W
n
xxxx 







−
±=±=
01
ˆˆ δ
SOLO
Propagating Means and Covariances Through Nonlinear Transformations
Consider a nonlinear function (continue – 4)( )xfy =
Unscented Transformation (UT) (continue – 3)
Unscented Algorithm:
( ) ( )
( ) ( ) ( )xfPxfP
W
n
n
W
xfP
W
n
P
W
n
n
W
xfP
W
n
P
W
n
n
W
xfD
n
W
xxTxxxT
x
n
i
T
i
xxx
i
xxxT
x
n
i
T
i
xxx
i
xxxT
x
n
i
x
x
x
xx
i
ˆ
2
1
ˆ
12
11
ˆ
112
11
ˆ
112
11
ˆ
2
11
0
0
1 00
0
1 00
0
1
20
∇∇=∇





−
∇
−
=∇
















−







−
∇
−
=
∇







−







−
∇
−
=
−
∑
∑∑
=
==
δ
Finally:
We found
( ){ } [ ]{ } ( ) ( )[ ] [ ]{ } [ ]{ } +++∇∇+==+= ∑
∞
=
xxxxxx
xxT
x
n
x
n
x fDEfDEfPxffDE
n
xxfEy ˆ
4
ˆ
3
ˆ
0
ˆ
!4
1
!3
1
!2
1
ˆ
!
1
ˆˆ δδδδ
We can see that the two expressions agree exactly to the third order.
143
Unscented Kalman Filter
SOLO
Propagating Means and Covariances Through Nonlinear Transformations
Consider a nonlinear function (continue – 5)( )xfy =
Unscented Transformation (UT) (continue – 4)
Accuracy of the Covariance:
( ) ( ){ } { }
( ) ( ) ( ) ( ) ( )
( ) ( )[ ] [ ]{ } [ ]{ }
( ) ( )[ ] [ ]{ } [ ]{ }
T
xxxxxx
xxT
x
xxxxxx
xxT
x
T
m
m
xx
n
n
xx
TTTyy
fDEfDEfPxf
fDEfDEfPxf
fD
m
xfDxffD
n
xfDxfE
yyyyEyyyyEP






+++∇∇+⋅
⋅





+++∇∇+−














++





++=
−=−−=
∑∑
∞
=
∞
=


ˆ
4
ˆ
3
ˆ
ˆ
4
ˆ
3
ˆ
22
!4
1
!3
1
!2
1
ˆ
!4
1
!3
1
!2
1
ˆ
!
1
ˆˆ
!
1
ˆˆ
ˆˆˆˆ
δδ
δδ
δδδδ
( ) ( ) ( ) ( ){ } ( ) ( ) ( ){ } ( ) ( ) ( )
( )




















+






++






++=
∑∑
∑∑
∞
=
∞
=
∞
=
∞
=
T
m
m
x
n
n
x
T
n
n
x
T
x
T
n
n
x
T
x
T
fD
m
fD
n
E
xfxfD
n
ExfxfDExfD
n
ExfxfDExfxfxf
22
2
0
2
0
!
1
!
1
ˆˆ
!
1
ˆˆˆ
!
1
ˆˆˆˆˆ
δδ
δδδδ

144
Unscented Kalman Filter
SOLO
Propagating Means and Covariances Through Nonlinear Transformations
Consider a nonlinear function (continue – 6)( )xfy =
Unscented Transformation (UT) (continue – 5)
Accuracy of the Covariance:
( ) ( ){ } { }
( )[ ] ( )[ ]{ }
( ) ( ) ( )
{ } { }
    
0
1 1
22
0
1 1
ˆˆ
!2!2
1
!!
1
4
1
ˆˆˆˆ
>
∞
=
∞
=
>
∞
=
∞
=






−






+
∇∇∇∇−=
−=−−=
∑∑∑∑
ji
i j
Tj
x
i
x
ji
i j
Tj
x
i
x
T
xx
xxT
xxx
xxT
x
T
x
xx
x
TTTyy
fDEfDE
ji
fDfD
ji
E
fPfPP
yyyyEyyyyEP
δδδδ
AA
( )[ ] ( )[ ]{ }
( )
( )
( ) ( ) ( )
{ } { }
  
  
0
1 1
2
1
2
1
2
~
2
~
2
2
1
0
1 1
~~
ˆˆ
4!2!2
1
!!
1
2
1
4
1
>
∞
=
∞
= = =
=
>
∞
=
∞
=






+
−






+
+
∇∇∇∇−=
∑∑ ∑∑
∑ ∑∑
ji
i j
L
k
L
m
Tji
L
k
ji
i j
Tji
T
xx
xxT
xxx
xxT
x
T
x
xx
x
yy
UT
fDEfDE
Lji
fDfD
jiL
fPfPPP
mk
kk
σσ
σσ
λ
λ
AA
145
Uscented Kalman FilterSOLO
146
Uscented Kalman FilterSOLO
( ) ( )∑∑ −−==
N
T
iiiz
N
ii zzPz
2
0
2
0
ψψβψβ
x
xPα
xP




zP
( )f
iβ
iβ
iψ
z
{ } [ ]xxi PxPxx ααχ −+=
Weighted
sample mean
Weighted
sample
covariance
Table of Content
147
Uscented Kalman Filter
SOLO
UKF Summary
Initialization of UKF
{ } ( ) ( ){ }T
xxxxEPxEx 00000|000
ˆˆˆ −−==
{ } [ ] ( )( ){ }










=−−===
R
Q
P
xxxxEPxxEx
TaaaaaTTaa
00
00
00
ˆˆ00ˆˆ
0|0
00000|0000
[ ]TTTTa
vwxx =:
For { }∞∈ ,,1 k
Calculate the Sigma Points ( )
( )
λγ
γ
γ +=







=−=
=+=
=
−−−−
+
−−
−−−−−−
−−−−
L
LiPxx
LiPxx
xx
i
kkkk
Li
kk
i
kkkk
i
kk
kkkk
,,1ˆ
,,1ˆ
ˆ
1|11|11|1
1|11|11|1
1|1
0
1|1


State Prediction and its Covariance
System Definition
( ) { } { }
( ) { } { }



==+=
==+−= −−−−−−−
lkk
T
lkkkkk
lkk
T
lkkkkkk
RvvEvEvxkhz
QwwEwEwuxkfx
,
,1111111
&0,
&0,,1
δ
δ
( ) Liuxkfx k
i
kk
i
kk 2,,1,0,,1 11|11| =−= −−−−
( ) ( ) ( )
( )
Li
L
W
L
WxWx m
i
m
L
i
i
kk
m
ikk 2,,1
2
1
&ˆ 0
2
0
1|1| =
+
=
+
== ∑=
−−
λλ
λ
0
1
2
( )
( )( ) ( ) ( )
( )
Li
L
W
L
WxxxxWP c
i
c
L
i
T
kk
i
kkkk
i
kk
c
ikk 2,,1
2
1
&1ˆˆ 2
0
2
0
1|1|1|1|1| =
+
=+−+
+
=−−= ∑=
−−−−−
λ
βα
λ
λ
148
Uscented Kalman Filter
SOLO
UKF Summary (continue – 1)
Measure Prediction
( ) Lixkhz i
kk
i
kk 2,,1,0, 1|1| == −−
( ) ( ) ( )
( )
Li
L
W
L
WzWz m
i
m
L
i
i
kk
m
ikk 2,,1
2
1
&ˆ 0
2
0
1|1| =
+
=
+
== ∑=
−−
λλ
λ
3
Innovation and its Covariance4
1|ˆ −−= kkkk zzi
( )
( )( ) ( ) ( )
( )
Li
L
W
L
WzzzzWPS c
i
c
L
i
T
kk
i
kkkk
i
kk
c
i
zz
kkk 2,,1
2
1
&1ˆˆˆˆ 2
0
2
0
1|1|1|1|1| =
+
=+−+
+
=−−== ∑=
−−−−−
λ
βα
λ
λ
Kalman Gain Computations5
( )
( )( ) ( ) ( )
( )
Li
L
W
L
WzzxxWP c
i
c
L
i
T
kk
i
kkkk
i
kk
c
i
xz
kk 2,,1
2
1
&1ˆˆ 2
0
2
0
1|1|1|1|1| =
+
=+−+
+
=−−= ∑=
−−−−−
λ
βα
λ
λ
1
1|1|
−
−−= zz
kk
xz
kkk PPK
Update State and its Covariance6
kkkkkk iKxx += −1||
ˆˆ
T
kkkkkkk KSKPP −= −1||
k := k+1 & return to 1
149
Unscented Kalman Filter
State Estimation (one cycle)
Sensor Data
Processing and
Measurement
Formation
Observation -
to - Track
Association
Input
Data Track Maintenance
( Initialization,
Confirmation
and Deletion)
Filtering and
Prediction
Gating
Computations
Samuel S. Blackman, " Multiple-Target Tracking with Radar Applications", Artech House,
1986
Samuel S. Blackman, Robert Popoli, " Design and Analysis of Modern Tracking Systems",
Artech House, 1999
SOLO
Simon J. Julier Jeffrey K. Uhlman
150
Uscented Kalman FilterSOLO
Table of Content
151
Numerical Integration Using a Monte Carlo ApproximationSOLO
A Monte Carlo Approximation of the Expected Value Integrals uses Discrete
Approximation to the Gaussian PDF ( )xx
Pxx ,ˆ;N
( )xx
Pxx ,ˆ;N can be approximated by:
( ) ( ) ( ) ( )∑∑ ==
−=−≈=
ss N
i
i
s
N
i
iixx
xx
N
xxwPxxx
11
1
,ˆ; δδNp
We can see that for any x we have
( ) ( )∫∑∫∑
∞−
≤
∞− =
≈=−
x
xx
xx
i
i
x N
i
ii
dPxwdxw
i
s
ττττδ ,ˆ;
1
N
The weight wi
is not the probability of the point xi
. The probability density near xi
is
given by the density of the points in the region around xi
, which can be obtained by a
normalized histogram of all xi
.
Draw Ns samples from , where {xi
, i = 1,2,…,Ns} are a set of support
points (random samples of particles) with weights {wi
= 1/Ns, i=1,2,…,Ns}
( )xx
Pxx ,ˆ;N
Monte Carlo Kalman Filter (MCKF)
152
Numerical Integration Using a Monte Carlo Approximation
SOLO
The Expected Value for any function g (x) can be estimated from:
( ){ } ( ) ( ) ( ) ( ) ( ) ( ) ( )∑∑∫ ∑∫ ===
==−≈=
sss N
i
i
s
N
i
ii
N
i
ii
xp xg
N
xgwxxwxgxdxpxgxgE
111
1
δ
which is the sample mean.
( ) { } { }
( ) { } { }



==+=
==+−= −−−−−−−
lkk
T
lkkkkk
lkk
T
lkkkkkk
RvvEvEvxkhz
QwwEwEwuxkfx
,
,1111111
&0,
&0,,1
δ
δGiven the
System
Assuming that we computed the Mean and Covariance at stage k-1
let use the Monte Carlo Approximation to compute the predicted Mean and Covariance
at stage k
1|11|1 ,ˆ −−−− kkkk Px
1|1| ,ˆ −− kkkk Px
{ } ( ) ( )∑=
−−−− −== −
s
kk
N
i
k
i
kk
s
Zxpkkk uxkf
N
xEx
1
11|1|1| ,,1
1
ˆ 1:1
( ) ( ){ } ( )
{ } ( )
T
kkkkZxp
T
kkZxp
T
kkkkkk
xx
kk xxxxExxxxEP kkkk
1|1|||1|1|1|
ˆˆˆˆ 1:11:1
−−−−− −=−−= −−
Monte Carlo Kalman Filter (MCKF) (continue – 1)
Draw Ns samples
( ) ( ) skkkkkkk
i
kk NiPxxZxpx ,,1,ˆ;|~ 1|11|111:111|1 == −−−−−−−−− N
~means Generate
(Draw) samples
from a predefined
distribution
153
Numerical Integration Using a Monte Carlo Approximation
SOLO
( )( ){ } ( )
{ } ( )
( )[ ] ( )[ ]{ } ( )
( ) ( ){ } ( )
( ) ( )
T
N
i
k
i
kk
s
N
i
k
i
kk
s
Zxpk
i
kk
T
k
i
kk
T
kkkkZxp
T
kk
i
kkkk
i
kk
T
kkkkZxp
T
kkZxp
T
kkkkkk
xx
kk
ss
kk
kk
kkkk
uxkf
N
uxkf
N
QuxfuxfE
xxwuxkfwuxkfE
xxxxExxxxEP






−





−−+=
−+−+−=
−=−−=
∑∑ =
−−−
=
−−−−−−−−−
−−−−−−−−−−
−−−−−
−
−
−−
1
11|1
1
11|1|11|111|1
1|1||111|1111|1
1|1|||1|1|1|
,,1
1
,,1
1
,,
ˆˆ,,1,,1
ˆˆˆˆ
1:1
1:1
1:11:1
( ) ( ) ( ) ( )





−





−−−−+= ∑∑∑ =
−−−
=
−−−
=
−−−−−−−
sss N
i
k
i
kk
s
N
i
k
i
kk
s
N
i
k
i
kk
T
k
i
kk
s
xx
kk uxkf
N
uxkf
N
uxkfuxkf
N
QP
1
11|1
1
11|1
1
11|111|11| ,,1
1
,,1
1
,,1,,1
1
Using the Monte Carlo Approximation we obtain:
{ } ( ) ( )∑=
−− == −
s
kk
N
i
i
kk
s
Zxpkkk xkh
N
zEz
1
1||1| ,
1
ˆ 1:1
( ) ( ) ( ) ( )











−+= ∑∑∑ =
−
=
−
=
−−−
sss N
i
i
kk
s
N
i
i
kk
s
N
i
i
kk
Ti
kk
s
zz
kk xkh
N
xkh
N
xkhxkh
N
RP
1
1|
1
1|
1
1|1|1| ,
1
,
1
,,
1
Monte Carlo Kalman Filter (MCKF) (continue – 2)
( ) ( ) skkkkkkk
i
kk NiPxxZxpx ,,1,ˆ;|~ 1|1|1:11| == −−−− N
Now we approximate the predictive PDF, , as
and we draw new Ns (not necessarily the same as before) samples.
( )1:1| −kk Zxp ( )1|1| ,ˆ; −− kkkkk PxxN
154
Numerical Integration Using a Monte Carlo Approximation
SOLO
In the same way we obtain:
( ) ( )











−= ∑∑∑ =
−
=
−
=
−−−
sss N
i
i
kk
s
N
i
i
kk
s
N
i
i
kk
Ti
kk
s
zx
kk xkh
N
x
N
xkhx
N
P
1
1|
1
1|
1
1|1|1| ,
11
,
1
Monte Carlo Kalman Filter (MCKF) (continue – 3)
The Kalman Filter Equations are:
( ) 1
1|1|
−
−−= zz
kk
zx
kkk PPK
( )1|1|| ˆˆˆ −− −+= kkkkkkkk zzKxx
T
k
zz
kkk
xx
kk
xx
kk KPKPP 1|1|| −− −=
155
Monte Carlo Kalman Filter (MCKF)
SOLO
MCKF Summary
{ } ( ) ( ){ }T
xxxxEPxEx 00000|000 ˆˆˆ −−==
{ } [ ] ( )( ){ }










=−−===
R
Q
P
xxxxEPxxEx
TaaaaaTTaa
00
00
00
ˆˆ00ˆˆ
0|0
00000|0000
For { }∞∈ ,,1 k
System Definition:
( ) ( ) ( )
( ) ( )


=+=
==+−= −−−−−−
kkkkkk
kkkkkkk
Rvvvxkhz
QwwPxxxwuxkfx
,0;,
,0;&,ˆ;,,1 1110|0000111
N
NN
( ) sk
ai
kk
ai
kk Niuxkfx ,,1,,1 11|11| =−= −−−−
∑=
−− =
sN
i
ai
kk
s
a
kk x
N
x
1
1|1|
1
ˆ
Initialization of MCKF0
State Prediction and its Covariance2
Ta
kk
a
kk
N
i
Tai
kk
ai
kk
s
a
kk xxxx
N
P
s
1|1|
1
1|1|1|
ˆˆ
1
−−
=
−−− −= ∑
Assuming for k-1 Gaussian distribution with Mean and Covariance1
a
kk
a
kk Px 1|11|1 ,ˆ −−−−
Assuming Gaussian distribution with Mean and Covariance3 1|1| ,ˆ −− kkkk Px
( ) s
a
kk
a
kk
a
k
ai
kk NiPxxx ,,1,ˆ;~ 1|11|111|1 =−−−−−−− N
Generate (Draw) Ns samples
( ) s
a
kk
a
kk
a
kk
aj
kk NjPxxx ,,1,ˆ;~ 1|1|1|1| =−−−− N
Generate (Draw) new Ns samples
[ ]TTTTa
vwxx =:
Augment the state space to include processing and
measurement noises.
156
Monte Carlo Kalman Filter (MCKF)
SOLO
MCKF Summary (continue – 1)
( ) s
aj
kk
j
kk Njxkhz ,,1, 1|1| == −− ∑=
−− =
sN
j
j
kk
s
kk z
N
z
1
1|1|
1
ˆ
Measure Prediction4
( )( )∑=
−−−−− −−==
sN
j
T
kk
j
kkkk
j
kk
s
zz
kkk zzzz
N
PS
1
1|1|1|1|1| ˆˆ
1
Measurement & Innovation
Computation
1|ˆ −−= kkkk zzi7
( )( )∑=
−−−−− −−=
s
a
N
j
T
kk
j
kk
a
kk
aj
kk
s
zx
kk zzxx
N
P
1
1|1|1|1|1| ˆˆ
1
6 Kalman Gain Computations
1
1|1|
−
−−= zz
kk
zx
kk
a
k PPK
a
Kalman Filter8
k
a
k
a
kk
a
kk iKxx += −1|| ˆˆ
Ta
kk
a
k
a
kk
a
kk KSKPP −= −1||
k := k+1 & return to 1
Predicted Covariances Computations5
157
Sensor Data
Processing and
Measurement
Formation
Observation -
to - Track
Association
Input
Data Track Maintenance
( Initialization,
Confirmation
and Deletion)
Filtering and
Prediction
Gating
Computations
Samuel S. Blackman, " Multiple-Target Tracking with Radar Applications", Artech House,
1986
Samuel S. Blackman, Robert Popoli, " Design and Analysis of Modern Tracking Systems",
Artech House, 1999
SOLO
Monte Carlo Kalman Filter (MCKF)
Table of Content
158
Nonlinear Estimation Using Particle Filters
SOLO
We assumed that p (xk|Z1:k) is a Gaussian PDF. If the true PDF is not Gaussian
(multivariate, heavily skewed or non-standard – not represented by any standard PDF)
the Gaussian distribution can never described it well.
Non-Additive Non-Gaussian Nonlinear Filter
( )
( )kkk
kkk
vxhz
wxfx
,
, 11
=
= −−
kk vw &1− are system and measurement white-noise sequences
independent of past and current states and on each other and
having known P.D.F.s ( ) ( )kk vpwp &1−
We want to compute p (xk|Z1:k) recursively, assuming knowledge of p(xk-1|Z1:k-1)
in two stages, prediction (before) and update (after measurement)
Prediction (before measurement)
Use Chapman – Kolmogorov Equation to obtain:
( ) ( ) ( )∫ −−−−− = 11:1111:1 ||| kkkkkkk xdZxpxxpZxp
where: ( ) ( ) ( )∫ −−−−−− = 111111 |,|| kkkkkkkk wdxwpwxxpxxp
By assumption ( ) ( )111 | −−− = kkk wpxwp
Since by knowing , is deterministically given by system equation
we have
11 & −− kk wx kx
( ) ( )( )
( )
( )


≠
=
=−=
−−
−−
−−−−
11
11
1111
,0
,1
,,|
kkk
kkk
kkkkkk
wxfx
wxfx
wxfxwxxp δ
Therefore: ( ) ( )( ) ( )∫ −−−−− −= 11111 ,| kkkkkkk wdwpwxfxxxp δ
159
Nonlinear Estimation Using Particle Filters
SOLO Non-Additive Non-Gaussian Nonlinear Filter
( )
( )kkk
kkk
vxhz
wxfx
,
, 11
=
= −−
kk vw &1− are system and measurement white-noise sequences
independent of past and current states and on each other and
having known P.D.F.s ( ) ( )kk vpwp &1−
We want to compute p (xk|Z1:k) recursively, assuming knowledge of p(xk-1|Z1:k-1)
in two stages, prediction (before) and update (after measurement)
Prediction (before measurement)
( ) ( ) ( )∫ −−−−− = 11:1111:1 ||| kkkkkkk xdZxpxxpZxp
where:
( ) ( )
( ) ( ) ( )
( )
( ) ( )
( )
( ) ( )
( ) ( )∫ −
−
−
−
=
− ===
kkkkk
kkkk
kk
kkkk
Bayes
bp
apabp
bap
kkkkk
xdZxpxzp
Zxpxzp
Zzp
Zxpxzp
ZzxpZxp
1:1
1:1
1:1
1:1
|
|
1:1:1
||
||
|
||
,||
( ) ( ) ( )∫= kkkkkkkk vdxvpvxzpxzp |,||
By assumption ( ) ( )kkk vpxvp =|
Since by knowing , is deterministically given by system equationkk vx & kz
( ) ( )( )
( )
( )


≠
=
=−=
kkk
kkk
kkkkkk
vxhz
vxhz
vxhzvxzp
,0
,1
,,| δ
Therefore: ( ) ( )( ) ( )∫ −= kkkkkkk vdvpvxhzxzp ,| δ
( ) ( )( ) ( )∫ −−−−− −= 11111 ,| kkkkkkk wdwpwxfxxxp δ
1
Update (after measurement)2
160
Nonlinear Estimation Using Particle Filters
SOLO Non-Additive Non-Gaussian Nonlinear Filter
( )
( )kkk
kkk
vxhz
wxfx
,
, 11
=
= −−
kk vw &1− are system and measurement white-noise sequences
independent of past and current states and on each other and
having known P.D.F.s ( ) ( )kk vpwp &1−
We want to compute p (xk|Z1:k) recursively, assuming knowledge of p(xk-1|Z1:k-1)
in two stages, prediction (before) and update (after measurement)
Prediction (before measurement) ( ) ( ) ( )∫ −−−−− = 11:1111:1 ||| kkkkkkk xdZxpxxpZxp
( ) ( )( ) ( )∫ −−−−− −= 11111 ,| kkkkkkk wdwpwxfxxxp δ
Update (after measurement)
( ) ( )
( ) ( ) ( )
( )
( ) ( )
( )
( ) ( )
( ) ( )∫ −
−
−
−
=
− ===
kkkkk
kkkk
kk
kkkk
Bayes
bp
apabp
bap
kkkkk
xdZxpxzp
Zxpxzp
Zzp
Zxpxzp
ZzxpZxp
1:1
1:1
1:1
1:1
|
|
1:1:1
||
||
|
||
,||
We need to evaluate the following integrals:
( ) ( )( ) ( )∫ −= kkkkkkk vdvpvxhzxzp ,| δ
We use the numeric Monte Carlo Method to evaluate the integrals:
Generate (Draw): ( ) ( ) Sk
i
kk
i
k Nivpvwpw ,,1~&~ 11 =−−
( ) ( )( ) S
N
i
i
k
i
k
i
kkk Nwxfxxxp
S
∑=
−−− −≈
1
111 /,| δ
( ) ( )( ) S
N
i
i
k
i
k
i
kkk Nvxhzxzp
S
∑=
−≈
1
/,| δ
or
( ) ( ) ( ) S
N
i
i
kkkk
i
k
i
k
i
k Nxxxxpwxfx
S
∑=
−−− −≈→=
1
111 /|, δ
( ) ( ) ( ) S
N
i
i
kkkk
i
k
i
k
i
k Nzzxzpvxhz
S
∑=
−≈→=
1
/|, δ
Analytic solutions for those integral
equations do not exist in the general
case.
1
2
161
SOLO
( ) ( ) ( )
( ) ( )kvkkk
xkkwkkkk
vpgivenvxhz
xpuwpgivenwuxfx
:,
,,:,, 011111 0
=
= −−−−−
Monte Carlo Computations of and .( )kk xzp |( )1| −kk xxp
Generate (Draw) ( ) Sx
i
Nixpx ,,1~ 00 0
=
For { }∞∈ ,,1 k
Initialization0
1 At stage k-1
Generate (Draw) NS samples ( ) Skw
i
k Niwpw ,,1~ 11 =−−
2 State Update ( ) S
i
kk
i
k
i
k Niwuxfx ,,1,, 111 == −−−
3 Generate (Draw) Measurement Noise ( ) Skv
i
k Nivpv ,,1~ =
k:=k+1 & return to 1
( ) ( )∑=
− −≈
SN
i
S
i
kkkk Nxxxxp
1
1 /| δ
( ) ( )∑=
−≈
SN
i
S
i
kkkk Nzzxzp
1
/| δ
4 Measurement , Update ( ) S
i
k
i
k
i
k Nivxhz ,,1, ==kz
Nonlinear Estimation Using Particle Filters
Non-Additive Non-Gaussian Nonlinear Filter
162
Nonlinear Estimation Using Particle Filters
SOLO Non-Additive Non-Gaussian Nonlinear Filter
( )
( )kkk
kkk
vxhz
wxfx
,
, 11
=
= −−
kk vw &1− are system and measurement white-noise sequences
independent of past and current states and on each other and
having known P.D.F.s ( ) ( )kk vpwp &1−
We want to compute p (xk|Z1:k) recursively, assuming knowledge of p(xk-1|Z1:k-1)
in two stages, prediction (before) and update (after measurement)
Prediction (before measurement) ( ) ( ) ( )∫ −−−−− = 11:1111:1 ||| kkkkkkk xdZxpxxpZxp
Update (after measurement)
( ) ( )
( ) ( ) ( )
( )
( ) ( )
( )
( ) ( )
( ) ( )∫ −
−
−
−
=
− ===
kkkkk
kkkk
kk
kkkk
Bayes
bp
apabp
bap
kkkkk
xdZxpxzp
Zxpxzp
Zzp
Zxpxzp
ZzxpZxp
1:1
1:1
1:1
1:1
|
|
1:1:1
||
||
|
||
,||
We use the numeric Monte Carlo Method to evaluate the integrals:
Generate (Draw): ( ) ( ) Sk
i
kk
i
k Nivpvwpw ,,1~&~ 11 =−−
( ) ( ) ( ) S
N
i
i
kkkk
i
k
i
k
i
k Nxxxxpwxfx
S
∑=
−−− −≈→=
1
111 /|, δ
( ) ( ) ( ) S
N
i
i
kkkk
i
k
i
k
i
k Nzzxzpvxhz
S
∑=
−≈→=
1
/|, δ
( ) ( ) ( ) ( ) ( ) ( )∑∑ ∫∫∑ ==
−−−−
=
−−− −=−=−=
SSS N
i
i
kk
S
N
i
kkk
i
kk
S
k
N
i
kk
i
kk
S
kk xx
N
xdZxpxx
N
xdZxpxx
N
Zxp
11
1
11:111
1
1:111:1
1
|
1
|
1
| δδδ
  
Since we use NS points to
describe the probabilities we
call those points, Particles.
1
2
Table of Content
163
Nonlinear Estimation Using Particle Filters
SOLO
We assumed that p (xk|Z1:k) is a Gaussian PDF. If the true PDF is not Gaussian
(multivariate, heavily skewed or non-standard – not represented by any standard PDF)
the Gaussian distribution can never described it well. In such cases approximate
Grid-Based Filters and Particle Filters will yield an improvement at the cost of
heavy computation demand.
( ) ( )
( )
0
|
|
:
:1
:1
>=
kk
kk
k
Zxq
Zxp
xw
To overcome this difficulty we use The Principle of Importance Sampling.
Suppose that p (xk|Z1:k) is a PDF from which is difficult to draw samples.
Also suppose that q (xk|Z1:k) is another PDF from which samples can be easily drawn
(referred to Importance Density), for example a Gaussian PDF.
Now assume that we can find at each sample the scale factor w (xk) between the
two densities:
Using this we can write:
( ){ } ( ) ( ) ( )
( ) ( )
( )
( )
( )
( )
( )
( ) ( ) ( )
( ) ( )∫
∫
∫
∫
∫
=
==
kkkk
kkkkk
kkk
kk
kk
kkk
kk
kk
k
kkkkZxpk
xdZxqxw
xdZxqxwxg
xdZxq
Zxq
Zxp
xdZxq
Zxq
Zxp
xg
xdZxpxgxgE kk
:1
:1
1
:1
:1
:1
:1
:1
:1
:1|
|
|
|
|
|
|
|
|
|:1
  
Non-Additive Non-Gaussian Nonlinear Filter ( )
( )kkk
kkk
vxhz
wxfx
,
, 11
=
= −−
Importance Sampling (IS)
164
SOLO
( ){ } ( )
( ) ( ) ( )
( ) ( )∫
∫=
kkkk
kkkkk
Zxpk
xdZxqxw
xdZxqxwxg
xgE kk
:1
:1
|
|
|
:1
( ) ( )
( )∑=
= sN
i
i
k
s
i
ki
k
xw
N
xw
xw
1
1
:~
where
Generate (draw) Ns particle samples { xk
i
, i=1,…,Ns } from q(xk|Z1:k)
( ) skk
i
k NiZxqx ,,1|~ :1 =
( ){ } ( )
( ) ( )
( )
( ) ( )∑
∑
∑
=
=
=
=≈
s
s
s
kk
N
i
i
kkN
i
i
k
s
N
i
i
kk
s
Zxpk xwxg
xw
N
xwxg
N
xgE
1
1
1
|
~
1
1
:1
and estimate g(xk) using a Monte Carlo approximation:
Nonlinear Estimation Using Particle Filters
Non-Additive Non-Gaussian Nonlinear Filter ( )
( )kkk
kkk
vxhz
wxfx
,
, 11
=
= −−
Importance Sampling (IS)
Table of Content
165
SOLO
It would be useful if the importance density could be generated recursively (sequentially).
( ) ( )
( ) ( ) ( ) ( )
( )
( ) ( ) ( )
( )
( ) ( ) ( )
( )kk
kkkk
Zzpc
kk
kkkkkk
bP
aPabP
baP
Bayes
kk
kkk
k
Zxq
Zxpxzpc
Zxq
ZzpZxpxzp
Zxq
Zzxp
xw
kk
:1
1:1
|/1:
:1
1:11:1
|
|:1
1:1
|
||
|
|/||
|
,| 1:1
−
=
−−
=
−
−
===
( ) ( ) ( ) ( )
( ) ( )1:111:11
|,
1:11 |,||, −−−−
=
−− = kkkkk
bPbaPbaP
Bayes
kkk ZxpZxxpZxxp
( ) ( ) ( ) ( )∫∫ −−−−−−−−− == 11:111:1111:111:1 |,||,| kkkkkkkkkkkk xdZxpZxxpxdZxxpZxp
Using:
we obtain:
( ) ( ) ( ) ( )∫∫ −−−−−−−−− == 11:111:1111:111:1 |,||,| kkkkkkkkkkkk xdZxqZxxqxdZxxqZxq
In the same way:
( ) ( ) ( )
( )
( ) ( ) ( )
( ) ( )∫
∫
−−−−−
−−−−−−
==
11:111:11
11:111:11
:1
1:1
|,|
|,||
|
||
kkkkkk
kkkkkkkk
kk
kkkk
k
xdZxqZxxq
xdZxpZxxpxzpc
Zxq
Zxpxzpc
xw
Sequential Importance Sampling (SIS)
Nonlinear Estimation Using Particle Filters
Non-Additive Non-Gaussian Nonlinear Filter ( )
( )kkk
kkk
vxhz
wxfx
,
, 11
=
= −−
166
SOLO
It would be useful if the importance density could be generated recursively.
( ) ( ) ( )
( )
( ) ( ) ( )
( ) ( )∫
∫
−−−−−
−−−−−−
==
11:111:11
11:111:11
:1
1:1
|,|
|,||
|
||
kkkkkk
kkkkkkkk
kk
kkkk
k
xdZxqZxxq
xdZxpZxxpxzpc
Zxq
Zxpxzpc
xw
Suppose that at k-1 we have Ns particle samples and their probabilities
{ xk-1|k-1
i
,wk-1
i
,i=1,…,Ns }, that constitute a random measure which characterizes the
posterior PDF for time up to tk-1. Then
( ) ( ) ( )∑=
−−−−−−−− −≈
sN
i
i
kkkk
i
kkkk xxZxpZxp
1
1|111:11|11:11 || δ
( )
( ) ( ) ( ) ( )
( ) ( ) ( )∫ ∑
∫ ∑
=
−−−−−−−−
−
=
−−−−−−−−
−
−
= s
s
N
i
i
kkkk
i
kkkkk
k
N
i
i
kkkk
i
kkkkkkk
k
xxZxqZxxq
xdxxZxpZxxpxzpc
xw
1
1|111:11|11:11
1
1
1|111:11|11:11
|,|
|,||
δ
δ
( ) ( ) ( )∑=
−−−−−−−− −≈
sN
i
i
kkkk
i
kkkk xxZxqZxq
1
1|111:11|11:11 || δ
Sequential Importance Sampling (SIS) (continue – 1)
We obtained:
Nonlinear Estimation Using Particle Filters
Non-Additive Non-Gaussian Nonlinear Filter ( )
( )kkk
kkk
vxhz
wxfx
,
, 11
=
= −−
167
SOLO
( ) ( )
( )
( ) ( )
( )kk
kkkk
Bayes
kk
kk
k
Zxq
Zxpxzpc
Zxq
Zxp
xw
:1
1:1
:1
:1
|
||
|
| −
==
( )
( ) ( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( )
( ) ( )
( ) ( )
( ) ( ) ( ) ( ) ( )
( ) ( )1:11|11|1
1:11|11|1
|,|
|,|
1:11|11:11|1
1:11|11:11|1
1
1|111:11|11:11
1
1
1|111:11|11:11
||
|||
|,|
|,||
|,|
|,||
1|11:11|1
1|11:11|1
−−−−−
−−−−−
=
=
−−−−−−
−−−−−−
=
−−−−−−−−
−
=
−−−−−−−−
−−−−−
−−−−−
=
=
−
−
=
∫ ∑
∫ ∑
k
i
kk
i
kkk
k
i
kk
i
kkkkk
xxpZxxp
xxqZxxq
k
i
kkk
i
kkk
k
i
kkk
i
kkkkk
N
i
i
kkkk
i
kkkkk
k
N
i
i
kkkk
i
kkkkkkk
k
Zxqxxq
Zxpxxpxzpc
ZxqZxxq
ZxpZxxpxzpc
xxZxqZxxq
xdxxZxpZxxpxzpc
xw
i
kkkk
i
kkk
i
kkkk
i
kkk
s
s
δ
δ
( ) ( )
( )1:11
1:11
1
|
|
−−
−−
− =
kk
kk
k
Zxq
Zxp
xwSince
( ) ( )
( )i
kk
i
kk
i
kk
i
kk
i
kkki
k
i
k
xxq
xxpxzpc
ww
1|1|
1|1||
1
|
||
−−
−−
−=
Define ( ) ( )
( )k
i
kk
k
i
kki
kk
i
k
Zxq
Zxp
xww
:1|
:1|
|
|
|
: ==
( ) ( )
( )1:11|1
1:11|1
1|11
|
|
:
−−−
−−−
−−− ==
k
i
kk
k
i
kki
kk
i
k
Zxq
Zxp
xww
Sequential Importance Sampling (SIS) (continue – 2)
Nonlinear Estimation Using Particle Filters
Non-Additive Non-Gaussian Nonlinear Filter ( )
( )kkk
kkk
vxhz
wxfx
,
, 11
=
= −−
168
SOLO
Sequential Importance Sampling (SIS) (continue – 3)
( ) ( )

( )
( ) ( ) ( )
( )
( ) ( )
( ) twwwt
Zxxq
xxpxzp
ww i
kk
N
i
i
k
k
i
k
i
k
i
k
i
k
i
kk
N
i
k
i
k /~~
,|
||~~
1:11
1
/1
1 ==→= ∑=−
−
−
( ) { } ( )∑=
−
− −=≈
N
i
i
kk
i
kkk NxxNxZxp
1
1
1:1 /:,| δ
k:=k+1
Run This
Nonlinear Estimation Using Particle Filters
Non-Additive Non-Gaussian Nonlinear Filter ( )
( )kkk
kkk
vxhz
wxfx
,
, 11
=
= −−
( ) ( )∑=
−=
N
i
i
kk
i
kkk xxwZxp
1
:1| δ
Generate (Draw) ( ) Sx
i
Nixpx ,,1~ 00 0
=
For { }∞∈ ,,1 k
Initialization0
1 At stage k-1
Generate (Draw) NS samples ( ) Skw
i
k Niwpw ,,1~ 11 =−−
2 State Update ( ) S
i
kk
i
k
i
k Niwuxfx ,,1,, 111 == −−−
Start with the approximation ( ) ( )∑=
− −≈
SN
i
S
i
kkkk Nxxxxp
1
1 /| δ3
After measurement zk we compute ( ) ( ) ( )
{ }i
k
i
kkk wxZxp ~,| :1 ≈4
Generate (Draw) NS samples ( ) Skw
i
k Nivpv ,,1~ =
Compute ( )i
k
i
k
i
k vxhz ,=
Approximate ( ) ( )∑=
−=
SN
i
S
i
kk
i
kk Nzzxzp
1
/| δ
169
SOLO
The resulting sequential importance sampling (SIS) algorithm is a Monte Carlo method
that forms the basis for most sequential MC Filters.
Sequential Importance Sampling (SIS) (continue – 4)
This sequential Monte Carlo method is known variously as:
• Bootstrap Filtering
• Condensation Algorithm
• Particle Filtering
• Interacting Particle Approximation
• Survival of the Fittest
Nonlinear Estimation Using Particle Filters
Non-Additive Non-Gaussian Nonlinear Filter ( )
( )kkk
kkk
vxhz
wxfx
,
, 11
=
= −−
170
SOLO
Degeneracy Problem
Sequential Importance Sampling (SIS) (continue – 5)
A common problem with SIS particle filter is the degeneracy phenomenon, where after
a few iterations, all but one particle will have negligible weights.
It can be shown that the variance of the importance weights, wk
i
, of the SIS algorithm,
can only increase over time, and that leads to the degeneracy problem. A suitable measure
of degeneracy is given by:
( )
1
1ˆ
1
1
2
== ∑
∑ =
=
N
i
i
kN
i
i
k
eff wwhere
w
N
To see this let look at the following two cases:
1
( )
N
N
NNi
N
w N
i
eff
i
k ==⇒==
∑=1
2
/1
1ˆ,,1,
1

2
( )
1
1ˆ
0
1
1
2
==⇒



≠
=
=
∑=
N
i
i
k
eff
i
k
w
N
ji
ji
w
Hence, small Neff indicates a severe degeneracy and vice versa.
Nonlinear Estimation Using Particle Filters
Non-Additive Non-Gaussian Nonlinear Filter ( )
( )kkk
kkk
vxhz
wxfx
,
, 11
=
= −−
Table of Content
171
SOLO
The Bootstrap (Resampling)
• Popularized by Brad Efron (1979)
• The Bootstrap is a name generically applied to statistical
resampling schemes that allow uncertainty in the data to
be assesed from the data themselves, in other words
“pulling yourself up by your bootstraps”
The disadvantage of bootstrapping is that while (under some conditions) it is
asymptotically consistent, it does not provide general finite-sample
guarantees, and has a tendency to be overly optimistic.The apparent
simplicity may conceal the fact that important assumptions are being made
when undertaking the bootstrap analysis (e.g. independence of samples)
where these would be more formally stated in other approaches.
The advantage of bootstrapping over analytical methods is its great simplicity - it is
straightforward to apply the bootstrap to derive estimates of standard errors and
confidence intervals for complex estimators of complex parameters of the
distribution, such as percentile points, proportions, odds ratio, and correlation
coefficients.
Sequential Importance Resampling (SIR)
Nonlinear Estimation Using Particle Filters
Non-Additive Non-Gaussian Nonlinear Filter ( )
( )kkk
kkk
vxhz
wxfx
,
, 11
=
= −−
Bradley Efron
1938
Stanford U.
172
SOLO
Resampling
Sequential Importance Resampling (SIR) (continue – 1)
Whenever a significant degeneracy is observed (i.e., when Neff falls bellow some
Threshold Nthr) during the sampling, where we obtained
( ) ( )∑=
−≈
N
i
i
kk
i
kkk xxwZxp
1
:1| δ
we need to resample and replace the mapping representation
with a random measure
{ } Niwx i
k
i
k ,,1, =
{ } NiNxi
k ,,1/1,*
=
This is done by first computing the Cumulative Density Function (C.D.F.) of the
sampled distribution wk
i
.
Initialize the C.D.F.: c1 = wk
1
Compute the C.D.F.: ci = ci-1 + wk
i
For i = 2:N
i := i + 1
Nonlinear Estimation Using Particle Filters
Non-Additive Non-Gaussian Nonlinear Filter ( )
( )kkk
kkk
vxhz
wxfx
,
, 11
=
= −−
173
SOLO
Resampling (continue – 1)
Sequential Importance Resampling (SIR) (continue – 2)
Using the method of Inverse Transform Algorithm we generate N independent and
identical distributed (i.i.d.) variables from the uniform distribution u, we sort them in
ascending order and we compare them with the Cumulative Distribution Function (C.D.F.)
of the normalized weights.
Nonlinear Estimation Using Particle Filters
Non-Additive Non-Gaussian Nonlinear Filter ( )
( )kkk
kkk
vxhz
wxfx
,
, 11
=
= −−
Non-Additive Non-Gaussian Nonlinear Filter ( )
( )kkk
kkk
vxhz
wxfx
,
, 11
=
= −−
174
SOLO
Resampling Algorithm (continue – 2)
Sequential Importance Resampling (SIR) (continue – 3)
Initialize the C.D.F.: c1 = wk
1
Compute the C.D.F.: ci = ci-1 + wk
i
For i = 2:N
i := i + 1
0
Start at the bottom of the C.D.F.: i = 1
Draw for the uniform distribution [ ]1
,0~ −
NUui
1 For i=1:N
Move along the C.D.F. uj = ui +(j – 1) N-1
.
For j=1:N2
WHILE uj > ci
j* = i + 1
END WHILE
3
END For
5 i := i + 1 If i < N Return to 1
4 Assign sample: i
k
j
k xx =*
Assign weight:
1−
= Nwj
k Assign parent: ii j
=
Nonlinear Estimation Using Particle Filters
Non-Additive Non-Gaussian Nonlinear Filter ( )
( )kkk
kkk
vxhz
wxfx
,
, 11
=
= −−
175
SOLO
Resampling
Sequential Importance Resampling (SIR) (continue – 4)
( ) ( )

( )
( ) ( ) ( )
( )
( ) ( )
( )
twwwt
Zxxq
xxpxzp
ww
i
kk
N
i
i
k
k
i
k
i
k
i
k
i
k
i
kk
N
i
k
i
k
/~~
,|
||~~
1
:11
1
/1
1
==
=
∑=
−
−
−
After measurement zk-1 we
compute ( ) ( ) ( )
{ }i
k
i
kkk wxZxp ~,| :1 ≈
1
Start with the approximation
( ) { }
( )∑=
−
−
−=
≈
N
i
i
kk
i
kkk
Nxx
NxZxp
1
1
1:1
/:
,|
δ
0
Prediction
( ) ( ) ( )
( )i
kk
i
k
i
k nuxfx ,,*1 =+
to obtain ( ) ( )
{ }1
1:11 ,| −
++ ≈ NxZxp i
kkk
3
k:=k+1 Run This
Nonlinear Estimation Using Particle Filters
Non-Additive Non-Gaussian Nonlinear Filter ( )
( )kkk
kkk
vxhz
wxfx
,
, 11
=
= −−
( ) ( )∑=
−=
N
i
i
kk
i
kkk xxwZxp
1
:1| δ
If Resample
to obtain ( ) ( )
{ }1
:1 ,*| −
≈ NxZxp i
kkk
2 ( ) tht
N
i
i
keff NwN <






= ∑=1
2
/1
176
SOLO
Resampling
Sequential Importance Resampling (SIR) (continue – 5)
Although the resampling step reduces the effect of degeneracy, it introduces other
practical problems:
It limits the possibility of parallel implementation.
The particles that have high wk
i
are statistically selected many times. This leads to
loss of diversity among the particles (sample impoverishment).
1
2
Several other techniques for generating samples from an unknown P.D.F., beside
Importance Sampling, have been presented in the literature. If the P.D.F. is stationary,
Markov Chain Monte Carlo (MCMC) methods have been proposed:
• Metropolis – Hastings (MH)
• Gibbs sampler (a special case of MH)
(see Probability Presentation)
Nonlinear Estimation Using Particle Filters
Non-Additive Non-Gaussian Nonlinear Filter ( )
( )kkk
kkk
vxhz
wxfx
,
, 11
=
= −−
177
SOLO
Selection of Importance Density
Sequential Importance Resampling (SIR) (continue – 6)
The choice of the Importance Density q (xk|xk-1,zk) is one of the most critical issues in
the design of the Particle Filter.
The Optimal Choice
The Optimal Importance Density q (xk|xk-1,zk), that minimizes the variance of importance
weights, conditioned upon xk-1
i
and zk has been shown to be:
( ) ( )
( ) ( ) ( ) ( ) ( ) ( )
( )i
kk
i
kk
i
kkk
aabbba
k
i
kkoptk
i
kk
xzp
xxpxxzp
zxxpzxxq
1
11
Pr|PrPr|Pr
11
|
|,|
,|,|
−
−−
=
−− ==
( ) ( )
( )
( ) ( ) ( )
( )
( ) ( )
( )k
i
k
i
k
i
k
i
k
i
kki
k
i
k
zxxq
xxpxzp
ww
,|
||
1
1
1
−
−
−=Substitution of this into:
we obtain: ( ) ( ) ( )
( )i
kk
i
k
i
k xzpww 11 | −−=
From this equation we can see that the importance weights at time k can be computed
(if necessary resampling can be performed) before the particles are propagate to time k.
In order to use optimal importance function we must:
sample from p (xk|xk-1,zk).1
evaluate:2 ( ) ( ) ( )∫ −− = k
i
kkkk
i
kk xdxxpxzpxzp 11 |||
In the general case either of these two tasks can be difficult.
Nonlinear Estimation Using Particle Filters
Non-Additive Non-Gaussian Nonlinear Filter ( )
( )kkk
kkk
vxhz
wxfx
,
, 11
=
= −−
178
Sequential Importance Resampling Particle Filter (SIRPF)
SOLO
SIRPF Summary
Initialization of SIRPF
( )00 0
~ˆ xpx x
For { }∞∈ ,,1 k
Assuming for k-1 Gaussian distribution with Mean and Covariance
( ) skkkkk
i
kk NiPxxx ,,1,ˆ;ˆ 1|11|111|1 == −−−−−−− N
State Prediction and its Covariance
System Definition
( ) ( ) ( )
( ) ( )


=
−= −−−
kvkkkk
kwkxkkkk
vpvvxkhz
wpwxpxwuxkfx
~,,
~,~,,,1 00111 0
( ) sk
i
kk
i
kk Niuxkfx ,,1,,1 11|11| =−= −−−−
0
1
2
1|11|1 ,ˆ −−−− kkkk Px
Generate Ns samples
Assuming Gaussian distribution with Mean and Covariance
( ) skkkkkk
j
kk NjPxxx ,,1,ˆ; 1|1|1|1| == −−−− N
3 1|1| ,ˆ −− kkkk Px
Generate new Ns samples
Draw
Table of Content
179
Monte Carlo Particle Filter (MCPF)
SOLO
MCPF Summary
Initialization of MCPF
{ } ( ) ( ){ }T
xxxxEPxEx 00000|000
ˆˆˆ −−==
{ } [ ] ( )( ){ }










=−−===
R
Q
P
xxxxEPxxEx
TaaaaaTTaa
00
00
00
ˆˆ00ˆˆ
0|0
00000|0000
[ ]TTTTa
vwxx =:
For { }∞∈ ,,1 k
Assuming for k-1 Gaussian distribution with Mean and Covariance
( ) skkkkk
i
kk NiPxxx ,,1,ˆ;ˆ 1|11|111|1 == −−−−−−− N
State Prediction and its Covariance
System Definition
( ) { } { }
( ) { } { }



==+=
==+−= −−−−−−−
lkk
T
lkkkkk
lkk
T
lkkkkkk
RvvEvEvxkhz
QwwEwEwuxkfx
,
,1111111
&0,
&0,,1
δ
δ
( ) sk
i
kk
i
kk Niuxkfx ,,1,,1 11|11| =−= −−−−
∑=
−− =
sN
i
i
kk
s
kk x
N
x
1
1|1|
1
ˆ
0
1
2
T
kkkk
N
i
Ti
kk
i
kk
s
kk xxxx
N
P
s
1|1|
1
1|1|1|
ˆˆ
1
−−
=
−−− −= ∑
1|11|1 ,ˆ −−−− kkkk Px
Generate Ns samples
Assuming Gaussian distribution with Mean and Covariance
( ) skkkkkk
j
kk NjPxxx ,,1,ˆ; 1|1|1|1| == −−−− N
3 1|1| ,ˆ −− kkkk Px
Generate new Ns samples
180
Monte Carlo Particle Filter (MCPF)
SOLO
MCPF Summary (continue – 1)
Measure Prediction
( ) s
j
kk
j
kk Njxkhz ,,1, 1|1| == −− ∑=
−− =
sN
j
j
kk
s
kk z
N
z
1
1|1|
1
ˆ
4
Innovation and its Covariance
6
1|ˆ −−= kkkk zzi
( )( )∑=
−−−−− −−==
sN
j
T
kk
j
kkkk
j
kk
s
zz
kkk zzzz
N
PS
1
1|1|1|1|1| ˆˆ
1
Kalman Gain Computations
7
( )( )∑=
−−−−− −−=
sN
j
T
kk
j
kkkk
j
kk
s
xz
kk zzxx
N
P
1
1|1|1|1|1| ˆˆ
1
1
1|1|
−
−−= zz
kk
xz
kkk PPK
Kalman Filter8
kkkk
x
kk iKx += −1|| ˆµ T
kkkkk
xx
kk KSKP −=Σ −1||
k := k+1 & return to 1
Predicted Covariances Computations5
Importance Sampling using Gaussian Mean and Covariance
( ) s
xx
kk
x
kkk
m
kk Nmxx ,,1,; ||| =Σ= µN
9 xx
kk
x
kk 1|1| , −− Σµ
Generate new Ns samples
Weight Update10 ( ) ( )
( ) sxx
kk
x
kk
m
kk
kkkk
m
kk
m
kkkm
k Nm
x
Pxxxzp
w ,,1
,;
,ˆ;|~
1|1||
1|1|||
=
Σ
=
−−
−−
µN
N
s
N
l
l
k
m
k
m
k Nmwww
s
,,1~/~
1
== ∑=
Update State and its Covariance11 ∑=
=
sN
m
m
kk
m
k
s
kk xw
N
x
1
||
1
ˆ ( )( )∑=
−−=
sN
m
T
kk
m
kkkk
m
kk
s
kk xxxx
N
P
1
|||||
ˆˆ
1
181
Sensor Data
Processing and
Measurement
Formation
Observation -
to - Track
Association
Input
Data Track Maintenance
( Initialization,
Confirmation
and Deletion)
Filtering and
Prediction
Gating
Computations
Samuel S. Blackman, " Multiple-Target Tracking with Radar Applications", Artech House,
1986
Samuel S. Blackman, Robert Popoli, " Design and Analysis of Modern Tracking Systems",
Artech House, 1999
SOLO
Monte Carlo Particle Filter (MCPF)
Table of Content
182
Estimators
vxHz +=
SOLO
Maximum Likelihood Estimate (MLE)
For the particular vector measurement equation
where the measurement noise, is gaussian (normal), with zero mean:
v
H zx
( )RNv ,0~
( )
( )
( )xp
zxp
xzp
x
zx
xz
,
| ,
| =
and independent of , the conditional probability can be written,
using Bayes rule as:
x ( )xzp xz ||
( )










−
−
==−=
1
111
1111
1
1
,
nxpp
nx
pxnxpxnpxpx
xHz
xHz
zxfxHzv
xn
xn

( ) ( )
2/1
,,
/,, T
vxzx
JJvxpzxp =
The measurement noise can be related to and by the function:v zx
pxp
p
pp
p
I
z
f
z
f
z
f
z
f
z
f
J =
















∂
∂
∂
∂
∂
∂
∂
∂
=





∂
∂
=



1
1
1
1
( ) ( ) ( ) ( )vpxpvxpzxp vxvxzx
⋅== ,, ,,
v
Since the measurement noise is independent of :xv
zThe joint probability of and is given by:x
183
EstimatorsSOLO
Maximum Likelihood Estimate (continue – 1)
v
H zx
( ) ( ) ( ) ( )vpxpvxpzxp vxvxzx ⋅== ,, ,,
x
v
( )vxp vx
,,
( ) ( )
( )
( ) ( )





−−−=
−=
−
xHzRxHz
R
xHzpxzp
T
p
vxz
1
2/12/
|
2
1
exp
2
1
|
π
( ) ( ) ( )[ ] ( )RWWLSxHzRxHzxzp
T
x
xz
x
⇒−−⇔ −1
| min|max
( ) ( )[ ] ( ) 02 11
=−−=−−
∂
∂ −−
xHzRHxHzRxHz
x
TT
0*11
=− −−
xHRHzRH TT
( ) zRHHRHxx TT 111
*: −−−
==

( ) ( )[ ] HRHxHzRxHz
x
TT 11
2
2
2 −−
=−−
∂
∂ this is a positive definite matrix, therefore
the solution minimizes
and maximizes
( ) ( )[ ]xHzRxHz
T
−− −1
( )xzp xz ||
( ) ( )
( )
( )
( ) 



−=== −
vRv
R
vp
xp
zxp
xzp T
pv
x
zx
xz
1
2/12/
/
|
2
1
exp
2
1,
|
π
gaussian (normal), with zero mean
( ) ( )xzpxzL xz |:, |=
is called the Likelihood Function and is a measure
of how likely is the parameter given the observation .x z
Table of Content
184
EstimatorsSOLO
Bayesian Maximum Likelihood Estimate (Maximum Aposteriori– MAP Estimate)
v
H zx
vxHz +=
Consider a gaussian vector , where ,
measurement, , where the gaussian noise
is independent of and .( )RNv ,0~
v
x ( ) ( )[ ]−− PxNx ,~

x
( )
( ) ( )
( )( ) ( ) ( )( )





−−−−−−
−
= −
xxPxx
P
xp
T
nx
 1
2/12/
2
1
exp
2
1
π
( ) ( )
( )
( ) ( )





−−−=−= −
xHzRxHz
R
xHzpxzp
T
pvxz
1
2/12/|
2
1
exp
2
1
|
π
( ) ( ) ( ) ( )∫∫
+∞
∞−
+∞
∞−
== xdxpxzpxdzxpzp xxzzxz |, |,
is gaussian with( )zpz ( ) ( ) ( ) ( ) ( )−=+=+= xHvExEHvxHEzE

0
( ) ( )[ ] ( )[ ]{ } ( )[ ] ( )[ ]{ }
( )( )[ ] ( )( )[ ]{ } ( )[ ] ( )[ ]{ }
( )[ ]{ } ( )[ ]{ } { } ( ) RHPHvvEHxxvEvxxEH
HxxxxEHvxxHvxxHE
xHvxHxHvxHEzEzzEzEz
TTTTT
TTT
TT
+−=+−−−−−−
−−−−=+−−+−−=
−−+−−+=−−=
  

  



00
cov
( )
( ) ( )
( )[ ] ( )[ ] ( )[ ]






−−+−−−−
+−
=
−
xHzRHPHxHz
RHPH
zp TT
Tpz
ˆˆ
2
1
exp
2
1 1
2/12/
π
185
EstimatorsSOLO
Bayesian Maximum Likelihood Estimate (Maximum Aposteriori Estimate) (continue – 1)
v
H zx
vxHz +=
Consider a gaussian vector , where ,
measurement, , where the gaussian noise
is independent of and .( )RNv ,0~
v
x ( ) ( )[ ]−− PxNx ,~

x
( )
( ) ( )
( )( ) ( ) ( )( )





−−−−−−
−
= −
xxPxx
P
xp
T
nx
 1
2/12/
2
1
exp
2
1
π
( ) ( )
( )
( ) ( )



−−−=−= −
xHzRxHz
R
xHzpxzp
T
pvxz
1
2/12/|
2
1
exp
2
1
|
π
( )
( ) ( )
( )[ ] ( )[ ] ( )[ ]






−−+−−−−
+−
=
−
xHzRHPHxHz
RHPH
zp TT
Tpz
ˆˆ
2
1
exp
2
1 1
2/12/
π
( )
( ) ( )
( )
( )
( )
( )
( ) ( ) ( )( ) ( ) ( )( ) ( )[ ] ( )[ ] ( )[ ]





−−+−−−+−−−−−−−−−⋅
+−
−
==
−−−
xHzRHPHxHzxxPxxxHzRxHz
RHPH
RPzp
xpxzp
zxp
TTTT
T
nz
xxz
zx
ˆˆ
2
1
2
1
2
1
exp
2
1|
|
111
2/1
2/12/1
2/
|
|

π
from which
186
EstimatorsSOLO
Bayesian Maximum Likelihood Estimate (Maximum Aposteriori Estimate) (continue – 2)
( ) ( ) ( )( ) ( ) ( )( ) ( )( ) ( )[ ] ( )( )−−+−−−−−−−−−+−−
−−−
xHzRHPHxHzxxPxxxHzRxHz TTTT  111
( ) ( )( )[ ] ( ) ( )( )[ ] ( )( ) ( ) ( )( )
( )( ) ( )[ ] ( )( ) ( )( ) ( )[ ]{ } ( )( )
( )( ) ( )( ) ( )( ) ( )( ) ( )( ) ( )[ ] ( )( )−−+−−−+−−−−−−−−−−
−−+−−−−=−−+−−−−
−−−−−+−−−−−−−−−−=
−−−−
−−−
−−
xxHRHPxxxxHRxHzxHzRHxx
xHzRHPHRxHzxHzRHPHxHz
xxPxxxxHxHzRxxHxHz
TTTTT
TTTT
TT



1111
111
11
( )( ) ( )( ) ( )( ) ( ) ( )( ) ( )( ) ( )[ ] ( )( )−−+−−−−−−−−−+−−−−
−−−
xHzRHPHxHzxxPxxxHzRxHz TTTT  111
( )[ ] ( )[ ] 11111111 −−−−−−−−
−++/−/=+−− RHPHRHHRRRRHPHR TTT
we have
then
Define: ( ) ( )[ ] 111
:
−−−
+−=+ HRHPP T
( )( ) ( ) ( )[ ] ( ) ( )( )
( )( ) ( ) ( )[ ] ( )( ) ( )( ) ( ) ( )[ ] ( )( )
( )( ) ( )[ ] ( )( )−−+−−−+
−−++−−−−−++−−−
−−+++−−=
−−
−−−−
−−−
xxHRHPxx
xxPPHRxHzxHzRHPPxx
xHzRHPPPHRxHz
TT
TTT
TT



11
1111
111
( ) ( )( )[ ] ( ) ( ) ( )( )[ ]−−+−−+−−+−−= −−−
xHzRHxxPxHzRHxx TTT  111
( )
( ) ( )
( ) ( ) ( )( )[ ] ( ) ( ) ( ) ( )( )[ ]






−−+−−−+−−+−−−−⋅
+
= −−−
xHzRHPxxPxHzRHPxx
P
zxp TTT
nzx
 111
2/12/|
2
1
exp
2
1
|
π
187
EstimatorsSOLO
Bayesian Maximum Likelihood Estimate (Maximum Aposteriori Estimate) (continue – 3)
then
where: ( ) ( )[ ] 111
:
−−−
+−=+ HRHPP T
( )
( ) ( )
( ) ( ) ( )[ ] ( ) ( ) ( ) ( )[ ]






−+−−−+−+−−−−⋅
+
= −−−
xHzRHPxxPxHzRHPxx
P
zxp TTT
nzx
111
2/12/|
2
1
exp
2
1
|

π
( )zxp zx
x
|max | ( ) ( ) ( ) ( )( )−−++−==+ −
xHzRHPxxx T  1*
:
Table of Content
SOLO
( ) ( ) ( ) [ ]fttttwddttxftxd ,, 0∈+=
A continuous dynamic system is described by:
Nonlinear Filters
( )tx - n- dimensional state vector
( )twd - n- dimensional process noise vector described by the covariance matrix Q
- the probability of the state at time tx
The time evolution of the probability density function is described by the
Fokker–Planck equation:
Nonlinear Filters based on the Fokker-Planck Equation
Fred Daum from Raytheon Company leads methods to design Nonlinear
Filters starting from Fokker-Planck Equation.
( ) ( ){ } 0:ˆ == twdEtwd
( ) ( )[ ] ( ) ( )[ ]{ } ( ) ( )τδ −=−− ttQtwdtwdtwdtwdE
T
ˆˆ
Return to Stochastic
Processes
( ) ( )[ ] ( )[ ] ( ) ( )[ ]( ) ( ) ( ) ( )[ ]






∂
∂
∂
∂
+
∂
∂
−=
∂
∂
x
txp
tQ
xx
txpttxf
t
txp txtxtx
2
1,
( )[ ]txp
Fred Daum
( )[ ] ( ) ( )[ ]( ) ( )[ ] ( ) ( )[ ]( )
∑=
∂
∂
=
∂
∂ n
i
i
txitx
x
txpttxf
x
txpttxf
1
,,
( ) ( )[ ] ( ) ( )[ ] ( ) ( )[ ] ( ) ( )[ ] T
n
txtxtxtx
x
txp
x
txp
x
txp
x
txp






∂
∂
∂
∂
∂
∂
=
∂
∂
,,,
21

SOLO
Assuming system measurements at discrete time tk given by:
( ) ( )( ) [ ]fkkkkk tttvttxhtz ,,, 0∈=
kv - m- dimensional measurement noise vector at tk
We are interested in the probability of the state at time t given the set of discrete
measurements until (included) time tk < t.
x
( )kZtxp |,
{ }kk zzzZ ,,, 21 = - set of all measurements up to and including time tk.
Bayes’ Rule:
Nonlinear Filters
( ) ( ) ( )
( )
( ) ( )
( )1
1
|
|
1
|
,||,
,|,
−
−
=
− =








kk
kkkk
Bayes
bp
apabp
bap
Z
kkk
Zzp
txzpZtxp
Zztxp
k

( )1|, −kk Ztxp probability of at time tk given Zk-1 (apriori – before measurement zk)x
(aposteriori – after measurement zk)
probability o f at time tk given Zk
x
( )kk txzp ,| probability of measurement given the state at time tk.
(likelihood of measurement)
kz x
( )1| −kk Zzp probability of measurement given Zk-1 (apriori – before measurement zk)
(normalization of conditional probability)
kz
Nonlinear Filters based on the Fokker-Planck Equation
SOLO
In the Classical Particle Filter solution
the particle are drawn using the apriori
density that decide their distribution (see
Figure).
After measurement the Likelihood of
Measurement is obtained and nothing
will prevent a low density of particles
drawn before in the Likelihood region.
This is the Particle Degeneracy, that
produce the curse of dimensionality.
Nonlinear Filters
Fred Daum
Nonlinear Filters based on the Fokker-Planck Equation
prior density
Particles to represent prior density
Liklehood of
measurement
Particle Degeneracy
Cause of Curse of Dimensionality
The Particle Filter solutions have implementations problems.
The Number of Particles necessary to
reduce the Filter Error increase with
System Dimension. Daum gives the
Filter Error as function of Number of
Particles for System Dimension as
Parameter.
http://sc.enseeiht.fr/doc/Seminar_Daum_2012_2.pdf
SOLO
By taking natural logarithm of the conditional probability, we get in the
right side a sum of logarithms
Nonlinear Filters
Fred Daum
( ) ( ) ( ) ( )        
ionnormalizat
kk
likelihood
kk
aprior
kk
aposterior
kk ZzptxzpZtxpZtxp 11 |ln,|ln|,ln|,ln −− −+=
The homotopy
( ) ( ) ( ) ( )
ionnormalizatlikelihoodaprioraposterior
Kxhxgxp λλλ lnlnln,ln −+=
p.d.f.
p.d.f.
Flow of Density
particles particles
Flow of Particles
Sample from
Density
Sample from
Density
apriori
aposteriori
Induced Flow of Particles
for Bayes¶Rule
Since p (x,λ) is the p.d.f. associated to a system defined
by f (x,λ) we have the Fokker-Plank Equation:
( ) ( ) ( )( ) ( ) ( )






∂
∂
∂
∂
+
∂
∂
−=
∂
∂
x
xp
xQ
xx
xpxfxp λ
λ
λλ
λ
λ ,
,
2
1,,,
To obtain the aposteriori probability p (x,tk|Zk) from the apriori probability p (x,tk|Zk-1) and the
likelihood p (zk|x,tk), Daum uses a homotopy procedure (see next slide) by choosing a homotopy
continuous parameter λ ϵ [0,1]. He will search for a function (not related to the
filtered system) that describes the flow of the particles and is associated to p (x,tk|Zk) .
( )λ,xf
Nonlinear Filters based on the Fokker-Planck Equation
( )λ,xQ - Noise Spectrum to be defined
Here we describe Daum proposed methods called Particle Flow Filters
( ) ( )
λ
λλ
λ d
wd
xQxf
d
xd
,, += Particle Flow Equation
01/13/15 192
Homotopy
In topology, two continuous functions from one topological space to
another are called homotopic (Greek μός (homós) = same, similarὁ ,
and τόπος (tópos) = place) if one can be "continuously deformed" into
the other, such a deformation being called a homotopy between the two
functions. An outstanding use of homotopy is the definition of
homotopy groups and cohomotopy groups, important invariants in
algebraic topology.
A Homotopy of a Coffe
Cup into a doughnut
Formally, a homotopy between two continuous functions f and g from a
topological space X to a topological space Y is defined to be a continuous
function H : X × [0,1] → Y from the product of the space X with the unit
interval [0,1] to Y such that, if x X then H(x,0) = f(x) and H(x,1) =∈
g(x).
If we think of the second parameter of H as time then H describes a
continuous deformation of f into g: at time 0 we have the function f and
at time 1 we have the function g.
An alternative notation is to say that a homotopy between two
continuous functions f, g : X → Y is a family of continuous functions ht :
X → Y for t ∈ [0,1] such that h0 = f and h1 = g, and the map t h↦ t is
continuous from [0,1] to the space of all continuous functions X → Y.
The two versions coincide by setting ht(x) = H(x,t).
Formal definition
SOLO
SOLO Nonlinear Filters
Fred Daum
( ) ( ) ( )( ) ( ) ( )






∂
∂
∂
∂
+
∂
∂
−=
∂
∂
x
xp
xQ
xx
xpxfxp λ
λ
λλ
λ
λ ,
,
2
1,,,
Fokker-Plank Equation
( ) ( ) ( ) ( ) ( )( ) ( ) ( )






∂
∂
∂
∂
+
∂
∂
−=





−
x
xp
xQ
xx
xpxf
xp
d
Kd
xh
λ
λ
λλ
λ
λ
λ ,
,
2
1,,
,
ln
ln
Partial Differential Equation for
f given p
( ) ( ) ( ) ( )( ) ( ) ( )






∂
∂
∂
∂
+
∂
∂
−=
∂
∂
x
xp
xQ
xx
xpxf
xp
xp λ
λ
λλ
λ
λ
λ ,
,
2
1,,
,
,ln
( ) ( ) ( ) ( )λλλ Kxhxgxp lnlnln,ln −+= Definition of p (x,λ)
We have:
( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( )






∂
∂
∂
∂
+
∂
∂
−
∂
∂
−=





−
x
xp
xQ
x
xf
x
xp
x
xf
xpxp
d
Kd
xh
λ
λλ
λλ
λλ
λ
λ ,
,
2
1
,
,,
,,
ln
ln
( ) ( ) ( ) ( ) ( )
( )
( ) ( )






∂
∂
∂
∂
+
∂
∂
−
∂
∂
−=





−
x
xp
xQ
xxp
xf
x
xp
x
xf
d
Kd
xh
λ
λ
λ
λ
λλ
λ
λ ,
,
,2
1
,
,ln,ln
ln
Nonlinear Filters based on the Fokker-Planck Equation
SOLO Nonlinear Filters
Fred Daum
( ) ( ) ( ) ( ) ( )
( )
( ) ( )






∂
∂
∂
∂
+
∂
∂
−
∂
∂
−=





−
x
xp
xQ
xxp
xf
x
xp
x
xf
d
Kd
xh
λ
λ
λ
λ
λλ
λ
λ ,
,
,2
1
,
,ln,ln
ln
Differentiate this Equation as function of x
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )












∂
∂
∂
∂
∂
∂
+
∂∂
∂
−





∂
∂
∂
∂
−
∂
∂
−=
∂
∂
λ
λ
λ
λλλλ
λ ,/
,
,
2
1,,ln,,ln
,
ln
2
2
xp
x
xp
xQ
xxx
xf
x
xp
x
xf
xx
xp
xf
x
xh T
One option to simplify the problem is to choose such that:( )λ,xQ
( ) ( ) ( ) ( ) ( ) ( ) 0,/
,
,
2
1,,ln,
=












∂
∂
∂
∂
∂
∂
+
∂∂
∂
−





∂
∂
∂
∂
− λ
λ
λ
λλλ
xp
x
xp
xQ
xxx
xf
x
xp
x
xf
x
We obtain ( ) ( ) ( )
2
2
,ln
,
ln
x
xp
xf
x
xh T
∂
∂
−=
∂
∂ λ
λ ( ) ( ) ( )
T
x
xh
x
xp
xf 





∂
∂






∂
∂
−=
−
ln,ln
,
1
2
2
λ
λ
Nonlinear Filters based on the Fokker-Planck Equation
SOLO Nonlinear Filters
Fred Daum
Second option to simplify the problem is to choose ( ) 0, =λxQ
Nonlinear Filters based on the Fokker-Planck Equation
( )λ
λ
,xf
d
xd
=
( ) ( ) ( )( )
x
xpxfxp
∂
∂
−=
∂
∂ λλ
λ
λ ,,,
Fokker-Plank Equation
Particle Flow Equation
( ) ( ) ( ) ( )λλλ Kxhxgxp lnlnln,ln −+= Definition of p (x,λ)
Define ( ) ( ) ( ) ( )
  
known
xp
d
Kd
xhx λ
λ
λ
λη ,
ln
ln:, 





−−=
We obtain ( ) ( )[ ] ( )ληλλ ,,, xxpxf
x
=
∂
∂
P.D.E. for f given p
( ) ( ) ( ) ( )( )
x
xpxf
xp
xp
∂
∂
−=
∂
∂ λλ
λ
λ
λ ,,
,
,ln
( ) ( ) ( ) ( ) ( )[ ]λλλ
λ
λ
,,,
ln
ln xpxf
x
xp
d
Kd
xh
∂
∂
−=





− P.D.E. for f given p
λd
d
SOLO Nonlinear Filters
Fred Daum
Second option to simplify the problem is to choose ( ) 0, =λxQ
Nonlinear Filters based on the Fokker-Planck Equation
We obtain ( ) ( )[ ]
( )
( )ληλλ
λ
,,,
,
xxpxf
x xq
=
∂
∂
   q = p f
f = unknown function
p & η known at random points( ) ( ) ( ) ( )λ
λ
λ
λη ,
ln
ln:, xp
d
Kd
xhx 





−−=
We have ( )λη ,
2
2
1
1
x
x
q
x
q
x
q
x
q
d
d
=
∂
∂
++
∂
∂
+
∂
∂
=
∂
∂

1. Linear PDE in unknown f or q.
2. Constant coefficient PDE in q.
3. First Order PDE.
4. Highly undetermined PDE.
5. Same as the Gauss divergence law in Maxwell Equations.
6. Same as Euler’s Equation in Fluid Dynamics.
7. Existence of solutions if and only if integral of η is zero.
Exact Flow Solutions for g & h Gaussian Densities:
( ) ( ) ( )
( ) [ ]
( ) ( ) ( )[ ]xAzRHPAIAIb
HRHPHHPA
bxAxf
T
TT
+++=
+−=
+=
−
−
1
1
2:
2
1
:
,
λλλ
λλ
λλλ Automatically stable under very
mild conditions & extremely
fast
Fred Daum
SOLO Nonlinear Filters
F. Daum, J. Huang, Particle Flow for Nonlinear Filters, Bayesian Decision and
Transport, 7 April 2014
SOLO Nonlinear Filters
Fred Daum
Table of Content
199
Recursive Bayesian Estimation
References:
SOLO
1. Sage, A.P., & Melsa, J.L., “Estimation Theory with Applications to Communications
and Control”, McGraw Hill, 1971
2. Gordon, N.J., Salmond, D.J., Smith, A.M.F., “Novel Approach to
Nonlinear/Non- Gaussian Bayesian State Estimation”, IEE Proceedings Radar
and Signal Processing, vol. 140, No. 2, April 1993, pp. 107 - 113
7. Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques
Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation,
January 2005
5. Arulampalam,S., Maskell,S., Gordon,N., Clapp,T., “A Tutorial on Particle
Filters for On-line Non-linear/Non-Gaussian Bayesian Tracking”, IEEE
Transactions on Signal Processing, Vol. 50, No. 2, February 2002
6. Ristic,B., Arulampalam,S., Gordon,N., “Beyond the Kalman Filter Particle
Filters for Tracking Applications”, Artech House, 2004
4. Karlsson, R., “Simulation Based Methods for Target Tracking”, Department of
Electrical Engineering Linköpings Universitet, 2002
3. Doucet,A., de Freitas,N., Gordon,N., Ed. “Sequential Monte Carlo Methods in
Practice”, Springer, 2001
200
Recursive Bayesian Estimation
References (continue – 1):
SOLO
Fred Daum, “Particle Flow for Nonlinear Filters”, 19 July 2012
http://sc.enseeiht.fr/doc/Seminar_Daum_2012_2.pdf
https://www.ll.mit.edu/asap/asap_06/pdf/Papers/23_Daum_Pa.pdf
Fred Daum, Misha Krichman, “Non-Particle Filters”,
F. Daum, J. Huang, Particle Flow for Nonlinear Filters, Bayesian Decision and
Transport, 7 April 2014
http://meeting.xidian.edu.cn/workshop/miis2014/uploads/files/July-5th-930am_Fred
%20Daum_Particle%20flow%20for%20nonliner%20filters,%20Bayesuan
%20Decisions%20and%20Transport%20.pdf
http://www.dsi.unifi.it/users/chisci/idfric/Nonlinear_filtering_Chen.pdf
Zhe Chen, “Bayesian Filtering From Kalman Filters to Particle Filters,
and Beyond”, 18.05.06,
Table of Content
201January 13, 2015 201
SOLO
Technion
Israeli Institute of Technology
1964 – 1968 BSc EE
1968 – 1971 MSc EE
Israeli Air Force
1970 – 1974
RAFAEL
Israeli Armament Development Authority
1974 – 2013
Stanford University
1983 – 1986 PhD AA
202
“Proceedings of the IEEE”, March 2004, Special Issue on:
“Sequential State Estimation: From Kalman Filters to Particle Filters”
Julier, S.,J. and Uhlmann, J.,K., “Unscented Filtering and Nonlinear Estimation”,
pp.401 - 422
Recursive Bayesian Estimation
203
SOLO
Neil GordonM. Sanjev
Arulampalam
Tim ClappSimon MaskellNando de FreitasArnaud Doucet
Branko Ristic
Genshiro Kitagawa Christophe Andrieu
Dan Crişan Fred Daum
Recursive Bayesian Estimation
204
Markov Chain Monte Carlo (MCMC)SOLO
Some MCMC Developments Related to Vision
Nicholas Constantine Metropolis
( 1915 – 1999)
Metropolis 1946
Hastings 1970
Heat bath
Miller, Grenader, 1994
Green 1995
DDMCMC 2001 - 2005
Waltz 1972, (labeling)
Rosenfeld, Hummel, Zucker
1976 (relaxation)
Geman brothers 1984,
(Gibbs sampler)
Kirkpatrick 1983
Swendsen-Wang 1987
(clustering)
Swendsen-Wang Cut 2003
205
Markov Chain Monte Carlo (MCMC)SOLO
A Brief History of MCMC
Nicholas Constantine Metropolis
( 1915 – 1999)
1942 – 1946: Real use of Monte Carlo started during WWII
- study of the atomic bomb (neutron diffusion in fissile material)
1948: Fermi, Metropolis, Ulam obtained Monte Carlo estimates for
the eigenvalues of the Schrödinger equations.
1950: Formating of the basic construction of MCMC, e.g. the Metropolis method
- application to statistical physics model, such as Ising model
1960 - 80: using MCMC to study phase transition; material growth/defect,
macro molecules (polymers), etc.
1980s: Gibbs samples (Germ brothers), Simulated annealing, data augmentation,
Swendsen-Wang, etc.
global optimization; image and spech; quantum field theory
1990s: Applications in genetics; computational biology.
206
Rao – Blackwell TheoremSOLO
Rao-Blackwell Theorem provides a process by which a possible improvement in
efficiency of an estimator can be obtained by taking its conditional expectation with
respect to a sufficient statistics.
The result on one parameter appeared in Rao (1945)
and in Blackwell (1947). Lehmann and Scheffè (1950)
called the result as Rao-Blackwell Theorem (RBT),
and the process is described as Rao-Blackwellization
(RB) by Berkson (1955). In computational terminology
it is called Rao-Blackwellized Filter (RBF).
Calyampudi Radhakrishna Rao
and David Blackwell.
The Rao – Blackwell Theorem states that if g (x) is
any kind of estimator of a parameter θ, then the
conditional expectation of g (x) given T (x), where
T (x) is a sufficient statistics, is typically a better
estimator of θ, and is never worse.
Let x = (x1,…,xn) be a random sample from a probability distribution p (x,θ) where
θ = (θ1,…, θq) is an unknown vector parameter. Consider an estimator
g (x)=(g1(x),…,gq(x)) of θ and the qxq mean square and product matrix C (g)
C (g) =(cij)= ( E {[gi(x)- θi(x)] {[gj(x)- θj(x)]})
Let S be a sufficient statistic, which may be vector valued, s.t. the conditional expectation,
E {g|S} = T (x), is independent on θ. A general version of Rao – Blackwell is
C (g) – C (T) is nonnegative definite
207
SOLO Non-Gaussian Distribution Approximation
208
SOLO Non-Gaussian Distribution Approximation
http://www.dsi.unifi.it/users/chisci/idfric/Nonlinear_filtering_Chen.pdf

3 recursive bayesian estimation

  • 1.
    1 Recursive Bayesian Estimation SOLO HERMELIN Updated:22.02.09 11.01.14 http://www.solohermelin.com
  • 2.
    2 SOLO Table of ContentRecursive Bayesian Estimation Review of Probability Conditional Probability Total Probability Theorem Conditional Probability - Bayes Formula Statistical Independent Events Expected Value or Mathematical Expectation Variance and Central Moments Characteristic Function and Moment-Generating Function Probability Distribution and Probability Density Functions (Examples) Normal (Gaussian) Distribution Existence Theorems 1 & 2 Monte Carlo Method Estimation of the Mean and Variance of a Random Variable Generating Discrete Random Variables Existence Theorem 3 Markov Processes Functions of one Random Variable The Laws of Large Numbers Central Limit Theorem Problem Definition Stochastic Processes
  • 3.
    3 SOLO Table of Content(continue -1) Recursive Bayesian Estimation Bayesian Estimation Introduction Linear Gaussian Markov Systems Closed-Form Solutions of Estimation Kalman Filter Extended Kalman Filter General Bayesian Nonlinear Filters Additive Gaussian Nonlinear Filter Gauss – Hermite Quadrature Approximation Unscented Kalman Filter Monte Carlo Kalman Filter (MCKF) Non-Additive Non-Gaussian Nonlinear Filter Nonlinear Estimation Using Particle Filters Importance Sampling (IS) Sequential Importance Sampling (SIS) Sequential Importance Resampling (SIR) Monte Carlo Particle Filter (MCPF) Bayesian Maximum Likelihood Estimate (Maximum Aposteriori – MAP Estimate)
  • 4.
    4 SOLO Table of Content(continue -2) Recursive Bayesian Estimation References Nonlinear Filters based on the Fokker-Planck Equation
  • 5.
    5 SOLO Recursive BayesianEstimation kx1−kx kz1−kz 0x 1x 2x 1z 2z kZ :11:1 −kZ ( )11, −− kk wxf ( )kk vxh , ( )00 ,wxf ( )11,vxh ( )11,wxf ( )22 ,vxh Since this is a probabilistic problem, we start with a remainder of Probability Theory A discrete nonlinear system is defined by ( ) ( )kkk kkk vxkhz wxkfx ,, ,,1 11 = −= −− State vector dynamics Measurements kk vw ,1− State and Measurement Noise Vectors, respectively Problem Definition: Estimate the hidden States of a Non-linear Dynamic Stochastic System from Noisy Measurements . kx kz Table of Content
  • 6.
    6 SOLO Pr (A) isthe probability of the event A if S nAAAA ∪∪∪= 21 1A 2A nA jiOAA ji ≠∀/=∩ ( ) 0Pr ≥A(1) (3) If jiOAAandAAAA jin ≠∀/=∩∪∪∪= 21 ( ) 1Pr =S(2) then ( ) ( ) ( ) ( )nAAAA PrPrPrPr 21 +++=  Probability Axiomatic Definition Probability Geometric Definition Assume that the probability of an event in a geometric region A is defined as the ratio between A surface to surface of S. ( ) ( ) ( )SSurface ASurface A =Pr ( ) 0Pr ≥A(1) ( ) 1Pr =S(2) (3) If jiOAAandAAAA jin ≠∀/=∩∪∪∪= 21 then ( ) ( ) ( ) ( )nAAAA PrPrPrPr 21 +++=  S A Review of Probability A more detailed explanation of the subject is given in the “Probability” Presentation
  • 7.
    7 SOLO From those definitionwe can prove the following:( ) 0=/OP(1’) Proof: OOSandOSS /=/∩/∪= ( ) ( ) ( ) ( ) ( ) 0PrPrPrPr 3 =/⇒/+=⇒ OOSS ( ) ( )APAP −= 1(2’) Proof: OAAandAAS /=∩∪= ( ) ( ) ( ) ( ) ( ) ( ) ( )AAAAS Pr1PrPrPr1Pr 32 −=⇒+==⇒ ( ) 1Pr0 ≤≤ A(3’) Proof: ( ) ( ) ( ) ( ) ( ) 1Pr0Pr1Pr 1'2 ≤⇒≥−= AAA ( ) ( )APr0 1 ≤ ( ) 0Pr ≥A(1) ( ) 1Pr =S(2) (3) If jiOAAandAAAA jin ≠∀/=∩∪∪∪= 21 then ( ) ( ) ( ) ( )n AAAA PrPrPrPr 21 +++=  ( ) ( )AABAIf PrPr ≤⇒⊂(4’) Proof: ( ) ( ) ( ) ( ) ( ) ( )BAAABB PrPr0PrPrPr 00 3 ≤⇒≥+−= ≥≥  ( ) ( ) OAABandAABB /=∩−∪−= ( ) ( ) ( ) ( )BABABA ∩−+=∪ PrPrPrPr(5’) Proof: ( ) ( ) ( ) ( ) ( ) ( ) OABBAandABBAB OABAandABABA /=−∩∩−∪∩= /=−∩−∪=∪ ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )BABABA ABBAB ABABA ∩−+=∪⇒     −+∩= −+=∪ PrPrPrPr PrPrPr PrPrPr 3 3 Table of Content Review of Probability
  • 8.
    8 SOLO Conditional Probability S nAAAAααα ∪∪∪= 21  1αA jiOAA ji ≠∀/=∩ 1αβA mAAAB βββ ∪∪∪= 212αA 2αβA 1βA 2βA  Given two events A and B decomposed in elementary events jiOAAandAAAAA ji n i in ≠∀/=∩=∪∪∪= = αααααα  1 21 lkOAAandAAAAB lk m k km ≠∀/=∩=∪∪∪= = ββββββ  1 21 jiOAAandAAABA jir ≠∀/=∩∪∪∪=∩ αβαβαβαβαβ 21 ( ) ( ) ( ) ( )n AAAA ααα PrPrPrPr 21 +++=  ( ) ( ) ( ) ( )mAAAB βββ PrPrPrPr 21 +++=  ( ) ( ) ( ) ( ) nmrAAABA r ,PrPrPrPr 21 ≤+++=∩ βαβαβα  We want to find the probability of A event under the condition that the event B had occurred designed as P (A|B) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )B BA AAA AAA BA m r Pr Pr PrPrPr PrPrPr |Pr 21 21 ∩ = +++ +++ = βββ βαβαβα   Review of Probability
  • 9.
    9 SOLO Conditional Probability SnAAAA ααα ∪∪∪= 21  1αA jiOAA ji ≠∀/=∩ 1αβA mAAAB βββ ∪∪∪= 212αA 2αβA 1βA 2βA  If the events A and B are statistical independent, that the fact that B occurred will not affect the probability of A to occur. ( ) ( ) ( )B BA BA Pr Pr |Pr ∩ = ( ) ( ) ( )A BA AB Pr Pr |Pr ∩ = ( ) ( )ABA Pr|Pr = ( ) ( ) ( ) ( ) ( ) ( ) ( )BAAABBBABA PrPrPr|PrPr|PrPr ⋅=⋅=⋅=∩ Definition: n events Ai i = 1,2,…n are statistical independent if: ( ) nrAA r i i r i i ,,2PrPr 11  =∀=      ∏== Table of Content Review of Probability
  • 10.
    10 SOLO Conditional Probability -Bayes Formula Using the relation: ( ) ( ) ( ) ( ) ( )llll AABBBABA ββββ Pr|PrPr|PrPr ⋅=⋅=∩ ( ) ( ) ( ) klOBABABAB lk m k k , 1 ∀/=∩∩∩∩= = βββ ( ) ( )∑ = ∩= m k k BAB 1 PrPr β we obtain: ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )∑= ⋅ ⋅ = ⋅ = m k kk llll l AAB AAB B AAB BA 1 Pr|Pr Pr|Pr Pr Pr|Pr |Pr ββ ββββ β Bayes Formula Thomas Bayes 1702 - 1761 Table of Content Review of Probability
  • 11.
    11 SOLO Total Probability Theorem Tableof Content jiOAAandSAAA jin ≠∀/=∩=∪∪∪ 21If we say that the set space S is decomposed in exhaustive and incompatible (exclusive) sets. The Total Probability Theorem states that for any event B, its probability can be decomposed in terms of conditional probability as follows: ( ) ( ) ( ) ( )∑∑ == == n i i n i i BPBABAB 11 |Pr,PrPr Using the relation: ( ) ( ) ( ) ( ) ( )llll AABBBABA Pr|PrPr|PrPr ⋅=⋅=∩ ( ) ( ) ( ) klOBABABAB lk n k k , 1 ∀/=∩∩∩∩= =  ( ) ( )∑= ∩= n k k BAB 1 PrPr For any event B we obtain: Review of Probability
  • 12.
    12 SOLO Statistical Independent Events () ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )∏∑∏∑∏∑ ∑∑∑ = −       ≠≠ =       ≠ =       = = −       ≠≠       ≠       == −+−+−=       −+−+−=      n i i n n kji kji i i n ji ji i i n i i tIndependen lStatisticaA n i i n n kji kji kji n ji ji ji n i i n i i AAAA AAAAAAAA i 1 1 3 ,. 3 1 2 . 2 1 1 1 1 1 3 ,. 2 . 1 11 Pr1PrPrPr Pr1PrPrPrPr    From Theorem of Addition Therefore ( )[ ]∏== −=      − n i i tIndependen lStatisticaA n i i AA i 11 Pr1Pr1  ( )[ ]∏== −−=      n i i tIndependen lStatisticaA n i i AA i 11 Pr11Pr  Since OAASAA n i i n i i n i i n i i /=               =               ====   1111 &         =      − ==  n i i n i i AA 11 PrPr1 ( )∏== =      n i i tIndependen lStatisticaA n i i AA i 11 PrPr  If the n events Ai i = 1,2,…n are statistical independent than are also statistical independentiA ( )∏= = n i iA 1 Pr      = =  n i i MorganDe A 1 Pr ( )[ ]∏= −= n i i tIndependen lStatisticaA A i 1 Pr1 ( ) nrAA r i i r i i ,,2PrPr 11  =∀=      ∏== Table of Content Review of Probability
  • 13.
    13 SOLO Review ofProbability Expected Value or Mathematical Expectation Given a Probability Density Function p (x) we define the Expected Value For a Continuous Random Variable: ( ) ( )∫ +∞ ∞− = dxxpxxE X: For a Discrete Random Variable: ( ) ( )∑= k kXk xpxxE : For a general function g (x) of the Random Variable x: ( )[ ] ( ) ( )∫ +∞ ∞− = dxxpxgxgE X: ( )xp x 0 ∞+∞− 0.1 ( )xE ( ) ( ) ( )∫ ∫ ∞+ ∞− +∞ ∞− = dxxp dxxpx xE X X : The Expected Value is the center of surface enclosed between the Probability Density Function and x axis. Table of Content
  • 14.
    14 SOLO Review ofProbability Variance Given a Probability Density Functions p (x) we define the Variance ( ) ( )[ ]{ } ( ) ( )[ ] ( ) ( )22222 2: xExExExExxExExExVar −=+−=−= Central Moment ( ) { }k k xEx =:'µ Given a Probability Density Functions p (x) we define the Central Moment of order k about the origin ( ) ( )[ ]{ } ( ) ( )∑= −− −      =−= k j jk j jkk k xE j k xExEx 0 '1: µµ Given a Probability Density Functions p (x) we define the Central Moment of order k about the Mean E (x) Table of Content
  • 15.
    15 SOLO Review ofProbability Moments Normal Distribution ( ) ( ) ( )[ ] σπ σ σ 2 2/exp ; 22 x xpX − = [ ] ( )    −⋅ = oddnfor evennforn xE n n 0 131 σ [ ] ( )      += =−⋅ = + 12!2 2 2131 12 knfork knforn xE kk n n σ π σ Proof: Start from: and differentiate k time with respect to a( ) 0exp 2 >=−∫ ∞ ∞− a a dxxa π Substitute a = 1/(2σ2 ) to obtain E [xn ] ( ) ( ) 0 2 1231 exp 12 22 > −⋅ =− + ∞ ∞− ∫ a a k dxxax kk k π [ ] ( ) ( )[ ] ( ) ( )[ ] ( ) ( ) 12 ! 0 122/ 0 222221212 !2 2 exp 2 22 2/exp 2 2 2/exp 2 1 2 + ∞+ = ∞∞ ∞− ++ =−= −=−= ∫ ∫∫ kk k k k xy kkk kdyyy xdxxxdxxxxE σ πσ σ π σ σπ σ σπ σ    Now let compute: [ ] [ ]( )2244 33 xExE == σ Chi-square
  • 16.
    16 SOLO Review ofProbability Functions of one Random Variable Let y = g (x) a given function of the random variable x defined o the domain Ω, with probability distribution pX (x). We want to find pY (y). Fundamental Theorem Assume x1, x2, …, xn all the solutions of the equation ( ) ( ) ( )n xgxgxgy ==== 21 ( ) ( ) ( ) ( ) ( ) ( ) ( )n nXXX Y xg xp xg xp xg xp yp ''' 2 2 1 1 +++=  ( ) ( ) xd xgd xg =:' Proof ( ) ( ) ( ) ( ) ( ) ( )∑∑∑ === ==±≤≤=+≤≤= n i i iX n i iiX n i iiiY yd xg xp xdxpxdxxxydyYyydyp 111 ' PrPr: q.e.d.
  • 17.
    17 SOLO Review ofProbability Functions of one Random Variable (continue – 1) Example 1 bxay += ( )       − = a by p a yp XY 1 Example 2 x a y = ( )       = y a p y a yp XY 2 Example 3 2 xay = ( ) ( )yU a y p a y p ya yp XXY                 −+         = 2 1 Example 4 xy = ( ) ( ) ( )[ ] ( )yUypypyp XXY −+= Table of Content
  • 18.
    18 SOLO Review ofProbability Characteristic Function and Moment-Generating Function Given a Probability Density Functions pX (x) we define the Characteristic Function or Moment Generating Function ( ) ( )[ ] ( ) ( ) ( ) ( ) ( ) ( )     = ==Φ ∑ ∫∫ +∞ ∞− +∞ ∞− x X XX X discretexxpxj continuousxxPdxjdxxpxj xjE ω ωω ωω exp expexp exp: This is in fact the complex conjugate of the Fourier Transfer of the Probability Density Function. This function is always defined since the sufficient condition of the existence of a Fourier Transfer : Given the Characteristic Function we can find the Probability Density Functions pX (x) using the Inverse Fourier Transfer: ( ) ( ) ( ) ∞<== ∫∫ +∞ ∞− ≥+∞ ∞− 1 0 dxxpdxxp X xp X ( ) ( ) ( )∫ +∞ ∞− Φ−= ωωω π dxjxp XX exp 2 1 is always fulfilled.
  • 19.
    19 SOLO Review ofProbability Properties of Moment-Generating Function ( ) ( ) ( )∫ +∞ ∞− = Φ dxxpxxjj d d X X ω ω ω exp ( ) ( ) 10 ==Φ ∫ +∞ ∞− = dxxpXX ω ω ( ) ( ) ( )xEjdxxpxj d d X X == Φ ∫ +∞ ∞−=0ω ω ω ( ) ( ) ( ) ( )∫ +∞ ∞− = Φ dxxpxxjj d d X X 22 2 2 exp ω ω ω ( ) ( ) ( ) ( ) ( )2222 0 2 2 xEjdxxpxj d d X X == Φ ∫ +∞ ∞−=ω ω ω ( ) ( ) ( ) ( )∫ +∞ ∞− = Φ dxxpxxjj d d X nn n X n ω ω ω exp ( ) ( ) ( ) ( ) ( )nn X nn n X n xEjdxxpxj d d == Φ ∫ +∞ ∞−=0ω ω ω   ( ) ( ) ( )∫ +∞ ∞− =Φ dxxpxj XX ωω exp This is the reason why ΦX (ω) is also called the Moment-Generation Function.
  • 20.
    20 SOLO Review ofProbability Properties of Moment-Generating Function ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )   +++++= + Φ ++ Φ + Φ +Φ=Φ === = n n n n X n XX XX xE n j xE j xE j d d nd d d d !!2!1 1 ! 1 !2 1 2 2 0 2 0 2 2 0 0 ωωω ω ω ω ω ω ω ω ω ω ωω ωωω ω Develop ΦX (ω) in a Taylor series ( ) ( ) ( )∫ +∞ ∞− =Φ dxxpxj XX ωω exp
  • 21.
    21 SOLO Review ofProbability Probability Distribution and Probability Density Functions (Examples) (2) Poisson’s Distribution ( ) ( )0 0 exp ! , k k k nkp k −≈ (1) Binomial (Bernoulli) ( ) ( ) ( ) ( ) knkknk pp k n pp knk n nkp −− −      =− − = 11 !! ! , 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 k ( )nkP , (3) Normal (Gaussian) ( ) ( ) ( )[ ] σπ σµ σµ 2 2/exp ,; 22 −− = x xp (4) Laplacian Distribution ( )         − −= b x b bxp µ µ exp 2 1 ,;
  • 22.
    22 SOLO Review ofProbability Probability Distribution and Probability Density Functions (Examples) (5) Gama Distribution ( ) ( ) ( )      < ≥ Γ − = − 00 0 /exp ,; 1 x xx k x kxp k k θ θ θ (6) Beta Distribution ( ) ( ) ( ) ( ) ( ) ( ) ( ) 11 1 0 11 11 1 1 1 ,; −− −− −− − ΓΓ +Γ = − − = ∫ βα βα βα βα βα βα xx duuu xx xp (7) Cauchy Distribution ( ) ( )       +− = 22 0 0 1 ,; γ γ π γ xx xxp
  • 23.
    23 SOLO Review ofProbability Probability Distribution and Probability Density Functions (Examples) SOLO (8) Exponential Distribution ( ) ( )    < ≥− = 00 0exp ; x xx xp λλ λ (9) Chi-square Distribution ( ) ( ) ( ) ( )      < ≥− Γ= − 00 02/exp 2/ 2/1 ; 12/ 2/ x xxx kkxp k k Γ is the gamma function ( ) ( )∫ ∞ − −=Γ 0 1 exp dttta a (10) Student’s t-Distribution ( ) ( )[ ] ( ) ( )( ) 2/12 /12/ 2/1 ; + +Γ +Γ = ν ννπν ν ν x xp
  • 24.
    24 SOLO Review ofProbability Probability Distribution and Probability Density Functions (Examples) SOLO (11) Uniform Distribution (Continuous) ( )      >> ≤≤ −= bxxa bxa abbaxp 0 1 ,; (12) Rayleigh Distribution ( ) 2 2 2 2 exp ; σ σ σ       − = x x xp (13) Rice Distribution ( )             + − = 202 2 22 2 exp ,; σσ σ σ vx I vx x vxp
  • 25.
    25 SOLO Review ofProbability Probability Distribution and Probability Density Functions (Examples) (14) Weibull Distribution SOLO ( )      < >≥               − −      − = − 00 0,,exp ,,; 1 x x xx xp αγµ α µ α µ α γ αµγ γγ Table of Content
  • 26.
    26 SOLO Review ofProbability Normal (Gaussian) Distribution Karl Friederich Gauss 1777-1855 ( ) ( ) ( )σµ σπ σ µ σµ ,;: 2 2 exp ,; 2 2 x x xp N=       − − = ( ) ( ) ∫ ∞−       − −= x du u xP 2 2 2 exp 2 1 ,; σ µ σπ σµ ( ) µ=xE ( ) σ=xVar ( ) ( )[ ] ( ) ( )       −=       − −= =Φ ∫ ∞+ ∞− 2 exp exp 2 exp 2 1 exp 22 2 2 σω µω ω σ µ σπ ωω j duuj u xjE Probability Density Functions Cumulative Distribution Function Mean Value Variance Moment Generating Function
  • 27.
    27 SOLO Review ofProbability Moments Normal Distribution ( ) ( ) ( )[ ] ( )σ σπ σ σ ,0;: 2 2/exp ,0; 22 x x xpX N= − = [ ] ( )    −⋅ = oddnfor evennforn xE n n 0 131 σ [ ] ( )      += =−⋅ = + 12!2 2 2131 12 knfork knforn xE kk n n σ π σ Proof: Start from: and differentiate k time with respect to a( ) 0exp 2 >=−∫ ∞ ∞− a a dxxa π Substitute a = 1/(2σ2 ) to obtain E [xn ] ( ) ( ) 0 2 1231 exp 12 22 > −⋅ =− + ∞ ∞− ∫ a a k dxxax kk k π [ ] ( ) ( )[ ] ( ) ( )[ ] ( ) ( ) 12 ! 0 122/ 0 222221212 !2 2 exp 2 22 2/exp 2 2 2/exp 2 1 2 + ∞+ = ∞∞ ∞− ++ =−= −=−= ∫ ∫∫ kk k k k xy kkk kdyyy xdxxxdxxxxE σ πσ σ π σ σπ σ σπ σ    Now let compute: [ ] [ ]( )2244 33 xExE == σ Chi-square
  • 28.
    28 SOLO Review ofProbability Normal (Gaussian) Distribution (continue – 1) Karl Friederich Gauss 1777-1855 ( ) ( ) ( ) ( )PxxxxPxxPPxxp T ,;: 2 1 exp2,; 12/1  N=    −−−= −− π A Vector – Valued Gaussian Random Variable has the Probability Density Functions where { }xEx  = Mean Value ( )( ){ }T xxxxEP  −−= Covariance Matrix If P is diagonal P = diag [σ1 2 σ2 2 … σk 2 ] then the components of the random vector are uncorrelated, and x  ( ) ( ) ( ) ( ) ( ) ∏= − −       − − =       − −      − −      − − =                               − − −                             − − − −= k i i i ii k k kk kk k T kk xxxxxxxx xx xx xx xx xx xx PPxxp 1 2 2 2 2 2 2 2 2 22 1 2 1 2 11 22 11 1 2 2 2 2 1 22 11 2/1 2 2 exp 2 2 exp 2 2 exp 2 2 exp 0 0 2 1 exp2,; σπ σ σπ σ σπ σ σπ σ σ σ σ π    therefore the components of the random vector are also independent
  • 29.
    29 SOLO Review ofProbability The Laws of Large Numbers The Law of Large Numbers is a fundamental concept in statistics and probability that describes how the average of randomly selected sample of a large population is likely to be close to the average of the whole population. There are two laws of large numbers the Weak Law and the Strong Law. The Weak Law of Large Numbers The Weak Law of Large Numbers states that if X1,X2,…,Xn,… is an infinite sequence of random variables that have the same expected value μ and variance σ2 , and are uncorrelated (i.e., the correlation between any two of them is zero), then ( ) nXXX nn /: 1 ++=  converges in probability (a weak convergence sense) to μ . We have { } ∞→=<− nforXn 1Pr εµ converges in probability The Strong Law of Large Numbers The Strong Law of Large Numbers states that if X1,X2,…,Xn,… is an infinite sequence of random variables that have the same expected value μ and variance σ2 , and are uncorrelated (i.e., the correlation between any two of them is zero), and E (|Xi|) < ∞ then ,i.e. converges almost surely to μ.{ } ∞→== nforXn 1Pr µ converges almost surely
  • 30.
    3030 SOLO Review ofProbability The Law of Large Numbers Differences between the Weak Law and the Strong Law The Weak Law states that, for a specified large n, (X1 + ... + Xn) / n is likely to be near μ. Thus, it leaves open the possibility that | (X1 + ... + Xn) / n − μ | > ε happens an infinite number of times, although it happens at infrequent intervals. The Strong Law shows that this almost surely will not occur. In particular, it implies that with probability 1, we have for any positive value ε, the inequality | (X1 + ... + Xn) / n − μ | > ε is true only a finite number of times (as opposed to an infinite, but infrequent, number of times). Almost sure convergence is also called strong convergence of random variables. This version is called the strong law because random variables which converge strongly (almost surely) are guaranteed to converge weakly (in probability). The strong law implies the weak law.
  • 31.
    3131 SOLO Review ofProbability The Law of Large Numbers Proof of the Weak Law of Large Numbers ( ) iXE i ∀= µ ( ) iXVar i ∀= 2 σ ( )( )[ ] jiXXE ji ≠∀=−− 0µµ ( ) ( ) ( )[ ] µµ ==++= nnnXEXEXE nn //1  ( ) ( )[ ]{ } ( ) ( ) ( )( )[ ] ( )[ ] ( )[ ] nn n n XEXE n XX E n XX EXEXEXVar n jiXXE nn nnn ji 2 2 2 2 22 1 0 2 1 2 12 σσµµ µµ µ µµ == −++− =               −++− =               − ++ =−= ≠∀=−−   Given we have: Using Chebyshev’s inequality on we obtain:nX ( ) 2 2 / Pr ε σ εµ n Xn ≤≥− Using this equation we obtain: ( ) ( ) ( ) n XXX nnn 2 2 1Pr1Pr1Pr ε σ εµεµεµ −≥≥−−≥>−−=≤− As n approaches infinity, the expression approaches 1. Chebyshev’s inequality q.e.d. Monte Carlo Integration Monte Carlo Integration Table of Content
  • 32.
    3232 SOLO Review ofProbability Central Limit Theorem The first version of this theorem was first postulated by the French-born English mathematician Abraham de Moivre in 1733, using the normal distribution to approximate the distribution of the number of heads resulting from many tosses of a fair coin. This was published in1756 in “The Doctrine of Chance” 3th Ed. Pierre-Simon Laplace (1749-1827) Abraham de Moivre (1667-1754) This finding was forgotten until 1812 when the French mathematician Pierre-Simon Laplace recovered it in his work “Théory Analytique des Probabilités”, in which he approximate the binomial distribution with the normal distribution. This is known as the De Moivre – Laplace Theorem. De Moivre – Laplace Theorem The present form of the Central Limit Theorem was given by the Russian mathematician Alexandr Lyapunov in 1901. Alexandr Mikhailovich Lyapunov (1857-1918)
  • 33.
    3333 SOLO Review ofProbability Central Limit Theorem (continue – 1) Let X1, X2, …, Xm be a sequence of independent random variables with the same probability distribution function pX (x). Define the statistical mean: m XXX X m m +++ = 21 ( ) ( ) ( ) ( ) µ= +++ = m XEXEXE XE m m 21 ( ) ( )[ ]{ } ( ) ( ) ( ) mm m m XXX EXEXEXVar m mmmXm 2 2 22 21 22 σσµµµ σ ==               −++−+− =−==  Define also the new random variable ( ) ( ) ( ) ( ) m XXXXEX Y m X mm m σ µµµ σ −++−+− = − = 21 : We have: The probability distribution of Y tends to become gaussian (normal) as m tends to infinity, regardless of the probability distribution of the random variable, as long as the mean μ and the variance σ2 are finite.
  • 34.
    3434 SOLO Review ofProbability Central Limit Theorem (continue – 2) ( ) ( ) ( ) ( ) m XXXXEX Y m X mm m σ µµµ σ −++−+− = − = 21 : Proof The Characteristic Function ( ) ( )[ ] ( ) ( ) ( ) ( ) ( ) ( ) m X m i m i i m Y m X m j E m X jE m XXX jEYjE i               Φ=                     − =               − =               −++−+− ==Φ − = ∏ ω σ µω σ µ ω σ µµµ ωωω σ µexpexp expexp 1 21  ( ) ( ) ( ) ( ) ( ) ( ) ( ) 0/lim 2 1 !3 / !2 / !1 / 1 2222 33 1 22 0 =      Ο/      Ο/+−= +               − +               − +      − +=      Φ ∞→ − mmmm X E mjX E mjX E mj m m iii Xi ωωωω σ µω σ µω σ µωω σ µ     Develop in a Taylor series( )       Φ − miX ω σ µ
  • 35.
    35 SOLO Review ofProbability Central Limit Theorem (continue – 3) Proof (continue – 1) The Characteristic Function ( ) ( ) m XY m E i               Φ=Φ − ω ω σ µ ( ) 0/lim 2 1 2222 =      Ο/      Ο/+−=      Φ ∞→ − mmmmm m Xi ωωωωω σ µ ( ) ( )2/exp 2 1 2 22 ω ωω ω −→            Ο/+−=Φ ∞→m m Y mm Therefore ( ) ( ) ( ) ( ) ( )2/exp 2 1 2/exp 2 1 exp 2 1 22 ydyjdyjyp m YY −=−−→Φ−= ∫∫ +∞ ∞− ∞→+∞ ∞− π ωωω π ωωω π The probability distribution of Y tends to become gaussian (normal) as m tends to infinity (Convergence in Distribution). Characteristic Function of Normal Distribution Convergence Concepts Monte Carlo Integration Table of Content
  • 36.
    36 SOLO Review ofProbability Central Limit Theorem (continue – 2) ( ) ( ) ( ) ( ) m XXXXEX Y m X mm m σ µµµ σ −++−+− = − = 21 : Proof The Characteristic Function ( ) ( )[ ] ( ) ( ) ( ) ( ) ( ) ( ) m X m i m i i m Y m X m j E m X jE m XXX jEYjE i               Φ=                     − =               − =               −++−+− ==Φ − = ∏ ω σ µω σ µ ω σ µµµ ωωω σ µexpexp expexp 1 21  ( ) ( ) ( ) ( ) ( ) ( ) ( ) 0/lim 2 1 !3 / !2 / !1 / 1 2222 33 1 22 0 =      Ο/      Ο/+−= +               − +               − +      − +=      Φ ∞→ − mmmm X E mjX E mjX E mj m m iii Xi ωωωω σ µω σ µω σ µωω σ µ     Develop in a Taylor series( )       Φ − miX ω σ µ
  • 37.
    37 SOLO Review ofProbability Central Limit Theorem (continue – 3) Proof (continue – 1) The Characteristic Function ( ) ( ) m XY m E i               Φ=Φ − ω ω σ µ ( ) 0/lim 2 1 2222 =      Ο/      Ο/+−=      Φ ∞→ − mmmmm m Xi ωωωωω σ µ ( ) ( )2/exp 2 1 2 22 ω ωω ω −→            Ο/+−=Φ ∞→m m Y mm Therefore ( ) ( ) ( ) ( ) ( )2/exp 2 1 2/exp 2 1 exp 2 1 22 ydyjdyjyp m YY −=−−→Φ−= ∫∫ +∞ ∞− ∞→+∞ ∞− π ωωω π ωωω π The probability distribution of Y tends to become gaussian (normal) as m tends to infinity (Convergence in Distribution). Characteristic Function of Normal Distribution Convergence Concepts Table of Content
  • 38.
    38 SOLO Review ofProbability Existence Theorems Existence Theorem 1 Given a function G (x) such that ( ) ( ) ( ) 1lim,1,0 ==∞+=∞− ∞→ xGGG x ( ) ( ) 2121 0 xxifxGxG <=≤ ( G (x) is monotonic non-decreasing) ( ) ( ) ( )xGxGxG n xx xx n n == ≥ → + lim We can find an experiment X and a random variable x, defined on X, such that its distribution function P (x) equals the given function G (x). Proof of Existence Theorem 1 Assume that the outcome of the experiment X is any real number -∞ <x < +∞. We consider as events all intervals, the intersection or union of intervals on the real axis. 5x 1x 2x 3x 4x 6x 7x 8x ∞− ∞+ To specify the probability of those events we define P (x)=Prob { x ≤ x1}= G (x1). From our definition of G (x) it follows that P (x) is a distribution function. Existence Theorem 2 Existence Theorem 3
  • 39.
    39 SOLO Review ofProbability Existence Theorems Existence Theorem 2 If a function F (x,y) is such that ( ) ( ) ( ) ( ) ( ) ( ) ( ) 0,,,, 1,,0,, 11122122 ≥+−− =+∞∞+=−∞=∞− yxFyxFyxFyxF FxFyF for every x1 < x2, y1 < y2, then two random variables x and y can be found such that F (x,y) is their joint distribution function. Proof of Existence Theorem 2 Assume that the outcome of the experiment X is any real number -∞ <x < +∞. Assume that the outcome of the experiment Y is any real number -∞ <y < +∞. We consider as events all intervals, the intersection or union of intervals on the real axes x and y. To specify the probability of those events we define P (x,y)=Prob { x ≤ x1, y ≤ y1, }= F (x1,y1). From our definition of F (x,y) it follows that P (x,y) is a joint distribution function. The proof is similar to that in the Existence Theorem 1
  • 40.
    40 SOLO Review ofProbability Monte Carlo Method Monte Carlo methods are a class of computational algorithms that rely on repeated random sampling to compute their results. Monte Carlo methods are often used when simulating physical and mathematical systems. Because of their reliance on repeated computation and random or pseudo-random numbers, Monte Carlo methods are most suited to calculation by a computer. Monte Carlo methods tend to be used when it is infeasible or impossible to compute an exact result with a deterministic algorithm. The term Monte Carlo method was coined in the 1940s by physicists Stanislaw Ulam, Enrico Fermi, John von Neumann, and Nicholas Metropolis, working on nuclear weapon projects in the Los Alamos National Laboratory (reference to the Monte Carlo Casino in Monaco where Ulam's uncle would borrow money to gamble) Stanislaw Ulam 1909 - 1984 Enrico - Fermi 1901 - 1954 John von Neumann 1903 - 1957 Monte Carlo Casino Nicholas Constantine Metropolis (1915 –1999)
  • 41.
    41 SOLO Review ofProbability Monte Carlo Approximation Monte Carlo runs, generate a set of random samples that approximate the distribution p (x). So, with P samples, expectations with respect to the filtering distribution are approximated by ( ) ( ) ( ) ( )∑∫ = ≈ P L L xf P dxxpxf 1 1 and , in the usual way for Monte Carlo, can give all the moments etc. of the distribution up to some degree of approximation. { } ( ) ( ) ∑∫ = ≈== P L L x P dxxpxxE 1 1 1 µ ( ){ } ( ) ( ) ( ) ( )∑∫ = −≈−=−= P L nLnn n x P dxxpxxE 1 111 1 µµµµ  Table of Content x(L) are generated (draw) samples from distribution p (x) ( ) ( )xpx L ~
  • 42.
    42 SOLO Review ofProbability Estimation of the Mean and Variance of a Random Variable (Unknown Statistics) { } { } jimxExE ji ,∀== Define Estimation of the Population mean ∑= = k i ik x k m 1 1 :ˆ A random variable, x, may take on any values in the range - ∞ to + ∞. Based on a sample of k values, xi, i = 1,2,…,k, we wish to compute the sample mean, , and sample variance, , as estimates of the population mean, m, and variance, σ2 . 2 ˆkσ kmˆ ( ) { } ( ) ( ) ( )[ ] ( ) ( )[ ] 2 1 2 1 222 2 22222 1 11 2 1 2 2 11 2 1 2 11 1 1 1 1 1 21 11 2 1 ˆˆ2 1 ˆ 1 σσ σσσ k k kk mkmkk k mmk k m k xx k Ex k xExE k mxmxE k mx k E k i k i k i k l l k j j k j jii k k i ik k i i k i ki − =      −=       ++−+++−−+=               +       −=       +−=       − ∑ ∑ ∑ ∑∑∑ ∑∑∑ = = = === === { } { } jimxExE ji ,2222 ∀+== σ { } { } mxE k mE k i ik == ∑=1 1 ˆ { } { } { } jimxExExxE ji tindependenxx ji ji ,2 , ∀== Compute Biased Unbiased Monte Carlo simulations assume independent and identical distributed (i.i.d.) samples.
  • 43.
    43 SOLO Review ofProbability Estimation of the Mean and Variance of a Random Variable (continue - 1) { } { } jimxExE ji ,∀== Define Estimation of the Population mean ∑= = k i ik x k m 1 1 :ˆ A random variable, x, may take on any values in the range - ∞ to + ∞. Based on a sample of k values, xi, i = 1,2,…,k, we wish to compute the sample mean, , and sample variance, , as estimates of the population mean, m, and variance, σ2 . 2 ˆkσ kmˆ ( ) 2 1 2 1 ˆ 1 σ k k mx k E k i ki − =       −∑= { } { } jimxExE ji ,2222 ∀+== σ { } { } mxE k mE k i ik == ∑=1 1 ˆ { } { } { } jimxExExxE ji tindependenxx ji ji ,2 , ∀== Biased Unbiased Therefore, the unbiased estimation of the sample variance of the population is defined as: ( )∑= − − = k i kik mx k 1 22 ˆ 1 1 :ˆσ since { } ( ) 2 1 22 ˆ 1 1 :ˆ σσ =       − − = ∑= k i kik mx k EE Unbiased Monte Carlo simulations assume independent and identical distributed (i.i.d.) samples.
  • 44.
    44 SOLO Review ofProbability Estimation of the Mean and Variance of a Random Variable (continue - 2) A random variable, x, may take on any values in the range - ∞ to + ∞. Based on a sample of k values, xi, i = 1,2,…,k, we wish to compute the sample mean, , and sample variance, , as estimates of the population mean, m, and variance, σ2 . 2 ˆkσ kmˆ { } { } mxE k mE k i ik == ∑=1 1 ˆ { } ( ) 2 1 22 ˆ 1 1 :ˆ σσ =       − − = ∑= k i kik mx k EE Monte Carlo simulations assume independent and identical distributed (i.i.d.) samples.
  • 45.
    45 SOLO Review ofProbability Estimation of the Mean and Variance of a Random Variable (continue - 3) { } { } mxE k mE k i ik == ∑=1 1 ˆ { } ( ) 2 1 22 ˆ 1 1 :ˆ σσ =       − − = ∑= k i kik mx k EEWe found: Let Compute: ( ){ } ( ) ( ){ } ( ) ( ){ } ( ){ } ( ){ } ( ){ } k mxEmxEmxE k mxmxEmxE k mx k Emx k EmmE k i k ij j ji k i i k i k ij j ji k i i k i i k i ikmk 2 1 1 00 1 2 2 1 11 2 2 2 1 2 1 22 ˆ 2 1 1 11 ˆ: σ σ σ =           −−+−=           −−+−=               −=               −=−= ∑ ∑∑ ∑∑∑ ∑∑ = ≠ == = ≠ == ==  ( ){ } k mmE kmk 2 22 ˆ ˆ: σ σ =−=
  • 46.
    46 SOLO Review ofProbability Estimation of the Mean and Variance of a Random Variable (continue - 4) Let Compute: ( ){ } ( ) ( ) ( ) ( ) ( ) ( )[ ] ( ) ( ) ( ) ( )               −− − +− − − +− − =               −−+−−+− − =               −−+− − =               −− − =−= ∑∑ ∑ ∑∑ == = == 2 22 11 2 2 2 1 22 2 2 1 2 2 2 1 22222 ˆ ˆ 11 ˆ2 1 1 ˆˆ2 1 1 ˆ 1 1 ˆ 1 1 ˆ:2 σ σ σσσσσσ k k i i k k i i k i kkii k i ki k i kik mm k k mx k mm mx k E mmmmmxmx k E mmmx k Emx k EE k ( ) ( ){ } ( ){ } ( ){ } ( ){ } ( ) ( ){ } ( ) ( ){ } ( ){ } ( ) ( ){ } ( ){ } ( ) ( ){ } ( ) ( ){ } ( ){ } ( ) ( ){ } ( ){ } ( ) ( ){ } ( ) ( ){ }                 k k k i i k k i i k k k i i k k i i k k k i i k k k k i i k k k i k ij j ji k k i i mmE k k mxE k mmE mxE k mmEk mxE k mxE k mmEk mxE k mmE mmE k k mxE k mmE mxEmxEmxE kk / 2 2 1 0 2 0 1 0 2 3 1 2 2 1 2 2 / 2 1 3 2 0 44 2 2 1 2 2 / 2 1 1 22 1 4 2 2 ˆ 2 222 22 22 4 2 ˆ 1 2 1 ˆ4 1 ˆ4 1 2 1 ˆ2 1 ˆ4 ˆ 11 ˆ4 1 1 σ σσσ σσ σσ µ σ σσ σ σσ − − −− − − −− − − + − − −− − − +− − − + +− − +− − − +             −−+− − ≈ ∑∑ ∑∑∑ ∑∑ ∑∑ == === == ≠ == Since (xi – m), (xj - m) and are all independent for i ≠ j:( )kmm ˆ−
  • 47.
    47 SOLO Review ofProbability Estimation of the Mean and Variance of a Random Variable (continue - 4) Since (xi – m), (xj - m) and are all independent for i ≠ j:( )kmm ˆ− ( ) ( ) ( ) ( ) ( ) ( ){ } ( ) ( ) ( ) ( ) ( ) ( ) ( ){ }4 2 2 4 22 4 44 2 4 44 2 2 2 4 2 4 2 42 ˆ ˆ 11 7 11 2 1 2 1 2 ˆ 11 4 1 1 1 2 k k mmE k k k k k k kk k k k mmE k k kk kk k k k − − + − +− + − = − − − − − + +− − + − + − − + − ≈ σ µσσσ σ σσµ σσ kk 4 42 ˆ 2 σµ σσ − ≈ ( ){ }4 4 : mxE i −=µ ( ) ( ){ } ( ){ } ( ){ } ( ){ } ( ) ( ){ } ( ) ( ){ } ( ){ } ( ) ( ){ } ( ){ } ( ) ( ){ } ( ) ( ){ } ( ){ } ( ) ( ){ } ( ){ } ( ) ( ){ } ( ) ( ){ }                 k k k i i k k i i k k k i i k k i i k k k i i k k k k i i k k k i k ij j ji k k i i mmE k k mxE k mmE mxE k mmEk mxE k mxE k mmEk mxE k mmE mmE k k mxE k mmE mxEmxEmxE kk / 2 2 1 0 2 0 1 0 2 3 1 2 2 1 2 2 / 2 1 3 2 0 44 2 2 1 2 2 / 2 1 1 22 1 4 2 2 ˆ 2 222 22 22 4 2 ˆ 1 2 1 ˆ4 1 ˆ4 1 2 1 ˆ2 1 ˆ4 ˆ 11 ˆ4 1 1 σ σσσ σσ σσ µ σ σσ σ σσ − − −− − − −− − − + − − −− − − +− − − + +− − +− − − +             −−+− − ≈ ∑∑ ∑∑∑ ∑∑ ∑∑ == === == ≠ ==
  • 48.
    48 SOLO Review ofProbability Estimation of the Mean and Variance of a Random Variable (continue - 5) { } { } mxE k mE k i ik == ∑=1 1 ˆ { } ( ) 2 1 22 ˆ 1 1 :ˆ σσ =       − − = ∑= k i kik mx k EE We found: ( ){ } k mmE kmk 2 22 ˆ ˆ: σ σ =−= ( ){ } ( ) k mx k EE k i kik k 4 4 2 2 1 22222 ˆ ˆ 1 1 ˆ:2 σµ σσσσσ − ≈               −− − =−= ∑= ( ){ }4 4 : mxE i −=µ Kurtosis of random variable xi Define 4 4 : σ µ λ = ( ){ } ( ) ( ) k mx k EE k i kik k 42 2 1 22222 ˆ 1 ˆ 1 1 ˆ:2 σλ σσσσσ − ≈               −− − =−= ∑=
  • 49.
    49 SOLO Review ofProbability Estimation of the Mean and Variance of a Random Variable (continue - 6) [ ] ϕσσσ σσ =≤≤ 2 ˆ 2 k 2 k ˆ-0Prob n For high values of k, according to the Central Limit Theorem the estimations of mean and of variance are approximately Gaussian Random Variables. kmˆ 2 ˆkσ We want to find a region around that will contain σ2 with a predefined probability φ as function of the number of iterations k. 2 ˆkσ Since are approximately Gaussian Random Variables nσ is given by solving: 2 ˆkσ ϕζζ π σ σ =      −∫ + − n n d2 2 1 exp 2 1 nσ φ 1.000 0.6827 1.645 0.9000 1.960 0.9500 2.576 0.9900 Cumulative Probability within nσ Standard Deviation of the Mean for a Gaussian Random Variable 22 k 22 1 ˆ- 1 σ λ σσσ λ σσ k n k n − ≤≤ − − 22 k 2 1 1 ˆ-1 1 σ λ σσ λ σσ         − − ≤≤        + − − k n k n ( ) ( ) ( ) ( )( )42222 1,0;ˆ~ˆ&,0;ˆ~ˆ σλσσσσ −−− kkkk kmmmk NN
  • 50.
    50 SOLO Review ofProbability Estimation of the Mean and Variance of a Random Variable (continue - 7) [ ] ϕσσσ σσ =≤≤ 2 ˆ 2 k 2 k ˆ-0Prob n 22 k 22 1 ˆ- 1 σ λ σσσ λ σσ k n k n − ≤≤ − − 22 k 2 1 1 ˆ-1 1 σ λ σσ λ σσ         − − ≤≤        + − − k n k n 22 ˆ 1 2 k σ λ σσ k − = 22 k 2 1 1ˆ 1 1 σ λ σσ λ σσ         − −≥≥        − + k n k n         − − ≥≥         − + k n k n 1 1 ˆ 1 1 2 2 k 2 λ σ σ λ σ σσ k n k n 1 1 :ˆ: 1 1 k − − =≥≥= − + λ σ σσσ λ σ σσ
  • 51.
    51 SOLO Review ofProbability Estimation of the Mean and Variance of a Random Variable (continue - 8)
  • 52.
    52 SOLO Review ofProbability Estimation of the Mean and Variance of a Random Variable (continue - 9)
  • 53.
    53 SOLO Review ofProbability Estimation of the Mean and Variance of a Random Variable (continue - 10) k n k n kk 1ˆ 1 :& 1ˆ 1 : 00 − − = − + = λ σ σ λ σ σ σσ Monte-Carlo Procedure Choose the Confidence Level φ and find the corresponding nσ using the normal (Gaussian) distribution. nσ φ 1.000 0.6827 1.645 0.9000 1.960 0.9500 2.576 0.9900 1 Run a few sample k0 > 20 and estimate λ according to2 ( ) ( ) 2 1 2 0 1 4 0 0 0 0 0 0 ˆ 1 ˆ 1 :ˆ       − − = ∑ ∑ = = k i ki k i ki k mx k mx k λ∑= = 0 0 10 1 :ˆ k i ik x k m 3 Compute and as function of kσ σ 4 Find k for which [ ] ϕσσσ σσ =≤≤ 2 ˆ 2 k 2 k ˆ-0Prob n 5 Run k-k0 simulations
  • 54.
    54 SOLO Review ofProbability Estimation of the Mean and Variance of a Random Variable (continue – 11) Monte-Carlo Procedure Choose the Confidence Level φ = 95% that gives the corresponding nσ=1.96. nσ φ 1.000 0.6827 1.645 0.9000 1.960 0.9500 2.576 0.9900 1 The kurtosis λ = 32 3 Find k for which ϕσ λ σσ σ σ =             − ≤≤  2 kˆ 22 k 2 1 ˆ-0Prob k n 4 Run k>800 simulations Example: Assume a Gaussian distribution λ = 3 95.0 2 96.1ˆ-0Prob 2 kˆ 22 k 2 =             ≤≤  σ σσσ k Assume also that we require also that with probability φ = 95 %22 k 2 1.0ˆ- σσσ ≤ 1.0 2 96.1 = k 800≈k
  • 55.
    55 SOLO Review ofProbability Generating Discrete Random Variables Pseudo-Random Number Generators • First attempts to generate “random numbers”: - Draw balls out of a stirred urn - Roll dice • 1927: L.H.C. Tippett published a table of 40,000 digits taken “at random” from census reports. • 1939: M.G. Kendall and B. Babington-Smith create a mechanical machine to generate random numbers. They published a table of 100,000 digits. • 1946: J. Von Neumann proposed the “middle square method”. • 1948: D.H. Lehmer introduced the “linear congruential method”. • 1955: RAND Corporation published a table of 1,000,000 random digits obtained from electronic noise. • 1965: M.D. MacLaren and G. Marsaglia proposed to combine two congruential generators. • 1989: R.S. Wikramaratna proposed the additive congruential method.
  • 56.
    56 SOLO Review ofProbability Generating Discrete Random Variables Pseudo-Random Number Generators A Random Number represents the value of a random variable uniform distributed on (0,1). Pseudo-Random Numbers constitute a sequence of values, which although are deterministically generated, have all the appearances of being independent uniform distributed on (0,1). One approach 1. Define x0 = integer initial condition or seed 2. Using integers a and m recursively compute mxax nn modulo1−= mxIntegerxkmaxmkxa nnn <∈+⋅=− ,,,1 Therefore xn takes the values 0,1,…,m-1 and the quantity un=xn/m , called a pseudo-random number is an approximation to the value of uniform (0,1) random variable. In general the integers a and m should be chose to satisfy three criteria: 1. For any initial seed, the resultant sequence has the “appearance” of being a sequence of independent (0,1) random variables. For any initial seed, the number of variables that can be generated before repetition begins is large. The values can be computed efficiently on a digital computer. Multiplicative congruential method Return to Monte Carlo Approximation
  • 57.
    57 SOLO Review ofProbability Generating Discrete Random Variables Pseudo-Random Number Generators (continue – 1) A guideline is to choose m to be a large prime number compared to the computer word size. Examples: 32 bits word computer: 807,16712 531 ==−= am 125,35312 535 ==−= am36 bits word computer: Another generator of pseudo-random numbers uses recursions of the type: ( ) mcxax nn modulo1 += − mxIntegerxkmcaxmkcxa nnn <∈+⋅=+− ,,,,1 Mixed congruential method
  • 58.
    58 SOLO Review ofProbability Generating Discrete Random Variables Histograms Return to Table of Content A histogram is a graphical display of tabulated frequencies, shown as bars. It shows what proportion of cases fall into each of several categories: it is a form of data binning. The categories are usually specified as non-overlapping intervals of some variable. The categories (bars) must be adjacent. The intervals (or bands, or bins) are generally of the same size. Histograms are used to plot density of data, and often for density estimation: estimating the probability density function of the underlying variable. The total area of a histogram always equals 1. If the length of the intervals on the x-axis are all 1, then a histogram is identical to a relative frequency plot. A cumulative histogram is a mapping that counts the cumulative number of observations in all of the bins up to the specified bin. That is, the cumulative histogram Mi of a histogram mi is defined as: An ordinary and a cumulative histogram of the same data. The data shown is a random sample of 10,000 points from a normal distribution with a mean of 0 and a standard deviation of 1. Mathematical Definition ∑= = k i imn 1 In a more general mathematical sense, a histogram is a mapping mi that counts the number of observations that fall into various disjoint categories (known as bins), whereas the graph of a histogram is merely one way to represent a histogram. Thus, if we let n be the total number of observations and k be the total number of bins, the histogram mi meets the following conditions: ∑= = i j ji mM 1
  • 59.
    59 SOLO Review ofProbability Generating Discrete Random Variables The Inverse Transform Method Suppose we want to generate a discrete random variable X having probability density function: ( ) 1,1,0)( ==−= ∑j jjj pjxxpxp δ To accomplish this, let generate a random number U that is uniformly distributed over (0,1) and set:            <≤ +<≤ < = ∑∑ = − =   j i i j i ij pUpifx ppUpifx pUifx X 1 1 1 1001 00 j j i i j i ij ppUpPxXP =       <<== ∑∑ = − = 1 1 1 )( Since , for any a and b such that 0 < a < b < 1, and U is uniformly distributed P (a ≤ U < b) = b-a, we have: and so X has the desired distribution.
  • 60.
    60 SOLO Review ofProbability Generating Discrete Random Variables The Inverse Transform Method (continue – 1) Suppose we want to generate a discrete random variable X having probability density function: ( ) 1,1,0)( ==−= ∑j jjj pjxxpxp δ Draw X, N times, from p (x) Histogram of the Results
  • 61.
    61 SOLO Review ofProbability Generating Discrete Random Variables The Inverse Transform Method (continue – 2) Generating a Poisson Random Variable: 1,1,0 ! )( ===== ∑− i i i i pi i eiXPp  λλ ( ) 1 ! !1 1 1 + = + = − + − + i i e i e p p i i i i λ λ λ λ λ Draw X , N times, from Poisson Distribution Histogram of the Results
  • 62.
    62 SOLO Review ofProbability Generating Discrete Random Variables The Inverse Transform Method (continue – 3) Generating Binominal Random Variable: ( ) ( ) 1,1,01 !! ! )( ==− − === ∑− i i ini i pipp ini n iXPp  ( ) ( ) ( ) ( ) ( ) p p i in pp ini n pp ini n p p ini ini i i −+ − = − − − −−+ = − −−+ + 111 !! ! 1 !1!1 ! 11 1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 k ( )nkP , Histogram of the Results
  • 63.
    63 SOLO Review ofProbability Generating Discrete Random Variables The Accaptance-Rejection Technique Suppose we have an efficient method for simulating a random variable having a probability density function { qj, j ≥0 }. We want to use this to obtain a random variable that has the probability density function { pj, j ≥0 }. Let c be a constant such that: 0.. ≠∀≤ j j j qtsjc q p If such a c exists, it must satisfy: cqcp j j j j ≤⇒≤ ∑∑ 1 11  Rejection Method Step 1: Simulate the value of Y, having probability density function qj. Step 2: Generate a random number U (that is uniformly distributed over (0,1) ). Step 3: If U < pY/c qY, set X = Y and stop. Otherwise return to Step 1.
  • 64.
    64 SOLO Review ofProbability Generating Discrete Random Variables The Acceptance-Rejection Technique (continue – 1) Theorem The random variable X obtained by the rejection method has probability density function P { X=i } = pi. Proof { } { } { } { } { }Acceptance , Acceptance Acceptance, Acceptance Method Acceptance Method Acceptance P qc p UiYP P iYP iYPiXP i i Bayes       ≤= = = ==== { } { } { } { }AcceptanceAcceptanceAcceptance (0,1)ddistribute uniformlyU ceindependen by Pc p P qc p q P qc p UPiYP ii i i i i qi ==      ≤= =  Summing over all i, yields { } { }Acceptance 1 1 Pc p iXP i i i   ∑ ∑ == { } 1Acceptance =Pc { } ipiXP == { } 1 1 Acceptance ≤= c P q.e.d.
  • 65.
    65 SOLO Review ofProbability Generating Discrete Random Variables The Acceptance-Rejection Technique (continue – 2) Example Generate a truncated Gaussian using the Accept-Reject method. Consider the case with ( ) [ ]     −∈ ≈ − otherwise xe xp x 0 4,42/2/2 π Consider the Uniform proposal function ( ) [ ]    −∈ ≈ otherwise x xq 0 4,48/1 In Figure we can see the results of the Accept-Reject method using N=10,000 samples.
  • 66.
    66 SOLO Review ofProbability Generating Continuous Random Variables The Inverse Transform Algorithm Let U be a uniform (0,1) random variable. For any continuous distribution function F the random variable X defined by ( )UFX 1− = has distribution F. [ F-1 (u) is defined to be that value of x such that F (x) = u ] Proof Let Px(x) denote the Probability Distribution Function X=F-1 (U) ( ) { } ( ){ }xUFPxXPxPx ≤=≤= −1 Since F is a distribution function, it means that F (x) is a monotonic increasing function of x and so the inequality “a ≤ b” is equivalent to the inequality “F (a) ≤ F (b)”, therefore ( ) ( )[ ] ( ){ } ( )[ ] ( ){ } ( ) ( ) ( )xFxFUP xFUFFPxP uniformU xF UUFF x 1,0 10 1 1 ≤≤ = − =≤= ≤= −
  • 67.
    67 SOLO Review ofProbability Importance Sampling Let Y = (Y1,…,Ym) a vector of random variables having a joint probability density function f (y1,…,ym), and suppose that we are interested in estimating ( )[ ] ( ) ( )∫== mmmmf dydyyyfyyhYYhE  1111 ,,,,,,θ Suppose that a direct generation of the random vector Y so as to compute h (Y) is inefficient possible because (a) is difficult to generate the random vector Y, or (b) the variance of h (Y) is large, or (c) both of the above Suppose that W=(W1,…,Wm) is another random vector, which takes values in the same domain as Y, and has a joint density function g(w1,…,wm) that can be easily generated. The estimation θ can be expressed as: ( )[ ] ( ) ( ) ( ) ( ) ( ) ( ) ( )       === ∫ Wg WfWh Edwdwwwg wwg wwfwwh YYhE gmm m mm mf     11 1 11 1 ,, ,, ,,,, ,,θ Therefore, we can estimate θ by generating values of random vector W, and then using as the estimator the resulting average of the values h (W) f (W)/ g (W). Return to Particle Filters
  • 68.
    68 SOLO Review ofProbability Monte Carlo Integration Monte Carlo Method can be used to numerically evaluate multidimensional integrals ( ) ( )∫∫ == xdxgdxdxxxgI mm  11 ,, To use Monte Carlo we factorize ( ) ( ) ( )xpxfxg ⋅= ( ) ( ) 1&0 =≥ ∫ xdxpxp in such a way that is interpreted as a Probability Density Function such that( )xp We assume that we can draw NS samples from ( )xp( )S i Nix ,,1, = ( ) S i Nixpx ,,1~ = Using Monte Carlo we can approximate ( ) ( )∑= −≈ SN i S i Nxxxp 1 /δ ( ) ( ) ( ) ( ) ( ) ( ) ( )∑∑∫ ∫ ∑∫ == = =−⋅= −⋅=≈⋅= SS S S N i i S N i i S N i S i N xf N xdxxxf N xdNxxxfIxdxpxfI 11 1 11 / δ δ
  • 69.
    69 SOLO Review ofProbability Monte Carlo Integration we draw NS samples from ( )xp( )S i Nix ,,1, = ( ) S i Nixpx ,,1~ = ( ) ( ) ( )∑∫ = =≈⋅= S S N i i S N xf N IxdxpxfI 1 1 If the samples are independent, then INS is an unbiased estimate of I. i x According to the Law of Large Numbers INS will almost surely converge to I: II sa N N S S .. ∞→ → ( )[ ] ( ) ∞<−= ∫ xdxpIxff 22 :σIf the variance of is finite; i.e.:( )xf then the Central Limit Theorem holds and the estimation error converges in distribution to a Normal Distribution: ( ) ( )2 ,0~lim fNS N IIN S S σN− ∞→ The error of the MC estimate, e = INS – I, is of the order of O (NS -1/2 ), meaning that the rate of convergence of the estimate is independent of the dimension of the integrand. Numerical Integration of and ( )kk xzp |( )1| −kk xxp Return to Particle Filters
  • 70.
    70 SOLO Review ofProbability Existence Theorems Existence Theorem 3 Given a function S (ω)= S (-ω) or, equivalently, a positive-defined function R (τ), (R (τ) = R (-τ), and R (0)=max R (τ), for all τ ), we can find a stochastic process x (t) having S (ω) as its power spectrum or R (τ) as its autocorrelation. Proof of Existence Theorem 3 Define ( ) ( ) ( ) ( ) ( )ω π ω π ω ωωω π −= − === ∫ +∞ ∞− f a S a S fdSa 22 2 :& 1 : Since , according to Existence Theorem 1, we can find a random variable ω with the even density function f (ω), and probability density function ( ) ( ) 1&0 =≥ ∫ +∞ ∞− ωωω dff ( ) ( )∫∞− = ω ττω dfP : We now form the process , where is a random variable uniform distributed in the interval (-π,+π) and independent of ω. ( ) ( )ϑω += tatx cos: ϑ
  • 71.
    71 SOLO Review ofProbability Existence Theorems Existence Theorem 3 Proof of Existence Theorem 3 (continue – 1) Since is uniform distributed in the interval (-π,+π) and independent of ω, its spectrum is ( ){ } ( ){ } ( ){ } ( ){ } ( ){ } 0sinsincoscos 00 , =−=  ϑωϑω ϑωϑω ϑω EtEaEtEatxE tindependen ϑ ( ) { } ( ) ϖπ ϖπ ϖπϖπ ϑ π ϖ πϖπϖπ π ϑϖπ π ϑϖϑϖ ϑϑ sin 2 1 2 1 2 1 = − ==== −+ − + − ∫ j ee j e deeES jjj jj or { } ( ){ } ( ){ } ( ) ϖπ ϖπ ϑϖϑϖ ϑϑ ϑϖ ϑ sin sincos =+= EjEeE j 1=ϖ 1=ϖ ( ) ( ){ } ( ) ( )[ ]{ } ( ){ } ( )[ ]{ } ( ){ } ( )[ ]{ } ( ){ } ( )[ ]{ } ( ){ }  0 2 0 22, 22 2 2sin2sin 2 2cos2cos 2 cos 2 22cos 2 cos 2 coscos ϑτωϑτωτω ϑτωτω ϑτωϑωτ ϑωϑωω ϑω EtE a EtE a E a tE a E a ttEatxtxE tindependen +−++= +++= +++=+ 2=ϖ 2=ϖ Given a function S (ω)= S (-ω) or, equivalently, a positive-defined function R (τ), (R (τ) = R (-τ), and R (0)=max R (τ), for all τ ), we can find a stochastic process x (t) having S (ω) as its power spectrum or R (τ) as its autocorrelation.
  • 72.
    72 SOLO Review ofProbability Existence Theorems Existence Theorem 3 Proof of Existence Theorem 3 (continue – 2) ( ){ } 0=txE ( ) ( ){ } ( ){ } ( ) ( ) ( )τωωτωτωτ ω xRdf a E a txtxE ===+ ∫ +∞ ∞− cos 2 cos 2 22 ( ) ( )ϑω += tatx cos:We have Because of those two properties x (t) is wide-sense stationary with a power spectrum given by: ( ) ( ) ( ) ( )[ ] ( ) ( ) ( ) ( )∫∫ +∞ ∞− −=+∞ ∞− =−= ττωτττωτωτω ττ dRdjRS x RR xx xx cossincos ( ) ( ) ( ) ( )[ ] ( ) ( ) ( ) ( )∫∫ +∞ ∞− −=+∞ ∞− =+= ωτωω π ωτωτωω π τ ωω dSdjSR x SS xx xx cos 2 1 sincos 2 1 Therefore ( ) ( )ωπω faSx 2 = q.e.d. Fourier Inverse Fourier ( ) ( )∫ +∞ ∞− = ωωτω df a cos 2 2 f (ω) definition ( )ωS= Given a function S (ω)= S (-ω) or, equivalently, a positive-defined function R (τ), (R (τ) = R (-τ), and R (0)=max R (τ), for all τ ), we can find a stochastic process x (t) having S (ω) as its power spectrum or R (τ) as its autocorrelation.
  • 73.
    73 SOLO Markov Processes A MarkovProcess is defined by: Andrei Andreevich Markov 1856 - 1922 ( ) ( )( ) ( ) ( )( ) 111 ,|,,,|, tttxtxptxtxp >∀ΩΩ=≤ΩΩ ττ i.e. the Random Process, the past up to any time t1 is fully defined by the process at t1. Examples of Markov Processes: 1. Continuous Dynamic System ( ) ( ) ( ) ( )vuxthtz wuxtftx ,,, ,,, = = 2. Discrete Dynamic System ( ) ( ) ( ) ( )kkkkk kkkkk vuxthtz wuxtftx ,,, ,,, 1111 = = −−−− x - state space vector (n x 1) u - input vector (m x 1) - measurement vector (p x 1)z v - white measurement noise vector (p x 1) - white input noise vector (n x 1)w Recursive Bayesian Estimation
  • 74.
    74 Recursive Bayesian EstimationSOLO Usingthis property we obtain: ( ) ( )1021 |,,,| −−− = kkkkk xxpxxxxp  Markov Processes ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )∏= − −−−− −−−−−− = = = −− − k i ii k xxp kkkk kk xxp kkkkkk xxpxp xxpxxxpxxp xxxpxxxxpxxxxp kk kk 1 10 02 | 0211 021 | 021021 | ,,,,|| ,,,,,,|,,,, 21 1           Markov Process: Table of Content the present discrete state probability depends only on the previous state. The Markov Process is defined if we know p (x0) and p(xi|xi-1) for each i.
  • 75.
    75 Recursive Bayesian EstimationSOLO Ina Markovian system the probability of the current true state depends only on the previous state, and is independent of the other earlier states ( ) ( )1021 |,,,| −−− = kkkkk xxpxxxxp  Similarly the measurements at the k-th time- step is dependent upon the current true state, so is conditionally independent of all other earlier states, given the current state ( ) ( )kkkkk xzpxxxzp |,,,| 01 =−  ( ) ( ) ( ) ( ) ( )kkkkkkkk zpzxpxpxzpxzp ||, == From the definition of the Markovian system (see Figure) p (xk|xk-1) is defined by f and the statistics of x and w and p (zk|xk) is defined by h and statistics of x and v. kx1−kx kz1−kz 0x 1x 2x 1z 2z kZ :11:1 −kZ ( )111 ,, −−− kkk wuxf ( )kk vxh , Markov Processes ( )000 ,, wuxf ( )11,vxh ( )111 ,, wuxf ( )22 ,vxh Hidden States Measurements
  • 76.
    76 Recursive Bayesian EstimationSOLO () ( ) ( ) ( ) ( )kvkkk xkkwkkkk vpgivenvxhz xpuwpgivenwuxfx :, ,,:,, 011111 0 = = −−−−− Markov Processes ( ) ( )j kkkkxkkkw j k wuxfxtsNjuxxfw k 11111 1 1 ,,..,..,1,, −−−−− − − === Suppose that we can obtain all for which j kw 1− ( ) ( ) ( )∑= − −−−−− ∇= kxN j j kkkw j kwkk wuxfwpxxp 1 1 11111 ,,|then ( ) ( ) ( )∑= − ∇= kx k N j j kkv j kvkk vxhvpxzp 1 1 ,| ( ) ( )j kkkzkkv j k vxhztsNjxzhv k ,..,..,1,1 === − In the same way, suppose that we can obtain all for whichj kv then ( ) ( ) ( ) ( ) ( )∑ ∑ = − −−−− = −−−− ∇= =+≤≤= kx kx N j k j kkkw j kw N j j k j kwkkkkkkkk xdwuxfwp wdwpxxdxXxxdxxp 1 1 1111 1 1111 ,, |Pr| This is a Conceptual Not a Practical Procedure Analytic Computations of and .( )kk xzp |( )1| −kk xxp
  • 77.
    77 Recursive Bayesian EstimationSOLO () ( ) ( ) ( ) ( )kvkkk xkkwkkkk vpgivenvxhz xpuwpgivenwuxfx : ,,:, 011111 0 += += −−−−− kx1−kx kz1−kz ( ) 111, −−− + kkk wuxf ( ) kk vxh + Markov Processes ( ) ( )[ ]111 ,| −−− −= kkkwkk uxfxpxxptherefore ( ) ( )[ ]kkvkk xhzpxzp −=|and For additive noise we have ( ) ( )kkk kkkk xhzv uxfxw −= −= −−− 111 , Analytic Computations of and (continue – 1)( )kk xzp |( )1| −kk xxp
  • 78.
    78 SOLO ( ) ( )kkk kkk vxhz wxfx , ,11 = = −− kk vw &1− are system and measurement white-noise sequences independent of past and current states and on each other and having known P.D.F.s ( ) ( )kk vpwp &1− We want to compute p (xk|Z1:k) recursively, assuming knowledge of p(xk-1|Z1:k-1) in two stages, prediction (before) and update (after measurement) ( ) ( )( ) ( )∫ −−−−− −= 11111 ,| kkkkkkk wdwpwxfxxxp δ We need to evaluate the following integrals: ( ) ( )( ) ( )∫ −= kkkkkkk vdvpvxhzxzp ,| δ We use the numeric Monte Carlo Method to evaluate the integrals: Generate (Draw): ( ) ( ) Sk i kk i k Nivpvwpw ,,1~&~ 11 =−− ( ) ( )( ) S N i i k i k i kkk Nwxfxxxp S ∑= −−− −≈ 1 111 /,| δ ( ) ( )( ) S N i i k i k i kkk Nvxhzxzp S ∑= −≈ 1 /,| δ or ( ) ( ) ( ) S N i i kkkk i k i k i k Nxxxxpwxfx S ∑= −−− −≈→= 1 111 /|, δ ( ) ( ) ( ) S N i i kkkk i k i k i k Nzzxzpvxhz S ∑= −≈→= 1 /|, δ Analytic solutions for those integral equations do not exist in the general case. Recursive Bayesian Estimation Numerical Computations of and .( )kk xzp |( )1| −kk xxp Markov Processes Prediction (before measurement) ( ) ( ) ( )∫ −−−−− = 11:1111:1 ||| kkkkkkk xdZxpxxpZxp1 Update (after measurement) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )∫ − − − − = − === kkkkk kkkk kk kkkk Bayes bp apabp bap kkkkk xdZxpxzp Zxpxzp Zzp Zxpxzp ZzxpZxp 1:1 1:1 1:1 1:1 | | 1:1:1 || || | || ,|| 2
  • 79.
    79 Recursive Bayesian EstimationSOLO () ( ) ( ) ( ) ( )kvkkk xkkwkkkk vpgivenvxhz xpuwpgivenwuxfx :, ,,:,, 011111 0 = = −−−−− Markov Processes Monte Carlo Computations of and .( )kk xzp |( )1| −kk xxp Generate (Draw) ( ) Sx i Nixpx ,,1~ 00 0 = For { }∞∈ ,,1 k Initialization0 1 At stage k-1 Generate (Draw) NS samples ( ) Skw i k Niwpw ,,1~ 11 =−− 2 State Update ( ) S i kk i k i k Niwuxfx ,,1,, 111 == −−− 3 Generate (Draw) Measurement Noise ( ) Skv i k Nivpv ,,1~ = k:=k+1 & return to 1 Compute Histograms of to obtain ( )kk xzp | kk xz | ( ) ( )∑= − −≈ SN i S i kkkk Nxxxxp 1 1 /| δ ( ) ( )∑= −≈ SN i S i kkkk Nzzxzp 1 /| δ Compute Histograms of to obtain 1| −kk xx ( )1| −kk xxp 4 Measurement , Update ( ) S i k i k i k Nivxhz ,,1, ==kz
  • 80.
    SOLO Stochastic Processes dealwith systems corrupted by noise. A description of those processes is given in “Stochastic Processes” Presentation. Here we give only one aspect of those processes. ( ) ( ) ( ) [ ]fttttwddttxftxd ,, 0∈+= A continuous dynamic system is described by: Stochastic Processes ( )tx - n- dimensional state vector ( )twd - n- dimensional process noise vector Assuming system measurements at discrete time tk given by: ( ) ( )( ) [ ]fkkkkk tttvttxhtz ,,, 0∈= kv - m- dimensional measurement noise vector at tk We are interested in the probability of the state at time t given the set of discrete measurements until (included) time tk < t. x ( )kZtxp |, { }kk zzzZ ,,, 21 = - set of all measurements up to and including time tk. The time evolution of the probability density function is described by the Fokker–Planck equation.
  • 81.
    A solution tothe one-dimensional Fokker–Planck equation, with both the drift and the diffusion term. The initial condition is a Dirac delta function in x = 1, and the distribution drifts towards x = 0. The Fokker–Planck equation describes the time evolution of the probability density function of the position of a particle, and can be generalized to other observables as well. It is named after Adriaan Fokker and Max Planck and is also known as the Kolmogorov forward equation. The first use of the Fokker– Planck equation was the statistical description of Brownian motion of a particle in a fluid. In one spatial dimension x, the Fokker–Planck equation for a process with drift D1(x,t) and diffusion D2(x,t) is More generally, the time-dependent probability distribution may depend on a set of N macrovariables xi. The general form of the Fokker–Planck equation is then where D1 is the drift vector and D2 the diffusion tensor; the latter results from the presence of the stochastic force. Fokker – Planck Equation Adriaan Fokker 1887 - 1972 Max Planck 1858 - 1947 SOLO Adriaan Fokker „Die mittlere Energie rotierender elektrischer Dipole im Strahlungsfeld" Annalen der Physik 43, (1914) 810- 820 Max Plank, „Ueber einen Satz der statistichen Dynamik und eine Erweiterung in der Quantumtheorie“, Sitzungberichte der Preussischen Akadademie der Wissenschaften (1917) p. 324-341 Stochastic Processes ( ) ( ) ( )[ ] ( ) ( )[ ]txftxD x txftxD x txf t ,,,,, 22 2 1 ∂ ∂ + ∂ ∂ −= ∂ ∂ ( )[ ] ( )[ ]∑∑∑ = == ∂∂ ∂ + ∂ ∂ −= ∂ ∂ N i N j Nji ji N i Ni i ftxxD xx ftxxD x f t 1 1 1 2 2 1 1 1 ,,,,,, 
  • 82.
    Fokker – PlanckEquation (continue – 1) The Fokker–Planck equation can be used for computing the probability densities of stochastic differential equations. where is the state and is a standard M-dimensional Wiener process. If the initial probability distribution is , then the probability distribution of the state is given by the Fokker – Planck Equation with the drift and diffusion terms: Similarly, a Fokker–Planck equation can be derived for Stratonovich stochastic differential equations. In this case, noise-induced drift terms appear if the noise strength is state-dependent. SOLO Consider the Itô stochastic differential equation: ( ) ( ) ( )[ ] ( ) ( )[ ]txftxD x txftxD x txf t ,,,,, 22 2 1 ∂ ∂ + ∂ ∂ −= ∂ ∂
  • 83.
    Fokker – PlanckEquation (continue – 2) Derivation of the Fokker–Planck Equation SOLO Start with ( ) ( ) ( )11|1, 111 |, −−− −−− = kxkkxxkkxx xpxxpxxp kkkkk and ( ) ( ) ( ) ( )∫∫ +∞ ∞− −−− +∞ ∞− −− −−− == 111|11, 111 |, kkxkkxxkkkxxkx xdxpxxpxdxxpxp kkkkkk define ( ) ( )ttxxtxxttttt kkkk ∆−==∆−== −− 11 ,,, ( ) ( )[ ] ( ) ( ) ( ) ( )[ ] ( ) ( )[ ] ( )∫ +∞ ∞− ∆−∆− ∆−∆−∆−= ttxdttxpttxtxptxp ttxttxtxtx || Let use the Characteristic Function of ( ) ( ) ( ) ( ) ( )[ ]{ } ( ) ( ) ( ) ( )[ ] ( ) ( ) ( ) ( )ttxtxtxtxdttxtxpttxtxss ttxtxttxtx ∆−−=∆∆−∆−−−=Φ ∫ +∞ ∞− ∆−∆−∆ |exp: || ( ) ( ) ( ) ( )[ ]ttxtxp ttxtx ∆−∆− || The inverse transform is ( ) ( ) ( ) ( )[ ] ( ) ( )[ ]{ } ( ) ( ) ( )∫ ∞+ ∞− ∆−∆∆− Φ∆−−=∆− j j ttxtxttxtx sdsttxtxs j ttxtxp || exp 2 1 | π Using Chapman-Kolmogorov Equation we obtain: ( ) ( )[ ] ( ) ( )[ ]{ } ( ) ( ) ( ) ( ) ( ) ( ) ( )[ ] ( ) ( )[ ] ( ) ( ) ( )[ ]{ } ( ) ( ) ( ) ( ) ( )[ ] ( )ttxdsdttxpsttxtxs j ttxdttxpsdsttxtxs j txp j j ttxttxtx ttx ttxtxp j j ttxtxtx ttxtx ∆−∆−Φ∆−−= ∆−∆−Φ∆−−= ∫ ∫ ∫ ∫ ∞+ ∞− ∞+ ∞− ∆−∆−∆ +∞ ∞− ∆− ∆− ∞+ ∞− ∆−∆ ∆− | | | exp 2 1 exp 2 1 | π π    Stochastic Processes
  • 84.
    Fokker – PlanckEquation (continue – 3) Derivation of the Fokker–Planck Equation (continue – 1) SOLO The Characteristic Function can be expressed in terms of the moments about x (t-Δt) as: ( ) ( )[ ] ( ) ( )[ ]{ } ( ) ( ) ( ) ( ) ( )[ ] ( )ttxdsdttxpsttxtxs j txp j j ttxttxtxtx ∆−∆−Φ∆−−= ∫ ∫ +∞ ∞− ∞+ ∞− ∆−∆−∆ |exp 2 1 π ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )[ ] ( ){ }∑ ∞ = ∆−∆∆−∆ ∆−∆−− − +=Φ 1 || | ! 1 i i ttxtx i ttxtx ttxttxtxE i s s Therefore ( ) ( )[ ] ( ) ( )[ ]{ } ( ) ( ) ( ) ( ) ( )[ ] ( ){ } ( ) ( )[ ] ( )ttxdsdttxpttxttxtxE i s ttxtxs j txp j j ttx i i ttxtx i tx ∆−∆−       ∆−∆−− − +∆−−= ∫ ∫ ∑ +∞ ∞− ∞+ ∞− ∆− ∞ = ∆− 1 | | ! 1exp 2 1 π Use the fact that ( ) ( ) ( )[ ]{ } ( ) ( ) ( )[ ] ( )[ ] ,2,1,01exp 2 1 = ∂ ∆−−∂ −=∆−−−∫ ∞+ ∞− i tx ttxtx sdttxtxss j i i i j j i δ π ( ) ( )[ ] ( ) ( )[ ]{ } ( ) ( )[ ] ( ) ( ) ( ) ( )[ ] ( )[ ] ( ) ( )[ ] ( ){ } ( ) ( )[ ] ( )∫∑ ∫ ∫ ∞+ ∞− ∞ = ∆− +∞ ∞− ∆− ∞+ ∞− ∆−∆−∆−∆−− ∂ ∆−−∂− + ∆−∆−∆−−= 1 | ! 1 exp 2 1 i ttx i i ii ttx j j tx ttxdttxpttxttxtxE tx ttxtx i ttxdttxpsdttxtxs j txp δ π where δ [u] is the Dirac delta function: [ ] { } ( ) [ ] ( ) ( ) ( ) ( ) ( )000..0exp 2 1 FFFtsuFFduuuFsdus j u j j ==∀== −+ +∞ ∞− ∞+ ∞− ∫∫ δ π δ Stochastic Processes
  • 85.
    Fokker – PlanckEquation (continue – 4) Derivation of the Fokker–Planck Equation (continue – 2) SOLO [ ] ( ){ } ( ) [ ] ( ) ( ) ( ) ( ) ( )afafaftsufufduuaufsduas j ua au j j ==∀=−−=− −+= +∞ ∞− ∞+ ∞− ∫∫ ..exp 2 1 δ π δ [ ] ( ){ } ( ) ( ) { } ( ) ( ) { }∫∫∫ ∞+ ∞− ∞+ ∞− ∞+ ∞− =→=− − =− j j j j j j sdussFs j uf du d sdussF j ufsduass j ua ud d exp 2 1 exp 2 1 exp 2 1 πππ δ ( ) [ ] ( ) ( ){ } ( ) ( ){ } { } ( ) { } { } ( ) ( ) au j j j j j j j j ud ufd sdsFass j sdduusufass j sdduuasufs j dusduass j ufduua ud d uf = ∞+ ∞− ∞+ ∞− ∞+ ∞− ∞+ ∞− +∞ ∞− +∞ ∞− ∞+ ∞− +∞ ∞− −= − =− − = − − =− − =− ∫∫ ∫ ∫ ∫∫ ∫∫ exp 2 1 expexp 2 1 exp 2 1 exp 2 1 ππ ππ δ [ ] ( ) ( ){ } ( ) ( ) { } ( ) ( ) { }∫∫∫ ∞+ ∞− ∞+ ∞− ∞+ ∞− =→=− − =− j j i i ij j j j i i i i sdussFs j uf du d sdussF j ufsduass j ua ud d exp 2 1 exp 2 1 exp 2 1 πππ δ ( ) [ ] ( ) ( ) ( ){ } ( ) ( ) ( ){ } ( ) { } ( ) { } ( ) ( ) { } ( ) ( ) au i i i j j i ij j i i j j i ij j i i i i ud ufd sdassFs j sdduusufass j sdduuasufs j dusduass j ufduua ud d uf = −= − =− − = − − =− − =− ∫∫ ∫ ∫ ∫∫ ∫∫ ∞+ ∞− ∞+ ∞− ∞+ ∞− ∞+ ∞− +∞ ∞− +∞ ∞− ∞+ ∞− +∞ ∞− 1exp 2 1 expexp 2 1 exp 2 1 exp 2 1 ππ ππ δ Useful results related to integrals involving Delta (Dirac) function Stochastic Processes
  • 86.
    Fokker – PlanckEquation (continue – 5) Derivation of the Fokker–Planck Equation (continue – 3) SOLO ( ) ( )[ ]{ } ( ) ( )[ ] ( ) ( )[ ] ( ) ( ) ( )[ ] ( ) ( )[ ] ( ) ( ) ( )[ ]txpttxdttxpttxtxttxdttxpsdttxtxs j ttxttxttx ttxtx j j ∆− +∞ ∞− ∆− +∞ ∞− ∆− ∆−− ∞+ ∞− =∆−∆−∆−−=∆−∆−∆−− ∫∫ ∫ δ π δ    exp 2 1 ( ) ( ) ( )[ ] ( )[ ] ( ) ( ) ( ) ( )[ ] ( ){ } ( ) ( )[ ] ( ) ( ) ( ) ( )[ ] ( )[ ] ( ) ( ) ( ) ( )[ ] ( ){ } ( ) ( )[ ] ( ) ( ) ( ) ( ) ( ) ( )[ ] ( ){ } ( ) ( )[ ]( ) ( )[ ]∑ ∑ ∫ ∫∑ ∞ = =∆ ∆−∆− ∞ = ∞+ ∞− ∆−∆− +∞ ∞− ∞ = ∆−∆− ∂ ∆−∆−−∂− = ∆−∆−∆−∆−− ∂ ∆−−∂− = ∆−∆−∆−∆−− ∂ ∆−−∂− 1 0 | 1 | 1 | | ! 1 | ! 1 | ! 1 i t i ttx i ttxtx ii i ttx i ttxtxi ii i ttx i ttxtxi ii tx txpttxttxtxE i ttxdttxpttxttxtxE tx ttxtx i ttxdttxpttxttxtxE tx ttxtx i δ δ ( ) [ ] ( ) ( ) ( ) [ ] [ ] ( ) auau i i i i i i i i i ud ufd duua uad d uf ud ufd duua ud d uf == =− − →−=− ∫∫ +∞ ∞− +∞ ∞− δδ 1We found ( ) ( )[ ] ( ) ( )[ ] ( ) ( ) ( ) ( ) ( )[ ] ( ){ } ( ) ( )[ ]( ) ( )[ ]∑ ∞ = =∆ ∆−∆− ∆− ∂ ∆−∆−−∂− += 1 0 | | ! 1 i t i ttx i ttxtx ii ttxtx tx txpttxttxtxE i txptxp ( ) ( )[ ] ( ) ( )[ ] ( ) ( ) ( )[ ] ( ){ } ( ) ( )[ ]( ) ( )[ ]∑ ∞ = ∆− →∆ ∆− →∆ ∂ ∆−∆−−∂ ∆ − = ∆ − 1 00 |1 lim ! 1 lim i i ttx ii t i ttxtx t tx txpttxttxtxE tit txptxp Therefore Rearranging, dividing by Δt, and tacking the limit Δt→0, we obtain: Stochastic Processes
  • 87.
    Fokker – PlanckEquation (continue – 6) Derivation of the Fokker–Planck Equation (continue – 4) SOLO We found ( ) ( )[ ] ( ) ( )[ ] ( ) ( ) ( ) ( ) ( )[ ] ( ){ } ( ) ( )[ ]( ) ( )[ ]∑ ∞ = ∆−∆− →∆ ∆− →∆ ∂ ∆−∆−−∂ ∆ − = ∆ − 1 | 00 |1 lim ! 1 lim i i ttx i ttxtx i t i ttxtx t tx txpttxttxtxE tit txptxp Define: ( ) ( )[ ] ( ) ( ) ( ) ( )[ ] ( ){ } t ttxttxtxE txtxm i ttxtx t i ∆ ∆−∆−− =− ∆− →∆ − | lim: | 0 Therefore ( ) ( )[ ] ( ) ( ) ( )[ ] ( ) ( )[ ]( ) ( )[ ]∑ ∞ = − ∂ −∂− = ∂ ∂ 1 ! 1 i i tx iii tx tx txptxtxm it txp ( ) ( )ttxtx t ∆−= →∆ − 0 lim: and: This equation is called the Stochastic Equation or Kinetic Equation. It is a partial differential equation that we must solve, with the initial condition: ( ) ( )[ ] ( )[ ]000 0 txptxp tx === Stochastic Processes
  • 88.
    Fokker – PlanckEquation (continue – 7) Derivation of the Fokker–Planck Equation (continue – 5) SOLO We want to find px(t) [x(t)] where x(t) is the solution of ( ) ( ) ( ) [ ]fg ttttntxf dt txd ,, 0∈+= ( ){ } 0: == tnEn gg  ( )tng ( ) ( )[ ] ( ) ( )[ ]{ } ( ) ( )τδττ −=−− ttQnntntnE gggg ˆˆ Wiener (Gauss) Process ( ) ( )[ ] ( ) ( )[ ] ( ){ } [ ] ( ){ } [ ]{ } ( )tQnEtxnE t ttxttxtxE txtxm gg t === ∆ ∆−∆−− =− →∆ − 22 2 2 0 2 | | lim: ( ) ( )[ ] ( ) ( )[ ] ( ){ } ( ) ( ) ( ) ( ) ( )txfnEtxftx td txd E t ttxttxtxE txtxm g t ,,| | lim: 0 0 1 =+=             = ∆ ∆−∆−− =− →∆ −  ( ) ( )[ ] ( ) ( )[ ] ( ){ } 20 | lim: 0 >= ∆ ∆−∆−− =− →∆ − i t ttxttxtxE txtxm i t i Therefore we obtain: ( ) ( )[ ] ( )[ ] ( ) ( )[ ]( ) ( ) ( ) ( ) ( )[ ] ( )[ ]2 2 2 1, tx txp tQ tx txpttxf t txp txtxtx ∂ ∂ + ∂ ∂ −= ∂ ∂ Stochastic Processes Fokker–Planck Equation Return to Daum
  • 89.
    89 Recursive Bayesian EstimationSOLO Givena nonlinear discrete stochastic Markovian system we want to use k discrete measurements Z1:k={z1,z2,…,zk} to estimate the hidden state xk. For this we want to compute the probability of xk given all the measurements Z1:k={z1,z2,…,zk} . If we know p ( xk| Z1:k ) then xk is estimated using: { } ( )∫== kkkkkkkk xdZxpxZxEx :1:1| ||:ˆ ( )( ){ } ( )( ) ( )∫ −−=−−= kkk T kkkkk T kkkkkk xdZxpxxxxZxxxxEP :1:1| |ˆˆ|ˆˆ or more general we can compute all moments of the probability distribution p ( xk| Z1:k ): ( ){ } ( ) ( )∫= kkkkkk xdZxpxgZxgE :1:1 || Bayesian Estimation Introduction Problem: Estimate the hidden States of a Non-linear Dynamic Stochastic System from Noisy Measurements. kx1−kx kz1−kz 0x 1x 2x 1z 2z kZ :11:1 −kZ ( )11, −− kk wxf ( )kk vxh , ( )00 ,wxf ( )11,vxh ( )11,wxf ( )22 ,vxh The knowledge of p ( xk| Z1:k ) allows also the computation of Maximum a Posteriori (MAP) estimate using: ( )kk x MAP kk Zxpx k :1| |maxargˆ =
  • 90.
    90 Recursive Bayesian EstimationSOLO Tofind the expression for p ( xk| Z1:k ) we use the theorem of joint probability (Bayes Rule): ( ) ( ) ( )k kk RuleBayes kk Zp Zxp Zxp :1 :1 :1 , | = Since Z1:k ={ zk, Z1:k-1 }: ( ) ( ) ( )1:1 1:1 :1 , ,, | − − = kk kkk kk Zzp Zzxp Zxp The denominator of this expression is ( ) ( ) ( )1:11:11:1 ,,|,, −−− = kkkkk RuleBayes kkk ZxpZxzpZzxp ( ) ( ) ( )    1:11:11:1 |,| −−−= kkkkkk ZpZxpZxzp Since the knowledge of xk supersedes the need for Z1:k-1 = {z1, z2,…,zk-1} ( ) ( )kkkkk xzpZxzp |,| 1:1 ≡− ( ) ( ) ( ) ( ) ( ) ( )1:11:1 1:11:1 :1 | || | −− −− = kkk kkkkk kk ZpZzp ZpZxpxzp ZxpTherefore: ( ) ( ) ( )1:11:11:1 |, −−− = kkk RuleBayes kk ZpZzpZzp and the nominator is Bayesian Estimation Introduction
  • 91.
    91 Recursive Bayesian EstimationSOLO Thefinal result is: ( ) ( ) ( ) ( )1:1 1:1 :1 | || | − − = kk kkkk kk Zzp Zxpxzp Zxp Therefore: Since p ( xk| Z1:k ) is a probability distribution it must satisfy: ( ) ( ) ( ) ( ) ( ) ( ) ( )∫ ∫ ∫ − − − − === 1:1 1:1 1:1 1:1 :1 | || | || |1 kk kkkkk k kk kkkk kkk Zzp xdZxpxzp xd Zzp Zxpxzp xdZxp ( ) 1| :1 =∫ kkk xdZxp ( ) ( ) ( )∫ −− = kkkkkkk xdZxpxzpZzp 1:11:1 ||| ( ) ( ) ( ) ( ) ( )∫ − − = kkkkk kkkk kk xdZxpxzp Zxpxzp Zxp 1:1 1:1 :1 || || | and: This is a recursive relation that needs the value of p (xk|Z1:k-1), assuming that p (zk|xk) is obtained from the Markovian system definition ( zk = h (xk,vk) ). Bayesian Estimation Introduction kx1−kx kz1−kz 0x 1x 2x 1z 2z kZ :11:1 −kZ ( )11, −− kk wxf ( )kk vxh , ( )00 ,wxf ( )11,vxh ( )11,wxf ( )22 ,vxh Hidden States Measurements
  • 92.
    92 Recursive Bayesian EstimationSOLO TheCorrection Step is: ( ) ( ) ( ) ( )1:1 1:1 :1 | || | − − = kk kkkk kk Zzp Zxpxzp Zxp Bayesian Estimation Introduction evidence priorlikeliood posterior ⋅ = or: prior: given by prediction equation ( )kk xzp | likelihood: given by observation model ( )1:1| −kk Zxp evidence: the normalized constant on the denominator ( ) ( ) ( )∫ −− = kkkkkkk xdZxpxzpZzp 1:11:1 |||
  • 93.
    93 Recursive Bayesian EstimationSOLO () ( ) ( )1:111:111:11 |,||, −−−−−− = kkkkk Bayes kkk ZxpZxxpZxxp ( ) ( ) ( ) ( )∫∫ −−−−−−−− == 11:11111:111:1 |||,| kkkkkkkkkkk xdZxpxxpxdZxxpZxp Using: We obtain: Since for a Markov Process the knowledge of xk-1 supersedes the need for Z1:k-1 = {z1, z2,…,zk-1} ( ) ( )11:11 |,| −−− = kkkkk xxpZxxp Chapman – Kolmogorov Equation Sydney Chapman 1888 - 1970 Andrey Nikolaevich Kolmogorov 1903-1987 Bayesian Estimation Introduction kx1−kx kz1−kz 0x 1x 2x 1z 2z kZ :11:1 −kZ ( )11, −− kk wxf ( )kk vxh , ( )00 ,wxf ( )11,vxh ( )11,wxf ( )22 ,vxh Hidden States Measurements
  • 94.
    94 Recursive Bayesian EstimationSOLO () ( ) ( ) ( )∫∫ −−−−−−−− == 11:11111:111:1 |||,| kkkkkkkkkkk xdZxpxxpxdZxxpZxp Using p (xk-1|Z1:k-1) from time-step k-1 and p (xk|xk-1) of the Markov system, compute: Initialize with p (x0) ( ) ( ) ( ) ( ) ( )∫ − − = kkkkk kkkk kk xdZxpxzp Zxpxzp Zxp 1:1 1:1 :1 || || | Using p (xk|Z1:k-1) from Prediction phase and p (zk|xk) of the Markov system, compute: { } ( )∫== kkkkkkkk xdZxpxZxEx :1:1| ||ˆ ( )( ){ } ( )( ) ( )∫ −−=−−= kkk T kkkkk T kkkkkk xdZxpxxxxZxxxxEP :1:1| |ˆˆ|ˆˆ At stage k k:=k+1 ( )1|11| ˆˆ −−− = kkkk xfx 0 Prediction phase (before zk measurement) 1 Correction Step (after zk measurement)2 Filtering3 kx1−kx kz1−kz 0x 1x 2x 1z 2z kZ :11:1 −kZ ( )11, −− kk wxf ( )kk vxh , ( )00 ,wxf ( )11,vxh ( )11,wxf ( )22 ,vxh Bayesian Estimation Introduction - Summary
  • 95.
    95 Recursive Bayesian EstimationSOLO () ( ) ( ) ( )∫∫ −−−−−−−− == 11:11111:111:1 |||,| kkkkkkkkkkk xdZxpxxpxdZxxpZxp ( ) ( ) ( ) ( ) ( )∫ − − = kkkkk kkkk kk xdZxpxzp Zxpxzp Zxp 1:1 1:1 :1 || || | Prediction phase (before zk measurement) 1 Correction Step (after zk measurement)2 kx1−kx kz1−kz 0x 1x 2x 1z 2z kZ :11:1 −kZ ( )11, −− kk wxf ( )kk vxh , ( )00 ,wxf ( )11,vxh ( )11,wxf ( )22 ,vxh Bayesian Estimation Introduction - Summary This is a Conceptual Solution because the Integrals are Often Not Tractable. An optimal solution is possible for some restricted cases: • Linear Systems with Gaussian Noises (system and measurements) • Grid-Based Filters Table of Content
  • 96.
    96 SOLO Linear Gaussian Systems ALinear Combination of Independent Gaussian random vectors is also a Gaussian random vector mmm XaXaXaS +++= 2211: ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )    +++++++−=     +−    +−    +−= ΦΦ⋅Φ==Φ ∫ ∫ +∞ ∞− +∞ ∞− mmmm mmmm YYYm YpYp mYYmS aaajaaa ajaajaaja YdYdYYpSj m mmYY mm µµµωσσσω µωσωµωσωµωσω ωωωωω          2211 222 2 2 2 2 1 2 1 2 222 22 2 2 2 2 2 11 2 1 2 1 2 11,, 2 1 exp 2 1 exp 2 1 exp 2 1 exp ,,exp 21 11 1 ( ) ( )       − −= 2 2 2 exp 2 1 ,; i ii i iiiX X Xp i σ µ σπ σµ ( ) ( ) ( )     +−==Φ ∫ +∞ ∞− iiiiXiX jXdXpXj ii µωσωωω 22 2 1 expexp: Moment- Generating Function Gaussian distribution Define Proof: ( ) ( )iX ii i X i iYiii Xp aa Y p a YpXaY iii 11 : =      =→= ( ) ( ) ( ) ( ) ( ) ( )       +−=Φ===Φ ∫∫ +∞ ∞− +∞ ∞− iiiiiiX asign asign ii i iX iiiiYiY ajaXaXda a Xp XajYdYpYj i i ii µωσωωωω 222 2 1 expexpexp: 1 1 Review of Probability
  • 97.
    97 SOLO Linear Gaussian Systems(continue – 1) A Linear Combination of Independent Gaussian random vectors is also a Gaussian random vector mmm XaXaXaS +++= 2211: Therefore the Linear Combination of Independent Gaussian Random Variables is a Gaussian Random Variable with mmS mmS aaa aaa m m µµµµ σσσσ +++= +++=   2211 222 2 2 2 2 1 2 1 2 Therefore the Sm probability distribution is: ( ) ( )         − −= 2 2 2 exp 2 1 ,; m m m mm S S S SSm x Sp σ µ σπ σµ Proof (continue – 1): ( ) ( ) ( )      +++++++−=Φ mmmmS aaajaaam µµµωσσσωω  2211 222 2 2 2 2 1 2 1 2 2 1 exp We found: Review of Probability q.e.d.
  • 98.
    98 Recursive Bayesian EstimationSOLO LinearGaussian Markov Systems (continue – 2) ( ) ( )kkkk kkkk vuxkhz wuxkfx ,,, ,,,1 111 = −= −−− kkkk kkkkkkk vxHz wuGxx += Γ++Φ= −−−−−− 111111 wk-1 and vk, white noises, zero mean, Gaussian, independent ( ) ( ) ( ){ } ( ) ( ){ } ( )kPkekeEkxEkxke x T xxx =−= &: ( ) ( ) ( ){ } ( ) ( ){ } ( ) lk T www kQlekeEkwEkwke , 0 &: δ=−=  ( ) ( ) ( ){ } ( ) ( ){ } ( ) lk T vvv kRlekeEkvEkvke , 0 &: δ=−=  ( ) ( ){ } { }0=lekeE T vw    = ≠ = lk lk lk 1 0 ,δ ( ) ( )Qwwpw ,0;N= ( ) ( )Rvvpv ,0;N= ( ) ( )       −= − wQw Q wp T nw 1 2/12/ 2 1 exp 2 1 π ( ) ( )     −= − vRv R vp T pv 1 2/12/ 2 1 exp 2 1 π A Linear Gaussian Markov Systems is defined as ( ) ( )0|0000 ,;0 Pxxxp ttx == = N ( ) ( ) ( ) ( )    −−−= = − == 00 1 0|0002/1 0|0 2/0 2 1 exp 2 1 0 xxPxx P xp t T tntx π
  • 99.
    99 Recursive Bayesian EstimationSOLO LinearGaussian Markov Systems (continue – 3) 111111 −−−−−− Γ++Φ= kkkkkkk wuGxx Prediction phase (before zk measurement) { } { } { }   0 1:111111:1111:11| |||:ˆ −−−−−−−−−− Γ++Φ== kkkkkkkkkkkk ZwEuGZxEZxEx or 111|111| ˆˆ −−−−−− +Φ= kkkkkkk uGxx The expectation is { }[ ] { }[ ]{ } ( )[ ] ( )[ ]{ }1:1111|111111|111 1:11|1|1| |ˆˆ |ˆˆ: −−−−−−−−−−−−− −−−− Γ+−ΦΓ+−Φ= −−= k T kkkkkkkkkkkk k T kkkkkkkk ZwxxwxxE ZxExxExEP ( ) ( ){ } ( ){ } ( ){ } { } T k Q T kkk T k T kkkkk T k T kkkkk T k P T kkkkkkk wwExxwE wxxExxxxE kk 11111 0 1|1111 1 0 11|11111|111|111 ˆ ˆˆˆ 1|1 −−−−−−−−−− −−−−−−−−−−−−−− ΓΓ+Φ−Γ+ Γ−Φ+Φ−−Φ= −−         T kk T kkkkkk QPP 1111|111| −−−−−−− ΓΓ+ΦΦ= { } ( )1|1|1:1 ,ˆ;| −−− = kkkkkkk PxxZxP N Since is a Linear Combination of Independent Gaussian Random Variables: 111111 −−−−−− Γ++Φ= kkkkkkk wuGxx
  • 100.
    100 SOLO For the particularvector measurement equation where the measurement noise, is Gaussian (normal), with zero mean: ( ) ( )kkkv Rvvp ,0;N= ( ) ( ) ( )xp zxp xzp x zx xz , | , | = and independent of , the conditional probability can be written, using Bayes rule as: kx ( )xzp xz || ( )           − − ==−= 1 111 1111 1 1 , nxpp nx pxnxpxnpxpx xHz xHz zxfxHzv xn xn  ( ) ( ) 2/1 ,, /,, T vxzx JJvxpzxp = The measurement noise can be related to and by the function:v zx pxp p pp p I z f z f z f z f z f J =                 ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ =      ∂ ∂ =    1 1 1 1 ( ) ( ) ( ) ( )vpxpvxpzxp vxvxzx ⋅== ,, ,, kv Since the measurement noise is independent of :xv zThe joint probability of and is given by:x Recursive Bayesian Estimation Linear Gaussian Markov Systems (continue – 4) kkkk vxHz += Correction Step (after zk measurement) - 1st Way ( ) ( ) ( ) ( )1:1 1:1 :1 | || | − − = kk kkkk kk Zzp Zxpxzp Zxp
  • 101.
    101 ( ) ()kkkv Rvvp ,0;N= kkkk vxHz += Consider a Gaussian vector , where , measurement, , where the Gaussian noise is independent of and . v kx ( ) [ ]1|1| ,; −−= kkkkkkx Pxxxp  N kx ( ) ( ) ( ) ( )∫∫ +∞ ∞− +∞ ∞− == kkxkkxzkkkzxkz xdxpxzpxdzxpzp |, |, is Gaussian with( )kz zp ( ) ( ) ( ) ( ) 1| 0 −=+=+= kkkkkkkkk xHvExEHvxHEzE   ( ) ( )[ ] ( )[ ]{ } [ ][ ]{ } ( )[ ] ( )[ ]{ } [ ]{ } [ ]{ } [ ]{ } { } k T kkkk T kk T k T kkkk T kkkkk T k T kkkkkkk T kkkkkkkkkk T kkkkkkkkkkkk T kkkkk RHPHvvEHxxvEvxxEH HxxxxEHvxxHvxxHE xHvxHxHvxHEzEzzEzEz +=+−−−− −−=+−+−= −+−+=−−= −−− −−−− −− 1| 0 1| 0 1| 1|1|1|1| 1|1|cov           ( ) ( ) ( ) ( )[ ] ( )[ ] ( )[ ]       −−+−−−− +− = − xHzRHPHxHz RHPH zp TT Tpz ˆˆ 2 1 exp 2 1 1 2/12/ π ( ) ( ) ( ) ( )      −−−= − − −− − −− 1| 1 1|1|2/1 1| 2/1:1| 2 1 exp 2 1 |1:1 kkkkk T kkk kk nkkZx xxPxx P Zxp kk  π ( ) ( ) ( ) ( ) ( )    −−−=−= − kkk T kkkpkkkvkkxz xHzRxHz R xHzpxzp 1 2/12/| 2 1 exp 2 1 | π Recursive Bayesian EstimationSOLO Linear Gaussian Markov Systems (continue – 5) Correction Step (after zk measurement) 1st Way (continue – 1)
  • 102.
    102 Recursive Bayesian EstimationSOLO LinearGaussian Markov Systems (continue – 6) kkkk vxHz += ( ) ( )Rvvpv ,0;N= ( ) ( )       −= − vRv R vp T pv 1 2/12/ 2 1 exp 2 1 π Correction Step (after zk measurement) 1st Way (continue – 2) ( ) ( ) ( ) ( )    −−−= − − −− − −− 1| 1 1|1|2/1 1| 2/1:1| 2 1 exp 2 1 |1:1 kkkkk T kkk kk nkkZx xxPxx P Zxp kk  π ( ) ( ) ( ) ( ) ( )    −−−=−= − kkk T kkkpkkkvkkxz xHzRxHz R xHzpxzp 1 2/12/| 2 1 exp 2 1 | π ( ) ( ) [ ] [ ] [ ]       −+−− + = − − −− − 1| 1 1|1|2/1 1| 2/ ˆˆ 2 1 exp 2 1 kkkk T kkkk T kkk k T kkkk p kz xHzRHPHxHz RHPH zp π ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) [ ] [ ] [ ]      −+−+−−−−−−⋅ + == − − −−− − −− − − −− − 1| 1 1|1|1| 1 1|1| 1 2/1 1| 2/12/1 1|2/1:1 1:1 :1 ˆˆ 2 1 2 1 2 1 exp 2 1 | || | kkkkk T kkkk T kkkkkkkkk T kkkkkkk T kkk k T kkkk kkknkk kkkk kk xHzRHPHxHzxxPxxxHzRxHz RHPH RPZzp Zxpxzp Zxp  π from which
  • 103.
    103 ( ) () ( ) ( ) ( ) [ ] ( )1| 1 1|1|1| 1 1|1| 1 − − −−− − −− − −+−−−−+−− kkkk T kkkkk T kkkkkkkkk T kkkkkkk T kkk xHzHPHRxHzxxPxxxHzRxHz  ( )[ ] ( )[ ] ( ) ( ) ( ) [ ] ( ) ( ) [ ]{ }( ) ( ) ( ) ( ) ( ) ( ) [ ]( )1| 11 1|1|1| 1 1|1| 1 1| 1| 1 1| 1 1|1| 1 1|1| 1| 1 1|1|1|1| 1 1|1| − −− −−− − −− − − − − − − −− − −− − − −−−− − −− −+−+−−−−−− −+−−=−+−− −−+−−−−−−= kkkkk T kkk T kkkkkkkk T kkkkkkkkk T k T kkk kkkk T kkkkkk T kkkkkkkk T kkkkk T kkkk kkkkk T kkkkkkkkkkkk T kkkkkkkk xxHRHPxxxxHRxHzxHzRHxx xHzHPHRRxHzxHzHPHRxHz xxPxxxxHxHzRxxHxHz    [ ] [ ] 1111 1| 1111 1| 1 −−−− − −−−− − − ++/−/=+− k T kkk T kkkkkkk LemmaMatrixInverse T kkkkkk RHHRHPHRRRHPHRRwe have Define: [ ] [ ] 1 1| 1 1| 1 1| 1 1| 111 1|| : − − − − − − − − −−− − +−=+= kk T k T kkkkkkkkkk LemmaMatrixInverse kk T kkkkk PHHPHRHPPHRHPP ( )[ ] ( )[ ]1| 1 |1| 1 |1| 1 |1| − − − − − − − −+−−+−= kkkkk T kkkkkkkk T kkkkk T kkkkkk xHzRHPxxPxHzRHPxx  ( ) ( ) ( )[ ] ( )[ ]      −+−−+−−⋅= − − − − − − − 1| 1 |1| 1 |1| 1 |1|2/1 | 2/:1| 2 1 exp 2 1 | kkkkk T kkkkkkkk T kkkkk T kkkkkk kk nkkzx xHzRHPxxPxHzRHPxx P Zxp  π Recursive Bayesian EstimationSOLO Linear Gaussian Markov Systems (continue – 7) Correction Step (after zk measurement) 1st Way (continue – 3) then ( ) ( ) ( ) ( ) ( ) [ ] ( )1| 1 1|1|1| 1 1|1| 1 − − −−− − −− − −+−−−−+−− kkkkk T kkkk T kkkkkkkkk T kkkkkkk T kkk xHzRHPHxHzxxPxxxHzRxHz  ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )( ) ( ) ( )1| 1 |1|1| 1 || 1 1| 1| 1 | 1 |1|1| 1 | 1 || 1 1| − − −− −− − − −− −− −−− − −−+−−− −−−−−= kkkkk T kkkkkkkkkkkk T kkkk kkkkk T kkkkk T kkkkkkkk T kkkkkkkkk T kkkk xxPxxxxPPHRxHz xHzRHPPxxxHzRHPPPHRxHz  
  • 104.
    104 then ( )kkzx x Zxp k :1| |max () { }kk kkkkk T kkkkkkkk ZxE xHzRHPxxx :1 1| 1 |1| * | | ˆˆ:ˆ = −+== − − − Recursive Bayesian EstimationSOLO Linear Gaussian Markov Systems (continue – 8) Correction Step (after zk measurement) 1st Way (continue – 4) ( ) ( ) ( )[ ] ( )[ ]      −+−−+−−⋅= − − − − − − − 1| 1 1| 1 |1| 1 1|2/1 | 2/:1| 2 1 exp 2 1 | kkkkk T kkkkkk T kkkkk T kkkk kk nkkzx xHzRHxxPxHzRHxx P Zxp  π where:[ ] ( )( ){ }k T kkkkkkkk T kkkkk ZxxxxEHRHPP :1|| 111 1|| ˆˆ: −−=+= −−− −
  • 105.
    105 { } () ( )  ki kkkkkkkkkkkkkkkkk zzKxxHzKxZxEx 1|1|1|1|:1| ˆ| −−−− −+=−+== Recursive Bayesian EstimationSOLO Linear Gaussian Markov Systems (continue – 9) Summary 1st Way – Kalman Filter Initial Conditions: [ ] 111 1|| : −−− − += kk T kkkkk HRHPP Prediction phase (before zk measurement) 111|111| ˆˆ −−−−−− +Φ= kkkkkkk uGxx Correction Step (after zk measurement) T kk T kkkkkk QPP 1111|111| −−−−−−− ΓΓ+ΦΦ= 1 |: − = k T kkkk RHPK { }00|0 ˆ xEx = ( ) ( ){ }T xxxxEP 0|000|000|0 ˆˆ: −−= kkkk wxHz += { } { } { } 0 1:11|1:11:11| |ˆ||ˆ −−−−− +=+== kkkkkkkkkkkkk ZwExHZwxHEZzEz 1|1| ˆˆ −− = kkkkk xHz
  • 106.
    106 Recursive Bayesian EstimationSOLO LinearGaussian Markov Systems (continue – 10) kkkk vxHz += ( ) ( )Rvvpv ,0;N= ( ) ( )       −= − vRv R vp T pv 1 2/12/ 2 1 exp 2 1 π ( ) ( ) [ ] [ ] [ ]       −+−− + = − − −− − 1| 1 1|1|2/1 1| 2/ ˆˆ 2 1 exp 2 1 kkkkk T kkkk T kkkk k T kkkk p kz xHzRHPHxHz RHPH zp π from which { } 1|1:11| ˆ|ˆ −−− == kkkkkkk xHZzEz ( ) ( ){ } kk T kkkkk T kkkkkk zz kk SRHPHZzzzzEP =+=−−= −−−−− :ˆˆ 1|1:11|1|1| [ ][ ]{ } [ ] ( )[ ]{ } T kkkk T kkkkkkkk k T kkkkkk xz kk HPZvxxHxxE ZzzxxEP 1|1:11|1| 1:11|1|1| ˆˆ ˆˆ −−−− −−−− =+−−= −−= We also have Correction Step (after zk measurement) 2nd Way Define the innovation: 1|1| ˆˆ: −− −=−= kkkkkk xHzzzi
  • 107.
    107 Recursive Bayesian EstimationSOLO Jointand Conditional Gaussian Random Variables       = k k k z x yDefine: assumed that they are Gaussian distributed Prediction phase (before zk measurement) 2nd way (continue -1) { }         =             = − − − − − 1| 1| 1:1 1:1 1:1 ˆ ˆ | | | kk kk kk kk kk z x Zz Zx EZyE         =                 − −         − − = −− −− − − − − − − zz kk zx kk xz kk xx kk k T kkk kkk kkk kkkyy kk PP PP Z zz xx zz xx EP 1|1| 1|1| 1:1 1| 1| 1| 1| 1| ˆ ˆ ˆ ˆ where: [ ][ ]{ } 1|1:11|1|1| ˆˆ −−−−− =−−= kkk T kkkkkk xx kk PZxxxxEP [ ][ ]{ } kk T kkkkk T kkkkkk zz kk SRHPHZzzzzEP =+=−−= −−−−− :ˆˆ 1|1:11|1|1| [ ][ ]{ } T kkkk T kkkkkk xz kk HPZzzxxEP 1|1:11|1|1| ˆˆ −−−−− =−−= Linear Gaussian Markov Systems (continue – 11)
  • 108.
    108 ( ) () ( )    −−−= − − −− − − 1| 1 1|1|2/1 1| 1:1, ˆˆ 2 1 exp 2 1 |, kkk yy kk T kkk yy kk kkkzx yyPyy P Zzxp π Recursive Bayesian EstimationSOLO Joint and Conditional Gaussian Random Variables The conditional probability distribution function (pdf) of xk given zk is given by: Prediction phase (before zk measurement) 2nd Way (continue – 2) ( ) ( ) ( )      −−−= − − −− − − 1| 1 1|1|2/1 1| 1:1 ˆˆ 2 1 exp 2 1 | kkk zz kk T kkk zz kk kkz zzPzz P Zzp π ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )    −−−     −−− === − − −− − − −− − − − − − 1| 1 1|1| 1| 1 1|1| 2/1 1| 2/1 1| 1:1 1:1, |1:1| ˆˆ 2 1 exp ˆˆ 2 1 exp 2 2 | |, |,| kkk zz kk T kkk kkk yy kk T kkk yy kk zz kk kkz kkkzx kkzxkkkzx zzPzz yyPyy P P Zzp Zzxp zxpZzxp π π ( ) ( ) ( ) ( )    −−+−−−= − − −−− − −− − − 1| 1 1|1|1| 1 1|1|2/1 1| 2/1 1| ˆˆ 2 1 ˆˆ 2 1 exp 2 2 kkk zz kk T kkkkkk yy kk T kkk yy kk zz kk zzPzzyyPyy P P π π Linear Gaussian Markov Systems (continue – 12) We assumed that is Gaussian distributed:      = k k k z x y
  • 109.
    109 Recursive Bayesian EstimationSOLO Jointand Conditional Gaussian Random Variables Prediction phase (before zk measurement) 2nd Way (continue – 3) ( ) ( ) ( ) ( ) ( )    −−+−−−= − − −−− − −− − − 1| 1 1|1|1| 1 1|1|2/1 1| 2/1 1| | ˆˆ 2 1 ˆˆ 2 1 exp 2 2 | kkk zz kk T kkkkkk zz kk T kkk yy kk zz kk kkzx zzPzzyyPyy P P zxp π π Define: 1|1| ˆ:&ˆ: −− −=−= kkkkkkkk zzxx ςξ ( ) ( ) ( ) ( ) k zz kk T kk zz kk T kk zx kk T kk xz kk T kk xx kk T k kkkzz T k k k zz kk zx kk xz kk xx kk T k k k zz kk T k k k zz kk zx kk xz kk xx kk T k k kkk zz kk T kkkkkk yy kk T kkk PTTTT P TT TT P PP PP zzPzzyyPyyq ςςςςξςςξξξ ςς ς ξ ς ξ ςς ς ξ ς ξ 1 1|1|1|1|1| 1 1| 1|1| 1|1| 1 1| 1 1|1| 1|1| 1| 1 1|1|1| 1 1|1| ˆˆˆˆ: − −−−−− − − −− −− − − − −− −− − − −−− − −− −+++= −                    = −                    = −−−−−= Linear Gaussian Markov Systems (continue – 13)
  • 110.
    110 Recursive Bayesian EstimationSOLO Jointand Conditional Gaussian Random Variables Prediction phase (before zk measurement) 2nd way (continue – 4) Using Inverse Matrix Lemma: ( ) ( ) ( ) ( )         −−− −−− =      −−−−− −−−−−− 11111 111111 nxmnxnmxnmxmmxnmxmnxmnxnmxnmxm mxmnxmmxnmxmnxmnxnmxnmxmnxmnxn mxmmxn nxmnxn BADCDCBADC CBDCBADCBA CD BA         =         −− −− − −− −− zz kk zx kk xz kk xx kk zz kk zx kk xz kk xx kk TT TT PP PP 1|1| 1|1| 1 1|1| 1|1| in 1 1|1|1| 1 1| 1| 1 1|1|1| 1 1| 1| 1 1|1|1| 1 1| − −−− − − − − −−− − − − − −−− − − −= −= −= zz kk xz kk xz kk xx kk xz kk xx kk zx kk zz kk zz kk kkzxkkzzkkxzkkxxkkxx PPTT TTTTP PPPPT k zz kk T kk zz kk T kk zx kk T kk xz kk T kk xx kk T k PTTTTq ςςςςξςςξξξ 1 1|1|1|1|1| − −−−−− −+++= ( ) k zz kk T kk zz kk T k k xz kk xx kk zx kk T kk xz kk xx kk zx kk T kk xz kk T kk xx kk xx kk zx kk T k T k PT TTTTTTTTTT ςςςς ςςςςςξξςξ 1 1|1| 1| 1 1|1|1| 1 1|1|1|1| 1 1|1| − −− − − −−− − −−−− − −− −+ −+++= ( ) ( ) ( ) ( ) ( )k xz kk xx kkk xx kk T k xz kk xx kkkk zz kk xz kk xx kkkkzx zz kk T k k xz kk xx kk xx kk T k xz kk xx kkkk xx kk T k xz kk xx kkk TT TTTTTPTTTT TTTTTTTT zx kk Txz kk ςξςξςς ςςξξςξ 1| 1 1|1|1| 1 1| 0 1|1| 1 1|1|1| 1| 1 1|1|1| 1 1|1|1| 1 1| 1|1| − − −−− − −−− − −−− − − −−− − −−− − − = ++=−−+ +++= −−    Linear Gaussian Markov Systems (continue – 14)
  • 111.
    111 Recursive Bayesian EstimationSOLO Jointand Conditional Gaussian Random Variables Prediction phase (before zk measurement) 2nd way (continue – 5)         =         −− −− − −− −− zz kk zx kk xz kk xx kk zz kk zx kk xz kk xx kk TT TT PP PP 1|1| 1|1| 1 1|1| 1|1| 1 1|1|1| 1 1| 1| 1 1|1|1| 1 1| 1| 1 1|1|1| 1 1| − −−− − − − − −−− − − − − −−− − − −= −= −= zz kk xz kk xz kk xx kk xz kk xx kk zx kk zz kk zz kk kkzxkkzzkkxzkkxxkkxx PPTT TTTTP PPPPT ( ) ( )k xz kk xx kkk xx kk T k xz kk xx kkk TTTTTq ςξςξ 1| 1 1|1|1| 1 1| − − −−− − − ++= 1|1| ˆ:&ˆ: −− −=−= kkkkkkkk zzxx ςξ ( ) ( )[ ] ( )[ ]       −−−−−−−=       −= −−−−− − − − − 1|1|1|1|1|2/1 1| 2/1 1| 2/1 1| 2/1 1| | ˆˆˆˆ 2 1 exp 2 2 2 1 exp 2 2 | kkkkkkk xx kk T kkkkkkk yy kk zz kk yy kk zz kk kkzx zzKxxTzzKxx P P q P P zxp π π π π ( )1| 1 1|1|1| 1 1|1| ˆˆ − − −−− − −− −−−=+ kkk K zz kk xz kkkkkk xx kk xz kkk zzPPxxTT k   ςξ Linear Gaussian Markov Systems (continue – 15)
  • 112.
    112 Recursive Bayesian EstimationSOLO Jointand Conditional Gaussian Random Variables Prediction phase (before zk measurement) 2nd Way (continue – 6) ( ) ( )[ ] ( )[ ]      −−−−−−−= − − −−−−− − −−− 1| 1 1|1|1|1|1| 1 1|1|1|| ˆˆˆˆ 2 1 exp| kkk xx kk xz kkkkk xx kk T kkk xx kk xz kkkkkkkzx zzPPxxTzzPPxxczxp From this we can see that { } ( )1| 1 1|1|1|| ˆˆˆ| − − −−− −+== kkk K zz kk xz kkkkkkkk zzPPxxzxE k    ( )( ){ } T k zz kkk xx kk zx kk zz kk xz kk xx kk xx kkk T kkkkkk xx kk KPKP PPPPTZxxxxEP 1|1| 1| 1 1|1|1| 1 1|:1||| ˆˆ −− − − −−− − − −= −==−−= [ ][ ]{ } 1|1:11|1|1| ˆˆ −−−−− =−−= kkk T kkkkkk xx kk PZxxxxEP [ ][ ]{ } k T kkkkkk T kkkkkk zz kk SHPHRZzzzzEP =+=−−= −−−−− :ˆˆ 1|1:11|1|1| [ ][ ]{ } T kkkk T kkkkkk xz kk HPZzzxxEP 1|1:11|1|1| ˆˆ −−−−− =−−= Linear Gaussian Markov Systems (continue – 16)
  • 113.
    113 Recursive Bayesian EstimationSOLO Jointand Conditional Gaussian Random Variables Prediction phase (before zk measurement) 2nd Way (continue – 7) From this we can see that ( ) [ ] 111 1|1| 1 1|1|1|| −−− −− − −−− +=+−= kk T kkkkkk T kkkkk T kkkkkkk HRHPPHHPHRHPPP ( ) 1 1| 1 1|1| 1 1|1| − − − −− − −− =+== k T kkk T kkkkk T kkk zz kk xz kkk SHPHPHRHPPPK Linear Gaussian Markov Systems (continue – 17) kk T kkkkk KSKPP −= −1|| or [ ][ ]{ } 1|1:11|1|1| ˆˆ −−−−− =−−= kkk T kkkkkk xx kk PZxxxxEP [ ][ ]{ } k T kkkkkk T kkkkkk zz kk SHPHRZzzzzEP =+=−−= −−−−− :ˆˆ 1|1:11|1|1| [ ][ ]{ } T kkkk T kkkkkk xz kk HPZzzxxEP 1|1:11|1|1| ˆˆ −−−−− =−−=
  • 114.
    114 We found thatthe optimal Kk is [ ] 1 1|1| − −− += T kkkkk T kkkk HPHRHPK [ ] [ ] 1111 |1 11 & 1 |1 1 1| 1 −−−− + −−− + +−=+ − − − k T kkk T kkkkkk LemmaMatrixInverse existPR T kkkkk RHHRHPHRRHPHR kkk [ ] 1111 1| 1 1| 1 1| −−−− − − − − − +−= k T kkk T kkkkk T kkkk T kkkk RHHRHPHRHPRHPK [ ]{ } [ ] 1111 |1 111 |1|1 −−−− + −−− ++ +−+= k T kkk T kkkkk T kkk T kkkkk RHHRHPHRHHRHPP [ ] 1 | 1111 |1 −−−−− + =+= RHPRHHRHPK T kkk T kkk T kkkk If Rk -1 and Pk|k-1 -1 exist: Recursive Bayesian EstimationSOLO Linear Gaussian Markov Systems (continue – 18) Relation Between 1st and 2nd ways 2nd Way 1st Way = 2nd Way
  • 115.
    115 1|1| ˆˆ: −−−=−= kkkkkkkk zzxHzi Recursive Bayesian EstimationSOLO Linear Gaussian Markov Systems (continue – 19) Innovation The innovation is the quantity: We found that: { } ( ){ } { } 0ˆ||ˆ| 1|1:11:11|1:1 =−=−= −−−−− kkkkkkkkkk zZzEZzzEZiE [ ][ ]{ } { } k T kkkkkk T kkk T kkkkkk SHPHRZiiEZzzzzE =+==−− −−−−− :ˆˆ 1|1:11:11|1| Using the smoothing property of the expectation: { }{ } ( ) ( ) ( ) ( ) ( ) ( ) ( ) { }xEdxxpxdxdyyxpx dxdyypyxpxdyypdxyxpxyxEE x X x y YX x y yxp YYX y Y x YX YX ==         =           =      = ∫∫ ∫ ∫ ∫∫ ∫ ∞+ −∞= ∞+ −∞= ∞+ −∞= ∞+ −∞= ∞+ −∞= ∞+ −∞= ∞+ −∞= , || , , || ,    { } { }{ }1:1 −= k T jk T jk ZiiEEiiEwe have: Assuming, without loss of generality, that k-1 ≥ j, and innovation I (j) is Independent on Z1:k-1, and it can be taken outside the inner expectation: { } { }{ } { } 0 0 1:11:1 =         == −− T jkkk T jk T jk iZiEEZiiEEiiE 
  • 116.
    116 1|1| ˆˆ: −−−=−= kkkkkkkk zzxHzi Recursive Bayesian EstimationSOLO Linear Gaussian Markov Systems (continue – 20) Innovation (continue – 1) The innovation is the quantity: We found that: { } ( ){ } { } 0ˆ||ˆ| 1|1:11:11|1:1 =−=−= −−−−− kkkkkkkkkk zZzEZzzEZiE { } k T kkkkkk T kk SHPHRZiiE =+= −− :1|1:1 { } 0= T jk iiE { } jik T jk SiiE δ= The uncorrelated ness property of the innovation implies that since they are Gaussian, the innovation are independent of each other and thus the innovation sequence is Strictly White. Thus the innovation sequence is zero mean and white for the Kalman (Optimal) Filter. Without the Gaussian assumption, the innovation sequence is Wide Sense White. Table of Content
  • 117.
    117 Recursive Bayesian EstimationSOLO Closed-FormSolutions of Estimation Closed-Form solutions for the Optimal Recursive Bayesian Estimation can be derived only for special cases The most important case: • Dynamic and measurement models are linear ( ) ( )kkkk kkkk vuxkhz wuxkfx ,,, ,,,1 111 = −= −−− kkkk kkkkkkk vxHz wuGxx += Γ++Φ= −−−−−− 111111 • Random noises are Gaussian ( ) ( )Qwwpw ,0;N= ( ) ( )Rvvpv ,0;N= ( ) ( )       −= wQw Q wp T nw 2 1 exp 2 1 2/12/ π ( ) ( )     −= − vRv R vp T pv 1 2/12/ 2 1 exp 2 1 π • Solution: KALMAN FILTER • In other non-linear/non-Gaussian cases: USE APPROXIMATIONS
  • 118.
    118 Recursive Bayesian EstimationSOLO Closed-FormSolutions of Estimation (continue – 1) • Dynamic and measurement models are linear kkkk kkkkkkk vxHz wuGxx += Γ++Φ= −−−−−− 111111 • The Optimal Estimator is the Kalman Filter developed by R. E. Kalman in 1960 ( ) ( ) ( ){ } ( ) ( ){ } ( )kPkekeEkxEkxke x T xxx =−= &: ( ) ( ) ( ){ } ( ) ( ){ } ( ) lk T www kQlekeEkwEkwke , 0 &: δ=−=  ( ) ( ) ( ){ } ( ) ( ){ } ( ) lk T vvv kRlekeEkvEkvke , 0 &: δ=−=  ( ) ( ){ } { }0=lekeE T vw    = ≠ = lk lk lk 1 0 ,δ Rudolf E. Kalman ( 1920 - ) • K.F. is an Optimal Estimator (in the Minimum Mean Square Estimator (MMSE) ) sense if: - state and measurement models are linear - the random elements are Gaussian • Under those conditions, the covariance matri: - independent of the state (can be calculated off-line) - equals the Cramer – Rao lower bound Table of Content
  • 119.
    119 Kalman Filter State Estimationin a Linear System (one cycle) SOLO 1: += kk Initialization{ } ( ) ( ){ }T xxxxEPxEx 00000|000 ˆˆˆ −−==0 State vector prediction111|111| ˆˆ −−−−−− +Φ= kkkkkkk uGxx1 Covariance matrix extrapolation111|111| −−−−−− +ΦΦ= k T kkkkkk QPP2 Innovation Covariancek T kkkkk RHPHS += −1|3 Gain Matrix Computation 1 1| − −= k T kkkk SHPK4 Measurement & Innovation 1|ˆ 1| ˆ − −−= kkz kkkkk xHzi5 Filteringkkkkkk iKxx += −1|| ˆˆ6 Covariance matrix updating ( ) ( ) ( ) T kkk T kkkkkk kkkk T kkkkk kkkk T kkkkkkk KRKHKIPHKI PHKI KSKP PHSHPPP +−−= −= −= −= − − − − − −− 1| 1| 1| 1| 1 1|1||7
  • 120.
    120 Kalman Filter State Estimationin a Linear System (one cycle) Sensor Data Processing and Measurement Formation Observation - to - Track Association Input Data Track Maintenance ( Initialization, Confirmation and Deletion) Filtering and Prediction Gating Computations Samuel S. Blackman, " Multiple-Target Tracking with Radar Applications", Artech House, 1986 Samuel S. Blackman, Robert Popoli, " Design and Analysis of Modern Tracking Systems", Artech House, 1999 SOLO Rudolf E. Kalman ( 1920 - )
  • 121.
    121 SOLO General Bayesian NonlinearFilters General Bayesian Nonlinear Filters Additive Gaussian Noise Gauss Hermite Kalman Filter (GHKF) Unscented Kalman Filter (UKF) Non-Resampling Particle Filter Gaussian Particle Filter (GPF) Gauss Hermite Particle Filter (GHPF) Unscented Particle Filter (UPF) Monte Carlo Particle Filter (MCPF) Recursive Bayesian Estimation Monte Carlo Kalman Filter (MCKF) Extended Kalman Filter (EKF) Non-Additive Non-Gaussian Noise Resampling Particle Filter Sequential Importance Sampling Particle Filter (SIS PF) Bootstrap Particle Filter (BPF) Run This Table of Content
  • 122.
    122 Extended Kalman Filter SensorData Processing and Measurement Formation Observation - to - Track Association Input Data Track Maintenance ( Initialization, Confirmation and Deletion) Filtering and Prediction Gating Computations Samuel S. Blackman, " Multiple-Target Tracking with Radar Applications", Artech House, 1986 Samuel S. Blackman, Robert Popoli, " Design and Analysis of Modern Tracking Systems", Artech House, 1999 SOLO In the extended Kalman filter, (EKF) the state transition and observation models need not be linear functions of the state but may instead be (differentiable) functions. ( ) ( ) ( )[ ] ( )kwkukxkfkx +=+ ,,1 ( ) ( ) ( )[ ] ( )11,1,11 +++++=+ kkukxkhkz ν State vector dynamics Measurements ( ) ( ) ( ){ } ( ) ( ){ } ( )kPkekeEkxEkxke x T xxx =−= &: ( ) ( ) ( ){ } ( ) ( ){ } ( ) lk T www kQlekeEkwEkwke , 0 &: δ=−=  ( ) ( ){ } lklekeE T vw ,0 ∀=    = ≠ = lk lk lk 1 0 ,δ The function f can be used to compute the predicted state from the previous estimate and similarly the function h can be used to compute the predicted measurement from the predicted state. However, f and h cannot be applied to the covariance directly. Instead a matrix of partial derivatives (the Jacobian) is computed. ( ) ( ) ( )[ ] ( ){ } ( )[ ] ( ) ( ){ } ( ) ( ) ( ){ } ( ) ( )keke x f keke x f kekukxEkfkukxkfke wx Hessian kxE T xx Jacobian kxE wx ++ ∂ ∂ + ∂ ∂ =+−=+   2 2 2 1 ,,,,1 ( ) ( ) ( )[ ] ( ){ } ( )[ ] ( ) ( ){ } ( ) ( ) ( ){ } ( ) ( 111 2 1 111,1,11,1,11 1 2 2 1 ++++ ∂ ∂ +++ ∂ ∂ =+++++−+++=+ ++ kke x h keke x h kkukxEkhkukxkhke x Hessian kxE T xx Jacobian kxE z νν   Taylor’s Expansion:
  • 123.
    123 Extended Kalman Filter StateEstimation (one cycle) SOLO 1: += kk ( )11|11| ,ˆ,1ˆ −−−− −= kkkkk uxkfx State vector prediction1 Jacobians Computation 1|1|1 ˆˆ 1 & −−− ∂ ∂ = ∂ ∂ =Φ − kkkk x k x k x h H x f 2 Covariance matrix extrapolation111|111| −−−−−− +ΦΦ= k T kkkkkk QPP3 Innovation Covariancek T kkkkk RHPHS += −1|4 Gain Matrix Computation 1 1| − −= k T kkkk SHPK5 Measurement & Innovation 1|ˆ 1| ˆ − −−= kkz kkkkk xHzi6 Filteringkkkkkk iKxx += −1|| ˆˆ7 Covariance matrix updating ( ) ( ) ( ) T kkk T kkkkkk kkkk T kkkkk kkkk T kkkkkkk KRKHKIPHKI PHKI KSKP PHSHPPP +−−= −= −= −= − − − − − −− 1| 1| 1| 1| 1 1|1||8 0 Initialization (k = 0){ } ( ) ( ){ }T xxxxEPxEx 00000|000 ˆˆˆ −−==
  • 124.
    124 Extended Kalman Filter StateEstimation (one cycle) Sensor Data Processing and Measurement Formation Observation - to - Track Association Input Data Track Maintenance ( Initialization, Confirmation and Deletion) Filtering and Prediction Gating Computations Samuel S. Blackman, " Multiple-Target Tracking with Radar Applications", Artech House, 1986 Samuel S. Blackman, Robert Popoli, " Design and Analysis of Modern Tracking Systems", Artech House, 1999 SOLO Rudolf E. Kalman ( 1920 - )
  • 125.
    125 SOLO Criticism of theExtended Kalman Filter Unlike its linear counterpart, the extended Kalman filter is not an optimal estimator. In addition, if the initial estimate of the state is wrong, or if the process is modeled incorrectly, the filter may quickly diverge, owing to its linearization. Another problem with the extended Kalman filter is that the estimated covariance matrix tends to underestimate the true covariance matrix and therefore risks becoming inconsistent in the statistical sense without the addition of "stabilizing noise". Having stated this, the Extended Kalman filter can give reasonable performance, and is arguably the de facto standard in navigation systems and GPS. Extended Kalman Filter Table of Content
  • 126.
    126 SOLO Additive Gaussian NonlinearFilter Consider the case of a Markovian process where the noise is additive and Gaussian: ( ) ( ) kkk kkk vxhz wxfx += += −− 11 ( ) ( )kkkw Qwwp ,0;N= ( ) ( )kkkv Rvvp ,0;N= ( ) ( )       −= kk T k k nkw wQw Q wp 2 1 exp 2 1 2/12/ π ( ) ( )     −= − kk T k k pkv vRv R vp 1 2/12/ 2 1 exp 2 1 π where wk and vk are independent white noises Gaussian, with zero mean and covariances Qk and Rk, respectively: Recursive Bayesian Estimation Therefore, since f (xk-1) is a deterministic function, by adding the Gaussian noise wk-1, we obtain xk also a Gaussian random variable. ( ) ( )( )111:11 ,;,| −−−− = kkkkkk QxfxZxxp N
  • 127.
    127 SOLO Additive Gaussian NonlinearFilter (continue – 1) ( ) ( ) kkk kkk vxhz wxfx += += −− 11 Recursive Bayesian Estimation ( ) ( )( )111:11 ,;,| −−−− = kkkkkk QxfxZxxp N ( ) ( ) ( )1:111:111:11 |,||, −−−−−− = kkkkk Bayes kkk ZxpZxxpZxxp ( ) ( ) ( ) ( )∫∫ −−−−−−−−− == 11:111:1111:111:1 |,||,| kkkkkkkkkkkk xdZxpZxxpxdZxxpZxp Using: we obtain: ( ) ( )( ) ( )∫ −−−−−− = 11:11111:1 |,;| kkkkkkkk xdZxpQxfxZxp N { } ( ) ( )( ) ( )[ ]∫ ∫∫ −−−−−−−− === kkkkkkkkkkkkkkkk xdxdZxpQxfxxxdZxpxZxEx 11:11111:11:11| |,;||:ˆ N ( )( )[ ] ( ) ( ) ( )∫∫ ∫ −−−−−−−−− == 11:11111:1111 ||,; kkkkkkkkkkkk xdZxpxfxdZxpxdQxfxx N Assume that is Gaussian with mean and covariance , then1−kx 1|1 ˆ −− kkx 1|1 −− kkP ( ) ( )1|11|111:11 ,ˆ;| −−−−−−− = kkkkkkk PxxZxp N { } ( ) ( )∫ −−−−−−−−− == 11|11|1111:11| ,ˆ;|ˆ k xx kkkkkkkkkk xdPxxxfZxEx N
  • 128.
    128 SOLO Additive Gaussian NonlinearFilter (continue – 2) ( ) ( ) kkk kkk vxhz wxfx += += −− 11 Recursive Bayesian Estimation ( ) ( )xx kkkkkkk PxxZxp 1|11|111:11 ,ˆ;| −−−−−−− = N { } ( ) ( )∫ −−−−−−−−− == 11|11|1111:11| ,ˆ;|ˆ k xx kkkkkkkkkk xdPxxxfZxEx N ( )( ){ } ( )[ ] ( )[ ]{ } ( )[ ] ( )[ ] ( )∫ −−−−−−−−−−−− −−−−−−−−−−− −+−+= −+−+=−−= 11|11|111|111|11 1:11|111|111:11|1|1| ,ˆ;ˆˆ |ˆˆ|ˆˆ k xx kkkkk T kkkkkkkk k T kkkkkkkkk T kkkkkk xx kk xdPxxxwxfxwxf ZxwxfxwxfEZxxxxEP N ( ) ( ) ( ) T kkkkkk xx kkkkkk T k xx kk xxQxdPxxxfxfP 1|1|111|11|11111| ˆˆ.ˆ, −−−−−−−−−−−− −+= ∫ N Let compute now { } ( )∫ −−−− == kkkkkkkkk xdZxpzZxzEz 1:11:111| |,|ˆ { } ( ) ( )[ ] ( )∫∫ −−−−−−− +=== k xx kkkkkkkk xx kkkkkkkkkkk xdPxxvxhxdPxxzZxzEz 1|1|1|1|1:111| ,ˆ;,ˆ;,|ˆ NN Since xk and vk are independent { } ( ) ( )∫ −−−−− == k xx kkkkkkkkkkk xdPxxxhZxzEz 1|1|1:111| ,ˆ;,|ˆ N Using the Gaussian approximation of p (xk| Z1:k-1) given by ( ) ( )xx kkkkkkk PxxZxp 1|1|1:1 ,ˆ;| −−− ≈ N
  • 129.
    129 SOLO Additive Gaussian NonlinearFilter (continue – 3) ( ) ( ) kkk kkk vxhz wxfx += += −− 11 Recursive Bayesian Estimation ( ) ( )xx kkkkkkk PxxZxp 1|1|1:1 ,ˆ;| −−− ≈ N Since xk and vk are independent { } ( ) ( )∫ −−−−− == k xx kkkkkkkkkkk xdPxxxhZxzEz 1|1|1:111| ,ˆ;,|ˆ N ( )( ){ } ( )[ ] ( )[ ]{ } ( )[ ] ( )[ ] ( )∫ −−−− −−−−−−− −+−+= −+−+=−−= k xx kkkkk T kkkkkkkk k T kkkkkkkkk T kkkkkk zz kk xdPxxzvxhzvxh ZzvxhzvxhEZzzzzEP 1|1|1|1| 1:11|1|1:11|1|1| .ˆ,ˆˆ |ˆˆ|ˆˆ N ( ) ( ) ( ) T kkkkkk xx kkkkkk T k zz kk zzRxdPxxxhxhP 1|1|1|1|1| ˆˆ,ˆ; −−−−− −+= ∫ N In the same way ( )( ){ } ( ) ( )[ ]{ } ( ) ( )[ ] ( )∫ −−−− −−−−−−− −+−= −+−=−−= k xx kkkkk T kkkkkkk k T kkkkkkkk T kkkkkk zx kk xdPxxzvxhxx ZzvxhxxEZzzxxEP 1|1|1|1| 1:11|1|1:11|1|1| .ˆ,ˆˆ |ˆˆ|ˆˆ N ( ) ( ) T kkkkk xx kkkkkk T k zx kk zxxdPxxxhxP 1|1|1|1|1| ˆˆ,ˆ; −−−−− −= ∫ N
  • 130.
    130 SOLO Additive Gaussian NonlinearFilter (continue – 4) ( ) ( ) kkk kkk vxhz wxfx += += −− 11 Recursive Bayesian Estimation { } ( ) ( )∫ −−−−− == k xx kkkkkkkkkkk xdPxxxhZxzEz 1|1|1:111| ,ˆ;,|ˆ N ( ) ( ) ( ) T kkkkkk xx kkkkkk T k zz kk zzRxdPxxxhxhP 1|1|1|1|1| ˆˆ,ˆ; −−−−− −+= ∫ N ( ) ( ) T kkkkk xx kkkkkk T k zx kk zxxdPxxxhxP 1|1|1|1|1| ˆˆ,ˆ; −−−−− −= ∫ N { } ( ) ( )∫ −−−−−−−−− == 11|11|1111:11| .ˆ;|ˆ k xx kkkkkkkkkk xdPxxxfZxEx N ( ) ( ) ( ) T kkkkkk xx kkkkkk T k xx kk xxQxdPxxxfxfP 1|1|111|11|11111| ˆˆ,ˆ; −−−−−−−−−−−− −+= ∫ N Summary Initialization0 { } ( ) ( ){ }T xxxxEP xEx 00000|0 00 ˆˆ ˆ −−= = For { }∞∈ ,,1 k State Prediction and its Covariance1 Measure Prediction and Covariances2 kx1−kx kz1−kz 0x 1x 2x 1z 2z kZ :11:1 −kZ ( ) 11 −− + kk wxf ( ) kk vxh + ( ) 00 wxf + ( ) 11 vxh + ( ) 11 wxf + ( ) 22 vxh +
  • 131.
    131 SOLO Additive Gaussian NonlinearFilter (continue – 5) ( ) ( ) kkk kkk vxhz wxfx += += −− 11 Recursive Bayesian Estimation Summary (continue – 1) We showed that the Kalman Filter, that uses this computations is given by: { } ( )1| 1 1|1|1|| ˆˆ|ˆ − − −−− −+== kkk K zz kk zx kkkkkkkk zzPPxzxEx k    ( )( ){ } T k zz kkk xx kk xz kk zz kk zx kk xx kkk T kkkkkk xx kk KPKP PPPPZxxxxEP 1 1|1| 1| 1 1|1|1|:1||| ˆˆ − −− − − −−− −= −=−−= Kalman Gain Computations3 1 1|1| − −−= zz kk xz kkk PPK k := k+1 & return to 1 Update State and its Covariance4
  • 132.
    132 SOLO Additive Gaussian NonlinearFilter (continue – 6) ( ) ( ) kkk kkk vxhz wxfx += += −− 11 Recursive Bayesian Estimation ( ) ( )∫= xdPxxxgI xx ,ˆ;N To obtain the Kalman Filter, we must approximate integrals of the type: Three approximation are presented: (1) Gauss – Hermite Quadrature Approximation (2) Unscented Transformation Approximation (3) Monte Carlo Approximation Table of Content
  • 133.
    133 SOLO Additive Gaussian NonlinearFilter (continue – 7) ( ) ( ) kkk kkk vxhz wxfx += += −− 11 Recursive Bayesian Estimation ( ) ( )∫= xdPxxxgI xx,ˆ;N To obtain the Kalman Filter, we must approximate integrals of the type: Gauss – Hermite Quadrature Approximation ( ) ( )[ ] ( ) ( )∫       −−−= − xdxxPxx P xgI xx T xx n ˆˆ 2 1 exp 2 1 1 2/1 π Let Pxx = ST S a Cholesky decomposition, and define: ( )xxSz ˆ 2 1 : 1 −= − ( ) ( )∫ − = zdezgI zz n T 2/ 2 2 π This integral can be approximated using the Gauss – Hermite quadrature rule: ( ) ( )∑∫ = − ≈ M i ii z zfwzdzfe 1 2 where the quadrature points zi and weights wi are defined as follows: Carl Friedrich Gauss 1777-1855 Charles Hermite 1822-1901 Andre – Louis Cholesky 1875 - 1918
  • 134.
    134 SOLO Additive Gaussian NonlinearFilter (continue – 8) ( ) ( ) kkk kkk vxhz wxfx += += −− 11 Recursive Bayesian Estimation Gauss – Hermite Quadrature Approximation (continue – 1) ( ) ( )∑∫ = − ≈ M i ii z zfwzdzfe 1 2 The quadrature points zi and weights wi are defined as follows: A set of orthonormal Hermite polynomials are generated from the recurrence relationship: ( ) ( ) ( ) ( ) ( )zH j j zH j zzH zHzH jjj 11 4/1 01 11 2 /1,0 −+ − + − + = == π or in matrix form: ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )  ( ) Mj j zH zH zH zH zH zH zH z jM e M zh M J M M zh M M M ,,2,1 2 : 1 0 0 0 00 00 00 00 00 1 1 0 1 1 2 21 1 1 1 0          ==                 +                                 =             − − −− ββ β β β ββ β ( )  ( ) ( )zH j zH j zHz jjj jj 11 1 2 1 2 +− + + +=  ββ ( ) ( ) ( )zHezhJzhz MMMM β+=
  • 135.
    135 SOLO Additive Gaussian NonlinearFilter (continue –9) Recursive Bayesian Estimation Gauss – Hermite Quadrature Approximation (continue – 2) ( ) ( )∑∫ = − ≈ M i ii z zfwzdzfe 1 2 Orthonormal Hermitian Polynomials in matrix form: ( ) Mj j JJ j T M M M M ,,2,1 2 : 00 00 00 00 00 1 1 2 21 1   ===                     = − − β β β β ββ β ( ) ( ) ( )zHezhJzhz MMMM β+= Let evaluate this equation for the M roots zi for which ( ) MizH iM ,,2,10 == ( ) ( ) MizhJzhz iMii ,,2,1 == From this equation we can see that zi and are the eigenvalues and eigenvectors, respectively, of the symmetric matrix JM. ( ) ( ) ( ) ( )[ ] MizHzHzHzh T iMiii ,,1,,, 110  == − Because of the symmetry of JM the eigenvectors are orthogonal and can be normalized. Define: ( ) ( ) MjizHWWzHv M j ijiiij i j ,,2,1,:&/: 1 0 2 === ∑ − = We have: ( ) ( ) ( ) ( ) li li li li M j l lj i ij M j l j i j zhzh WWW zH W zH vv δ=⋅== ≠ − = − = ∑∑  0 1 0 1 0 1 : Table of Content
  • 136.
    136 Unscented Kalman FilterSOLO Whenthe state transition and observation models – that is, the predict and update functions f and h (see above) – are highly non-linear, the Extended Kalman Filter can give particularly poor performance [JU97]. This is because only the mean is propagated through the non-linearity. The Unscented Kalman Filter (UKF) [JU97] uses a deterministic sampling technique known as the to pick a minimal set of sample points (called “sigma points”) around the mean. These “sigma points” are then propagated through the non-linear functions and the covariance of the estimate is then recovered. The result is a filter which more accurately captures the true mean and covariance. (This can be verified using Monte Carlo sampling or through a Taylor series expansion of the posterior statistics.) In addition, this technique removes the requirement to analytically calculate Jacobians, which for complex functions can be a difficult task in itself. ( ) 111,,1 −−− +−= kkkk wuxkfx ( ) kkk xkhz ν+= , State vector dynamics Measurements ( ) ( ) ( ){ } ( ) ( ){ } ( )kPkekeEkxEkxke x T xxx =−= &: ( ) ( ) ( ){ } ( ) ( ){ } ( ) lk T www kQlekeEkwEkwke , 0 &: δ=−=  ( ) ( ){ } lklekeE T vw ,0 ∀=    = ≠ = lk lk lk 1 0 ,δ The Unscented Algorithm using ( ) ( ) ( ){ } ( ) ( ){ } ( )kPkekeEkxEkxke x T xxx =−= &: determines ( ) ( ) ( ){ } ( ) ( ){ } ( )kPkekeEkzEkzke z T zzz =−= &:
  • 137.
    137 Unscented Kalman FilterSOLO () ( )[ ] ( ) n n j j j n x n x n x x x xx fx n xxf         ∂ ∂ =∇⋅ ∇⋅=+ ∑ ∑ = ∞ = 1 0 ˆ : ! 1 ˆ δδ δδ Develop the nonlinear function f in a Taylor series around xˆ Define also the operator ( )[ ] ( )xf x xfxfD n n j j jx n x n x x         ∂ ∂ =∇⋅= ∑=1 : δδδ Propagating Means and Covariances Through Nonlinear Transformations Consider a nonlinear function .( )xfy = Let compute Assume is a random variable with a probability density function pX (x) (known or unknown) with mean and covariance x { } ( ) ( ){ }Txx xxxxEPxEx ˆˆ,ˆ −−== ( ){ } { } ( )[ ]{ } ∑ ∑∑ ∑ ∞ = = ∞ = ∞ =                         ∂ ∂ =∇⋅= =+= 0 ˆ 10 ˆ 0 ! 1 ! 1 ! 1 ˆˆ n x n n j j j n x n x n n x f x xE n fxE n DE n xxfEy x δδ δ δ { } { } { } ( )( ){ } xxTT PxxxxExxE xxExE xxx =−−= =−= += ˆˆ 0ˆ ˆ δδ δ δ
  • 138.
    138 Unscented Kalman Filter SOLO PropagatingMeans and Covariances Through Nonlinear Transformations Consider a nonlinear function . (continue – 1) ( )xfy = { } { } { } ( )( ){ } xxTT PxxxxExxE xxExE xxx =−−= =−= += ˆˆ 0ˆ ˆ δδ δ δ ( ){ } ( ) +                         ∂ ∂ +                         ∂ ∂ +                         ∂ ∂ +                         ∂ ∂ +=                         ∂ ∂ =+= ∑∑∑ ∑∑ ∑ === = ∞ = = x n j j jx n j j jx n j j j x n j j j n x n n j j j f x xEf x xEf x xE f x xExff x xE n xxfEy xxx xx ˆ 4 1 ˆ 3 1 ˆ 2 1 ˆ 10 ˆ 1 !4 1 !3 1 !2 1 ˆ ! 1 ˆˆ δδδ δδδ Since all the differentials of f are computed around the mean (non-random)xˆ ( )[ ]{ } ( )[ ]{ } { }( )[ ] ( )[ ]xx xxT xxx TT xxx TT xxx fPfxxEfxxEfxE ˆˆˆˆ 2 ∇∇=∇∇=∇∇=∇⋅ δδδδδ ( )[ ]{ } { } { } 0 ˆ 1 0ˆ 1 ˆ0 ˆ =                 ∂ ∂ =                         ∂ ∂ =                 ∇⋅=∇⋅ ∑∑ == x n j j j x n j j j x xxx f x xEf x xEfxEfxE xx  δδδδ ( ){ } [ ]{ } ( ) ( )[ ] [ ]{ } [ ]{ } +++∇∇+==+= ∑ ∞ = xxxxxx xxT x n x n x fDEfDEfPxffDE n xxfEy ˆ 4 ˆ 3 ˆ 0 ˆ !4 1 !3 1 !2 1 ˆ ! 1 ˆˆ δδδδ
  • 139.
    139 Simon J. Julier UnscentedKalman FilterSOLO Propagating Means and Covariances Through Nonlinear Transformations Consider a nonlinear function . (continue - 2) ( )xfy = { } { } { } ( )( ){ } xxTT PxxxxExxE xxExE xxx =−−= =−= += ˆˆ 0ˆ ˆ δδ δ δ Unscented Transformation (UT), proposed by Julier and Uhlmann uses a set of “sigma points” to provide an approximation of the probabilistic properties through the nonlinear function Jeffrey K. Uhlman A set of “sigma points” S consists of p+1 vectors and their associated weights S = { i=0,1,..,p: x(i) , W(i) }. (1) Compute the transformation of the “sigma points” through the nonlinear transformation f: ( ) ( ) ( ) pixfy ii ,,1,0 == (2) Compute the approximation of the mean: ( ) ( ) ∑= ≈ p i ii yWy 0 ˆ The estimation is unbiased if: ( ) ( ) ( ) ( ) { } ( ) yWyyEWyWE p i i p i y ii p i ii ˆˆ 00 ˆ 0 ===       ∑∑∑ ===  ( ) 1 0 =∑= p i i W (3) The approximation of output covariance is given by ( ) ( ) ( ) ( ) ( )∑= −−≈ p i Tiiiyy yyyyWP 0 ˆˆ
  • 140.
    140 Unscented Kalman FilterSOLO PropagatingMeans and Covariances Through Nonlinear Transformations Consider a nonlinear function (continue – 3)( )xfy = One set of points that satisfies the above conditions consists of a symmetric set of symmetric p = 2nx points that lie on the covariance contour Pxx : th xn ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) x x ni x i xxxni i xxxi ni nWW nWW P W n xx P W n xx WWxx x x ,,1 2/1 2/1 1 ˆ 1 ˆ ˆ 0 0 0 0 0 00 =            −= −=         − −=         − += == + + where is the row or column of the matrix square root of nx Pxx /(1-W0) (the original covariance matrix Pxx multiplied by the number of dimensions of x, nx/(1-W0)). This implies: ( )( )i xx x WPn 01/ − xxx n i T i xxx i xxx P W n P W n P W nx 01 00 111 − =        −        − ∑= Unscented Transformation (UT) (continue – 1)
  • 141.
    141 Unscented Kalman Filter SOLO PropagatingMeans and Covariances Through Nonlinear Transformations Consider a nonlinear function (continue – 3)( )xfy = Unscented Transformation (UT) (continue – 2) ( ) ( ) ( ) ( ) ( ) ( )         += = = == ∑ ∑ ∞ = − ∞ = 0 0 2,,1ˆ ! 1 ,,1ˆ ! 1 0ˆ n xx n x n x n x ii nnixfD n nixfD n ixf xfy i i   δ δ 1 2 Unscented Algorithm: ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )∑∑ ∑ ∑ ∑∑ ∑∑ == = = ∞ = − = ∞ ==     ++ − + − +=     ++++ − += − + − +== x ii x i x iii x i x i x n i xx x n i x x n i xxx x n i n n x x n i n n x x n i ii UT xfDxfD n W xfD n W xf xfDxfDxfDxf n W xfW xfD nn W xfD nn W xfWyWy 1 640 1 20 1 6420 0 1 0 0 1 0 0 0 2 0 ˆ !6 1 ˆ !4 11 ˆ 2 11 ˆ ˆ !6 1 ˆ !4 1 ˆ !2 1 ˆ 1 ˆ ˆ ! 1 2 1 ˆ ! 1 2 1 ˆˆ   δδδ δδδ δδ ( ) i xxx i i P W n xxxx         − ±=±= 01 ˆˆ δ Since ( ) ( ) ( ) ( )    − =         ∂ ∂ −= ∑= − oddnxfD evennxfD xf x xxfD n x n x n n j j ij n x i i x i ˆ ˆ ˆˆ 1 δ δ δ δ
  • 142.
    142 Unscented Kalman Filter () ( ) ( ) ( )∑=     ++ − +∇∇+= x ii n i xx x xxT UT xfDxfD n W xfPxfy 1 640 ˆ !6 1 ˆ !4 11 ˆ 2 1 ˆˆ δδ ( ) i xxx i i P W n xxxx         − ±=±= 01 ˆˆ δ SOLO Propagating Means and Covariances Through Nonlinear Transformations Consider a nonlinear function (continue – 4)( )xfy = Unscented Transformation (UT) (continue – 3) Unscented Algorithm: ( ) ( ) ( ) ( ) ( )xfPxfP W n n W xfP W n P W n n W xfP W n P W n n W xfD n W xxTxxxT x n i T i xxx i xxxT x n i T i xxx i xxxT x n i x x x xx i ˆ 2 1 ˆ 12 11 ˆ 112 11 ˆ 112 11 ˆ 2 11 0 0 1 00 0 1 00 0 1 20 ∇∇=∇      − ∇ − =∇                 −        − ∇ − = ∇        −        − ∇ − = − ∑ ∑∑ = == δ Finally: We found ( ){ } [ ]{ } ( ) ( )[ ] [ ]{ } [ ]{ } +++∇∇+==+= ∑ ∞ = xxxxxx xxT x n x n x fDEfDEfPxffDE n xxfEy ˆ 4 ˆ 3 ˆ 0 ˆ !4 1 !3 1 !2 1 ˆ ! 1 ˆˆ δδδδ We can see that the two expressions agree exactly to the third order.
  • 143.
    143 Unscented Kalman Filter SOLO PropagatingMeans and Covariances Through Nonlinear Transformations Consider a nonlinear function (continue – 5)( )xfy = Unscented Transformation (UT) (continue – 4) Accuracy of the Covariance: ( ) ( ){ } { } ( ) ( ) ( ) ( ) ( ) ( ) ( )[ ] [ ]{ } [ ]{ } ( ) ( )[ ] [ ]{ } [ ]{ } T xxxxxx xxT x xxxxxx xxT x T m m xx n n xx TTTyy fDEfDEfPxf fDEfDEfPxf fD m xfDxffD n xfDxfE yyyyEyyyyEP       +++∇∇+⋅ ⋅      +++∇∇+−               ++      ++= −=−−= ∑∑ ∞ = ∞ =   ˆ 4 ˆ 3 ˆ ˆ 4 ˆ 3 ˆ 22 !4 1 !3 1 !2 1 ˆ !4 1 !3 1 !2 1 ˆ ! 1 ˆˆ ! 1 ˆˆ ˆˆˆˆ δδ δδ δδδδ ( ) ( ) ( ) ( ){ } ( ) ( ) ( ){ } ( ) ( ) ( ) ( )                     +       ++       ++= ∑∑ ∑∑ ∞ = ∞ = ∞ = ∞ = T m m x n n x T n n x T x T n n x T x T fD m fD n E xfxfD n ExfxfDExfD n ExfxfDExfxfxf 22 2 0 2 0 ! 1 ! 1 ˆˆ ! 1 ˆˆˆ ! 1 ˆˆˆˆˆ δδ δδδδ 
  • 144.
    144 Unscented Kalman Filter SOLO PropagatingMeans and Covariances Through Nonlinear Transformations Consider a nonlinear function (continue – 6)( )xfy = Unscented Transformation (UT) (continue – 5) Accuracy of the Covariance: ( ) ( ){ } { } ( )[ ] ( )[ ]{ } ( ) ( ) ( ) { } { }      0 1 1 22 0 1 1 ˆˆ !2!2 1 !! 1 4 1 ˆˆˆˆ > ∞ = ∞ = > ∞ = ∞ =       −       + ∇∇∇∇−= −=−−= ∑∑∑∑ ji i j Tj x i x ji i j Tj x i x T xx xxT xxx xxT x T x xx x TTTyy fDEfDE ji fDfD ji E fPfPP yyyyEyyyyEP δδδδ AA ( )[ ] ( )[ ]{ } ( ) ( ) ( ) ( ) ( ) { } { }       0 1 1 2 1 2 1 2 ~ 2 ~ 2 2 1 0 1 1 ~~ ˆˆ 4!2!2 1 !! 1 2 1 4 1 > ∞ = ∞ = = = = > ∞ = ∞ =       + −       + + ∇∇∇∇−= ∑∑ ∑∑ ∑ ∑∑ ji i j L k L m Tji L k ji i j Tji T xx xxT xxx xxT x T x xx x yy UT fDEfDE Lji fDfD jiL fPfPPP mk kk σσ σσ λ λ AA
  • 145.
  • 146.
    146 Uscented Kalman FilterSOLO () ( )∑∑ −−== N T iiiz N ii zzPz 2 0 2 0 ψψβψβ x xPα xP     zP ( )f iβ iβ iψ z { } [ ]xxi PxPxx ααχ −+= Weighted sample mean Weighted sample covariance Table of Content
  • 147.
    147 Uscented Kalman Filter SOLO UKFSummary Initialization of UKF { } ( ) ( ){ }T xxxxEPxEx 00000|000 ˆˆˆ −−== { } [ ] ( )( ){ }           =−−=== R Q P xxxxEPxxEx TaaaaaTTaa 00 00 00 ˆˆ00ˆˆ 0|0 00000|0000 [ ]TTTTa vwxx =: For { }∞∈ ,,1 k Calculate the Sigma Points ( ) ( ) λγ γ γ +=        =−= =+= = −−−− + −− −−−−−− −−−− L LiPxx LiPxx xx i kkkk Li kk i kkkk i kk kkkk ,,1ˆ ,,1ˆ ˆ 1|11|11|1 1|11|11|1 1|1 0 1|1   State Prediction and its Covariance System Definition ( ) { } { } ( ) { } { }    ==+= ==+−= −−−−−−− lkk T lkkkkk lkk T lkkkkkk RvvEvEvxkhz QwwEwEwuxkfx , ,1111111 &0, &0,,1 δ δ ( ) Liuxkfx k i kk i kk 2,,1,0,,1 11|11| =−= −−−− ( ) ( ) ( ) ( ) Li L W L WxWx m i m L i i kk m ikk 2,,1 2 1 &ˆ 0 2 0 1|1| = + = + == ∑= −− λλ λ 0 1 2 ( ) ( )( ) ( ) ( ) ( ) Li L W L WxxxxWP c i c L i T kk i kkkk i kk c ikk 2,,1 2 1 &1ˆˆ 2 0 2 0 1|1|1|1|1| = + =+−+ + =−−= ∑= −−−−− λ βα λ λ
  • 148.
    148 Uscented Kalman Filter SOLO UKFSummary (continue – 1) Measure Prediction ( ) Lixkhz i kk i kk 2,,1,0, 1|1| == −− ( ) ( ) ( ) ( ) Li L W L WzWz m i m L i i kk m ikk 2,,1 2 1 &ˆ 0 2 0 1|1| = + = + == ∑= −− λλ λ 3 Innovation and its Covariance4 1|ˆ −−= kkkk zzi ( ) ( )( ) ( ) ( ) ( ) Li L W L WzzzzWPS c i c L i T kk i kkkk i kk c i zz kkk 2,,1 2 1 &1ˆˆˆˆ 2 0 2 0 1|1|1|1|1| = + =+−+ + =−−== ∑= −−−−− λ βα λ λ Kalman Gain Computations5 ( ) ( )( ) ( ) ( ) ( ) Li L W L WzzxxWP c i c L i T kk i kkkk i kk c i xz kk 2,,1 2 1 &1ˆˆ 2 0 2 0 1|1|1|1|1| = + =+−+ + =−−= ∑= −−−−− λ βα λ λ 1 1|1| − −−= zz kk xz kkk PPK Update State and its Covariance6 kkkkkk iKxx += −1|| ˆˆ T kkkkkkk KSKPP −= −1|| k := k+1 & return to 1
  • 149.
    149 Unscented Kalman Filter StateEstimation (one cycle) Sensor Data Processing and Measurement Formation Observation - to - Track Association Input Data Track Maintenance ( Initialization, Confirmation and Deletion) Filtering and Prediction Gating Computations Samuel S. Blackman, " Multiple-Target Tracking with Radar Applications", Artech House, 1986 Samuel S. Blackman, Robert Popoli, " Design and Analysis of Modern Tracking Systems", Artech House, 1999 SOLO Simon J. Julier Jeffrey K. Uhlman
  • 150.
  • 151.
    151 Numerical Integration Usinga Monte Carlo ApproximationSOLO A Monte Carlo Approximation of the Expected Value Integrals uses Discrete Approximation to the Gaussian PDF ( )xx Pxx ,ˆ;N ( )xx Pxx ,ˆ;N can be approximated by: ( ) ( ) ( ) ( )∑∑ == −=−≈= ss N i i s N i iixx xx N xxwPxxx 11 1 ,ˆ; δδNp We can see that for any x we have ( ) ( )∫∑∫∑ ∞− ≤ ∞− = ≈=− x xx xx i i x N i ii dPxwdxw i s ττττδ ,ˆ; 1 N The weight wi is not the probability of the point xi . The probability density near xi is given by the density of the points in the region around xi , which can be obtained by a normalized histogram of all xi . Draw Ns samples from , where {xi , i = 1,2,…,Ns} are a set of support points (random samples of particles) with weights {wi = 1/Ns, i=1,2,…,Ns} ( )xx Pxx ,ˆ;N Monte Carlo Kalman Filter (MCKF)
  • 152.
    152 Numerical Integration Usinga Monte Carlo Approximation SOLO The Expected Value for any function g (x) can be estimated from: ( ){ } ( ) ( ) ( ) ( ) ( ) ( ) ( )∑∑∫ ∑∫ === ==−≈= sss N i i s N i ii N i ii xp xg N xgwxxwxgxdxpxgxgE 111 1 δ which is the sample mean. ( ) { } { } ( ) { } { }    ==+= ==+−= −−−−−−− lkk T lkkkkk lkk T lkkkkkk RvvEvEvxkhz QwwEwEwuxkfx , ,1111111 &0, &0,,1 δ δGiven the System Assuming that we computed the Mean and Covariance at stage k-1 let use the Monte Carlo Approximation to compute the predicted Mean and Covariance at stage k 1|11|1 ,ˆ −−−− kkkk Px 1|1| ,ˆ −− kkkk Px { } ( ) ( )∑= −−−− −== − s kk N i k i kk s Zxpkkk uxkf N xEx 1 11|1|1| ,,1 1 ˆ 1:1 ( ) ( ){ } ( ) { } ( ) T kkkkZxp T kkZxp T kkkkkk xx kk xxxxExxxxEP kkkk 1|1|||1|1|1| ˆˆˆˆ 1:11:1 −−−−− −=−−= −− Monte Carlo Kalman Filter (MCKF) (continue – 1) Draw Ns samples ( ) ( ) skkkkkkk i kk NiPxxZxpx ,,1,ˆ;|~ 1|11|111:111|1 == −−−−−−−−− N ~means Generate (Draw) samples from a predefined distribution
  • 153.
    153 Numerical Integration Usinga Monte Carlo Approximation SOLO ( )( ){ } ( ) { } ( ) ( )[ ] ( )[ ]{ } ( ) ( ) ( ){ } ( ) ( ) ( ) T N i k i kk s N i k i kk s Zxpk i kk T k i kk T kkkkZxp T kk i kkkk i kk T kkkkZxp T kkZxp T kkkkkk xx kk ss kk kk kkkk uxkf N uxkf N QuxfuxfE xxwuxkfwuxkfE xxxxExxxxEP       −      −−+= −+−+−= −=−−= ∑∑ = −−− = −−−−−−−−− −−−−−−−−−− −−−−− − − −− 1 11|1 1 11|1|11|111|1 1|1||111|1111|1 1|1|||1|1|1| ,,1 1 ,,1 1 ,, ˆˆ,,1,,1 ˆˆˆˆ 1:1 1:1 1:11:1 ( ) ( ) ( ) ( )      −      −−−−+= ∑∑∑ = −−− = −−− = −−−−−−− sss N i k i kk s N i k i kk s N i k i kk T k i kk s xx kk uxkf N uxkf N uxkfuxkf N QP 1 11|1 1 11|1 1 11|111|11| ,,1 1 ,,1 1 ,,1,,1 1 Using the Monte Carlo Approximation we obtain: { } ( ) ( )∑= −− == − s kk N i i kk s Zxpkkk xkh N zEz 1 1||1| , 1 ˆ 1:1 ( ) ( ) ( ) ( )            −+= ∑∑∑ = − = − = −−− sss N i i kk s N i i kk s N i i kk Ti kk s zz kk xkh N xkh N xkhxkh N RP 1 1| 1 1| 1 1|1|1| , 1 , 1 ,, 1 Monte Carlo Kalman Filter (MCKF) (continue – 2) ( ) ( ) skkkkkkk i kk NiPxxZxpx ,,1,ˆ;|~ 1|1|1:11| == −−−− N Now we approximate the predictive PDF, , as and we draw new Ns (not necessarily the same as before) samples. ( )1:1| −kk Zxp ( )1|1| ,ˆ; −− kkkkk PxxN
  • 154.
    154 Numerical Integration Usinga Monte Carlo Approximation SOLO In the same way we obtain: ( ) ( )            −= ∑∑∑ = − = − = −−− sss N i i kk s N i i kk s N i i kk Ti kk s zx kk xkh N x N xkhx N P 1 1| 1 1| 1 1|1|1| , 11 , 1 Monte Carlo Kalman Filter (MCKF) (continue – 3) The Kalman Filter Equations are: ( ) 1 1|1| − −−= zz kk zx kkk PPK ( )1|1|| ˆˆˆ −− −+= kkkkkkkk zzKxx T k zz kkk xx kk xx kk KPKPP 1|1|| −− −=
  • 155.
    155 Monte Carlo KalmanFilter (MCKF) SOLO MCKF Summary { } ( ) ( ){ }T xxxxEPxEx 00000|000 ˆˆˆ −−== { } [ ] ( )( ){ }           =−−=== R Q P xxxxEPxxEx TaaaaaTTaa 00 00 00 ˆˆ00ˆˆ 0|0 00000|0000 For { }∞∈ ,,1 k System Definition: ( ) ( ) ( ) ( ) ( )   =+= ==+−= −−−−−− kkkkkk kkkkkkk Rvvvxkhz QwwPxxxwuxkfx ,0;, ,0;&,ˆ;,,1 1110|0000111 N NN ( ) sk ai kk ai kk Niuxkfx ,,1,,1 11|11| =−= −−−− ∑= −− = sN i ai kk s a kk x N x 1 1|1| 1 ˆ Initialization of MCKF0 State Prediction and its Covariance2 Ta kk a kk N i Tai kk ai kk s a kk xxxx N P s 1|1| 1 1|1|1| ˆˆ 1 −− = −−− −= ∑ Assuming for k-1 Gaussian distribution with Mean and Covariance1 a kk a kk Px 1|11|1 ,ˆ −−−− Assuming Gaussian distribution with Mean and Covariance3 1|1| ,ˆ −− kkkk Px ( ) s a kk a kk a k ai kk NiPxxx ,,1,ˆ;~ 1|11|111|1 =−−−−−−− N Generate (Draw) Ns samples ( ) s a kk a kk a kk aj kk NjPxxx ,,1,ˆ;~ 1|1|1|1| =−−−− N Generate (Draw) new Ns samples [ ]TTTTa vwxx =: Augment the state space to include processing and measurement noises.
  • 156.
    156 Monte Carlo KalmanFilter (MCKF) SOLO MCKF Summary (continue – 1) ( ) s aj kk j kk Njxkhz ,,1, 1|1| == −− ∑= −− = sN j j kk s kk z N z 1 1|1| 1 ˆ Measure Prediction4 ( )( )∑= −−−−− −−== sN j T kk j kkkk j kk s zz kkk zzzz N PS 1 1|1|1|1|1| ˆˆ 1 Measurement & Innovation Computation 1|ˆ −−= kkkk zzi7 ( )( )∑= −−−−− −−= s a N j T kk j kk a kk aj kk s zx kk zzxx N P 1 1|1|1|1|1| ˆˆ 1 6 Kalman Gain Computations 1 1|1| − −−= zz kk zx kk a k PPK a Kalman Filter8 k a k a kk a kk iKxx += −1|| ˆˆ Ta kk a k a kk a kk KSKPP −= −1|| k := k+1 & return to 1 Predicted Covariances Computations5
  • 157.
    157 Sensor Data Processing and Measurement Formation Observation- to - Track Association Input Data Track Maintenance ( Initialization, Confirmation and Deletion) Filtering and Prediction Gating Computations Samuel S. Blackman, " Multiple-Target Tracking with Radar Applications", Artech House, 1986 Samuel S. Blackman, Robert Popoli, " Design and Analysis of Modern Tracking Systems", Artech House, 1999 SOLO Monte Carlo Kalman Filter (MCKF) Table of Content
  • 158.
    158 Nonlinear Estimation UsingParticle Filters SOLO We assumed that p (xk|Z1:k) is a Gaussian PDF. If the true PDF is not Gaussian (multivariate, heavily skewed or non-standard – not represented by any standard PDF) the Gaussian distribution can never described it well. Non-Additive Non-Gaussian Nonlinear Filter ( ) ( )kkk kkk vxhz wxfx , , 11 = = −− kk vw &1− are system and measurement white-noise sequences independent of past and current states and on each other and having known P.D.F.s ( ) ( )kk vpwp &1− We want to compute p (xk|Z1:k) recursively, assuming knowledge of p(xk-1|Z1:k-1) in two stages, prediction (before) and update (after measurement) Prediction (before measurement) Use Chapman – Kolmogorov Equation to obtain: ( ) ( ) ( )∫ −−−−− = 11:1111:1 ||| kkkkkkk xdZxpxxpZxp where: ( ) ( ) ( )∫ −−−−−− = 111111 |,|| kkkkkkkk wdxwpwxxpxxp By assumption ( ) ( )111 | −−− = kkk wpxwp Since by knowing , is deterministically given by system equation we have 11 & −− kk wx kx ( ) ( )( ) ( ) ( )   ≠ = =−= −− −− −−−− 11 11 1111 ,0 ,1 ,,| kkk kkk kkkkkk wxfx wxfx wxfxwxxp δ Therefore: ( ) ( )( ) ( )∫ −−−−− −= 11111 ,| kkkkkkk wdwpwxfxxxp δ
  • 159.
    159 Nonlinear Estimation UsingParticle Filters SOLO Non-Additive Non-Gaussian Nonlinear Filter ( ) ( )kkk kkk vxhz wxfx , , 11 = = −− kk vw &1− are system and measurement white-noise sequences independent of past and current states and on each other and having known P.D.F.s ( ) ( )kk vpwp &1− We want to compute p (xk|Z1:k) recursively, assuming knowledge of p(xk-1|Z1:k-1) in two stages, prediction (before) and update (after measurement) Prediction (before measurement) ( ) ( ) ( )∫ −−−−− = 11:1111:1 ||| kkkkkkk xdZxpxxpZxp where: ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )∫ − − − − = − === kkkkk kkkk kk kkkk Bayes bp apabp bap kkkkk xdZxpxzp Zxpxzp Zzp Zxpxzp ZzxpZxp 1:1 1:1 1:1 1:1 | | 1:1:1 || || | || ,|| ( ) ( ) ( )∫= kkkkkkkk vdxvpvxzpxzp |,|| By assumption ( ) ( )kkk vpxvp =| Since by knowing , is deterministically given by system equationkk vx & kz ( ) ( )( ) ( ) ( )   ≠ = =−= kkk kkk kkkkkk vxhz vxhz vxhzvxzp ,0 ,1 ,,| δ Therefore: ( ) ( )( ) ( )∫ −= kkkkkkk vdvpvxhzxzp ,| δ ( ) ( )( ) ( )∫ −−−−− −= 11111 ,| kkkkkkk wdwpwxfxxxp δ 1 Update (after measurement)2
  • 160.
    160 Nonlinear Estimation UsingParticle Filters SOLO Non-Additive Non-Gaussian Nonlinear Filter ( ) ( )kkk kkk vxhz wxfx , , 11 = = −− kk vw &1− are system and measurement white-noise sequences independent of past and current states and on each other and having known P.D.F.s ( ) ( )kk vpwp &1− We want to compute p (xk|Z1:k) recursively, assuming knowledge of p(xk-1|Z1:k-1) in two stages, prediction (before) and update (after measurement) Prediction (before measurement) ( ) ( ) ( )∫ −−−−− = 11:1111:1 ||| kkkkkkk xdZxpxxpZxp ( ) ( )( ) ( )∫ −−−−− −= 11111 ,| kkkkkkk wdwpwxfxxxp δ Update (after measurement) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )∫ − − − − = − === kkkkk kkkk kk kkkk Bayes bp apabp bap kkkkk xdZxpxzp Zxpxzp Zzp Zxpxzp ZzxpZxp 1:1 1:1 1:1 1:1 | | 1:1:1 || || | || ,|| We need to evaluate the following integrals: ( ) ( )( ) ( )∫ −= kkkkkkk vdvpvxhzxzp ,| δ We use the numeric Monte Carlo Method to evaluate the integrals: Generate (Draw): ( ) ( ) Sk i kk i k Nivpvwpw ,,1~&~ 11 =−− ( ) ( )( ) S N i i k i k i kkk Nwxfxxxp S ∑= −−− −≈ 1 111 /,| δ ( ) ( )( ) S N i i k i k i kkk Nvxhzxzp S ∑= −≈ 1 /,| δ or ( ) ( ) ( ) S N i i kkkk i k i k i k Nxxxxpwxfx S ∑= −−− −≈→= 1 111 /|, δ ( ) ( ) ( ) S N i i kkkk i k i k i k Nzzxzpvxhz S ∑= −≈→= 1 /|, δ Analytic solutions for those integral equations do not exist in the general case. 1 2
  • 161.
    161 SOLO ( ) () ( ) ( ) ( )kvkkk xkkwkkkk vpgivenvxhz xpuwpgivenwuxfx :, ,,:,, 011111 0 = = −−−−− Monte Carlo Computations of and .( )kk xzp |( )1| −kk xxp Generate (Draw) ( ) Sx i Nixpx ,,1~ 00 0 = For { }∞∈ ,,1 k Initialization0 1 At stage k-1 Generate (Draw) NS samples ( ) Skw i k Niwpw ,,1~ 11 =−− 2 State Update ( ) S i kk i k i k Niwuxfx ,,1,, 111 == −−− 3 Generate (Draw) Measurement Noise ( ) Skv i k Nivpv ,,1~ = k:=k+1 & return to 1 ( ) ( )∑= − −≈ SN i S i kkkk Nxxxxp 1 1 /| δ ( ) ( )∑= −≈ SN i S i kkkk Nzzxzp 1 /| δ 4 Measurement , Update ( ) S i k i k i k Nivxhz ,,1, ==kz Nonlinear Estimation Using Particle Filters Non-Additive Non-Gaussian Nonlinear Filter
  • 162.
    162 Nonlinear Estimation UsingParticle Filters SOLO Non-Additive Non-Gaussian Nonlinear Filter ( ) ( )kkk kkk vxhz wxfx , , 11 = = −− kk vw &1− are system and measurement white-noise sequences independent of past and current states and on each other and having known P.D.F.s ( ) ( )kk vpwp &1− We want to compute p (xk|Z1:k) recursively, assuming knowledge of p(xk-1|Z1:k-1) in two stages, prediction (before) and update (after measurement) Prediction (before measurement) ( ) ( ) ( )∫ −−−−− = 11:1111:1 ||| kkkkkkk xdZxpxxpZxp Update (after measurement) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )∫ − − − − = − === kkkkk kkkk kk kkkk Bayes bp apabp bap kkkkk xdZxpxzp Zxpxzp Zzp Zxpxzp ZzxpZxp 1:1 1:1 1:1 1:1 | | 1:1:1 || || | || ,|| We use the numeric Monte Carlo Method to evaluate the integrals: Generate (Draw): ( ) ( ) Sk i kk i k Nivpvwpw ,,1~&~ 11 =−− ( ) ( ) ( ) S N i i kkkk i k i k i k Nxxxxpwxfx S ∑= −−− −≈→= 1 111 /|, δ ( ) ( ) ( ) S N i i kkkk i k i k i k Nzzxzpvxhz S ∑= −≈→= 1 /|, δ ( ) ( ) ( ) ( ) ( ) ( )∑∑ ∫∫∑ == −−−− = −−− −=−=−= SSS N i i kk S N i kkk i kk S k N i kk i kk S kk xx N xdZxpxx N xdZxpxx N Zxp 11 1 11:111 1 1:111:1 1 | 1 | 1 | δδδ    Since we use NS points to describe the probabilities we call those points, Particles. 1 2 Table of Content
  • 163.
    163 Nonlinear Estimation UsingParticle Filters SOLO We assumed that p (xk|Z1:k) is a Gaussian PDF. If the true PDF is not Gaussian (multivariate, heavily skewed or non-standard – not represented by any standard PDF) the Gaussian distribution can never described it well. In such cases approximate Grid-Based Filters and Particle Filters will yield an improvement at the cost of heavy computation demand. ( ) ( ) ( ) 0 | | : :1 :1 >= kk kk k Zxq Zxp xw To overcome this difficulty we use The Principle of Importance Sampling. Suppose that p (xk|Z1:k) is a PDF from which is difficult to draw samples. Also suppose that q (xk|Z1:k) is another PDF from which samples can be easily drawn (referred to Importance Density), for example a Gaussian PDF. Now assume that we can find at each sample the scale factor w (xk) between the two densities: Using this we can write: ( ){ } ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )∫ ∫ ∫ ∫ ∫ = == kkkk kkkkk kkk kk kk kkk kk kk k kkkkZxpk xdZxqxw xdZxqxwxg xdZxq Zxq Zxp xdZxq Zxq Zxp xg xdZxpxgxgE kk :1 :1 1 :1 :1 :1 :1 :1 :1 :1| | | | | | | | | |:1    Non-Additive Non-Gaussian Nonlinear Filter ( ) ( )kkk kkk vxhz wxfx , , 11 = = −− Importance Sampling (IS)
  • 164.
    164 SOLO ( ){ }( ) ( ) ( ) ( ) ( ) ( )∫ ∫= kkkk kkkkk Zxpk xdZxqxw xdZxqxwxg xgE kk :1 :1 | | | :1 ( ) ( ) ( )∑= = sN i i k s i ki k xw N xw xw 1 1 :~ where Generate (draw) Ns particle samples { xk i , i=1,…,Ns } from q(xk|Z1:k) ( ) skk i k NiZxqx ,,1|~ :1 = ( ){ } ( ) ( ) ( ) ( ) ( ) ( )∑ ∑ ∑ = = = =≈ s s s kk N i i kkN i i k s N i i kk s Zxpk xwxg xw N xwxg N xgE 1 1 1 | ~ 1 1 :1 and estimate g(xk) using a Monte Carlo approximation: Nonlinear Estimation Using Particle Filters Non-Additive Non-Gaussian Nonlinear Filter ( ) ( )kkk kkk vxhz wxfx , , 11 = = −− Importance Sampling (IS) Table of Content
  • 165.
    165 SOLO It would beuseful if the importance density could be generated recursively (sequentially). ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )kk kkkk Zzpc kk kkkkkk bP aPabP baP Bayes kk kkk k Zxq Zxpxzpc Zxq ZzpZxpxzp Zxq Zzxp xw kk :1 1:1 |/1: :1 1:11:1 | |:1 1:1 | || | |/|| | ,| 1:1 − = −− = − − === ( ) ( ) ( ) ( ) ( ) ( )1:111:11 |, 1:11 |,||, −−−− = −− = kkkkk bPbaPbaP Bayes kkk ZxpZxxpZxxp ( ) ( ) ( ) ( )∫∫ −−−−−−−−− == 11:111:1111:111:1 |,||,| kkkkkkkkkkkk xdZxpZxxpxdZxxpZxp Using: we obtain: ( ) ( ) ( ) ( )∫∫ −−−−−−−−− == 11:111:1111:111:1 |,||,| kkkkkkkkkkkk xdZxqZxxqxdZxxqZxq In the same way: ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )∫ ∫ −−−−− −−−−−− == 11:111:11 11:111:11 :1 1:1 |,| |,|| | || kkkkkk kkkkkkkk kk kkkk k xdZxqZxxq xdZxpZxxpxzpc Zxq Zxpxzpc xw Sequential Importance Sampling (SIS) Nonlinear Estimation Using Particle Filters Non-Additive Non-Gaussian Nonlinear Filter ( ) ( )kkk kkk vxhz wxfx , , 11 = = −−
  • 166.
    166 SOLO It would beuseful if the importance density could be generated recursively. ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )∫ ∫ −−−−− −−−−−− == 11:111:11 11:111:11 :1 1:1 |,| |,|| | || kkkkkk kkkkkkkk kk kkkk k xdZxqZxxq xdZxpZxxpxzpc Zxq Zxpxzpc xw Suppose that at k-1 we have Ns particle samples and their probabilities { xk-1|k-1 i ,wk-1 i ,i=1,…,Ns }, that constitute a random measure which characterizes the posterior PDF for time up to tk-1. Then ( ) ( ) ( )∑= −−−−−−−− −≈ sN i i kkkk i kkkk xxZxpZxp 1 1|111:11|11:11 || δ ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )∫ ∑ ∫ ∑ = −−−−−−−− − = −−−−−−−− − − = s s N i i kkkk i kkkkk k N i i kkkk i kkkkkkk k xxZxqZxxq xdxxZxpZxxpxzpc xw 1 1|111:11|11:11 1 1 1|111:11|11:11 |,| |,|| δ δ ( ) ( ) ( )∑= −−−−−−−− −≈ sN i i kkkk i kkkk xxZxqZxq 1 1|111:11|11:11 || δ Sequential Importance Sampling (SIS) (continue – 1) We obtained: Nonlinear Estimation Using Particle Filters Non-Additive Non-Gaussian Nonlinear Filter ( ) ( )kkk kkk vxhz wxfx , , 11 = = −−
  • 167.
    167 SOLO ( ) () ( ) ( ) ( ) ( )kk kkkk Bayes kk kk k Zxq Zxpxzpc Zxq Zxp xw :1 1:1 :1 :1 | || | | − == ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )1:11|11|1 1:11|11|1 |,| |,| 1:11|11:11|1 1:11|11:11|1 1 1|111:11|11:11 1 1 1|111:11|11:11 || ||| |,| |,|| |,| |,|| 1|11:11|1 1|11:11|1 −−−−− −−−−− = = −−−−−− −−−−−− = −−−−−−−− − = −−−−−−−− −−−−− −−−−− = = − − = ∫ ∑ ∫ ∑ k i kk i kkk k i kk i kkkkk xxpZxxp xxqZxxq k i kkk i kkk k i kkk i kkkkk N i i kkkk i kkkkk k N i i kkkk i kkkkkkk k Zxqxxq Zxpxxpxzpc ZxqZxxq ZxpZxxpxzpc xxZxqZxxq xdxxZxpZxxpxzpc xw i kkkk i kkk i kkkk i kkk s s δ δ ( ) ( ) ( )1:11 1:11 1 | | −− −− − = kk kk k Zxq Zxp xwSince ( ) ( ) ( )i kk i kk i kk i kk i kkki k i k xxq xxpxzpc ww 1|1| 1|1|| 1 | || −− −− −= Define ( ) ( ) ( )k i kk k i kki kk i k Zxq Zxp xww :1| :1| | | | : == ( ) ( ) ( )1:11|1 1:11|1 1|11 | | : −−− −−− −−− == k i kk k i kki kk i k Zxq Zxp xww Sequential Importance Sampling (SIS) (continue – 2) Nonlinear Estimation Using Particle Filters Non-Additive Non-Gaussian Nonlinear Filter ( ) ( )kkk kkk vxhz wxfx , , 11 = = −−
  • 168.
    168 SOLO Sequential Importance Sampling(SIS) (continue – 3) ( ) ( )  ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) twwwt Zxxq xxpxzp ww i kk N i i k k i k i k i k i k i kk N i k i k /~~ ,| ||~~ 1:11 1 /1 1 ==→= ∑=− − − ( ) { } ( )∑= − − −=≈ N i i kk i kkk NxxNxZxp 1 1 1:1 /:,| δ k:=k+1 Run This Nonlinear Estimation Using Particle Filters Non-Additive Non-Gaussian Nonlinear Filter ( ) ( )kkk kkk vxhz wxfx , , 11 = = −− ( ) ( )∑= −= N i i kk i kkk xxwZxp 1 :1| δ Generate (Draw) ( ) Sx i Nixpx ,,1~ 00 0 = For { }∞∈ ,,1 k Initialization0 1 At stage k-1 Generate (Draw) NS samples ( ) Skw i k Niwpw ,,1~ 11 =−− 2 State Update ( ) S i kk i k i k Niwuxfx ,,1,, 111 == −−− Start with the approximation ( ) ( )∑= − −≈ SN i S i kkkk Nxxxxp 1 1 /| δ3 After measurement zk we compute ( ) ( ) ( ) { }i k i kkk wxZxp ~,| :1 ≈4 Generate (Draw) NS samples ( ) Skw i k Nivpv ,,1~ = Compute ( )i k i k i k vxhz ,= Approximate ( ) ( )∑= −= SN i S i kk i kk Nzzxzp 1 /| δ
  • 169.
    169 SOLO The resulting sequentialimportance sampling (SIS) algorithm is a Monte Carlo method that forms the basis for most sequential MC Filters. Sequential Importance Sampling (SIS) (continue – 4) This sequential Monte Carlo method is known variously as: • Bootstrap Filtering • Condensation Algorithm • Particle Filtering • Interacting Particle Approximation • Survival of the Fittest Nonlinear Estimation Using Particle Filters Non-Additive Non-Gaussian Nonlinear Filter ( ) ( )kkk kkk vxhz wxfx , , 11 = = −−
  • 170.
    170 SOLO Degeneracy Problem Sequential ImportanceSampling (SIS) (continue – 5) A common problem with SIS particle filter is the degeneracy phenomenon, where after a few iterations, all but one particle will have negligible weights. It can be shown that the variance of the importance weights, wk i , of the SIS algorithm, can only increase over time, and that leads to the degeneracy problem. A suitable measure of degeneracy is given by: ( ) 1 1ˆ 1 1 2 == ∑ ∑ = = N i i kN i i k eff wwhere w N To see this let look at the following two cases: 1 ( ) N N NNi N w N i eff i k ==⇒== ∑=1 2 /1 1ˆ,,1, 1  2 ( ) 1 1ˆ 0 1 1 2 ==⇒    ≠ = = ∑= N i i k eff i k w N ji ji w Hence, small Neff indicates a severe degeneracy and vice versa. Nonlinear Estimation Using Particle Filters Non-Additive Non-Gaussian Nonlinear Filter ( ) ( )kkk kkk vxhz wxfx , , 11 = = −− Table of Content
  • 171.
    171 SOLO The Bootstrap (Resampling) •Popularized by Brad Efron (1979) • The Bootstrap is a name generically applied to statistical resampling schemes that allow uncertainty in the data to be assesed from the data themselves, in other words “pulling yourself up by your bootstraps” The disadvantage of bootstrapping is that while (under some conditions) it is asymptotically consistent, it does not provide general finite-sample guarantees, and has a tendency to be overly optimistic.The apparent simplicity may conceal the fact that important assumptions are being made when undertaking the bootstrap analysis (e.g. independence of samples) where these would be more formally stated in other approaches. The advantage of bootstrapping over analytical methods is its great simplicity - it is straightforward to apply the bootstrap to derive estimates of standard errors and confidence intervals for complex estimators of complex parameters of the distribution, such as percentile points, proportions, odds ratio, and correlation coefficients. Sequential Importance Resampling (SIR) Nonlinear Estimation Using Particle Filters Non-Additive Non-Gaussian Nonlinear Filter ( ) ( )kkk kkk vxhz wxfx , , 11 = = −− Bradley Efron 1938 Stanford U.
  • 172.
    172 SOLO Resampling Sequential Importance Resampling(SIR) (continue – 1) Whenever a significant degeneracy is observed (i.e., when Neff falls bellow some Threshold Nthr) during the sampling, where we obtained ( ) ( )∑= −≈ N i i kk i kkk xxwZxp 1 :1| δ we need to resample and replace the mapping representation with a random measure { } Niwx i k i k ,,1, = { } NiNxi k ,,1/1,* = This is done by first computing the Cumulative Density Function (C.D.F.) of the sampled distribution wk i . Initialize the C.D.F.: c1 = wk 1 Compute the C.D.F.: ci = ci-1 + wk i For i = 2:N i := i + 1 Nonlinear Estimation Using Particle Filters Non-Additive Non-Gaussian Nonlinear Filter ( ) ( )kkk kkk vxhz wxfx , , 11 = = −−
  • 173.
    173 SOLO Resampling (continue –1) Sequential Importance Resampling (SIR) (continue – 2) Using the method of Inverse Transform Algorithm we generate N independent and identical distributed (i.i.d.) variables from the uniform distribution u, we sort them in ascending order and we compare them with the Cumulative Distribution Function (C.D.F.) of the normalized weights. Nonlinear Estimation Using Particle Filters Non-Additive Non-Gaussian Nonlinear Filter ( ) ( )kkk kkk vxhz wxfx , , 11 = = −− Non-Additive Non-Gaussian Nonlinear Filter ( ) ( )kkk kkk vxhz wxfx , , 11 = = −−
  • 174.
    174 SOLO Resampling Algorithm (continue– 2) Sequential Importance Resampling (SIR) (continue – 3) Initialize the C.D.F.: c1 = wk 1 Compute the C.D.F.: ci = ci-1 + wk i For i = 2:N i := i + 1 0 Start at the bottom of the C.D.F.: i = 1 Draw for the uniform distribution [ ]1 ,0~ − NUui 1 For i=1:N Move along the C.D.F. uj = ui +(j – 1) N-1 . For j=1:N2 WHILE uj > ci j* = i + 1 END WHILE 3 END For 5 i := i + 1 If i < N Return to 1 4 Assign sample: i k j k xx =* Assign weight: 1− = Nwj k Assign parent: ii j = Nonlinear Estimation Using Particle Filters Non-Additive Non-Gaussian Nonlinear Filter ( ) ( )kkk kkk vxhz wxfx , , 11 = = −−
  • 175.
    175 SOLO Resampling Sequential Importance Resampling(SIR) (continue – 4) ( ) ( )  ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) twwwt Zxxq xxpxzp ww i kk N i i k k i k i k i k i k i kk N i k i k /~~ ,| ||~~ 1 :11 1 /1 1 == = ∑= − − − After measurement zk-1 we compute ( ) ( ) ( ) { }i k i kkk wxZxp ~,| :1 ≈ 1 Start with the approximation ( ) { } ( )∑= − − −= ≈ N i i kk i kkk Nxx NxZxp 1 1 1:1 /: ,| δ 0 Prediction ( ) ( ) ( ) ( )i kk i k i k nuxfx ,,*1 =+ to obtain ( ) ( ) { }1 1:11 ,| − ++ ≈ NxZxp i kkk 3 k:=k+1 Run This Nonlinear Estimation Using Particle Filters Non-Additive Non-Gaussian Nonlinear Filter ( ) ( )kkk kkk vxhz wxfx , , 11 = = −− ( ) ( )∑= −= N i i kk i kkk xxwZxp 1 :1| δ If Resample to obtain ( ) ( ) { }1 :1 ,*| − ≈ NxZxp i kkk 2 ( ) tht N i i keff NwN <       = ∑=1 2 /1
  • 176.
    176 SOLO Resampling Sequential Importance Resampling(SIR) (continue – 5) Although the resampling step reduces the effect of degeneracy, it introduces other practical problems: It limits the possibility of parallel implementation. The particles that have high wk i are statistically selected many times. This leads to loss of diversity among the particles (sample impoverishment). 1 2 Several other techniques for generating samples from an unknown P.D.F., beside Importance Sampling, have been presented in the literature. If the P.D.F. is stationary, Markov Chain Monte Carlo (MCMC) methods have been proposed: • Metropolis – Hastings (MH) • Gibbs sampler (a special case of MH) (see Probability Presentation) Nonlinear Estimation Using Particle Filters Non-Additive Non-Gaussian Nonlinear Filter ( ) ( )kkk kkk vxhz wxfx , , 11 = = −−
  • 177.
    177 SOLO Selection of ImportanceDensity Sequential Importance Resampling (SIR) (continue – 6) The choice of the Importance Density q (xk|xk-1,zk) is one of the most critical issues in the design of the Particle Filter. The Optimal Choice The Optimal Importance Density q (xk|xk-1,zk), that minimizes the variance of importance weights, conditioned upon xk-1 i and zk has been shown to be: ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )i kk i kk i kkk aabbba k i kkoptk i kk xzp xxpxxzp zxxpzxxq 1 11 Pr|PrPr|Pr 11 | |,| ,|,| − −− = −− == ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )k i k i k i k i k i kki k i k zxxq xxpxzp ww ,| || 1 1 1 − − −=Substitution of this into: we obtain: ( ) ( ) ( ) ( )i kk i k i k xzpww 11 | −−= From this equation we can see that the importance weights at time k can be computed (if necessary resampling can be performed) before the particles are propagate to time k. In order to use optimal importance function we must: sample from p (xk|xk-1,zk).1 evaluate:2 ( ) ( ) ( )∫ −− = k i kkkk i kk xdxxpxzpxzp 11 ||| In the general case either of these two tasks can be difficult. Nonlinear Estimation Using Particle Filters Non-Additive Non-Gaussian Nonlinear Filter ( ) ( )kkk kkk vxhz wxfx , , 11 = = −−
  • 178.
    178 Sequential Importance ResamplingParticle Filter (SIRPF) SOLO SIRPF Summary Initialization of SIRPF ( )00 0 ~ˆ xpx x For { }∞∈ ,,1 k Assuming for k-1 Gaussian distribution with Mean and Covariance ( ) skkkkk i kk NiPxxx ,,1,ˆ;ˆ 1|11|111|1 == −−−−−−− N State Prediction and its Covariance System Definition ( ) ( ) ( ) ( ) ( )   = −= −−− kvkkkk kwkxkkkk vpvvxkhz wpwxpxwuxkfx ~,, ~,~,,,1 00111 0 ( ) sk i kk i kk Niuxkfx ,,1,,1 11|11| =−= −−−− 0 1 2 1|11|1 ,ˆ −−−− kkkk Px Generate Ns samples Assuming Gaussian distribution with Mean and Covariance ( ) skkkkkk j kk NjPxxx ,,1,ˆ; 1|1|1|1| == −−−− N 3 1|1| ,ˆ −− kkkk Px Generate new Ns samples Draw Table of Content
  • 179.
    179 Monte Carlo ParticleFilter (MCPF) SOLO MCPF Summary Initialization of MCPF { } ( ) ( ){ }T xxxxEPxEx 00000|000 ˆˆˆ −−== { } [ ] ( )( ){ }           =−−=== R Q P xxxxEPxxEx TaaaaaTTaa 00 00 00 ˆˆ00ˆˆ 0|0 00000|0000 [ ]TTTTa vwxx =: For { }∞∈ ,,1 k Assuming for k-1 Gaussian distribution with Mean and Covariance ( ) skkkkk i kk NiPxxx ,,1,ˆ;ˆ 1|11|111|1 == −−−−−−− N State Prediction and its Covariance System Definition ( ) { } { } ( ) { } { }    ==+= ==+−= −−−−−−− lkk T lkkkkk lkk T lkkkkkk RvvEvEvxkhz QwwEwEwuxkfx , ,1111111 &0, &0,,1 δ δ ( ) sk i kk i kk Niuxkfx ,,1,,1 11|11| =−= −−−− ∑= −− = sN i i kk s kk x N x 1 1|1| 1 ˆ 0 1 2 T kkkk N i Ti kk i kk s kk xxxx N P s 1|1| 1 1|1|1| ˆˆ 1 −− = −−− −= ∑ 1|11|1 ,ˆ −−−− kkkk Px Generate Ns samples Assuming Gaussian distribution with Mean and Covariance ( ) skkkkkk j kk NjPxxx ,,1,ˆ; 1|1|1|1| == −−−− N 3 1|1| ,ˆ −− kkkk Px Generate new Ns samples
  • 180.
    180 Monte Carlo ParticleFilter (MCPF) SOLO MCPF Summary (continue – 1) Measure Prediction ( ) s j kk j kk Njxkhz ,,1, 1|1| == −− ∑= −− = sN j j kk s kk z N z 1 1|1| 1 ˆ 4 Innovation and its Covariance 6 1|ˆ −−= kkkk zzi ( )( )∑= −−−−− −−== sN j T kk j kkkk j kk s zz kkk zzzz N PS 1 1|1|1|1|1| ˆˆ 1 Kalman Gain Computations 7 ( )( )∑= −−−−− −−= sN j T kk j kkkk j kk s xz kk zzxx N P 1 1|1|1|1|1| ˆˆ 1 1 1|1| − −−= zz kk xz kkk PPK Kalman Filter8 kkkk x kk iKx += −1|| ˆµ T kkkkk xx kk KSKP −=Σ −1|| k := k+1 & return to 1 Predicted Covariances Computations5 Importance Sampling using Gaussian Mean and Covariance ( ) s xx kk x kkk m kk Nmxx ,,1,; ||| =Σ= µN 9 xx kk x kk 1|1| , −− Σµ Generate new Ns samples Weight Update10 ( ) ( ) ( ) sxx kk x kk m kk kkkk m kk m kkkm k Nm x Pxxxzp w ,,1 ,; ,ˆ;|~ 1|1|| 1|1||| = Σ = −− −− µN N s N l l k m k m k Nmwww s ,,1~/~ 1 == ∑= Update State and its Covariance11 ∑= = sN m m kk m k s kk xw N x 1 || 1 ˆ ( )( )∑= −−= sN m T kk m kkkk m kk s kk xxxx N P 1 ||||| ˆˆ 1
  • 181.
    181 Sensor Data Processing and Measurement Formation Observation- to - Track Association Input Data Track Maintenance ( Initialization, Confirmation and Deletion) Filtering and Prediction Gating Computations Samuel S. Blackman, " Multiple-Target Tracking with Radar Applications", Artech House, 1986 Samuel S. Blackman, Robert Popoli, " Design and Analysis of Modern Tracking Systems", Artech House, 1999 SOLO Monte Carlo Particle Filter (MCPF) Table of Content
  • 182.
    182 Estimators vxHz += SOLO Maximum LikelihoodEstimate (MLE) For the particular vector measurement equation where the measurement noise, is gaussian (normal), with zero mean: v H zx ( )RNv ,0~ ( ) ( ) ( )xp zxp xzp x zx xz , | , | = and independent of , the conditional probability can be written, using Bayes rule as: x ( )xzp xz || ( )           − − ==−= 1 111 1111 1 1 , nxpp nx pxnxpxnpxpx xHz xHz zxfxHzv xn xn  ( ) ( ) 2/1 ,, /,, T vxzx JJvxpzxp = The measurement noise can be related to and by the function:v zx pxp p pp p I z f z f z f z f z f J =                 ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ =      ∂ ∂ =    1 1 1 1 ( ) ( ) ( ) ( )vpxpvxpzxp vxvxzx ⋅== ,, ,, v Since the measurement noise is independent of :xv zThe joint probability of and is given by:x
  • 183.
    183 EstimatorsSOLO Maximum Likelihood Estimate(continue – 1) v H zx ( ) ( ) ( ) ( )vpxpvxpzxp vxvxzx ⋅== ,, ,, x v ( )vxp vx ,, ( ) ( ) ( ) ( ) ( )      −−−= −= − xHzRxHz R xHzpxzp T p vxz 1 2/12/ | 2 1 exp 2 1 | π ( ) ( ) ( )[ ] ( )RWWLSxHzRxHzxzp T x xz x ⇒−−⇔ −1 | min|max ( ) ( )[ ] ( ) 02 11 =−−=−− ∂ ∂ −− xHzRHxHzRxHz x TT 0*11 =− −− xHRHzRH TT ( ) zRHHRHxx TT 111 *: −−− ==  ( ) ( )[ ] HRHxHzRxHz x TT 11 2 2 2 −− =−− ∂ ∂ this is a positive definite matrix, therefore the solution minimizes and maximizes ( ) ( )[ ]xHzRxHz T −− −1 ( )xzp xz || ( ) ( ) ( ) ( ) ( )     −=== − vRv R vp xp zxp xzp T pv x zx xz 1 2/12/ / | 2 1 exp 2 1, | π gaussian (normal), with zero mean ( ) ( )xzpxzL xz |:, |= is called the Likelihood Function and is a measure of how likely is the parameter given the observation .x z Table of Content
  • 184.
    184 EstimatorsSOLO Bayesian Maximum LikelihoodEstimate (Maximum Aposteriori– MAP Estimate) v H zx vxHz += Consider a gaussian vector , where , measurement, , where the gaussian noise is independent of and .( )RNv ,0~ v x ( ) ( )[ ]−− PxNx ,~  x ( ) ( ) ( ) ( )( ) ( ) ( )( )      −−−−−− − = − xxPxx P xp T nx  1 2/12/ 2 1 exp 2 1 π ( ) ( ) ( ) ( ) ( )      −−−=−= − xHzRxHz R xHzpxzp T pvxz 1 2/12/| 2 1 exp 2 1 | π ( ) ( ) ( ) ( )∫∫ +∞ ∞− +∞ ∞− == xdxpxzpxdzxpzp xxzzxz |, |, is gaussian with( )zpz ( ) ( ) ( ) ( ) ( )−=+=+= xHvExEHvxHEzE  0 ( ) ( )[ ] ( )[ ]{ } ( )[ ] ( )[ ]{ } ( )( )[ ] ( )( )[ ]{ } ( )[ ] ( )[ ]{ } ( )[ ]{ } ( )[ ]{ } { } ( ) RHPHvvEHxxvEvxxEH HxxxxEHvxxHvxxHE xHvxHxHvxHEzEzzEzEz TTTTT TTT TT +−=+−−−−−− −−−−=+−−+−−= −−+−−+=−−=           00 cov ( ) ( ) ( ) ( )[ ] ( )[ ] ( )[ ]       −−+−−−− +− = − xHzRHPHxHz RHPH zp TT Tpz ˆˆ 2 1 exp 2 1 1 2/12/ π
  • 185.
    185 EstimatorsSOLO Bayesian Maximum LikelihoodEstimate (Maximum Aposteriori Estimate) (continue – 1) v H zx vxHz += Consider a gaussian vector , where , measurement, , where the gaussian noise is independent of and .( )RNv ,0~ v x ( ) ( )[ ]−− PxNx ,~  x ( ) ( ) ( ) ( )( ) ( ) ( )( )      −−−−−− − = − xxPxx P xp T nx  1 2/12/ 2 1 exp 2 1 π ( ) ( ) ( ) ( ) ( )    −−−=−= − xHzRxHz R xHzpxzp T pvxz 1 2/12/| 2 1 exp 2 1 | π ( ) ( ) ( ) ( )[ ] ( )[ ] ( )[ ]       −−+−−−− +− = − xHzRHPHxHz RHPH zp TT Tpz ˆˆ 2 1 exp 2 1 1 2/12/ π ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )( ) ( ) ( )( ) ( )[ ] ( )[ ] ( )[ ]      −−+−−−+−−−−−−−−−⋅ +− − == −−− xHzRHPHxHzxxPxxxHzRxHz RHPH RPzp xpxzp zxp TTTT T nz xxz zx ˆˆ 2 1 2 1 2 1 exp 2 1| | 111 2/1 2/12/1 2/ | |  π from which
  • 186.
    186 EstimatorsSOLO Bayesian Maximum LikelihoodEstimate (Maximum Aposteriori Estimate) (continue – 2) ( ) ( ) ( )( ) ( ) ( )( ) ( )( ) ( )[ ] ( )( )−−+−−−−−−−−−+−− −−− xHzRHPHxHzxxPxxxHzRxHz TTTT  111 ( ) ( )( )[ ] ( ) ( )( )[ ] ( )( ) ( ) ( )( ) ( )( ) ( )[ ] ( )( ) ( )( ) ( )[ ]{ } ( )( ) ( )( ) ( )( ) ( )( ) ( )( ) ( )( ) ( )[ ] ( )( )−−+−−−+−−−−−−−−−− −−+−−−−=−−+−−−− −−−−−+−−−−−−−−−−= −−−− −−− −− xxHRHPxxxxHRxHzxHzRHxx xHzRHPHRxHzxHzRHPHxHz xxPxxxxHxHzRxxHxHz TTTTT TTTT TT    1111 111 11 ( )( ) ( )( ) ( )( ) ( ) ( )( ) ( )( ) ( )[ ] ( )( )−−+−−−−−−−−−+−−−− −−− xHzRHPHxHzxxPxxxHzRxHz TTTT  111 ( )[ ] ( )[ ] 11111111 −−−−−−−− −++/−/=+−− RHPHRHHRRRRHPHR TTT we have then Define: ( ) ( )[ ] 111 : −−− +−=+ HRHPP T ( )( ) ( ) ( )[ ] ( ) ( )( ) ( )( ) ( ) ( )[ ] ( )( ) ( )( ) ( ) ( )[ ] ( )( ) ( )( ) ( )[ ] ( )( )−−+−−−+ −−++−−−−−++−−− −−+++−−= −− −−−− −−− xxHRHPxx xxPPHRxHzxHzRHPPxx xHzRHPPPHRxHz TT TTT TT    11 1111 111 ( ) ( )( )[ ] ( ) ( ) ( )( )[ ]−−+−−+−−+−−= −−− xHzRHxxPxHzRHxx TTT  111 ( ) ( ) ( ) ( ) ( ) ( )( )[ ] ( ) ( ) ( ) ( )( )[ ]       −−+−−−+−−+−−−−⋅ + = −−− xHzRHPxxPxHzRHPxx P zxp TTT nzx  111 2/12/| 2 1 exp 2 1 | π
  • 187.
    187 EstimatorsSOLO Bayesian Maximum LikelihoodEstimate (Maximum Aposteriori Estimate) (continue – 3) then where: ( ) ( )[ ] 111 : −−− +−=+ HRHPP T ( ) ( ) ( ) ( ) ( ) ( )[ ] ( ) ( ) ( ) ( )[ ]       −+−−−+−+−−−−⋅ + = −−− xHzRHPxxPxHzRHPxx P zxp TTT nzx 111 2/12/| 2 1 exp 2 1 |  π ( )zxp zx x |max | ( ) ( ) ( ) ( )( )−−++−==+ − xHzRHPxxx T  1* : Table of Content
  • 188.
    SOLO ( ) () ( ) [ ]fttttwddttxftxd ,, 0∈+= A continuous dynamic system is described by: Nonlinear Filters ( )tx - n- dimensional state vector ( )twd - n- dimensional process noise vector described by the covariance matrix Q - the probability of the state at time tx The time evolution of the probability density function is described by the Fokker–Planck equation: Nonlinear Filters based on the Fokker-Planck Equation Fred Daum from Raytheon Company leads methods to design Nonlinear Filters starting from Fokker-Planck Equation. ( ) ( ){ } 0:ˆ == twdEtwd ( ) ( )[ ] ( ) ( )[ ]{ } ( ) ( )τδ −=−− ttQtwdtwdtwdtwdE T ˆˆ Return to Stochastic Processes ( ) ( )[ ] ( )[ ] ( ) ( )[ ]( ) ( ) ( ) ( )[ ]       ∂ ∂ ∂ ∂ + ∂ ∂ −= ∂ ∂ x txp tQ xx txpttxf t txp txtxtx 2 1, ( )[ ]txp Fred Daum ( )[ ] ( ) ( )[ ]( ) ( )[ ] ( ) ( )[ ]( ) ∑= ∂ ∂ = ∂ ∂ n i i txitx x txpttxf x txpttxf 1 ,, ( ) ( )[ ] ( ) ( )[ ] ( ) ( )[ ] ( ) ( )[ ] T n txtxtxtx x txp x txp x txp x txp       ∂ ∂ ∂ ∂ ∂ ∂ = ∂ ∂ ,,, 21 
  • 189.
    SOLO Assuming system measurementsat discrete time tk given by: ( ) ( )( ) [ ]fkkkkk tttvttxhtz ,,, 0∈= kv - m- dimensional measurement noise vector at tk We are interested in the probability of the state at time t given the set of discrete measurements until (included) time tk < t. x ( )kZtxp |, { }kk zzzZ ,,, 21 = - set of all measurements up to and including time tk. Bayes’ Rule: Nonlinear Filters ( ) ( ) ( ) ( ) ( ) ( ) ( )1 1 | | 1 | ,||, ,|, − − = − =         kk kkkk Bayes bp apabp bap Z kkk Zzp txzpZtxp Zztxp k  ( )1|, −kk Ztxp probability of at time tk given Zk-1 (apriori – before measurement zk)x (aposteriori – after measurement zk) probability o f at time tk given Zk x ( )kk txzp ,| probability of measurement given the state at time tk. (likelihood of measurement) kz x ( )1| −kk Zzp probability of measurement given Zk-1 (apriori – before measurement zk) (normalization of conditional probability) kz Nonlinear Filters based on the Fokker-Planck Equation
  • 190.
    SOLO In the ClassicalParticle Filter solution the particle are drawn using the apriori density that decide their distribution (see Figure). After measurement the Likelihood of Measurement is obtained and nothing will prevent a low density of particles drawn before in the Likelihood region. This is the Particle Degeneracy, that produce the curse of dimensionality. Nonlinear Filters Fred Daum Nonlinear Filters based on the Fokker-Planck Equation prior density Particles to represent prior density Liklehood of measurement Particle Degeneracy Cause of Curse of Dimensionality The Particle Filter solutions have implementations problems. The Number of Particles necessary to reduce the Filter Error increase with System Dimension. Daum gives the Filter Error as function of Number of Particles for System Dimension as Parameter. http://sc.enseeiht.fr/doc/Seminar_Daum_2012_2.pdf
  • 191.
    SOLO By taking naturallogarithm of the conditional probability, we get in the right side a sum of logarithms Nonlinear Filters Fred Daum ( ) ( ) ( ) ( )         ionnormalizat kk likelihood kk aprior kk aposterior kk ZzptxzpZtxpZtxp 11 |ln,|ln|,ln|,ln −− −+= The homotopy ( ) ( ) ( ) ( ) ionnormalizatlikelihoodaprioraposterior Kxhxgxp λλλ lnlnln,ln −+= p.d.f. p.d.f. Flow of Density particles particles Flow of Particles Sample from Density Sample from Density apriori aposteriori Induced Flow of Particles for Bayes¶Rule Since p (x,λ) is the p.d.f. associated to a system defined by f (x,λ) we have the Fokker-Plank Equation: ( ) ( ) ( )( ) ( ) ( )       ∂ ∂ ∂ ∂ + ∂ ∂ −= ∂ ∂ x xp xQ xx xpxfxp λ λ λλ λ λ , , 2 1,,, To obtain the aposteriori probability p (x,tk|Zk) from the apriori probability p (x,tk|Zk-1) and the likelihood p (zk|x,tk), Daum uses a homotopy procedure (see next slide) by choosing a homotopy continuous parameter λ ϵ [0,1]. He will search for a function (not related to the filtered system) that describes the flow of the particles and is associated to p (x,tk|Zk) . ( )λ,xf Nonlinear Filters based on the Fokker-Planck Equation ( )λ,xQ - Noise Spectrum to be defined Here we describe Daum proposed methods called Particle Flow Filters ( ) ( ) λ λλ λ d wd xQxf d xd ,, += Particle Flow Equation
  • 192.
    01/13/15 192 Homotopy In topology,two continuous functions from one topological space to another are called homotopic (Greek μός (homós) = same, similarὁ , and τόπος (tópos) = place) if one can be "continuously deformed" into the other, such a deformation being called a homotopy between the two functions. An outstanding use of homotopy is the definition of homotopy groups and cohomotopy groups, important invariants in algebraic topology. A Homotopy of a Coffe Cup into a doughnut Formally, a homotopy between two continuous functions f and g from a topological space X to a topological space Y is defined to be a continuous function H : X × [0,1] → Y from the product of the space X with the unit interval [0,1] to Y such that, if x X then H(x,0) = f(x) and H(x,1) =∈ g(x). If we think of the second parameter of H as time then H describes a continuous deformation of f into g: at time 0 we have the function f and at time 1 we have the function g. An alternative notation is to say that a homotopy between two continuous functions f, g : X → Y is a family of continuous functions ht : X → Y for t ∈ [0,1] such that h0 = f and h1 = g, and the map t h↦ t is continuous from [0,1] to the space of all continuous functions X → Y. The two versions coincide by setting ht(x) = H(x,t). Formal definition SOLO
  • 193.
    SOLO Nonlinear Filters FredDaum ( ) ( ) ( )( ) ( ) ( )       ∂ ∂ ∂ ∂ + ∂ ∂ −= ∂ ∂ x xp xQ xx xpxfxp λ λ λλ λ λ , , 2 1,,, Fokker-Plank Equation ( ) ( ) ( ) ( ) ( )( ) ( ) ( )       ∂ ∂ ∂ ∂ + ∂ ∂ −=      − x xp xQ xx xpxf xp d Kd xh λ λ λλ λ λ λ , , 2 1,, , ln ln Partial Differential Equation for f given p ( ) ( ) ( ) ( )( ) ( ) ( )       ∂ ∂ ∂ ∂ + ∂ ∂ −= ∂ ∂ x xp xQ xx xpxf xp xp λ λ λλ λ λ λ , , 2 1,, , ,ln ( ) ( ) ( ) ( )λλλ Kxhxgxp lnlnln,ln −+= Definition of p (x,λ) We have: ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )       ∂ ∂ ∂ ∂ + ∂ ∂ − ∂ ∂ −=      − x xp xQ x xf x xp x xf xpxp d Kd xh λ λλ λλ λλ λ λ , , 2 1 , ,, ,, ln ln ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )       ∂ ∂ ∂ ∂ + ∂ ∂ − ∂ ∂ −=      − x xp xQ xxp xf x xp x xf d Kd xh λ λ λ λ λλ λ λ , , ,2 1 , ,ln,ln ln Nonlinear Filters based on the Fokker-Planck Equation
  • 194.
    SOLO Nonlinear Filters FredDaum ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )       ∂ ∂ ∂ ∂ + ∂ ∂ − ∂ ∂ −=      − x xp xQ xxp xf x xp x xf d Kd xh λ λ λ λ λλ λ λ , , ,2 1 , ,ln,ln ln Differentiate this Equation as function of x ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )             ∂ ∂ ∂ ∂ ∂ ∂ + ∂∂ ∂ −      ∂ ∂ ∂ ∂ − ∂ ∂ −= ∂ ∂ λ λ λ λλλλ λ ,/ , , 2 1,,ln,,ln , ln 2 2 xp x xp xQ xxx xf x xp x xf xx xp xf x xh T One option to simplify the problem is to choose such that:( )λ,xQ ( ) ( ) ( ) ( ) ( ) ( ) 0,/ , , 2 1,,ln, =             ∂ ∂ ∂ ∂ ∂ ∂ + ∂∂ ∂ −      ∂ ∂ ∂ ∂ − λ λ λ λλλ xp x xp xQ xxx xf x xp x xf x We obtain ( ) ( ) ( ) 2 2 ,ln , ln x xp xf x xh T ∂ ∂ −= ∂ ∂ λ λ ( ) ( ) ( ) T x xh x xp xf       ∂ ∂       ∂ ∂ −= − ln,ln , 1 2 2 λ λ Nonlinear Filters based on the Fokker-Planck Equation
  • 195.
    SOLO Nonlinear Filters FredDaum Second option to simplify the problem is to choose ( ) 0, =λxQ Nonlinear Filters based on the Fokker-Planck Equation ( )λ λ ,xf d xd = ( ) ( ) ( )( ) x xpxfxp ∂ ∂ −= ∂ ∂ λλ λ λ ,,, Fokker-Plank Equation Particle Flow Equation ( ) ( ) ( ) ( )λλλ Kxhxgxp lnlnln,ln −+= Definition of p (x,λ) Define ( ) ( ) ( ) ( )    known xp d Kd xhx λ λ λ λη , ln ln:,       −−= We obtain ( ) ( )[ ] ( )ληλλ ,,, xxpxf x = ∂ ∂ P.D.E. for f given p ( ) ( ) ( ) ( )( ) x xpxf xp xp ∂ ∂ −= ∂ ∂ λλ λ λ λ ,, , ,ln ( ) ( ) ( ) ( ) ( )[ ]λλλ λ λ ,,, ln ln xpxf x xp d Kd xh ∂ ∂ −=      − P.D.E. for f given p λd d
  • 196.
    SOLO Nonlinear Filters FredDaum Second option to simplify the problem is to choose ( ) 0, =λxQ Nonlinear Filters based on the Fokker-Planck Equation We obtain ( ) ( )[ ] ( ) ( )ληλλ λ ,,, , xxpxf x xq = ∂ ∂    q = p f f = unknown function p & η known at random points( ) ( ) ( ) ( )λ λ λ λη , ln ln:, xp d Kd xhx       −−= We have ( )λη , 2 2 1 1 x x q x q x q x q d d = ∂ ∂ ++ ∂ ∂ + ∂ ∂ = ∂ ∂  1. Linear PDE in unknown f or q. 2. Constant coefficient PDE in q. 3. First Order PDE. 4. Highly undetermined PDE. 5. Same as the Gauss divergence law in Maxwell Equations. 6. Same as Euler’s Equation in Fluid Dynamics. 7. Existence of solutions if and only if integral of η is zero. Exact Flow Solutions for g & h Gaussian Densities: ( ) ( ) ( ) ( ) [ ] ( ) ( ) ( )[ ]xAzRHPAIAIb HRHPHHPA bxAxf T TT +++= +−= += − − 1 1 2: 2 1 : , λλλ λλ λλλ Automatically stable under very mild conditions & extremely fast
  • 197.
    Fred Daum SOLO NonlinearFilters F. Daum, J. Huang, Particle Flow for Nonlinear Filters, Bayesian Decision and Transport, 7 April 2014
  • 198.
    SOLO Nonlinear Filters FredDaum Table of Content
  • 199.
    199 Recursive Bayesian Estimation References: SOLO 1.Sage, A.P., & Melsa, J.L., “Estimation Theory with Applications to Communications and Control”, McGraw Hill, 1971 2. Gordon, N.J., Salmond, D.J., Smith, A.M.F., “Novel Approach to Nonlinear/Non- Gaussian Bayesian State Estimation”, IEE Proceedings Radar and Signal Processing, vol. 140, No. 2, April 1993, pp. 107 - 113 7. Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005 5. Arulampalam,S., Maskell,S., Gordon,N., Clapp,T., “A Tutorial on Particle Filters for On-line Non-linear/Non-Gaussian Bayesian Tracking”, IEEE Transactions on Signal Processing, Vol. 50, No. 2, February 2002 6. Ristic,B., Arulampalam,S., Gordon,N., “Beyond the Kalman Filter Particle Filters for Tracking Applications”, Artech House, 2004 4. Karlsson, R., “Simulation Based Methods for Target Tracking”, Department of Electrical Engineering Linköpings Universitet, 2002 3. Doucet,A., de Freitas,N., Gordon,N., Ed. “Sequential Monte Carlo Methods in Practice”, Springer, 2001
  • 200.
    200 Recursive Bayesian Estimation References(continue – 1): SOLO Fred Daum, “Particle Flow for Nonlinear Filters”, 19 July 2012 http://sc.enseeiht.fr/doc/Seminar_Daum_2012_2.pdf https://www.ll.mit.edu/asap/asap_06/pdf/Papers/23_Daum_Pa.pdf Fred Daum, Misha Krichman, “Non-Particle Filters”, F. Daum, J. Huang, Particle Flow for Nonlinear Filters, Bayesian Decision and Transport, 7 April 2014 http://meeting.xidian.edu.cn/workshop/miis2014/uploads/files/July-5th-930am_Fred %20Daum_Particle%20flow%20for%20nonliner%20filters,%20Bayesuan %20Decisions%20and%20Transport%20.pdf http://www.dsi.unifi.it/users/chisci/idfric/Nonlinear_filtering_Chen.pdf Zhe Chen, “Bayesian Filtering From Kalman Filters to Particle Filters, and Beyond”, 18.05.06, Table of Content
  • 201.
    201January 13, 2015201 SOLO Technion Israeli Institute of Technology 1964 – 1968 BSc EE 1968 – 1971 MSc EE Israeli Air Force 1970 – 1974 RAFAEL Israeli Armament Development Authority 1974 – 2013 Stanford University 1983 – 1986 PhD AA
  • 202.
    202 “Proceedings of theIEEE”, March 2004, Special Issue on: “Sequential State Estimation: From Kalman Filters to Particle Filters” Julier, S.,J. and Uhlmann, J.,K., “Unscented Filtering and Nonlinear Estimation”, pp.401 - 422 Recursive Bayesian Estimation
  • 203.
    203 SOLO Neil GordonM. Sanjev Arulampalam TimClappSimon MaskellNando de FreitasArnaud Doucet Branko Ristic Genshiro Kitagawa Christophe Andrieu Dan Crişan Fred Daum Recursive Bayesian Estimation
  • 204.
    204 Markov Chain MonteCarlo (MCMC)SOLO Some MCMC Developments Related to Vision Nicholas Constantine Metropolis ( 1915 – 1999) Metropolis 1946 Hastings 1970 Heat bath Miller, Grenader, 1994 Green 1995 DDMCMC 2001 - 2005 Waltz 1972, (labeling) Rosenfeld, Hummel, Zucker 1976 (relaxation) Geman brothers 1984, (Gibbs sampler) Kirkpatrick 1983 Swendsen-Wang 1987 (clustering) Swendsen-Wang Cut 2003
  • 205.
    205 Markov Chain MonteCarlo (MCMC)SOLO A Brief History of MCMC Nicholas Constantine Metropolis ( 1915 – 1999) 1942 – 1946: Real use of Monte Carlo started during WWII - study of the atomic bomb (neutron diffusion in fissile material) 1948: Fermi, Metropolis, Ulam obtained Monte Carlo estimates for the eigenvalues of the Schrödinger equations. 1950: Formating of the basic construction of MCMC, e.g. the Metropolis method - application to statistical physics model, such as Ising model 1960 - 80: using MCMC to study phase transition; material growth/defect, macro molecules (polymers), etc. 1980s: Gibbs samples (Germ brothers), Simulated annealing, data augmentation, Swendsen-Wang, etc. global optimization; image and spech; quantum field theory 1990s: Applications in genetics; computational biology.
  • 206.
    206 Rao – BlackwellTheoremSOLO Rao-Blackwell Theorem provides a process by which a possible improvement in efficiency of an estimator can be obtained by taking its conditional expectation with respect to a sufficient statistics. The result on one parameter appeared in Rao (1945) and in Blackwell (1947). Lehmann and Scheffè (1950) called the result as Rao-Blackwell Theorem (RBT), and the process is described as Rao-Blackwellization (RB) by Berkson (1955). In computational terminology it is called Rao-Blackwellized Filter (RBF). Calyampudi Radhakrishna Rao and David Blackwell. The Rao – Blackwell Theorem states that if g (x) is any kind of estimator of a parameter θ, then the conditional expectation of g (x) given T (x), where T (x) is a sufficient statistics, is typically a better estimator of θ, and is never worse. Let x = (x1,…,xn) be a random sample from a probability distribution p (x,θ) where θ = (θ1,…, θq) is an unknown vector parameter. Consider an estimator g (x)=(g1(x),…,gq(x)) of θ and the qxq mean square and product matrix C (g) C (g) =(cij)= ( E {[gi(x)- θi(x)] {[gj(x)- θj(x)]}) Let S be a sufficient statistic, which may be vector valued, s.t. the conditional expectation, E {g|S} = T (x), is independent on θ. A general version of Rao – Blackwell is C (g) – C (T) is nonnegative definite
  • 207.
  • 208.
    208 SOLO Non-Gaussian DistributionApproximation http://www.dsi.unifi.it/users/chisci/idfric/Nonlinear_filtering_Chen.pdf

Editor's Notes

  • #11 John Minkoff, “Signals, Noise, and Active Sensors - Radar, Sonar, Laser Radar”
  • #16 A. Papoulis, “Probability, Random Variables and Stochastic Processes”,McGraw-Hill, 1965, pp.147-148
  • #17 Papoulis, “Probability, Random Variables and Stochastic Processes”,McGraw-Hill, 1965, pp.126-132
  • #18 Papoulis, “Probability, Random Variables and Stochastic Processes”,McGraw-Hill, 1965, pp.126-132
  • #27 John Minkoff, “Signals, Noise, and Active Sensors - Radar, Sonar, Laser Radar”
  • #28 A. Papoulis, “Probability, Random Variables and Stochastic Processes”,McGraw-Hill, 1965, pp.147-148
  • #30 A.Papoulis, “Probability, Random Variables and Stochastic Processes”,McGraw-Hill, 1965, pp.263-266 http://en.wikipedia.org/wiki/Law_of_large_numbers
  • #31 Papoulis, “Probability, Random Variables and Stochastic Processes”,McGraw-Hill, 1965, pp.263-266 http://en.wikipedia.org/wiki/Law_of_large_numbers
  • #32 A.Papoulis, “Probability, Random Variables and Stochastic Processes”,McGraw-Hill, 1965, pp.260-263 http://en.wikipedia.org/wiki/Law_of_large_numbers
  • #33 http://en.wikipedia.org/wiki/Central_limit_theorem
  • #39 A. Papoulis, “ Probability, Random Variables and StochasticProcesses”, McGraw-Hill, 1965, pp.99-100
  • #40 A. Papoulis, “ Probability, Random Variables and StochasticProcesses”, McGraw-Hill, 1965, pp.169
  • #41 http://en.wikipedia.org/wiki/Monte_Carlo_sampling http://www.lanl.gov/news/pdf/Metropolis_bio.pdf
  • #43 A. Gelb, Ed., “Applied Optimal Estimation”,MIT Press, 1974, pg.147, Prob. 4-10
  • #44 A. Gelb, Ed., “Applied Optimal Estimation”,MIT Press, 1974, pg.147, Problem 4-10
  • #45 A. Gelb, Ed., “Applied Optimal Estimation”,MIT Press, 1974, pg.147, Problem 4-10
  • #46 Taylor, J., H., “Handbook of the Direct Statistical Analysis of Missile Guidance Systems via CADET”,“ The Analytic Sciences Corporation”, NTIS, AD-A013 397, 31 May 1975, Appendix C, “The Monte-Carlo Method: Application and Reliability”
  • #47 Taylor, J., H., “Handbook of the Direct Statistical Analysis of Missile Guidance Systems via CADET”,“ The Analytic Sciences Corporation”, NTIS, AD-A013 397, 31 May 1975, Appendix C, “The Monte-Carlo Method: Application and Reliability”
  • #48 Taylor, J., H., “Handbook of the Direct Statistical Analysis of Missile Guidance Systems via CADET”,“ The Analytic Sciences Corporation”, NTIS, AD-A013 397, 31 May 1975, Appendix C, “The Monte-Carlo Method: Application and Reliability”
  • #49 Taylor, J., H., “Handbook of the Direct Statistical Analysis of Missile Guidance Systems via CADET”,“ The Analytic Sciences Corporation”, NTIS, AD-A013 397, 31 May 1975, Appendix C, “The Monte-Carlo Method: Application and Reliability”
  • #50 Taylor, J., H., “Handbook of the Direct Statistical Analysis of Missile Guidance Systems via CADET”,“ The Analytic Sciences Corporation”, NTIS, AD-A013 397, 31 May 1975, Appendix C, “The Monte-Carlo Method: Application and Reliability”
  • #51 Taylor, J., H., “Handbook of the Direct Statistical Analysis of Missile Guidance Systems via CADET”,“ The Analytic Sciences Corporation”, NTIS, AD-A013 397, 31 May 1975, Appendix C, “The Monte-Carlo Method: Application and Reliability”
  • #52 Taylor, J., H., “Handbook of the Direct Statistical Analysis of Missile Guidance Systems via CADET”,“ The Analytic Sciences Corporation”, NTIS, AD-A013 397, 31 May 1975, Appendix C, “The Monte-Carlo Method: Application and Reliability”
  • #53 Taylor, J., H., “Handbook of the Direct Statistical Analysis of Missile Guidance Systems via CADET”,“ The Analytic Sciences Corporation”, NTIS, AD-A013 397, 31 May 1975, Appendix C, “The Monte-Carlo Method: Application and Reliability”
  • #54 Taylor, J., H., “Handbook of the Direct Statistical Analysis of Missile Guidance Systems via CADET”,“ The Analytic Sciences Corporation”, NTIS, AD-A013 397, 31 May 1975, Appendix C, “The Monte-Carlo Method: Application and Reliability”
  • #55 Bar-Shalom, Y., Xiao-Rong, L., “Estimation and Tracking: Principles, Techniques, and Software”, Artech House, 1993, pp. 108-109
  • #56 University of Alberta “ Principles of Monte Carlo Simulation”, February 2001
  • #57 S.M. Ross, “ A Course in Simulation”, Macmillan &amp; Collier MacmillanPublishers, 1990, pp. 36 - 37
  • #58 S.M. Ross, “ A Course in Simulation”, Macmillan &amp; Collier MacmillanPublishers, 1990, pp. 36 - 37
  • #59 S.M. Ross, “ A Course in Simulation”, Macmillan &amp; Collier MacmillanPublishers, 1990, pp. 36 – 37 Coddington, P., “Monte Carlo Simulation for Statistical Physics”, CPS 713, Northest Parallel Architectures Center, January 1996 http://en.wikipedia.org/wiki/Histogram
  • #60 S.M. Ross, “ A Course in Simulation”, Macmillan &amp; Collier MacmillanPublishers, 1990, pp. 44 - 50
  • #61 S.M. Ross, “ A Course in Simulation”, Macmillan &amp; Collier MacmillanPublishers, 1990, pp. 44 - 45
  • #62 S.M. Ross, “ A Course in Simulation”, Macmillan &amp; Collier MacmillanPublishers, 1990, pp. 49 - 50
  • #63 S.M. Ross, “ A Course in Simulation”, Macmillan &amp; Collier MacmillanPublishers, 1990, pp. 50 - 51
  • #64 S.M. Ross, “ A Course in Simulation”, Macmillan &amp; Collier MacmillanPublishers, 1990, pp. 51 - 52
  • #65 S.M. Ross, “ A Course in Simulation”, Macmillan &amp; Collier MacmillanPublishers, 1990, pp. 51 - 52
  • #66 Karlsson, R., “ Simulation Based Methods for Target Tracking”, Linkoping Studies in Science and Technology, Thesis No. 930, 2002, pp. 34 – 35, , http://www.control.isy.liu.se/research/reports/LicentiateThesis/Lic930.pdf
  • #67 S.M. Ross, “ A Course in Simulation”, Macmillan &amp; Collier MacmillanPublishers, 1990, pp. 59 - 60
  • #68 S.M. Ross, “ A Course in Simulation”, Macmillan &amp; Collier MacmillanPublishers, 1990, pp.135 - 136
  • #69 Ristic, B., Arulampalam, S., Gordon, N., “Beyond the Kalman Filter – Particle Filter for Tracking Applications”, Artech House, 2004, pp. 35-36
  • #70 Ristic, B., Arulampalam, S., Gordon, N., “Beyond the Kalman Filter – Particle Filter for Tracking Applications”, Artech House, 2004, pp. 35-36
  • #71 A. Papoulis, “ Probability, Random Variables and StochasticProcesses”, McGraw-Hill, 1965, pp.350
  • #72 A. Papoulis, “ Probability, Random Variables and StochasticProcesses”, McGraw-Hill, 1965, pp.350
  • #73 A. Papoulis, “ Probability, Random Variables and StochasticProcesses”, McGraw-Hill, 1965, pp.303, 350
  • #75 Sage, A.P., &amp; Melsa, J.L., “Estimation Theory with Applications to Communications and Control”, McGraw Hill, 1971, pp. 272 - 283
  • #76 Sage, A.P., &amp; Melsa, J.L., “Estimation Theory with Applications to Communications and Control”, McGraw Hill, 1971, pp. 272 - 283
  • #77 Sage, A.P., &amp; Melsa, J.L., “Estimation Theory with Applications to Communications and Control”, McGraw Hill, 1971, pp. 272 - 283
  • #78 Sage, A.P., &amp; Melsa, J.L., “Estimation Theory with Applications to Communications and Control”, McGraw Hill, 1971, pp. 272 - 283
  • #79 Gordon, N.J., Salmond, D.J., Smith, A.M.F., “Novel Approach to Nonlinear/Non-Gaussian Bayesian State Estimation”, IEE Proceedings Radar and Signal Processing, vol. 140, No. 2, April 1993, pp. 107 - 113
  • #81 Sage, A.P., &amp; Melsa, J.L., “Estimation Theory with Applications to Communications and Control”, McGraw Hill, 1971, pp. 77- 82
  • #82 http://en.wikipedia.org/wiki/Adriaan_Fokker http://en.wikipedia.org/wiki/Max_Planck http://jeff560.tripod.com/f.html http://en.wikipedia.org/wiki/Fokker-Planck_equation
  • #83 http://en.wikipedia.org/wiki/Adriaan_Fokker http://en.wikipedia.org/wiki/Max_Planck http://en.wikipedia.org/wiki/Fokker-Planck_equation
  • #84 Sage, A.P., &amp; Melsa, J.L., “Estimation Theory with Applications to Communications and Control”, McGraw Hill, 1971, pp. 77- 82
  • #85 Sage, A.P., &amp; Melsa, J.L., “Estimation Theory with Applications to Communications and Control”, McGraw Hill, 1971, pp. 77- 82
  • #86 Sage, A.P., &amp; Melsa, J.L., “Estimation Theory with Applications to Communications and Control”, McGraw Hill, 1971, pp. 77- 82
  • #87 Sage, A.P., &amp; Melsa, J.L., “Estimation Theory with Applications to Communications and Control”, McGraw Hill, 1971, pp. 77- 82
  • #88 Sage, A.P., &amp; Melsa, J.L., “Estimation Theory with Applications to Communications and Control”, McGraw Hill, 1971, pp. 77- 82
  • #89 Sage, A.P., &amp; Melsa, J.L., “Estimation Theory with Applications to Communications and Control”, McGraw Hill, 1971, pp. 77- 82
  • #90 Sage, A.P., &amp; Melsa, J.L., “Estimation Theory with Applications to Communications and Control”, McGraw Hill, 1971, pp. 272 - 283
  • #91 Sage, A.P., &amp; Melsa, J.L., “Estimation Theory with Applications to Communications and Control”, McGraw Hill, 1971, pp. 272 - 283
  • #92 Sage, A.P., &amp; Melsa, J.L., “Estimation Theory with Applications to Communications and Control”, McGraw Hill, 1971, pp. 272 - 283
  • #93 Sage, A.P., &amp; Melsa, J.L., “Estimation Theory with Applications to Communications and Control”, McGraw Hill, 1971, pp. 272 - 283
  • #94 Sage, A.P., &amp; Melsa, J.L., “Estimation Theory with Applications to Communications and Control”, McGraw Hill, 1971, pp. 272 - 283
  • #95 Sage, A.P., &amp; Melsa, J.L., “Estimation Theory with Applications to Communications and Control”, McGraw Hill, 1971, pp. 272 - 283
  • #96 Sage, A.P., &amp; Melsa, J.L., “Estimation Theory with Applications to Communications and Control”, McGraw Hill, 1971, pp. 272 - 283
  • #97 Sage, A.P., &amp; Melsa, J.L., “Estimation Theory with Applications to Communications and Control”, McGraw Hill, 1971, pp. 272 - 283
  • #98 Sage, A.P., &amp; Melsa, J.L., “Estimation Theory with Applications to Communications and Control”, McGraw Hill, 1971, pp. 272 - 283
  • #99 Sage, A.P., &amp; Melsa, J.L., “Estimation Theory with Applications to Communications and Control”, McGraw Hill, 1971, pp. 272 - 283
  • #100 Sage, A.P., &amp; Melsa, J.L., “Estimation Theory with Applications to Communications and Control”, McGraw Hill, 1971, pp. 272 - 283
  • #103 Sage, A.P., &amp; Melsa, J.L., “Estimation Theory with Applications to Communications and Control”, McGraw Hill, 1971, pp. 272 - 283
  • #107 Sage, A.P., &amp; Melsa, J.L., “Estimation Theory with Applications to Communications and Control”, McGraw Hill, 1971, pp. 272 - 283
  • #108 Bar-Shalom, Y., Li, X-R., “Estimation and Tracking: Principles, Techniques and Software”, Artech House, 1993, pp. 43-44, 132
  • #109 Bar-Shalom, Y., Li, X-R., “Estimation and Tracking: Principles, Techniques and Software”, Artech House, 1993, pp. 43-44, 132
  • #110 Bar-Shalom, Y., Li, X-R., “Estimation and Tracking: Principles, Techniques and Software”, Artech House, 1993, pp. 43-44, 132
  • #111 Bar-Shalom, Y., Li, X-R., “Estimation and Tracking: Principles, Techniques and Software”, Artech House, 1993, pp. 43-44, 132
  • #112 Bar-Shalom, Y., Li, X-R., “Estimation and Tracking: Principles, Techniques and Software”, Artech House, 1993, pp. 43-44, 132
  • #113 Bar-Shalom, Y., Li, X-R., “Estimation and Tracking: Principles, Techniques and Software”, Artech House, 1993, pp. 43-44, 132
  • #114 Bar-Shalom, Y., Li, X-R., “Estimation and Tracking: Principles, Techniques and Software”, Artech House, 1993, pp. 43-44, 132
  • #115 Kailath, T., Sayed, A.H., Hassibi, B, “Linear Estimators”, Prentice Hall, 2000,pp.96
  • #116 Kailath, T., Sayed, A.H., Hassibi, B, “Linear Estimators”, Prentice Hall, 2000,pp.96
  • #117 Kailath, T., Sayed, A.H., Hassibi, B, “Linear Estimators”, Prentice Hall, 2000,pp.96
  • #118 Sage, A.P., &amp; Melsa, J.L., “Estimation Theory with Applications to Communications and Control”, McGraw Hill, 1971, pp. 272 - 283
  • #119 Sage, A.P., &amp; Melsa, J.L., “Estimation Theory with Applications to Communications and Control”, McGraw Hill, 1971, pp. 272 - 283
  • #120 http://en.wikipedia.org/wiki/Rudolf_Kalman
  • #121 http://en.wikipedia.org/wiki/Rudolf_Kalman
  • #122 Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005
  • #124 http://en.wikipedia.org/wiki/Rudolf_Kalman
  • #125 http://en.wikipedia.org/wiki/Rudolf_Kalman
  • #127 Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005 Ito, Kazufumi, Xiong Kaiqi, “Gaussian Filters for Nonlinear Filtering Problems”, IEEE Transactions on Automatic Control, Vol. 45, No. 5, May 2000, pp. 910 - 927
  • #128 Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005 Ito, Kazufumi, Xiong Kaiqi, “Gaussian Filters for Nonlinear Filtering Problems”, IEEE Transactions on Automatic Control, Vol. 45, No. 5, May 2000, pp. 910 - 927
  • #129 Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005 Ito, Kazufumi, Xiong Kaiqi, “Gaussian Filters for Nonlinear Filtering Problems”, IEEE Transactions on Automatic Control, Vol. 45, No. 5, May 2000, pp. 910 - 927
  • #130 Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005 Ito, Kazufumi, Xiong Kaiqi, “Gaussian Filters for Nonlinear Filtering Problems”, IEEE Transactions on Automatic Control, Vol. 45, No. 5, May 2000, pp. 910 - 927
  • #131 Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005 Ito, Kazufumi, Xiong Kaiqi, “Gaussian Filters for Nonlinear Filtering Problems”, IEEE Transactions on Automatic Control, Vol. 45, No. 5, May 2000, pp. 910 - 927
  • #132 Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005 Ito, Kazufumi, Xiong Kaiqi, “Gaussian Filters for Nonlinear Filtering Problems”, IEEE Transactions on Automatic Control, Vol. 45, No. 5, May 2000, pp. 910 - 927
  • #133 Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005 Ito, Kazufumi, Xiong Kaiqi, “Gaussian Filters for Nonlinear Filtering Problems”, IEEE Transactions on Automatic Control, Vol. 45, No. 5, May 2000, pp. 910 - 927
  • #134 Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005 Ito, Kazufumi, Xiong Kaiqi, “Gaussian Filters for Nonlinear Filtering Problems”, IEEE Transactions on Automatic Control, Vol. 45, No. 5, May 2000, pp. 910 - 927
  • #135 Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005 Ito, Kazufumi, Xiong Kaiqi, “Gaussian Filters for Nonlinear Filtering Problems”, IEEE Transactions on Automatic Control, Vol. 45, No. 5, May 2000, pp. 910 - 927
  • #136 Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005 Ito, Kazufumi, Xiong Kaiqi, “Gaussian Filters for Nonlinear Filtering Problems”, IEEE Transactions on Automatic Control, Vol. 45, No. 5, May 2000, pp. 910 - 927
  • #137 Julier, S.J., Uhlmann, J.K., “A New Extension of the Kalman Filter to Nonlinear Systems”, Proc. of AeroSense: The 11th Int. Symp. on Aerospace/Defense Sensing, Simulation and Controls., 1997 http://cslu.cse.ogi.edu/nsel/ukf/
  • #138 Julier, S.J., Uhlmann, J.K., “A New Extension of the Kalman Filter to Nonlinear Systems”, Proc. of AeroSense: The 11th Int. Symp. on Aerospace/Defense Sensing, Simulation and Controls., 1997 http://cslu.cse.ogi.edu/nsel/ukf/
  • #139 Julier, S.J., Uhlmann, J.K., “A New Extension of the Kalman Filter to Nonlinear Systems”, Proc. of AeroSense: The 11th Int. Symp. on Aerospace/Defense Sensing, Simulation and Controls., 1997 http://cslu.cse.ogi.edu/nsel/ukf/
  • #140 Julier, S.J., Uhlmann, J.K., “A New Extension of the Kalman Filter to Nonlinear Systems”, Proc. of AeroSense: The 11th Int. Symp. on Aerospace/Defense Sensing, Simulation and Controls., 1997 http://cslu.cse.ogi.edu/nsel/ukf/
  • #141 Julier, S.J., Uhlmann, J.K., “A New Extension of the Kalman Filter to Nonlinear Systems”, Proc. of AeroSense: The 11th Int. Symp. on Aerospace/Defense Sensing, Simulation and Controls., 1997 http://cslu.cse.ogi.edu/nsel/ukf/
  • #142 Julier, S.J., Uhlmann, J.K., “A New Extension of the Kalman Filter to Nonlinear Systems”, Proc. of AeroSense: The 11th Int. Symp. on Aerospace/Defense Sensing, Simulation and Controls., 1997 http://cslu.cse.ogi.edu/nsel/ukf/
  • #143 Julier, S.J., Uhlmann, J.K., “A New Extension of the Kalman Filter to Nonlinear Systems”, Proc. of AeroSense: The 11th Int. Symp. on Aerospace/Defense Sensing, Simulation and Controls., 1997 http://cslu.cse.ogi.edu/nsel/ukf/
  • #144 Julier, S.J., Uhlmann, J.K., “A New Extension of the Kalman Filter to Nonlinear Systems”, Proc. of AeroSense: The 11th Int. Symp. on Aerospace/Defense Sensing, Simulation and Controls., 1997 http://cslu.cse.ogi.edu/nsel/ukf/
  • #145 Wan, E.,A., van der Merwe, R., “The Unscented Kalman Filter”, Ch.7 of Haykin, S., Ed., “Kalman Filter and Neural Networks”, John Wiley &amp; Sons, 2001, pp. 272
  • #146 Julier, S.J., Uhlmann, J.K., “A New Extension of the Kalman Filter to Nonlinear Systems”, Proc. of AeroSense: The 11th Int. Symp. on Aerospace/Defence Sensing, Simulation and Controls., 1997 http://cslu.cse.ogi.edu/nsel/ukf/
  • #147 Julier, S.J., Uhlmann, J.K., “A New Extension of the Kalman Filter to Nonlinear Systems”, Proc. of AeroSense: The 11th Int. Symp. on Aerospace/Defence Sensing, Simulation and Controls., 1997 http://cslu.cse.ogi.edu/nsel/ukf/ http://cslu.cse.ogi.edu/nsel/Doc/snow00-presentation/sld001.htm
  • #148 Julier, S.J., Uhlmann, J.K., “A New Extension of the Kalman Filter to Nonlinear Systems”, Proc. of AeroSense: The 11th Int. Symp. on Aerospace/Defence Sensing, Simulation and Controls., 1997 http://cslu.cse.ogi.edu/nsel/ukf/ http://cslu.cse.ogi.edu/nsel/Doc/snow00-presentation/sld001.htm
  • #149 Julier, S.J., Uhlmann, J.K., “A New Extension of the Kalman Filter to Nonlinear Systems”, Proc. of AeroSense: The 11th Int. Symp. on Aerospace/Defence Sensing, Simulation and Controls., 1997 http://cslu.cse.ogi.edu/nsel/ukf/ http://cslu.cse.ogi.edu/nsel/Doc/snow00-presentation/sld001.htm
  • #150 http://en.wikipedia.org/wiki/Rudolf_Kalman
  • #151 Julier, S.J., Uhlmann, J.K., “A New Extension of the Kalman Filter to Nonlinear Systems”, Proc. of AeroSense: The 11th Int. Symp. on Aerospace/Defence Sensing, Simulation and Controls., 1997 http://cslu.cse.ogi.edu/nsel/ukf/ http://cslu.cse.ogi.edu/nsel/Doc/snow00-presentation/sld001.htm
  • #152 Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005
  • #153 Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005
  • #154 Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005
  • #155 Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005
  • #156 Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005
  • #157 Julier, S.J., Uhlmann, J.K., “A New Extension of the Kalman Filter to Nonlinear Systems”, Proc. of AeroSense: The 11th Int. Symp. on Aerospace/Defence Sensing, Simulation and Controls., 1997 http://cslu.cse.ogi.edu/nsel/ukf/ http://cslu.cse.ogi.edu/nsel/Doc/snow00-presentation/sld001.htm
  • #158 http://en.wikipedia.org/wiki/Rudolf_Kalman
  • #159 Gordon, N.J., Salmond, D.J., Smith, A.M.F., “Novel Approach to Nonlinear/Non-Gaussian Bayesian State Estimation”, IEE Proceedings Radar and Signal Processing, vol. 140, No. 2, April 1993, pp. 107 - 113
  • #160 Gordon, N.J., Salmond, D.J., Smith, A.M.F., “Novel Approach to Nonlinear/Non-Gaussian Bayesian State Estimation”, IEE Proceedings Radar and Signal Processing, vol. 140, No. 2, April 1993, pp. 107 - 113
  • #161 Gordon, N.J., Salmond, D.J., Smith, A.M.F., “Novel Approach to Nonlinear/Non-Gaussian Bayesian State Estimation”, IEE Proceedings Radar and Signal Processing, vol. 140, No. 2, April 1993, pp. 107 - 113
  • #163 Gordon, N.J., Salmond, D.J., Smith, A.M.F., “Novel Approach to Nonlinear/Non-Gaussian Bayesian State Estimation”, IEE Proceedings Radar and Signal Processing, vol. 140, No. 2, April 1993, pp. 107 - 113
  • #164 Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005 http://www.ece.iastate.edu/~namrata/EE520/Gordonnovelapproach.pdf Arulampalam,S., Maskell,S., Gordon,N., Clapp,T., “A Tutorial on Particle Filters for On-line Non-linear/Non-Gaussian Bayesian Tracking”, IEEE Transactions on Signal Processing, Vol. 50, No. 2, February 2002 Istic,B., Arulampalam,S., Gordon,N., “Beyond the Kalman Filter Particle Filters for Tracking Applications”, Artech House, 2004 Karlsson, R., “Simulation Based Metods for Target Tracking”, Department of Electrical Engineering Linköpings Universitet, 2002
  • #165 Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005 http://www.ece.iastate.edu/~namrata/EE520/Gordonnovelapproach.pdf Arulampalam,S., Maskell,S., Gordon,N., Clapp,T., “A Tutorial on Particle Filters for On-line Non-linear/Non-Gaussian Bayesian Tracking”, IEEE Transactions on Signal Processing, Vol. 50, No. 2, February 2002 Istic,B., Arulampalam,S., Gordon,N., “Beyond the Kalman Filter Particle Filters for Tracking Applications”, Artech House, 2004 Karlsson, R., “Simulation Based Metods for Target Tracking”, Department of Electrical Engineering Linköpings Universitet, 2002
  • #166 Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005 http://www.ece.iastate.edu/~namrata/EE520/Gordonnovelapproach.pdf Arulampalam,S., Maskell,S., Gordon,N., Clapp,T., “A Tutorial on Particle Filters for On-line Non-linear/Non-Gaussian Bayesian Tracking”, IEEE Transactions on Signal Processing, Vol. 50, No. 2, February 2002 Istic,B., Arulampalam,S., Gordon,N., “Beyond the Kalman Filter Particle Filters for Tracking Applications”, Artech House, 2004 Karlsson, R., “Simulation Based Metods for Target Tracking”, Department of Electrical Engineering Linköpings Universitet, 2002
  • #167 Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005 http://www.ece.iastate.edu/~namrata/EE520/Gordonnovelapproach.pdf Arulampalam,S., Maskell,S., Gordon,N., Clapp,T., “A Tutorial on Particle Filters for On-line Non-linear/Non-Gaussian Bayesian Tracking”, IEEE Transactions on Signal Processing, Vol. 50, No. 2, February 2002 Istic,B., Arulampalam,S., Gordon,N., “Beyond the Kalman Filter Particle Filters for Tracking Applications”, Artech House, 2004 Karlsson, R., “Simulation Based Metods for Target Tracking”, Department of Electrical Engineering Linköpings Universitet, 2002
  • #168 Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005 http://www.ece.iastate.edu/~namrata/EE520/Gordonnovelapproach.pdf Arulampalam,S., Maskell,S., Gordon,N., Clapp,T., “A Tutorial on Particle Filters for On-line Non-linear/Non-Gaussian Bayesian Tracking”, IEEE Transactions on Signal Processing, Vol. 50, No. 2, February 2002 Istic,B., Arulampalam,S., Gordon,N., “Beyond the Kalman Filter Particle Filters for Tracking Applications”, Artech House, 2004 Karlsson, R., “Simulation Based Metods for Target Tracking”, Department of Electrical Engineering Linköpings Universitet, 2002
  • #169 Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005 http://www.ece.iastate.edu/~namrata/EE520/Gordonnovelapproach.pdf Arulampalam,S., Maskell,S., Gordon,N., Clapp,T., “A Tutorial on Particle Filters for On-line Non-linear/Non-Gaussian Bayesian Tracking”, IEEE Transactions on Signal Processing, Vol. 50, No. 2, February 2002 Istic,B., Arulampalam,S., Gordon,N., “Beyond the Kalman Filter Particle Filters for Tracking Applications”, Artech House, 2004 Karlsson, R., “Simulation Based Metods for Target Tracking”, Department of Electrical Engineering Linköpings Universitet, 2002
  • #170 Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005 http://www.ece.iastate.edu/~namrata/EE520/Gordonnovelapproach.pdf Arulampalam,S., Maskell,S., Gordon,N., Clapp,T., “A Tutorial on Particle Filters for On-line Non-linear/Non-Gaussian Bayesian Tracking”, IEEE Transactions on Signal Processing, Vol. 50, No. 2, February 2002 Istic,B., Arulampalam,S., Gordon,N., “Beyond the Kalman Filter Particle Filters for Tracking Applications”, Artech House, 2004 Karlsson, R., “Simulation Based Metods for Target Tracking”, Department of Electrical Engineering Linköpings Universitet, 2002
  • #171 Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005 http://www.ece.iastate.edu/~namrata/EE520/Gordonnovelapproach.pdf Arulampalam,S., Maskell,S., Gordon,N., Clapp,T., “A Tutorial on Particle Filters for On-line Non-linear/Non-Gaussian Bayesian Tracking”, IEEE Transactions on Signal Processing, Vol. 50, No. 2, February 2002 Istic,B., Arulampalam,S., Gordon,N., “Beyond the Kalman Filter Particle Filters for Tracking Applications”, Artech House, 2004 Karlsson, R., “Simulation Based Metods for Target Tracking”, Department of Electrical Engineering Linköpings Universitet, 2002
  • #172 University of Alberta, “Principles of Monte Carlo Simulation”, February 2001 http://en.wikipedia.org/wiki/Bootstrapping_(statistics) Efron, B., “Bootstrap methods: another look at the jacknife”, The Annals of Statistics”, 1979, no.7, pp. 1-26
  • #173 Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005 http://www.ece.iastate.edu/~namrata/EE520/Gordonnovelapproach.pdf Arulampalam,S., Maskell,S., Gordon,N., Clapp,T., “A Tutorial on Particle Filters for On-line Non-linear/Non-Gaussian Bayesian Tracking”, IEEE Transactions on Signal Processing, Vol. 50, No. 2, February 2002 Istic,B., Arulampalam,S., Gordon,N., “Beyond the Kalman Filter Particle Filters for Tracking Applications”, Artech House, 2004 Karlsson, R., “Simulation Based Metods for Target Tracking”, Department of Electrical Engineering Linköpings Universitet, 2002
  • #174 Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005 http://www.ece.iastate.edu/~namrata/EE520/Gordonnovelapproach.pdf Arulampalam,S., Maskell,S., Gordon,N., Clapp,T., “A Tutorial on Particle Filters for On-line Non-linear/Non-Gaussian Bayesian Tracking”, IEEE Transactions on Signal Processing, Vol. 50, No. 2, February 2002 Istic,B., Arulampalam,S., Gordon,N., “Beyond the Kalman Filter Particle Filters for Tracking Applications”, Artech House, 2004 Karlsson, R., “Simulation Based Metods for Target Tracking”, Department of Electrical Engineering Linköpings Universitet, 2002
  • #175 Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005 http://www.ece.iastate.edu/~namrata/EE520/Gordonnovelapproach.pdf Arulampalam,S., Maskell,S., Gordon,N., Clapp,T., “A Tutorial on Particle Filters for On-line Non-linear/Non-Gaussian Bayesian Tracking”, IEEE Transactions on Signal Processing, Vol. 50, No. 2, February 2002 Istic,B., Arulampalam,S., Gordon,N., “Beyond the Kalman Filter Particle Filters for Tracking Applications”, Artech House, 2004 Karlsson, R., “Simulation Based Metods for Target Tracking”, Department of Electrical Engineering Linköpings Universitet, 2002
  • #176 Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005 http://www.ece.iastate.edu/~namrata/EE520/Gordonnovelapproach.pdf Arulampalam,S., Maskell,S., Gordon,N., Clapp,T., “A Tutorial on Particle Filters for On-line Non-linear/Non-Gaussian Bayesian Tracking”, IEEE Transactions on Signal Processing, Vol. 50, No. 2, February 2002 Istic,B., Arulampalam,S., Gordon,N., “Beyond the Kalman Filter Particle Filters for Tracking Applications”, Artech House, 2004 Karlsson, R., “Simulation Based Metods for Target Tracking”, Department of Electrical Engineering Linköpings Universitet, 2002
  • #177 Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005 http://www.ece.iastate.edu/~namrata/EE520/Gordonnovelapproach.pdf Arulampalam,S., Maskell,S., Gordon,N., Clapp,T., “A Tutorial on Particle Filters for On-line Non-linear/Non-Gaussian Bayesian Tracking”, IEEE Transactions on Signal Processing, Vol. 50, No. 2, February 2002 Istic,B., Arulampalam,S., Gordon,N., “Beyond the Kalman Filter Particle Filters for Tracking Applications”, Artech House, 2004 Karlsson, R., “Simulation Based Metods for Target Tracking”, Department of Electrical Engineering Linköpings Universitet, 2002
  • #178 Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005 http://www.ece.iastate.edu/~namrata/EE520/Gordonnovelapproach.pdf Arulampalam,S., Maskell,S., Gordon,N., Clapp,T., “A Tutorial on Particle Filters for On-line Non-linear/Non-Gaussian Bayesian Tracking”, IEEE Transactions on Signal Processing, Vol. 50, No. 2, February 2002 Istic,B., Arulampalam,S., Gordon,N., “Beyond the Kalman Filter Particle Filters for Tracking Applications”, Artech House, 2004 Karlsson, R., “Simulation Based Metods for Target Tracking”, Department of Electrical Engineering Linköpings Universitet, 2002
  • #179 Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005
  • #180 Haug, A.J., “A Tutorial on Bayesian Estimation and Tracking Techniques Applicable to Nonlinear and Non-Gaussian Processes”, MITRE Corporation, January 2005
  • #181 Julier, S.J., Uhlmann, J.K., “A New Extension of the Kalman Filter to Nonlinear Systems”, Proc. of AeroSense: The 11th Int. Symp. on Aerospace/Defence Sensing, Simulation and Controls., 1997 http://cslu.cse.ogi.edu/nsel/ukf/ http://cslu.cse.ogi.edu/nsel/Doc/snow00-presentation/sld001.htm
  • #182 http://en.wikipedia.org/wiki/Rudolf_Kalman
  • #189 F. Daum, J. Huang, Prticle Flow for Nonlinear Filters, Bayesian Decision and Transport, 7 April 2014
  • #190 F. Daum, J. Huang, Prticle Flow for Nonlinear Filters, Bayesian Decision and Transport, 7 April 2014
  • #191 F. Daum, J. Huang, Prticle Flow for Nonlinear Filters, Bayesian Decision and Transport, 7 April 2014
  • #192 F. Daum, J. Huang, Prticle Flow for Nonlinear Filters, Bayesian Decision and Transport, 7 April 2014
  • #193 http://en.wikipedia.org/wiki/Homotopy
  • #194 F. Daum, J. Huang, Prticle Flow for Nonlinear Filters, Bayesian Decision and Transport, 7 April 2014
  • #195 F. Daum, J. Huang, Prticle Flow for Nonlinear Filters, Bayesian Decision and Transport, 7 April 2014
  • #196 F. Daum, J. Huang, Prticle Flow for Nonlinear Filters, Bayesian Decision and Transport, 7 April 2014
  • #197 F. Daum, J. Huang, Prticle Flow for Nonlinear Filters, Bayesian Decision and Transport, 7 April 2014
  • #199 Sage, A.P., &amp; Melsa, J.L., “Estimation Theory with Applications to Communications and Control”, McGraw Hill, 1971, pp. 77- 82
  • #205 Zhu, Dellaert, Tu, ICCV05 Tutorial MCMC for Vision, October 2005 Metropolis, N., Rosenbluth, A., Rosenbluth, M., Teller, A., Teller, E., “ Equations of state calculations by fast computing machine”, Journal of Chemical Physics, 1953, Vol. 21(6), pp.1087-1092 Hastings, W., “Monte Carlo simulation methods using Markov Chains and their Applications”, Biometrica, 1970, No. 57, pp.97 - 109 Geman, S. and Geman, D., “ Stochastic Relaxation, Gibbs Distributions and the Bayesian Restoration of Images”, IEEE Transactions of Pattern Analysis and Machine Intelligence, 1984, No. 6, pp. 721-741
  • #206 http://civs.stat.ucla.edu/MCMC/MCMC_tutorial.htm Zhu, Dellaert, Tu, ICCV05 Tutorial MCMC for Vision, October 2005
  • #207 http://en.wikipedia.org/wiki/Rao%E2%80%93Blackwell_theorem http://www.scholarpedia.org/article/Rao-Blackwell_theorem
  • #208 Zhe Chen, “Bayesian Filtering From Kalman Filters to Particle Filters, and Beyond”, 18.05.06, Manuscript, pg. 15 http://www.dsi.unifi.it/users/chisci/idfric/Nonlinear_filtering_Chen.pdf
  • #209 Zhe Chen, “Bayesian Filtering From Kalman Filters to Particle Filters, and Beyond”, 18.05.06, Manuscript, pg. 15 http://www.dsi.unifi.it/users/chisci/idfric/Nonlinear_filtering_Chen.pdf