꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
Controlled sequential Monte Carlo
1. Controlled sequential Monte Carlo
Jeremy Heng,
Department of Statistics, Harvard University
Joint work with Adrian Bishop (UTS, CSIRO), George
Deligiannidis & Arnaud Doucet (Oxford)
Computation and Econometrics Workshop
@ The National Art Center, 27 July 2018
Jeremy Heng Controlled sequential Monte Carlo 1/ 36
2. Outline
1 State space models
2 Sequential Monte Carlo
3 Controlled sequential Monte Carlo
4 Bayesian parameter inference
5 Extensions and future work
Jeremy Heng Controlled sequential Monte Carlo 2/ 36
3. Outline
1 State space models
2 Sequential Monte Carlo
3 Controlled sequential Monte Carlo
4 Bayesian parameter inference
5 Extensions and future work
Jeremy Heng Controlled sequential Monte Carlo 2/ 36
4. State space models
X0
Latent Markov chain
X0 ∼ µ, Xt|Xt−1 ∼ ft(Xt−1, ·), t ∈ [1 : T]
Observations
Yt|X0:T ∼ gt(Xt, ·), t ∈ [0 : T]
Jeremy Heng Controlled sequential Monte Carlo 3/ 36
5. State space models
…X0 X1 XT
Latent Markov chain
X0 ∼ µ, Xt|Xt−1 ∼ ft(Xt−1, ·), t ∈ [1 : T]
Observations
Yt|X0:T ∼ gt(Xt, ·), t ∈ [0 : T]
Jeremy Heng Controlled sequential Monte Carlo 3/ 36
6. State space models
…X0 X1 XT
Y0 Y1 YT…
Latent Markov chain
X0 ∼ µ, Xt|Xt−1 ∼ ft(Xt−1, ·), t ∈ [1 : T]
Observations
Yt|X0:T ∼ gt(Xt, ·), t ∈ [0 : T]
Jeremy Heng Controlled sequential Monte Carlo 3/ 36
7. State space models
…X0 X1 XT
Y0 Y1 YT…
✓
Latent Markov chain
X0 ∼ µθ, Xt|Xt−1 ∼ ft,θ(Xt−1, ·), t ∈ [1 : T]
Observations
Yt|X0:T ∼ gt,θ(Xt, ·), t ∈ [0 : T]
Jeremy Heng Controlled sequential Monte Carlo 3/ 36
13. Neuroscience example
• Joint work with Demba Ba, Harvard School of Engineering
Jeremy Heng Controlled sequential Monte Carlo 5/ 36
14. Neuroscience example
• Joint work with Demba Ba, Harvard School of Engineering
• 3000 measurements yt ∈ {0, . . . , 50} collected from a
neuroscience experiment (Temereanca et al., 2008)
500 1000 1500 2000 2500 3000
0
2
4
6
8
10
12
14
Jeremy Heng Controlled sequential Monte Carlo 5/ 36
15. Neuroscience example
• Observation model
Yt|Xt ∼ Binomial (50, κ(Xt)) , κ(u) = (1 + exp(−u))−1
Jeremy Heng Controlled sequential Monte Carlo 6/ 36
16. Neuroscience example
• Observation model
Yt|Xt ∼ Binomial (50, κ(Xt)) , κ(u) = (1 + exp(−u))−1
• Latent Markov chain
X0 ∼ N(0, 1), Xt|Xt−1 ∼ N(αXt−1, σ2
)
Jeremy Heng Controlled sequential Monte Carlo 6/ 36
18. Objects of interest
• Online estimation: compute filtering distributions
p(xt|y0:t, θ), t ≥ 0
Jeremy Heng Controlled sequential Monte Carlo 7/ 36
19. Objects of interest
• Online estimation: compute filtering distributions
p(xt|y0:t, θ), t ≥ 0
• Offline estimation: compute smoothing distribution
p(x0:T |y0:T , θ), T ∈ N
or marginal likelihood
p(y0:T |θ) =
XT+1
µθ(x0)
T
t=1
ft,θ(xt−1, xt)
T
t=0
gt,θ(xt, yt)dx0:T
Jeremy Heng Controlled sequential Monte Carlo 7/ 36
20. Objects of interest
• Online estimation: compute filtering distributions
p(xt|y0:t, θ), t ≥ 0
• Offline estimation: compute smoothing distribution
p(x0:T |y0:T , θ), T ∈ N
or marginal likelihood
p(y0:T |θ) =
XT+1
µθ(x0)
T
t=1
ft,θ(xt−1, xt)
T
t=0
gt,θ(xt, yt)dx0:T
• Parameter inference
arg max
θ∈Θ
p(y0:T |θ), p(θ|y0:T ) ∝ p(θ)p(y0:T |θ)
Jeremy Heng Controlled sequential Monte Carlo 7/ 36
21. Outline
1 State space models
2 Sequential Monte Carlo
3 Controlled sequential Monte Carlo
4 Bayesian parameter inference
5 Extensions and future work
Jeremy Heng Controlled sequential Monte Carlo 7/ 36
22. Sequential Monte Carlo
• Sequential Monte Carlo (SMC) aka bootstrap particle filter
(BPF) recursively simulates an interacting particle system of
size N
(X1
t , . . . , XN
t ), t ∈ [0 : T]
Del Moral (2004). Feynman-Kac formulae.
Jeremy Heng Controlled sequential Monte Carlo 8/ 36
23. Sequential Monte Carlo
• Sequential Monte Carlo (SMC) aka bootstrap particle filter
(BPF) recursively simulates an interacting particle system of
size N
(X1
t , . . . , XN
t ), t ∈ [0 : T]
• Unbiased and consistent marginal likelihood estimator
ˆp(y0:T |θ) =
T
t=0
1
N
N
n=1
gt,θ(Xn
t , yt)
Del Moral (2004). Feynman-Kac formulae.
Jeremy Heng Controlled sequential Monte Carlo 8/ 36
24. Sequential Monte Carlo
• Sequential Monte Carlo (SMC) aka bootstrap particle filter
(BPF) recursively simulates an interacting particle system of
size N
(X1
t , . . . , XN
t ), t ∈ [0 : T]
• Unbiased and consistent marginal likelihood estimator
ˆp(y0:T |θ) =
T
t=0
1
N
N
n=1
gt,θ(Xn
t , yt)
• Consistent approximation of smoothing distribution
1
N
N
n=1
ϕ(Xn
0:T ) → ϕ(x0:T )p(x0:T |y0:T , θ)dx0:T
as N → ∞
Del Moral (2004). Feynman-Kac formulae.
Jeremy Heng Controlled sequential Monte Carlo 8/ 36
25. Sequential Monte Carlo
…X0 X1 XT
Y0 Y1 YT…
✓
For time t = 0 and particle n ∈ [1 : N]
sample Xn
0 ∼ µθ
Jeremy Heng Controlled sequential Monte Carlo 9/ 36
26. Sequential Monte Carlo
…X0 X1 XT
Y0 Y1 YT…
✓
For time t = 0 and particle n ∈ [1 : N]
weight W n
0 ∝ g0,θ(Xn
0 , y0)
Jeremy Heng Controlled sequential Monte Carlo 9/ 36
27. Sequential Monte Carlo
…X0 X1 XT
Y0 Y1 YT…
✓
X
X
X
For time t = 0 and particle n ∈ [1 : N]
sample ancestor An
0 ∼ R W 1
0 , . . . , W N
0 , resampled particle: X
An
0
0
Jeremy Heng Controlled sequential Monte Carlo 9/ 36
28. Sequential Monte Carlo
…X0 X1 XT
Y0 Y1 YT…
✓
X
X
X
For time t = 1 and particle n ∈ [1 : N]
sample Xn
1 ∼ f1,θ(X
An
0
0 , ·)
Jeremy Heng Controlled sequential Monte Carlo 9/ 36
29. Sequential Monte Carlo
…X0 X1 XT
Y0 Y1 YT…
✓
X
X
X
For time t = 1 and particle n ∈ [1 : N]
weight W n
1 ∝ g1,θ(Xn
1 , y1)
Jeremy Heng Controlled sequential Monte Carlo 9/ 36
30. Sequential Monte Carlo
…X0 X1 XT
Y0 Y1 YT…
✓
X
X
X X
For time t = 1 and particle n ∈ [1 : N]
sample ancestor An
1 ∼ R W 1
1 , . . . , W N
1 , resampled particle: X
An
1
1
Jeremy Heng Controlled sequential Monte Carlo 9/ 36
31. Sequential Monte Carlo
…X0 X1 XT
Y0 Y1 YT…
✓
X
X
X X
X
X
X
Repeat for time t ∈ [2 : T].
Jeremy Heng Controlled sequential Monte Carlo 9/ 36
32. Sequential Monte Carlo
…X0 X1 XT
Y0 Y1 YT…
✓
X
X
X X
X
X
X
Repeat for time t ∈ [2 : T]. Note this is for a given θ!
Jeremy Heng Controlled sequential Monte Carlo 9/ 36
33. Ideal Metropolis-Hastings
Cannot run Metropolis-Hastings algorithm to sample from
p(θ|y0:T ) ∝ p(θ)p(y0:T |θ)
For iteration i ≥ 1
1. Sample θ∗ ∼ q(θ(i−1), ·)
2. With probability
min 1,
p(θ∗)p(y0:T |θ∗)q(θ∗, θ(i−1))
p(θ(i−1))p(y0:T |θ(i−1))q(θ(i−1), θ∗)
set θ(i) = θ∗, otherwise set θ(i) = θ(i−1)
Jeremy Heng Controlled sequential Monte Carlo 10/ 36
34. Particle marginal Metropolis-Hastings (PMMH)
For iteration i ≥ 1
1. Sample θ∗ ∼ q(θ(i−1), ·)
2. Compute ˆp(y0:T |θ∗) with SMC
2. With probability
min 1,
p(θ∗)ˆp(y0:T |θ∗)q(θ∗, θ(i−1))
p(θ(i−1))ˆp(y0:T |θ(i−1))q(θ(i−1), θ∗)
set θ(i) = θ∗ and ˆp(y0:T |θ(i)) = ˆp(y0:T |θ∗), otherwise set
θ(i) = θ(i−1) and ˆp(y0:T |θ(i)) = ˆp(y0:T |θ(i−1))
Andrieu, Doucet & Holenstein (2010). Journal of the Royal Statistical Society:
Series B.
Jeremy Heng Controlled sequential Monte Carlo 11/ 36
35. Particle marginal Metropolis-Hastings (PMMH)
• Exact approximation:
1
m
m
i=1
ϕ(θ(i)
) →
Θ
ϕ(θ)p(θ|y0:T )dθ
as m → ∞, for any N ≥ 1
Jeremy Heng Controlled sequential Monte Carlo 12/ 36
36. Particle marginal Metropolis-Hastings (PMMH)
• Exact approximation:
1
m
m
i=1
ϕ(θ(i)
) →
Θ
ϕ(θ)p(θ|y0:T )dθ
as m → ∞, for any N ≥ 1
• Performance depends on the variance of ˆp(y0:T |θ)
Andrieu & Vihola (2015). The Annals of Applied Probability.
Jeremy Heng Controlled sequential Monte Carlo 12/ 36
37. Particle marginal Metropolis-Hastings (PMMH)
• Exact approximation:
1
m
m
i=1
ϕ(θ(i)
) →
Θ
ϕ(θ)p(θ|y0:T )dθ
as m → ∞, for any N ≥ 1
• Performance depends on the variance of ˆp(y0:T |θ)
Andrieu & Vihola (2015). The Annals of Applied Probability.
• Account for computational cost to optimize efficiency
Doucet, Pitt, Deligiannidis & Kohn (2015). Biometrika.
Sherlock, Thiery, Roberts, & Rosenthal (2015). The Annals of Statistics.
Jeremy Heng Controlled sequential Monte Carlo 12/ 36
38. Particle marginal Metropolis-Hastings (PMMH)
• Exact approximation:
1
m
m
i=1
ϕ(θ(i)
) →
Θ
ϕ(θ)p(θ|y0:T )dθ
as m → ∞, for any N ≥ 1
• Performance depends on the variance of ˆp(y0:T |θ)
Andrieu & Vihola (2015). The Annals of Applied Probability.
• Account for computational cost to optimize efficiency
Doucet, Pitt, Deligiannidis & Kohn (2015). Biometrika.
Sherlock, Thiery, Roberts, & Rosenthal (2015). The Annals of Statistics.
• Proposed method lowers variance of ˆp(y0:T |θ) at fixed cost
Jeremy Heng Controlled sequential Monte Carlo 12/ 36
39. Neuroscience example: SMC performance
Relative variance of log-marginal likelihood estimator
σ2
→ Var
log ˆp(y0:T |(α, σ2))
log p(y0:T |(α, σ2))
, fix α = 0.99
with N = 1024
0.05 0.1 0.15 0.2
-9.5
-9
-8.5
-8
-7.5
-7
-6.5
-6
-5.5
Jeremy Heng Controlled sequential Monte Carlo 13/ 36
40. Neuroscience example: SMC performance
Relative variance of log-marginal likelihood estimator
σ2
→ Var
log ˆp(y0:T |(α, σ2))
log p(y0:T |(α, σ2))
, fix α = 0.99
with N = 2048
0.05 0.1 0.15 0.2
-9.5
-9
-8.5
-8
-7.5
-7
-6.5
-6
-5.5
Jeremy Heng Controlled sequential Monte Carlo 13/ 36
41. Neuroscience example: SMC performance
Relative variance of log-marginal likelihood estimator
σ2
→ Var
log ˆp(y0:T |(α, σ2))
log p(y0:T |(α, σ2))
, fix α = 0.99
with N = 5529
0.05 0.1 0.15 0.2
-9.5
-9
-8.5
-8
-7.5
-7
-6.5
-6
-5.5
Jeremy Heng Controlled sequential Monte Carlo 13/ 36
42. Neuroscience example: SMC performance
Relative variance of log-marginal likelihood estimator
σ2
→ Var
log ˆp(y0:T |(α, σ2))
log p(y0:T |(α, σ2))
, fix α = 0.99
with N = 5529 vs. controlled SMC
0.05 0.1 0.15 0.2
-9.5
-9
-8.5
-8
-7.5
-7
-6.5
-6
-5.5
Jeremy Heng Controlled sequential Monte Carlo 13/ 36
43. Mismatch between dynamics and observations
SMC weights samples from model dynamics
Xt|Xt−1 ∼ N(αXt−1, σ2
)
without taking observations into account
-12 -10 -8 -6 -4 -2 0
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Jeremy Heng Controlled sequential Monte Carlo 14/ 36
44. Mismatch between dynamics and observations
SMC weights samples from model dynamics
Xt|Xt−1 ∼ N(αXt−1, σ2
)
without taking observations into account
-12 -10 -8 -6 -4 -2 0
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Jeremy Heng Controlled sequential Monte Carlo 14/ 36
45. Mismatch between dynamics and observations
SMC weights samples from model dynamics
Xt|Xt−1 ∼ N(αXt−1, σ2
)
without taking observations into account
-12 -10 -8 -6 -4 -2 0
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Jeremy Heng Controlled sequential Monte Carlo 14/ 36
46. Mismatch between dynamics and observations
SMC weights samples from model dynamics
Xt|Xt−1 ∼ N(αXt−1, σ2
)
without taking observations into account
-12 -10 -8 -6 -4 -2 0
0
0.5
1
1.5
2
2.5
3
3.5
4
Jeremy Heng Controlled sequential Monte Carlo 14/ 36
47. Outline
1 State space models
2 Sequential Monte Carlo
3 Controlled sequential Monte Carlo
4 Bayesian parameter inference
5 Extensions and future work
Jeremy Heng Controlled sequential Monte Carlo 14/ 36
48. Optimal dynamics
• Fix θ and suppress notational dependency
Jeremy Heng Controlled sequential Monte Carlo 15/ 36
49. Optimal dynamics
• Fix θ and suppress notational dependency
• Sampling from smoothing distribution p(x0:T |y0:T )
X0 ∼ p(x0|y0:T ), Xt|Xt−1 ∼ p(xt|xt−1, yt:T ), t ∈ [1 : T]
Jeremy Heng Controlled sequential Monte Carlo 15/ 36
50. Optimal dynamics
• Fix θ and suppress notational dependency
• Sampling from smoothing distribution p(x0:T |y0:T )
X0 ∼ p(x0|y0:T ), Xt|Xt−1 ∼ p(xt|xt−1, yt:T ), t ∈ [1 : T]
• Define backward information filter ψ∗
t (xt) = p(yt:T |xt),
then
p(x0|y0:T ) =
µ(x0)ψ∗
0(x0)
µ(ψ∗
0)
with µ(ψ∗
0) = X ψ∗
0(x0)µ(x0)dx0, and
p(xt|xt−1, yt:T ) =
ft(xt−1, xt)ψ∗
t (xt)
ft(ψ∗
t )(xt−1)
with ft(ψ∗
t )(xt−1) = X ψ∗
t (xt)ft(xt−1, xt)dxt
Jeremy Heng Controlled sequential Monte Carlo 15/ 36
51. Controlled state space model
• Given a policy
ψ = (ψ0, . . . , ψT )
i.e. positive and bounded functions
Jeremy Heng Controlled sequential Monte Carlo 16/ 36
52. Controlled state space model
• Given a policy
ψ = (ψ0, . . . , ψT )
i.e. positive and bounded functions
• Construct ψ-controlled dynamics
X0 ∼ µψ
, Xt|Xt−1 ∼ f ψ
t (Xt−1, ·), t ∈ [1 : T]
where
µψ
(x0) =
µ(x0)ψ0(x0)
µ(ψ0)
, f ψ
t (xt−1, xt) =
ft(xt−1, xt)ψt(xt)
ft(ψt)(xt−1)
Jeremy Heng Controlled sequential Monte Carlo 16/ 36
53. Controlled state space model
• Given a policy
ψ = (ψ0, . . . , ψT )
i.e. positive and bounded functions
• Construct ψ-controlled dynamics
X0 ∼ µψ
, Xt|Xt−1 ∼ f ψ
t (Xt−1, ·), t ∈ [1 : T]
where
µψ
(x0) =
µ(x0)ψ0(x0)
µ(ψ0)
, f ψ
t (xt−1, xt) =
ft(xt−1, xt)ψt(xt)
ft(ψt)(xt−1)
• Introducing ψ-controlled observation model
Yt|X0:T ∼ gψ
t (Xt, ·), t ∈ [0 : T]
gives a ψ-controlled state space model
Jeremy Heng Controlled sequential Monte Carlo 16/ 36
54. Controlled state space model
• Define controlled observation densities (gψ
0 , . . . , gψ
T ) so
that
pψ
(x0:T |y0:T ) = p(x0:T |y0:T ), pψ
(y0:T ) = p(y0:T )
Jeremy Heng Controlled sequential Monte Carlo 17/ 36
55. Controlled state space model
• Define controlled observation densities (gψ
0 , . . . , gψ
T ) so
that
pψ
(x0:T |y0:T ) = p(x0:T |y0:T ), pψ
(y0:T ) = p(y0:T )
• Achieved with
gψ
0 (x0, y0) =
µ(ψ0)g0(x0, y0)f1(ψ1)(x0)
ψ0(x0)
,
gψ
t (xt, yt) =
gt(xt, yt)ft+1(ψt+1)(xt)
ψt(xt)
, t ∈ [1 : T − 1],
gψ
T (xT , yT ) =
gT (xT , yT )
ψT (xT )
Jeremy Heng Controlled sequential Monte Carlo 17/ 36
56. Controlled state space model
• Define controlled observation densities (gψ
0 , . . . , gψ
T ) so
that
pψ
(x0:T |y0:T ) = p(x0:T |y0:T ), pψ
(y0:T ) = p(y0:T )
• Achieved with
gψ
0 (x0, y0) =
µ(ψ0)g0(x0, y0)f1(ψ1)(x0)
ψ0(x0)
,
gψ
t (xt, yt) =
gt(xt, yt)ft+1(ψt+1)(xt)
ψt(xt)
, t ∈ [1 : T − 1],
gψ
T (xT , yT ) =
gT (xT , yT )
ψT (xT )
• Requirements on policy ψ
– Evaluating gψ
t tractable
Jeremy Heng Controlled sequential Monte Carlo 17/ 36
57. Controlled state space model
• Define controlled observation densities (gψ
0 , . . . , gψ
T ) so
that
pψ
(x0:T |y0:T ) = p(x0:T |y0:T ), pψ
(y0:T ) = p(y0:T )
• Achieved with
gψ
0 (x0, y0) =
µ(ψ0)g0(x0, y0)f1(ψ1)(x0)
ψ0(x0)
,
gψ
t (xt, yt) =
gt(xt, yt)ft+1(ψt+1)(xt)
ψt(xt)
, t ∈ [1 : T − 1],
gψ
T (xT , yT ) =
gT (xT , yT )
ψT (xT )
• Requirements on policy ψ
– Evaluating gψ
t tractable
– Sampling µψ and f ψ
t feasible
Jeremy Heng Controlled sequential Monte Carlo 17/ 36
58. Neuroscience example
• Gaussian policy for neuroscience example
ψt(xt) = exp −atx2
t − btxt − ct , (at, bt, ct) ∈ R3
Jeremy Heng Controlled sequential Monte Carlo 18/ 36
59. Neuroscience example
• Gaussian policy for neuroscience example
ψt(xt) = exp −atx2
t − btxt − ct , (at, bt, ct) ∈ R3
• Both requirements satisfied since
µψ
= N(−k0b0, k0), f ψ
t (xt−1, ·) = N(kt{ασ−2
xt−1 − bt}, kt)
with k0 = (1 + 2a0)−1 and kt = (σ−2 + 2at)−1
Jeremy Heng Controlled sequential Monte Carlo 18/ 36
60. Controlled SMC
• Construct ψ-controlled SMC as SMC applied to ψ-controlled
state space model
X0 ∼ µψ
, Xt|Xt−1 ∼ f ψ
t (Xt−1, ·), t ∈ [1 : T]
Yt|X0:T ∼ gψ
t (Xt, ·), t ∈ [0 : T]
Jeremy Heng Controlled sequential Monte Carlo 19/ 36
61. Controlled SMC
• Construct ψ-controlled SMC as SMC applied to ψ-controlled
state space model
X0 ∼ µψ
, Xt|Xt−1 ∼ f ψ
t (Xt−1, ·), t ∈ [1 : T]
Yt|X0:T ∼ gψ
t (Xt, ·), t ∈ [0 : T]
• Unbiased and consistent marginal likelihood estimator
ˆpψ
(y0:T ) =
T
t=0
1
N
N
n=1
gψ
t (Xn
t , yt)
Jeremy Heng Controlled sequential Monte Carlo 19/ 36
62. Controlled SMC
• Construct ψ-controlled SMC as SMC applied to ψ-controlled
state space model
X0 ∼ µψ
, Xt|Xt−1 ∼ f ψ
t (Xt−1, ·), t ∈ [1 : T]
Yt|X0:T ∼ gψ
t (Xt, ·), t ∈ [0 : T]
• Unbiased and consistent marginal likelihood estimator
ˆpψ
(y0:T ) =
T
t=0
1
N
N
n=1
gψ
t (Xn
t , yt)
• Consistent approximation of smoothing distribution
1
N
N
n=1
ϕ(Xn
0:T ) → ϕ(x0:T )p(x0:T |y0:T )dx0:T
as N → ∞
Jeremy Heng Controlled sequential Monte Carlo 19/ 36
63. Optimal policy
• Policy ψ∗
t (xt) = p(yt:T |xt) is optimal since ψ∗-controlled SMC
gives
Jeremy Heng Controlled sequential Monte Carlo 20/ 36
64. Optimal policy
• Policy ψ∗
t (xt) = p(yt:T |xt) is optimal since ψ∗-controlled SMC
gives
• ψ∗-controlled SMC gives independent samples from
smoothing distribution
Xn
0:T ∼ p(x0:T |y0:T ), n ∈ [1 : N],
and zero variance estimator of marginal likelihood for any
N ≥ 1
ˆpψ∗
(y0:T ) = p(y0:T )
Jeremy Heng Controlled sequential Monte Carlo 20/ 36
65. Optimal policy
• Policy ψ∗
t (xt) = p(yt:T |xt) is optimal since ψ∗-controlled SMC
gives
• ψ∗-controlled SMC gives independent samples from
smoothing distribution
Xn
0:T ∼ p(x0:T |y0:T ), n ∈ [1 : N],
and zero variance estimator of marginal likelihood for any
N ≥ 1
ˆpψ∗
(y0:T ) = p(y0:T )
• (Proposition 1) Optimal policy satisfies backward recursion
ψ∗
T (xT ) = gT (xT , yT ),
ψ∗
t (xt) = gt(xt, yt)ft+1(ψ∗
t+1)(xt), t ∈ [T − 1 : 0]
Jeremy Heng Controlled sequential Monte Carlo 20/ 36
66. Optimal policy
• Policy ψ∗
t (xt) = p(yt:T |xt) is optimal since ψ∗-controlled SMC
gives
• ψ∗-controlled SMC gives independent samples from
smoothing distribution
Xn
0:T ∼ p(x0:T |y0:T ), n ∈ [1 : N],
and zero variance estimator of marginal likelihood for any
N ≥ 1
ˆpψ∗
(y0:T ) = p(y0:T )
• (Proposition 1) Optimal policy satisfies backward recursion
ψ∗
T (xT ) = gT (xT , yT ),
ψ∗
t (xt) = gt(xt, yt)ft+1(ψ∗
t+1)(xt), t ∈ [T − 1 : 0]
• Backward recursion typically intractable but can be
approximated
Jeremy Heng Controlled sequential Monte Carlo 20/ 36
67. Connection to optimal control
• V ∗
t = − log ψ∗
t are the optimal value functions of
Kullback-Leibler control problem
inf
ψ∈Ψ
KL pψ
(x0:T ) p(x0:T |y0:T )
Jeremy Heng Controlled sequential Monte Carlo 21/ 36
68. Connection to optimal control
• V ∗
t = − log ψ∗
t are the optimal value functions of
Kullback-Leibler control problem
inf
ψ∈Ψ
KL pψ
(x0:T ) p(x0:T |y0:T )
• Connection useful for methodology and analysis
Bertsekas & Tsitsiklis (1996). Neuro-dynamic programming.
Tsitsiklis & Van Roy (2001). IEEE Transactions on Neural Networks.
Jeremy Heng Controlled sequential Monte Carlo 21/ 36
69. Connection to optimal control
• V ∗
t = − log ψ∗
t are the optimal value functions of
Kullback-Leibler control problem
inf
ψ∈Ψ
KL pψ
(x0:T ) p(x0:T |y0:T )
• Connection useful for methodology and analysis
Bertsekas & Tsitsiklis (1996). Neuro-dynamic programming.
Tsitsiklis & Van Roy (2001). IEEE Transactions on Neural Networks.
• Methods to approximate backward recursion are known as
approximate dynamic programming (ADP) for finite
horizon control problems
Jeremy Heng Controlled sequential Monte Carlo 21/ 36
70. Approximate dynamic programming
• First run standard SMC to get (Xn
0 , . . . , Xn
T ), n ∈ [1 : N]
Jeremy Heng Controlled sequential Monte Carlo 22/ 36
71. Approximate dynamic programming
• First run standard SMC to get (Xn
0 , . . . , Xn
T ), n ∈ [1 : N]
• For time T, approximate
ψ∗
T (xT ) = gT (xT , yT )
by least squares
ˆψT = arg min
ξ∈F
N
n=1
{log ξ(Xn
T ) − log gT (Xn
T , yT )}2
Jeremy Heng Controlled sequential Monte Carlo 22/ 36
72. Approximate dynamic programming
• First run standard SMC to get (Xn
0 , . . . , Xn
T ), n ∈ [1 : N]
• For time T, approximate
ψ∗
T (xT ) = gT (xT , yT )
by least squares
ˆψT = arg min
ξ∈F
N
n=1
{log ξ(Xn
T ) − log gT (Xn
T , yT )}2
• For t ∈ [T − 1 : 0], approximate
ψ∗
t (xt) = gt(xt, yt)ft+1(ψ∗
t+1)(xt)
by least squares and ψ∗
t+1 ≈ ˆψt+1
ˆψt = arg min
ξ∈F
N
n=1
log ξ(Xn
t ) − log gt(Xn
t , yt) − log ft+1( ˆψt+1)(Xn
t )
2
Jeremy Heng Controlled sequential Monte Carlo 22/ 36
73. Approximate dynamic programming
• (Proposition 3) Error bounds
E ˆψt − ψ∗
t ≤
T
s=t
Ct−1,s−1eN
s
where Ct,s are stability constants and eN
t are least squares
errors
Jeremy Heng Controlled sequential Monte Carlo 23/ 36
74. Approximate dynamic programming
• (Proposition 3) Error bounds
E ˆψt − ψ∗
t ≤
T
s=t
Ct−1,s−1eN
s
where Ct,s are stability constants and eN
t are least squares
errors
• (Theorem 1) As N → ∞, ˆψ converges to idealized ADP
˜ψT = arg min
ξ∈F
E {log ξ(XT ) − log gT (XT , yT )}2
,
˜ψt = arg min
ξ∈F
E log ξ(Xt) − log gt(Xt, yt) − log ft+1( ˜ψt+1)(Xt)
2
Jeremy Heng Controlled sequential Monte Carlo 23/ 36
75. Neuroscience example: controlled SMC
Marginal likelihood estimates of ˆψ-controlled SMC with N = 128
(α = 0.99, σ2 = 0.11)
Variance reduction ≈ 22 times
Iteration 0 Iteration 1
-3125
-3120
-3115
-3110
-3105
Jeremy Heng Controlled sequential Monte Carlo 24/ 36
76. Neuroscience example: controlled SMC
Marginal likelihood estimates of ˆψ-controlled SMC with N = 128
(α = 0.99, σ2 = 0.11)
Variance reduction ≈ 22 times
Iteration 0 Iteration 1
-3125
-3120
-3115
-3110
-3105
We can do better!
Jeremy Heng Controlled sequential Monte Carlo 24/ 36
77. Policy refinement
• Current policy ˆψ defines the dynamics
X0 ∼ µ
ˆψ
, Xt|Xt−1 ∼ f
ˆψ
t (Xt−1, ·), t ∈ [1 : T]
Jeremy Heng Controlled sequential Monte Carlo 25/ 36
78. Policy refinement
• Current policy ˆψ defines the dynamics
X0 ∼ µ
ˆψ
, Xt|Xt−1 ∼ f
ˆψ
t (Xt−1, ·), t ∈ [1 : T]
• Further control these dynamics with policy φ = (φ0, . . . φT )
X0 ∼ µ
ˆψ
φ
, Xt|Xt−1 ∼ f
ˆψ
t
φ
(Xt−1, ·), t ∈ [1 : T]
Jeremy Heng Controlled sequential Monte Carlo 25/ 36
79. Policy refinement
• Current policy ˆψ defines the dynamics
X0 ∼ µ
ˆψ
, Xt|Xt−1 ∼ f
ˆψ
t (Xt−1, ·), t ∈ [1 : T]
• Further control these dynamics with policy φ = (φ0, . . . φT )
X0 ∼ µ
ˆψ
φ
, Xt|Xt−1 ∼ f
ˆψ
t
φ
(Xt−1, ·), t ∈ [1 : T]
• Equivalent to controlling model dynamics µ and ft with policy
ˆψ · φ = ( ˆψ0 · φ0, . . . , ˆψT · φT )
Jeremy Heng Controlled sequential Monte Carlo 25/ 36
80. Policy refinement
• (Proposition 1) Optimal refinement of φ∗ current policy ˆψ
φ∗
T (xT ) = g
ˆψ
T (xT , yT ),
φ∗
t (xt) = g
ˆψ
t (xt, yt)f
ˆψ
t+1(φ∗
t+1)(xt), t ∈ [T − 1 : 0]
Jeremy Heng Controlled sequential Monte Carlo 26/ 36
81. Policy refinement
• (Proposition 1) Optimal refinement of φ∗ current policy ˆψ
φ∗
T (xT ) = g
ˆψ
T (xT , yT ),
φ∗
t (xt) = g
ˆψ
t (xt, yt)f
ˆψ
t+1(φ∗
t+1)(xt), t ∈ [T − 1 : 0]
• Optimal policy ψ∗ = ˆψ · φ∗
Jeremy Heng Controlled sequential Monte Carlo 26/ 36
82. Policy refinement
• (Proposition 1) Optimal refinement of φ∗ current policy ˆψ
φ∗
T (xT ) = g
ˆψ
T (xT , yT ),
φ∗
t (xt) = g
ˆψ
t (xt, yt)f
ˆψ
t+1(φ∗
t+1)(xt), t ∈ [T − 1 : 0]
• Optimal policy ψ∗ = ˆψ · φ∗
• Approximate backward recursion to obtain ˆφ ≈ φ∗
– using particles from ˆψ-controlled SMC
– same function class F
Jeremy Heng Controlled sequential Monte Carlo 26/ 36
83. Policy refinement
• (Proposition 1) Optimal refinement of φ∗ current policy ˆψ
φ∗
T (xT ) = g
ˆψ
T (xT , yT ),
φ∗
t (xt) = g
ˆψ
t (xt, yt)f
ˆψ
t+1(φ∗
t+1)(xt), t ∈ [T − 1 : 0]
• Optimal policy ψ∗ = ˆψ · φ∗
• Approximate backward recursion to obtain ˆφ ≈ φ∗
– using particles from ˆψ-controlled SMC
– same function class F
• Run controlled SMC with refined policy ˆψ · ˆφ
Jeremy Heng Controlled sequential Monte Carlo 26/ 36
84. Neuroscience example: controlled SMC
Marginal likelihood estimates of controlled SMC iteration 2 with
N = 128 (α = 0.99, σ2 = 0.11)
Further variance reduction ≈ 24 times
Iteration 0 Iteration 1 Iteration 2
-3125
-3120
-3115
-3110
-3105
Jeremy Heng Controlled sequential Monte Carlo 27/ 36
85. Neuroscience example: controlled SMC
Marginal likelihood estimates of controlled SMC iteration 3 with
N = 128 (α = 0.99, σ2 = 0.11)
Further variance reduction ≈ 1.3 times
Iteration 0 Iteration 1 Iteration 2 Iteration 3
-3125
-3120
-3115
-3110
-3105
Jeremy Heng Controlled sequential Monte Carlo 27/ 36
86. Neuroscience example: controlled SMC
Coefficients (ai
t, bi
t) of Gaussian approximation estimated at
iteration i ≥ 1
0 500 1000 1500 2000 2500 3000
-4
-2
0
2
4
6
8
Iteration 1
Iteration 2
Iteration 3
0 500 1000 1500 2000 2500 3000
-20
-10
0
10
20
30
40
Iteration 1
Iteration 2
Iteration 3
Jeremy Heng Controlled sequential Monte Carlo 28/ 36
87. Effect of policy refinement
• Residual from ADP when fitting ˆψ
εt(xt) = log ˆψt(xt) − log g(xt, yt) − log ft+1( ˆψt+1)(xt)
Jeremy Heng Controlled sequential Monte Carlo 29/ 36
88. Effect of policy refinement
• Residual from ADP when fitting ˆψ
εt(xt) = log ˆψt(xt) − log g(xt, yt) − log ft+1( ˆψt+1)(xt)
• Performance of ˆψ-controlled SMC related to εt
Jeremy Heng Controlled sequential Monte Carlo 29/ 36
89. Effect of policy refinement
• Residual from ADP when fitting ˆψ
εt(xt) = log ˆψt(xt) − log g(xt, yt) − log ft+1( ˆψt+1)(xt)
• Performance of ˆψ-controlled SMC related to εt
• Next ADP refinement re-fits residual (like L2-boosting)
ˆφt = arg min
ξ∈F
N
n=1
log ξ(Xn
t ) − εt(Xn
t ) − log f
ˆψ
t+1(ˆφt+1)(Xn
t )
2
B¨uhlmann & Yu (2003). Journal of the American Statistical Association.
Jeremy Heng Controlled sequential Monte Carlo 29/ 36
90. Effect of policy refinement
• Residual from ADP when fitting ˆψ
εt(xt) = log ˆψt(xt) − log g(xt, yt) − log ft+1( ˆψt+1)(xt)
• Performance of ˆψ-controlled SMC related to εt
• Next ADP refinement re-fits residual (like L2-boosting)
ˆφt = arg min
ξ∈F
N
n=1
log ξ(Xn
t ) − εt(Xn
t ) − log f
ˆψ
t+1(ˆφt+1)(Xn
t )
2
B¨uhlmann & Yu (2003). Journal of the American Statistical Association.
Jeremy Heng Controlled sequential Monte Carlo 29/ 36
91. Effect of policy refinement
• Residual from ADP when fitting ˆψ
εt(xt) = log ˆψt(xt) − log g(xt, yt) − log ft+1( ˆψt+1)(xt)
• Performance of ˆψ-controlled SMC related to εt
• Next ADP refinement re-fits residual (like L2-boosting)
ˆφt = arg min
ξ∈F
N
n=1
log ξ(Xn
t ) − εt(Xn
t ) − log f
ˆψ
t+1(ˆφt+1)(Xn
t )
2
B¨uhlmann & Yu (2003). Journal of the American Statistical Association.
This explains previous plots!
Jeremy Heng Controlled sequential Monte Carlo 29/ 36
92. Effect of policy refinement
Coefficients of policy at time 0 over 30 iterations
0 5 10 15 20 25 30
0
1
2
3
4
5
6
7
8
9
0 5 10 15 20 25 30
0
5
10
15
20
25
30
35
40
Jeremy Heng Controlled sequential Monte Carlo 30/ 36
93. Effect of policy refinement
Coefficients of policy at time 0 over 30 iterations
0 5 10 15 20 25 30
0
1
2
3
4
5
6
7
8
9
0 5 10 15 20 25 30
0
5
10
15
20
25
30
35
40
(Theorem 2) Under regularity conditions
• Policy refinement generates a Markov chain on policy space
Jeremy Heng Controlled sequential Monte Carlo 30/ 36
94. Effect of policy refinement
Coefficients of policy at time 0 over 30 iterations
0 5 10 15 20 25 30
0
1
2
3
4
5
6
7
8
9
0 5 10 15 20 25 30
0
5
10
15
20
25
30
35
40
(Theorem 2) Under regularity conditions
• Policy refinement generates a Markov chain on policy space
• Converges to unique stationary distribution
Jeremy Heng Controlled sequential Monte Carlo 30/ 36
95. Effect of policy refinement
Coefficients of policy at time 0 over 30 iterations
0 5 10 15 20 25 30
0
1
2
3
4
5
6
7
8
9
0 5 10 15 20 25 30
0
5
10
15
20
25
30
35
40
(Theorem 2) Under regularity conditions
• Policy refinement generates a Markov chain on policy space
• Converges to unique stationary distribution
• Characterization of stationary distribution as N → ∞
Jeremy Heng Controlled sequential Monte Carlo 30/ 36
96. Outline
1 State space models
2 Sequential Monte Carlo
3 Controlled sequential Monte Carlo
4 Bayesian parameter inference
5 Extensions and future work
Jeremy Heng Controlled sequential Monte Carlo 30/ 36
97. Neuroscience example: PMMH performance
Trace plots of particle marginal Metropolis-Hastings chain
0 200 400 600 800 1000
0.99
0.991
0.992
0.993
0.994
0.995
0.996
0.997
0.998
0.999
1
0 200 400 600 800 1000
0.07
0.08
0.09
0.1
0.11
0.12
0.13
0.14
BPF: N = 5529
cSMC: N = 128, 3 refinements
Each iteration takes 2-3 seconds
Jeremy Heng Controlled sequential Monte Carlo 31/ 36
98. Neuroscience example: PMMH
Autocorrelation function of particle marginal Metropolis-Hastings
chain
0 10 20 30 40 50
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
BPF
cSMC
0 10 20 30 40 50
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
BPF
cSMC
100,000 iterations with BPF (3 days)
≈ 20,000 iterations with cSMC (1/2 day)
Jeremy Heng Controlled sequential Monte Carlo 32/ 36
99. Neuroscience example: PMMH
Autocorrelation function of particle marginal Metropolis-Hastings
chain
0 10 20 30 40 50
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
BPF
cSMC
0 10 20 30 40 50
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
BPF
cSMC
100,000 iterations with BPF (3 days)
≈ 20,000 iterations with cSMC (1/2 day)
≈ 2,000 iterations with cSMC + parallel computation (1.5 hours)
Jeremy Heng Controlled sequential Monte Carlo 32/ 36
100. Parallel computation for Markov chain Monte Carlo
Serial MCMC
iterations ! 1
Parallel MCMC
processors
#
1
Jacob, O’Leary & Atchad´e (2017). Unbiased Markov chain Monte Carlo with
couplings.
Heng & Jacob (2017). Unbiased Hamiltonian Monte Carlo with couplings.
Middleton, Deligiannidis, Doucet & Jacob (2018). Unbiased Markov chain
Monte Carlo for intractable target distributions.
Jeremy Heng Controlled sequential Monte Carlo 33/ 36
101. Outline
1 State space models
2 Sequential Monte Carlo
3 Controlled sequential Monte Carlo
4 Bayesian parameter inference
5 Extensions and future work
Jeremy Heng Controlled sequential Monte Carlo 33/ 36
102. Static models
• Bayesian model: compute posterior
p(θ|y) ∝ p(θ)p(y|θ)
or model evidence p(y) = Θ p(θ)p(y|θ)dθ
Jeremy Heng Controlled sequential Monte Carlo 34/ 36
103. Static models
• Bayesian model: compute posterior
p(θ|y) ∝ p(θ)p(y|θ)
or model evidence p(y) = Θ p(θ)p(y|θ)dθ
• Introduce artificial tempered posteriors
pt(θ|y) ∝ p(θ)p(y|θ)λt
where 0 = λ0 < · · · < λT = 1 (Jarzynski 1997; Neal 2001;
Chopin 2004; Del Moral, Doucet & Jasra, 2006)
Jeremy Heng Controlled sequential Monte Carlo 34/ 36
104. Static models
• Bayesian model: compute posterior
p(θ|y) ∝ p(θ)p(y|θ)
or model evidence p(y) = Θ p(θ)p(y|θ)dθ
• Introduce artificial tempered posteriors
pt(θ|y) ∝ p(θ)p(y|θ)λt
where 0 = λ0 < · · · < λT = 1 (Jarzynski 1997; Neal 2001;
Chopin 2004; Del Moral, Doucet & Jasra, 2006)
• Optimally controlled SMC gives independent samples from
p(θ|y) and a zero variance estimator of p(y)
Jeremy Heng Controlled sequential Monte Carlo 34/ 36
105. Static models
Model evidence estimates of Cox point process model in 900
dimensions
Variance reduction ≈ 370 times compared to AIS
Iteration 0 Iteration 1 Iteration 2 Iteration 3 AIS
440
450
460
470
480
490
500
Jeremy Heng Controlled sequential Monte Carlo 35/ 36
106. Concluding remarks
• Extensions:
– Online filtering
– Relax requirements on policies
Jeremy Heng Controlled sequential Monte Carlo 36/ 36
107. Concluding remarks
• Extensions:
– Online filtering
– Relax requirements on policies
• Controlled sequential Monte Carlo.
Heng, Bishop, Deligiannidis & Doucet. arXiv:1708.08396.
Jeremy Heng Controlled sequential Monte Carlo 36/ 36
108. Concluding remarks
• Extensions:
– Online filtering
– Relax requirements on policies
• Controlled sequential Monte Carlo.
Heng, Bishop, Deligiannidis & Doucet. arXiv:1708.08396.
• MATLAB code:
https://github.com/jeremyhengjm/controlledSMC
Jeremy Heng Controlled sequential Monte Carlo 36/ 36