Controlled sequential Monte Carlo
Jeremy Heng,
Department of Statistics, Harvard University
Joint work with Adrian Bishop (UTS, CSIRO), George
Deligiannidis & Arnaud Doucet (Oxford)
Computation and Econometrics Workshop
@ The National Art Center, 27 July 2018
Jeremy Heng Controlled sequential Monte Carlo 1/ 36
Outline
1 State space models
2 Sequential Monte Carlo
3 Controlled sequential Monte Carlo
4 Bayesian parameter inference
5 Extensions and future work
Jeremy Heng Controlled sequential Monte Carlo 2/ 36
Outline
1 State space models
2 Sequential Monte Carlo
3 Controlled sequential Monte Carlo
4 Bayesian parameter inference
5 Extensions and future work
Jeremy Heng Controlled sequential Monte Carlo 2/ 36
State space models
X0
Latent Markov chain
X0 ∼ µ, Xt|Xt−1 ∼ ft(Xt−1, ·), t ∈ [1 : T]
Observations
Yt|X0:T ∼ gt(Xt, ·), t ∈ [0 : T]
Jeremy Heng Controlled sequential Monte Carlo 3/ 36
State space models
…X0 X1 XT
Latent Markov chain
X0 ∼ µ, Xt|Xt−1 ∼ ft(Xt−1, ·), t ∈ [1 : T]
Observations
Yt|X0:T ∼ gt(Xt, ·), t ∈ [0 : T]
Jeremy Heng Controlled sequential Monte Carlo 3/ 36
State space models
…X0 X1 XT
Y0 Y1 YT…
Latent Markov chain
X0 ∼ µ, Xt|Xt−1 ∼ ft(Xt−1, ·), t ∈ [1 : T]
Observations
Yt|X0:T ∼ gt(Xt, ·), t ∈ [0 : T]
Jeremy Heng Controlled sequential Monte Carlo 3/ 36
State space models
…X0 X1 XT
Y0 Y1 YT…
✓
Latent Markov chain
X0 ∼ µθ, Xt|Xt−1 ∼ ft,θ(Xt−1, ·), t ∈ [1 : T]
Observations
Yt|X0:T ∼ gt,θ(Xt, ·), t ∈ [0 : T]
Jeremy Heng Controlled sequential Monte Carlo 3/ 36
State space models
Jeremy Heng Controlled sequential Monte Carlo 4/ 36
State space models
Jeremy Heng Controlled sequential Monte Carlo 4/ 36
State space models
Jeremy Heng Controlled sequential Monte Carlo 4/ 36
State space models
Jeremy Heng Controlled sequential Monte Carlo 4/ 36
State space models
Jeremy Heng Controlled sequential Monte Carlo 4/ 36
Neuroscience example
• Joint work with Demba Ba, Harvard School of Engineering
Jeremy Heng Controlled sequential Monte Carlo 5/ 36
Neuroscience example
• Joint work with Demba Ba, Harvard School of Engineering
• 3000 measurements yt ∈ {0, . . . , 50} collected from a
neuroscience experiment (Temereanca et al., 2008)
500 1000 1500 2000 2500 3000
0
2
4
6
8
10
12
14
Jeremy Heng Controlled sequential Monte Carlo 5/ 36
Neuroscience example
• Observation model
Yt|Xt ∼ Binomial (50, κ(Xt)) , κ(u) = (1 + exp(−u))−1
Jeremy Heng Controlled sequential Monte Carlo 6/ 36
Neuroscience example
• Observation model
Yt|Xt ∼ Binomial (50, κ(Xt)) , κ(u) = (1 + exp(−u))−1
• Latent Markov chain
X0 ∼ N(0, 1), Xt|Xt−1 ∼ N(αXt−1, σ2
)
Jeremy Heng Controlled sequential Monte Carlo 6/ 36
Neuroscience example
• Observation model
Yt|Xt ∼ Binomial (50, κ(Xt)) , κ(u) = (1 + exp(−u))−1
• Latent Markov chain
X0 ∼ N(0, 1), Xt|Xt−1 ∼ N(αXt−1, σ2
)
• Unknown parameters
θ = (α, σ2
) ∈ [0, 1] × (0, ∞)
Jeremy Heng Controlled sequential Monte Carlo 6/ 36
Objects of interest
• Online estimation: compute filtering distributions
p(xt|y0:t, θ), t ≥ 0
Jeremy Heng Controlled sequential Monte Carlo 7/ 36
Objects of interest
• Online estimation: compute filtering distributions
p(xt|y0:t, θ), t ≥ 0
• Offline estimation: compute smoothing distribution
p(x0:T |y0:T , θ), T ∈ N
or marginal likelihood
p(y0:T |θ) =
XT+1
µθ(x0)
T
t=1
ft,θ(xt−1, xt)
T
t=0
gt,θ(xt, yt)dx0:T
Jeremy Heng Controlled sequential Monte Carlo 7/ 36
Objects of interest
• Online estimation: compute filtering distributions
p(xt|y0:t, θ), t ≥ 0
• Offline estimation: compute smoothing distribution
p(x0:T |y0:T , θ), T ∈ N
or marginal likelihood
p(y0:T |θ) =
XT+1
µθ(x0)
T
t=1
ft,θ(xt−1, xt)
T
t=0
gt,θ(xt, yt)dx0:T
• Parameter inference
arg max
θ∈Θ
p(y0:T |θ), p(θ|y0:T ) ∝ p(θ)p(y0:T |θ)
Jeremy Heng Controlled sequential Monte Carlo 7/ 36
Outline
1 State space models
2 Sequential Monte Carlo
3 Controlled sequential Monte Carlo
4 Bayesian parameter inference
5 Extensions and future work
Jeremy Heng Controlled sequential Monte Carlo 7/ 36
Sequential Monte Carlo
• Sequential Monte Carlo (SMC) aka bootstrap particle filter
(BPF) recursively simulates an interacting particle system of
size N
(X1
t , . . . , XN
t ), t ∈ [0 : T]
Del Moral (2004). Feynman-Kac formulae.
Jeremy Heng Controlled sequential Monte Carlo 8/ 36
Sequential Monte Carlo
• Sequential Monte Carlo (SMC) aka bootstrap particle filter
(BPF) recursively simulates an interacting particle system of
size N
(X1
t , . . . , XN
t ), t ∈ [0 : T]
• Unbiased and consistent marginal likelihood estimator
ˆp(y0:T |θ) =
T
t=0
1
N
N
n=1
gt,θ(Xn
t , yt)
Del Moral (2004). Feynman-Kac formulae.
Jeremy Heng Controlled sequential Monte Carlo 8/ 36
Sequential Monte Carlo
• Sequential Monte Carlo (SMC) aka bootstrap particle filter
(BPF) recursively simulates an interacting particle system of
size N
(X1
t , . . . , XN
t ), t ∈ [0 : T]
• Unbiased and consistent marginal likelihood estimator
ˆp(y0:T |θ) =
T
t=0
1
N
N
n=1
gt,θ(Xn
t , yt)
• Consistent approximation of smoothing distribution
1
N
N
n=1
ϕ(Xn
0:T ) → ϕ(x0:T )p(x0:T |y0:T , θ)dx0:T
as N → ∞
Del Moral (2004). Feynman-Kac formulae.
Jeremy Heng Controlled sequential Monte Carlo 8/ 36
Sequential Monte Carlo
…X0 X1 XT
Y0 Y1 YT…
✓
For time t = 0 and particle n ∈ [1 : N]
sample Xn
0 ∼ µθ
Jeremy Heng Controlled sequential Monte Carlo 9/ 36
Sequential Monte Carlo
…X0 X1 XT
Y0 Y1 YT…
✓
For time t = 0 and particle n ∈ [1 : N]
weight W n
0 ∝ g0,θ(Xn
0 , y0)
Jeremy Heng Controlled sequential Monte Carlo 9/ 36
Sequential Monte Carlo
…X0 X1 XT
Y0 Y1 YT…
✓
X
X
X
For time t = 0 and particle n ∈ [1 : N]
sample ancestor An
0 ∼ R W 1
0 , . . . , W N
0 , resampled particle: X
An
0
0
Jeremy Heng Controlled sequential Monte Carlo 9/ 36
Sequential Monte Carlo
…X0 X1 XT
Y0 Y1 YT…
✓
X
X
X
For time t = 1 and particle n ∈ [1 : N]
sample Xn
1 ∼ f1,θ(X
An
0
0 , ·)
Jeremy Heng Controlled sequential Monte Carlo 9/ 36
Sequential Monte Carlo
…X0 X1 XT
Y0 Y1 YT…
✓
X
X
X
For time t = 1 and particle n ∈ [1 : N]
weight W n
1 ∝ g1,θ(Xn
1 , y1)
Jeremy Heng Controlled sequential Monte Carlo 9/ 36
Sequential Monte Carlo
…X0 X1 XT
Y0 Y1 YT…
✓
X
X
X X
For time t = 1 and particle n ∈ [1 : N]
sample ancestor An
1 ∼ R W 1
1 , . . . , W N
1 , resampled particle: X
An
1
1
Jeremy Heng Controlled sequential Monte Carlo 9/ 36
Sequential Monte Carlo
…X0 X1 XT
Y0 Y1 YT…
✓
X
X
X X
X
X
X
Repeat for time t ∈ [2 : T].
Jeremy Heng Controlled sequential Monte Carlo 9/ 36
Sequential Monte Carlo
…X0 X1 XT
Y0 Y1 YT…
✓
X
X
X X
X
X
X
Repeat for time t ∈ [2 : T]. Note this is for a given θ!
Jeremy Heng Controlled sequential Monte Carlo 9/ 36
Ideal Metropolis-Hastings
Cannot run Metropolis-Hastings algorithm to sample from
p(θ|y0:T ) ∝ p(θ)p(y0:T |θ)
For iteration i ≥ 1
1. Sample θ∗ ∼ q(θ(i−1), ·)
2. With probability
min 1,
p(θ∗)p(y0:T |θ∗)q(θ∗, θ(i−1))
p(θ(i−1))p(y0:T |θ(i−1))q(θ(i−1), θ∗)
set θ(i) = θ∗, otherwise set θ(i) = θ(i−1)
Jeremy Heng Controlled sequential Monte Carlo 10/ 36
Particle marginal Metropolis-Hastings (PMMH)
For iteration i ≥ 1
1. Sample θ∗ ∼ q(θ(i−1), ·)
2. Compute ˆp(y0:T |θ∗) with SMC
2. With probability
min 1,
p(θ∗)ˆp(y0:T |θ∗)q(θ∗, θ(i−1))
p(θ(i−1))ˆp(y0:T |θ(i−1))q(θ(i−1), θ∗)
set θ(i) = θ∗ and ˆp(y0:T |θ(i)) = ˆp(y0:T |θ∗), otherwise set
θ(i) = θ(i−1) and ˆp(y0:T |θ(i)) = ˆp(y0:T |θ(i−1))
Andrieu, Doucet & Holenstein (2010). Journal of the Royal Statistical Society:
Series B.
Jeremy Heng Controlled sequential Monte Carlo 11/ 36
Particle marginal Metropolis-Hastings (PMMH)
• Exact approximation:
1
m
m
i=1
ϕ(θ(i)
) →
Θ
ϕ(θ)p(θ|y0:T )dθ
as m → ∞, for any N ≥ 1
Jeremy Heng Controlled sequential Monte Carlo 12/ 36
Particle marginal Metropolis-Hastings (PMMH)
• Exact approximation:
1
m
m
i=1
ϕ(θ(i)
) →
Θ
ϕ(θ)p(θ|y0:T )dθ
as m → ∞, for any N ≥ 1
• Performance depends on the variance of ˆp(y0:T |θ)
Andrieu & Vihola (2015). The Annals of Applied Probability.
Jeremy Heng Controlled sequential Monte Carlo 12/ 36
Particle marginal Metropolis-Hastings (PMMH)
• Exact approximation:
1
m
m
i=1
ϕ(θ(i)
) →
Θ
ϕ(θ)p(θ|y0:T )dθ
as m → ∞, for any N ≥ 1
• Performance depends on the variance of ˆp(y0:T |θ)
Andrieu & Vihola (2015). The Annals of Applied Probability.
• Account for computational cost to optimize efficiency
Doucet, Pitt, Deligiannidis & Kohn (2015). Biometrika.
Sherlock, Thiery, Roberts, & Rosenthal (2015). The Annals of Statistics.
Jeremy Heng Controlled sequential Monte Carlo 12/ 36
Particle marginal Metropolis-Hastings (PMMH)
• Exact approximation:
1
m
m
i=1
ϕ(θ(i)
) →
Θ
ϕ(θ)p(θ|y0:T )dθ
as m → ∞, for any N ≥ 1
• Performance depends on the variance of ˆp(y0:T |θ)
Andrieu & Vihola (2015). The Annals of Applied Probability.
• Account for computational cost to optimize efficiency
Doucet, Pitt, Deligiannidis & Kohn (2015). Biometrika.
Sherlock, Thiery, Roberts, & Rosenthal (2015). The Annals of Statistics.
• Proposed method lowers variance of ˆp(y0:T |θ) at fixed cost
Jeremy Heng Controlled sequential Monte Carlo 12/ 36
Neuroscience example: SMC performance
Relative variance of log-marginal likelihood estimator
σ2
→ Var
log ˆp(y0:T |(α, σ2))
log p(y0:T |(α, σ2))
, fix α = 0.99
with N = 1024
0.05 0.1 0.15 0.2
-9.5
-9
-8.5
-8
-7.5
-7
-6.5
-6
-5.5
Jeremy Heng Controlled sequential Monte Carlo 13/ 36
Neuroscience example: SMC performance
Relative variance of log-marginal likelihood estimator
σ2
→ Var
log ˆp(y0:T |(α, σ2))
log p(y0:T |(α, σ2))
, fix α = 0.99
with N = 2048
0.05 0.1 0.15 0.2
-9.5
-9
-8.5
-8
-7.5
-7
-6.5
-6
-5.5
Jeremy Heng Controlled sequential Monte Carlo 13/ 36
Neuroscience example: SMC performance
Relative variance of log-marginal likelihood estimator
σ2
→ Var
log ˆp(y0:T |(α, σ2))
log p(y0:T |(α, σ2))
, fix α = 0.99
with N = 5529
0.05 0.1 0.15 0.2
-9.5
-9
-8.5
-8
-7.5
-7
-6.5
-6
-5.5
Jeremy Heng Controlled sequential Monte Carlo 13/ 36
Neuroscience example: SMC performance
Relative variance of log-marginal likelihood estimator
σ2
→ Var
log ˆp(y0:T |(α, σ2))
log p(y0:T |(α, σ2))
, fix α = 0.99
with N = 5529 vs. controlled SMC
0.05 0.1 0.15 0.2
-9.5
-9
-8.5
-8
-7.5
-7
-6.5
-6
-5.5
Jeremy Heng Controlled sequential Monte Carlo 13/ 36
Mismatch between dynamics and observations
SMC weights samples from model dynamics
Xt|Xt−1 ∼ N(αXt−1, σ2
)
without taking observations into account
-12 -10 -8 -6 -4 -2 0
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Jeremy Heng Controlled sequential Monte Carlo 14/ 36
Mismatch between dynamics and observations
SMC weights samples from model dynamics
Xt|Xt−1 ∼ N(αXt−1, σ2
)
without taking observations into account
-12 -10 -8 -6 -4 -2 0
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Jeremy Heng Controlled sequential Monte Carlo 14/ 36
Mismatch between dynamics and observations
SMC weights samples from model dynamics
Xt|Xt−1 ∼ N(αXt−1, σ2
)
without taking observations into account
-12 -10 -8 -6 -4 -2 0
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Jeremy Heng Controlled sequential Monte Carlo 14/ 36
Mismatch between dynamics and observations
SMC weights samples from model dynamics
Xt|Xt−1 ∼ N(αXt−1, σ2
)
without taking observations into account
-12 -10 -8 -6 -4 -2 0
0
0.5
1
1.5
2
2.5
3
3.5
4
Jeremy Heng Controlled sequential Monte Carlo 14/ 36
Outline
1 State space models
2 Sequential Monte Carlo
3 Controlled sequential Monte Carlo
4 Bayesian parameter inference
5 Extensions and future work
Jeremy Heng Controlled sequential Monte Carlo 14/ 36
Optimal dynamics
• Fix θ and suppress notational dependency
Jeremy Heng Controlled sequential Monte Carlo 15/ 36
Optimal dynamics
• Fix θ and suppress notational dependency
• Sampling from smoothing distribution p(x0:T |y0:T )
X0 ∼ p(x0|y0:T ), Xt|Xt−1 ∼ p(xt|xt−1, yt:T ), t ∈ [1 : T]
Jeremy Heng Controlled sequential Monte Carlo 15/ 36
Optimal dynamics
• Fix θ and suppress notational dependency
• Sampling from smoothing distribution p(x0:T |y0:T )
X0 ∼ p(x0|y0:T ), Xt|Xt−1 ∼ p(xt|xt−1, yt:T ), t ∈ [1 : T]
• Define backward information filter ψ∗
t (xt) = p(yt:T |xt),
then
p(x0|y0:T ) =
µ(x0)ψ∗
0(x0)
µ(ψ∗
0)
with µ(ψ∗
0) = X ψ∗
0(x0)µ(x0)dx0, and
p(xt|xt−1, yt:T ) =
ft(xt−1, xt)ψ∗
t (xt)
ft(ψ∗
t )(xt−1)
with ft(ψ∗
t )(xt−1) = X ψ∗
t (xt)ft(xt−1, xt)dxt
Jeremy Heng Controlled sequential Monte Carlo 15/ 36
Controlled state space model
• Given a policy
ψ = (ψ0, . . . , ψT )
i.e. positive and bounded functions
Jeremy Heng Controlled sequential Monte Carlo 16/ 36
Controlled state space model
• Given a policy
ψ = (ψ0, . . . , ψT )
i.e. positive and bounded functions
• Construct ψ-controlled dynamics
X0 ∼ µψ
, Xt|Xt−1 ∼ f ψ
t (Xt−1, ·), t ∈ [1 : T]
where
µψ
(x0) =
µ(x0)ψ0(x0)
µ(ψ0)
, f ψ
t (xt−1, xt) =
ft(xt−1, xt)ψt(xt)
ft(ψt)(xt−1)
Jeremy Heng Controlled sequential Monte Carlo 16/ 36
Controlled state space model
• Given a policy
ψ = (ψ0, . . . , ψT )
i.e. positive and bounded functions
• Construct ψ-controlled dynamics
X0 ∼ µψ
, Xt|Xt−1 ∼ f ψ
t (Xt−1, ·), t ∈ [1 : T]
where
µψ
(x0) =
µ(x0)ψ0(x0)
µ(ψ0)
, f ψ
t (xt−1, xt) =
ft(xt−1, xt)ψt(xt)
ft(ψt)(xt−1)
• Introducing ψ-controlled observation model
Yt|X0:T ∼ gψ
t (Xt, ·), t ∈ [0 : T]
gives a ψ-controlled state space model
Jeremy Heng Controlled sequential Monte Carlo 16/ 36
Controlled state space model
• Define controlled observation densities (gψ
0 , . . . , gψ
T ) so
that
pψ
(x0:T |y0:T ) = p(x0:T |y0:T ), pψ
(y0:T ) = p(y0:T )
Jeremy Heng Controlled sequential Monte Carlo 17/ 36
Controlled state space model
• Define controlled observation densities (gψ
0 , . . . , gψ
T ) so
that
pψ
(x0:T |y0:T ) = p(x0:T |y0:T ), pψ
(y0:T ) = p(y0:T )
• Achieved with
gψ
0 (x0, y0) =
µ(ψ0)g0(x0, y0)f1(ψ1)(x0)
ψ0(x0)
,
gψ
t (xt, yt) =
gt(xt, yt)ft+1(ψt+1)(xt)
ψt(xt)
, t ∈ [1 : T − 1],
gψ
T (xT , yT ) =
gT (xT , yT )
ψT (xT )
Jeremy Heng Controlled sequential Monte Carlo 17/ 36
Controlled state space model
• Define controlled observation densities (gψ
0 , . . . , gψ
T ) so
that
pψ
(x0:T |y0:T ) = p(x0:T |y0:T ), pψ
(y0:T ) = p(y0:T )
• Achieved with
gψ
0 (x0, y0) =
µ(ψ0)g0(x0, y0)f1(ψ1)(x0)
ψ0(x0)
,
gψ
t (xt, yt) =
gt(xt, yt)ft+1(ψt+1)(xt)
ψt(xt)
, t ∈ [1 : T − 1],
gψ
T (xT , yT ) =
gT (xT , yT )
ψT (xT )
• Requirements on policy ψ
– Evaluating gψ
t tractable
Jeremy Heng Controlled sequential Monte Carlo 17/ 36
Controlled state space model
• Define controlled observation densities (gψ
0 , . . . , gψ
T ) so
that
pψ
(x0:T |y0:T ) = p(x0:T |y0:T ), pψ
(y0:T ) = p(y0:T )
• Achieved with
gψ
0 (x0, y0) =
µ(ψ0)g0(x0, y0)f1(ψ1)(x0)
ψ0(x0)
,
gψ
t (xt, yt) =
gt(xt, yt)ft+1(ψt+1)(xt)
ψt(xt)
, t ∈ [1 : T − 1],
gψ
T (xT , yT ) =
gT (xT , yT )
ψT (xT )
• Requirements on policy ψ
– Evaluating gψ
t tractable
– Sampling µψ and f ψ
t feasible
Jeremy Heng Controlled sequential Monte Carlo 17/ 36
Neuroscience example
• Gaussian policy for neuroscience example
ψt(xt) = exp −atx2
t − btxt − ct , (at, bt, ct) ∈ R3
Jeremy Heng Controlled sequential Monte Carlo 18/ 36
Neuroscience example
• Gaussian policy for neuroscience example
ψt(xt) = exp −atx2
t − btxt − ct , (at, bt, ct) ∈ R3
• Both requirements satisfied since
µψ
= N(−k0b0, k0), f ψ
t (xt−1, ·) = N(kt{ασ−2
xt−1 − bt}, kt)
with k0 = (1 + 2a0)−1 and kt = (σ−2 + 2at)−1
Jeremy Heng Controlled sequential Monte Carlo 18/ 36
Controlled SMC
• Construct ψ-controlled SMC as SMC applied to ψ-controlled
state space model
X0 ∼ µψ
, Xt|Xt−1 ∼ f ψ
t (Xt−1, ·), t ∈ [1 : T]
Yt|X0:T ∼ gψ
t (Xt, ·), t ∈ [0 : T]
Jeremy Heng Controlled sequential Monte Carlo 19/ 36
Controlled SMC
• Construct ψ-controlled SMC as SMC applied to ψ-controlled
state space model
X0 ∼ µψ
, Xt|Xt−1 ∼ f ψ
t (Xt−1, ·), t ∈ [1 : T]
Yt|X0:T ∼ gψ
t (Xt, ·), t ∈ [0 : T]
• Unbiased and consistent marginal likelihood estimator
ˆpψ
(y0:T ) =
T
t=0
1
N
N
n=1
gψ
t (Xn
t , yt)
Jeremy Heng Controlled sequential Monte Carlo 19/ 36
Controlled SMC
• Construct ψ-controlled SMC as SMC applied to ψ-controlled
state space model
X0 ∼ µψ
, Xt|Xt−1 ∼ f ψ
t (Xt−1, ·), t ∈ [1 : T]
Yt|X0:T ∼ gψ
t (Xt, ·), t ∈ [0 : T]
• Unbiased and consistent marginal likelihood estimator
ˆpψ
(y0:T ) =
T
t=0
1
N
N
n=1
gψ
t (Xn
t , yt)
• Consistent approximation of smoothing distribution
1
N
N
n=1
ϕ(Xn
0:T ) → ϕ(x0:T )p(x0:T |y0:T )dx0:T
as N → ∞
Jeremy Heng Controlled sequential Monte Carlo 19/ 36
Optimal policy
• Policy ψ∗
t (xt) = p(yt:T |xt) is optimal since ψ∗-controlled SMC
gives
Jeremy Heng Controlled sequential Monte Carlo 20/ 36
Optimal policy
• Policy ψ∗
t (xt) = p(yt:T |xt) is optimal since ψ∗-controlled SMC
gives
• ψ∗-controlled SMC gives independent samples from
smoothing distribution
Xn
0:T ∼ p(x0:T |y0:T ), n ∈ [1 : N],
and zero variance estimator of marginal likelihood for any
N ≥ 1
ˆpψ∗
(y0:T ) = p(y0:T )
Jeremy Heng Controlled sequential Monte Carlo 20/ 36
Optimal policy
• Policy ψ∗
t (xt) = p(yt:T |xt) is optimal since ψ∗-controlled SMC
gives
• ψ∗-controlled SMC gives independent samples from
smoothing distribution
Xn
0:T ∼ p(x0:T |y0:T ), n ∈ [1 : N],
and zero variance estimator of marginal likelihood for any
N ≥ 1
ˆpψ∗
(y0:T ) = p(y0:T )
• (Proposition 1) Optimal policy satisfies backward recursion
ψ∗
T (xT ) = gT (xT , yT ),
ψ∗
t (xt) = gt(xt, yt)ft+1(ψ∗
t+1)(xt), t ∈ [T − 1 : 0]
Jeremy Heng Controlled sequential Monte Carlo 20/ 36
Optimal policy
• Policy ψ∗
t (xt) = p(yt:T |xt) is optimal since ψ∗-controlled SMC
gives
• ψ∗-controlled SMC gives independent samples from
smoothing distribution
Xn
0:T ∼ p(x0:T |y0:T ), n ∈ [1 : N],
and zero variance estimator of marginal likelihood for any
N ≥ 1
ˆpψ∗
(y0:T ) = p(y0:T )
• (Proposition 1) Optimal policy satisfies backward recursion
ψ∗
T (xT ) = gT (xT , yT ),
ψ∗
t (xt) = gt(xt, yt)ft+1(ψ∗
t+1)(xt), t ∈ [T − 1 : 0]
• Backward recursion typically intractable but can be
approximated
Jeremy Heng Controlled sequential Monte Carlo 20/ 36
Connection to optimal control
• V ∗
t = − log ψ∗
t are the optimal value functions of
Kullback-Leibler control problem
inf
ψ∈Ψ
KL pψ
(x0:T ) p(x0:T |y0:T )
Jeremy Heng Controlled sequential Monte Carlo 21/ 36
Connection to optimal control
• V ∗
t = − log ψ∗
t are the optimal value functions of
Kullback-Leibler control problem
inf
ψ∈Ψ
KL pψ
(x0:T ) p(x0:T |y0:T )
• Connection useful for methodology and analysis
Bertsekas & Tsitsiklis (1996). Neuro-dynamic programming.
Tsitsiklis & Van Roy (2001). IEEE Transactions on Neural Networks.
Jeremy Heng Controlled sequential Monte Carlo 21/ 36
Connection to optimal control
• V ∗
t = − log ψ∗
t are the optimal value functions of
Kullback-Leibler control problem
inf
ψ∈Ψ
KL pψ
(x0:T ) p(x0:T |y0:T )
• Connection useful for methodology and analysis
Bertsekas & Tsitsiklis (1996). Neuro-dynamic programming.
Tsitsiklis & Van Roy (2001). IEEE Transactions on Neural Networks.
• Methods to approximate backward recursion are known as
approximate dynamic programming (ADP) for finite
horizon control problems
Jeremy Heng Controlled sequential Monte Carlo 21/ 36
Approximate dynamic programming
• First run standard SMC to get (Xn
0 , . . . , Xn
T ), n ∈ [1 : N]
Jeremy Heng Controlled sequential Monte Carlo 22/ 36
Approximate dynamic programming
• First run standard SMC to get (Xn
0 , . . . , Xn
T ), n ∈ [1 : N]
• For time T, approximate
ψ∗
T (xT ) = gT (xT , yT )
by least squares
ˆψT = arg min
ξ∈F
N
n=1
{log ξ(Xn
T ) − log gT (Xn
T , yT )}2
Jeremy Heng Controlled sequential Monte Carlo 22/ 36
Approximate dynamic programming
• First run standard SMC to get (Xn
0 , . . . , Xn
T ), n ∈ [1 : N]
• For time T, approximate
ψ∗
T (xT ) = gT (xT , yT )
by least squares
ˆψT = arg min
ξ∈F
N
n=1
{log ξ(Xn
T ) − log gT (Xn
T , yT )}2
• For t ∈ [T − 1 : 0], approximate
ψ∗
t (xt) = gt(xt, yt)ft+1(ψ∗
t+1)(xt)
by least squares and ψ∗
t+1 ≈ ˆψt+1
ˆψt = arg min
ξ∈F
N
n=1
log ξ(Xn
t ) − log gt(Xn
t , yt) − log ft+1( ˆψt+1)(Xn
t )
2
Jeremy Heng Controlled sequential Monte Carlo 22/ 36
Approximate dynamic programming
• (Proposition 3) Error bounds
E ˆψt − ψ∗
t ≤
T
s=t
Ct−1,s−1eN
s
where Ct,s are stability constants and eN
t are least squares
errors
Jeremy Heng Controlled sequential Monte Carlo 23/ 36
Approximate dynamic programming
• (Proposition 3) Error bounds
E ˆψt − ψ∗
t ≤
T
s=t
Ct−1,s−1eN
s
where Ct,s are stability constants and eN
t are least squares
errors
• (Theorem 1) As N → ∞, ˆψ converges to idealized ADP
˜ψT = arg min
ξ∈F
E {log ξ(XT ) − log gT (XT , yT )}2
,
˜ψt = arg min
ξ∈F
E log ξ(Xt) − log gt(Xt, yt) − log ft+1( ˜ψt+1)(Xt)
2
Jeremy Heng Controlled sequential Monte Carlo 23/ 36
Neuroscience example: controlled SMC
Marginal likelihood estimates of ˆψ-controlled SMC with N = 128
(α = 0.99, σ2 = 0.11)
Variance reduction ≈ 22 times
Iteration 0 Iteration 1
-3125
-3120
-3115
-3110
-3105
Jeremy Heng Controlled sequential Monte Carlo 24/ 36
Neuroscience example: controlled SMC
Marginal likelihood estimates of ˆψ-controlled SMC with N = 128
(α = 0.99, σ2 = 0.11)
Variance reduction ≈ 22 times
Iteration 0 Iteration 1
-3125
-3120
-3115
-3110
-3105
We can do better!
Jeremy Heng Controlled sequential Monte Carlo 24/ 36
Policy refinement
• Current policy ˆψ defines the dynamics
X0 ∼ µ
ˆψ
, Xt|Xt−1 ∼ f
ˆψ
t (Xt−1, ·), t ∈ [1 : T]
Jeremy Heng Controlled sequential Monte Carlo 25/ 36
Policy refinement
• Current policy ˆψ defines the dynamics
X0 ∼ µ
ˆψ
, Xt|Xt−1 ∼ f
ˆψ
t (Xt−1, ·), t ∈ [1 : T]
• Further control these dynamics with policy φ = (φ0, . . . φT )
X0 ∼ µ
ˆψ
φ
, Xt|Xt−1 ∼ f
ˆψ
t
φ
(Xt−1, ·), t ∈ [1 : T]
Jeremy Heng Controlled sequential Monte Carlo 25/ 36
Policy refinement
• Current policy ˆψ defines the dynamics
X0 ∼ µ
ˆψ
, Xt|Xt−1 ∼ f
ˆψ
t (Xt−1, ·), t ∈ [1 : T]
• Further control these dynamics with policy φ = (φ0, . . . φT )
X0 ∼ µ
ˆψ
φ
, Xt|Xt−1 ∼ f
ˆψ
t
φ
(Xt−1, ·), t ∈ [1 : T]
• Equivalent to controlling model dynamics µ and ft with policy
ˆψ · φ = ( ˆψ0 · φ0, . . . , ˆψT · φT )
Jeremy Heng Controlled sequential Monte Carlo 25/ 36
Policy refinement
• (Proposition 1) Optimal refinement of φ∗ current policy ˆψ
φ∗
T (xT ) = g
ˆψ
T (xT , yT ),
φ∗
t (xt) = g
ˆψ
t (xt, yt)f
ˆψ
t+1(φ∗
t+1)(xt), t ∈ [T − 1 : 0]
Jeremy Heng Controlled sequential Monte Carlo 26/ 36
Policy refinement
• (Proposition 1) Optimal refinement of φ∗ current policy ˆψ
φ∗
T (xT ) = g
ˆψ
T (xT , yT ),
φ∗
t (xt) = g
ˆψ
t (xt, yt)f
ˆψ
t+1(φ∗
t+1)(xt), t ∈ [T − 1 : 0]
• Optimal policy ψ∗ = ˆψ · φ∗
Jeremy Heng Controlled sequential Monte Carlo 26/ 36
Policy refinement
• (Proposition 1) Optimal refinement of φ∗ current policy ˆψ
φ∗
T (xT ) = g
ˆψ
T (xT , yT ),
φ∗
t (xt) = g
ˆψ
t (xt, yt)f
ˆψ
t+1(φ∗
t+1)(xt), t ∈ [T − 1 : 0]
• Optimal policy ψ∗ = ˆψ · φ∗
• Approximate backward recursion to obtain ˆφ ≈ φ∗
– using particles from ˆψ-controlled SMC
– same function class F
Jeremy Heng Controlled sequential Monte Carlo 26/ 36
Policy refinement
• (Proposition 1) Optimal refinement of φ∗ current policy ˆψ
φ∗
T (xT ) = g
ˆψ
T (xT , yT ),
φ∗
t (xt) = g
ˆψ
t (xt, yt)f
ˆψ
t+1(φ∗
t+1)(xt), t ∈ [T − 1 : 0]
• Optimal policy ψ∗ = ˆψ · φ∗
• Approximate backward recursion to obtain ˆφ ≈ φ∗
– using particles from ˆψ-controlled SMC
– same function class F
• Run controlled SMC with refined policy ˆψ · ˆφ
Jeremy Heng Controlled sequential Monte Carlo 26/ 36
Neuroscience example: controlled SMC
Marginal likelihood estimates of controlled SMC iteration 2 with
N = 128 (α = 0.99, σ2 = 0.11)
Further variance reduction ≈ 24 times
Iteration 0 Iteration 1 Iteration 2
-3125
-3120
-3115
-3110
-3105
Jeremy Heng Controlled sequential Monte Carlo 27/ 36
Neuroscience example: controlled SMC
Marginal likelihood estimates of controlled SMC iteration 3 with
N = 128 (α = 0.99, σ2 = 0.11)
Further variance reduction ≈ 1.3 times
Iteration 0 Iteration 1 Iteration 2 Iteration 3
-3125
-3120
-3115
-3110
-3105
Jeremy Heng Controlled sequential Monte Carlo 27/ 36
Neuroscience example: controlled SMC
Coefficients (ai
t, bi
t) of Gaussian approximation estimated at
iteration i ≥ 1
0 500 1000 1500 2000 2500 3000
-4
-2
0
2
4
6
8
Iteration 1
Iteration 2
Iteration 3
0 500 1000 1500 2000 2500 3000
-20
-10
0
10
20
30
40
Iteration 1
Iteration 2
Iteration 3
Jeremy Heng Controlled sequential Monte Carlo 28/ 36
Effect of policy refinement
• Residual from ADP when fitting ˆψ
εt(xt) = log ˆψt(xt) − log g(xt, yt) − log ft+1( ˆψt+1)(xt)
Jeremy Heng Controlled sequential Monte Carlo 29/ 36
Effect of policy refinement
• Residual from ADP when fitting ˆψ
εt(xt) = log ˆψt(xt) − log g(xt, yt) − log ft+1( ˆψt+1)(xt)
• Performance of ˆψ-controlled SMC related to εt
Jeremy Heng Controlled sequential Monte Carlo 29/ 36
Effect of policy refinement
• Residual from ADP when fitting ˆψ
εt(xt) = log ˆψt(xt) − log g(xt, yt) − log ft+1( ˆψt+1)(xt)
• Performance of ˆψ-controlled SMC related to εt
• Next ADP refinement re-fits residual (like L2-boosting)
ˆφt = arg min
ξ∈F
N
n=1
log ξ(Xn
t ) − εt(Xn
t ) − log f
ˆψ
t+1(ˆφt+1)(Xn
t )
2
B¨uhlmann & Yu (2003). Journal of the American Statistical Association.
Jeremy Heng Controlled sequential Monte Carlo 29/ 36
Effect of policy refinement
• Residual from ADP when fitting ˆψ
εt(xt) = log ˆψt(xt) − log g(xt, yt) − log ft+1( ˆψt+1)(xt)
• Performance of ˆψ-controlled SMC related to εt
• Next ADP refinement re-fits residual (like L2-boosting)
ˆφt = arg min
ξ∈F
N
n=1
log ξ(Xn
t ) − εt(Xn
t ) − log f
ˆψ
t+1(ˆφt+1)(Xn
t )
2
B¨uhlmann & Yu (2003). Journal of the American Statistical Association.
Jeremy Heng Controlled sequential Monte Carlo 29/ 36
Effect of policy refinement
• Residual from ADP when fitting ˆψ
εt(xt) = log ˆψt(xt) − log g(xt, yt) − log ft+1( ˆψt+1)(xt)
• Performance of ˆψ-controlled SMC related to εt
• Next ADP refinement re-fits residual (like L2-boosting)
ˆφt = arg min
ξ∈F
N
n=1
log ξ(Xn
t ) − εt(Xn
t ) − log f
ˆψ
t+1(ˆφt+1)(Xn
t )
2
B¨uhlmann & Yu (2003). Journal of the American Statistical Association.
This explains previous plots!
Jeremy Heng Controlled sequential Monte Carlo 29/ 36
Effect of policy refinement
Coefficients of policy at time 0 over 30 iterations
0 5 10 15 20 25 30
0
1
2
3
4
5
6
7
8
9
0 5 10 15 20 25 30
0
5
10
15
20
25
30
35
40
Jeremy Heng Controlled sequential Monte Carlo 30/ 36
Effect of policy refinement
Coefficients of policy at time 0 over 30 iterations
0 5 10 15 20 25 30
0
1
2
3
4
5
6
7
8
9
0 5 10 15 20 25 30
0
5
10
15
20
25
30
35
40
(Theorem 2) Under regularity conditions
• Policy refinement generates a Markov chain on policy space
Jeremy Heng Controlled sequential Monte Carlo 30/ 36
Effect of policy refinement
Coefficients of policy at time 0 over 30 iterations
0 5 10 15 20 25 30
0
1
2
3
4
5
6
7
8
9
0 5 10 15 20 25 30
0
5
10
15
20
25
30
35
40
(Theorem 2) Under regularity conditions
• Policy refinement generates a Markov chain on policy space
• Converges to unique stationary distribution
Jeremy Heng Controlled sequential Monte Carlo 30/ 36
Effect of policy refinement
Coefficients of policy at time 0 over 30 iterations
0 5 10 15 20 25 30
0
1
2
3
4
5
6
7
8
9
0 5 10 15 20 25 30
0
5
10
15
20
25
30
35
40
(Theorem 2) Under regularity conditions
• Policy refinement generates a Markov chain on policy space
• Converges to unique stationary distribution
• Characterization of stationary distribution as N → ∞
Jeremy Heng Controlled sequential Monte Carlo 30/ 36
Outline
1 State space models
2 Sequential Monte Carlo
3 Controlled sequential Monte Carlo
4 Bayesian parameter inference
5 Extensions and future work
Jeremy Heng Controlled sequential Monte Carlo 30/ 36
Neuroscience example: PMMH performance
Trace plots of particle marginal Metropolis-Hastings chain
0 200 400 600 800 1000
0.99
0.991
0.992
0.993
0.994
0.995
0.996
0.997
0.998
0.999
1
0 200 400 600 800 1000
0.07
0.08
0.09
0.1
0.11
0.12
0.13
0.14
BPF: N = 5529
cSMC: N = 128, 3 refinements
Each iteration takes 2-3 seconds
Jeremy Heng Controlled sequential Monte Carlo 31/ 36
Neuroscience example: PMMH
Autocorrelation function of particle marginal Metropolis-Hastings
chain
0 10 20 30 40 50
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
BPF
cSMC
0 10 20 30 40 50
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
BPF
cSMC
100,000 iterations with BPF (3 days)
≈ 20,000 iterations with cSMC (1/2 day)
Jeremy Heng Controlled sequential Monte Carlo 32/ 36
Neuroscience example: PMMH
Autocorrelation function of particle marginal Metropolis-Hastings
chain
0 10 20 30 40 50
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
BPF
cSMC
0 10 20 30 40 50
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
BPF
cSMC
100,000 iterations with BPF (3 days)
≈ 20,000 iterations with cSMC (1/2 day)
≈ 2,000 iterations with cSMC + parallel computation (1.5 hours)
Jeremy Heng Controlled sequential Monte Carlo 32/ 36
Parallel computation for Markov chain Monte Carlo
Serial MCMC
iterations ! 1
Parallel MCMC
processors
#
1
Jacob, O’Leary & Atchad´e (2017). Unbiased Markov chain Monte Carlo with
couplings.
Heng & Jacob (2017). Unbiased Hamiltonian Monte Carlo with couplings.
Middleton, Deligiannidis, Doucet & Jacob (2018). Unbiased Markov chain
Monte Carlo for intractable target distributions.
Jeremy Heng Controlled sequential Monte Carlo 33/ 36
Outline
1 State space models
2 Sequential Monte Carlo
3 Controlled sequential Monte Carlo
4 Bayesian parameter inference
5 Extensions and future work
Jeremy Heng Controlled sequential Monte Carlo 33/ 36
Static models
• Bayesian model: compute posterior
p(θ|y) ∝ p(θ)p(y|θ)
or model evidence p(y) = Θ p(θ)p(y|θ)dθ
Jeremy Heng Controlled sequential Monte Carlo 34/ 36
Static models
• Bayesian model: compute posterior
p(θ|y) ∝ p(θ)p(y|θ)
or model evidence p(y) = Θ p(θ)p(y|θ)dθ
• Introduce artificial tempered posteriors
pt(θ|y) ∝ p(θ)p(y|θ)λt
where 0 = λ0 < · · · < λT = 1 (Jarzynski 1997; Neal 2001;
Chopin 2004; Del Moral, Doucet & Jasra, 2006)
Jeremy Heng Controlled sequential Monte Carlo 34/ 36
Static models
• Bayesian model: compute posterior
p(θ|y) ∝ p(θ)p(y|θ)
or model evidence p(y) = Θ p(θ)p(y|θ)dθ
• Introduce artificial tempered posteriors
pt(θ|y) ∝ p(θ)p(y|θ)λt
where 0 = λ0 < · · · < λT = 1 (Jarzynski 1997; Neal 2001;
Chopin 2004; Del Moral, Doucet & Jasra, 2006)
• Optimally controlled SMC gives independent samples from
p(θ|y) and a zero variance estimator of p(y)
Jeremy Heng Controlled sequential Monte Carlo 34/ 36
Static models
Model evidence estimates of Cox point process model in 900
dimensions
Variance reduction ≈ 370 times compared to AIS
Iteration 0 Iteration 1 Iteration 2 Iteration 3 AIS
440
450
460
470
480
490
500
Jeremy Heng Controlled sequential Monte Carlo 35/ 36
Concluding remarks
• Extensions:
– Online filtering
– Relax requirements on policies
Jeremy Heng Controlled sequential Monte Carlo 36/ 36
Concluding remarks
• Extensions:
– Online filtering
– Relax requirements on policies
• Controlled sequential Monte Carlo.
Heng, Bishop, Deligiannidis & Doucet. arXiv:1708.08396.
Jeremy Heng Controlled sequential Monte Carlo 36/ 36
Concluding remarks
• Extensions:
– Online filtering
– Relax requirements on policies
• Controlled sequential Monte Carlo.
Heng, Bishop, Deligiannidis & Doucet. arXiv:1708.08396.
• MATLAB code:
https://github.com/jeremyhengjm/controlledSMC
Jeremy Heng Controlled sequential Monte Carlo 36/ 36

Controlled sequential Monte Carlo

  • 1.
    Controlled sequential MonteCarlo Jeremy Heng, Department of Statistics, Harvard University Joint work with Adrian Bishop (UTS, CSIRO), George Deligiannidis & Arnaud Doucet (Oxford) Computation and Econometrics Workshop @ The National Art Center, 27 July 2018 Jeremy Heng Controlled sequential Monte Carlo 1/ 36
  • 2.
    Outline 1 State spacemodels 2 Sequential Monte Carlo 3 Controlled sequential Monte Carlo 4 Bayesian parameter inference 5 Extensions and future work Jeremy Heng Controlled sequential Monte Carlo 2/ 36
  • 3.
    Outline 1 State spacemodels 2 Sequential Monte Carlo 3 Controlled sequential Monte Carlo 4 Bayesian parameter inference 5 Extensions and future work Jeremy Heng Controlled sequential Monte Carlo 2/ 36
  • 4.
    State space models X0 LatentMarkov chain X0 ∼ µ, Xt|Xt−1 ∼ ft(Xt−1, ·), t ∈ [1 : T] Observations Yt|X0:T ∼ gt(Xt, ·), t ∈ [0 : T] Jeremy Heng Controlled sequential Monte Carlo 3/ 36
  • 5.
    State space models …X0X1 XT Latent Markov chain X0 ∼ µ, Xt|Xt−1 ∼ ft(Xt−1, ·), t ∈ [1 : T] Observations Yt|X0:T ∼ gt(Xt, ·), t ∈ [0 : T] Jeremy Heng Controlled sequential Monte Carlo 3/ 36
  • 6.
    State space models …X0X1 XT Y0 Y1 YT… Latent Markov chain X0 ∼ µ, Xt|Xt−1 ∼ ft(Xt−1, ·), t ∈ [1 : T] Observations Yt|X0:T ∼ gt(Xt, ·), t ∈ [0 : T] Jeremy Heng Controlled sequential Monte Carlo 3/ 36
  • 7.
    State space models …X0X1 XT Y0 Y1 YT… ✓ Latent Markov chain X0 ∼ µθ, Xt|Xt−1 ∼ ft,θ(Xt−1, ·), t ∈ [1 : T] Observations Yt|X0:T ∼ gt,θ(Xt, ·), t ∈ [0 : T] Jeremy Heng Controlled sequential Monte Carlo 3/ 36
  • 8.
    State space models JeremyHeng Controlled sequential Monte Carlo 4/ 36
  • 9.
    State space models JeremyHeng Controlled sequential Monte Carlo 4/ 36
  • 10.
    State space models JeremyHeng Controlled sequential Monte Carlo 4/ 36
  • 11.
    State space models JeremyHeng Controlled sequential Monte Carlo 4/ 36
  • 12.
    State space models JeremyHeng Controlled sequential Monte Carlo 4/ 36
  • 13.
    Neuroscience example • Jointwork with Demba Ba, Harvard School of Engineering Jeremy Heng Controlled sequential Monte Carlo 5/ 36
  • 14.
    Neuroscience example • Jointwork with Demba Ba, Harvard School of Engineering • 3000 measurements yt ∈ {0, . . . , 50} collected from a neuroscience experiment (Temereanca et al., 2008) 500 1000 1500 2000 2500 3000 0 2 4 6 8 10 12 14 Jeremy Heng Controlled sequential Monte Carlo 5/ 36
  • 15.
    Neuroscience example • Observationmodel Yt|Xt ∼ Binomial (50, κ(Xt)) , κ(u) = (1 + exp(−u))−1 Jeremy Heng Controlled sequential Monte Carlo 6/ 36
  • 16.
    Neuroscience example • Observationmodel Yt|Xt ∼ Binomial (50, κ(Xt)) , κ(u) = (1 + exp(−u))−1 • Latent Markov chain X0 ∼ N(0, 1), Xt|Xt−1 ∼ N(αXt−1, σ2 ) Jeremy Heng Controlled sequential Monte Carlo 6/ 36
  • 17.
    Neuroscience example • Observationmodel Yt|Xt ∼ Binomial (50, κ(Xt)) , κ(u) = (1 + exp(−u))−1 • Latent Markov chain X0 ∼ N(0, 1), Xt|Xt−1 ∼ N(αXt−1, σ2 ) • Unknown parameters θ = (α, σ2 ) ∈ [0, 1] × (0, ∞) Jeremy Heng Controlled sequential Monte Carlo 6/ 36
  • 18.
    Objects of interest •Online estimation: compute filtering distributions p(xt|y0:t, θ), t ≥ 0 Jeremy Heng Controlled sequential Monte Carlo 7/ 36
  • 19.
    Objects of interest •Online estimation: compute filtering distributions p(xt|y0:t, θ), t ≥ 0 • Offline estimation: compute smoothing distribution p(x0:T |y0:T , θ), T ∈ N or marginal likelihood p(y0:T |θ) = XT+1 µθ(x0) T t=1 ft,θ(xt−1, xt) T t=0 gt,θ(xt, yt)dx0:T Jeremy Heng Controlled sequential Monte Carlo 7/ 36
  • 20.
    Objects of interest •Online estimation: compute filtering distributions p(xt|y0:t, θ), t ≥ 0 • Offline estimation: compute smoothing distribution p(x0:T |y0:T , θ), T ∈ N or marginal likelihood p(y0:T |θ) = XT+1 µθ(x0) T t=1 ft,θ(xt−1, xt) T t=0 gt,θ(xt, yt)dx0:T • Parameter inference arg max θ∈Θ p(y0:T |θ), p(θ|y0:T ) ∝ p(θ)p(y0:T |θ) Jeremy Heng Controlled sequential Monte Carlo 7/ 36
  • 21.
    Outline 1 State spacemodels 2 Sequential Monte Carlo 3 Controlled sequential Monte Carlo 4 Bayesian parameter inference 5 Extensions and future work Jeremy Heng Controlled sequential Monte Carlo 7/ 36
  • 22.
    Sequential Monte Carlo •Sequential Monte Carlo (SMC) aka bootstrap particle filter (BPF) recursively simulates an interacting particle system of size N (X1 t , . . . , XN t ), t ∈ [0 : T] Del Moral (2004). Feynman-Kac formulae. Jeremy Heng Controlled sequential Monte Carlo 8/ 36
  • 23.
    Sequential Monte Carlo •Sequential Monte Carlo (SMC) aka bootstrap particle filter (BPF) recursively simulates an interacting particle system of size N (X1 t , . . . , XN t ), t ∈ [0 : T] • Unbiased and consistent marginal likelihood estimator ˆp(y0:T |θ) = T t=0 1 N N n=1 gt,θ(Xn t , yt) Del Moral (2004). Feynman-Kac formulae. Jeremy Heng Controlled sequential Monte Carlo 8/ 36
  • 24.
    Sequential Monte Carlo •Sequential Monte Carlo (SMC) aka bootstrap particle filter (BPF) recursively simulates an interacting particle system of size N (X1 t , . . . , XN t ), t ∈ [0 : T] • Unbiased and consistent marginal likelihood estimator ˆp(y0:T |θ) = T t=0 1 N N n=1 gt,θ(Xn t , yt) • Consistent approximation of smoothing distribution 1 N N n=1 ϕ(Xn 0:T ) → ϕ(x0:T )p(x0:T |y0:T , θ)dx0:T as N → ∞ Del Moral (2004). Feynman-Kac formulae. Jeremy Heng Controlled sequential Monte Carlo 8/ 36
  • 25.
    Sequential Monte Carlo …X0X1 XT Y0 Y1 YT… ✓ For time t = 0 and particle n ∈ [1 : N] sample Xn 0 ∼ µθ Jeremy Heng Controlled sequential Monte Carlo 9/ 36
  • 26.
    Sequential Monte Carlo …X0X1 XT Y0 Y1 YT… ✓ For time t = 0 and particle n ∈ [1 : N] weight W n 0 ∝ g0,θ(Xn 0 , y0) Jeremy Heng Controlled sequential Monte Carlo 9/ 36
  • 27.
    Sequential Monte Carlo …X0X1 XT Y0 Y1 YT… ✓ X X X For time t = 0 and particle n ∈ [1 : N] sample ancestor An 0 ∼ R W 1 0 , . . . , W N 0 , resampled particle: X An 0 0 Jeremy Heng Controlled sequential Monte Carlo 9/ 36
  • 28.
    Sequential Monte Carlo …X0X1 XT Y0 Y1 YT… ✓ X X X For time t = 1 and particle n ∈ [1 : N] sample Xn 1 ∼ f1,θ(X An 0 0 , ·) Jeremy Heng Controlled sequential Monte Carlo 9/ 36
  • 29.
    Sequential Monte Carlo …X0X1 XT Y0 Y1 YT… ✓ X X X For time t = 1 and particle n ∈ [1 : N] weight W n 1 ∝ g1,θ(Xn 1 , y1) Jeremy Heng Controlled sequential Monte Carlo 9/ 36
  • 30.
    Sequential Monte Carlo …X0X1 XT Y0 Y1 YT… ✓ X X X X For time t = 1 and particle n ∈ [1 : N] sample ancestor An 1 ∼ R W 1 1 , . . . , W N 1 , resampled particle: X An 1 1 Jeremy Heng Controlled sequential Monte Carlo 9/ 36
  • 31.
    Sequential Monte Carlo …X0X1 XT Y0 Y1 YT… ✓ X X X X X X X Repeat for time t ∈ [2 : T]. Jeremy Heng Controlled sequential Monte Carlo 9/ 36
  • 32.
    Sequential Monte Carlo …X0X1 XT Y0 Y1 YT… ✓ X X X X X X X Repeat for time t ∈ [2 : T]. Note this is for a given θ! Jeremy Heng Controlled sequential Monte Carlo 9/ 36
  • 33.
    Ideal Metropolis-Hastings Cannot runMetropolis-Hastings algorithm to sample from p(θ|y0:T ) ∝ p(θ)p(y0:T |θ) For iteration i ≥ 1 1. Sample θ∗ ∼ q(θ(i−1), ·) 2. With probability min 1, p(θ∗)p(y0:T |θ∗)q(θ∗, θ(i−1)) p(θ(i−1))p(y0:T |θ(i−1))q(θ(i−1), θ∗) set θ(i) = θ∗, otherwise set θ(i) = θ(i−1) Jeremy Heng Controlled sequential Monte Carlo 10/ 36
  • 34.
    Particle marginal Metropolis-Hastings(PMMH) For iteration i ≥ 1 1. Sample θ∗ ∼ q(θ(i−1), ·) 2. Compute ˆp(y0:T |θ∗) with SMC 2. With probability min 1, p(θ∗)ˆp(y0:T |θ∗)q(θ∗, θ(i−1)) p(θ(i−1))ˆp(y0:T |θ(i−1))q(θ(i−1), θ∗) set θ(i) = θ∗ and ˆp(y0:T |θ(i)) = ˆp(y0:T |θ∗), otherwise set θ(i) = θ(i−1) and ˆp(y0:T |θ(i)) = ˆp(y0:T |θ(i−1)) Andrieu, Doucet & Holenstein (2010). Journal of the Royal Statistical Society: Series B. Jeremy Heng Controlled sequential Monte Carlo 11/ 36
  • 35.
    Particle marginal Metropolis-Hastings(PMMH) • Exact approximation: 1 m m i=1 ϕ(θ(i) ) → Θ ϕ(θ)p(θ|y0:T )dθ as m → ∞, for any N ≥ 1 Jeremy Heng Controlled sequential Monte Carlo 12/ 36
  • 36.
    Particle marginal Metropolis-Hastings(PMMH) • Exact approximation: 1 m m i=1 ϕ(θ(i) ) → Θ ϕ(θ)p(θ|y0:T )dθ as m → ∞, for any N ≥ 1 • Performance depends on the variance of ˆp(y0:T |θ) Andrieu & Vihola (2015). The Annals of Applied Probability. Jeremy Heng Controlled sequential Monte Carlo 12/ 36
  • 37.
    Particle marginal Metropolis-Hastings(PMMH) • Exact approximation: 1 m m i=1 ϕ(θ(i) ) → Θ ϕ(θ)p(θ|y0:T )dθ as m → ∞, for any N ≥ 1 • Performance depends on the variance of ˆp(y0:T |θ) Andrieu & Vihola (2015). The Annals of Applied Probability. • Account for computational cost to optimize efficiency Doucet, Pitt, Deligiannidis & Kohn (2015). Biometrika. Sherlock, Thiery, Roberts, & Rosenthal (2015). The Annals of Statistics. Jeremy Heng Controlled sequential Monte Carlo 12/ 36
  • 38.
    Particle marginal Metropolis-Hastings(PMMH) • Exact approximation: 1 m m i=1 ϕ(θ(i) ) → Θ ϕ(θ)p(θ|y0:T )dθ as m → ∞, for any N ≥ 1 • Performance depends on the variance of ˆp(y0:T |θ) Andrieu & Vihola (2015). The Annals of Applied Probability. • Account for computational cost to optimize efficiency Doucet, Pitt, Deligiannidis & Kohn (2015). Biometrika. Sherlock, Thiery, Roberts, & Rosenthal (2015). The Annals of Statistics. • Proposed method lowers variance of ˆp(y0:T |θ) at fixed cost Jeremy Heng Controlled sequential Monte Carlo 12/ 36
  • 39.
    Neuroscience example: SMCperformance Relative variance of log-marginal likelihood estimator σ2 → Var log ˆp(y0:T |(α, σ2)) log p(y0:T |(α, σ2)) , fix α = 0.99 with N = 1024 0.05 0.1 0.15 0.2 -9.5 -9 -8.5 -8 -7.5 -7 -6.5 -6 -5.5 Jeremy Heng Controlled sequential Monte Carlo 13/ 36
  • 40.
    Neuroscience example: SMCperformance Relative variance of log-marginal likelihood estimator σ2 → Var log ˆp(y0:T |(α, σ2)) log p(y0:T |(α, σ2)) , fix α = 0.99 with N = 2048 0.05 0.1 0.15 0.2 -9.5 -9 -8.5 -8 -7.5 -7 -6.5 -6 -5.5 Jeremy Heng Controlled sequential Monte Carlo 13/ 36
  • 41.
    Neuroscience example: SMCperformance Relative variance of log-marginal likelihood estimator σ2 → Var log ˆp(y0:T |(α, σ2)) log p(y0:T |(α, σ2)) , fix α = 0.99 with N = 5529 0.05 0.1 0.15 0.2 -9.5 -9 -8.5 -8 -7.5 -7 -6.5 -6 -5.5 Jeremy Heng Controlled sequential Monte Carlo 13/ 36
  • 42.
    Neuroscience example: SMCperformance Relative variance of log-marginal likelihood estimator σ2 → Var log ˆp(y0:T |(α, σ2)) log p(y0:T |(α, σ2)) , fix α = 0.99 with N = 5529 vs. controlled SMC 0.05 0.1 0.15 0.2 -9.5 -9 -8.5 -8 -7.5 -7 -6.5 -6 -5.5 Jeremy Heng Controlled sequential Monte Carlo 13/ 36
  • 43.
    Mismatch between dynamicsand observations SMC weights samples from model dynamics Xt|Xt−1 ∼ N(αXt−1, σ2 ) without taking observations into account -12 -10 -8 -6 -4 -2 0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 Jeremy Heng Controlled sequential Monte Carlo 14/ 36
  • 44.
    Mismatch between dynamicsand observations SMC weights samples from model dynamics Xt|Xt−1 ∼ N(αXt−1, σ2 ) without taking observations into account -12 -10 -8 -6 -4 -2 0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 Jeremy Heng Controlled sequential Monte Carlo 14/ 36
  • 45.
    Mismatch between dynamicsand observations SMC weights samples from model dynamics Xt|Xt−1 ∼ N(αXt−1, σ2 ) without taking observations into account -12 -10 -8 -6 -4 -2 0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 Jeremy Heng Controlled sequential Monte Carlo 14/ 36
  • 46.
    Mismatch between dynamicsand observations SMC weights samples from model dynamics Xt|Xt−1 ∼ N(αXt−1, σ2 ) without taking observations into account -12 -10 -8 -6 -4 -2 0 0 0.5 1 1.5 2 2.5 3 3.5 4 Jeremy Heng Controlled sequential Monte Carlo 14/ 36
  • 47.
    Outline 1 State spacemodels 2 Sequential Monte Carlo 3 Controlled sequential Monte Carlo 4 Bayesian parameter inference 5 Extensions and future work Jeremy Heng Controlled sequential Monte Carlo 14/ 36
  • 48.
    Optimal dynamics • Fixθ and suppress notational dependency Jeremy Heng Controlled sequential Monte Carlo 15/ 36
  • 49.
    Optimal dynamics • Fixθ and suppress notational dependency • Sampling from smoothing distribution p(x0:T |y0:T ) X0 ∼ p(x0|y0:T ), Xt|Xt−1 ∼ p(xt|xt−1, yt:T ), t ∈ [1 : T] Jeremy Heng Controlled sequential Monte Carlo 15/ 36
  • 50.
    Optimal dynamics • Fixθ and suppress notational dependency • Sampling from smoothing distribution p(x0:T |y0:T ) X0 ∼ p(x0|y0:T ), Xt|Xt−1 ∼ p(xt|xt−1, yt:T ), t ∈ [1 : T] • Define backward information filter ψ∗ t (xt) = p(yt:T |xt), then p(x0|y0:T ) = µ(x0)ψ∗ 0(x0) µ(ψ∗ 0) with µ(ψ∗ 0) = X ψ∗ 0(x0)µ(x0)dx0, and p(xt|xt−1, yt:T ) = ft(xt−1, xt)ψ∗ t (xt) ft(ψ∗ t )(xt−1) with ft(ψ∗ t )(xt−1) = X ψ∗ t (xt)ft(xt−1, xt)dxt Jeremy Heng Controlled sequential Monte Carlo 15/ 36
  • 51.
    Controlled state spacemodel • Given a policy ψ = (ψ0, . . . , ψT ) i.e. positive and bounded functions Jeremy Heng Controlled sequential Monte Carlo 16/ 36
  • 52.
    Controlled state spacemodel • Given a policy ψ = (ψ0, . . . , ψT ) i.e. positive and bounded functions • Construct ψ-controlled dynamics X0 ∼ µψ , Xt|Xt−1 ∼ f ψ t (Xt−1, ·), t ∈ [1 : T] where µψ (x0) = µ(x0)ψ0(x0) µ(ψ0) , f ψ t (xt−1, xt) = ft(xt−1, xt)ψt(xt) ft(ψt)(xt−1) Jeremy Heng Controlled sequential Monte Carlo 16/ 36
  • 53.
    Controlled state spacemodel • Given a policy ψ = (ψ0, . . . , ψT ) i.e. positive and bounded functions • Construct ψ-controlled dynamics X0 ∼ µψ , Xt|Xt−1 ∼ f ψ t (Xt−1, ·), t ∈ [1 : T] where µψ (x0) = µ(x0)ψ0(x0) µ(ψ0) , f ψ t (xt−1, xt) = ft(xt−1, xt)ψt(xt) ft(ψt)(xt−1) • Introducing ψ-controlled observation model Yt|X0:T ∼ gψ t (Xt, ·), t ∈ [0 : T] gives a ψ-controlled state space model Jeremy Heng Controlled sequential Monte Carlo 16/ 36
  • 54.
    Controlled state spacemodel • Define controlled observation densities (gψ 0 , . . . , gψ T ) so that pψ (x0:T |y0:T ) = p(x0:T |y0:T ), pψ (y0:T ) = p(y0:T ) Jeremy Heng Controlled sequential Monte Carlo 17/ 36
  • 55.
    Controlled state spacemodel • Define controlled observation densities (gψ 0 , . . . , gψ T ) so that pψ (x0:T |y0:T ) = p(x0:T |y0:T ), pψ (y0:T ) = p(y0:T ) • Achieved with gψ 0 (x0, y0) = µ(ψ0)g0(x0, y0)f1(ψ1)(x0) ψ0(x0) , gψ t (xt, yt) = gt(xt, yt)ft+1(ψt+1)(xt) ψt(xt) , t ∈ [1 : T − 1], gψ T (xT , yT ) = gT (xT , yT ) ψT (xT ) Jeremy Heng Controlled sequential Monte Carlo 17/ 36
  • 56.
    Controlled state spacemodel • Define controlled observation densities (gψ 0 , . . . , gψ T ) so that pψ (x0:T |y0:T ) = p(x0:T |y0:T ), pψ (y0:T ) = p(y0:T ) • Achieved with gψ 0 (x0, y0) = µ(ψ0)g0(x0, y0)f1(ψ1)(x0) ψ0(x0) , gψ t (xt, yt) = gt(xt, yt)ft+1(ψt+1)(xt) ψt(xt) , t ∈ [1 : T − 1], gψ T (xT , yT ) = gT (xT , yT ) ψT (xT ) • Requirements on policy ψ – Evaluating gψ t tractable Jeremy Heng Controlled sequential Monte Carlo 17/ 36
  • 57.
    Controlled state spacemodel • Define controlled observation densities (gψ 0 , . . . , gψ T ) so that pψ (x0:T |y0:T ) = p(x0:T |y0:T ), pψ (y0:T ) = p(y0:T ) • Achieved with gψ 0 (x0, y0) = µ(ψ0)g0(x0, y0)f1(ψ1)(x0) ψ0(x0) , gψ t (xt, yt) = gt(xt, yt)ft+1(ψt+1)(xt) ψt(xt) , t ∈ [1 : T − 1], gψ T (xT , yT ) = gT (xT , yT ) ψT (xT ) • Requirements on policy ψ – Evaluating gψ t tractable – Sampling µψ and f ψ t feasible Jeremy Heng Controlled sequential Monte Carlo 17/ 36
  • 58.
    Neuroscience example • Gaussianpolicy for neuroscience example ψt(xt) = exp −atx2 t − btxt − ct , (at, bt, ct) ∈ R3 Jeremy Heng Controlled sequential Monte Carlo 18/ 36
  • 59.
    Neuroscience example • Gaussianpolicy for neuroscience example ψt(xt) = exp −atx2 t − btxt − ct , (at, bt, ct) ∈ R3 • Both requirements satisfied since µψ = N(−k0b0, k0), f ψ t (xt−1, ·) = N(kt{ασ−2 xt−1 − bt}, kt) with k0 = (1 + 2a0)−1 and kt = (σ−2 + 2at)−1 Jeremy Heng Controlled sequential Monte Carlo 18/ 36
  • 60.
    Controlled SMC • Constructψ-controlled SMC as SMC applied to ψ-controlled state space model X0 ∼ µψ , Xt|Xt−1 ∼ f ψ t (Xt−1, ·), t ∈ [1 : T] Yt|X0:T ∼ gψ t (Xt, ·), t ∈ [0 : T] Jeremy Heng Controlled sequential Monte Carlo 19/ 36
  • 61.
    Controlled SMC • Constructψ-controlled SMC as SMC applied to ψ-controlled state space model X0 ∼ µψ , Xt|Xt−1 ∼ f ψ t (Xt−1, ·), t ∈ [1 : T] Yt|X0:T ∼ gψ t (Xt, ·), t ∈ [0 : T] • Unbiased and consistent marginal likelihood estimator ˆpψ (y0:T ) = T t=0 1 N N n=1 gψ t (Xn t , yt) Jeremy Heng Controlled sequential Monte Carlo 19/ 36
  • 62.
    Controlled SMC • Constructψ-controlled SMC as SMC applied to ψ-controlled state space model X0 ∼ µψ , Xt|Xt−1 ∼ f ψ t (Xt−1, ·), t ∈ [1 : T] Yt|X0:T ∼ gψ t (Xt, ·), t ∈ [0 : T] • Unbiased and consistent marginal likelihood estimator ˆpψ (y0:T ) = T t=0 1 N N n=1 gψ t (Xn t , yt) • Consistent approximation of smoothing distribution 1 N N n=1 ϕ(Xn 0:T ) → ϕ(x0:T )p(x0:T |y0:T )dx0:T as N → ∞ Jeremy Heng Controlled sequential Monte Carlo 19/ 36
  • 63.
    Optimal policy • Policyψ∗ t (xt) = p(yt:T |xt) is optimal since ψ∗-controlled SMC gives Jeremy Heng Controlled sequential Monte Carlo 20/ 36
  • 64.
    Optimal policy • Policyψ∗ t (xt) = p(yt:T |xt) is optimal since ψ∗-controlled SMC gives • ψ∗-controlled SMC gives independent samples from smoothing distribution Xn 0:T ∼ p(x0:T |y0:T ), n ∈ [1 : N], and zero variance estimator of marginal likelihood for any N ≥ 1 ˆpψ∗ (y0:T ) = p(y0:T ) Jeremy Heng Controlled sequential Monte Carlo 20/ 36
  • 65.
    Optimal policy • Policyψ∗ t (xt) = p(yt:T |xt) is optimal since ψ∗-controlled SMC gives • ψ∗-controlled SMC gives independent samples from smoothing distribution Xn 0:T ∼ p(x0:T |y0:T ), n ∈ [1 : N], and zero variance estimator of marginal likelihood for any N ≥ 1 ˆpψ∗ (y0:T ) = p(y0:T ) • (Proposition 1) Optimal policy satisfies backward recursion ψ∗ T (xT ) = gT (xT , yT ), ψ∗ t (xt) = gt(xt, yt)ft+1(ψ∗ t+1)(xt), t ∈ [T − 1 : 0] Jeremy Heng Controlled sequential Monte Carlo 20/ 36
  • 66.
    Optimal policy • Policyψ∗ t (xt) = p(yt:T |xt) is optimal since ψ∗-controlled SMC gives • ψ∗-controlled SMC gives independent samples from smoothing distribution Xn 0:T ∼ p(x0:T |y0:T ), n ∈ [1 : N], and zero variance estimator of marginal likelihood for any N ≥ 1 ˆpψ∗ (y0:T ) = p(y0:T ) • (Proposition 1) Optimal policy satisfies backward recursion ψ∗ T (xT ) = gT (xT , yT ), ψ∗ t (xt) = gt(xt, yt)ft+1(ψ∗ t+1)(xt), t ∈ [T − 1 : 0] • Backward recursion typically intractable but can be approximated Jeremy Heng Controlled sequential Monte Carlo 20/ 36
  • 67.
    Connection to optimalcontrol • V ∗ t = − log ψ∗ t are the optimal value functions of Kullback-Leibler control problem inf ψ∈Ψ KL pψ (x0:T ) p(x0:T |y0:T ) Jeremy Heng Controlled sequential Monte Carlo 21/ 36
  • 68.
    Connection to optimalcontrol • V ∗ t = − log ψ∗ t are the optimal value functions of Kullback-Leibler control problem inf ψ∈Ψ KL pψ (x0:T ) p(x0:T |y0:T ) • Connection useful for methodology and analysis Bertsekas & Tsitsiklis (1996). Neuro-dynamic programming. Tsitsiklis & Van Roy (2001). IEEE Transactions on Neural Networks. Jeremy Heng Controlled sequential Monte Carlo 21/ 36
  • 69.
    Connection to optimalcontrol • V ∗ t = − log ψ∗ t are the optimal value functions of Kullback-Leibler control problem inf ψ∈Ψ KL pψ (x0:T ) p(x0:T |y0:T ) • Connection useful for methodology and analysis Bertsekas & Tsitsiklis (1996). Neuro-dynamic programming. Tsitsiklis & Van Roy (2001). IEEE Transactions on Neural Networks. • Methods to approximate backward recursion are known as approximate dynamic programming (ADP) for finite horizon control problems Jeremy Heng Controlled sequential Monte Carlo 21/ 36
  • 70.
    Approximate dynamic programming •First run standard SMC to get (Xn 0 , . . . , Xn T ), n ∈ [1 : N] Jeremy Heng Controlled sequential Monte Carlo 22/ 36
  • 71.
    Approximate dynamic programming •First run standard SMC to get (Xn 0 , . . . , Xn T ), n ∈ [1 : N] • For time T, approximate ψ∗ T (xT ) = gT (xT , yT ) by least squares ˆψT = arg min ξ∈F N n=1 {log ξ(Xn T ) − log gT (Xn T , yT )}2 Jeremy Heng Controlled sequential Monte Carlo 22/ 36
  • 72.
    Approximate dynamic programming •First run standard SMC to get (Xn 0 , . . . , Xn T ), n ∈ [1 : N] • For time T, approximate ψ∗ T (xT ) = gT (xT , yT ) by least squares ˆψT = arg min ξ∈F N n=1 {log ξ(Xn T ) − log gT (Xn T , yT )}2 • For t ∈ [T − 1 : 0], approximate ψ∗ t (xt) = gt(xt, yt)ft+1(ψ∗ t+1)(xt) by least squares and ψ∗ t+1 ≈ ˆψt+1 ˆψt = arg min ξ∈F N n=1 log ξ(Xn t ) − log gt(Xn t , yt) − log ft+1( ˆψt+1)(Xn t ) 2 Jeremy Heng Controlled sequential Monte Carlo 22/ 36
  • 73.
    Approximate dynamic programming •(Proposition 3) Error bounds E ˆψt − ψ∗ t ≤ T s=t Ct−1,s−1eN s where Ct,s are stability constants and eN t are least squares errors Jeremy Heng Controlled sequential Monte Carlo 23/ 36
  • 74.
    Approximate dynamic programming •(Proposition 3) Error bounds E ˆψt − ψ∗ t ≤ T s=t Ct−1,s−1eN s where Ct,s are stability constants and eN t are least squares errors • (Theorem 1) As N → ∞, ˆψ converges to idealized ADP ˜ψT = arg min ξ∈F E {log ξ(XT ) − log gT (XT , yT )}2 , ˜ψt = arg min ξ∈F E log ξ(Xt) − log gt(Xt, yt) − log ft+1( ˜ψt+1)(Xt) 2 Jeremy Heng Controlled sequential Monte Carlo 23/ 36
  • 75.
    Neuroscience example: controlledSMC Marginal likelihood estimates of ˆψ-controlled SMC with N = 128 (α = 0.99, σ2 = 0.11) Variance reduction ≈ 22 times Iteration 0 Iteration 1 -3125 -3120 -3115 -3110 -3105 Jeremy Heng Controlled sequential Monte Carlo 24/ 36
  • 76.
    Neuroscience example: controlledSMC Marginal likelihood estimates of ˆψ-controlled SMC with N = 128 (α = 0.99, σ2 = 0.11) Variance reduction ≈ 22 times Iteration 0 Iteration 1 -3125 -3120 -3115 -3110 -3105 We can do better! Jeremy Heng Controlled sequential Monte Carlo 24/ 36
  • 77.
    Policy refinement • Currentpolicy ˆψ defines the dynamics X0 ∼ µ ˆψ , Xt|Xt−1 ∼ f ˆψ t (Xt−1, ·), t ∈ [1 : T] Jeremy Heng Controlled sequential Monte Carlo 25/ 36
  • 78.
    Policy refinement • Currentpolicy ˆψ defines the dynamics X0 ∼ µ ˆψ , Xt|Xt−1 ∼ f ˆψ t (Xt−1, ·), t ∈ [1 : T] • Further control these dynamics with policy φ = (φ0, . . . φT ) X0 ∼ µ ˆψ φ , Xt|Xt−1 ∼ f ˆψ t φ (Xt−1, ·), t ∈ [1 : T] Jeremy Heng Controlled sequential Monte Carlo 25/ 36
  • 79.
    Policy refinement • Currentpolicy ˆψ defines the dynamics X0 ∼ µ ˆψ , Xt|Xt−1 ∼ f ˆψ t (Xt−1, ·), t ∈ [1 : T] • Further control these dynamics with policy φ = (φ0, . . . φT ) X0 ∼ µ ˆψ φ , Xt|Xt−1 ∼ f ˆψ t φ (Xt−1, ·), t ∈ [1 : T] • Equivalent to controlling model dynamics µ and ft with policy ˆψ · φ = ( ˆψ0 · φ0, . . . , ˆψT · φT ) Jeremy Heng Controlled sequential Monte Carlo 25/ 36
  • 80.
    Policy refinement • (Proposition1) Optimal refinement of φ∗ current policy ˆψ φ∗ T (xT ) = g ˆψ T (xT , yT ), φ∗ t (xt) = g ˆψ t (xt, yt)f ˆψ t+1(φ∗ t+1)(xt), t ∈ [T − 1 : 0] Jeremy Heng Controlled sequential Monte Carlo 26/ 36
  • 81.
    Policy refinement • (Proposition1) Optimal refinement of φ∗ current policy ˆψ φ∗ T (xT ) = g ˆψ T (xT , yT ), φ∗ t (xt) = g ˆψ t (xt, yt)f ˆψ t+1(φ∗ t+1)(xt), t ∈ [T − 1 : 0] • Optimal policy ψ∗ = ˆψ · φ∗ Jeremy Heng Controlled sequential Monte Carlo 26/ 36
  • 82.
    Policy refinement • (Proposition1) Optimal refinement of φ∗ current policy ˆψ φ∗ T (xT ) = g ˆψ T (xT , yT ), φ∗ t (xt) = g ˆψ t (xt, yt)f ˆψ t+1(φ∗ t+1)(xt), t ∈ [T − 1 : 0] • Optimal policy ψ∗ = ˆψ · φ∗ • Approximate backward recursion to obtain ˆφ ≈ φ∗ – using particles from ˆψ-controlled SMC – same function class F Jeremy Heng Controlled sequential Monte Carlo 26/ 36
  • 83.
    Policy refinement • (Proposition1) Optimal refinement of φ∗ current policy ˆψ φ∗ T (xT ) = g ˆψ T (xT , yT ), φ∗ t (xt) = g ˆψ t (xt, yt)f ˆψ t+1(φ∗ t+1)(xt), t ∈ [T − 1 : 0] • Optimal policy ψ∗ = ˆψ · φ∗ • Approximate backward recursion to obtain ˆφ ≈ φ∗ – using particles from ˆψ-controlled SMC – same function class F • Run controlled SMC with refined policy ˆψ · ˆφ Jeremy Heng Controlled sequential Monte Carlo 26/ 36
  • 84.
    Neuroscience example: controlledSMC Marginal likelihood estimates of controlled SMC iteration 2 with N = 128 (α = 0.99, σ2 = 0.11) Further variance reduction ≈ 24 times Iteration 0 Iteration 1 Iteration 2 -3125 -3120 -3115 -3110 -3105 Jeremy Heng Controlled sequential Monte Carlo 27/ 36
  • 85.
    Neuroscience example: controlledSMC Marginal likelihood estimates of controlled SMC iteration 3 with N = 128 (α = 0.99, σ2 = 0.11) Further variance reduction ≈ 1.3 times Iteration 0 Iteration 1 Iteration 2 Iteration 3 -3125 -3120 -3115 -3110 -3105 Jeremy Heng Controlled sequential Monte Carlo 27/ 36
  • 86.
    Neuroscience example: controlledSMC Coefficients (ai t, bi t) of Gaussian approximation estimated at iteration i ≥ 1 0 500 1000 1500 2000 2500 3000 -4 -2 0 2 4 6 8 Iteration 1 Iteration 2 Iteration 3 0 500 1000 1500 2000 2500 3000 -20 -10 0 10 20 30 40 Iteration 1 Iteration 2 Iteration 3 Jeremy Heng Controlled sequential Monte Carlo 28/ 36
  • 87.
    Effect of policyrefinement • Residual from ADP when fitting ˆψ εt(xt) = log ˆψt(xt) − log g(xt, yt) − log ft+1( ˆψt+1)(xt) Jeremy Heng Controlled sequential Monte Carlo 29/ 36
  • 88.
    Effect of policyrefinement • Residual from ADP when fitting ˆψ εt(xt) = log ˆψt(xt) − log g(xt, yt) − log ft+1( ˆψt+1)(xt) • Performance of ˆψ-controlled SMC related to εt Jeremy Heng Controlled sequential Monte Carlo 29/ 36
  • 89.
    Effect of policyrefinement • Residual from ADP when fitting ˆψ εt(xt) = log ˆψt(xt) − log g(xt, yt) − log ft+1( ˆψt+1)(xt) • Performance of ˆψ-controlled SMC related to εt • Next ADP refinement re-fits residual (like L2-boosting) ˆφt = arg min ξ∈F N n=1 log ξ(Xn t ) − εt(Xn t ) − log f ˆψ t+1(ˆφt+1)(Xn t ) 2 B¨uhlmann & Yu (2003). Journal of the American Statistical Association. Jeremy Heng Controlled sequential Monte Carlo 29/ 36
  • 90.
    Effect of policyrefinement • Residual from ADP when fitting ˆψ εt(xt) = log ˆψt(xt) − log g(xt, yt) − log ft+1( ˆψt+1)(xt) • Performance of ˆψ-controlled SMC related to εt • Next ADP refinement re-fits residual (like L2-boosting) ˆφt = arg min ξ∈F N n=1 log ξ(Xn t ) − εt(Xn t ) − log f ˆψ t+1(ˆφt+1)(Xn t ) 2 B¨uhlmann & Yu (2003). Journal of the American Statistical Association. Jeremy Heng Controlled sequential Monte Carlo 29/ 36
  • 91.
    Effect of policyrefinement • Residual from ADP when fitting ˆψ εt(xt) = log ˆψt(xt) − log g(xt, yt) − log ft+1( ˆψt+1)(xt) • Performance of ˆψ-controlled SMC related to εt • Next ADP refinement re-fits residual (like L2-boosting) ˆφt = arg min ξ∈F N n=1 log ξ(Xn t ) − εt(Xn t ) − log f ˆψ t+1(ˆφt+1)(Xn t ) 2 B¨uhlmann & Yu (2003). Journal of the American Statistical Association. This explains previous plots! Jeremy Heng Controlled sequential Monte Carlo 29/ 36
  • 92.
    Effect of policyrefinement Coefficients of policy at time 0 over 30 iterations 0 5 10 15 20 25 30 0 1 2 3 4 5 6 7 8 9 0 5 10 15 20 25 30 0 5 10 15 20 25 30 35 40 Jeremy Heng Controlled sequential Monte Carlo 30/ 36
  • 93.
    Effect of policyrefinement Coefficients of policy at time 0 over 30 iterations 0 5 10 15 20 25 30 0 1 2 3 4 5 6 7 8 9 0 5 10 15 20 25 30 0 5 10 15 20 25 30 35 40 (Theorem 2) Under regularity conditions • Policy refinement generates a Markov chain on policy space Jeremy Heng Controlled sequential Monte Carlo 30/ 36
  • 94.
    Effect of policyrefinement Coefficients of policy at time 0 over 30 iterations 0 5 10 15 20 25 30 0 1 2 3 4 5 6 7 8 9 0 5 10 15 20 25 30 0 5 10 15 20 25 30 35 40 (Theorem 2) Under regularity conditions • Policy refinement generates a Markov chain on policy space • Converges to unique stationary distribution Jeremy Heng Controlled sequential Monte Carlo 30/ 36
  • 95.
    Effect of policyrefinement Coefficients of policy at time 0 over 30 iterations 0 5 10 15 20 25 30 0 1 2 3 4 5 6 7 8 9 0 5 10 15 20 25 30 0 5 10 15 20 25 30 35 40 (Theorem 2) Under regularity conditions • Policy refinement generates a Markov chain on policy space • Converges to unique stationary distribution • Characterization of stationary distribution as N → ∞ Jeremy Heng Controlled sequential Monte Carlo 30/ 36
  • 96.
    Outline 1 State spacemodels 2 Sequential Monte Carlo 3 Controlled sequential Monte Carlo 4 Bayesian parameter inference 5 Extensions and future work Jeremy Heng Controlled sequential Monte Carlo 30/ 36
  • 97.
    Neuroscience example: PMMHperformance Trace plots of particle marginal Metropolis-Hastings chain 0 200 400 600 800 1000 0.99 0.991 0.992 0.993 0.994 0.995 0.996 0.997 0.998 0.999 1 0 200 400 600 800 1000 0.07 0.08 0.09 0.1 0.11 0.12 0.13 0.14 BPF: N = 5529 cSMC: N = 128, 3 refinements Each iteration takes 2-3 seconds Jeremy Heng Controlled sequential Monte Carlo 31/ 36
  • 98.
    Neuroscience example: PMMH Autocorrelationfunction of particle marginal Metropolis-Hastings chain 0 10 20 30 40 50 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 BPF cSMC 0 10 20 30 40 50 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 BPF cSMC 100,000 iterations with BPF (3 days) ≈ 20,000 iterations with cSMC (1/2 day) Jeremy Heng Controlled sequential Monte Carlo 32/ 36
  • 99.
    Neuroscience example: PMMH Autocorrelationfunction of particle marginal Metropolis-Hastings chain 0 10 20 30 40 50 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 BPF cSMC 0 10 20 30 40 50 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 BPF cSMC 100,000 iterations with BPF (3 days) ≈ 20,000 iterations with cSMC (1/2 day) ≈ 2,000 iterations with cSMC + parallel computation (1.5 hours) Jeremy Heng Controlled sequential Monte Carlo 32/ 36
  • 100.
    Parallel computation forMarkov chain Monte Carlo Serial MCMC iterations ! 1 Parallel MCMC processors # 1 Jacob, O’Leary & Atchad´e (2017). Unbiased Markov chain Monte Carlo with couplings. Heng & Jacob (2017). Unbiased Hamiltonian Monte Carlo with couplings. Middleton, Deligiannidis, Doucet & Jacob (2018). Unbiased Markov chain Monte Carlo for intractable target distributions. Jeremy Heng Controlled sequential Monte Carlo 33/ 36
  • 101.
    Outline 1 State spacemodels 2 Sequential Monte Carlo 3 Controlled sequential Monte Carlo 4 Bayesian parameter inference 5 Extensions and future work Jeremy Heng Controlled sequential Monte Carlo 33/ 36
  • 102.
    Static models • Bayesianmodel: compute posterior p(θ|y) ∝ p(θ)p(y|θ) or model evidence p(y) = Θ p(θ)p(y|θ)dθ Jeremy Heng Controlled sequential Monte Carlo 34/ 36
  • 103.
    Static models • Bayesianmodel: compute posterior p(θ|y) ∝ p(θ)p(y|θ) or model evidence p(y) = Θ p(θ)p(y|θ)dθ • Introduce artificial tempered posteriors pt(θ|y) ∝ p(θ)p(y|θ)λt where 0 = λ0 < · · · < λT = 1 (Jarzynski 1997; Neal 2001; Chopin 2004; Del Moral, Doucet & Jasra, 2006) Jeremy Heng Controlled sequential Monte Carlo 34/ 36
  • 104.
    Static models • Bayesianmodel: compute posterior p(θ|y) ∝ p(θ)p(y|θ) or model evidence p(y) = Θ p(θ)p(y|θ)dθ • Introduce artificial tempered posteriors pt(θ|y) ∝ p(θ)p(y|θ)λt where 0 = λ0 < · · · < λT = 1 (Jarzynski 1997; Neal 2001; Chopin 2004; Del Moral, Doucet & Jasra, 2006) • Optimally controlled SMC gives independent samples from p(θ|y) and a zero variance estimator of p(y) Jeremy Heng Controlled sequential Monte Carlo 34/ 36
  • 105.
    Static models Model evidenceestimates of Cox point process model in 900 dimensions Variance reduction ≈ 370 times compared to AIS Iteration 0 Iteration 1 Iteration 2 Iteration 3 AIS 440 450 460 470 480 490 500 Jeremy Heng Controlled sequential Monte Carlo 35/ 36
  • 106.
    Concluding remarks • Extensions: –Online filtering – Relax requirements on policies Jeremy Heng Controlled sequential Monte Carlo 36/ 36
  • 107.
    Concluding remarks • Extensions: –Online filtering – Relax requirements on policies • Controlled sequential Monte Carlo. Heng, Bishop, Deligiannidis & Doucet. arXiv:1708.08396. Jeremy Heng Controlled sequential Monte Carlo 36/ 36
  • 108.
    Concluding remarks • Extensions: –Online filtering – Relax requirements on policies • Controlled sequential Monte Carlo. Heng, Bishop, Deligiannidis & Doucet. arXiv:1708.08396. • MATLAB code: https://github.com/jeremyhengjm/controlledSMC Jeremy Heng Controlled sequential Monte Carlo 36/ 36