SlideShare a Scribd company logo
1 of 47
Download to read offline
Stratified Monte Carlo and bootstrapping for
approximate Bayesian computation
Umberto Picchini
@uPicchini
Chalmers University of Technology
and University of Gothenburg
Sweden
ABC World Seminar, 7 May
2020
Joint work with Richard Everitt:
P and Everitt (2019) Stratified sampling and bootstrapping for
approximate Bayesian computation, arXiv:1905.07976.
This is ongoing work. Feedback is very welcome.
Umberto Picchini, @uPicchini 2/47
I am going to talk of essentially three main points:
• how to use bootstrap samples to reduce the computational
effort when approximating the ABC likelihood;
• how to reduce the bootstrap-induced bias of the approximated
ABC likelihood using stratified Monte Carlo.
• how to obtain reliable inference with a larger than usual ABC
threshold, via stratified MC.
The above is illustrated using a pseudomarginal ABC-MCMC
approach.
However other samplers could be used.
Umberto Picchini, @uPicchini 3/47
Notation
• We are interested in Bayesian inference for parameters θ;
• the likelihood p(xobs|θ) for data xobs is “unavailable”;
• we assume the ability to simulate pseudo-data x∗ from a
simulator x∗ ∼ p(x|θ);
• we use an ABC approach (approximate Bayesian computation);
• we provide summary statistics sobs = S(xobs) to data and
s∗ = S(x∗) to simulated data;
• the key idea in ABC is to give higher weights to summaries s∗
close to sobs and with higher probability accept parameters θ∗
generating s∗ ≈ sobs.
Umberto Picchini, @uPicchini 4/47
A possibility to weight the proximity of s∗ to sobs is to use a kernel
funct Kδ(s∗, sobs) and write
πδ(θ, s∗
|sobs)
augmented posterior
∝ π(θ) × Kδ(s∗
, sobs) × p(s∗
|θ)
for example we may use a Gaussian kernel
Kδ(s∗
, sobs) ∝ exp(−
1
2δ2
(s∗
− sobs) (s∗
− sobs)).
Other kernels are possible, e.g. Epanechnikov’s kernel.
We can then write the (marginal) ABC posterior as
πδ(θ|sobs) ∝ π(θ) Kδ(s∗
, sobs)p(s∗
|θ)ds∗
ABC likelihood
Umberto Picchini, @uPicchini 5/47
The ABC likelihood: Kδ(s∗, sobs)p(s∗|θ)ds∗.
This can trivially be approximated unbiasedly via Monte Carlo as1
Kδ(s∗
, sobs)p(s∗
|θ)ds∗
≈
1
M
M
r=1
Kδ(s∗r
, sobs), s∗r
∼
iid
p(s∗
|θ).
and plugged in a Metropolis-Hastings ABC-MCMC algorithm,
proposing a move θ∗ ∼ q(θ|θ#), and accepting with probability
min 1,
1
M
M
r=1 Kδ(s∗r, s)
1
M
M
r=1 Kδ(s#r, s)
·
π(θ∗)
π(θ#)
·
q(θ#|θ∗)
q(θ∗|θ#)
Problem: if the model simulator is computationally expensive,
having M large is unfeasible. Often M = 1 is used.
1
Lee, Andrieu, Doucet (2012): Discussion of Fearnhead-Prangle 2012
JRSS-B.
Umberto Picchini, @uPicchini 6/47
So we have the approximated ABC posterior (up to a constant c)
πδ(θ|sobs) ≈ c · π(θ) ·
1
M
M
r=1
Kδ(s∗r
, sobs)
non-negative unbiased
, s∗r
∼
iid M times
p(s∗
|θ).
• this makes the algorithm an instance of pseudomarginal MCMC
[Andrieu, Roberts 2009]2
• No matter the value of M, the ABC-MCMC will sample exactly
from πδ(θ|sobs)
• Typically ABC-MCMC is computationally intensive and a small M
is chosen, say M = 1;
• the lower the M the higher the variance of the estimate of the ABC
likelihood Kδ(s∗
, s)p(s∗
|θ)ds∗
• the larger the variance the worse the mixing of the chain (due to
occasional overestimation of the likelihood).
2
Andrieu, C., and Roberts, G. O. (2009). The pseudo-marginal approach for
efficient Monte Carlo computations. The Annals of Statistics, 37(2), 697-725.
Umberto Picchini, @uPicchini 7/47
Dilemma:
a small M will decrease the runtime considerably, however it will
increase the chance to overestimate the likelihood → possibly
high-rejection rates.
Question: is it worth to have M > 1 to reduce the variance of the
ABC likelihood given the higher computational cost?
Bornn et al 20173 found that no, it is not worth and M = 1 is
just fine (when using a uniform kernel).
Basically using M = 1 is so much faster to run that the decreased
variance obtained with M > 1 is not worth given the higher
computational cost.
3
L. Bornn, N. S. Pillai, A. Smith, and D. Woodard. The use of a single
pseudo-sample in approximate Bayesian computation. Statistics and
Computing, 27(3):583590, 2017.
Umberto Picchini, @uPicchini 8/47
Can we devise a strategy to cheaply generate many artificial
datasets?
Umberto Picchini, @uPicchini 9/47
Data resampling (nonparametric bootstrap)
In a similar context (based on synthetic likelihood approaches) Everitt
20174
used the following approach:
At any proposed θ
• simulate say M = 1 datasets x∗
∼ p(x|θ);
• sample with replacement from elements in x∗
to obtain a
bootstrapped dataset (with dimension dim(xobs));
• repeat the resampling R times to obtain x∗1
, ..., x∗R
bootstrap
datasets from x∗
;
• compute the summaries s∗1
, ..., s∗R
for each resampled dataset;
• compute ABC likelihood approximation
1
R
R
r=1
Kδ(s∗r
, sobs)
• plug this into ABC-MCMC
4
Everitt (2017). Bootstrapped synthetic likelihood. arXiv:1711.05825.
Umberto Picchini, @uPicchini 10/47
This is cheap compared to producing M independent summaries
from the model, when the simulator is computationally intensive.
It reduces the variance of the ABC likelihood compared to using
M = 1 without resampling.
However it is worse than using several M 1 independent samples.
Umberto Picchini, @uPicchini 11/47
Example: data is 1000 iid observations from N(θ = 0, 1).
Set Gaussian prior on θ → known analytic posterior. Use sufficient
stats S(xobs) = ¯xobs throughout.
-0.15 -0.1 -0.05 0 0.05 0.1 0.15
0
2
4
6
8
10
12
14
pseudomarginal ABC
Exact Bayes
(a) pseudomarginal ABC,
M = 500
-0.15 -0.1 -0.05 0 0.05 0.1 0.15
0
2
4
6
8
10
12
14
ABC with resampling
Exact Bayes
(b) M = 1 and R = 500
resamples
• Left: pseudo-marginal ABC with M = 500 independent
datasets;
• Right: M = 1 and R = 500 resampled datasets
Umberto Picchini, @uPicchini 12/47
What we saw is hardly surprising:
Of course resampling introduces additional variability in the
estimand.
Example: true loglikelihood of the summary statistic ¯x
-0.1 -0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1
-3
-2
-1
0
1
2
3
Umberto Picchini, @uPicchini 13/47
We estimate the ABC loglikelihood at several θ-values in [−0.1, 0.1] with
a very small threshold, δ = 3 × 10−5
:
-0.1 -0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1
-3.5
-3
-2.5
-2
-1.5
-1
-0.5
0
0.5
105
Figure: Solid lines are mean values over 500 repetitions. Dashed lines are
2.5 and 97.5 percentiles.
In black: analytic loglikelihood (undistinguishable).
In red: ABC loglikelihood approximated via M=500 independent datasets.
Umberto Picchini, @uPicchini 14/47
Let’s zoom in very central values...
-0.01 -0.008 -0.006 -0.004 -0.002 0 0.002 0.004 0.006 0.008 0.01
-50
-40
-30
-20
-10
0
10
Figure: Solid lines are mean values over 500 repetitions. Dashed lines are
2.5 and 97.5 percentiles.
We notice some bias: this can be reduced by increasing model
simulations M to values 500
Umberto Picchini, @uPicchini 15/47
We now use bootstrap M = 1 simulated datasets R = 500 times:
-0.1 -0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1
-3.5
-3
-2.5
-2
-1.5
-1
-0.5
0
0.5
106
1
R
R
r=1
Kδ(s∗r
, sobs)
In blue: ABC loglikelihood approximated via M=1 and R=500
bootstrapped datasets; large bias+overly variable likelihood
In red: no bootstrapping and M=500 independent datasets;
Umberto Picchini, @uPicchini 16/47
Stratified Monte Carlo
Stratified Monte Carlo is a variance reduction technique.
In full generality: want to approximate
µ =
D
f(x)p(x)dx
over some space D, for some function f and density (or probability
mass) function p.
Now partition D into J “strata” D1, ..., DJ :
• ∪J
j=1Dj = D
• Dj ∩ Dj = ∅, j = j
Umberto Picchini, @uPicchini 17/47
Samples from bivariate N2(0, I2)
6 concentric rings and 7 strata.
Each stratum has exactly 3 points sampled from within it.
Better to oversample inside the most “important” strata, where the
integrand has higher mass.
Umberto Picchini, @uPicchini 18/47
Ideally the statistician should decide how many Monte Carlo draws to
sample from each stratum Dj.
• Call this number ˜nj;
• define ωj := P(X ∈ Dj)
Probabilities ωj should be known.
Then we approximate µ = D
f(x)p(x)dx with
ˆµstrat =
J
j=1
ωj
˜nj
x∗∈Dj
f(x∗
) , x∗
∼ p(x|x ∈ Dj)
This is the (unbiased) stratified MC estimator.
Variance reduction compared to vanilla MC estimator can be obtained if
we know how many ˜nj to sample from each stratum (e.g “proportional
allocation method”5
)
5
Art Owen (2013), Monte Carlo theory, methods and examples.
Umberto Picchini, @uPicchini 19/47
However in our settings we can’t assume ability to simulate
from within a given stratum; so we can’t decide ˜nj.
And we can’t assume to know anything about ωj := P(X ∈ Dj).
We use a “post stratification” approach (e.g. Owen 2013)6
• first generate many x∗ ∼ p(x) (i.e. from the model simulator);
• count the number of x∗ ending up in each stratum Dj;
• call these frequencies nj (a random variable);
So these frequencies are known after the simulation is done, not
before.
However we still do not know anything about the ωj = P(X ∈ Dj).
We are going to address this soon within an ABC framework.
6
Art Owen: “Monte Carlo theory, methods and examples” 2013.
Umberto Picchini, @uPicchini 20/47
Define strata for ABC
Suppose we have an ns-dimensional summary, i.e. ns = dim(sobs)
and consider the Gaussian kernel
Kδ(s∗
, sobs) =
1
δns
exp −
1
2δ2
(s∗
− sobs) (s∗
− sobs) .
In ABC the µ to approximate via stratified MC is the ABC
likelihood
D
Kδ(s∗
, sobs)p(s∗
|θ)ds∗
So lets partition D...
Umberto Picchini, @uPicchini 21/47
Define strata for ABC
Example to partition D using three strata:
• D1 = {s∗ s.t. s∗ − sobs < δ/2}
• D2 = {s∗ s.t. s∗ − sobs < δ}D1
• D3 = D{D1 ∪ D2}
And more explicitly:
• D1 = {s∗ s.t. (s∗ − sobs) (s∗ − sobs) ∈ (0, δ/2]}
• D2 = {s∗ s.t. (s∗ − sobs) (s∗ − sobs) ∈ (δ/2, δ]}
• D3 = {s∗ s.t. (s∗ − sobs) (s∗ − sobs) ∈ (δ, ∞)}.
Because of our resampling approach, for every θ we have R 1
simulated summaries, say R = 500.
We just need to count how many summaries fall into D1
instead of D2 instead of D3.
This give us n1, n2 and n3 = R − (n1 + n2).
Umberto Picchini, @uPicchini 22/47
n1n1n1
ssdsdddsobs
n1=2
n2=13
n3=10
R = 25 simulated s∗ distributed across three strata.
Umberto Picchini, @uPicchini 23/47
How about the strata probabilities?
We still need to estimate the strata probabilities ωj = P(s∗ ∈ Dj).
This is easy because ωj = Dj
p(s∗|θ)ds∗ which we estimate by
another MC simulation.
So
1 simulate once from the model x∗ ∼ p(x|θ)
2 resample R times from x∗ to obtain x∗1, ..., x∗R
3 compute summaries s∗1, ..., s∗R
4 obtain distances dr := (s∗r − sobs) (s∗r − sobs)
ˆω1 :=
1
R
R
r=1
I{dr≤δ/2}, ˆω2 :=
1
R
R
r=1
I{δ/2<dr≤δ},
ˆω3 := 1 −
2
j=1
ˆωj.
Umberto Picchini, @uPicchini 24/47
We finally have a (biased) estimator of the ABC likelihood using J
strata:
ˆˆµstrat =
J
j=1
ˆωj
nj
r∈Dj
Kδ(s∗r
, sobs) ,
Bias due both to resampling and using estimated ωj.
All in all we simulated twice from the model at given θ, once to
obtain nj and once to obtain ˆωj.
Umberto Picchini, @uPicchini 25/47
Notice the above is not quite ok. What if some nj = 0?
(neglected stratum)
In our ABC-MCMC we reject proposal θ∗ as soon as nj = 0, so
the actual estimator is
ˆˆµstrat =
J
j=1
ˆωj
nj
r∈Dj
Kδ(s∗r
, sobs) I{nj>0,∀j}
This is both a curse and a blessing. We’ll see why...
Umberto Picchini, @uPicchini 26/47
Stratified MC within ABC-MCMC
As usual, we accept a proposal using a MH step:
propose θ∗ ∼ q(θ|θ#) and accept with probability
1 ∧
ˆˆµstrat(θ∗)
ˆˆµstrat(θ#)
·
π(θ∗)
π(θ#)
·
q(θ#|θ∗)
q(θ∗|θ#)
if we accept, set: θ# := θ∗ and ˆˆµstrat(θ#) := ˆˆµstrat(θ∗).
Repeat a few thousands of times,
with
ˆˆµstrat =
J
j=1
ˆωj
nj
r∈Dj
Kδ(s∗r
, sobs) I{nj>0,∀j}
Umberto Picchini, @uPicchini 27/47
Reprising the Gaussian example
Data: 1000 iid observations ∼ N(θ = 0, 1). Gaussian prior →
exact posterior
Red: exact posterior.
Blue: different types of ABC-MCMC posteriors.
-0.15 -0.1 -0.05 0 0.05 0.1 0.15
0
2
4
6
8
10
12
14
pseudomarginal ABC
Exact Bayes
(a) pseudomarginal ABC,
M = 500, small δ
-0.15 -0.1 -0.05 0 0.05 0.1 0.15
0
2
4
6
8
10
12
14
ABC with resampling
Exact Bayes
(b) M = 1 and R = 500
resamples, small δ
-0.15 -0.1 -0.05 0 0.05 0.1 0.15
0
2
4
6
8
10
12
14
resampling + stratification
Exact Bayes
(c) M = 1, R = 500 and
stratification, large δ
With stratification and only M = 1 we get results as good as with
M = 500 (compare left and right).
...and I used a 10x larger δ with the stratified approach!
Umberto Picchini, @uPicchini 28/47
Basically, if the model is simple enough, we can afford using a very
small δ if we have the computational power to run very many
iterations.
Even if the ABC likelihood is badly approximated (highly variable)
and hence chain-mixing is poor, we just need to run many MCMC
iterations.
But what if computational power is limited? If the model is
complex, we can use a larger δ, and employ stratification.
Umberto Picchini, @uPicchini 29/47
Basically when δ is “very small”, ABC likelihood is very
“concentrated” around sobs → most simulated summaries end
up in low probability regions → we often only obtain a poor
ABC-likelihood approximation with high variance.
ssd
sdd
d
sobs
𝛿
Umberto Picchini, @uPicchini 30/47
With a larger δ we get more samples in higher probability
regions, and in each stratum, hence a better approximated
(though wider) ABC likelihood.
If we only accept θ∗ when ALL strata are represented, a good
approximation of a “wider ABC likelihood” (larger δ) works better
than a poorly approximated “narrow ABC likelihood” (small δ).
Umberto Picchini, @uPicchini 31/47
Did stratification work to reduce the variance of the likelihood?
Here M = 1 and R = 500 bootstrapped datasets:
-0.1 -0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1
-25
-20
-15
-10
-5
0
5
And here we have standard pseudomarginal ABC with M = 500:
-0.1 -0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1
-3.5
-3
-2.5
-2
-1.5
-1
-0.5
0
0.5
105
Umberto Picchini, @uPicchini 32/47
The very reason why it works so well is that we necessarily need
at least one sample in each stratum, since
ˆˆµstrat =
J
j=1
ˆωj
nj
r∈Dj
Kδ(s∗r
, sobs) I{nj>0,∀j}
which crucially implies that the most internal stratum needs a
sample (hence relatively close to sobs).
This is why we need to use a larger δ when implementing
stratification.
Umberto Picchini, @uPicchini 33/47
Further variance reduction
But we can do more:
at each θ∗
• (1) we have simulated once from p(x|θ∗), resampled annd
computed the strata frequencies nj;
• (2) we have simulated once more from p(x|θ∗), resampled and
estimated the strata probabilities ωj;
At the same θ∗, swap the roles of the two simulated datasets:
• reuse simulated data from (1) to estimate the ωj instead;
• reuse simulated data from (2) to compute the nj instead;
We have now obtained a second likelihood at nearly zero cost: call
this ˆˆµ
(2)
strat.
Use ¯µstrat =
(ˆˆµstrat+ˆˆµ
(2)
strat)
2 into the MCMC
Umberto Picchini, @uPicchini 34/47
Averaging the two likelihoods reduces the likelihood variance
further.
-0.1 -0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1
20
40
60
80
100
120
140
160
Variances for ABC loglikelihoods
What you see above implies a 50% reduction in the estimated
variance compared to the already good stratified estimator ˆˆµstrat.
More details in the paper...
Umberto Picchini, @uPicchini 35/47
An important point to keep in mind:
• we cannot create as many strata as we want;
• if one stratum is “neglected”, i.e. nj = 0 for some j, then we
have ˆˆµstrat = 0 since
ˆˆµstrat =
J
j=1
ˆωj
nj
r∈Dj
Kδ(s∗r
, sobs) I{nj>0,∀j}
• this forces us to keep the number of strata “small” (we used 3
strata) to reduce the chance of nj = 0.
• enlarge δ. This is ok if you use stratification with the
estimator above.
Umberto Picchini, @uPicchini 36/47
Further questions
Using a larger δ increases acceptance rate...however
• resampling gives some computational overhead. Worth it?
Let’s see in next example....
Umberto Picchini, @uPicchini 37/47
Time series
Our methodology is not restricted to iid data.
Next example, uses “block bootstrap”7
, where we resample blocks of
observations for stationary time-series.
B =



(1 : B)
block1
, (B + 1 : 2B)
block2
, ..., (nobs − B + 1 : nobs)
block
nobs
B



.
• blocks are chosen to be sufficiently large such that they retain the
short range dependence structure of the data;
• so that a resampled time series, constructed by concatenating
resampled blocks, has similar statistical properties to real data
At each θ we resample blocks of indeces of simulated data.
7
Kunsch, H. R. (1989). The jackknife and the bootstrap for general
stationary observations. The Annals of Statistics, 1217-1241.
Umberto Picchini, @uPicchini 38/47
2D Lotka-Volterra time series
A predator-prey model with an intractable likelihood (Markov
jump process).
Two interacting species: X1 (# predators) and X2 (# prey).
Populations evolve according to three interactions:
• A prey may be born, with rate θ1X2, increasing X2 by one.
• The predator-prey interaction in which X1 increases by one
and X2 decreases by one, with rate θ2X1X2.
• A predator may die, with rate θ3X1, decreasing X1 by one.
Its solution may be simulated exactly using the “Gillespie
algorithm”8.
8
Gillespie, D. T. (1977). Exact stochastic simulation of coupled chemical
reactions. The journal of physical chemistry, 81(25), 2340-2361.
Umberto Picchini, @uPicchini 39/47
We have 32 observations for each species simulated via Gillespie’s
algorithm.
At each θ we simulate and resample 4 blocks each having size
B = 8.
We want inference for reaction rates (θ1, θ2, θ3).
Set vague priors log θj ∼ U(−6, 2)
Umberto Picchini, @uPicchini 40/47
We run several experiments:
• pseudomarginal ABC-MCMC with M = 2 indep. datasets for each
θ;
• ABC-MCMC with M = 2, R = 500 resampled datasets and
allocated across three strata;
D1 = {s∗
s.t. distance ∈ (0, δ/2)}
D2 = {s∗
s.t. distance ∈ (δ/2, δ]}
D2 = {s∗
s.t. distance ∈ (δ, ∞)}
• ABC sequential Monte Carlo (ABC-SMC), with 1,000 particles.
Very expensive, 12 hrs of computation.
We used the very general ABC-SMC with automatically determined
sequence of δ’s from Fan-Sisson9
, generalizing Del Moral et al 2012.
9
Y. Fan and S. Sisson. ABC samplers, in Handbook of approximate
Bayesian computation. Chapman and Hall/CRC, 2018.
Umberto Picchini, @uPicchini 41/47
We start with ABC-SMC: slow, but safe gold-standard procedure.
After 12 hrs determines that δ = 0.205 is ok (acceptance rate ≈
1%).
θ1 θ2 θ3 ESS accept. rate runtime ESS/min efficiency
(%) (min) increase
true parameters 1 0.005 0.6
ABC-SMC (δ = 0.205) 0.966 [0.749,1.173] 0.0048 [0.0036,0.0058] 0.587 [0.485,0.711] – 1 – –
pmABC (δ = 0.230) 1.085 [0.881,1.401] 0.0052 [0.0043,0.0063] 0.598 [0.475,0.720] 22.4 1.5 16.2 1.38 1
strat. ABC (δ = 1.380) 0.943 [0.742,1.177] 0.0048 [0.0037,0.0060] 0.613 [0.482,0.757] 145.8 7.5 28.7 5.08 3.7
Figure: Lotka-Volterra: Posterior mean and 95% intervals for θ.
For stratified ABC we used a δ 6-times larger than for standard
ABC!
With stratification: inference similar to ABC-SMC, but acceptance
rate way higher than pmABC and better tails exploration.
Umberto Picchini, @uPicchini 42/47
10,000 MCMC iterations is a very small number when targeting a 1%
acceptance rate.
High autocorrelation, potentially bad tails exploration.
Here is an example from parameter θ1.
Without stratification: (δ = 0.23), M = 2
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
0.8
0.9
1
1.1
1.2
1.3
1.4
1.5
With stratification: (δ = 1.34), M = 2, R = 500
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
1
0.6
0.7
0.8
0.9
1
1.1
1.2
1.3
1.4
Umberto Picchini, @uPicchini 43/47
Low acceptance rate means that typically we are unable to explore
the tails of the posterior adequately.
→ high IAT (integrated autocorrelation time)
→ Low ESS (effective sample size).
Remember, we are assuming an expensive model. So we cannot
just increase the number of iterations ad libitum.
Umberto Picchini, @uPicchini 44/47
Conclusions
• stratified Monte Carlo is effective in reducing resampling bias;
• Allows for precise ABC while using a larger δ;
• smaller variance ABC likelihood → better mixing MCMC;
• Downside: neglected strata may increase rejection rate;
• to reduce chances for neglected strata, initialize the simulation
with bootstrapping without stratification. Run initial iterations
this way to cheaply approach high posterior mass.
• Downside: requires a bootstrapping strategy that makes sense
for the given data;
• more research needed for constructing optimal strata.
Umberto Picchini, @uPicchini 45/47
• Ongoing work: comments most welcome!
• P and Everitt (2019). Stratified sampling and resampling for
approximate Bayesian computation,
https://arxiv.org/abs/1806.05982
picchini@chalmers.se
@uPicchini
Umberto Picchini, @uPicchini 46/47
PhD opportunity
PhD position with me. Deadline 29 May.
Simulation-based Bayesian inference and deep learning for
stochastic modelling
https://tinyurl.com/ru5j9co
Umberto Picchini, @uPicchini 47/47

More Related Content

What's hot

ABC short course: model choice chapter
ABC short course: model choice chapterABC short course: model choice chapter
ABC short course: model choice chapterChristian Robert
 
ABC short course: survey chapter
ABC short course: survey chapterABC short course: survey chapter
ABC short course: survey chapterChristian Robert
 
CISEA 2019: ABC consistency and convergence
CISEA 2019: ABC consistency and convergenceCISEA 2019: ABC consistency and convergence
CISEA 2019: ABC consistency and convergenceChristian Robert
 
Inference in generative models using the Wasserstein distance [[INI]
Inference in generative models using the Wasserstein distance [[INI]Inference in generative models using the Wasserstein distance [[INI]
Inference in generative models using the Wasserstein distance [[INI]Christian Robert
 
Multiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsMultiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsChristian Robert
 
A nonlinear approximation of the Bayesian Update formula
A nonlinear approximation of the Bayesian Update formulaA nonlinear approximation of the Bayesian Update formula
A nonlinear approximation of the Bayesian Update formulaAlexander Litvinenko
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Christian Robert
 
Bayesian inference on mixtures
Bayesian inference on mixturesBayesian inference on mixtures
Bayesian inference on mixturesChristian Robert
 
NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)Christian Robert
 
Likelihood-free Design: a discussion
Likelihood-free Design: a discussionLikelihood-free Design: a discussion
Likelihood-free Design: a discussionChristian Robert
 
Approximate Bayesian computation for the Ising/Potts model
Approximate Bayesian computation for the Ising/Potts modelApproximate Bayesian computation for the Ising/Potts model
Approximate Bayesian computation for the Ising/Potts modelMatt Moores
 
Precomputation for SMC-ABC with undirected graphical models
Precomputation for SMC-ABC with undirected graphical modelsPrecomputation for SMC-ABC with undirected graphical models
Precomputation for SMC-ABC with undirected graphical modelsMatt Moores
 
Final PhD Seminar
Final PhD SeminarFinal PhD Seminar
Final PhD SeminarMatt Moores
 
Convergence of ABC methods
Convergence of ABC methodsConvergence of ABC methods
Convergence of ABC methodsChristian Robert
 

What's hot (20)

ABC short course: model choice chapter
ABC short course: model choice chapterABC short course: model choice chapter
ABC short course: model choice chapter
 
ABC short course: survey chapter
ABC short course: survey chapterABC short course: survey chapter
ABC short course: survey chapter
 
CISEA 2019: ABC consistency and convergence
CISEA 2019: ABC consistency and convergenceCISEA 2019: ABC consistency and convergence
CISEA 2019: ABC consistency and convergence
 
Inference in generative models using the Wasserstein distance [[INI]
Inference in generative models using the Wasserstein distance [[INI]Inference in generative models using the Wasserstein distance [[INI]
Inference in generative models using the Wasserstein distance [[INI]
 
Multiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsMultiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximations
 
A nonlinear approximation of the Bayesian Update formula
A nonlinear approximation of the Bayesian Update formulaA nonlinear approximation of the Bayesian Update formula
A nonlinear approximation of the Bayesian Update formula
 
ABC workshop: 17w5025
ABC workshop: 17w5025ABC workshop: 17w5025
ABC workshop: 17w5025
 
asymptotics of ABC
asymptotics of ABCasymptotics of ABC
asymptotics of ABC
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1
 
Bayesian inference on mixtures
Bayesian inference on mixturesBayesian inference on mixtures
Bayesian inference on mixtures
 
NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)
 
Likelihood-free Design: a discussion
Likelihood-free Design: a discussionLikelihood-free Design: a discussion
Likelihood-free Design: a discussion
 
Karnaugh maps
Karnaugh mapsKarnaugh maps
Karnaugh maps
 
15_representation.pdf
15_representation.pdf15_representation.pdf
15_representation.pdf
 
14_autoencoders.pdf
14_autoencoders.pdf14_autoencoders.pdf
14_autoencoders.pdf
 
10_rnn.pdf
10_rnn.pdf10_rnn.pdf
10_rnn.pdf
 
Approximate Bayesian computation for the Ising/Potts model
Approximate Bayesian computation for the Ising/Potts modelApproximate Bayesian computation for the Ising/Potts model
Approximate Bayesian computation for the Ising/Potts model
 
Precomputation for SMC-ABC with undirected graphical models
Precomputation for SMC-ABC with undirected graphical modelsPrecomputation for SMC-ABC with undirected graphical models
Precomputation for SMC-ABC with undirected graphical models
 
Final PhD Seminar
Final PhD SeminarFinal PhD Seminar
Final PhD Seminar
 
Convergence of ABC methods
Convergence of ABC methodsConvergence of ABC methods
Convergence of ABC methods
 

Similar to Stratified Monte Carlo and bootstrapping for approximate Bayesian computation

Accelerated approximate Bayesian computation with applications to protein fol...
Accelerated approximate Bayesian computation with applications to protein fol...Accelerated approximate Bayesian computation with applications to protein fol...
Accelerated approximate Bayesian computation with applications to protein fol...Umberto Picchini
 
ABC with data cloning for MLE in state space models
ABC with data cloning for MLE in state space modelsABC with data cloning for MLE in state space models
ABC with data cloning for MLE in state space modelsUmberto Picchini
 
Introduction to Evidential Neural Networks
Introduction to Evidential Neural NetworksIntroduction to Evidential Neural Networks
Introduction to Evidential Neural NetworksFederico Cerutti
 
Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...
Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...
Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...Huang Po Chun
 
Guided sequential ABC schemes for simulation-based inference
Guided sequential ABC schemes for simulation-based inferenceGuided sequential ABC schemes for simulation-based inference
Guided sequential ABC schemes for simulation-based inferenceUmberto Picchini
 
Low-rank response surface in numerical aerodynamics
Low-rank response surface in numerical aerodynamicsLow-rank response surface in numerical aerodynamics
Low-rank response surface in numerical aerodynamicsAlexander Litvinenko
 
Introduction to Bootstrap and elements of Markov Chains
Introduction to Bootstrap and elements of Markov ChainsIntroduction to Bootstrap and elements of Markov Chains
Introduction to Bootstrap and elements of Markov ChainsUniversity of Salerno
 
Spike sorting: What is it? Why do we need it? Where does it come from? How is...
Spike sorting: What is it? Why do we need it? Where does it come from? How is...Spike sorting: What is it? Why do we need it? Where does it come from? How is...
Spike sorting: What is it? Why do we need it? Where does it come from? How is...NeuroMat
 
Slides econometrics-2017-graduate-2
Slides econometrics-2017-graduate-2Slides econometrics-2017-graduate-2
Slides econometrics-2017-graduate-2Arthur Charpentier
 
Approximate Thin Plate Spline Mappings
Approximate Thin Plate Spline MappingsApproximate Thin Plate Spline Mappings
Approximate Thin Plate Spline MappingsArchzilon Eshun-Davies
 
Markov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing themMarkov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing themPierre Jacob
 

Similar to Stratified Monte Carlo and bootstrapping for approximate Bayesian computation (20)

Accelerated approximate Bayesian computation with applications to protein fol...
Accelerated approximate Bayesian computation with applications to protein fol...Accelerated approximate Bayesian computation with applications to protein fol...
Accelerated approximate Bayesian computation with applications to protein fol...
 
ABC with data cloning for MLE in state space models
ABC with data cloning for MLE in state space modelsABC with data cloning for MLE in state space models
ABC with data cloning for MLE in state space models
 
Absorbing Random Walk Centrality
Absorbing Random Walk CentralityAbsorbing Random Walk Centrality
Absorbing Random Walk Centrality
 
Introduction to Evidential Neural Networks
Introduction to Evidential Neural NetworksIntroduction to Evidential Neural Networks
Introduction to Evidential Neural Networks
 
AI Lesson 29
AI Lesson 29AI Lesson 29
AI Lesson 29
 
Lesson 29
Lesson 29Lesson 29
Lesson 29
 
Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...
Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...
Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...
 
UNIT III.pptx
UNIT III.pptxUNIT III.pptx
UNIT III.pptx
 
Guided sequential ABC schemes for simulation-based inference
Guided sequential ABC schemes for simulation-based inferenceGuided sequential ABC schemes for simulation-based inference
Guided sequential ABC schemes for simulation-based inference
 
QMC: Transition Workshop - Density Estimation by Randomized Quasi-Monte Carlo...
QMC: Transition Workshop - Density Estimation by Randomized Quasi-Monte Carlo...QMC: Transition Workshop - Density Estimation by Randomized Quasi-Monte Carlo...
QMC: Transition Workshop - Density Estimation by Randomized Quasi-Monte Carlo...
 
Low-rank response surface in numerical aerodynamics
Low-rank response surface in numerical aerodynamicsLow-rank response surface in numerical aerodynamics
Low-rank response surface in numerical aerodynamics
 
Hoip10 articulo surface reconstruction_upc
Hoip10 articulo surface reconstruction_upcHoip10 articulo surface reconstruction_upc
Hoip10 articulo surface reconstruction_upc
 
Introduction to Bootstrap and elements of Markov Chains
Introduction to Bootstrap and elements of Markov ChainsIntroduction to Bootstrap and elements of Markov Chains
Introduction to Bootstrap and elements of Markov Chains
 
Talk 5
Talk 5Talk 5
Talk 5
 
Spike sorting: What is it? Why do we need it? Where does it come from? How is...
Spike sorting: What is it? Why do we need it? Where does it come from? How is...Spike sorting: What is it? Why do we need it? Where does it come from? How is...
Spike sorting: What is it? Why do we need it? Where does it come from? How is...
 
Slides econometrics-2017-graduate-2
Slides econometrics-2017-graduate-2Slides econometrics-2017-graduate-2
Slides econometrics-2017-graduate-2
 
Approximate Thin Plate Spline Mappings
Approximate Thin Plate Spline MappingsApproximate Thin Plate Spline Mappings
Approximate Thin Plate Spline Mappings
 
Markov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing themMarkov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing them
 
Calculus Homework Help
Calculus Homework HelpCalculus Homework Help
Calculus Homework Help
 
4th Semester CS / IS (2013-June) Question Papers
4th Semester CS / IS (2013-June) Question Papers 4th Semester CS / IS (2013-June) Question Papers
4th Semester CS / IS (2013-June) Question Papers
 

More from Umberto Picchini

Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...Umberto Picchini
 
A likelihood-free version of the stochastic approximation EM algorithm (SAEM)...
A likelihood-free version of the stochastic approximation EM algorithm (SAEM)...A likelihood-free version of the stochastic approximation EM algorithm (SAEM)...
A likelihood-free version of the stochastic approximation EM algorithm (SAEM)...Umberto Picchini
 
Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of...
Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of...Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of...
Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of...Umberto Picchini
 
My data are incomplete and noisy: Information-reduction statistical methods f...
My data are incomplete and noisy: Information-reduction statistical methods f...My data are incomplete and noisy: Information-reduction statistical methods f...
My data are incomplete and noisy: Information-reduction statistical methods f...Umberto Picchini
 
Inference for stochastic differential equations via approximate Bayesian comp...
Inference for stochastic differential equations via approximate Bayesian comp...Inference for stochastic differential equations via approximate Bayesian comp...
Inference for stochastic differential equations via approximate Bayesian comp...Umberto Picchini
 
Intro to Approximate Bayesian Computation (ABC)
Intro to Approximate Bayesian Computation (ABC)Intro to Approximate Bayesian Computation (ABC)
Intro to Approximate Bayesian Computation (ABC)Umberto Picchini
 

More from Umberto Picchini (6)

Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...Bayesian inference for mixed-effects models driven by SDEs and other stochast...
Bayesian inference for mixed-effects models driven by SDEs and other stochast...
 
A likelihood-free version of the stochastic approximation EM algorithm (SAEM)...
A likelihood-free version of the stochastic approximation EM algorithm (SAEM)...A likelihood-free version of the stochastic approximation EM algorithm (SAEM)...
A likelihood-free version of the stochastic approximation EM algorithm (SAEM)...
 
Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of...
Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of...Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of...
Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of...
 
My data are incomplete and noisy: Information-reduction statistical methods f...
My data are incomplete and noisy: Information-reduction statistical methods f...My data are incomplete and noisy: Information-reduction statistical methods f...
My data are incomplete and noisy: Information-reduction statistical methods f...
 
Inference for stochastic differential equations via approximate Bayesian comp...
Inference for stochastic differential equations via approximate Bayesian comp...Inference for stochastic differential equations via approximate Bayesian comp...
Inference for stochastic differential equations via approximate Bayesian comp...
 
Intro to Approximate Bayesian Computation (ABC)
Intro to Approximate Bayesian Computation (ABC)Intro to Approximate Bayesian Computation (ABC)
Intro to Approximate Bayesian Computation (ABC)
 

Recently uploaded

Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physicsvishikhakeshava1
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxAleenaTreesaSaji
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 

Recently uploaded (20)

Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physics
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptx
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 

Stratified Monte Carlo and bootstrapping for approximate Bayesian computation

  • 1. Stratified Monte Carlo and bootstrapping for approximate Bayesian computation Umberto Picchini @uPicchini Chalmers University of Technology and University of Gothenburg Sweden ABC World Seminar, 7 May 2020
  • 2. Joint work with Richard Everitt: P and Everitt (2019) Stratified sampling and bootstrapping for approximate Bayesian computation, arXiv:1905.07976. This is ongoing work. Feedback is very welcome. Umberto Picchini, @uPicchini 2/47
  • 3. I am going to talk of essentially three main points: • how to use bootstrap samples to reduce the computational effort when approximating the ABC likelihood; • how to reduce the bootstrap-induced bias of the approximated ABC likelihood using stratified Monte Carlo. • how to obtain reliable inference with a larger than usual ABC threshold, via stratified MC. The above is illustrated using a pseudomarginal ABC-MCMC approach. However other samplers could be used. Umberto Picchini, @uPicchini 3/47
  • 4. Notation • We are interested in Bayesian inference for parameters θ; • the likelihood p(xobs|θ) for data xobs is “unavailable”; • we assume the ability to simulate pseudo-data x∗ from a simulator x∗ ∼ p(x|θ); • we use an ABC approach (approximate Bayesian computation); • we provide summary statistics sobs = S(xobs) to data and s∗ = S(x∗) to simulated data; • the key idea in ABC is to give higher weights to summaries s∗ close to sobs and with higher probability accept parameters θ∗ generating s∗ ≈ sobs. Umberto Picchini, @uPicchini 4/47
  • 5. A possibility to weight the proximity of s∗ to sobs is to use a kernel funct Kδ(s∗, sobs) and write πδ(θ, s∗ |sobs) augmented posterior ∝ π(θ) × Kδ(s∗ , sobs) × p(s∗ |θ) for example we may use a Gaussian kernel Kδ(s∗ , sobs) ∝ exp(− 1 2δ2 (s∗ − sobs) (s∗ − sobs)). Other kernels are possible, e.g. Epanechnikov’s kernel. We can then write the (marginal) ABC posterior as πδ(θ|sobs) ∝ π(θ) Kδ(s∗ , sobs)p(s∗ |θ)ds∗ ABC likelihood Umberto Picchini, @uPicchini 5/47
  • 6. The ABC likelihood: Kδ(s∗, sobs)p(s∗|θ)ds∗. This can trivially be approximated unbiasedly via Monte Carlo as1 Kδ(s∗ , sobs)p(s∗ |θ)ds∗ ≈ 1 M M r=1 Kδ(s∗r , sobs), s∗r ∼ iid p(s∗ |θ). and plugged in a Metropolis-Hastings ABC-MCMC algorithm, proposing a move θ∗ ∼ q(θ|θ#), and accepting with probability min 1, 1 M M r=1 Kδ(s∗r, s) 1 M M r=1 Kδ(s#r, s) · π(θ∗) π(θ#) · q(θ#|θ∗) q(θ∗|θ#) Problem: if the model simulator is computationally expensive, having M large is unfeasible. Often M = 1 is used. 1 Lee, Andrieu, Doucet (2012): Discussion of Fearnhead-Prangle 2012 JRSS-B. Umberto Picchini, @uPicchini 6/47
  • 7. So we have the approximated ABC posterior (up to a constant c) πδ(θ|sobs) ≈ c · π(θ) · 1 M M r=1 Kδ(s∗r , sobs) non-negative unbiased , s∗r ∼ iid M times p(s∗ |θ). • this makes the algorithm an instance of pseudomarginal MCMC [Andrieu, Roberts 2009]2 • No matter the value of M, the ABC-MCMC will sample exactly from πδ(θ|sobs) • Typically ABC-MCMC is computationally intensive and a small M is chosen, say M = 1; • the lower the M the higher the variance of the estimate of the ABC likelihood Kδ(s∗ , s)p(s∗ |θ)ds∗ • the larger the variance the worse the mixing of the chain (due to occasional overestimation of the likelihood). 2 Andrieu, C., and Roberts, G. O. (2009). The pseudo-marginal approach for efficient Monte Carlo computations. The Annals of Statistics, 37(2), 697-725. Umberto Picchini, @uPicchini 7/47
  • 8. Dilemma: a small M will decrease the runtime considerably, however it will increase the chance to overestimate the likelihood → possibly high-rejection rates. Question: is it worth to have M > 1 to reduce the variance of the ABC likelihood given the higher computational cost? Bornn et al 20173 found that no, it is not worth and M = 1 is just fine (when using a uniform kernel). Basically using M = 1 is so much faster to run that the decreased variance obtained with M > 1 is not worth given the higher computational cost. 3 L. Bornn, N. S. Pillai, A. Smith, and D. Woodard. The use of a single pseudo-sample in approximate Bayesian computation. Statistics and Computing, 27(3):583590, 2017. Umberto Picchini, @uPicchini 8/47
  • 9. Can we devise a strategy to cheaply generate many artificial datasets? Umberto Picchini, @uPicchini 9/47
  • 10. Data resampling (nonparametric bootstrap) In a similar context (based on synthetic likelihood approaches) Everitt 20174 used the following approach: At any proposed θ • simulate say M = 1 datasets x∗ ∼ p(x|θ); • sample with replacement from elements in x∗ to obtain a bootstrapped dataset (with dimension dim(xobs)); • repeat the resampling R times to obtain x∗1 , ..., x∗R bootstrap datasets from x∗ ; • compute the summaries s∗1 , ..., s∗R for each resampled dataset; • compute ABC likelihood approximation 1 R R r=1 Kδ(s∗r , sobs) • plug this into ABC-MCMC 4 Everitt (2017). Bootstrapped synthetic likelihood. arXiv:1711.05825. Umberto Picchini, @uPicchini 10/47
  • 11. This is cheap compared to producing M independent summaries from the model, when the simulator is computationally intensive. It reduces the variance of the ABC likelihood compared to using M = 1 without resampling. However it is worse than using several M 1 independent samples. Umberto Picchini, @uPicchini 11/47
  • 12. Example: data is 1000 iid observations from N(θ = 0, 1). Set Gaussian prior on θ → known analytic posterior. Use sufficient stats S(xobs) = ¯xobs throughout. -0.15 -0.1 -0.05 0 0.05 0.1 0.15 0 2 4 6 8 10 12 14 pseudomarginal ABC Exact Bayes (a) pseudomarginal ABC, M = 500 -0.15 -0.1 -0.05 0 0.05 0.1 0.15 0 2 4 6 8 10 12 14 ABC with resampling Exact Bayes (b) M = 1 and R = 500 resamples • Left: pseudo-marginal ABC with M = 500 independent datasets; • Right: M = 1 and R = 500 resampled datasets Umberto Picchini, @uPicchini 12/47
  • 13. What we saw is hardly surprising: Of course resampling introduces additional variability in the estimand. Example: true loglikelihood of the summary statistic ¯x -0.1 -0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1 -3 -2 -1 0 1 2 3 Umberto Picchini, @uPicchini 13/47
  • 14. We estimate the ABC loglikelihood at several θ-values in [−0.1, 0.1] with a very small threshold, δ = 3 × 10−5 : -0.1 -0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 105 Figure: Solid lines are mean values over 500 repetitions. Dashed lines are 2.5 and 97.5 percentiles. In black: analytic loglikelihood (undistinguishable). In red: ABC loglikelihood approximated via M=500 independent datasets. Umberto Picchini, @uPicchini 14/47
  • 15. Let’s zoom in very central values... -0.01 -0.008 -0.006 -0.004 -0.002 0 0.002 0.004 0.006 0.008 0.01 -50 -40 -30 -20 -10 0 10 Figure: Solid lines are mean values over 500 repetitions. Dashed lines are 2.5 and 97.5 percentiles. We notice some bias: this can be reduced by increasing model simulations M to values 500 Umberto Picchini, @uPicchini 15/47
  • 16. We now use bootstrap M = 1 simulated datasets R = 500 times: -0.1 -0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 106 1 R R r=1 Kδ(s∗r , sobs) In blue: ABC loglikelihood approximated via M=1 and R=500 bootstrapped datasets; large bias+overly variable likelihood In red: no bootstrapping and M=500 independent datasets; Umberto Picchini, @uPicchini 16/47
  • 17. Stratified Monte Carlo Stratified Monte Carlo is a variance reduction technique. In full generality: want to approximate µ = D f(x)p(x)dx over some space D, for some function f and density (or probability mass) function p. Now partition D into J “strata” D1, ..., DJ : • ∪J j=1Dj = D • Dj ∩ Dj = ∅, j = j Umberto Picchini, @uPicchini 17/47
  • 18. Samples from bivariate N2(0, I2) 6 concentric rings and 7 strata. Each stratum has exactly 3 points sampled from within it. Better to oversample inside the most “important” strata, where the integrand has higher mass. Umberto Picchini, @uPicchini 18/47
  • 19. Ideally the statistician should decide how many Monte Carlo draws to sample from each stratum Dj. • Call this number ˜nj; • define ωj := P(X ∈ Dj) Probabilities ωj should be known. Then we approximate µ = D f(x)p(x)dx with ˆµstrat = J j=1 ωj ˜nj x∗∈Dj f(x∗ ) , x∗ ∼ p(x|x ∈ Dj) This is the (unbiased) stratified MC estimator. Variance reduction compared to vanilla MC estimator can be obtained if we know how many ˜nj to sample from each stratum (e.g “proportional allocation method”5 ) 5 Art Owen (2013), Monte Carlo theory, methods and examples. Umberto Picchini, @uPicchini 19/47
  • 20. However in our settings we can’t assume ability to simulate from within a given stratum; so we can’t decide ˜nj. And we can’t assume to know anything about ωj := P(X ∈ Dj). We use a “post stratification” approach (e.g. Owen 2013)6 • first generate many x∗ ∼ p(x) (i.e. from the model simulator); • count the number of x∗ ending up in each stratum Dj; • call these frequencies nj (a random variable); So these frequencies are known after the simulation is done, not before. However we still do not know anything about the ωj = P(X ∈ Dj). We are going to address this soon within an ABC framework. 6 Art Owen: “Monte Carlo theory, methods and examples” 2013. Umberto Picchini, @uPicchini 20/47
  • 21. Define strata for ABC Suppose we have an ns-dimensional summary, i.e. ns = dim(sobs) and consider the Gaussian kernel Kδ(s∗ , sobs) = 1 δns exp − 1 2δ2 (s∗ − sobs) (s∗ − sobs) . In ABC the µ to approximate via stratified MC is the ABC likelihood D Kδ(s∗ , sobs)p(s∗ |θ)ds∗ So lets partition D... Umberto Picchini, @uPicchini 21/47
  • 22. Define strata for ABC Example to partition D using three strata: • D1 = {s∗ s.t. s∗ − sobs < δ/2} • D2 = {s∗ s.t. s∗ − sobs < δ}D1 • D3 = D{D1 ∪ D2} And more explicitly: • D1 = {s∗ s.t. (s∗ − sobs) (s∗ − sobs) ∈ (0, δ/2]} • D2 = {s∗ s.t. (s∗ − sobs) (s∗ − sobs) ∈ (δ/2, δ]} • D3 = {s∗ s.t. (s∗ − sobs) (s∗ − sobs) ∈ (δ, ∞)}. Because of our resampling approach, for every θ we have R 1 simulated summaries, say R = 500. We just need to count how many summaries fall into D1 instead of D2 instead of D3. This give us n1, n2 and n3 = R − (n1 + n2). Umberto Picchini, @uPicchini 22/47
  • 23. n1n1n1 ssdsdddsobs n1=2 n2=13 n3=10 R = 25 simulated s∗ distributed across three strata. Umberto Picchini, @uPicchini 23/47
  • 24. How about the strata probabilities? We still need to estimate the strata probabilities ωj = P(s∗ ∈ Dj). This is easy because ωj = Dj p(s∗|θ)ds∗ which we estimate by another MC simulation. So 1 simulate once from the model x∗ ∼ p(x|θ) 2 resample R times from x∗ to obtain x∗1, ..., x∗R 3 compute summaries s∗1, ..., s∗R 4 obtain distances dr := (s∗r − sobs) (s∗r − sobs) ˆω1 := 1 R R r=1 I{dr≤δ/2}, ˆω2 := 1 R R r=1 I{δ/2<dr≤δ}, ˆω3 := 1 − 2 j=1 ˆωj. Umberto Picchini, @uPicchini 24/47
  • 25. We finally have a (biased) estimator of the ABC likelihood using J strata: ˆˆµstrat = J j=1 ˆωj nj r∈Dj Kδ(s∗r , sobs) , Bias due both to resampling and using estimated ωj. All in all we simulated twice from the model at given θ, once to obtain nj and once to obtain ˆωj. Umberto Picchini, @uPicchini 25/47
  • 26. Notice the above is not quite ok. What if some nj = 0? (neglected stratum) In our ABC-MCMC we reject proposal θ∗ as soon as nj = 0, so the actual estimator is ˆˆµstrat = J j=1 ˆωj nj r∈Dj Kδ(s∗r , sobs) I{nj>0,∀j} This is both a curse and a blessing. We’ll see why... Umberto Picchini, @uPicchini 26/47
  • 27. Stratified MC within ABC-MCMC As usual, we accept a proposal using a MH step: propose θ∗ ∼ q(θ|θ#) and accept with probability 1 ∧ ˆˆµstrat(θ∗) ˆˆµstrat(θ#) · π(θ∗) π(θ#) · q(θ#|θ∗) q(θ∗|θ#) if we accept, set: θ# := θ∗ and ˆˆµstrat(θ#) := ˆˆµstrat(θ∗). Repeat a few thousands of times, with ˆˆµstrat = J j=1 ˆωj nj r∈Dj Kδ(s∗r , sobs) I{nj>0,∀j} Umberto Picchini, @uPicchini 27/47
  • 28. Reprising the Gaussian example Data: 1000 iid observations ∼ N(θ = 0, 1). Gaussian prior → exact posterior Red: exact posterior. Blue: different types of ABC-MCMC posteriors. -0.15 -0.1 -0.05 0 0.05 0.1 0.15 0 2 4 6 8 10 12 14 pseudomarginal ABC Exact Bayes (a) pseudomarginal ABC, M = 500, small δ -0.15 -0.1 -0.05 0 0.05 0.1 0.15 0 2 4 6 8 10 12 14 ABC with resampling Exact Bayes (b) M = 1 and R = 500 resamples, small δ -0.15 -0.1 -0.05 0 0.05 0.1 0.15 0 2 4 6 8 10 12 14 resampling + stratification Exact Bayes (c) M = 1, R = 500 and stratification, large δ With stratification and only M = 1 we get results as good as with M = 500 (compare left and right). ...and I used a 10x larger δ with the stratified approach! Umberto Picchini, @uPicchini 28/47
  • 29. Basically, if the model is simple enough, we can afford using a very small δ if we have the computational power to run very many iterations. Even if the ABC likelihood is badly approximated (highly variable) and hence chain-mixing is poor, we just need to run many MCMC iterations. But what if computational power is limited? If the model is complex, we can use a larger δ, and employ stratification. Umberto Picchini, @uPicchini 29/47
  • 30. Basically when δ is “very small”, ABC likelihood is very “concentrated” around sobs → most simulated summaries end up in low probability regions → we often only obtain a poor ABC-likelihood approximation with high variance. ssd sdd d sobs 𝛿 Umberto Picchini, @uPicchini 30/47
  • 31. With a larger δ we get more samples in higher probability regions, and in each stratum, hence a better approximated (though wider) ABC likelihood. If we only accept θ∗ when ALL strata are represented, a good approximation of a “wider ABC likelihood” (larger δ) works better than a poorly approximated “narrow ABC likelihood” (small δ). Umberto Picchini, @uPicchini 31/47
  • 32. Did stratification work to reduce the variance of the likelihood? Here M = 1 and R = 500 bootstrapped datasets: -0.1 -0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1 -25 -20 -15 -10 -5 0 5 And here we have standard pseudomarginal ABC with M = 500: -0.1 -0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1 -3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 105 Umberto Picchini, @uPicchini 32/47
  • 33. The very reason why it works so well is that we necessarily need at least one sample in each stratum, since ˆˆµstrat = J j=1 ˆωj nj r∈Dj Kδ(s∗r , sobs) I{nj>0,∀j} which crucially implies that the most internal stratum needs a sample (hence relatively close to sobs). This is why we need to use a larger δ when implementing stratification. Umberto Picchini, @uPicchini 33/47
  • 34. Further variance reduction But we can do more: at each θ∗ • (1) we have simulated once from p(x|θ∗), resampled annd computed the strata frequencies nj; • (2) we have simulated once more from p(x|θ∗), resampled and estimated the strata probabilities ωj; At the same θ∗, swap the roles of the two simulated datasets: • reuse simulated data from (1) to estimate the ωj instead; • reuse simulated data from (2) to compute the nj instead; We have now obtained a second likelihood at nearly zero cost: call this ˆˆµ (2) strat. Use ¯µstrat = (ˆˆµstrat+ˆˆµ (2) strat) 2 into the MCMC Umberto Picchini, @uPicchini 34/47
  • 35. Averaging the two likelihoods reduces the likelihood variance further. -0.1 -0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1 20 40 60 80 100 120 140 160 Variances for ABC loglikelihoods What you see above implies a 50% reduction in the estimated variance compared to the already good stratified estimator ˆˆµstrat. More details in the paper... Umberto Picchini, @uPicchini 35/47
  • 36. An important point to keep in mind: • we cannot create as many strata as we want; • if one stratum is “neglected”, i.e. nj = 0 for some j, then we have ˆˆµstrat = 0 since ˆˆµstrat = J j=1 ˆωj nj r∈Dj Kδ(s∗r , sobs) I{nj>0,∀j} • this forces us to keep the number of strata “small” (we used 3 strata) to reduce the chance of nj = 0. • enlarge δ. This is ok if you use stratification with the estimator above. Umberto Picchini, @uPicchini 36/47
  • 37. Further questions Using a larger δ increases acceptance rate...however • resampling gives some computational overhead. Worth it? Let’s see in next example.... Umberto Picchini, @uPicchini 37/47
  • 38. Time series Our methodology is not restricted to iid data. Next example, uses “block bootstrap”7 , where we resample blocks of observations for stationary time-series. B =    (1 : B) block1 , (B + 1 : 2B) block2 , ..., (nobs − B + 1 : nobs) block nobs B    . • blocks are chosen to be sufficiently large such that they retain the short range dependence structure of the data; • so that a resampled time series, constructed by concatenating resampled blocks, has similar statistical properties to real data At each θ we resample blocks of indeces of simulated data. 7 Kunsch, H. R. (1989). The jackknife and the bootstrap for general stationary observations. The Annals of Statistics, 1217-1241. Umberto Picchini, @uPicchini 38/47
  • 39. 2D Lotka-Volterra time series A predator-prey model with an intractable likelihood (Markov jump process). Two interacting species: X1 (# predators) and X2 (# prey). Populations evolve according to three interactions: • A prey may be born, with rate θ1X2, increasing X2 by one. • The predator-prey interaction in which X1 increases by one and X2 decreases by one, with rate θ2X1X2. • A predator may die, with rate θ3X1, decreasing X1 by one. Its solution may be simulated exactly using the “Gillespie algorithm”8. 8 Gillespie, D. T. (1977). Exact stochastic simulation of coupled chemical reactions. The journal of physical chemistry, 81(25), 2340-2361. Umberto Picchini, @uPicchini 39/47
  • 40. We have 32 observations for each species simulated via Gillespie’s algorithm. At each θ we simulate and resample 4 blocks each having size B = 8. We want inference for reaction rates (θ1, θ2, θ3). Set vague priors log θj ∼ U(−6, 2) Umberto Picchini, @uPicchini 40/47
  • 41. We run several experiments: • pseudomarginal ABC-MCMC with M = 2 indep. datasets for each θ; • ABC-MCMC with M = 2, R = 500 resampled datasets and allocated across three strata; D1 = {s∗ s.t. distance ∈ (0, δ/2)} D2 = {s∗ s.t. distance ∈ (δ/2, δ]} D2 = {s∗ s.t. distance ∈ (δ, ∞)} • ABC sequential Monte Carlo (ABC-SMC), with 1,000 particles. Very expensive, 12 hrs of computation. We used the very general ABC-SMC with automatically determined sequence of δ’s from Fan-Sisson9 , generalizing Del Moral et al 2012. 9 Y. Fan and S. Sisson. ABC samplers, in Handbook of approximate Bayesian computation. Chapman and Hall/CRC, 2018. Umberto Picchini, @uPicchini 41/47
  • 42. We start with ABC-SMC: slow, but safe gold-standard procedure. After 12 hrs determines that δ = 0.205 is ok (acceptance rate ≈ 1%). θ1 θ2 θ3 ESS accept. rate runtime ESS/min efficiency (%) (min) increase true parameters 1 0.005 0.6 ABC-SMC (δ = 0.205) 0.966 [0.749,1.173] 0.0048 [0.0036,0.0058] 0.587 [0.485,0.711] – 1 – – pmABC (δ = 0.230) 1.085 [0.881,1.401] 0.0052 [0.0043,0.0063] 0.598 [0.475,0.720] 22.4 1.5 16.2 1.38 1 strat. ABC (δ = 1.380) 0.943 [0.742,1.177] 0.0048 [0.0037,0.0060] 0.613 [0.482,0.757] 145.8 7.5 28.7 5.08 3.7 Figure: Lotka-Volterra: Posterior mean and 95% intervals for θ. For stratified ABC we used a δ 6-times larger than for standard ABC! With stratification: inference similar to ABC-SMC, but acceptance rate way higher than pmABC and better tails exploration. Umberto Picchini, @uPicchini 42/47
  • 43. 10,000 MCMC iterations is a very small number when targeting a 1% acceptance rate. High autocorrelation, potentially bad tails exploration. Here is an example from parameter θ1. Without stratification: (δ = 0.23), M = 2 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 With stratification: (δ = 1.34), M = 2, R = 500 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 1 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 Umberto Picchini, @uPicchini 43/47
  • 44. Low acceptance rate means that typically we are unable to explore the tails of the posterior adequately. → high IAT (integrated autocorrelation time) → Low ESS (effective sample size). Remember, we are assuming an expensive model. So we cannot just increase the number of iterations ad libitum. Umberto Picchini, @uPicchini 44/47
  • 45. Conclusions • stratified Monte Carlo is effective in reducing resampling bias; • Allows for precise ABC while using a larger δ; • smaller variance ABC likelihood → better mixing MCMC; • Downside: neglected strata may increase rejection rate; • to reduce chances for neglected strata, initialize the simulation with bootstrapping without stratification. Run initial iterations this way to cheaply approach high posterior mass. • Downside: requires a bootstrapping strategy that makes sense for the given data; • more research needed for constructing optimal strata. Umberto Picchini, @uPicchini 45/47
  • 46. • Ongoing work: comments most welcome! • P and Everitt (2019). Stratified sampling and resampling for approximate Bayesian computation, https://arxiv.org/abs/1806.05982 picchini@chalmers.se @uPicchini Umberto Picchini, @uPicchini 46/47
  • 47. PhD opportunity PhD position with me. Deadline 29 May. Simulation-based Bayesian inference and deep learning for stochastic modelling https://tinyurl.com/ru5j9co Umberto Picchini, @uPicchini 47/47