SlideShare a Scribd company logo
Parallel Markov Chain Monte Carlo
Scott C. Schmidler∗
Department of Statistical Science
Duke University
SAMSI Workshop
December 11, 2017
∗
joint work with Doug VanDerwerken
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Markov chain Monte Carlo integration
A general problem in (esp Bayesian) statistics and statistical
mechanics is calculation of integrals of the form:
h π = Eπ (h(x)) =
X
h(x)π(dx)
A common, powerful approach is Monte Carlo integration:
h ≈
1
n
n
i=1
h(Xi ) for X1, X2, . . . , Xn ∼ π
When sampling π is difficult, can construct a Markov chain.
MCMC
Many ways to do so: Metropolis-Hastings, Gibbs sampling,
Langevin & Hamiltonian methods, etc.
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Problem: MCMC can be slow
When X0, X1, X2, . . . , Xn come from a Markov chain, convergence
of ergodic averages
ˆµh =
1
n
n
i=1
h(Xi )
can converge very slowly.
Mixing time
τ = sup
π0
min{n : πn − π TV < ∀n ≥ n}.
where
πn − π TV = sup
A⊂X
|πn(A) − π(A)|
In problems with multimodality, high dimensions, or simply strong
dependence, mixing times can be very, very long.
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Rapid and slow mixing
One way to characterize this is rapid mixing.
Let (X(d), F(d), λ(d)) a sequence of measure spaces, and π(d)
densities wrt λ(d) for d ∈ N the problem size.
P is rapidly mixing if τ (d) is bounded above by a polynomial in d.
P is torpidly mixing if τ (d) bounded below by an exponential in d.
Even if the chain is “rapidly” mixing, τ may be impractically large.
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Computation is changing
At the same time, the computing landscape has shifted
dramatically.
Moore’s law (exponential growth of processor speed) is dead.
Future growth must come through parallelism:
Multi-core platforms
Cluster computing
Massive parallelism (GPUs)
Cloud computing
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Parallel algorithms
Basic idea: Break a problem into pieces that can be solved
independently - preferably asynchronously - and recombined into a
full solution.
Integration (wrt prob measure π):
Θ
h(θ)π(dθ)
One possibility:
partition space: J
j=1 Θj
integrate within each element Θj : µj = Θj
h(θ)π(θ)dθ
sum the results: µ = j µj
Easily done for grid-based quadrature, but . . .
For fixed , # evals grows exponentially in dim(Θ)
In contrast, Monte Carlo integration“spends” function evals only in
relevant parts of Θ. (Hence preferred for d > 8).
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Parallelization
Our goal: Combine the best of both worlds: expend computation
only in regions of significant probability, while enabling parallel
evaluation in distinct regions.
Quandary: MCMC is an inherently serial algorithm, and number of
steps may be exponential in dim(Θ).
Scott C. Schmidler Parallel Markov Chain Monte Carlo
MCMC is a serial algorithm
MCMC is inherently serial:
Cannot compute Xt without first computing X1, X2, . . . Xt−1.
⇒ incompatible with parallelization
What we can do:
Parallelize individual steps (e.g. expensive likelihood calcs)
Proposing moves in parallel, or precomputing acceptance
ratios, for a individual steps
Markov chains with natural parallel structure
Parallel tempering
Population MCMC
but . . . such chains have inherent limitations on number of
processors; cannot parallelize component chains
Split ’big data’ and recombine results in ad hoc ways
Particle filtering/SMC
Scott C. Schmidler Parallel Markov Chain Monte Carlo
MCMC is a serial algorithm
Moreover, these approaches all require processor synchronization.
Achievable only on dedicated clusters, with high-speed
connectivity
Without this, parallelization may slow-down compared to
single processor.
Finally, all require the component (or joint) chains to reach
equilibrium for valid inference.
⇒ Cannot reduce the number of serial steps required..
e.g. Parallel Tempering:
may speed convergence vs single-temp, but . . .
increasing # processors > # temps doesn’t help.
When mixing slow, e.g in presence of multimodality, may not help.
(e.g Woodard, S., Huber 2009).
These algs fundamentally limited by mixing time of joint process.
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Goal of this work
Goal: A procedure that can be applied to any Markov chain Monte
Carlo algorithm (including above methods) to make it parallel, with
the ability to take advantage of as many processors as available:
Asynchronously parallel.
Ideally, linear speedup in # processors.
Not limited by the mixing time of the component chain(s).
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Basic idea (not quite what we do)
Given a partition Θ = J
j=1 Θj .
For each j run an MCMC chain θ
(j)
1 , . . . , θ
(j)
nj on the target
distribution restricted to Θj :
πj (θ)
∆
= π(θ)1Θj
(θ)/wj
where wj = Θj
π(θ)ν(dθ).
Then for ergodic averages ˆµj,n = n−1
j
nj
i=1 f (θ
(i)
j ) we have
ˆµj,n −→ Eπj (f ) =
Θj
f (θ)πj (θ)ν(dθ)
as nj → ∞, for each j ∈ {1, . . . , J}.
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Combining the chains
If we can also construct estimators for the weights:
ˆwj,n → wj
Then the combined estimator
ˆµn =
J
j=1
ˆwj,n ˆµj,n −→ µ = Eπ(f )
If ˆµj,n’s and ˆwj,n’s unbiased and independent, then ˆµn unbiased.
Notice:
Need only ˆµj,n’s and ˆwj,n’s to converge, not the chains!
Requires only that each chain mix locally
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Estimating the weights
Let g(θ) be unnormalized target density, i.e. π(θ) = g(θ)
c
Estimating cj = Θj
g(θ)ν(dθ) equivalent to estimating the
normalizing constant of target density gj (θ) = g(θ)1Θj
(θ)
Many techniques available (but requires care).
Then form
ˆwj,n =
n
i=1
ˆc
(i)
j /
n
i=1
J
k=1
ˆc
(i)
k
which is consistent (but not unbiased) for w.
Other ratio estimators may improve efficiency (Tin 1965), allowing
reduction in n.
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Estimating the weights
Approach 1: Markov chain output
Estimate cj directly from MCMC trajectories.
HME (Newton & Raftery 1994, Raftery et. al. 2007)
Chib’s method (Chib 1995, Chib & Jeliazkov 2001)
Bridge/path sampling (Meng & Wong 1996, Gelman & Meng
1998, Meng & Schilling 2002).
Note restriction to Θj helps avoid problems (eg Wolpert & S. 2012).
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Estimating the weights
Approach 2: Adaptive importance sampling
Construct approximation qj to πj from MCMC draws:
t(mj , Sj ) distn for sample mean mj , covar Sj
Adaptive mixture of t-distributions (Ji & S. 2013, Wang & S.
2013)
Draw θt
iid
∼ qj to get unbiased IS estimate
ˆcj = T−1
T
t=1
g(θt)1Θj
(θt)/qj (θt)
Again, qj need only approximate π locally on Θj , so λ∗
j = sup πj /qj
much smaller
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Estimating the weights
Approach 2: Adaptive importance sampling (cont’d)
More generally, may use a sequence of distributions qj,t.
Markov chain θt | θt−1 ∼ qj (θt | θt−1)
Adaptive MIS chain (Ji & S. 2013, Wang & S. 2013)
‘Sample’ (’trajectory’) to denote independent (conditional) draws.
Averaging n independent ˆcj ’s decreases variance as n−1.
Pseudo-marginal approach (Andrieu & Roberts 2009) using these
techniques significantly less efficient.
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Mixture of normals
Consider a simple mixture of 2 normals:
π(z) =
1
2
NM(z; −1M, σ2
1IM) +
1
2
NM(z; 1M, σ2
2IM)
Upper bounds on spectral gap (WSH07a,b) yield:
Thm: RW-MH is torpidly mixing.
Thm: Tempering is torpidly mixing for σ1 = σ2.
Lower bounds on hitting times obtained by (SW10) yield:
Thm: Equi-energy sampler torpidly mixing for σ1 = σ2.
Thm: Haario adaptive RW kernel torpidly mixing for σ1 = σ2.
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Towards some theory
However, if partition J
j=1 Θj is such that:
Θj ’s are convex
πj is log-concave for j = 1, . . . , J, then
Then
πj can be sampled in polynomial time (Frieze, Kannan, et al)
cj can be estimated in polynomial time (Lovasz, Vempala)
+ some additional technical restrictions gives:
⇒ we can sample π and approximate Eπ(h(x)) in polynomial time
. . . assuming we can intialize within the basins of attraction in poly
time!
(VanDerwerken & S., 2015)
Scott C. Schmidler Parallel Markov Chain Monte Carlo
FPRAS for mixture-of-normals
Theorem
Under above conditions, PMCMC algorithm returns a sample in
time O(poly(d)) from a distribution ˆπ for which ||ˆπ − π∗||TV ≤
with prob at least 1 − δ.
HPD region of modes sampled in poly-time
Use samples to estimate HPD hyperellipsoid Bj at each mode
where π logconcave on Bj .
Apply logconcave integration
Similar result allows construction of a rapidly mixing MIS chain
using adaptive mixture IS instead (VanDerwerken & S., 2015)
Scott C. Schmidler Parallel Markov Chain Monte Carlo
FPRAS for mixture-of-normals
Note: exponentially faster than estimating transition matrix as
in MD
Shows problem difficulty is finding modes, not mixing between
them. (Hard even in normal problem?)
Currently exploring limits of generalizability.
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Problems with Approach #1
This approach has some shortcomings:
1 Requires # chains (processors) equal to partition size, which
could be exponential in dim(Θ).
2 Where does the partition come from?
3 Restriction πj requires rejection; makes evaluating transition
density hard for ˆwj ’s.
4 Restriction could slow down mixing of chains.
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Solution
No need for 1-to-1 correspondence between chains and estimators.
For L indpt chains, let
ˆµj,n = n−1
j
L
l=1
Kl
k=1
f (θlk)1Θj
(θlk)
nj = L
l=1
Kl
k=1 1Θj
(θlk) is # draws in Θj from any chain
L can be much smaller; need not be exponential in dim(Θ).
⇒ Chains unrestricted, can cross between partition elements.
Partition imposed on samples after the fact.
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Adaptive partitioning
Still need a partition.
Key: Must not grow exponentially in dim(Θ).
PACE clustering algorithm (VanDerwerken & S., 2013)
Let x
(j)
t denote draw t from chain j
Xi set of draws available at iter i
1 Define x∗
i = arg max
x
(j)
t ∈Xi
{log π(x
(j)
t )}
2 Assign all draws lying in B (x∗
i ) to Ci , and set Xi+1 = Xi  Ci .
3 Repeat (1)-(2) until 1 − α of draws clustered (e.g 98%).
4 Reallocate all draws to nearest cluster (Voronoi).
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Examples: Multimodal target
Mixture of normals:
π(x) =
4
k=1
wj N(µj , Σj )
Weights: w = (0.02, 0.20, 0.20, 0.58)
Means: µ1 = (3, 3), µ2 = (7, −3), µ3 = (2, 7), and µ4 = (−5, 0)
Covariances:
Σ1 =
1 .2
.2 1
Σ2 =
2 −.5
−.5 .5
Σ3 =
1.3 .3
.3 .4
Σ4 =
1 1
1 2.5
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Multimodal example
Langevin diffusion:
dθt =
σ2
2
log π(θt)dt + σdWt
10 chains initialized uniformly
25k iterations each in parallel on 10 processors
Cluster firsting 1k draws after 250 burn-in ⇒ 7 element partition.
In parallel, 1 processor per element (7 total) each generated:
n ≈ 5000 trajectories of length T = 5, and corresponding ˆcj ’s
initialized
iid
∼ t4(mj , J−1(θ))
t4 perturbations instead of Gaussian to ensure var(ˆcj ) < ∞
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Multimodal example
−10 −5 0 5 10
−10
−5
0
5
10
1
2
3
4
5
6
7
Clustering of 10 chains initialized uniformly within dashed lines.
Ellipses show 95% contours for component densities of target.
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Multimodal example
Estimated weights: [0.02, 0.23, 0.20, 0.55].
(True weights: [0.02, 0.20, 0.20, 0.58])
Using AIS approach instead:
5000 ˆcj ’s from samples of size T = 5 from t4 distn
requires 18s vs. 90s for simulating diffusions
Clustering + IS takes < 1
2 time of parallel 25k chains, so weights
estimated in parallel before sampling complete.
Estimated partition weights:
ˆw7 = [.378, .201, .201, .105, .020, .093, .002]
Estimated component weights (nearly exact):
ˆw = [.020, .201, .201, 0.578]
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Multimodal example: higher dimensions
Harder example:
p = 10 dim
4 component-means drawn uniformly on (−10, 10)p
Random covar matrices LT L with L ∼ MNp×p(0, Ip, Ip)
Weights ∼Dirichlet(1, 1, 1, 1)
- 20 parallel r.w. Metropolis chains, 100k iterations each
- Proposal scales tuned adaptively during the first 1k iterations
- Next 49k draws clustered ⇒ 4 partition elements.
- IS using t4(mj , Sj ) for cluster center mj , empirical covar Sj ,
T = 100, n = 1000
Results:
dTV( ˆw, w) = .0024 ˆµ − E(X) L1
= 0.17
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Multimodal example: higher dimensions
Sensitivity to partitioning:
Repeating with different gives 8-partition
dTV( ˆw, w) = .0074 ˆµ − E(X) L1
= 0.12
More, smaller weights to estimate, but better mixing within
(smaller) partition elements.
Since successfully repeated in 50, 100 dimensions.
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Multimodal example: higher dimensions
p = 50:
2 components: w = [0.1, 0.9]
Random means ∼ U(−10, 10)p; Corr LT L for
L ∼ MNp×p(0, Ip, Ip).
Parallel MCMC
14 chains, initialized uniformly
Normal RW MH with adaptive covar tuned during 100k iter
burn-in
2M post-burn-in draws each, thinned to 1000 draws.
Partition size: 2 ( 2 = 2p)
AIS using t4(mj , Σj ) (5M draws): ˆw = [.101, .899]
Pooling chains directly gives ˜w = [.210, .790], as 3 chains happen
to get stuck in mode 1, 11 in mode 2.
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Beyond multimodality
Parallelization easily visualized for multimodal problems, but our
approach is completely general.
What about other types of slowly-mixing chains?
E.g. component-wise chains with strong dependence between
dimensions. (such as correlated Gibbs samplers)
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Example: Probit regression
Probit regression model:
Assigns probs 1 − Φ(βX), Φ(βX) to response Y ∈ {0, 1} for
covariate X.
Posterior
π0(β)
n
i=1
Φ(βXi )yi
{1 − Φ(βXi )}1−yi
Data: N = 2000 pairs simulated X ∼ Bern(1
2), β = 5/
√
2.
Diffuse prior (π0(β) = N(0, 102)).
Model also studied by Nobile (1998), Imai & van Dyk (2005)
Traditional Gibbs sampler (Albert & Chib 1993) mixes slowly:
autocorr ρ > 0.999.
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Probit regression: Parallel MCMC
10 parallel chains initialized U(0, 20), run for 50k iterations
Partition formed by Voronoi with centers the deciles from the 500k
pooled draws.
Weights estimated via AIS with:
qj = N(mj , 2sj ) for mean, sd draws in Θj
n = 500, T = 10
TV distance to “truth” (200k indpt rejection sampling draws)
calculated on a fine discretization gives dTV = .075.
dTV for serial Gibbs sampler reaches .075 at ∼1.2 million iterations.
⇒ Parallelized Gibbs sampler: same accuracy with < 1
2 as many
draws, and more than 20× speedup due to parallelization.
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Probit regression: higher dimensional
N = 500 points for p = 8 covariates drawn from:
(1, Bern(1
2 ), U(0,1), N(0,1), Exp(1), N(5,1), Pois(10), N(20,25))
with β = (0.25, 5, 1, −1.5, −0.1, 0, 0, 0) .
Compare:
1M iterations serial Gibbs sampler
300k iterations each for 10 parallel chains
Partitioning: 2
= p for normalized dimensions
Weights: AIS with qj = t4(m∗
j , Sj ) for empirical mode m∗
j and
covariance Sj in element j, using n = 500, T = 50.
β2 much slower to converge (ρ > 0.999) than others (ρ < 0.95).
So compare marginal distribution for β2 with “truth” (5M MH
samples, ρ < 0.95) using dTV calculated by discretization.
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Multivariate Probit Regression: Parallel vs serial Gibbs
0 200 400 600 800 1000
0.0
0.1
0.2
0.3
0.4
0.5
Thousands of iterations
Totalvariationdistance
Using PACE convergence threshold 0.10 (VDW & S., 2013),
parallel Gibbs sampler converges ∼ 20× faster.
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Example: Loss of Heterozygosity
Data from Seattle Barrett Esophagus project.
LOH a genetic change undergone by cancer cells; chromosomal
regions with high loss rates may contain regulatory genes.
Loss frequencies modeled by mixture (Desai & Emond, 2004):
(also studied by Craiu et. al. 2009, 2011)
Xi ∼ η Bin(Ni , π1) + (1 − η)Beta-Bin (Ni , π2, γ),
γ controls beta-binomial overdispersion.
Likelihood:
40
i=1
η
ni
xi
πxi
1 (1 − π1)(ni −xi )
+ (1 − η)
ni
xi
B(xi + π2
ω2
, ni − xi + 1−π2
ω2
)
B(π2
ω2
, 1−π2
ω2
)
,
for ω2 = eγ/(2(1 + eγ)) and beta function B.
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Example: Loss of Heterozygosity
8 parallel chains initialized at logit(u) for u ∼ [0, 1]4
Clustering in logistic space (to choose 2 = 0.1) yields 7 clusters.
Weight estimation via AIS using:
t4(mj , Sj ) for cluster mean mj , empirical covar Sj
n = 10000 and T = 100.
Results agree with previous analyses, except γ slightly smaller.
Our results confirmed 4 times by iid importance sampling using
3-component t4 mixture, overdispersed covariances, n = 500, 000.
η π1 π2 γ
Parallel MCMC .816 (.001) .299 (.001) 0.678 (.002) 9.49 (.51)
IS .814 (.001) .299 (.001) .676 (.001) 9.84 (.06)
Scott C. Schmidler Parallel Markov Chain Monte Carlo
Conclusions
A general scheme for parallelizing any MCMC algorithm
Requires approximating norm constants, but only on local
regions
Requires MCMC to mix locally only
Doesn’t solve all problems, e.g. hitting modes in the first
place (which can be provably intractable)
Potentially powerful. Bigger applications in progress
Scott C. Schmidler Parallel Markov Chain Monte Carlo
References
VanDerwerken, DN and Schmidler, SC (2013). Parallel Markov
Chain Monte Carlo. arXiv:1312.7479
VanDerwerken, DN and Schmidler, SC (2017). Parallel Markov
Chain Monte Carlo. (revised and expanded version)
Scott C. Schmidler Parallel Markov Chain Monte Carlo

More Related Content

What's hot

Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
The Statistical and Applied Mathematical Sciences Institute
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
The Statistical and Applied Mathematical Sciences Institute
 
QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...
QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...
QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...
The Statistical and Applied Mathematical Sciences Institute
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
The Statistical and Applied Mathematical Sciences Institute
 
MCMC and likelihood-free methods
MCMC and likelihood-free methodsMCMC and likelihood-free methods
MCMC and likelihood-free methods
Christian Robert
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...
Valentin De Bortoli
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
The Statistical and Applied Mathematical Sciences Institute
 
Unbiased Bayes for Big Data
Unbiased Bayes for Big DataUnbiased Bayes for Big Data
Unbiased Bayes for Big Data
Christian Robert
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
The Statistical and Applied Mathematical Sciences Institute
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
The Statistical and Applied Mathematical Sciences Institute
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
The Statistical and Applied Mathematical Sciences Institute
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
The Statistical and Applied Mathematical Sciences Institute
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
The Statistical and Applied Mathematical Sciences Institute
 
Bayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear modelsBayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear models
Caleb (Shiqiang) Jin
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
The Statistical and Applied Mathematical Sciences Institute
 
Richard Everitt's slides
Richard Everitt's slidesRichard Everitt's slides
Richard Everitt's slides
Christian Robert
 
Approximate Bayesian Computation with Quasi-Likelihoods
Approximate Bayesian Computation with Quasi-LikelihoodsApproximate Bayesian Computation with Quasi-Likelihoods
Approximate Bayesian Computation with Quasi-Likelihoods
Stefano Cabras
 
Chris Sherlock's slides
Chris Sherlock's slidesChris Sherlock's slides
Chris Sherlock's slides
Christian Robert
 
Jere Koskela slides
Jere Koskela slidesJere Koskela slides
Jere Koskela slides
Christian Robert
 
Rao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithms
Rao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithmsRao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithms
Rao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithms
Christian Robert
 

What's hot (20)

Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...
QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...
QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
MCMC and likelihood-free methods
MCMC and likelihood-free methodsMCMC and likelihood-free methods
MCMC and likelihood-free methods
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Unbiased Bayes for Big Data
Unbiased Bayes for Big DataUnbiased Bayes for Big Data
Unbiased Bayes for Big Data
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Bayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear modelsBayesian hybrid variable selection under generalized linear models
Bayesian hybrid variable selection under generalized linear models
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Richard Everitt's slides
Richard Everitt's slidesRichard Everitt's slides
Richard Everitt's slides
 
Approximate Bayesian Computation with Quasi-Likelihoods
Approximate Bayesian Computation with Quasi-LikelihoodsApproximate Bayesian Computation with Quasi-Likelihoods
Approximate Bayesian Computation with Quasi-Likelihoods
 
Chris Sherlock's slides
Chris Sherlock's slidesChris Sherlock's slides
Chris Sherlock's slides
 
Jere Koskela slides
Jere Koskela slidesJere Koskela slides
Jere Koskela slides
 
Rao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithms
Rao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithmsRao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithms
Rao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithms
 

Similar to QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop, Paralell Markov Chain Monte Carlo - Scott Schmidler, Dec 11, 2017

Monte Carlo Berkeley.pptx
Monte Carlo Berkeley.pptxMonte Carlo Berkeley.pptx
Monte Carlo Berkeley.pptx
HaibinSu2
 
Firefly exact MCMC for Big Data
Firefly exact MCMC for Big DataFirefly exact MCMC for Big Data
Firefly exact MCMC for Big Data
Gianvito Siciliano
 
My Prize Winning Physics Poster from 2006
My Prize Winning Physics Poster from 2006My Prize Winning Physics Poster from 2006
My Prize Winning Physics Poster from 2006
Dr. Catherine Sinclair She/Her
 
Stratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computationStratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computation
Umberto Picchini
 
Modeling and quantification of uncertainties in numerical aerodynamics
Modeling and quantification of uncertainties in numerical aerodynamicsModeling and quantification of uncertainties in numerical aerodynamics
Modeling and quantification of uncertainties in numerical aerodynamics
Alexander Litvinenko
 
HMC and NUTS
HMC and NUTSHMC and NUTS
HMC and NUTS
Marco Banterle
 
Subquad multi ff
Subquad multi ffSubquad multi ff
Subquad multi ff
Fabian Velazquez
 
Presentation.pdf
Presentation.pdfPresentation.pdf
Presentation.pdf
Chiheb Ben Hammouda
 
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...
Varad Meru
 
Self-sampling Strategies for Multimemetic Algorithms in Unstable Computationa...
Self-sampling Strategies for Multimemetic Algorithms in Unstable Computationa...Self-sampling Strategies for Multimemetic Algorithms in Unstable Computationa...
Self-sampling Strategies for Multimemetic Algorithms in Unstable Computationa...
Rafael Nogueras
 
intro
introintro
Computation of Electromagnetic Fields Scattered from Dielectric Objects of Un...
Computation of Electromagnetic Fields Scattered from Dielectric Objects of Un...Computation of Electromagnetic Fields Scattered from Dielectric Objects of Un...
Computation of Electromagnetic Fields Scattered from Dielectric Objects of Un...
Alexander Litvinenko
 
High-Dimensional Network Estimation using ECL
High-Dimensional Network Estimation using ECLHigh-Dimensional Network Estimation using ECL
High-Dimensional Network Estimation using ECL
HPCC Systems
 
HOME ASSIGNMENT (0).pptx
HOME ASSIGNMENT (0).pptxHOME ASSIGNMENT (0).pptx
HOME ASSIGNMENT (0).pptx
SayedulHassan1
 
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...
Varad Meru
 
Finding Dense Subgraphs
Finding Dense SubgraphsFinding Dense Subgraphs
Finding Dense Subgraphs
Carlos Castillo (ChaTo)
 
Distributed ADMM
Distributed ADMMDistributed ADMM
Distributed ADMM
Pei-Che Chang
 
HOME ASSIGNMENT omar ali.pptx
HOME ASSIGNMENT omar ali.pptxHOME ASSIGNMENT omar ali.pptx
HOME ASSIGNMENT omar ali.pptx
SayedulHassan1
 
Traveling Salesman Problem in Distributed Environment
Traveling Salesman Problem in Distributed EnvironmentTraveling Salesman Problem in Distributed Environment
Traveling Salesman Problem in Distributed Environment
csandit
 
TRAVELING SALESMAN PROBLEM IN DISTRIBUTED ENVIRONMENT
TRAVELING SALESMAN PROBLEM IN DISTRIBUTED ENVIRONMENTTRAVELING SALESMAN PROBLEM IN DISTRIBUTED ENVIRONMENT
TRAVELING SALESMAN PROBLEM IN DISTRIBUTED ENVIRONMENT
cscpconf
 

Similar to QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop, Paralell Markov Chain Monte Carlo - Scott Schmidler, Dec 11, 2017 (20)

Monte Carlo Berkeley.pptx
Monte Carlo Berkeley.pptxMonte Carlo Berkeley.pptx
Monte Carlo Berkeley.pptx
 
Firefly exact MCMC for Big Data
Firefly exact MCMC for Big DataFirefly exact MCMC for Big Data
Firefly exact MCMC for Big Data
 
My Prize Winning Physics Poster from 2006
My Prize Winning Physics Poster from 2006My Prize Winning Physics Poster from 2006
My Prize Winning Physics Poster from 2006
 
Stratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computationStratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computation
 
Modeling and quantification of uncertainties in numerical aerodynamics
Modeling and quantification of uncertainties in numerical aerodynamicsModeling and quantification of uncertainties in numerical aerodynamics
Modeling and quantification of uncertainties in numerical aerodynamics
 
HMC and NUTS
HMC and NUTSHMC and NUTS
HMC and NUTS
 
Subquad multi ff
Subquad multi ffSubquad multi ff
Subquad multi ff
 
Presentation.pdf
Presentation.pdfPresentation.pdf
Presentation.pdf
 
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...
 
Self-sampling Strategies for Multimemetic Algorithms in Unstable Computationa...
Self-sampling Strategies for Multimemetic Algorithms in Unstable Computationa...Self-sampling Strategies for Multimemetic Algorithms in Unstable Computationa...
Self-sampling Strategies for Multimemetic Algorithms in Unstable Computationa...
 
intro
introintro
intro
 
Computation of Electromagnetic Fields Scattered from Dielectric Objects of Un...
Computation of Electromagnetic Fields Scattered from Dielectric Objects of Un...Computation of Electromagnetic Fields Scattered from Dielectric Objects of Un...
Computation of Electromagnetic Fields Scattered from Dielectric Objects of Un...
 
High-Dimensional Network Estimation using ECL
High-Dimensional Network Estimation using ECLHigh-Dimensional Network Estimation using ECL
High-Dimensional Network Estimation using ECL
 
HOME ASSIGNMENT (0).pptx
HOME ASSIGNMENT (0).pptxHOME ASSIGNMENT (0).pptx
HOME ASSIGNMENT (0).pptx
 
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passin...
 
Finding Dense Subgraphs
Finding Dense SubgraphsFinding Dense Subgraphs
Finding Dense Subgraphs
 
Distributed ADMM
Distributed ADMMDistributed ADMM
Distributed ADMM
 
HOME ASSIGNMENT omar ali.pptx
HOME ASSIGNMENT omar ali.pptxHOME ASSIGNMENT omar ali.pptx
HOME ASSIGNMENT omar ali.pptx
 
Traveling Salesman Problem in Distributed Environment
Traveling Salesman Problem in Distributed EnvironmentTraveling Salesman Problem in Distributed Environment
Traveling Salesman Problem in Distributed Environment
 
TRAVELING SALESMAN PROBLEM IN DISTRIBUTED ENVIRONMENT
TRAVELING SALESMAN PROBLEM IN DISTRIBUTED ENVIRONMENTTRAVELING SALESMAN PROBLEM IN DISTRIBUTED ENVIRONMENT
TRAVELING SALESMAN PROBLEM IN DISTRIBUTED ENVIRONMENT
 

More from The Statistical and Applied Mathematical Sciences Institute

Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
The Statistical and Applied Mathematical Sciences Institute
 
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
The Statistical and Applied Mathematical Sciences Institute
 
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
The Statistical and Applied Mathematical Sciences Institute
 
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
The Statistical and Applied Mathematical Sciences Institute
 
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
The Statistical and Applied Mathematical Sciences Institute
 
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
The Statistical and Applied Mathematical Sciences Institute
 
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
The Statistical and Applied Mathematical Sciences Institute
 
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
The Statistical and Applied Mathematical Sciences Institute
 
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
The Statistical and Applied Mathematical Sciences Institute
 
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
The Statistical and Applied Mathematical Sciences Institute
 
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
The Statistical and Applied Mathematical Sciences Institute
 
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
The Statistical and Applied Mathematical Sciences Institute
 
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
The Statistical and Applied Mathematical Sciences Institute
 
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
The Statistical and Applied Mathematical Sciences Institute
 
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
The Statistical and Applied Mathematical Sciences Institute
 
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
The Statistical and Applied Mathematical Sciences Institute
 
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
The Statistical and Applied Mathematical Sciences Institute
 
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
The Statistical and Applied Mathematical Sciences Institute
 
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
The Statistical and Applied Mathematical Sciences Institute
 
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
The Statistical and Applied Mathematical Sciences Institute
 

More from The Statistical and Applied Mathematical Sciences Institute (20)

Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
 
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
 
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
 
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
 
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
 
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
 
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
 
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
 
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
 
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
 
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
 
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
 
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
 
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
 
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
 
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
 
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
 
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
 
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
 
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
 

Recently uploaded

Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptxMARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
bennyroshan06
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 
How to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS ModuleHow to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS Module
Celine George
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
Delapenabediema
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
Balvir Singh
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
DeeptiGupta154
 
Basic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumersBasic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumers
PedroFerreira53928
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
beazzy04
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
GeoBlogs
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
Tamralipta Mahavidyalaya
 
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxStudents, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
EduSkills OECD
 
How to Break the cycle of negative Thoughts
How to Break the cycle of negative ThoughtsHow to Break the cycle of negative Thoughts
How to Break the cycle of negative Thoughts
Col Mukteshwar Prasad
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
kaushalkr1407
 
Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 

Recently uploaded (20)

Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptxMARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 
How to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS ModuleHow to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS Module
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
 
Basic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumersBasic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumers
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
 
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxStudents, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
 
How to Break the cycle of negative Thoughts
How to Break the cycle of negative ThoughtsHow to Break the cycle of negative Thoughts
How to Break the cycle of negative Thoughts
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
 
Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 

QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop, Paralell Markov Chain Monte Carlo - Scott Schmidler, Dec 11, 2017

  • 1. Parallel Markov Chain Monte Carlo Scott C. Schmidler∗ Department of Statistical Science Duke University SAMSI Workshop December 11, 2017 ∗ joint work with Doug VanDerwerken Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 2. Markov chain Monte Carlo integration A general problem in (esp Bayesian) statistics and statistical mechanics is calculation of integrals of the form: h π = Eπ (h(x)) = X h(x)π(dx) A common, powerful approach is Monte Carlo integration: h ≈ 1 n n i=1 h(Xi ) for X1, X2, . . . , Xn ∼ π When sampling π is difficult, can construct a Markov chain. MCMC Many ways to do so: Metropolis-Hastings, Gibbs sampling, Langevin & Hamiltonian methods, etc. Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 3. Problem: MCMC can be slow When X0, X1, X2, . . . , Xn come from a Markov chain, convergence of ergodic averages ˆµh = 1 n n i=1 h(Xi ) can converge very slowly. Mixing time τ = sup π0 min{n : πn − π TV < ∀n ≥ n}. where πn − π TV = sup A⊂X |πn(A) − π(A)| In problems with multimodality, high dimensions, or simply strong dependence, mixing times can be very, very long. Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 4. Rapid and slow mixing One way to characterize this is rapid mixing. Let (X(d), F(d), λ(d)) a sequence of measure spaces, and π(d) densities wrt λ(d) for d ∈ N the problem size. P is rapidly mixing if τ (d) is bounded above by a polynomial in d. P is torpidly mixing if τ (d) bounded below by an exponential in d. Even if the chain is “rapidly” mixing, τ may be impractically large. Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 5. Computation is changing At the same time, the computing landscape has shifted dramatically. Moore’s law (exponential growth of processor speed) is dead. Future growth must come through parallelism: Multi-core platforms Cluster computing Massive parallelism (GPUs) Cloud computing Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 6. Parallel algorithms Basic idea: Break a problem into pieces that can be solved independently - preferably asynchronously - and recombined into a full solution. Integration (wrt prob measure π): Θ h(θ)π(dθ) One possibility: partition space: J j=1 Θj integrate within each element Θj : µj = Θj h(θ)π(θ)dθ sum the results: µ = j µj Easily done for grid-based quadrature, but . . . For fixed , # evals grows exponentially in dim(Θ) In contrast, Monte Carlo integration“spends” function evals only in relevant parts of Θ. (Hence preferred for d > 8). Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 7. Parallelization Our goal: Combine the best of both worlds: expend computation only in regions of significant probability, while enabling parallel evaluation in distinct regions. Quandary: MCMC is an inherently serial algorithm, and number of steps may be exponential in dim(Θ). Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 8. MCMC is a serial algorithm MCMC is inherently serial: Cannot compute Xt without first computing X1, X2, . . . Xt−1. ⇒ incompatible with parallelization What we can do: Parallelize individual steps (e.g. expensive likelihood calcs) Proposing moves in parallel, or precomputing acceptance ratios, for a individual steps Markov chains with natural parallel structure Parallel tempering Population MCMC but . . . such chains have inherent limitations on number of processors; cannot parallelize component chains Split ’big data’ and recombine results in ad hoc ways Particle filtering/SMC Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 9. MCMC is a serial algorithm Moreover, these approaches all require processor synchronization. Achievable only on dedicated clusters, with high-speed connectivity Without this, parallelization may slow-down compared to single processor. Finally, all require the component (or joint) chains to reach equilibrium for valid inference. ⇒ Cannot reduce the number of serial steps required.. e.g. Parallel Tempering: may speed convergence vs single-temp, but . . . increasing # processors > # temps doesn’t help. When mixing slow, e.g in presence of multimodality, may not help. (e.g Woodard, S., Huber 2009). These algs fundamentally limited by mixing time of joint process. Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 10. Goal of this work Goal: A procedure that can be applied to any Markov chain Monte Carlo algorithm (including above methods) to make it parallel, with the ability to take advantage of as many processors as available: Asynchronously parallel. Ideally, linear speedup in # processors. Not limited by the mixing time of the component chain(s). Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 11. Basic idea (not quite what we do) Given a partition Θ = J j=1 Θj . For each j run an MCMC chain θ (j) 1 , . . . , θ (j) nj on the target distribution restricted to Θj : πj (θ) ∆ = π(θ)1Θj (θ)/wj where wj = Θj π(θ)ν(dθ). Then for ergodic averages ˆµj,n = n−1 j nj i=1 f (θ (i) j ) we have ˆµj,n −→ Eπj (f ) = Θj f (θ)πj (θ)ν(dθ) as nj → ∞, for each j ∈ {1, . . . , J}. Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 12. Combining the chains If we can also construct estimators for the weights: ˆwj,n → wj Then the combined estimator ˆµn = J j=1 ˆwj,n ˆµj,n −→ µ = Eπ(f ) If ˆµj,n’s and ˆwj,n’s unbiased and independent, then ˆµn unbiased. Notice: Need only ˆµj,n’s and ˆwj,n’s to converge, not the chains! Requires only that each chain mix locally Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 13. Estimating the weights Let g(θ) be unnormalized target density, i.e. π(θ) = g(θ) c Estimating cj = Θj g(θ)ν(dθ) equivalent to estimating the normalizing constant of target density gj (θ) = g(θ)1Θj (θ) Many techniques available (but requires care). Then form ˆwj,n = n i=1 ˆc (i) j / n i=1 J k=1 ˆc (i) k which is consistent (but not unbiased) for w. Other ratio estimators may improve efficiency (Tin 1965), allowing reduction in n. Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 14. Estimating the weights Approach 1: Markov chain output Estimate cj directly from MCMC trajectories. HME (Newton & Raftery 1994, Raftery et. al. 2007) Chib’s method (Chib 1995, Chib & Jeliazkov 2001) Bridge/path sampling (Meng & Wong 1996, Gelman & Meng 1998, Meng & Schilling 2002). Note restriction to Θj helps avoid problems (eg Wolpert & S. 2012). Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 15. Estimating the weights Approach 2: Adaptive importance sampling Construct approximation qj to πj from MCMC draws: t(mj , Sj ) distn for sample mean mj , covar Sj Adaptive mixture of t-distributions (Ji & S. 2013, Wang & S. 2013) Draw θt iid ∼ qj to get unbiased IS estimate ˆcj = T−1 T t=1 g(θt)1Θj (θt)/qj (θt) Again, qj need only approximate π locally on Θj , so λ∗ j = sup πj /qj much smaller Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 16. Estimating the weights Approach 2: Adaptive importance sampling (cont’d) More generally, may use a sequence of distributions qj,t. Markov chain θt | θt−1 ∼ qj (θt | θt−1) Adaptive MIS chain (Ji & S. 2013, Wang & S. 2013) ‘Sample’ (’trajectory’) to denote independent (conditional) draws. Averaging n independent ˆcj ’s decreases variance as n−1. Pseudo-marginal approach (Andrieu & Roberts 2009) using these techniques significantly less efficient. Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 17. Mixture of normals Consider a simple mixture of 2 normals: π(z) = 1 2 NM(z; −1M, σ2 1IM) + 1 2 NM(z; 1M, σ2 2IM) Upper bounds on spectral gap (WSH07a,b) yield: Thm: RW-MH is torpidly mixing. Thm: Tempering is torpidly mixing for σ1 = σ2. Lower bounds on hitting times obtained by (SW10) yield: Thm: Equi-energy sampler torpidly mixing for σ1 = σ2. Thm: Haario adaptive RW kernel torpidly mixing for σ1 = σ2. Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 18. Towards some theory However, if partition J j=1 Θj is such that: Θj ’s are convex πj is log-concave for j = 1, . . . , J, then Then πj can be sampled in polynomial time (Frieze, Kannan, et al) cj can be estimated in polynomial time (Lovasz, Vempala) + some additional technical restrictions gives: ⇒ we can sample π and approximate Eπ(h(x)) in polynomial time . . . assuming we can intialize within the basins of attraction in poly time! (VanDerwerken & S., 2015) Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 19. FPRAS for mixture-of-normals Theorem Under above conditions, PMCMC algorithm returns a sample in time O(poly(d)) from a distribution ˆπ for which ||ˆπ − π∗||TV ≤ with prob at least 1 − δ. HPD region of modes sampled in poly-time Use samples to estimate HPD hyperellipsoid Bj at each mode where π logconcave on Bj . Apply logconcave integration Similar result allows construction of a rapidly mixing MIS chain using adaptive mixture IS instead (VanDerwerken & S., 2015) Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 20. FPRAS for mixture-of-normals Note: exponentially faster than estimating transition matrix as in MD Shows problem difficulty is finding modes, not mixing between them. (Hard even in normal problem?) Currently exploring limits of generalizability. Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 21. Problems with Approach #1 This approach has some shortcomings: 1 Requires # chains (processors) equal to partition size, which could be exponential in dim(Θ). 2 Where does the partition come from? 3 Restriction πj requires rejection; makes evaluating transition density hard for ˆwj ’s. 4 Restriction could slow down mixing of chains. Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 22. Solution No need for 1-to-1 correspondence between chains and estimators. For L indpt chains, let ˆµj,n = n−1 j L l=1 Kl k=1 f (θlk)1Θj (θlk) nj = L l=1 Kl k=1 1Θj (θlk) is # draws in Θj from any chain L can be much smaller; need not be exponential in dim(Θ). ⇒ Chains unrestricted, can cross between partition elements. Partition imposed on samples after the fact. Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 23. Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 24. Adaptive partitioning Still need a partition. Key: Must not grow exponentially in dim(Θ). PACE clustering algorithm (VanDerwerken & S., 2013) Let x (j) t denote draw t from chain j Xi set of draws available at iter i 1 Define x∗ i = arg max x (j) t ∈Xi {log π(x (j) t )} 2 Assign all draws lying in B (x∗ i ) to Ci , and set Xi+1 = Xi Ci . 3 Repeat (1)-(2) until 1 − α of draws clustered (e.g 98%). 4 Reallocate all draws to nearest cluster (Voronoi). Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 25. Examples: Multimodal target Mixture of normals: π(x) = 4 k=1 wj N(µj , Σj ) Weights: w = (0.02, 0.20, 0.20, 0.58) Means: µ1 = (3, 3), µ2 = (7, −3), µ3 = (2, 7), and µ4 = (−5, 0) Covariances: Σ1 = 1 .2 .2 1 Σ2 = 2 −.5 −.5 .5 Σ3 = 1.3 .3 .3 .4 Σ4 = 1 1 1 2.5 Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 26. Multimodal example Langevin diffusion: dθt = σ2 2 log π(θt)dt + σdWt 10 chains initialized uniformly 25k iterations each in parallel on 10 processors Cluster firsting 1k draws after 250 burn-in ⇒ 7 element partition. In parallel, 1 processor per element (7 total) each generated: n ≈ 5000 trajectories of length T = 5, and corresponding ˆcj ’s initialized iid ∼ t4(mj , J−1(θ)) t4 perturbations instead of Gaussian to ensure var(ˆcj ) < ∞ Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 27. Multimodal example −10 −5 0 5 10 −10 −5 0 5 10 1 2 3 4 5 6 7 Clustering of 10 chains initialized uniformly within dashed lines. Ellipses show 95% contours for component densities of target. Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 28. Multimodal example Estimated weights: [0.02, 0.23, 0.20, 0.55]. (True weights: [0.02, 0.20, 0.20, 0.58]) Using AIS approach instead: 5000 ˆcj ’s from samples of size T = 5 from t4 distn requires 18s vs. 90s for simulating diffusions Clustering + IS takes < 1 2 time of parallel 25k chains, so weights estimated in parallel before sampling complete. Estimated partition weights: ˆw7 = [.378, .201, .201, .105, .020, .093, .002] Estimated component weights (nearly exact): ˆw = [.020, .201, .201, 0.578] Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 29. Multimodal example: higher dimensions Harder example: p = 10 dim 4 component-means drawn uniformly on (−10, 10)p Random covar matrices LT L with L ∼ MNp×p(0, Ip, Ip) Weights ∼Dirichlet(1, 1, 1, 1) - 20 parallel r.w. Metropolis chains, 100k iterations each - Proposal scales tuned adaptively during the first 1k iterations - Next 49k draws clustered ⇒ 4 partition elements. - IS using t4(mj , Sj ) for cluster center mj , empirical covar Sj , T = 100, n = 1000 Results: dTV( ˆw, w) = .0024 ˆµ − E(X) L1 = 0.17 Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 30. Multimodal example: higher dimensions Sensitivity to partitioning: Repeating with different gives 8-partition dTV( ˆw, w) = .0074 ˆµ − E(X) L1 = 0.12 More, smaller weights to estimate, but better mixing within (smaller) partition elements. Since successfully repeated in 50, 100 dimensions. Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 31. Multimodal example: higher dimensions p = 50: 2 components: w = [0.1, 0.9] Random means ∼ U(−10, 10)p; Corr LT L for L ∼ MNp×p(0, Ip, Ip). Parallel MCMC 14 chains, initialized uniformly Normal RW MH with adaptive covar tuned during 100k iter burn-in 2M post-burn-in draws each, thinned to 1000 draws. Partition size: 2 ( 2 = 2p) AIS using t4(mj , Σj ) (5M draws): ˆw = [.101, .899] Pooling chains directly gives ˜w = [.210, .790], as 3 chains happen to get stuck in mode 1, 11 in mode 2. Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 32. Beyond multimodality Parallelization easily visualized for multimodal problems, but our approach is completely general. What about other types of slowly-mixing chains? E.g. component-wise chains with strong dependence between dimensions. (such as correlated Gibbs samplers) Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 33. Example: Probit regression Probit regression model: Assigns probs 1 − Φ(βX), Φ(βX) to response Y ∈ {0, 1} for covariate X. Posterior π0(β) n i=1 Φ(βXi )yi {1 − Φ(βXi )}1−yi Data: N = 2000 pairs simulated X ∼ Bern(1 2), β = 5/ √ 2. Diffuse prior (π0(β) = N(0, 102)). Model also studied by Nobile (1998), Imai & van Dyk (2005) Traditional Gibbs sampler (Albert & Chib 1993) mixes slowly: autocorr ρ > 0.999. Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 34. Probit regression: Parallel MCMC 10 parallel chains initialized U(0, 20), run for 50k iterations Partition formed by Voronoi with centers the deciles from the 500k pooled draws. Weights estimated via AIS with: qj = N(mj , 2sj ) for mean, sd draws in Θj n = 500, T = 10 TV distance to “truth” (200k indpt rejection sampling draws) calculated on a fine discretization gives dTV = .075. dTV for serial Gibbs sampler reaches .075 at ∼1.2 million iterations. ⇒ Parallelized Gibbs sampler: same accuracy with < 1 2 as many draws, and more than 20× speedup due to parallelization. Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 35. Probit regression: higher dimensional N = 500 points for p = 8 covariates drawn from: (1, Bern(1 2 ), U(0,1), N(0,1), Exp(1), N(5,1), Pois(10), N(20,25)) with β = (0.25, 5, 1, −1.5, −0.1, 0, 0, 0) . Compare: 1M iterations serial Gibbs sampler 300k iterations each for 10 parallel chains Partitioning: 2 = p for normalized dimensions Weights: AIS with qj = t4(m∗ j , Sj ) for empirical mode m∗ j and covariance Sj in element j, using n = 500, T = 50. β2 much slower to converge (ρ > 0.999) than others (ρ < 0.95). So compare marginal distribution for β2 with “truth” (5M MH samples, ρ < 0.95) using dTV calculated by discretization. Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 36. Multivariate Probit Regression: Parallel vs serial Gibbs 0 200 400 600 800 1000 0.0 0.1 0.2 0.3 0.4 0.5 Thousands of iterations Totalvariationdistance Using PACE convergence threshold 0.10 (VDW & S., 2013), parallel Gibbs sampler converges ∼ 20× faster. Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 37. Example: Loss of Heterozygosity Data from Seattle Barrett Esophagus project. LOH a genetic change undergone by cancer cells; chromosomal regions with high loss rates may contain regulatory genes. Loss frequencies modeled by mixture (Desai & Emond, 2004): (also studied by Craiu et. al. 2009, 2011) Xi ∼ η Bin(Ni , π1) + (1 − η)Beta-Bin (Ni , π2, γ), γ controls beta-binomial overdispersion. Likelihood: 40 i=1 η ni xi πxi 1 (1 − π1)(ni −xi ) + (1 − η) ni xi B(xi + π2 ω2 , ni − xi + 1−π2 ω2 ) B(π2 ω2 , 1−π2 ω2 ) , for ω2 = eγ/(2(1 + eγ)) and beta function B. Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 38. Example: Loss of Heterozygosity 8 parallel chains initialized at logit(u) for u ∼ [0, 1]4 Clustering in logistic space (to choose 2 = 0.1) yields 7 clusters. Weight estimation via AIS using: t4(mj , Sj ) for cluster mean mj , empirical covar Sj n = 10000 and T = 100. Results agree with previous analyses, except γ slightly smaller. Our results confirmed 4 times by iid importance sampling using 3-component t4 mixture, overdispersed covariances, n = 500, 000. η π1 π2 γ Parallel MCMC .816 (.001) .299 (.001) 0.678 (.002) 9.49 (.51) IS .814 (.001) .299 (.001) .676 (.001) 9.84 (.06) Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 39. Conclusions A general scheme for parallelizing any MCMC algorithm Requires approximating norm constants, but only on local regions Requires MCMC to mix locally only Doesn’t solve all problems, e.g. hitting modes in the first place (which can be provably intractable) Potentially powerful. Bigger applications in progress Scott C. Schmidler Parallel Markov Chain Monte Carlo
  • 40. References VanDerwerken, DN and Schmidler, SC (2013). Parallel Markov Chain Monte Carlo. arXiv:1312.7479 VanDerwerken, DN and Schmidler, SC (2017). Parallel Markov Chain Monte Carlo. (revised and expanded version) Scott C. Schmidler Parallel Markov Chain Monte Carlo