International Conference on Monte Carlo techniques
Closing conference of thematic cycle
Paris July 5-8th 2016
Campus les cordeliers
Jere Koskela's slides
International Conference on Monte Carlo techniques
Closing conference of thematic cycle
Paris July 5-8th 2016
Campus les cordeliers
Chris Sherlock's slides
International Conference on Monte Carlo techniques
Closing conference of thematic cycle
Paris July 5-8th 2016
Campus les cordeliers
Jere Koskela's slides
International Conference on Monte Carlo techniques
Closing conference of thematic cycle
Paris July 5-8th 2016
Campus les cordeliers
Chris Sherlock's slides
Rao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithmsChristian Robert
Aggregate of three different papers on Rao-Blackwellisation, from Casella & Robert (1996), to Douc & Robert (2010), to Banterle et al. (2015), presented during an OxWaSP workshop on MCMC methods, Warwick, Nov 20, 2015
International Conference on Monte Carlo techniques
Closing conference of thematic cycle
Paris July 5-8th 2016
Campus les Cordeliers
Slides of Richard Everitt's presentation
Rao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithmsChristian Robert
Aggregate of three different papers on Rao-Blackwellisation, from Casella & Robert (1996), to Douc & Robert (2010), to Banterle et al. (2015), presented during an OxWaSP workshop on MCMC methods, Warwick, Nov 20, 2015
International Conference on Monte Carlo techniques
Closing conference of thematic cycle
Paris July 5-8th 2016
Campus les Cordeliers
Slides of Richard Everitt's presentation
Random Matrix Theory and Machine Learning - Part 3Fabian Pedregosa
ICML 2021 tutorial on random matrix theory and machine learning.
Part 3 covers: 1. Motivation: Average-case versus worst-case in high dimensions 2. Algorithm halting times (runtimes) 3. Outlook
Probabilistic Control of Switched Linear Systems with Chance ConstraintsLeo Asselborn
An approach to algorithmically synthesize control
strategies for set-to-set transitions of uncertain discrete-time
switched linear systems based on a combination of tree search
and reachable set computations in a stochastic setting is
proposed in this presentation. The initial state and disturbances
are assumed to be Gaussian distributed, and a time-variant
hybrid control law stabilizes the system towards a goal set.
The algorithmic solution computes sequences of discrete states
via tree search and the continuous controls are obtained
from solving embedded semi-definite programs (SDP). These
program taking polytopic input constraints as well as timevarying
probabilistic state constraints into account. An example
for demonstrating the principles of the solution procedure with
focus on handling the chance constraints is included.
Probabilistic Control of Uncertain Linear Systems Using Stochastic ReachabilityLeo Asselborn
This presentation proposes an approach to algorithmically synthesize control strategies for
set-to-set transitions of discrete-time uncertain systems based on reachable set computations in
a stochastic setting. For given Gaussian distributions of the initial states and disturbances, state
sets wich are reachable to a chosen confidence level under the effect of time-variant control laws
are computed by using principles of the ellipsoidal calculus. The proposed algorithm iterates over
LMI-constrained semi-definite programming problems to compute probabilistically stabilizing
controllers, while ellipsoidal input constraints are considered. An example for illustration is included.
Sequential quasi-Monte Carlo (SQMC) is a quasi-Monte Carlo (QMC) version of sequential Monte Carlo (or particle filtering), a popular class of Monte Carlo techniques used to carry out inference in state space models. In this talk I will first review the SQMC methodology as well as some theoretical results. Although SQMC converges faster than the usual Monte Carlo error rate its performance deteriorates quickly as the dimension of the hidden variable increases. However, I will show with an example that SQMC may perform well for some "high" dimensional problems. I will conclude this talk with some open problems and potential applications of SQMC in complicated settings.
Seminar at IEEE Computational Intelligence Society, Singapore Chapter at School of Electrical and Electronic Engineering, NTU, Singapore, 20 February 2019
Robust Control of Uncertain Switched Linear Systems based on Stochastic Reach...Leo Asselborn
This presentation proposes an approach to algorithmically synthesize control strategies for
set-to-set transitions of uncertain discrete-time switched linear systems based on a combination
of tree search and reachable set computations in a stochastic setting. For given Gaussian
distributions of the initial states and disturbances, state sets wich are reachable to a chosen
confidence level under the effect of time-variant hybrid control laws are computed by using
principles of the ellipsoidal calculus. The proposed algorithm iterates over sequences of the
discrete states and LMI-constrained semi-definite programming (SDP) problems to compute
stabilizing controllers, while polytopic input constraints are considered. An example for illustration is included.
We combined: low-rank tensor techniques and FFT to compute kriging, estimate variance, compute conditional covariance. We are able to solve 3D problems with very high resolution
Susie Bayarri Plenary Lecture given in the ISBA (International Society of Bayesian Analysis) World Meeting in Montreal, Canada on June 30, 2022, by Pierre E, Jacob (https://sites.google.com/site/pierrejacob/)
Talk on the design on non-negative unbiased estimators, useful to perform exact inference for intractable target distributions.
Corresponds to the article http://arxiv.org/abs/1309.6473
SMC^2: an algorithm for sequential analysis of state-space modelsPierre Jacob
In these slides I presented the SMC^2 method (see the article here: http://arxiv.org/abs/1101.1528 ) to an audience of marine biogeochemistry people, emphasizing on the model evidence estimation aspect.
This a short presentation for a 15 minutes talk at Bayesian Inference for Stochastic Processes 7, on the SMC^2 algorithm.
http://arxiv.org/abs/1101.1528
This pdf is about the Schizophrenia.
For more details visit on YouTube; @SELF-EXPLANATORY;
https://www.youtube.com/channel/UCAiarMZDNhe1A3Rnpr_WkzA/videos
Thanks...!
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
Richard's entangled aventures in wonderlandRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
Professional air quality monitoring systems provide immediate, on-site data for analysis, compliance, and decision-making.
Monitor common gases, weather parameters, particulates.
Slide 1: Title Slide
Extrachromosomal Inheritance
Slide 2: Introduction to Extrachromosomal Inheritance
Definition: Extrachromosomal inheritance refers to the transmission of genetic material that is not found within the nucleus.
Key Components: Involves genes located in mitochondria, chloroplasts, and plasmids.
Slide 3: Mitochondrial Inheritance
Mitochondria: Organelles responsible for energy production.
Mitochondrial DNA (mtDNA): Circular DNA molecule found in mitochondria.
Inheritance Pattern: Maternally inherited, meaning it is passed from mothers to all their offspring.
Diseases: Examples include Leber’s hereditary optic neuropathy (LHON) and mitochondrial myopathy.
Slide 4: Chloroplast Inheritance
Chloroplasts: Organelles responsible for photosynthesis in plants.
Chloroplast DNA (cpDNA): Circular DNA molecule found in chloroplasts.
Inheritance Pattern: Often maternally inherited in most plants, but can vary in some species.
Examples: Variegation in plants, where leaf color patterns are determined by chloroplast DNA.
Slide 5: Plasmid Inheritance
Plasmids: Small, circular DNA molecules found in bacteria and some eukaryotes.
Features: Can carry antibiotic resistance genes and can be transferred between cells through processes like conjugation.
Significance: Important in biotechnology for gene cloning and genetic engineering.
Slide 6: Mechanisms of Extrachromosomal Inheritance
Non-Mendelian Patterns: Do not follow Mendel’s laws of inheritance.
Cytoplasmic Segregation: During cell division, organelles like mitochondria and chloroplasts are randomly distributed to daughter cells.
Heteroplasmy: Presence of more than one type of organellar genome within a cell, leading to variation in expression.
Slide 7: Examples of Extrachromosomal Inheritance
Four O’clock Plant (Mirabilis jalapa): Shows variegated leaves due to different cpDNA in leaf cells.
Petite Mutants in Yeast: Result from mutations in mitochondrial DNA affecting respiration.
Slide 8: Importance of Extrachromosomal Inheritance
Evolution: Provides insight into the evolution of eukaryotic cells.
Medicine: Understanding mitochondrial inheritance helps in diagnosing and treating mitochondrial diseases.
Agriculture: Chloroplast inheritance can be used in plant breeding and genetic modification.
Slide 9: Recent Research and Advances
Gene Editing: Techniques like CRISPR-Cas9 are being used to edit mitochondrial and chloroplast DNA.
Therapies: Development of mitochondrial replacement therapy (MRT) for preventing mitochondrial diseases.
Slide 10: Conclusion
Summary: Extrachromosomal inheritance involves the transmission of genetic material outside the nucleus and plays a crucial role in genetics, medicine, and biotechnology.
Future Directions: Continued research and technological advancements hold promise for new treatments and applications.
Slide 11: Questions and Discussion
Invite Audience: Open the floor for any questions or further discussion on the topic.
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Sérgio Sacani
We characterize the earliest galaxy population in the JADES Origins Field (JOF), the deepest
imaging field observed with JWST. We make use of the ancillary Hubble optical images (5 filters
spanning 0.4−0.9µm) and novel JWST images with 14 filters spanning 0.8−5µm, including 7 mediumband filters, and reaching total exposure times of up to 46 hours per filter. We combine all our data
at > 2.3µm to construct an ultradeep image, reaching as deep as ≈ 31.4 AB mag in the stack and
30.3-31.0 AB mag (5σ, r = 0.1” circular aperture) in individual filters. We measure photometric
redshifts and use robust selection criteria to identify a sample of eight galaxy candidates at redshifts
z = 11.5 − 15. These objects show compact half-light radii of R1/2 ∼ 50 − 200pc, stellar masses of
M⋆ ∼ 107−108M⊙, and star-formation rates of SFR ∼ 0.1−1 M⊙ yr−1
. Our search finds no candidates
at 15 < z < 20, placing upper limits at these redshifts. We develop a forward modeling approach to
infer the properties of the evolving luminosity function without binning in redshift or luminosity that
marginalizes over the photometric redshift uncertainty of our candidate galaxies and incorporates the
impact of non-detections. We find a z = 12 luminosity function in good agreement with prior results,
and that the luminosity function normalization and UV luminosity density decline by a factor of ∼ 2.5
from z = 12 to z = 14. We discuss the possible implications of our results in the context of theoretical
models for evolution of the dark matter halo mass function.
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Sérgio Sacani
Since volcanic activity was first discovered on Io from Voyager images in 1979, changes
on Io’s surface have been monitored from both spacecraft and ground-based telescopes.
Here, we present the highest spatial resolution images of Io ever obtained from a groundbased telescope. These images, acquired by the SHARK-VIS instrument on the Large
Binocular Telescope, show evidence of a major resurfacing event on Io’s trailing hemisphere. When compared to the most recent spacecraft images, the SHARK-VIS images
show that a plume deposit from a powerful eruption at Pillan Patera has covered part
of the long-lived Pele plume deposit. Although this type of resurfacing event may be common on Io, few have been detected due to the rarity of spacecraft visits and the previously low spatial resolution available from Earth-based telescopes. The SHARK-VIS instrument ushers in a new era of high resolution imaging of Io’s surface using adaptive
optics at visible wavelengths.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.Sérgio Sacani
The return of a sample of near-surface atmosphere from Mars would facilitate answers to several first-order science questions surrounding the formation and evolution of the planet. One of the important aspects of terrestrial planet formation in general is the role that primary atmospheres played in influencing the chemistry and structure of the planets and their antecedents. Studies of the martian atmosphere can be used to investigate the role of a primary atmosphere in its history. Atmosphere samples would also inform our understanding of the near-surface chemistry of the planet, and ultimately the prospects for life. High-precision isotopic analyses of constituent gases are needed to address these questions, requiring that the analyses are made on returned samples rather than in situ.
A brief information about the SCOP protein database used in bioinformatics.
The Structural Classification of Proteins (SCOP) database is a comprehensive and authoritative resource for the structural and evolutionary relationships of proteins. It provides a detailed and curated classification of protein structures, grouping them into families, superfamilies, and folds based on their structural and sequence similarities.
1. Unbiased MCMC with couplings
Pierre E. Jacob
Department of Statistics, Harvard University
joint work with John O’Leary, Yves Atchad´e, and other fantastic
people acknowledged throughout
Department of Statistics, University of Oxford
July 19, 2019
Pierre E. Jacob Unbiased MCMC
2. Outline
1 It is about Monte Carlo and bias
2 Unbiased estimators from Markov kernels
3 Design of coupled Markov chains
4 It can fail
5 Beyond parallel computation and burn-in
Pierre E. Jacob Unbiased MCMC
3. Outline
1 It is about Monte Carlo and bias
2 Unbiased estimators from Markov kernels
3 Design of coupled Markov chains
4 It can fail
5 Beyond parallel computation and burn-in
Pierre E. Jacob Unbiased MCMC
4. Setting
Continuous or discrete space of dimension d.
Target probability distribution π,
with probability density function x → π(x).
Goal: approximate π, e.g. approximate
Eπ[h(X)] = h(x)π(x)dx = π(h),
for a class of “test” functions h.
Pierre E. Jacob Unbiased MCMC
5. Markov chain Monte Carlo
Initially, X0 ∼ π0, then Xt|Xt−1 ∼ P(Xt−1, ·) for t = 1, . . . , T.
Estimator:
1
T − b
T
t=b+1
h(Xt),
where b iterations are discarded as burn-in.
Might converge to Eπ[h(X)] as T → ∞ by the ergodic theorem.
Biased for any fixed b, T, since π0 = π.
Averaging independent copies of such estimators for fixed T, b
would not provide a consistent estimator of Eπ[h(X)]
as the number of independent copies goes to infinity.
Pierre E. Jacob Unbiased MCMC
6. Example: Metropolis–Hastings kernel P
Initialize: X0 ∼ π0.
At each iteration t ≥ 0, with Markov chain at state Xt,
1 propose X ∼ k(Xt, ·),
2 sample U ∼ U([0, 1]),
3 if
U ≤
π(X )k(X , Xt)
π(Xt)k(Xt, X )
,
set Xt+1 = X , otherwise set Xt+1 = Xt.
Hastings, Monte Carlo sampling methods using Markov chains and
their applications, Biometrika, 1970.
Pierre E. Jacob Unbiased MCMC
7. MCMC path
π = N(0, 1), MH with Normal proposal std = 0.5, π0 = N(10, 32
)
Pierre E. Jacob Unbiased MCMC
8. MCMC marginal distributions
π = N(0, 1), RWMH with Normal proposal std = 0.5, π0 = N(10, 32
)
Pierre E. Jacob Unbiased MCMC
9. Parallel computing
R processors generate unbiased estimators, independently.
The number of estimators completed by time t is random.
There are different ways of aggregating estimators,
either discarding on-going calculations at time t or not.
Different regimes can be considered, e.g.
as R → ∞ for fixed t,
as t → ∞ for fixed R,
as t and R both go to infinity.
Glynn & Heidelberger, Analysis of parallel replicated
simulations under a completion time constraint, 1991.
Pierre E. Jacob Unbiased MCMC
11. Outline
1 It is about Monte Carlo and bias
2 Unbiased estimators from Markov kernels
3 Design of coupled Markov chains
4 It can fail
5 Beyond parallel computation and burn-in
Pierre E. Jacob Unbiased MCMC
12. Coupled chains
Generate two Markov chains (Xt) and (Yt) as follows,
sample X0 and Y0 from π0 (independently or not),
sample X1|X0 ∼ P(X0, ·),
for t ≥ 1, sample (Xt+1, Yt)|(Xt, Yt−1) ∼ ¯P ((Xt, Yt−1), ·),
¯P is such that Xt+1|Xt ∼ P(Xt, ·) and Yt|Yt−1 ∼ P(Yt−1, ·),
so each chain marginally evolves according to P.
⇒ Xt and Yt have the same distribution for all t ≥ 0.
¯P is also such that ∃τ such that Xτ = Yτ−1,
and the chains are faithful.
Pierre E. Jacob Unbiased MCMC
13. Debiasing idea (one slide version)
Limit as a telescopic sum, for all k ≥ 0,
Eπ[h(X)] = lim
t→∞
E[h(Xt)] = E[h(Xk)] +
∞
t=k+1
E[h(Xt) − h(Xt−1)].
Since for all t ≥ 0, Xt and Yt have the same distribution,
= E[h(Xk)] +
∞
t=k+1
E[h(Xt) − h(Yt−1)].
If we can swap expectation and limit,
= E[h(Xk) +
∞
t=k+1
(h(Xt) − h(Yt−1))],
so h(Xk) + ∞
t=k+1(h(Xt) − h(Yt−1)) is unbiased.
Pierre E. Jacob Unbiased MCMC
14. Unbiased estimators
Unbiased estimator is given by
Hk(X, Y ) = h(Xk) +
τ−1
t=k+1
(h(Xt) − h(Yt−1)),
with the convention τ−1
t=k+1{·} = 0 if τ − 1 < k + 1.
h(Xk) alone is biased; the other terms correct for the bias.
Cost: τ − 1 calls to ¯P and 1 + max(0, k − τ) calls to P.
Glynn & Rhee, Exact estimation for Markov chain equilibrium
expectations, 2014.
Note: same reasoning would work with arbitrary lags L ≥ 1.
Pierre E. Jacob Unbiased MCMC
15. Conditions
Jacob, O’Leary, Atchad´e, Unbiased MCMC with couplings.
1 Marginal chain converges:
E[h(Xt)] → Eπ[h(X)],
and h(Xt) has (2 + η)-finite moments for all t.
2 Meeting time τ has geometric tails:
∃C < +∞ ∃δ ∈ (0, 1) ∀t ≥ 0 P(τ > t) ≤ Cδt
.
3 Chains stay together: Xt = Yt−1 for all t ≥ τ.
Condition 2 itself implied by e.g. geometric drift condition.
Under these conditions, Hk(X, Y ) is unbiased, has finite
expected cost and finite variance, for all k.
Pierre E. Jacob Unbiased MCMC
16. Conditions: update
Middleton, Deligiannidis, Doucet, Jacob, Unbiased MCMC for
intractable target distributions.
1 Marginal chain converges:
E[h(Xt)] → Eπ[h(X)],
and h(Xt) has (2 + η)-finite moments for all t.
2 Meeting time τ has polynomial tails:
∃C < +∞ ∃δ > 2(2η−1
+ 1) ∀t ≥ 0 P(τ > t) ≤ Ct−δ
.
3 Chains stay together: Xt = Yt−1 for all t ≥ τ.
Condition 2 itself implied by e.g. polynomial drift condition.
Pierre E. Jacob Unbiased MCMC
17. Improved unbiased estimators
Efficiency matters, thus in practice we recommend a variation
of the previous estimator, defined for integers k ≤ m as
Hk:m(X, Y ) =
1
m − k + 1
m
t=k
Ht(X, Y )
which can also be written
1
m − k + 1
m
t=k
h(Xt)+
τ−1
t=k+1
min 1,
t − k
m − k + 1
(h(Xt)−h(Yt−1)),
i.e. standard MCMC average + bias correction term.
As k → ∞, bias correction is zero with increasing probability.
Note: changing the lag is another way of modifying efficiency
while preserving the lack of bias.
Pierre E. Jacob Unbiased MCMC
18. Efficiency
Writing Hk:m(X, Y ) = MCMCk:m + BCk:m,
then, denoting the MSE of MCMCk:m by MSEk:m,
V[Hk:m(X, Y )] ≤ MSEk:m+2 MSEk:m E BC2
k:m +E BC2
k:m .
Under geometric drift condition and regularity assumptions on
h, for some δ < 1, C < ∞,
E[BC2
k:m] ≤
Cδk
(m − k + 1)2
,
and δ is directly related to tails of the meeting time.
Similarly under polynomial drift, we obtain a term of the form
k−δ in the numerator instead of δk.
Pierre E. Jacob Unbiased MCMC
19. Signed measure estimator
Replacing function evaluations by delta masses leads to
ˆπ(·) =
1
m − k + 1
m
t=k
δXt (·)
+
τ−1
t=k+1
min 1,
t − k
m − k + 1
(δXt (·) − δYt−1 (·)),
which is of the form
ˆπ(·) =
N
n=1
ωnδZn (·),
where N
n=1 ωn = 1 but some ωn might be negative.
Unbiasedness reads: E[Eˆπ[h(X)]] = Eπ[h(X)].
Pierre E. Jacob Unbiased MCMC
20. Assessing convergence of MCMC
Total variation distance between Xk ∼ πk and π = limk→∞ πk:
πk − π TV =
1
2
sup
h:|h|≤1
|E[h(Xk)] − Eπ[h(X)]|
=
1
2
sup
h:|h|≤1
|E[
τ−1
t=k+1
h(Xt) − h(Yt−1)]|
≤ E[max(0, τ − k − 1)].
0.00
0.01
0.02
0.03
0.04
0 50 100 150 200
meeting time
density
1e−03
1e−02
1e−01
1e+00
1e+01
0 50 100
k
upperbound
Pierre E. Jacob Unbiased MCMC
21. Assessing convergence of MCMC
With L-lag couplings, τ(L) = inf{t ≥ L : Xt = Yt−L},
dTV (πk, π) ≤ E 0 ∨ (τ(L)
− L − k)/L ,
dW (πk, π) ≤ E
(τ(L)−L−k)/L
j=1
Xk+jL − Yk+(j−1)L 1 .
0.00
0.25
0.50
0.75
1.00
1e+01 1e+02 1e+03 1e+04 1e+05 1e+06
iterations
dTV
SSG PT
Biswas & Jacob, Estimating Convergence of Markov chains with
L-Lag Couplings, 2019.
Pierre E. Jacob Unbiased MCMC
22. Outline
1 It is about Monte Carlo and bias
2 Unbiased estimators from Markov kernels
3 Design of coupled Markov chains
4 It can fail
5 Beyond parallel computation and burn-in
Pierre E. Jacob Unbiased MCMC
23. Designing coupled chains
To implement the proposed unbiased estimators,
we need to sample from a Markov kernel ¯P,
such that, when (Xt+1, Yt) is sampled from ¯P ((Xt, Yt−1), ·),
marginally Xt+1|Xt ∼ P(Xt, ·), and Yt|Yt−1 ∼ P(Yt−1, ·),
it is possible that Xt+1 = Yt exactly for some t ≥ 0,
if Xt = Yt−1, then Xt+1 = Yt almost surely.
Pierre E. Jacob Unbiased MCMC
24. Couplings of MCMC algorithms
Many practical couplings in the literature. . .
Propp & Wilson, Exact sampling with coupled Markov chains
and applications to statistical mechanics, Random Structures &
Algorithms, 1996.
Johnson, Studying convergence of Markov chain Monte Carlo
algorithms using coupled sample paths, JASA, 1996.
Neal, Circularly-coupled Markov chain sampling, UoT tech
report, 1999.
Pinto & Neal, Improving Markov chain Monte Carlo estimators
by coupling to an approximating chain, UoT tech report, 2001.
Glynn & Rhee, Exact estimation for Markov chain equilibrium
expectations, Journal of Applied Probability, 2014.
Pierre E. Jacob Unbiased MCMC
25. Couplings
(X, Y ) follows a coupling of p and q if X ∼ p and Y ∼ q.
The coupling inequality states that
P(X = Y ) ≤ 1 − p − q TV,
for any coupling, with p − q TV = 1
2 |p(x) − q(x)|dx.
Maximal couplings achieve the bound.
Pierre E. Jacob Unbiased MCMC
27. Maximal coupling: sampling algorithm
Requires: evaluations of p and q, sampling from p and q.
1 Sample X ∼ p and W ∼ U([0, 1]).
If W ≤ q(X)/p(X), output (X, X).
2 Otherwise, sample Y ∼ q and W ∼ U([0, 1])
until W > p(Y )/q(Y ), and output (X, Y ).
Output: a pair (X, Y ) such that X ∼ p, Y ∼ q
and P(X = Y ) is maximal.
Pierre E. Jacob Unbiased MCMC
28. Back to Metropolis–Hastings (kernel P)
At each iteration t, Markov chain at state Xt,
1 propose X ∼ k(Xt, ·),
2 sample U ∼ U([0, 1]),
3 if
U ≤
π(X )k(X , Xt)
π(Xt)k(Xt, X )
,
set Xt+1 = X , otherwise set Xt+1 = Xt.
How to propagate two MH chains from states Xt and Yt−1
such that {Xt+1 = Yt} can happen?
Pierre E. Jacob Unbiased MCMC
29. Coupling of Metropolis–Hastings (kernel ¯P)
At each iteration t, two Markov chains at states Xt, Yt−1,
1 propose (X , Y ) from max coupling of k(Xt, ·), k(Yt−1, ·),
2 sample U ∼ U([0, 1]),
3 if
U ≤
π(X )k(X , Xt)
π(Xt)k(Xt, X )
,
set Xt+1 = X , otherwise set Xt+1 = Xt,
if
U ≤
π(Y )k(Y , Yt−1)
π(Yt−1)k(Yt−1, Y )
,
set Yt = Y , otherwise set Yt = Yt−1.
Pierre E. Jacob Unbiased MCMC
30. Coupling of Gibbs
Gibbs sampler: update component i of the chain,
leaving π(dxi|x1, . . . , xi−1, xi+1, . . . , xd) invariant.
For instance, we can propose X ∼ k(Xi
t, ·) to replace Xi
t,
and accept or not with a Metropolis–Hastings step.
These proposals can be maximally coupled across two chains,
at each component update.
The chains meet when all components have met.
Likewise we can couple parallel tempering chains,
and meeting occurs when entire ensembles of chains meet.
Pierre E. Jacob Unbiased MCMC
31. Hamiltonian Monte Carlo
Introduce potential energy U(q) = − log π(q),
and total energy E(q, p) = U(q) + 1
2|p|2.
Hamiltonian dynamics for (q(s), p(s)), where s ≥ 0:
d
ds
q(s) = pE(q(s), p(s)),
d
ds
p(s) = − qE(q(s), p(s)).
Solving Hamiltonian dynamics exactly is not feasible,
but discretization + Metropolis–Hastings correction ensure that
π remains invariant.
Common random numbers can make two HMC chains contract,
under assumptions on the target such as strong log-concavity.
Heng & Jacob, Unbiased HMC with couplings, Biometrika, 2019.
Pierre E. Jacob Unbiased MCMC
32. Coupling of HMC
Figure 2 of Mangoubi & Smith, Rapid mixing of HMC strongly
log-concave distributions, 2017.
Coupling two copies X1, X2, . . . (blue) and Y1, Y2, . . .
(green) of HMC by choosing same momentum pi at
every step.
See also Bou-Rabee, Eberle & Zimmer, Coupling and Convergence
for Hamiltonian Monte Carlo, 2018.
Pierre E. Jacob Unbiased MCMC
33. Particle Independent Metropolis–Hastings
Initialization: run SMC, obtain ˆπ(0)(·), ˆZ(0),
where E[ ˆZ(0)] = Z, the normalizing constant of π.
At each iteration t, given (ˆπ(t)(·), ˆZ(t)),
1 run SMC, obtain (π (·), Z ),
2 sample U ∼ U([0, 1]),
3 if
U ≤
Z
ˆZ(t)
,
set (t + 1)-th state to (π (·), Z ),
otherwise set (t + 1)-th state to (ˆπ(t)(·), ˆZ(t)).
For any number of particles, T−1 T
t=1 ˆπ(t)(·) →T→∞ π.
Andrieu, Doucet, Holenstein, Particle MCMC, JRSS B, 2010.
Pierre E. Jacob Unbiased MCMC
34. Coupled Particle Independent Metropolis–Hastings
At each iteration t, given (ˆπ(t)(·), ˆZ(t)), and (˜π(t−1)(·), ˜Z(t−1)),
1 run SMC, obtain (π (·), Z ),
2 sample U ∼ U([0, 1]),
3 if
U ≤
Z
ˆZ(t)
,
set (t + 1)-th “1st” state to (π (·), Z ),
otherwise set (t + 1)-th “1st” state to (ˆπ(t)(·), ˆZ(t)).
4 if
U ≤
Z
˜Z(t−1)
,
set t-th “2nd” state to (π (·), Z ),
otherwise set t-th “2nd” state to (˜π(t−1)(·), ˜Z(t−1)).
Pierre E. Jacob Unbiased MCMC
35. Unbiased SMC
PIMH turns any SMC sampler into an MCMC scheme
that can be debiased using the coupled chains machinery.
Thus any MCMC algorithm that can be plugged in an
SMC sampler, can then be subsequently de-biased.
No need for couplings of the MCMC chains themselves.
Middleton, Deligiannidis, Doucet, Jacob, Unbiased Smoothing using
Particle Independent Metropolis-Hastings, AISTATS, 2019.
Pierre E. Jacob Unbiased MCMC
36. Outline
1 It is about Monte Carlo and bias
2 Unbiased estimators from Markov kernels
3 Design of coupled Markov chains
4 It can fail
5 Beyond parallel computation and burn-in
Pierre E. Jacob Unbiased MCMC
37. Bimodal target
Target is mixture of univariate Normal distributions:
π = 0.5 · N(−4, 1) + 0.5 · N(+4, 1).
MCMC: random walk Metropolis–Hastings,
with proposal standard deviation σ that will vary.
Initial distribution π0 will vary.
Pierre E. Jacob Unbiased MCMC
38. Bimodal target
With σ = 3, π0 = N(10, 102). . .
0.00
0.05
0.10
0.15
0.20
−10 −5 0 5 10
x
density
Pierre E. Jacob Unbiased MCMC
39. Bimodal target
With σ = 1, π0 = N(10, 102). . .
0.00
0.05
0.10
0.15
0.20
−10 −5 0 5 10
x
density
Pierre E. Jacob Unbiased MCMC
40. Bimodal target
With σ = 1, π0 = N(10, 12). . .
0.0
0.1
0.2
0.3
−10 −5 0 5 10
x
density
Pierre E. Jacob Unbiased MCMC
41. Outline
1 It is about Monte Carlo and bias
2 Unbiased estimators from Markov kernels
3 Design of coupled Markov chains
4 It can fail
5 Beyond parallel computation and burn-in
Pierre E. Jacob Unbiased MCMC
42. Unbiased property
Unbiased MCMC estimators can be used to tackle problems for
which standard MCMC methods are ill-suited.
For instance, problems where one needs to approximate
h(x1, x2)π2(x2|x1)dx2 π1(x1)dx1,
or equivalently
E1 [E2[h(X1, X2)|X1]] .
Pierre E. Jacob Unbiased MCMC
43. Example 1: cut distribution
First model, parameter θ1, data Y1, posterior π1(θ1|Y1).
Second model, parameter θ2, data Y2, conditional posterior:
π2(θ2|Y2, θ1) =
p2(θ2|θ1)p2(Y2|θ1, θ2)
p2(Y2|θ1)
.
Plummer, Cuts in Bayesian graphical models, 2015.
Jacob, Murray, Holmes, Robert, Better together? Statistical learning
in models made of modules.
Pierre E. Jacob Unbiased MCMC
44. Example: epidemiological study
Model of virus prevalence
∀i = 1, . . . , I Zi ∼ Binomial(Ni, ϕi),
Zi is number of women infected with high-risk HPV in a
sample of size Ni in country i.
Impact of prevalence onto cervical cancer occurrence
∀i = 1, . . . , I Yi ∼ Poisson(λiTi), log(λi) = θ2,1 + θ2,2ϕi,
Yi is number of cancer cases arising from Ti woman-years of
follow-up in country i.
Plummer, Cuts in Bayesian graphical models, 2015.
Pierre E. Jacob Unbiased MCMC
45. Example: epidemiological study
Approximations of the cut distribution marginals:
0
1
2
3
−2.5 −2.0 −1.5
θ2,1
density
0.00
0.05
0.10
0.15
10 15 20 25
θ2,2
density
Red curves were obtained by long MCMC run targeting
π2(θ2|Y2, θ1), for draws θ1 from first posterior π1(θ1|Y1).
Black bars were obtained with unbiased MCMC.
Pierre E. Jacob Unbiased MCMC
46. Example 2: normalizing constants
We might be interested in normalizing constant Z of
π(x) ∝ exp(−U(x)), defined as Z = exp(−U(x))dx.
Introduce
∀λ ∈ [0, 1] πλ(x) ∝ exp(−Uλ(x)).
Thermodynamics integration or path sampling identity:
log
Z1
Z0
= −
1
0
λUλ(x)πλ(dx)
q(λ)
q(dλ),
where q(dλ) is any distribution supported on [0, 1].
Rischard, Jacob, Pillai, Unbiased estimation of log normalizing
constants with applications to Bayesian cross-validation.
Pierre E. Jacob Unbiased MCMC
47. Example 3: Bayesian cross-validation
Data y = {y1, . . . , yn}, made of n units.
Partition y into training T and validation V sets.
For each split y = {T, V }, assess predictive performance with
log p(V |T) = log p(V |θ, T)p(dθ|T).
Note, log p(V |T) is a log-ratio of normalizing constants.
Finally we might want to approximate
CV =
n
nT
−1
T,V
log p(V |T).
Rischard, Jacob, Pillai, Unbiased estimation of log normalizing
constants with applications to Bayesian cross-validation.
Pierre E. Jacob Unbiased MCMC
48. Discussion
Perfect samplers, designed to sample i.i.d. from π, would
yield the same benefits and more.
What about regeneration?
Implications of lack of bias are numerous.
Proposed estimators can be obtained
using coupled MCMC kernels,
or using the “MCMC → SMC → coupled PIMH” route.
Cannot work if underlying MCMC doesn’t work. . .
but parallelism allows for more expensive MCMC kernels.
Thank you for listening!
Pierre E. Jacob Unbiased MCMC
49. References
with John O’Leary, Yves F. Atchad´e
Unbiased Markov chain Monte Carlo with couplings, 2019.
with Fredrik Lindsten, Thomas Sch¨on
Smoothing with Couplings of Conditional Particle Filters, 2018.
with Jeremy Heng
Unbiased Hamiltonian Monte Carlo with couplings, 2019.
with Lawrence Middleton, George Deligiannidis, Arnaud
Doucet
Unbiased Markov chain Monte Carlo for intractable target
distributions, 2019.
Unbiased Smoothing using Particle Independent
Metropolis-Hastings, 2019.
Pierre E. Jacob Unbiased MCMC
50. References
with Maxime Rischard, Natesh Pillai
Unbiased estimation of log normalizing constants with
applications to Bayesian cross-validation, 2019.
with Niloy Biswas
Estimating Convergence of Markov chains with L-Lag Couplings,
2019.
Funding provided by the National Science Foundation, grants
DMS-1712872 and DMS-1844695.
Pierre E. Jacob Unbiased MCMC