Markov chain Monte Carlo (MCMC) methods are commonly used to approximate properties of target probability distributions. However, MCMC estimators are generally biased for any fixed number of samples. The document discusses various techniques for constructing unbiased estimators from MCMC output, including regeneration, sequential Monte Carlo samplers, and coupled Markov chains. Specifically, running two Markov chains in parallel and taking the difference in their values at meeting times can yield an unbiased estimator, though certain conditions must hold.
Why should you care about Markov Chain Monte Carlo methods?
→ They are in the list of "Top 10 Algorithms of 20th Century"
→ They allow you to make inference with Bayesian Networks
→ They are used everywhere in Machine Learning and Statistics
Markov Chain Monte Carlo methods are a class of algorithms used to sample from complicated distributions. Typically, this is the case of posterior distributions in Bayesian Networks (Belief Networks).
These slides cover the following topics.
→ Motivation and Practical Examples (Bayesian Networks)
→ Basic Principles of MCMC
→ Gibbs Sampling
→ Metropolis–Hastings
→ Hamiltonian Monte Carlo
→ Reversible-Jump Markov Chain Monte Carlo
International Conference on Monte Carlo techniques
Closing conference of thematic cycle
Paris July 5-8th 2016
Campus les cordeliers
Jere Koskela's slides
Ordinal Regression and Machine Learning: Applications, Methods, MetricsFrancesco Casalegno
What do movie recommender systems, disease progression evaluation, and sovereign credit ranking have in common?
→ ordinal regression sits between classification and regression
→ target values are categorical and discrete, but ordered
→ many challenges to face when training and evaluating models
What will you find in this presentation?
→ real life, clear examples of ordinal regression you see everyday
→ learning to rank: predict user preferences and items relevance
→ best solution methods: naïve, binary decomposition, threshold
→ how to measure performance: understand & choose metrics
International Conference on Monte Carlo techniques
Closing conference of thematic cycle
Paris July 5-8th 2016
Campus les cordeliers
Chris Sherlock's slides
Why should you care about Markov Chain Monte Carlo methods?
→ They are in the list of "Top 10 Algorithms of 20th Century"
→ They allow you to make inference with Bayesian Networks
→ They are used everywhere in Machine Learning and Statistics
Markov Chain Monte Carlo methods are a class of algorithms used to sample from complicated distributions. Typically, this is the case of posterior distributions in Bayesian Networks (Belief Networks).
These slides cover the following topics.
→ Motivation and Practical Examples (Bayesian Networks)
→ Basic Principles of MCMC
→ Gibbs Sampling
→ Metropolis–Hastings
→ Hamiltonian Monte Carlo
→ Reversible-Jump Markov Chain Monte Carlo
International Conference on Monte Carlo techniques
Closing conference of thematic cycle
Paris July 5-8th 2016
Campus les cordeliers
Jere Koskela's slides
Ordinal Regression and Machine Learning: Applications, Methods, MetricsFrancesco Casalegno
What do movie recommender systems, disease progression evaluation, and sovereign credit ranking have in common?
→ ordinal regression sits between classification and regression
→ target values are categorical and discrete, but ordered
→ many challenges to face when training and evaluating models
What will you find in this presentation?
→ real life, clear examples of ordinal regression you see everyday
→ learning to rank: predict user preferences and items relevance
→ best solution methods: naïve, binary decomposition, threshold
→ how to measure performance: understand & choose metrics
International Conference on Monte Carlo techniques
Closing conference of thematic cycle
Paris July 5-8th 2016
Campus les cordeliers
Chris Sherlock's slides
Those are the slides for my Master course on Monte Carlo Statistical Methods given in conjunction with the Monte Carlo Statistical Methods book with George Casella.
International Conference on Monte Carlo techniques
Closing conference of thematic cycle
Paris July 5-8th 2016
Campus les Cordeliers
Slides of Richard Everitt's presentation
I am Stacy W. I am a Statistical Physics Assignment Expert at statisticsassignmenthelp.com. I hold a Masters in Statistics from, University of McGill, Canada
I have been helping students with their homework for the past 7years. I solve assignments related to Statistical.
Visit statisticsassignmenthelp.com or email info@statisticsassignmenthelp.com.
You can also call on +1 678 648 4277 for any assistance with Statistical Physics Assignments.
I am Keziah D. I am a Mechanical Engineering Assignment Expert at matlabassignmentexperts.com. I hold a Ph.D. Matlab, University of North Carolina, USA. I have been helping students with their homework for the past 8 years. I solve assignments related to Mechanical Engineering.
Visit matlabassignmentexperts.com or email info@matlabassignmentexperts.com.
You can also call on +1 678 648 4277 for any assistance with Mechanical Engineering Assignments.
This 10 hours class is intended to give students the basis to empirically solve statistical problems. Talk 1 serves as an introduction to the statistical software R, and presents how to calculate basic measures such as mean, variance, correlation and gini index. Talk 2 shows how the central limit theorem and the law of the large numbers work empirically. Talk 3 presents the point estimate, the confidence interval and the hypothesis test for the most important parameters. Talk 4 introduces to the linear regression model and Talk 5 to the bootstrap world. Talk 5 also presents an easy example of a markov chains.
All the talks are supported by script codes, in R language.
Those are the slides for my Master course on Monte Carlo Statistical Methods given in conjunction with the Monte Carlo Statistical Methods book with George Casella.
International Conference on Monte Carlo techniques
Closing conference of thematic cycle
Paris July 5-8th 2016
Campus les Cordeliers
Slides of Richard Everitt's presentation
I am Stacy W. I am a Statistical Physics Assignment Expert at statisticsassignmenthelp.com. I hold a Masters in Statistics from, University of McGill, Canada
I have been helping students with their homework for the past 7years. I solve assignments related to Statistical.
Visit statisticsassignmenthelp.com or email info@statisticsassignmenthelp.com.
You can also call on +1 678 648 4277 for any assistance with Statistical Physics Assignments.
I am Keziah D. I am a Mechanical Engineering Assignment Expert at matlabassignmentexperts.com. I hold a Ph.D. Matlab, University of North Carolina, USA. I have been helping students with their homework for the past 8 years. I solve assignments related to Mechanical Engineering.
Visit matlabassignmentexperts.com or email info@matlabassignmentexperts.com.
You can also call on +1 678 648 4277 for any assistance with Mechanical Engineering Assignments.
This 10 hours class is intended to give students the basis to empirically solve statistical problems. Talk 1 serves as an introduction to the statistical software R, and presents how to calculate basic measures such as mean, variance, correlation and gini index. Talk 2 shows how the central limit theorem and the law of the large numbers work empirically. Talk 3 presents the point estimate, the confidence interval and the hypothesis test for the most important parameters. Talk 4 introduces to the linear regression model and Talk 5 to the bootstrap world. Talk 5 also presents an easy example of a markov chains.
All the talks are supported by script codes, in R language.
We research behavior and sharp bounds for the zeros of infinite sequences of polynomials orthogonal with respect to a Geronimus perturbation of a positive Borel measure on the real line.
Sequential quasi-Monte Carlo (SQMC) is a quasi-Monte Carlo (QMC) version of sequential Monte Carlo (or particle filtering), a popular class of Monte Carlo techniques used to carry out inference in state space models. In this talk I will first review the SQMC methodology as well as some theoretical results. Although SQMC converges faster than the usual Monte Carlo error rate its performance deteriorates quickly as the dimension of the hidden variable increases. However, I will show with an example that SQMC may perform well for some "high" dimensional problems. I will conclude this talk with some open problems and potential applications of SQMC in complicated settings.
We examine the effectiveness of randomized quasi Monte Carlo (RQMC) to improve the convergence rate of the mean integrated square error, compared with crude Monte Carlo (MC), when estimating the density of a random variable X defined as a function over the s-dimensional unit cube (0,1)^s. We consider histograms and kernel density estimators. We show both theoretically and empirically that RQMC estimators can achieve faster convergence rates in
some situations.
This is joint work with Amal Ben Abdellah, Art B. Owen, and Florian Puchhammer.
Random Matrix Theory and Machine Learning - Part 3Fabian Pedregosa
ICML 2021 tutorial on random matrix theory and machine learning.
Part 3 covers: 1. Motivation: Average-case versus worst-case in high dimensions 2. Algorithm halting times (runtimes) 3. Outlook
Susie Bayarri Plenary Lecture given in the ISBA (International Society of Bayesian Analysis) World Meeting in Montreal, Canada on June 30, 2022, by Pierre E, Jacob (https://sites.google.com/site/pierrejacob/)
Talk on the design on non-negative unbiased estimators, useful to perform exact inference for intractable target distributions.
Corresponds to the article http://arxiv.org/abs/1309.6473
SMC^2: an algorithm for sequential analysis of state-space modelsPierre Jacob
In these slides I presented the SMC^2 method (see the article here: http://arxiv.org/abs/1101.1528 ) to an audience of marine biogeochemistry people, emphasizing on the model evidence estimation aspect.
This a short presentation for a 15 minutes talk at Bayesian Inference for Stochastic Processes 7, on the SMC^2 algorithm.
http://arxiv.org/abs/1101.1528
Seminar of U.V. Spectroscopy by SAMIR PANDASAMIR PANDA
Spectroscopy is a branch of science dealing the study of interaction of electromagnetic radiation with matter.
Ultraviolet-visible spectroscopy refers to absorption spectroscopy or reflect spectroscopy in the UV-VIS spectral region.
Ultraviolet-visible spectroscopy is an analytical method that can measure the amount of light received by the analyte.
Cancer cell metabolism: special Reference to Lactate PathwayAADYARAJPANDEY1
Normal Cell Metabolism:
Cellular respiration describes the series of steps that cells use to break down sugar and other chemicals to get the energy we need to function.
Energy is stored in the bonds of glucose and when glucose is broken down, much of that energy is released.
Cell utilize energy in the form of ATP.
The first step of respiration is called glycolysis. In a series of steps, glycolysis breaks glucose into two smaller molecules - a chemical called pyruvate. A small amount of ATP is formed during this process.
Most healthy cells continue the breakdown in a second process, called the Kreb's cycle. The Kreb's cycle allows cells to “burn” the pyruvates made in glycolysis to get more ATP.
The last step in the breakdown of glucose is called oxidative phosphorylation (Ox-Phos).
It takes place in specialized cell structures called mitochondria. This process produces a large amount of ATP. Importantly, cells need oxygen to complete oxidative phosphorylation.
If a cell completes only glycolysis, only 2 molecules of ATP are made per glucose. However, if the cell completes the entire respiration process (glycolysis - Kreb's - oxidative phosphorylation), about 36 molecules of ATP are created, giving it much more energy to use.
IN CANCER CELL:
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
introduction to WARBERG PHENOMENA:
WARBURG EFFECT Usually, cancer cells are highly glycolytic (glucose addiction) and take up more glucose than do normal cells from outside.
Otto Heinrich Warburg (; 8 October 1883 – 1 August 1970) In 1931 was awarded the Nobel Prize in Physiology for his "discovery of the nature and mode of action of the respiratory enzyme.
WARNBURG EFFECT : cancer cells under aerobic (well-oxygenated) conditions to metabolize glucose to lactate (aerobic glycolysis) is known as the Warburg effect. Warburg made the observation that tumor slices consume glucose and secrete lactate at a higher rate than normal tissues.
Nutraceutical market, scope and growth: Herbal drug technologyLokesh Patil
As consumer awareness of health and wellness rises, the nutraceutical market—which includes goods like functional meals, drinks, and dietary supplements that provide health advantages beyond basic nutrition—is growing significantly. As healthcare expenses rise, the population ages, and people want natural and preventative health solutions more and more, this industry is increasing quickly. Further driving market expansion are product formulation innovations and the use of cutting-edge technology for customized nutrition. With its worldwide reach, the nutraceutical industry is expected to keep growing and provide significant chances for research and investment in a number of categories, including vitamins, minerals, probiotics, and herbal supplements.
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Sérgio Sacani
Since volcanic activity was first discovered on Io from Voyager images in 1979, changes
on Io’s surface have been monitored from both spacecraft and ground-based telescopes.
Here, we present the highest spatial resolution images of Io ever obtained from a groundbased telescope. These images, acquired by the SHARK-VIS instrument on the Large
Binocular Telescope, show evidence of a major resurfacing event on Io’s trailing hemisphere. When compared to the most recent spacecraft images, the SHARK-VIS images
show that a plume deposit from a powerful eruption at Pillan Patera has covered part
of the long-lived Pele plume deposit. Although this type of resurfacing event may be common on Io, few have been detected due to the rarity of spacecraft visits and the previously low spatial resolution available from Earth-based telescopes. The SHARK-VIS instrument ushers in a new era of high resolution imaging of Io’s surface using adaptive
optics at visible wavelengths.
A brief information about the SCOP protein database used in bioinformatics.
The Structural Classification of Proteins (SCOP) database is a comprehensive and authoritative resource for the structural and evolutionary relationships of proteins. It provides a detailed and curated classification of protein structures, grouping them into families, superfamilies, and folds based on their structural and sequence similarities.
Richard's entangled aventures in wonderlandRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...Scintica Instrumentation
Intravital microscopy (IVM) is a powerful tool utilized to study cellular behavior over time and space in vivo. Much of our understanding of cell biology has been accomplished using various in vitro and ex vivo methods; however, these studies do not necessarily reflect the natural dynamics of biological processes. Unlike traditional cell culture or fixed tissue imaging, IVM allows for the ultra-fast high-resolution imaging of cellular processes over time and space and were studied in its natural environment. Real-time visualization of biological processes in the context of an intact organism helps maintain physiological relevance and provide insights into the progression of disease, response to treatments or developmental processes.
In this webinar we give an overview of advanced applications of the IVM system in preclinical research. IVIM technology is a provider of all-in-one intravital microscopy systems and solutions optimized for in vivo imaging of live animal models at sub-micron resolution. The system’s unique features and user-friendly software enables researchers to probe fast dynamic biological processes such as immune cell tracking, cell-cell interaction as well as vascularization and tumor metastasis with exceptional detail. This webinar will also give an overview of IVM being utilized in drug development, offering a view into the intricate interaction between drugs/nanoparticles and tissues in vivo and allows for the evaluation of therapeutic intervention in a variety of tissues and organs. This interdisciplinary collaboration continues to drive the advancements of novel therapeutic strategies.
Markov chain Monte Carlo methods and some attempts at parallelizing them
1. Markov chain Monte Carlo methods and some
attempts at parallelizing them
Pierre E. Jacob
Department of Statistics, Harvard University
(and many fantastic collaborators!)
MIT IDS.190, October 2019
blog: https://statisfaction.wordpress.com
Pierre E. Jacob Unbiased MCMC
2. Setting
Continuous or discrete space of dimension d.
Target probability distribution π,
with probability density/mass function x → π(x).
Goal: approximate π, e.g.
Eπ[h(X)] = h(x)π(x)dx = π(h),
for a class of “test” functions h.
Pierre E. Jacob Unbiased MCMC
3. Monte Carlo
Originates from physics, and still very much a research
topic in physics e.g.
K. Binder et al, Monte Carlo methods in statistical physics, 2012.
Often state-of-the-art for numerical integration e.g.
E. Novak, Some results on the complexity of numerical
integration, 2016.
Plays an important role in Bayesian inference e.g.
P. Green et al, Bayesian computation: a summary of the current
state, and samples backwards and forwards, 2015.
Can be useful for many other tasks in statistics e.g.
J. Besag, MCMC for Statistical Inference, 2001.
See also P. Diaconis, The MCMC revolution, 2009.
Pierre E. Jacob Unbiased MCMC
4. Outline
1 Monte Carlo and bias
2 Sequential Monte Carlo samplers
3 Regeneration
4 Unbiased estimators from coupled Markov chains
5 Bonus: new convergence diagnostics for MCMC
Pierre E. Jacob Unbiased MCMC
5. Outline
1 Monte Carlo and bias
2 Sequential Monte Carlo samplers
3 Regeneration
4 Unbiased estimators from coupled Markov chains
5 Bonus: new convergence diagnostics for MCMC
Pierre E. Jacob Unbiased MCMC
6. Markov chain Monte Carlo
Initially, X0 ∼ π0, then Xt|Xt−1 ∼ P(Xt−1, ·) for t = 1, . . . , T.
Estimator:
1
T − b
T
t=b+1
h(Xt),
where b iterations are discarded as burn-in.
Might converge to Eπ[h(X)] as T → ∞ by the ergodic theorem.
Biased for any fixed b, T, unless π0 is equal to π.
Averaging independent copies of such estimators for fixed b, T
would not provide a consistent estimator of Eπ[h(X)]
as the number of independent copies goes to infinity.
Pierre E. Jacob Unbiased MCMC
7. Example: Metropolis–Hastings kernel P
With Markov chain at state Xt,
1 propose X ∼ q(Xt, ·),
2 sample U ∼ Uniform(0, 1),
3 if
U ≤
π(X )q(X , Xt)
π(Xt)q(Xt, X )
,
set Xt+1 = X , otherwise set Xt+1 = Xt.
Hastings, Monte Carlo sampling methods using Markov chains and
their applications, 1970.
Pierre E. Jacob Unbiased MCMC
8. MCMC trace
π = N(0, 1), RWMH with Normal proposal std = 0.5, π0 = N(10, 32
)
Pierre E. Jacob Unbiased MCMC
9. MCMC marginal distributions
π = N(0, 1), RWMH with Normal proposal std = 0.5, π0 = N(10, 32
)
Pierre E. Jacob Unbiased MCMC
10. Independent replicates and MCMC
The bias is the difference |E[h(Xt)] − Eπ[h(X)]| for fixed t.
The bias has always been recognized as an obstacle on the way
to parallelize Monte Carlo calculations, e.g.
When running parallel Monte Carlo with many comput-
ers, it is more important to start with an unbiased (or
low-bias) estimate than with a low-variance estimate.
Rosenthal, Parallel computing and Monte Carlo algorithms, 2000.
For general statistical estimators, mean squared error is often the
prefered measure of accuracy.
In Monte Carlo, variance can be both quantified and arbitrarily
reduced with independent runs, but neither is true for the bias.
Pierre E. Jacob Unbiased MCMC
11. Outline
1 Monte Carlo and bias
2 Sequential Monte Carlo samplers
3 Regeneration
4 Unbiased estimators from coupled Markov chains
5 Bonus: new convergence diagnostics for MCMC
Pierre E. Jacob Unbiased MCMC
12. Importance sampling
Importance sampling relies on a proposal distribution q, chosen
by user to be an approximation of π.
1 Sample X1:N ∼ q, independently.
2 Weight w(Xn) = π(Xn)/q(Xn).
3 Normalize weights to obtain W1:N .
The procedure yields
ˆπN
(·) =
N
n=1
Wn
δXn (·)
approximates π as N → ∞ under conditions on q and π.
Pierre E. Jacob Unbiased MCMC
13. Importance sampling with MCMC proposals
Finding proposal q that approximates π might be difficult.
Can we use MCMC as an importance sampling proposal?
Something that would look like:
1 Sample X1:N by running N chains for T steps.
2 Weight w(Xn) somehow (?).
3 Normalize weights to obtain W1:N .
An immediate difficulty is that the marginal distributions of
MCMC chains are generally intractable, so importance weights
seem hard to compute.
Pierre E. Jacob Unbiased MCMC
14. Annealed importance sampling
For instance, sample X0 ∼ π0 and X1|X0 ∼ P(X0, ·).
Problem: marginal distribution of X1 is intractable.
Introduce backward kernel L(x1, x0) = P(x0, x1)π(x0)/π(x1).
Then consider
proposal distribution ¯q(x0, x1) = π0(x0)P(x0, x1),
target distribution ¯π(x0, x1) = π(x1)L(x1, x0).
Writing down importance sampling procedure leads to
tractable weights ∝ π(x0)/π0(x0),
desired marginal distribution: ¯π(x0, x1)dx0 = π(x1).
Neal, Annealed importance sampling, 2001,
Pierre E. Jacob Unbiased MCMC
15. Sequential Monte Carlo samplers
Del Moral, Doucet & Jasra, SMC samplers, 2006.
AIS and SMC samplers work by introducing sequence of target
distributions πt, for t = 0, . . . , T, and a sequence of MCMC
kernels Pt targeting πt.
Then N chains start from π0 and
move through the specified Markov kernels,
are weighted using ratios of successive target distributions,
are resampled according to weights (in SMC samplers).
At final step T, weighted samples approximate π.
The resampling steps induce interaction between the chains,
which possibly means communication between machines.
Whiteley, Lee & Heine, On the role of interaction in sequential Monte
Carlo algorithms, 2016.
Pierre E. Jacob Unbiased MCMC
16. Sequential Monte Carlo samplers
π = N(0, 1), adaptive SMC sampler with MH moves, π0 = N(10, 32
)
Pierre E. Jacob Unbiased MCMC
17. Outline
1 Monte Carlo and bias
2 Sequential Monte Carlo samplers
3 Regeneration
4 Unbiased estimators from coupled Markov chains
5 Bonus: new convergence diagnostics for MCMC
Pierre E. Jacob Unbiased MCMC
18. Regeneration in Markov chain samplers
Mykland, Tierney & Yu, Regeneration in Markov chain samplers, 1995.
−3
0
3
6
0 50 100 150 200
iteration
x
We might be able to identify regeneration times (Tn)n≥1
such that the tours (XTn−1 , . . . , XTn−1) are i.i.d.
and such that
N
n=1
Tn
t=Tn−1
h(Xt)
N
n=1(Tn − Tn−1)
a.s.
−−−−→
N→∞
Eπ[h(X)]
. . . but it might be difficult to identify these times.
Pierre E. Jacob Unbiased MCMC
19. Brockwell and Kadane’s regeneration technique
Design new chain such that regeneration is easier to identify.
State space E ∪ α, Markov kernel ˜P on E ∪ α that targets ˜π,
such that ˜π is equal to π on E.
Set ˜π(α) (to be chosen), and design “re-entry” proposal φ on E.
If Xt = α, propose X ∼ φ on E, acceptance probability
min(1, π(X )/(˜π(α)φ(X ))),
if Xt ∈ E, propose move to α, acceptance probability
min(1, ˜π(α)φ(Xt)/π(Xt)).
Perform these moves with probability ω, otherwise sample
Xt+1 ∼ P(Xt, ·) if Xt ∈ E, and set Xt+1 = α if Xt = α.
With the new chain, every re-entry in E is a regeneration.
Pierre E. Jacob Unbiased MCMC
20. Illustration of regeneration technique
π = N(0, 1), MH with Normal proposal std = 0.5, π0 = N(10, 32
)
Set ˜π(α) = 1, φ = N(2, 1), ω = 0.1.
−2
0
2
0 50 100 150 200
iteration
x
Brockwell & Kadane, Identification of regeneration times in MCMC
simulation, with application to adaptive schemes, 2005.
See also Nummelin, MC’s for MCMC’ists, 2002.
Pierre E. Jacob Unbiased MCMC
21. Outline
1 Monte Carlo and bias
2 Sequential Monte Carlo samplers
3 Regeneration
4 Unbiased estimators from coupled Markov chains
5 Bonus: new convergence diagnostics for MCMC
Pierre E. Jacob Unbiased MCMC
22. Coupled chains
Glynn & Rhee, Exact estimation for MC equilibrium expectations, 2014.
Generate two chains (Xt) and (Yt) as follows,
sample X0 and Y0 from π0 (independently, or not),
sample X1|X0 ∼ P(X0, ·),
for t ≥ 1, sample (Xt+1, Yt)|(Xt, Yt−1) ∼ ¯P ((Xt, Yt−1), ·).
¯P must be such that
Xt+1|Xt ∼ P(Xt, ·) and Yt|Yt−1 ∼ P(Yt−1, ·)
(thus Xt and Yt have the same distribution for all t ≥ 0),
there exists a random time τ such that Xt = Yt−1 for t ≥ τ
(the chains meet and remain “faithful”).
Pierre E. Jacob Unbiased MCMC
23. Metropolis on Normal target: coupled paths
0
4
8
0 50 100 150 200
iteration
x
π = N(0, 1), RWMH with Normal proposal std = 0.5, π0 = N(10, 32
)
Pierre E. Jacob Unbiased MCMC
24. Metropolis on Normal target: coupled paths
0
5
10
15
0 50 100 150 200
iteration
x
π = N(0, 1), RWMH with Normal proposal std = 0.5, π0 = N(10, 32
)
Pierre E. Jacob Unbiased MCMC
25. Debiasing idea (one slide version)
Limit as a telescopic sum, for all k ≥ 0,
Eπ[h(X)] = lim
t→∞
E[h(Xt)] = E[h(Xk)] +
∞
t=k+1
E[h(Xt) − h(Xt−1)].
Since for all t ≥ 0, Xt and Yt have the same distribution,
= E[h(Xk)] +
∞
t=k+1
E[h(Xt) − h(Yt−1)].
If we can swap expectation and limit,
= E[h(Xk) +
∞
t=k+1
(h(Xt) − h(Yt−1))].
Random variable in above expectation is unbiased for Eπ[h(X)].
Pierre E. Jacob Unbiased MCMC
26. Unbiased estimators
Unbiased estimator, for any user-chosen k, is given by
Hk(X, Y ) = h(Xk) +
τ−1
t=k+1
(h(Xt) − h(Yt−1)),
with the convention τ−1
t=k+1{·} = 0 if τ − 1 < k + 1.
h(Xk) alone is biased; the other terms correct for the bias.
Cost: τ − 1 calls to ¯P and 1 + max(0, k − τ) calls to P.
Glynn & Rhee, Exact estimation for Markov chain equilibrium expectations,
2014. Also Agapiou, Roberts & Vollmer, Unbiased Monte Carlo: Posterior
estimation for intractable/infinite-dimensional models, 2018.
Note: same reasoning would work with arbitrary lags L ≥ 1.
Pierre E. Jacob Unbiased MCMC
27. Conditions
Jacob, O’Leary, Atchad´e, Unbiased MCMC with couplings, 2019.
1 Marginal chain converges:
E[h(Xt)] → Eπ[h(X)],
and h(Xt) has (2 + η)-finite moments for all t.
2 Meeting time τ has geometric tails:
∃C < +∞ ∃δ ∈ (0, 1) ∀t ≥ 0 P(τ > t) ≤ Cδt
.
3 Chains stay together: Xt = Yt−1 for all t ≥ τ.
Condition 2 itself implied by e.g. geometric drift condition.
Under these conditions, Hk(X, Y ) is unbiased, has finite
expected cost and finite variance, for all k.
Pierre E. Jacob Unbiased MCMC
28. Metropolis on Normal target: meeting times
0.000
0.005
0.010
0.015
0 50 100 150 200
meeting time
density
π = N(0, 1), RWMH with Normal proposal std = 0.5, π0 = N(10, 32
)
Pierre E. Jacob Unbiased MCMC
29. Metropolis on Normal target: estimators of Eπ[X]
0.000
0.002
0.004
0.006
−1000 0 1000
estimator
density
k = 0
E[2τ] ≈ 96, V[H0(X, Y )] ≈ 65, 000.
Pierre E. Jacob Unbiased MCMC
30. Asymptotic inefficiency
Final estimator: average of R independent estimators.
In a given computing time,
more estimators can be produced if each estimator is cheaper.
An appropriate measure of performance is
[expected cost] × [variance],
called the asymptotic inefficiency.
Glynn & Whitt, Asymptotic efficiency of simulation estimators, 1992.
Glynn & Heidelberger, Bias properties of budget constrained
simulations, 1990.
Pierre E. Jacob Unbiased MCMC
31. Metropolis on Normal target: estimators of Eπ[X]
0.0
0.1
0.2
0.3
−200 −100 0 100
estimator
density
k = 100
E[max(k + τ, 2τ)] ≈ 148, V[Hk(X, Y )] ≈ 100.
Pierre E. Jacob Unbiased MCMC
32. Metropolis on Normal target: estimators of Eπ[X]
0.0
0.1
0.2
0.3
0.4
−4 −2 0 2 4
estimator
density
k = 200
E[max(k + τ, 2τ)] ≈ 248, V[Hk(X, Y )] ≈ 1.
Pierre E. Jacob Unbiased MCMC
33. Time-averaged unbiased estimators
Efficiency matters, thus in practice we recommend a variation
of the previous estimator, defined for integers k ≤ m as
Hk:m(X, Y ) =
1
m − k + 1
m
t=k
Ht(X, Y )
which can also be written
1
m − k + 1
m
t=k
h(Xt)+
τ−1
t=k+1
min 1,
t − k
m − k + 1
(h(Xt)−h(Yt−1)),
i.e. standard MCMC average + bias correction term.
Pierre E. Jacob Unbiased MCMC
34. Metropolis on Normal target: time-averaged estimators
0.0
0.5
1.0
1.5
2.0
2.5
−0.4 0.0 0.4
estimator
density
k = 200, m = 1000
E[max(m + τ, 2τ)] ≈ 1048, V[Hk(X, Y )] ≈ 0.028.
Pierre E. Jacob Unbiased MCMC
35. How to design appropriate coupled chains?
To implement the proposed unbiased estimators,
we need to sample from a Markov kernel ¯P,
such that, when (Xt+1, Yt) is sampled from ¯P ((Xt, Yt−1), ·),
marginally Xt+1|Xt ∼ P(Xt, ·), and Yt|Yt−1 ∼ P(Yt−1, ·),
it is possible that Xt+1 = Yt exactly for some t ≥ 0,
if Xt = Yt−1, then Xt+1 = Yt almost surely.
Pierre E. Jacob Unbiased MCMC
36. Couplings of MCMC algorithms
We can find many couplings in the literature. . .
Propp & Wilson, Exact sampling with coupled Markov chains
and applications to statistical mechanics, Random Structures &
Algorithms, 1996.
Johnson, Studying convergence of Markov chain Monte Carlo
algorithms using coupled sample paths, JASA, 1996.
Neal, Circularly-coupled Markov chain sampling, UoT tech
report, 1999.
Pinto & Neal, Improving Markov chain Monte Carlo estimators
by coupling to an approximating chain, UoT tech report, 2001.
Glynn & Rhee, Exact estimation for Markov chain equilibrium
expectations, Journal of Applied Probability, 2014.
Pierre E. Jacob Unbiased MCMC
37. Couplings of MCMC algorithms
Conditional particle filters
Jacob, Lindsten, Sch¨on, Smoothing with Couplings of
Conditional Particle Filters, 2019.
Metropolis–Hastings, Gibbs samplers, parallel tempering
Jacob, O’Leary, Atchad´e, Unbiased MCMC with couplings, 2019.
Hamiltonian Monte Carlo
Heng & Jacob, Unbiased HMC with couplings, 2019.
Pseudo-marginal MCMC, exchange algorithm
Middleton, Deligiannidis, Doucet, Jacob, Unbiased MCMC for
intractable target distributions, 2018.
Particle independent Metropolis–Hastings
Middleton, Deligiannidis, Doucet, Jacob, Unbiased Smoothing
using Particle Independent Metropolis-Hastings, 2019.
Pierre E. Jacob Unbiased MCMC
38. Maximal couplings
(X, Y ) follows a coupling of p and q if X ∼ p and Y ∼ q,
The coupling inequality states that
P(X = Y ) ≤ 1 − p − q TV,
for any coupling, with p − q TV = 1
2 |p(x) − q(x)|dx.
Maximal couplings achieve the bound.
Pierre E. Jacob Unbiased MCMC
40. Maximal coupling: algorithm
Requires: evaluations of p and q, sampling from p and q.
1 Sample X ∼ p and W ∼ Uniform(0, 1).
If W ≤ q(X)/p(X), set Y = X, output (X, Y ).
2 Otherwise, sample Y ∼ q and W ∼ Uniform(0, 1)
until W > p(Y )/q(Y ), set Y = Y and output (X, Y ).
Output: a pair (X, Y ) such that X ∼ p, Y ∼ q
and P(X = Y ) is maximal.
Pierre E. Jacob Unbiased MCMC
41. Back to Metropolis–Hastings (kernel P)
At each iteration t, Markov chain at state Xt,
1 propose X ∼ q(Xt, ·),
2 sample U ∼ Uniform(0, 1),
3 if
U ≤
π(X )q(X , Xt)
π(Xt)q(Xt, X )
,
set Xt+1 = X , otherwise set Xt+1 = Xt.
How to propagate two MH chains from states Xt and Yt−1
such that {Xt+1 = Yt} can happen?
Pierre E. Jacob Unbiased MCMC
42. Coupling of Metropolis–Hastings (kernel ¯P)
At each iteration t, two Markov chains at states Xt, Yt−1,
1 propose (X , Y ) from max coupling of q(Xt, ·), q(Yt−1, ·),
2 sample U ∼ Uniform(0, 1),
3 if
U ≤
π(X )q(X , Xt)
π(Xt)q(Xt, X )
,
set Xt+1 = X , otherwise set Xt+1 = Xt,
if
U ≤
π(Y )q(Y , Yt−1)
π(Yt−1)q(Yt−1, Y )
,
set Yt = Y , otherwise set Yt = Yt−1.
Pierre E. Jacob Unbiased MCMC
43. Scaling with dimension (not doing so well)
With naive maximum coupling of proposals. . .
10
100
1000
10000
1 2 3 4 5
dimension
averagemeetingtime
initialization: target offset
Pierre E. Jacob Unbiased MCMC
44. Scaling with dimension (much better)
With “reflection-maximal” couplings of proposals. . .
0
500
1000
1500
2000
1 13 25 37 50
dimension
averagemeetingtime
initialization: target offset
Pierre E. Jacob Unbiased MCMC
45. Hamiltonian Monte Carlo
Introduce potential energy U(q) = − log π(q),
and total energy E(q, p) = U(q) + 1
2|p|2.
Hamiltonian dynamics for (q(s), p(s)), where s ≥ 0:
d
ds
q(s) = pE(q(s), p(s)),
d
ds
p(s) = − qE(q(s), p(s)).
Solving Hamiltonian dynamics exactly is not feasible,
but discretization + Metropolis–Hastings correction ensure that
π remains invariant.
Common random numbers can make two HMC chains contract,
under assumptions on the target such as strong log-concavity.
Pierre E. Jacob Unbiased MCMC
46. Coupling of Hamiltonian Monte Carlo
Mangoubi & Smith, Rapid mixing of HMC on strongly
log-concave distributions, 2017
Bou-Rabee, Eberle & Zimmer, Coupling and Convergence for
Hamiltonian Monte Carlo, 2018.
Heng & Jacob, Unbiased HMC with couplings, 2019.
Pierre E. Jacob Unbiased MCMC
47. Coupling of Hamiltonian Monte Carlo
Figure 2 of Mangoubi & Smith, Rapid mixing of HMC strongly
log-concave distributions, 2017.
Coupling two copies X1, X2, . . . (blue) and Y1, Y2, . . .
(green) of HMC by choosing same momentum pi at ev-
ery step.
Pierre E. Jacob Unbiased MCMC
48. Scaling of Hamiltonian Monte Carlo
0
20
40
60
10 50 100 200 300
dimension
averagemeetingtime
initialization: target offset
Pierre E. Jacob Unbiased MCMC
49. Outline
1 Monte Carlo and bias
2 Sequential Monte Carlo samplers
3 Regeneration
4 Unbiased estimators from coupled Markov chains
5 Bonus: new convergence diagnostics for MCMC
Pierre E. Jacob Unbiased MCMC
50. Assessing finite-time bias of MCMC
Total variation distance between Xk ∼ πk and π = limk→∞ πk:
πk − π TV =
1
2
sup
h:|h|≤1
|E[h(Xk)] − Eπ[h(X)]|
=
1
2
sup
h:|h|≤1
|E[
τ−1
t=k+1
h(Xt) − h(Yt−1)]|
≤ E[max(0, τ − k − 1)].
0.000
0.005
0.010
0.015
0 50 100 150 200
meeting time
density
1e−03
1e−02
1e−01
1e+00
1e+01
0 50 100
k
upperbound
Pierre E. Jacob Unbiased MCMC
51. Assessing finite-time bias of MCMC
With L-lag couplings, τ(L) = inf{t ≥ L : Xt = Yt−L},
πk − π TV ≤ E max(0, (τ(L)
− L − k)/L ) .
0.00
0.25
0.50
0.75
1.00
1e+01 1e+02 1e+03 1e+04 1e+05 1e+06
iterations
dTV
SSG PT
Biswas, Jacob & Vanetti, Estimating Convergence of Markov chains
with L-Lag Couplings, 2019.
Pierre E. Jacob Unbiased MCMC
52. Discussion
Perfect samplers, that sample i.i.d. from π, would yield the
same benefits and more. Is any of this helping create
perfect samplers?
If underlying MCMC “doesn’t work”, proposed unbiased
estimators will have large cost and/or large variance.
Choice of tuning parameters? Choice of lag? Why couple
only two chains?
Lack of bias useful beyond parallel computation.
So far we have used Markovian couplings: can we do
better?
Thank you for listening!
Funding provided by the National Science Foundation, grants
DMS-1712872 and DMS-1844695.
Pierre E. Jacob Unbiased MCMC