SlideShare a Scribd company logo
1 of 86
Download to read offline
Bayesian inference with
models made of modules
Pierre E. Jacob
2022 ISBA World Meeting
Susie Bayarri Lecture
June 30, 2022
Pierre E. Jacob Models made of modules 1
Outline
1 Models made of modules and issues with joint modeling
2 Cutting feedback: cut posterior distributions
3 Computation involved when cutting feedback
Pierre E. Jacob Models made of modules 2
Outline
1 Models made of modules and issues with joint modeling
2 Cutting feedback: cut posterior distributions
3 Computation involved when cutting feedback
Pierre E. Jacob Models made of modules 2
Models made of modules
Let’s start with a simple example. . .
Pierre E. Jacob Models made of modules 3
Models made of modules
Ogle, Barber & Sartor (2013). Feedback and Modularization in a Bayesian
Meta–analysis of Tree Traits Affecting Forest Dynamics.
Pierre E. Jacob Models made of modules 4
Models made of modules
Zooming in. . . we see arrows. . . and diodes . . . ?
Pierre E. Jacob Models made of modules 5
Models made of modules
First module:
parameter θ1, data Y1
prior: p1(θ1)
likelihood: p1(Y1|θ1)
Second module:
parameter θ2, data Y2
prior: p2(θ2|θ1)
likelihood: p2 (Y2|θ1, θ2)
Pierre E. Jacob Models made of modules 6
Joint model approach
Parameter (θ1, θ2), with prior
p(θ1, θ2) = p1(θ1)p2(θ2|θ1).
Data (Y1, Y2), likelihood
p(Y1, Y2|θ1, θ2) = p1(Y1|θ1)p2(Y2|θ1, θ2).
Posterior distribution
π (θ1, θ2|Y1, Y2) ∝ p1 (θ1) p1(Y1|θ1)p2 (θ2|θ1) p2 (Y2|θ1, θ2).
Pierre E. Jacob Models made of modules 7
Joint model approach
Parameter (θ1, θ2), with prior
p(θ1, θ2) = p1(θ1)p2(θ2|θ1).
Data (Y1, Y2), likelihood
p(Y1, Y2|θ1, θ2) = p1(Y1|θ1)p2(Y2|θ1, θ2).
Posterior distribution
π (θ1, θ2|Y1, Y2) ∝ p1 (θ1) p1(Y1|θ1)p2 (θ2|θ1) p2 (Y2|θ1, θ2).
Departures from the joint model approach: why, how, when?
Pierre E. Jacob Models made of modules 7
Example: biased data
Liu, Bayarri & Berger (2009). Modularization in Bayesian analysis,
with emphasis on analysis of computer models.
Location model:
∀i = 1, . . . , n1 Y i
1 ∼ Normal(θ1, 1)
θ1 ∼ Normal(0, 1)
Extra data Y2 suspected to be biased:
∀i = 1, . . . , n2 Y i
2 ∼ Normal(θ1 + θ2, 1)
θ2 ∼ Normal(0, v)
Pierre E. Jacob Models made of modules 8
Example: biased data
Liu, Bayarri & Berger (2009). Modularization in Bayesian analysis,
with emphasis on analysis of computer models.
Location model:
∀i = 1, . . . , n1 Y i
1 ∼ Normal(θ1, 1)
θ1 ∼ Normal(0, 1)
Extra data Y2 suspected to be biased:
∀i = 1, . . . , n2 Y i
2 ∼ Normal(θ1 + θ2, 1)
θ2 ∼ Normal(0, v)
If interest is in θ1: are the extra data useful or harmful?
If interest is in θ2: joint model or “two-step” approach?
Pierre E. Jacob Models made of modules 8
Example: SARS-COV-2 prevalence
Nicholson et al. (2022). Interoperability of statistical models in
pandemic preparedness: principles and reality.
Prevalence π of SARS-COV-2 in the UK, estimated from
randomized surveillance data: u positive out of U tested,
Hypergeometric model with parameter π.
targeted surveillance data (patients with clinical need,
health & care workers): n positive out of N tested,
Binomial model involving π, P(tested|infected) and
P(tested|not infected).
Pierre E. Jacob Models made of modules 9
Example: SARS-COV-2 prevalence
Nicholson et al. (2022). Interoperability of statistical models in
pandemic preparedness: principles and reality.
Prevalence π of SARS-COV-2 in the UK, estimated from
randomized surveillance data: u positive out of U tested,
Hypergeometric model with parameter π.
targeted surveillance data (patients with clinical need,
health & care workers): n positive out of N tested,
Binomial model involving π, P(tested|infected) and
P(tested|not infected).
If interest is in π: are the extra data useful or harmful?
If interest is in e.g. P(tested|infected): joint model or
“two-step” approach?
Pierre E. Jacob Models made of modules 9
Example: stochastic dynamical models
Parslow, Cressie, Campbell, Jones & Murray (2013). Bayesian
learning and predictability in a stochastic nonlinear dynamical model.
Geophysics model of the temperature of the ocean, φ.
The temperature φ can be used as “forcings” in a model of
plankton population size β, for example in an SDE model
dβt = µ(βt, φt)dt + σ(βt, φt)dWt.
Pierre E. Jacob Models made of modules 10
Example: stochastic dynamical models
Parslow, Cressie, Campbell, Jones & Murray (2013). Bayesian
learning and predictability in a stochastic nonlinear dynamical model.
Geophysics model of the temperature of the ocean, φ.
The temperature φ can be used as “forcings” in a model of
plankton population size β, for example in an SDE model
dβt = µ(βt, φt)dt + σ(βt, φt)dWt.
We might want to:
propagate uncertainty about the geophysics to the biology?
allow/prevent feedback from the biology to the geophysics?
Pierre E. Jacob Models made of modules 10
Example: PKPD
Bennett & Wakefield (2001). Errors-in-variables in joint population
pharmacokinetic/pharmacodynamic modeling.
Lunn, Best, Spiegelhalter, Graham & Neuenschwander (2009). Combining
MCMC with ‘sequential’ PKPD modelling.
Pharmacokinetics (PK):
models the time course of drug absorption.
∀t Yt ∼ Normal(log Ct, vPK), where Ct = function(t, θPK).
From this we extract (C
(j)
t )t≥0 for individual j = 1, . . . , J.
Pharmacodynamics (PD):
models the effect of drugs.
∀j Zj ∼ Normal(Ej, vPD), where Ej = function(C
(j)
tj
, θPD),
and where tj is the time at which Ej is measured.
Pierre E. Jacob Models made of modules 11
Example: HPV prevalence and cervical cancer incidence
Plummer (2014). Cuts in Bayesian graphical models.
Human papillomavirus prevalence ϕi in country i:
∀i = 1, . . . , I Zi ∼ Binomial(Ni, ϕi),
Zi: number of women infected with high-risk HPV,
Ni: population size in country i.
Impact of prevalence onto cervical cancer occurrence:
∀i = 1, . . . , I Yi ∼ Poisson(λiTi), log(λi) = η1 + η2 ϕi,
Yi is number of cases during study in country i,
Ti: woman-years of follow-up in country i.
Pierre E. Jacob Models made of modules 12
Example: two-step estimation
Murphy & Topel (1985). Estimation and Inference in Two-Step
Econometric Models.
Impact of unanticipated money growth on unemployment.
∀t Mt = θX1t + t,
Mt: proportional growth in the M1 definition of money,
X1t: lagged Mt, lagged unemployment, more variables.
∀t log
Ut
1 − Ut
= βX2t + γ t + Wt,
Ut: annual average unemployment rate,
X2t: minimum wage, more variables.
Pierre E. Jacob Models made of modules 13
Example: two-step estimation
Murphy  Topel (1985). Estimation and Inference in Two-Step
Econometric Models.
Impact of unanticipated money growth on unemployment.
∀t Mt = θX1t + t,
Mt: proportional growth in the M1 definition of money,
X1t: lagged Mt, lagged unemployment, more variables.
∀t log
Ut
1 − Ut
= βX2t + γ t + Wt,
Ut: annual average unemployment rate,
X2t: minimum wage, more variables.
Joint estimation “inappropriate or computationally infeasible”.
Pierre E. Jacob Models made of modules 13
Example: multiple imputation
Missing data: imputation of missing values, then analysis
of completed data.
Jackson, Best  Richardson (2009). Bayesian graphical models
for regression on multiple data sets with different variables.
Pierre E. Jacob Models made of modules 14
Example: multiple imputation
Missing data: imputation of missing values, then analysis
of completed data.
Jackson, Best  Richardson (2009). Bayesian graphical models
for regression on multiple data sets with different variables.
Multiphase inference: a first analyst pre-processes raw
data, then a second analyst uses the processed data.
Blocker  Meng (2013). The potential and perils of
preprocessing: Building new foundations.
Pierre E. Jacob Models made of modules 14
More examples
Environmental epidemiology: estimation of environmental
exposure, then associated health effects.
Blangiardo, Hansell  Richardson (2011). A Bayesian model of
time activity data to investigate health effect of air pollution in
time series studies.
Pierre E. Jacob Models made of modules 15
More examples
Environmental epidemiology: estimation of environmental
exposure, then associated health effects.
Blangiardo, Hansell  Richardson (2011). A Bayesian model of
time activity data to investigate health effect of air pollution in
time series studies.
Causal inference with propensity scores: estimation of
probability of individuals receiving treatment, then
treatment effect adjusted for propensity score.
Zigler, Watts, Yeh, Wang, Coull  Dominici (2013). Model
feedback in Bayesian propensity score estimation.
Saarela, Belzile  Stephens (2016). A Bayesian view of doubly
robust causal inference.
Pierre E. Jacob Models made of modules 15
More examples
Environmental epidemiology: estimation of environmental
exposure, then associated health effects.
Blangiardo, Hansell  Richardson (2011). A Bayesian model of
time activity data to investigate health effect of air pollution in
time series studies.
Causal inference with propensity scores: estimation of
probability of individuals receiving treatment, then
treatment effect adjusted for propensity score.
Zigler, Watts, Yeh, Wang, Coull  Dominici (2013). Model
feedback in Bayesian propensity score estimation.
Saarela, Belzile  Stephens (2016). A Bayesian view of doubly
robust causal inference.
Jacob, Murray, Holmes  Robert (2017). Better together? Statistical
learning in models made of modules.
Pierre E. Jacob Models made of modules 15
Supporters say aye, opponents say no
Setup: model 2 depends on an input that is itself estimated
using model 1.
Bayesian analysis with the joint model:
- coherency, simultaneous treatment of uncertainty, and
other appeals of standard Bayes,
- computational toolbox is already available,
Pierre E. Jacob Models made of modules 16
Supporters say aye, opponents say no
Setup: model 2 depends on an input that is itself estimated
using model 1.
Bayesian analysis with the joint model:
- coherency, simultaneous treatment of uncertainty, and
other appeals of standard Bayes,
- computational toolbox is already available,
/ computationally challenging
as difficulties pile up with more modules,
/ parameters might be hard to interpret as their meaning
changes across modules,
/ module misspecification means that incorporating more
data is not necessarily beneficial, and sometimes harmful.
Pierre E. Jacob Models made of modules 16
Outline
1 Models made of modules and issues with joint modeling
2 Cutting feedback: cut posterior distributions
3 Computation involved when cutting feedback
Pierre E. Jacob Models made of modules 16
Plug-in approach
Simple:
1 first estimate θ1 given Y1, e.g. θ̂1 =
R
θ1 p1(θ1|Y1)dθ1,
2 inference on θ2 given Y2 and θ̂1 using
p2(θ2|θ̂1, Y2) =
p2(θ2|θ̂1)p2(Y2|θ̂1, θ2)
p2(Y2|θ̂1)
.
Pierre E. Jacob Models made of modules 17
Plug-in approach
Simple:
1 first estimate θ1 given Y1, e.g. θ̂1 =
R
θ1 p1(θ1|Y1)dθ1,
2 inference on θ2 given Y2 and θ̂1 using
p2(θ2|θ̂1, Y2) =
p2(θ2|θ̂1)p2(Y2|θ̂1, θ2)
p2(Y2|θ̂1)
.
/ Uncertainty about θ1 is ignored in the estimation of θ2.
- Misspecification of 2nd module does not impact θ1.
Pierre E. Jacob Models made of modules 17
Cut approach
Propagate uncertainty without allowing feedback.
Define the cut distribution:
πcut
(θ1, θ2; Y1, Y2) = p1(θ1|Y1)p2 (θ2|θ1, Y2).
Pierre E. Jacob Models made of modules 18
Cut approach
Propagate uncertainty without allowing feedback.
Define the cut distribution:
πcut
(θ1, θ2; Y1, Y2) = p1(θ1|Y1)p2 (θ2|θ1, Y2).
Known as Bayesian two-step estimation, or as “cutting
feedback”, term suggested by Nicky Best according to Jonty
Rougier, in a comment on Sansó, Forest  Zantedeschi (2008).
Pierre E. Jacob Models made of modules 18
Cut approach
Propagate uncertainty without allowing feedback.
Define the cut distribution:
πcut
(θ1, θ2; Y1, Y2) = p1(θ1|Y1)p2 (θ2|θ1, Y2).
Known as Bayesian two-step estimation, or as “cutting
feedback”, term suggested by Nicky Best according to Jonty
Rougier, in a comment on Sansó, Forest  Zantedeschi (2008).
Ideal sampling procedure:
1 Sample θ1 from p1(θ1|Y1).
2 Sample θ2 given θ1 from p2(θ2|θ1, Y2).
3 Output (θ1, θ2).
Pierre E. Jacob Models made of modules 18
Cut approach
From the OpenBUGS manual,
Spiegelhalter, Thomas, Best  Lunn, 2004:
The cut function acts as a kind of valve in the graph:
prior information is allowed to flow downwards through
the cut, but likelihood information is prevented from
flowing upwards.
Pierre E. Jacob Models made of modules 19
Cut approach
Difference between cut and standard posterior density functions:
πcut
(θ1, θ2; Y1, Y2) ∝ p1(θ1)p1(Y1|θ1)
p2(θ2|θ1)p2(Y2|θ1, θ2)
p2(Y2|θ1)
∝
π(θ1, θ2|Y1, Y2)
p2(Y2|θ1)
.
Pierre E. Jacob Models made of modules 20
Cut approach
Difference between cut and standard posterior density functions:
πcut
(θ1, θ2; Y1, Y2) ∝ p1(θ1)p1(Y1|θ1)
p2(θ2|θ1)p2(Y2|θ1, θ2)
p2(Y2|θ1)
∝
π(θ1, θ2|Y1, Y2)
p2(Y2|θ1)
.
The term p2(Y2|θ1) is a measure of feedback of Y2 onto θ1:
p2(Y2|θ1) =
Z
p2(Y2|θ1, θ2)p2(θ2|θ1)dθ2.
Pierre E. Jacob Models made of modules 20
Cut approach
Difference between cut and standard posterior density functions:
πcut
(θ1, θ2; Y1, Y2) ∝ p1(θ1)p1(Y1|θ1)
p2(θ2|θ1)p2(Y2|θ1, θ2)
p2(Y2|θ1)
∝
π(θ1, θ2|Y1, Y2)
p2(Y2|θ1)
.
The term p2(Y2|θ1) is a measure of feedback of Y2 onto θ1:
p2(Y2|θ1) =
Z
p2(Y2|θ1, θ2)p2(θ2|θ1)dθ2.
The marginal distribution of θ1 differs,
p1(θ1|Y1) for cut, π(θ1|Y1, Y2) for standard posterior,
The conditional distribution of θ2 is the same: p2(θ2|θ1, Y2).
Pierre E. Jacob Models made of modules 20
A variational representation
Among all distributions q(θ1, θ2) with marginal p1(θ1|Y1) on θ1,
πcut (θ1, θ2; Y1, Y2) minimizes
Kullback–Leibler (q(θ1, θ2), π(θ1, θ2|Y1, Y2)) .
Lemma 1 in Yu, Nott  Smith (2021). Variational inference for
cutting feedback in misspecified models.
Pierre E. Jacob Models made of modules 21
A variational representation
Among all distributions q(θ1, θ2) with marginal p1(θ1|Y1) on θ1,
πcut (θ1, θ2; Y1, Y2) minimizes
Kullback–Leibler (q(θ1, θ2), π(θ1, θ2|Y1, Y2)) .
Lemma 1 in Yu, Nott  Smith (2021). Variational inference for
cutting feedback in misspecified models.
We can also view the cut distribution as a valid representation
of beliefs about the parameters.
Pierre E. Jacob Models made of modules 21
Updating beliefs
Bissiri, Holmes  Walker (2016). A general framework for updating
belief distribution.
Prior beliefs p(θ), loss associated with data l(y, θ),
assume loss and prior are independent pieces of information.
Pierre E. Jacob Models made of modules 22
Updating beliefs
Bissiri, Holmes  Walker (2016). A general framework for updating
belief distribution.
Prior beliefs p(θ), loss associated with data l(y, θ),
assume loss and prior are independent pieces of information.
We look for an update “p(θ|y)”= ψ(l(y, θ), p(θ)).
Pierre E. Jacob Models made of modules 22
Updating beliefs
Bissiri, Holmes  Walker (2016). A general framework for updating
belief distribution.
Prior beliefs p(θ), loss associated with data l(y, θ),
assume loss and prior are independent pieces of information.
We look for an update “p(θ|y)”= ψ(l(y, θ), p(θ)).
Coherence:
ψ(l(y0
, θ), ψ(l(y, θ), p(θ))) = ψ(l(y0
, θ) + l(y, θ), p(θ)).
Pierre E. Jacob Models made of modules 22
Updating beliefs
Bissiri, Holmes  Walker (2016). A general framework for updating
belief distribution.
Prior beliefs p(θ), loss associated with data l(y, θ),
assume loss and prior are independent pieces of information.
We look for an update “p(θ|y)”= ψ(l(y, θ), p(θ)).
Coherence:
ψ(l(y0
, θ), ψ(l(y, θ), p(θ))) = ψ(l(y0
, θ) + l(y, θ), p(θ)).
Result: for coherence and optimality, we need to define
p(θ|y) = argminq
Z
l(y, θ)q(dθ) + KL(q(θ), p(θ)).
Solution: p(θ|y) ∝ exp(−l(y, θ))p(θ).
Pierre E. Jacob Models made of modules 22
Cut distributions are valid belief updates
Carmona  Nicholls (2020). Semi-Modular Inference: enhanced
learning in multi-modular models by tempering the influence of
components.
Retrieve the cut distribution in the framework of Bissiri,
Holmes  Walker (2016), with the loss
“l(y, θ)” = − {log p(Y1|θ1)p(Y2|θ1, θ2) − log p2(Y2|θ1)} .
Pierre E. Jacob Models made of modules 23
Cut distributions are valid belief updates
Carmona  Nicholls (2020). Semi-Modular Inference: enhanced
learning in multi-modular models by tempering the influence of
components.
Retrieve the cut distribution in the framework of Bissiri,
Holmes  Walker (2016), with the loss
“l(y, θ)” = − {log p(Y1|θ1)p(Y2|θ1, θ2) − log p2(Y2|θ1)} .
The loss involves p2(Y2|θ1) =
R
p2(Y2|θ1, θ2)p2(θ2|θ1)dθ2 and
thus depends on the prior. . . the argument needs adjustments.
Nicholls, Lee, Wu  Carmona (2022). Valid belief updates for
prequentially additive loss functions arising in Semi-Modular
Inference.
Pierre E. Jacob Models made of modules 23
Variants
Allow a controlled amount of feedback.
Nicholls  Carmona (2020). Semi-Modular Inference: enhanced
learning in multi-modular models by tempering the influence of
components.
Nicholls, Lee, Wu  Carmona (2022). Valid belief updates for
prequentially additive loss functions arising in Semi-Modular
Inference.
Pierre E. Jacob Models made of modules 24
Variants
Allow a controlled amount of feedback.
Nicholls  Carmona (2020). Semi-Modular Inference: enhanced
learning in multi-modular models by tempering the influence of
components.
Nicholls, Lee, Wu  Carmona (2022). Valid belief updates for
prequentially additive loss functions arising in Semi-Modular
Inference.
Cut + replace minus log-likelihood by other loss functions.
Frazier  Nott (2022). Cutting feedback and modularized
analyses in generalized Bayesian inference.
Pierre E. Jacob Models made of modules 24
Asymptotics
Asymptotics of two-step estimators in Murphy  Topel (1985).
Estimation and Inference in Two-Step Econometric Models.
Extended to cut distributions in Pompe  Jacob (2022).
Asymptotics of cut distributions and robust modular inference using
Posterior Bootstrap.
Pierre E. Jacob Models made of modules 25
Asymptotics
Asymptotics of two-step estimators in Murphy  Topel (1985).
Estimation and Inference in Two-Step Econometric Models.
Extended to cut distributions in Pompe  Jacob (2022).
Asymptotics of cut distributions and robust modular inference using
Posterior Bootstrap.
scenario A n1/n2 → α  0
p?(Y1,1:n1 , Y2,1:n2 ) =
n1
Y
i=1
p?
1(Y1,i)
n2
Y
i=1
p?
2(Y2,i)
Pierre E. Jacob Models made of modules 25
Asymptotics
Asymptotics of two-step estimators in Murphy  Topel (1985).
Estimation and Inference in Two-Step Econometric Models.
Extended to cut distributions in Pompe  Jacob (2022).
Asymptotics of cut distributions and robust modular inference using
Posterior Bootstrap.
scenario A n1/n2 → α  0
p?(Y1,1:n1 , Y2,1:n2 ) =
n1
Y
i=1
p?
1(Y1,i)
n2
Y
i=1
p?
2(Y2,i)
scenario B n1 = n2 = n
p?(Y1,1:n1 , Y2,1:n2 ) =
n
Y
i=1
p?
(Y1,i, Y2,i)
6=
n
Y
i=1
p?
1(Y1,i)p?
2(Y2,i)
Pierre E. Jacob Models made of modules 25
Asymptotics
Under regularity conditions, introduce the two-step MLEs:
θ̂1 = argmaxθ1
log p1(Y1|θ1) −→ θ?
1,
θ̂2 = argmaxθ2
log p2(Y2|θ̂1, θ2) −→ θ?
2.
Their asymptotic joint distribution is Normal with variance
ΣA under scenario A, ΣB under scenario B.
Pierre E. Jacob Models made of modules 26
Asymptotics
Under regularity conditions, introduce the two-step MLEs:
θ̂1 = argmaxθ1
log p1(Y1|θ1) −→ θ?
1,
θ̂2 = argmaxθ2
log p2(Y2|θ̂1, θ2) −→ θ?
2.
Their asymptotic joint distribution is Normal with variance
ΣA under scenario A, ΣB under scenario B.
Sandwich formula: E?[−d2`(θ?)
dθ2 ]−1E?[(d`(θ?)
dθ )2]E?[−d2`(θ?)
dθ2 ]−1
+ possible dependencies between Y1 and Y2 in scenario B.
Pierre E. Jacob Models made of modules 26
Asymptotics
Under regularity conditions, introduce the two-step MLEs:
θ̂1 = argmaxθ1
log p1(Y1|θ1) −→ θ?
1,
θ̂2 = argmaxθ2
log p2(Y2|θ̂1, θ2) −→ θ?
2.
Their asymptotic joint distribution is Normal with variance
ΣA under scenario A, ΣB under scenario B.
Sandwich formula: E?[−d2`(θ?)
dθ2 ]−1E?[(d`(θ?)
dθ )2]E?[−d2`(θ?)
dθ2 ]−1
+ possible dependencies between Y1 and Y2 in scenario B.
For (θ1, θ2) drawn from the cut distribution,
√
n(θ1 − θ̂1, θ2 − θ̂2) → Normal(0, ΣC).
ΣC ≡ E?[−d2`(θ?)
dθ2 ]−1 does not match ΣA or ΣB.
Pierre E. Jacob Models made of modules 26
Asymptotics
Frazier  Nott (2022). Cutting feedback and modularized analyses in
generalized Bayesian inference.
Focus on the asymptotic behaviour of p2(θ2|θ1, Y2),
assuming θ1 is fixed in a neighborhood of the MLE limit θ?
1.
Pierre E. Jacob Models made of modules 27
Asymptotics
Frazier  Nott (2022). Cutting feedback and modularized analyses in
generalized Bayesian inference.
Focus on the asymptotic behaviour of p2(θ2|θ1, Y2),
assuming θ1 is fixed in a neighborhood of the MLE limit θ?
1.
The asymptotic conditional distribution p2(θ2|θ1, Y2) is Normal
with both mean and variance depending explicitly on θ1.
Allows a finer understanding of the impact of
the uncertainty of θ1 onto that of θ2.
Pierre E. Jacob Models made of modules 27
Supporters say aye, opponents say no
Setup: model 2 depends on an input that is itself estimated
using model 1.
Cut distribution:
- Can mitigate effect of misspecification.
- Facilitates interoperability across teams.
- Can resolve computational intractability of joint model.
- Not completely unprincipled.
Pierre E. Jacob Models made of modules 28
Supporters say aye, opponents say no
Setup: model 2 depends on an input that is itself estimated
using model 1.
Cut distribution:
- Can mitigate effect of misspecification.
- Facilitates interoperability across teams.
- Can resolve computational intractability of joint model.
- Not completely unprincipled.
/ Can lead to sub-optimal estimation/prediction accuracy.
/ Is no replacement for constructive model criticism.
/ Associated computations present their own challenges.
Pierre E. Jacob Models made of modules 28
Asking the data whether to cut
Jacob, Murray, Holmes  Robert (2017). Better together? Statistical
learning in models made of modules.
We can try to be principled about whether to cut or not.
Natural route: introduce measures of predictive performance
that can be evaluated on test data. But which data: Y1? Y2?
Pierre E. Jacob Models made of modules 29
Asking the data whether to cut
Jacob, Murray, Holmes  Robert (2017). Better together? Statistical
learning in models made of modules.
We can try to be principled about whether to cut or not.
Natural route: introduce measures of predictive performance
that can be evaluated on test data. But which data: Y1? Y2?
Postulate: parameters are meaningful in the module that first
defines them. Thus distributions of parameters should be
compared on predictions in the module that defines them.
Pierre E. Jacob Models made of modules 29
Asking the data whether to cut
In the first module, θ1 is defined in its relation to Y1.
We propose to assess candidate distributions for θ1 based on
predictive performance for Y1.
Pierre E. Jacob Models made of modules 30
Asking the data whether to cut
In the first module, θ1 is defined in its relation to Y1.
We propose to assess candidate distributions for θ1 based on
predictive performance for Y1.
Two candidates:
p1(θ1|Y1) and π(θ1|Y1, Y2).
Using the prequential approach and the logarithmic scoring
rule, we compare
p1(Y1) and π(Y1|Y2).
Pierre E. Jacob Models made of modules 30
Asking the data whether to cut
In the first module, θ1 is defined in its relation to Y1.
We propose to assess candidate distributions for θ1 based on
predictive performance for Y1.
Two candidates:
p1(θ1|Y1) and π(θ1|Y1, Y2).
Using the prequential approach and the logarithmic scoring
rule, we compare
p1(Y1) and π(Y1|Y2).
If p1(Y1)  π(Y1|Y2), we support the use of distributions on
(θ1, θ2) that admit p1(θ1|Y1) as first marginal, e.g. cut.
Pierre E. Jacob Models made of modules 30
Outline
1 Models made of modules and issues with joint modeling
2 Cutting feedback: cut posterior distributions
3 Computation involved when cutting feedback
Pierre E. Jacob Models made of modules 30
Confusion about cut distributions
From Gelman (2020) blog post entitled How to “cut” using
Stan, if you must.
Question (rephrased for brevity):
Have cut posteriors been implemented in Stan?
Pierre E. Jacob Models made of modules 31
Confusion about cut distributions
From Gelman (2020) blog post entitled How to “cut” using
Stan, if you must.
Question (rephrased for brevity):
Have cut posteriors been implemented in Stan?
Reply:
This topic has come up before, and I don’t think this
“cut” is a good idea. If you want to implement it, [. . . ]
you’d first fit model 1 and get posterior simulations,
then approx those simulations by a mixture of multivari-
ate normal or t distributions, then use that as a prior
for model 2. [. . . ]
Pierre E. Jacob Models made of modules 31
Confusion about cut distributions
From Gelman (2020) blog post entitled How to “cut” using
Stan, if you must.
Question (rephrased for brevity):
Have cut posteriors been implemented in Stan?
Reply:
This topic has come up before, and I don’t think this
“cut” is a good idea. If you want to implement it, [. . . ]
you’d first fit model 1 and get posterior simulations,
then approx those simulations by a mixture of multivari-
ate normal or t distributions, then use that as a prior
for model 2. [. . . ]
This would in fact amount to a two-step approximation of the
standard posterior distribution, not the cut distribution!
Pierre E. Jacob Models made of modules 31
Modular approximations of the standard posterior
Lunn, Barrett, Sweeting  Thompson (2013). Fully Bayesian
hierarchical modelling in two stages, with application to
meta-analysis.
Goudie, Presanis, Lunn, De Angelis  Wernisch (2016). Model
surgery: joining and splitting models with Markov melding.
Manderson  Goudie (2021). A numerically stable algorithm for
integrating Bayesian models using Markov melding.
Pierre E. Jacob Models made of modules 32
Modular approximations of the standard posterior
Lunn, Barrett, Sweeting  Thompson (2013). Fully Bayesian
hierarchical modelling in two stages, with application to
meta-analysis.
Goudie, Presanis, Lunn, De Angelis  Wernisch (2016). Model
surgery: joining and splitting models with Markov melding.
Manderson  Goudie (2021). A numerically stable algorithm for
integrating Bayesian models using Markov melding.
Leonelli, Barons  Smith (2018). A conditional independence
framework for coherent modularized inference.
Huge interest in approximating the supraBayesian with a
de-centralized strategy, but this is not about cutting feedback.
Pierre E. Jacob Models made of modules 32
Sampling from the cut distribution
The density of the cut distribution is
πcut
(θ1, θ2; Y1, Y2) ∝
π(θ1, θ2|Y1, Y2)
p2(Y2|θ1)
.
Pierre E. Jacob Models made of modules 33
Sampling from the cut distribution
The density of the cut distribution is
πcut
(θ1, θ2; Y1, Y2) ∝
π(θ1, θ2|Y1, Y2)
p2(Y2|θ1)
.
The term p2(Y2|θ1) is typically intractable,
p2(Y2|θ1) =
Z
p2(Y2|θ1, θ2)p2(θ2|θ1)dθ2.
MCMC approach for doubly intractable targets:
Liu  Goudie (2021). Stochastic approximation cut algorithm for
inference in modularized Bayesian models.
Pierre E. Jacob Models made of modules 33
Sampling from the cut distribution
OpenBUGS’ approach via the cut function: alternate between
sampling θ1
0 from K1(θ1, dθ1
0) targeting p1(dθ1|Y1),
sampling θ2
0 from K2
θ1
0 (θ2, dθ2
0) targeting p2(dθ2|θ1
0, Y2).
This does not leave the cut distribution invariant. Iterating the
kernel K2
θ0
1
enough times mitigates the issue.
Plummer (2014). Cuts in Bayesian graphical models.
Pierre E. Jacob Models made of modules 34
Sampling from the cut distribution
In a perfect sampling world, we could sample
θ1 from p1(θ1|Y1),
θ2 given θ1 from p2(θ2|θ1, Y2),
then (θ1, θ2) would be exactly following the cut distribution.
For many models, exact sampling is not feasible.
Pierre E. Jacob Models made of modules 35
Sampling from the cut distribution
In an MCMC world, we can sample
θ1 approximately from p1(θ1|Y1) using MCMC,
θ2 given θ1 approximately from p2(θ2|θ1, Y2) using MCMC,
then resulting samples approximate the cut distribution,
in the limit of the numbers of iterations in both stages.
/ Can involve tuning and convergence diagnostics for many
MCMC runs at the 2nd stage, each with a different target.
Pierre E. Jacob Models made of modules 36
Unbiased estimation of the cut distribution
Jacob, O’Leary  Atchadé (2020). Unbiased Markov chain Monte
Carlo with couplings.
By coupling pairs of π-invariant chains in a particular way,
meeting
0 50 100 150 200
Pierre E. Jacob Models made of modules 37
Unbiased estimation of the cut distribution
Jacob, O’Leary  Atchadé (2020). Unbiased Markov chain Monte
Carlo with couplings.
By coupling pairs of π-invariant chains in a particular way,
meeting
0 50 100 150 200
Pierre E. Jacob Models made of modules 37
Unbiased estimation of the cut distribution
Jacob, O’Leary  Atchadé (2020). Unbiased Markov chain Monte
Carlo with couplings.
By coupling pairs of π-invariant chains in a particular way,
meeting
0 50 100 150 200
Pierre E. Jacob Models made of modules 37
Unbiased estimation of the cut distribution
Jacob, O’Leary  Atchadé (2020). Unbiased Markov chain Monte
Carlo with couplings.
By coupling pairs of π-invariant chains in a particular way,
meeting
0 50 100 150 200
we can construct a signed measure π̂(dθ) =
PN
n=1 ωnδθn (dθ)
with E[π̂(h)] = π(h) for a class of test functions h.
Pierre E. Jacob Models made of modules 37
Unbiased estimation of the cut distribution
In an unbiased MCMC world, we can approximate without bias
p1(dθ1|Y1) with a random measure
π̂1(dθ1) =
N1
X
n=1
ω1,nδθ1,n (dθ1),
obtained from coupled p1(dθ1|Y1)-invariant chains.
Pierre E. Jacob Models made of modules 38
Unbiased estimation of the cut distribution
In an unbiased MCMC world, we can approximate without bias
p1(dθ1|Y1) with a random measure
π̂1(dθ1) =
N1
X
n=1
ω1,nδθ1,n (dθ1),
obtained from coupled p1(dθ1|Y1)-invariant chains.
p2(dθ2|θ1, Y2) for any θ1, with a random measure
π̂2(dθ2|θ1) =
N2
X
n=1
ω2,nδθ2,n (dθ2),
obtained from coupled p2(dθ2|θ1, Y2)-invariant chains.
Pierre E. Jacob Models made of modules 38
Unbiased estimation of the cut distribution
In an unbiased MCMC world, we can approximate without bias
p1(dθ1|Y1) with a random measure
π̂1(dθ1) =
N1
X
n=1
ω1,nδθ1,n (dθ1),
obtained from coupled p1(dθ1|Y1)-invariant chains.
p2(dθ2|θ1, Y2) for any θ1, with a random measure
π̂2(dθ2|θ1) =
N2
X
n=1
ω2,nδθ2,n (dθ2),
obtained from coupled p2(dθ2|θ1, Y2)-invariant chains.
Using the tower property, we can estimate without bias
expectations with respect to πcut(θ1, θ2; Y1, Y2).
Pierre E. Jacob Models made of modules 38
Inexact approximations of the cut distribution
Variational inference:
Yu, Nott  Smith (2021). Variational inference for cutting
feedback in misspecified models.
Carmona  Nicholls (2022). Scalable Semi-Modular Inference
with Variational Meta-Posteriors.
Pierre E. Jacob Models made of modules 39
Inexact approximations of the cut distribution
Variational inference:
Yu, Nott  Smith (2021). Variational inference for cutting
feedback in misspecified models.
Carmona  Nicholls (2022). Scalable Semi-Modular Inference
with Variational Meta-Posteriors.
Posterior bootstrap:
Pompe  Jacob (2022). Asymptotics of cut distributions and
robust modular inference using Posterior Bootstrap.
adapting techniques developed earlier
by Newton  Raftery, Fong, Lyddon, Holmes  Walker.
Pierre E. Jacob Models made of modules 39
Inexact approximations of the cut distribution
Variational inference:
Yu, Nott  Smith (2021). Variational inference for cutting
feedback in misspecified models.
Carmona  Nicholls (2022). Scalable Semi-Modular Inference
with Variational Meta-Posteriors.
Posterior bootstrap:
Pompe  Jacob (2022). Asymptotics of cut distributions and
robust modular inference using Posterior Bootstrap.
adapting techniques developed earlier
by Newton  Raftery, Fong, Lyddon, Holmes  Walker.
Modular approximate Bayesian computation
Chakraborty, Nott, Drovandi, Frazier  Sisson (2022).
Modularized Bayesian analyses and cutting feedback in
likelihood-free inference.
Pierre E. Jacob Models made of modules 39
Discussion
A very appealing aspect of Bayesian analysis is its unified
treatment of many statistical questions.
Modular approaches appear to depart from the main
framework, and thus bring discomfort.
Pierre E. Jacob Models made of modules 40
Discussion
A very appealing aspect of Bayesian analysis is its unified
treatment of many statistical questions.
Modular approaches appear to depart from the main
framework, and thus bring discomfort.
If the essence of Bayesian analysis is not the direct application
of Bayes’ theorem, but
probability distributions for the quantities of interest,
constructed from data and prior information,
a framework for statistical inference
supported by principles  decision theory,
Pierre E. Jacob Models made of modules 40
Discussion
A very appealing aspect of Bayesian analysis is its unified
treatment of many statistical questions.
Modular approaches appear to depart from the main
framework, and thus bring discomfort.
If the essence of Bayesian analysis is not the direct application
of Bayes’ theorem, but
probability distributions for the quantities of interest,
constructed from data and prior information,
a framework for statistical inference
supported by principles  decision theory,
a good excuse to play with Markov chains,
Pierre E. Jacob Models made of modules 40
Discussion
A very appealing aspect of Bayesian analysis is its unified
treatment of many statistical questions.
Modular approaches appear to depart from the main
framework, and thus bring discomfort.
If the essence of Bayesian analysis is not the direct application
of Bayes’ theorem, but
probability distributions for the quantities of interest,
constructed from data and prior information,
a framework for statistical inference
supported by principles  decision theory,
a good excuse to play with Markov chains,
then cut posteriors might be essentially Bayesian.
Thank you!
Pierre E. Jacob Models made of modules 40

More Related Content

What's hot

Progressive: Pay-as-you-go insurance
Progressive: Pay-as-you-go insuranceProgressive: Pay-as-you-go insurance
Progressive: Pay-as-you-go insuranceBirte Gröger
 
Progressive: Pay-As-You-Go Insurance
Progressive: Pay-As-You-Go InsuranceProgressive: Pay-As-You-Go Insurance
Progressive: Pay-As-You-Go InsuranceAnant Lodha
 
Managing bussiness on cloud
Managing bussiness on cloudManaging bussiness on cloud
Managing bussiness on cloudAvinash Singh
 
Prediction of potential customers for term deposit
Prediction of potential customers for term depositPrediction of potential customers for term deposit
Prediction of potential customers for term depositPranov Mishra
 
Germany Oncology Market Analysis
Germany Oncology Market AnalysisGermany Oncology Market Analysis
Germany Oncology Market AnalysisInsights10
 
ZipCar : Influencing Consumer Behavior
ZipCar : Influencing Consumer Behavior ZipCar : Influencing Consumer Behavior
ZipCar : Influencing Consumer Behavior Shivank Pandya
 
Hubble session 1 case studies
Hubble session 1 case studiesHubble session 1 case studies
Hubble session 1 case studiesSpace IDEAS Hub
 
Benihana of Tokyo, case study
Benihana of Tokyo, case studyBenihana of Tokyo, case study
Benihana of Tokyo, case studyAamir chouhan
 

What's hot (12)

Progressive: Pay-as-you-go insurance
Progressive: Pay-as-you-go insuranceProgressive: Pay-as-you-go insurance
Progressive: Pay-as-you-go insurance
 
Progressive: Pay-As-You-Go Insurance
Progressive: Pay-As-You-Go InsuranceProgressive: Pay-As-You-Go Insurance
Progressive: Pay-As-You-Go Insurance
 
Managing bussiness on cloud
Managing bussiness on cloudManaging bussiness on cloud
Managing bussiness on cloud
 
Prediction of potential customers for term deposit
Prediction of potential customers for term depositPrediction of potential customers for term deposit
Prediction of potential customers for term deposit
 
CPNSFinalPaper
CPNSFinalPaperCPNSFinalPaper
CPNSFinalPaper
 
Case study ikea global sourcing
Case study ikea global sourcingCase study ikea global sourcing
Case study ikea global sourcing
 
Apple pay
Apple payApple pay
Apple pay
 
Germany Oncology Market Analysis
Germany Oncology Market AnalysisGermany Oncology Market Analysis
Germany Oncology Market Analysis
 
S11 ad5015
S11 ad5015S11 ad5015
S11 ad5015
 
ZipCar : Influencing Consumer Behavior
ZipCar : Influencing Consumer Behavior ZipCar : Influencing Consumer Behavior
ZipCar : Influencing Consumer Behavior
 
Hubble session 1 case studies
Hubble session 1 case studiesHubble session 1 case studies
Hubble session 1 case studies
 
Benihana of Tokyo, case study
Benihana of Tokyo, case studyBenihana of Tokyo, case study
Benihana of Tokyo, case study
 

Similar to ISBA 2022 Susie Bayarri lecture

better together? statistical learning in models made of modules
better together? statistical learning in models made of modulesbetter together? statistical learning in models made of modules
better together? statistical learning in models made of modulesChristian Robert
 
Investigations of certain estimators for modeling panel data under violations...
Investigations of certain estimators for modeling panel data under violations...Investigations of certain estimators for modeling panel data under violations...
Investigations of certain estimators for modeling panel data under violations...Alexander Decker
 
American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1Double Check ĆŐNSULTING
 
因果関係の推論
因果関係の推論因果関係の推論
因果関係の推論Kevin No
 
Computing Bayesian posterior with empirical likelihood in population genetics
Computing Bayesian posterior with empirical likelihood in population geneticsComputing Bayesian posterior with empirical likelihood in population genetics
Computing Bayesian posterior with empirical likelihood in population geneticsPierre Pudlo
 
Nber Lecture Final
Nber Lecture FinalNber Lecture Final
Nber Lecture FinalNBER
 
Likelihood free computational statistics
Likelihood free computational statisticsLikelihood free computational statistics
Likelihood free computational statisticsPierre Pudlo
 
New Insights and Applications of Eco-Finance Networks and Collaborative Games
New Insights and Applications of Eco-Finance Networks and Collaborative GamesNew Insights and Applications of Eco-Finance Networks and Collaborative Games
New Insights and Applications of Eco-Finance Networks and Collaborative GamesSSA KPI
 
JHU Job Talk
JHU Job TalkJHU Job Talk
JHU Job Talkjtleek
 
Computational Pool-Testing with Retesting Strategy
Computational Pool-Testing with Retesting StrategyComputational Pool-Testing with Retesting Strategy
Computational Pool-Testing with Retesting StrategyWaqas Tariq
 
Neural Networks with Complex Sample Data
Neural Networks with Complex Sample DataNeural Networks with Complex Sample Data
Neural Networks with Complex Sample DataSavano Pereira
 
Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of...
Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of...Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of...
Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of...Umberto Picchini
 
Modelling Population Heterogeneity- MichaelGhebre
Modelling Population Heterogeneity- MichaelGhebreModelling Population Heterogeneity- MichaelGhebre
Modelling Population Heterogeneity- MichaelGhebreMichael A Ghebre, PhD
 
Methods for High Dimensional Interactions
Methods for High Dimensional InteractionsMethods for High Dimensional Interactions
Methods for High Dimensional Interactionssahirbhatnagar
 
Regression shrinkage: better answers to causal questions
Regression shrinkage: better answers to causal questionsRegression shrinkage: better answers to causal questions
Regression shrinkage: better answers to causal questionsMaarten van Smeden
 
New Epidemiologic Measures in Multilevel Study: Median Risk Ratio, Median Haz...
New Epidemiologic Measures in Multilevel Study: Median Risk Ratio, Median Haz...New Epidemiologic Measures in Multilevel Study: Median Risk Ratio, Median Haz...
New Epidemiologic Measures in Multilevel Study: Median Risk Ratio, Median Haz...Jinseob Kim
 

Similar to ISBA 2022 Susie Bayarri lecture (20)

better together? statistical learning in models made of modules
better together? statistical learning in models made of modulesbetter together? statistical learning in models made of modules
better together? statistical learning in models made of modules
 
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
 
Investigations of certain estimators for modeling panel data under violations...
Investigations of certain estimators for modeling panel data under violations...Investigations of certain estimators for modeling panel data under violations...
Investigations of certain estimators for modeling panel data under violations...
 
American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1
 
因果関係の推論
因果関係の推論因果関係の推論
因果関係の推論
 
Computing Bayesian posterior with empirical likelihood in population genetics
Computing Bayesian posterior with empirical likelihood in population geneticsComputing Bayesian posterior with empirical likelihood in population genetics
Computing Bayesian posterior with empirical likelihood in population genetics
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Nber Lecture Final
Nber Lecture FinalNber Lecture Final
Nber Lecture Final
 
Likelihood free computational statistics
Likelihood free computational statisticsLikelihood free computational statistics
Likelihood free computational statistics
 
New Insights and Applications of Eco-Finance Networks and Collaborative Games
New Insights and Applications of Eco-Finance Networks and Collaborative GamesNew Insights and Applications of Eco-Finance Networks and Collaborative Games
New Insights and Applications of Eco-Finance Networks and Collaborative Games
 
JHU Job Talk
JHU Job TalkJHU Job Talk
JHU Job Talk
 
Computational Pool-Testing with Retesting Strategy
Computational Pool-Testing with Retesting StrategyComputational Pool-Testing with Retesting Strategy
Computational Pool-Testing with Retesting Strategy
 
Neural Networks with Complex Sample Data
Neural Networks with Complex Sample DataNeural Networks with Complex Sample Data
Neural Networks with Complex Sample Data
 
MS-Intro.pptx
MS-Intro.pptxMS-Intro.pptx
MS-Intro.pptx
 
Seattle.Slides.7
Seattle.Slides.7Seattle.Slides.7
Seattle.Slides.7
 
Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of...
Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of...Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of...
Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of...
 
Modelling Population Heterogeneity- MichaelGhebre
Modelling Population Heterogeneity- MichaelGhebreModelling Population Heterogeneity- MichaelGhebre
Modelling Population Heterogeneity- MichaelGhebre
 
Methods for High Dimensional Interactions
Methods for High Dimensional InteractionsMethods for High Dimensional Interactions
Methods for High Dimensional Interactions
 
Regression shrinkage: better answers to causal questions
Regression shrinkage: better answers to causal questionsRegression shrinkage: better answers to causal questions
Regression shrinkage: better answers to causal questions
 
New Epidemiologic Measures in Multilevel Study: Median Risk Ratio, Median Haz...
New Epidemiologic Measures in Multilevel Study: Median Risk Ratio, Median Haz...New Epidemiologic Measures in Multilevel Study: Median Risk Ratio, Median Haz...
New Epidemiologic Measures in Multilevel Study: Median Risk Ratio, Median Haz...
 

More from Pierre Jacob

Talk at CIRM on Poisson equation and debiasing techniques
Talk at CIRM on Poisson equation and debiasing techniquesTalk at CIRM on Poisson equation and debiasing techniques
Talk at CIRM on Poisson equation and debiasing techniquesPierre Jacob
 
Couplings of Markov chains and the Poisson equation
Couplings of Markov chains and the Poisson equation Couplings of Markov chains and the Poisson equation
Couplings of Markov chains and the Poisson equation Pierre Jacob
 
Monte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problemsMonte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problemsPierre Jacob
 
Monte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problemsMonte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problemsPierre Jacob
 
Markov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing themMarkov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing themPierre Jacob
 
Unbiased MCMC with couplings
Unbiased MCMC with couplingsUnbiased MCMC with couplings
Unbiased MCMC with couplingsPierre Jacob
 
Unbiased Markov chain Monte Carlo methods
Unbiased Markov chain Monte Carlo methods Unbiased Markov chain Monte Carlo methods
Unbiased Markov chain Monte Carlo methods Pierre Jacob
 
Recent developments on unbiased MCMC
Recent developments on unbiased MCMCRecent developments on unbiased MCMC
Recent developments on unbiased MCMCPierre Jacob
 
Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Pierre Jacob
 
Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Pierre Jacob
 
Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Pierre Jacob
 
Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Pierre Jacob
 
Current limitations of sequential inference in general hidden Markov models
Current limitations of sequential inference in general hidden Markov modelsCurrent limitations of sequential inference in general hidden Markov models
Current limitations of sequential inference in general hidden Markov modelsPierre Jacob
 
On non-negative unbiased estimators
On non-negative unbiased estimatorsOn non-negative unbiased estimators
On non-negative unbiased estimatorsPierre Jacob
 
Path storage in the particle filter
Path storage in the particle filterPath storage in the particle filter
Path storage in the particle filterPierre Jacob
 
Density exploration methods
Density exploration methodsDensity exploration methods
Density exploration methodsPierre Jacob
 
SMC^2: an algorithm for sequential analysis of state-space models
SMC^2: an algorithm for sequential analysis of state-space modelsSMC^2: an algorithm for sequential analysis of state-space models
SMC^2: an algorithm for sequential analysis of state-space modelsPierre Jacob
 
PAWL - GPU meeting @ Warwick
PAWL - GPU meeting @ WarwickPAWL - GPU meeting @ Warwick
PAWL - GPU meeting @ WarwickPierre Jacob
 
Presentation of SMC^2 at BISP7
Presentation of SMC^2 at BISP7Presentation of SMC^2 at BISP7
Presentation of SMC^2 at BISP7Pierre Jacob
 
Presentation MCB seminar 09032011
Presentation MCB seminar 09032011Presentation MCB seminar 09032011
Presentation MCB seminar 09032011Pierre Jacob
 

More from Pierre Jacob (20)

Talk at CIRM on Poisson equation and debiasing techniques
Talk at CIRM on Poisson equation and debiasing techniquesTalk at CIRM on Poisson equation and debiasing techniques
Talk at CIRM on Poisson equation and debiasing techniques
 
Couplings of Markov chains and the Poisson equation
Couplings of Markov chains and the Poisson equation Couplings of Markov chains and the Poisson equation
Couplings of Markov chains and the Poisson equation
 
Monte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problemsMonte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problems
 
Monte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problemsMonte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problems
 
Markov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing themMarkov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing them
 
Unbiased MCMC with couplings
Unbiased MCMC with couplingsUnbiased MCMC with couplings
Unbiased MCMC with couplings
 
Unbiased Markov chain Monte Carlo methods
Unbiased Markov chain Monte Carlo methods Unbiased Markov chain Monte Carlo methods
Unbiased Markov chain Monte Carlo methods
 
Recent developments on unbiased MCMC
Recent developments on unbiased MCMCRecent developments on unbiased MCMC
Recent developments on unbiased MCMC
 
Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...
 
Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...
 
Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...
 
Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...
 
Current limitations of sequential inference in general hidden Markov models
Current limitations of sequential inference in general hidden Markov modelsCurrent limitations of sequential inference in general hidden Markov models
Current limitations of sequential inference in general hidden Markov models
 
On non-negative unbiased estimators
On non-negative unbiased estimatorsOn non-negative unbiased estimators
On non-negative unbiased estimators
 
Path storage in the particle filter
Path storage in the particle filterPath storage in the particle filter
Path storage in the particle filter
 
Density exploration methods
Density exploration methodsDensity exploration methods
Density exploration methods
 
SMC^2: an algorithm for sequential analysis of state-space models
SMC^2: an algorithm for sequential analysis of state-space modelsSMC^2: an algorithm for sequential analysis of state-space models
SMC^2: an algorithm for sequential analysis of state-space models
 
PAWL - GPU meeting @ Warwick
PAWL - GPU meeting @ WarwickPAWL - GPU meeting @ Warwick
PAWL - GPU meeting @ Warwick
 
Presentation of SMC^2 at BISP7
Presentation of SMC^2 at BISP7Presentation of SMC^2 at BISP7
Presentation of SMC^2 at BISP7
 
Presentation MCB seminar 09032011
Presentation MCB seminar 09032011Presentation MCB seminar 09032011
Presentation MCB seminar 09032011
 

Recently uploaded

High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Spark3's new memory model/management
Spark3's new memory model/managementSpark3's new memory model/management
Spark3's new memory model/managementakshesh doshi
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 

Recently uploaded (20)

High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Spark3's new memory model/management
Spark3's new memory model/managementSpark3's new memory model/management
Spark3's new memory model/management
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 

ISBA 2022 Susie Bayarri lecture

  • 1. Bayesian inference with models made of modules Pierre E. Jacob 2022 ISBA World Meeting Susie Bayarri Lecture June 30, 2022 Pierre E. Jacob Models made of modules 1
  • 2. Outline 1 Models made of modules and issues with joint modeling 2 Cutting feedback: cut posterior distributions 3 Computation involved when cutting feedback Pierre E. Jacob Models made of modules 2
  • 3. Outline 1 Models made of modules and issues with joint modeling 2 Cutting feedback: cut posterior distributions 3 Computation involved when cutting feedback Pierre E. Jacob Models made of modules 2
  • 4. Models made of modules Let’s start with a simple example. . . Pierre E. Jacob Models made of modules 3
  • 5. Models made of modules Ogle, Barber & Sartor (2013). Feedback and Modularization in a Bayesian Meta–analysis of Tree Traits Affecting Forest Dynamics. Pierre E. Jacob Models made of modules 4
  • 6. Models made of modules Zooming in. . . we see arrows. . . and diodes . . . ? Pierre E. Jacob Models made of modules 5
  • 7. Models made of modules First module: parameter θ1, data Y1 prior: p1(θ1) likelihood: p1(Y1|θ1) Second module: parameter θ2, data Y2 prior: p2(θ2|θ1) likelihood: p2 (Y2|θ1, θ2) Pierre E. Jacob Models made of modules 6
  • 8. Joint model approach Parameter (θ1, θ2), with prior p(θ1, θ2) = p1(θ1)p2(θ2|θ1). Data (Y1, Y2), likelihood p(Y1, Y2|θ1, θ2) = p1(Y1|θ1)p2(Y2|θ1, θ2). Posterior distribution π (θ1, θ2|Y1, Y2) ∝ p1 (θ1) p1(Y1|θ1)p2 (θ2|θ1) p2 (Y2|θ1, θ2). Pierre E. Jacob Models made of modules 7
  • 9. Joint model approach Parameter (θ1, θ2), with prior p(θ1, θ2) = p1(θ1)p2(θ2|θ1). Data (Y1, Y2), likelihood p(Y1, Y2|θ1, θ2) = p1(Y1|θ1)p2(Y2|θ1, θ2). Posterior distribution π (θ1, θ2|Y1, Y2) ∝ p1 (θ1) p1(Y1|θ1)p2 (θ2|θ1) p2 (Y2|θ1, θ2). Departures from the joint model approach: why, how, when? Pierre E. Jacob Models made of modules 7
  • 10. Example: biased data Liu, Bayarri & Berger (2009). Modularization in Bayesian analysis, with emphasis on analysis of computer models. Location model: ∀i = 1, . . . , n1 Y i 1 ∼ Normal(θ1, 1) θ1 ∼ Normal(0, 1) Extra data Y2 suspected to be biased: ∀i = 1, . . . , n2 Y i 2 ∼ Normal(θ1 + θ2, 1) θ2 ∼ Normal(0, v) Pierre E. Jacob Models made of modules 8
  • 11. Example: biased data Liu, Bayarri & Berger (2009). Modularization in Bayesian analysis, with emphasis on analysis of computer models. Location model: ∀i = 1, . . . , n1 Y i 1 ∼ Normal(θ1, 1) θ1 ∼ Normal(0, 1) Extra data Y2 suspected to be biased: ∀i = 1, . . . , n2 Y i 2 ∼ Normal(θ1 + θ2, 1) θ2 ∼ Normal(0, v) If interest is in θ1: are the extra data useful or harmful? If interest is in θ2: joint model or “two-step” approach? Pierre E. Jacob Models made of modules 8
  • 12. Example: SARS-COV-2 prevalence Nicholson et al. (2022). Interoperability of statistical models in pandemic preparedness: principles and reality. Prevalence π of SARS-COV-2 in the UK, estimated from randomized surveillance data: u positive out of U tested, Hypergeometric model with parameter π. targeted surveillance data (patients with clinical need, health & care workers): n positive out of N tested, Binomial model involving π, P(tested|infected) and P(tested|not infected). Pierre E. Jacob Models made of modules 9
  • 13. Example: SARS-COV-2 prevalence Nicholson et al. (2022). Interoperability of statistical models in pandemic preparedness: principles and reality. Prevalence π of SARS-COV-2 in the UK, estimated from randomized surveillance data: u positive out of U tested, Hypergeometric model with parameter π. targeted surveillance data (patients with clinical need, health & care workers): n positive out of N tested, Binomial model involving π, P(tested|infected) and P(tested|not infected). If interest is in π: are the extra data useful or harmful? If interest is in e.g. P(tested|infected): joint model or “two-step” approach? Pierre E. Jacob Models made of modules 9
  • 14. Example: stochastic dynamical models Parslow, Cressie, Campbell, Jones & Murray (2013). Bayesian learning and predictability in a stochastic nonlinear dynamical model. Geophysics model of the temperature of the ocean, φ. The temperature φ can be used as “forcings” in a model of plankton population size β, for example in an SDE model dβt = µ(βt, φt)dt + σ(βt, φt)dWt. Pierre E. Jacob Models made of modules 10
  • 15. Example: stochastic dynamical models Parslow, Cressie, Campbell, Jones & Murray (2013). Bayesian learning and predictability in a stochastic nonlinear dynamical model. Geophysics model of the temperature of the ocean, φ. The temperature φ can be used as “forcings” in a model of plankton population size β, for example in an SDE model dβt = µ(βt, φt)dt + σ(βt, φt)dWt. We might want to: propagate uncertainty about the geophysics to the biology? allow/prevent feedback from the biology to the geophysics? Pierre E. Jacob Models made of modules 10
  • 16. Example: PKPD Bennett & Wakefield (2001). Errors-in-variables in joint population pharmacokinetic/pharmacodynamic modeling. Lunn, Best, Spiegelhalter, Graham & Neuenschwander (2009). Combining MCMC with ‘sequential’ PKPD modelling. Pharmacokinetics (PK): models the time course of drug absorption. ∀t Yt ∼ Normal(log Ct, vPK), where Ct = function(t, θPK). From this we extract (C (j) t )t≥0 for individual j = 1, . . . , J. Pharmacodynamics (PD): models the effect of drugs. ∀j Zj ∼ Normal(Ej, vPD), where Ej = function(C (j) tj , θPD), and where tj is the time at which Ej is measured. Pierre E. Jacob Models made of modules 11
  • 17. Example: HPV prevalence and cervical cancer incidence Plummer (2014). Cuts in Bayesian graphical models. Human papillomavirus prevalence ϕi in country i: ∀i = 1, . . . , I Zi ∼ Binomial(Ni, ϕi), Zi: number of women infected with high-risk HPV, Ni: population size in country i. Impact of prevalence onto cervical cancer occurrence: ∀i = 1, . . . , I Yi ∼ Poisson(λiTi), log(λi) = η1 + η2 ϕi, Yi is number of cases during study in country i, Ti: woman-years of follow-up in country i. Pierre E. Jacob Models made of modules 12
  • 18. Example: two-step estimation Murphy & Topel (1985). Estimation and Inference in Two-Step Econometric Models. Impact of unanticipated money growth on unemployment. ∀t Mt = θX1t + t, Mt: proportional growth in the M1 definition of money, X1t: lagged Mt, lagged unemployment, more variables. ∀t log Ut 1 − Ut = βX2t + γ t + Wt, Ut: annual average unemployment rate, X2t: minimum wage, more variables. Pierre E. Jacob Models made of modules 13
  • 19. Example: two-step estimation Murphy Topel (1985). Estimation and Inference in Two-Step Econometric Models. Impact of unanticipated money growth on unemployment. ∀t Mt = θX1t + t, Mt: proportional growth in the M1 definition of money, X1t: lagged Mt, lagged unemployment, more variables. ∀t log Ut 1 − Ut = βX2t + γ t + Wt, Ut: annual average unemployment rate, X2t: minimum wage, more variables. Joint estimation “inappropriate or computationally infeasible”. Pierre E. Jacob Models made of modules 13
  • 20. Example: multiple imputation Missing data: imputation of missing values, then analysis of completed data. Jackson, Best Richardson (2009). Bayesian graphical models for regression on multiple data sets with different variables. Pierre E. Jacob Models made of modules 14
  • 21. Example: multiple imputation Missing data: imputation of missing values, then analysis of completed data. Jackson, Best Richardson (2009). Bayesian graphical models for regression on multiple data sets with different variables. Multiphase inference: a first analyst pre-processes raw data, then a second analyst uses the processed data. Blocker Meng (2013). The potential and perils of preprocessing: Building new foundations. Pierre E. Jacob Models made of modules 14
  • 22. More examples Environmental epidemiology: estimation of environmental exposure, then associated health effects. Blangiardo, Hansell Richardson (2011). A Bayesian model of time activity data to investigate health effect of air pollution in time series studies. Pierre E. Jacob Models made of modules 15
  • 23. More examples Environmental epidemiology: estimation of environmental exposure, then associated health effects. Blangiardo, Hansell Richardson (2011). A Bayesian model of time activity data to investigate health effect of air pollution in time series studies. Causal inference with propensity scores: estimation of probability of individuals receiving treatment, then treatment effect adjusted for propensity score. Zigler, Watts, Yeh, Wang, Coull Dominici (2013). Model feedback in Bayesian propensity score estimation. Saarela, Belzile Stephens (2016). A Bayesian view of doubly robust causal inference. Pierre E. Jacob Models made of modules 15
  • 24. More examples Environmental epidemiology: estimation of environmental exposure, then associated health effects. Blangiardo, Hansell Richardson (2011). A Bayesian model of time activity data to investigate health effect of air pollution in time series studies. Causal inference with propensity scores: estimation of probability of individuals receiving treatment, then treatment effect adjusted for propensity score. Zigler, Watts, Yeh, Wang, Coull Dominici (2013). Model feedback in Bayesian propensity score estimation. Saarela, Belzile Stephens (2016). A Bayesian view of doubly robust causal inference. Jacob, Murray, Holmes Robert (2017). Better together? Statistical learning in models made of modules. Pierre E. Jacob Models made of modules 15
  • 25. Supporters say aye, opponents say no Setup: model 2 depends on an input that is itself estimated using model 1. Bayesian analysis with the joint model: - coherency, simultaneous treatment of uncertainty, and other appeals of standard Bayes, - computational toolbox is already available, Pierre E. Jacob Models made of modules 16
  • 26. Supporters say aye, opponents say no Setup: model 2 depends on an input that is itself estimated using model 1. Bayesian analysis with the joint model: - coherency, simultaneous treatment of uncertainty, and other appeals of standard Bayes, - computational toolbox is already available, / computationally challenging as difficulties pile up with more modules, / parameters might be hard to interpret as their meaning changes across modules, / module misspecification means that incorporating more data is not necessarily beneficial, and sometimes harmful. Pierre E. Jacob Models made of modules 16
  • 27. Outline 1 Models made of modules and issues with joint modeling 2 Cutting feedback: cut posterior distributions 3 Computation involved when cutting feedback Pierre E. Jacob Models made of modules 16
  • 28. Plug-in approach Simple: 1 first estimate θ1 given Y1, e.g. θ̂1 = R θ1 p1(θ1|Y1)dθ1, 2 inference on θ2 given Y2 and θ̂1 using p2(θ2|θ̂1, Y2) = p2(θ2|θ̂1)p2(Y2|θ̂1, θ2) p2(Y2|θ̂1) . Pierre E. Jacob Models made of modules 17
  • 29. Plug-in approach Simple: 1 first estimate θ1 given Y1, e.g. θ̂1 = R θ1 p1(θ1|Y1)dθ1, 2 inference on θ2 given Y2 and θ̂1 using p2(θ2|θ̂1, Y2) = p2(θ2|θ̂1)p2(Y2|θ̂1, θ2) p2(Y2|θ̂1) . / Uncertainty about θ1 is ignored in the estimation of θ2. - Misspecification of 2nd module does not impact θ1. Pierre E. Jacob Models made of modules 17
  • 30. Cut approach Propagate uncertainty without allowing feedback. Define the cut distribution: πcut (θ1, θ2; Y1, Y2) = p1(θ1|Y1)p2 (θ2|θ1, Y2). Pierre E. Jacob Models made of modules 18
  • 31. Cut approach Propagate uncertainty without allowing feedback. Define the cut distribution: πcut (θ1, θ2; Y1, Y2) = p1(θ1|Y1)p2 (θ2|θ1, Y2). Known as Bayesian two-step estimation, or as “cutting feedback”, term suggested by Nicky Best according to Jonty Rougier, in a comment on Sansó, Forest Zantedeschi (2008). Pierre E. Jacob Models made of modules 18
  • 32. Cut approach Propagate uncertainty without allowing feedback. Define the cut distribution: πcut (θ1, θ2; Y1, Y2) = p1(θ1|Y1)p2 (θ2|θ1, Y2). Known as Bayesian two-step estimation, or as “cutting feedback”, term suggested by Nicky Best according to Jonty Rougier, in a comment on Sansó, Forest Zantedeschi (2008). Ideal sampling procedure: 1 Sample θ1 from p1(θ1|Y1). 2 Sample θ2 given θ1 from p2(θ2|θ1, Y2). 3 Output (θ1, θ2). Pierre E. Jacob Models made of modules 18
  • 33. Cut approach From the OpenBUGS manual, Spiegelhalter, Thomas, Best Lunn, 2004: The cut function acts as a kind of valve in the graph: prior information is allowed to flow downwards through the cut, but likelihood information is prevented from flowing upwards. Pierre E. Jacob Models made of modules 19
  • 34. Cut approach Difference between cut and standard posterior density functions: πcut (θ1, θ2; Y1, Y2) ∝ p1(θ1)p1(Y1|θ1) p2(θ2|θ1)p2(Y2|θ1, θ2) p2(Y2|θ1) ∝ π(θ1, θ2|Y1, Y2) p2(Y2|θ1) . Pierre E. Jacob Models made of modules 20
  • 35. Cut approach Difference between cut and standard posterior density functions: πcut (θ1, θ2; Y1, Y2) ∝ p1(θ1)p1(Y1|θ1) p2(θ2|θ1)p2(Y2|θ1, θ2) p2(Y2|θ1) ∝ π(θ1, θ2|Y1, Y2) p2(Y2|θ1) . The term p2(Y2|θ1) is a measure of feedback of Y2 onto θ1: p2(Y2|θ1) = Z p2(Y2|θ1, θ2)p2(θ2|θ1)dθ2. Pierre E. Jacob Models made of modules 20
  • 36. Cut approach Difference between cut and standard posterior density functions: πcut (θ1, θ2; Y1, Y2) ∝ p1(θ1)p1(Y1|θ1) p2(θ2|θ1)p2(Y2|θ1, θ2) p2(Y2|θ1) ∝ π(θ1, θ2|Y1, Y2) p2(Y2|θ1) . The term p2(Y2|θ1) is a measure of feedback of Y2 onto θ1: p2(Y2|θ1) = Z p2(Y2|θ1, θ2)p2(θ2|θ1)dθ2. The marginal distribution of θ1 differs, p1(θ1|Y1) for cut, π(θ1|Y1, Y2) for standard posterior, The conditional distribution of θ2 is the same: p2(θ2|θ1, Y2). Pierre E. Jacob Models made of modules 20
  • 37. A variational representation Among all distributions q(θ1, θ2) with marginal p1(θ1|Y1) on θ1, πcut (θ1, θ2; Y1, Y2) minimizes Kullback–Leibler (q(θ1, θ2), π(θ1, θ2|Y1, Y2)) . Lemma 1 in Yu, Nott Smith (2021). Variational inference for cutting feedback in misspecified models. Pierre E. Jacob Models made of modules 21
  • 38. A variational representation Among all distributions q(θ1, θ2) with marginal p1(θ1|Y1) on θ1, πcut (θ1, θ2; Y1, Y2) minimizes Kullback–Leibler (q(θ1, θ2), π(θ1, θ2|Y1, Y2)) . Lemma 1 in Yu, Nott Smith (2021). Variational inference for cutting feedback in misspecified models. We can also view the cut distribution as a valid representation of beliefs about the parameters. Pierre E. Jacob Models made of modules 21
  • 39. Updating beliefs Bissiri, Holmes Walker (2016). A general framework for updating belief distribution. Prior beliefs p(θ), loss associated with data l(y, θ), assume loss and prior are independent pieces of information. Pierre E. Jacob Models made of modules 22
  • 40. Updating beliefs Bissiri, Holmes Walker (2016). A general framework for updating belief distribution. Prior beliefs p(θ), loss associated with data l(y, θ), assume loss and prior are independent pieces of information. We look for an update “p(θ|y)”= ψ(l(y, θ), p(θ)). Pierre E. Jacob Models made of modules 22
  • 41. Updating beliefs Bissiri, Holmes Walker (2016). A general framework for updating belief distribution. Prior beliefs p(θ), loss associated with data l(y, θ), assume loss and prior are independent pieces of information. We look for an update “p(θ|y)”= ψ(l(y, θ), p(θ)). Coherence: ψ(l(y0 , θ), ψ(l(y, θ), p(θ))) = ψ(l(y0 , θ) + l(y, θ), p(θ)). Pierre E. Jacob Models made of modules 22
  • 42. Updating beliefs Bissiri, Holmes Walker (2016). A general framework for updating belief distribution. Prior beliefs p(θ), loss associated with data l(y, θ), assume loss and prior are independent pieces of information. We look for an update “p(θ|y)”= ψ(l(y, θ), p(θ)). Coherence: ψ(l(y0 , θ), ψ(l(y, θ), p(θ))) = ψ(l(y0 , θ) + l(y, θ), p(θ)). Result: for coherence and optimality, we need to define p(θ|y) = argminq Z l(y, θ)q(dθ) + KL(q(θ), p(θ)). Solution: p(θ|y) ∝ exp(−l(y, θ))p(θ). Pierre E. Jacob Models made of modules 22
  • 43. Cut distributions are valid belief updates Carmona Nicholls (2020). Semi-Modular Inference: enhanced learning in multi-modular models by tempering the influence of components. Retrieve the cut distribution in the framework of Bissiri, Holmes Walker (2016), with the loss “l(y, θ)” = − {log p(Y1|θ1)p(Y2|θ1, θ2) − log p2(Y2|θ1)} . Pierre E. Jacob Models made of modules 23
  • 44. Cut distributions are valid belief updates Carmona Nicholls (2020). Semi-Modular Inference: enhanced learning in multi-modular models by tempering the influence of components. Retrieve the cut distribution in the framework of Bissiri, Holmes Walker (2016), with the loss “l(y, θ)” = − {log p(Y1|θ1)p(Y2|θ1, θ2) − log p2(Y2|θ1)} . The loss involves p2(Y2|θ1) = R p2(Y2|θ1, θ2)p2(θ2|θ1)dθ2 and thus depends on the prior. . . the argument needs adjustments. Nicholls, Lee, Wu Carmona (2022). Valid belief updates for prequentially additive loss functions arising in Semi-Modular Inference. Pierre E. Jacob Models made of modules 23
  • 45. Variants Allow a controlled amount of feedback. Nicholls Carmona (2020). Semi-Modular Inference: enhanced learning in multi-modular models by tempering the influence of components. Nicholls, Lee, Wu Carmona (2022). Valid belief updates for prequentially additive loss functions arising in Semi-Modular Inference. Pierre E. Jacob Models made of modules 24
  • 46. Variants Allow a controlled amount of feedback. Nicholls Carmona (2020). Semi-Modular Inference: enhanced learning in multi-modular models by tempering the influence of components. Nicholls, Lee, Wu Carmona (2022). Valid belief updates for prequentially additive loss functions arising in Semi-Modular Inference. Cut + replace minus log-likelihood by other loss functions. Frazier Nott (2022). Cutting feedback and modularized analyses in generalized Bayesian inference. Pierre E. Jacob Models made of modules 24
  • 47. Asymptotics Asymptotics of two-step estimators in Murphy Topel (1985). Estimation and Inference in Two-Step Econometric Models. Extended to cut distributions in Pompe Jacob (2022). Asymptotics of cut distributions and robust modular inference using Posterior Bootstrap. Pierre E. Jacob Models made of modules 25
  • 48. Asymptotics Asymptotics of two-step estimators in Murphy Topel (1985). Estimation and Inference in Two-Step Econometric Models. Extended to cut distributions in Pompe Jacob (2022). Asymptotics of cut distributions and robust modular inference using Posterior Bootstrap. scenario A n1/n2 → α 0 p?(Y1,1:n1 , Y2,1:n2 ) = n1 Y i=1 p? 1(Y1,i) n2 Y i=1 p? 2(Y2,i) Pierre E. Jacob Models made of modules 25
  • 49. Asymptotics Asymptotics of two-step estimators in Murphy Topel (1985). Estimation and Inference in Two-Step Econometric Models. Extended to cut distributions in Pompe Jacob (2022). Asymptotics of cut distributions and robust modular inference using Posterior Bootstrap. scenario A n1/n2 → α 0 p?(Y1,1:n1 , Y2,1:n2 ) = n1 Y i=1 p? 1(Y1,i) n2 Y i=1 p? 2(Y2,i) scenario B n1 = n2 = n p?(Y1,1:n1 , Y2,1:n2 ) = n Y i=1 p? (Y1,i, Y2,i) 6= n Y i=1 p? 1(Y1,i)p? 2(Y2,i) Pierre E. Jacob Models made of modules 25
  • 50. Asymptotics Under regularity conditions, introduce the two-step MLEs: θ̂1 = argmaxθ1 log p1(Y1|θ1) −→ θ? 1, θ̂2 = argmaxθ2 log p2(Y2|θ̂1, θ2) −→ θ? 2. Their asymptotic joint distribution is Normal with variance ΣA under scenario A, ΣB under scenario B. Pierre E. Jacob Models made of modules 26
  • 51. Asymptotics Under regularity conditions, introduce the two-step MLEs: θ̂1 = argmaxθ1 log p1(Y1|θ1) −→ θ? 1, θ̂2 = argmaxθ2 log p2(Y2|θ̂1, θ2) −→ θ? 2. Their asymptotic joint distribution is Normal with variance ΣA under scenario A, ΣB under scenario B. Sandwich formula: E?[−d2`(θ?) dθ2 ]−1E?[(d`(θ?) dθ )2]E?[−d2`(θ?) dθ2 ]−1 + possible dependencies between Y1 and Y2 in scenario B. Pierre E. Jacob Models made of modules 26
  • 52. Asymptotics Under regularity conditions, introduce the two-step MLEs: θ̂1 = argmaxθ1 log p1(Y1|θ1) −→ θ? 1, θ̂2 = argmaxθ2 log p2(Y2|θ̂1, θ2) −→ θ? 2. Their asymptotic joint distribution is Normal with variance ΣA under scenario A, ΣB under scenario B. Sandwich formula: E?[−d2`(θ?) dθ2 ]−1E?[(d`(θ?) dθ )2]E?[−d2`(θ?) dθ2 ]−1 + possible dependencies between Y1 and Y2 in scenario B. For (θ1, θ2) drawn from the cut distribution, √ n(θ1 − θ̂1, θ2 − θ̂2) → Normal(0, ΣC). ΣC ≡ E?[−d2`(θ?) dθ2 ]−1 does not match ΣA or ΣB. Pierre E. Jacob Models made of modules 26
  • 53. Asymptotics Frazier Nott (2022). Cutting feedback and modularized analyses in generalized Bayesian inference. Focus on the asymptotic behaviour of p2(θ2|θ1, Y2), assuming θ1 is fixed in a neighborhood of the MLE limit θ? 1. Pierre E. Jacob Models made of modules 27
  • 54. Asymptotics Frazier Nott (2022). Cutting feedback and modularized analyses in generalized Bayesian inference. Focus on the asymptotic behaviour of p2(θ2|θ1, Y2), assuming θ1 is fixed in a neighborhood of the MLE limit θ? 1. The asymptotic conditional distribution p2(θ2|θ1, Y2) is Normal with both mean and variance depending explicitly on θ1. Allows a finer understanding of the impact of the uncertainty of θ1 onto that of θ2. Pierre E. Jacob Models made of modules 27
  • 55. Supporters say aye, opponents say no Setup: model 2 depends on an input that is itself estimated using model 1. Cut distribution: - Can mitigate effect of misspecification. - Facilitates interoperability across teams. - Can resolve computational intractability of joint model. - Not completely unprincipled. Pierre E. Jacob Models made of modules 28
  • 56. Supporters say aye, opponents say no Setup: model 2 depends on an input that is itself estimated using model 1. Cut distribution: - Can mitigate effect of misspecification. - Facilitates interoperability across teams. - Can resolve computational intractability of joint model. - Not completely unprincipled. / Can lead to sub-optimal estimation/prediction accuracy. / Is no replacement for constructive model criticism. / Associated computations present their own challenges. Pierre E. Jacob Models made of modules 28
  • 57. Asking the data whether to cut Jacob, Murray, Holmes Robert (2017). Better together? Statistical learning in models made of modules. We can try to be principled about whether to cut or not. Natural route: introduce measures of predictive performance that can be evaluated on test data. But which data: Y1? Y2? Pierre E. Jacob Models made of modules 29
  • 58. Asking the data whether to cut Jacob, Murray, Holmes Robert (2017). Better together? Statistical learning in models made of modules. We can try to be principled about whether to cut or not. Natural route: introduce measures of predictive performance that can be evaluated on test data. But which data: Y1? Y2? Postulate: parameters are meaningful in the module that first defines them. Thus distributions of parameters should be compared on predictions in the module that defines them. Pierre E. Jacob Models made of modules 29
  • 59. Asking the data whether to cut In the first module, θ1 is defined in its relation to Y1. We propose to assess candidate distributions for θ1 based on predictive performance for Y1. Pierre E. Jacob Models made of modules 30
  • 60. Asking the data whether to cut In the first module, θ1 is defined in its relation to Y1. We propose to assess candidate distributions for θ1 based on predictive performance for Y1. Two candidates: p1(θ1|Y1) and π(θ1|Y1, Y2). Using the prequential approach and the logarithmic scoring rule, we compare p1(Y1) and π(Y1|Y2). Pierre E. Jacob Models made of modules 30
  • 61. Asking the data whether to cut In the first module, θ1 is defined in its relation to Y1. We propose to assess candidate distributions for θ1 based on predictive performance for Y1. Two candidates: p1(θ1|Y1) and π(θ1|Y1, Y2). Using the prequential approach and the logarithmic scoring rule, we compare p1(Y1) and π(Y1|Y2). If p1(Y1) π(Y1|Y2), we support the use of distributions on (θ1, θ2) that admit p1(θ1|Y1) as first marginal, e.g. cut. Pierre E. Jacob Models made of modules 30
  • 62. Outline 1 Models made of modules and issues with joint modeling 2 Cutting feedback: cut posterior distributions 3 Computation involved when cutting feedback Pierre E. Jacob Models made of modules 30
  • 63. Confusion about cut distributions From Gelman (2020) blog post entitled How to “cut” using Stan, if you must. Question (rephrased for brevity): Have cut posteriors been implemented in Stan? Pierre E. Jacob Models made of modules 31
  • 64. Confusion about cut distributions From Gelman (2020) blog post entitled How to “cut” using Stan, if you must. Question (rephrased for brevity): Have cut posteriors been implemented in Stan? Reply: This topic has come up before, and I don’t think this “cut” is a good idea. If you want to implement it, [. . . ] you’d first fit model 1 and get posterior simulations, then approx those simulations by a mixture of multivari- ate normal or t distributions, then use that as a prior for model 2. [. . . ] Pierre E. Jacob Models made of modules 31
  • 65. Confusion about cut distributions From Gelman (2020) blog post entitled How to “cut” using Stan, if you must. Question (rephrased for brevity): Have cut posteriors been implemented in Stan? Reply: This topic has come up before, and I don’t think this “cut” is a good idea. If you want to implement it, [. . . ] you’d first fit model 1 and get posterior simulations, then approx those simulations by a mixture of multivari- ate normal or t distributions, then use that as a prior for model 2. [. . . ] This would in fact amount to a two-step approximation of the standard posterior distribution, not the cut distribution! Pierre E. Jacob Models made of modules 31
  • 66. Modular approximations of the standard posterior Lunn, Barrett, Sweeting Thompson (2013). Fully Bayesian hierarchical modelling in two stages, with application to meta-analysis. Goudie, Presanis, Lunn, De Angelis Wernisch (2016). Model surgery: joining and splitting models with Markov melding. Manderson Goudie (2021). A numerically stable algorithm for integrating Bayesian models using Markov melding. Pierre E. Jacob Models made of modules 32
  • 67. Modular approximations of the standard posterior Lunn, Barrett, Sweeting Thompson (2013). Fully Bayesian hierarchical modelling in two stages, with application to meta-analysis. Goudie, Presanis, Lunn, De Angelis Wernisch (2016). Model surgery: joining and splitting models with Markov melding. Manderson Goudie (2021). A numerically stable algorithm for integrating Bayesian models using Markov melding. Leonelli, Barons Smith (2018). A conditional independence framework for coherent modularized inference. Huge interest in approximating the supraBayesian with a de-centralized strategy, but this is not about cutting feedback. Pierre E. Jacob Models made of modules 32
  • 68. Sampling from the cut distribution The density of the cut distribution is πcut (θ1, θ2; Y1, Y2) ∝ π(θ1, θ2|Y1, Y2) p2(Y2|θ1) . Pierre E. Jacob Models made of modules 33
  • 69. Sampling from the cut distribution The density of the cut distribution is πcut (θ1, θ2; Y1, Y2) ∝ π(θ1, θ2|Y1, Y2) p2(Y2|θ1) . The term p2(Y2|θ1) is typically intractable, p2(Y2|θ1) = Z p2(Y2|θ1, θ2)p2(θ2|θ1)dθ2. MCMC approach for doubly intractable targets: Liu Goudie (2021). Stochastic approximation cut algorithm for inference in modularized Bayesian models. Pierre E. Jacob Models made of modules 33
  • 70. Sampling from the cut distribution OpenBUGS’ approach via the cut function: alternate between sampling θ1 0 from K1(θ1, dθ1 0) targeting p1(dθ1|Y1), sampling θ2 0 from K2 θ1 0 (θ2, dθ2 0) targeting p2(dθ2|θ1 0, Y2). This does not leave the cut distribution invariant. Iterating the kernel K2 θ0 1 enough times mitigates the issue. Plummer (2014). Cuts in Bayesian graphical models. Pierre E. Jacob Models made of modules 34
  • 71. Sampling from the cut distribution In a perfect sampling world, we could sample θ1 from p1(θ1|Y1), θ2 given θ1 from p2(θ2|θ1, Y2), then (θ1, θ2) would be exactly following the cut distribution. For many models, exact sampling is not feasible. Pierre E. Jacob Models made of modules 35
  • 72. Sampling from the cut distribution In an MCMC world, we can sample θ1 approximately from p1(θ1|Y1) using MCMC, θ2 given θ1 approximately from p2(θ2|θ1, Y2) using MCMC, then resulting samples approximate the cut distribution, in the limit of the numbers of iterations in both stages. / Can involve tuning and convergence diagnostics for many MCMC runs at the 2nd stage, each with a different target. Pierre E. Jacob Models made of modules 36
  • 73. Unbiased estimation of the cut distribution Jacob, O’Leary Atchadé (2020). Unbiased Markov chain Monte Carlo with couplings. By coupling pairs of π-invariant chains in a particular way, meeting 0 50 100 150 200 Pierre E. Jacob Models made of modules 37
  • 74. Unbiased estimation of the cut distribution Jacob, O’Leary Atchadé (2020). Unbiased Markov chain Monte Carlo with couplings. By coupling pairs of π-invariant chains in a particular way, meeting 0 50 100 150 200 Pierre E. Jacob Models made of modules 37
  • 75. Unbiased estimation of the cut distribution Jacob, O’Leary Atchadé (2020). Unbiased Markov chain Monte Carlo with couplings. By coupling pairs of π-invariant chains in a particular way, meeting 0 50 100 150 200 Pierre E. Jacob Models made of modules 37
  • 76. Unbiased estimation of the cut distribution Jacob, O’Leary Atchadé (2020). Unbiased Markov chain Monte Carlo with couplings. By coupling pairs of π-invariant chains in a particular way, meeting 0 50 100 150 200 we can construct a signed measure π̂(dθ) = PN n=1 ωnδθn (dθ) with E[π̂(h)] = π(h) for a class of test functions h. Pierre E. Jacob Models made of modules 37
  • 77. Unbiased estimation of the cut distribution In an unbiased MCMC world, we can approximate without bias p1(dθ1|Y1) with a random measure π̂1(dθ1) = N1 X n=1 ω1,nδθ1,n (dθ1), obtained from coupled p1(dθ1|Y1)-invariant chains. Pierre E. Jacob Models made of modules 38
  • 78. Unbiased estimation of the cut distribution In an unbiased MCMC world, we can approximate without bias p1(dθ1|Y1) with a random measure π̂1(dθ1) = N1 X n=1 ω1,nδθ1,n (dθ1), obtained from coupled p1(dθ1|Y1)-invariant chains. p2(dθ2|θ1, Y2) for any θ1, with a random measure π̂2(dθ2|θ1) = N2 X n=1 ω2,nδθ2,n (dθ2), obtained from coupled p2(dθ2|θ1, Y2)-invariant chains. Pierre E. Jacob Models made of modules 38
  • 79. Unbiased estimation of the cut distribution In an unbiased MCMC world, we can approximate without bias p1(dθ1|Y1) with a random measure π̂1(dθ1) = N1 X n=1 ω1,nδθ1,n (dθ1), obtained from coupled p1(dθ1|Y1)-invariant chains. p2(dθ2|θ1, Y2) for any θ1, with a random measure π̂2(dθ2|θ1) = N2 X n=1 ω2,nδθ2,n (dθ2), obtained from coupled p2(dθ2|θ1, Y2)-invariant chains. Using the tower property, we can estimate without bias expectations with respect to πcut(θ1, θ2; Y1, Y2). Pierre E. Jacob Models made of modules 38
  • 80. Inexact approximations of the cut distribution Variational inference: Yu, Nott Smith (2021). Variational inference for cutting feedback in misspecified models. Carmona Nicholls (2022). Scalable Semi-Modular Inference with Variational Meta-Posteriors. Pierre E. Jacob Models made of modules 39
  • 81. Inexact approximations of the cut distribution Variational inference: Yu, Nott Smith (2021). Variational inference for cutting feedback in misspecified models. Carmona Nicholls (2022). Scalable Semi-Modular Inference with Variational Meta-Posteriors. Posterior bootstrap: Pompe Jacob (2022). Asymptotics of cut distributions and robust modular inference using Posterior Bootstrap. adapting techniques developed earlier by Newton Raftery, Fong, Lyddon, Holmes Walker. Pierre E. Jacob Models made of modules 39
  • 82. Inexact approximations of the cut distribution Variational inference: Yu, Nott Smith (2021). Variational inference for cutting feedback in misspecified models. Carmona Nicholls (2022). Scalable Semi-Modular Inference with Variational Meta-Posteriors. Posterior bootstrap: Pompe Jacob (2022). Asymptotics of cut distributions and robust modular inference using Posterior Bootstrap. adapting techniques developed earlier by Newton Raftery, Fong, Lyddon, Holmes Walker. Modular approximate Bayesian computation Chakraborty, Nott, Drovandi, Frazier Sisson (2022). Modularized Bayesian analyses and cutting feedback in likelihood-free inference. Pierre E. Jacob Models made of modules 39
  • 83. Discussion A very appealing aspect of Bayesian analysis is its unified treatment of many statistical questions. Modular approaches appear to depart from the main framework, and thus bring discomfort. Pierre E. Jacob Models made of modules 40
  • 84. Discussion A very appealing aspect of Bayesian analysis is its unified treatment of many statistical questions. Modular approaches appear to depart from the main framework, and thus bring discomfort. If the essence of Bayesian analysis is not the direct application of Bayes’ theorem, but probability distributions for the quantities of interest, constructed from data and prior information, a framework for statistical inference supported by principles decision theory, Pierre E. Jacob Models made of modules 40
  • 85. Discussion A very appealing aspect of Bayesian analysis is its unified treatment of many statistical questions. Modular approaches appear to depart from the main framework, and thus bring discomfort. If the essence of Bayesian analysis is not the direct application of Bayes’ theorem, but probability distributions for the quantities of interest, constructed from data and prior information, a framework for statistical inference supported by principles decision theory, a good excuse to play with Markov chains, Pierre E. Jacob Models made of modules 40
  • 86. Discussion A very appealing aspect of Bayesian analysis is its unified treatment of many statistical questions. Modular approaches appear to depart from the main framework, and thus bring discomfort. If the essence of Bayesian analysis is not the direct application of Bayes’ theorem, but probability distributions for the quantities of interest, constructed from data and prior information, a framework for statistical inference supported by principles decision theory, a good excuse to play with Markov chains, then cut posteriors might be essentially Bayesian. Thank you! Pierre E. Jacob Models made of modules 40