SlideShare a Scribd company logo
1 of 86
Download to read offline
Bayesian inference with
models made of modules
Pierre E. Jacob
2022 ISBA World Meeting
Susie Bayarri Lecture
June 30, 2022
Pierre E. Jacob Models made of modules 1
Outline
1 Models made of modules and issues with joint modeling
2 Cutting feedback: cut posterior distributions
3 Computation involved when cutting feedback
Pierre E. Jacob Models made of modules 2
Outline
1 Models made of modules and issues with joint modeling
2 Cutting feedback: cut posterior distributions
3 Computation involved when cutting feedback
Pierre E. Jacob Models made of modules 2
Models made of modules
Let’s start with a simple example. . .
Pierre E. Jacob Models made of modules 3
Models made of modules
Ogle, Barber & Sartor (2013). Feedback and Modularization in a Bayesian
Meta–analysis of Tree Traits Affecting Forest Dynamics.
Pierre E. Jacob Models made of modules 4
Models made of modules
Zooming in. . . we see arrows. . . and diodes . . . ?
Pierre E. Jacob Models made of modules 5
Models made of modules
First module:
parameter θ1, data Y1
prior: p1(θ1)
likelihood: p1(Y1|θ1)
Second module:
parameter θ2, data Y2
prior: p2(θ2|θ1)
likelihood: p2 (Y2|θ1, θ2)
Pierre E. Jacob Models made of modules 6
Joint model approach
Parameter (θ1, θ2), with prior
p(θ1, θ2) = p1(θ1)p2(θ2|θ1).
Data (Y1, Y2), likelihood
p(Y1, Y2|θ1, θ2) = p1(Y1|θ1)p2(Y2|θ1, θ2).
Posterior distribution
π (θ1, θ2|Y1, Y2) ∝ p1 (θ1) p1(Y1|θ1)p2 (θ2|θ1) p2 (Y2|θ1, θ2).
Pierre E. Jacob Models made of modules 7
Joint model approach
Parameter (θ1, θ2), with prior
p(θ1, θ2) = p1(θ1)p2(θ2|θ1).
Data (Y1, Y2), likelihood
p(Y1, Y2|θ1, θ2) = p1(Y1|θ1)p2(Y2|θ1, θ2).
Posterior distribution
π (θ1, θ2|Y1, Y2) ∝ p1 (θ1) p1(Y1|θ1)p2 (θ2|θ1) p2 (Y2|θ1, θ2).
Departures from the joint model approach: why, how, when?
Pierre E. Jacob Models made of modules 7
Example: biased data
Liu, Bayarri & Berger (2009). Modularization in Bayesian analysis,
with emphasis on analysis of computer models.
Location model:
∀i = 1, . . . , n1 Y i
1 ∼ Normal(θ1, 1)
θ1 ∼ Normal(0, 1)
Extra data Y2 suspected to be biased:
∀i = 1, . . . , n2 Y i
2 ∼ Normal(θ1 + θ2, 1)
θ2 ∼ Normal(0, v)
Pierre E. Jacob Models made of modules 8
Example: biased data
Liu, Bayarri & Berger (2009). Modularization in Bayesian analysis,
with emphasis on analysis of computer models.
Location model:
∀i = 1, . . . , n1 Y i
1 ∼ Normal(θ1, 1)
θ1 ∼ Normal(0, 1)
Extra data Y2 suspected to be biased:
∀i = 1, . . . , n2 Y i
2 ∼ Normal(θ1 + θ2, 1)
θ2 ∼ Normal(0, v)
If interest is in θ1: are the extra data useful or harmful?
If interest is in θ2: joint model or “two-step” approach?
Pierre E. Jacob Models made of modules 8
Example: SARS-COV-2 prevalence
Nicholson et al. (2022). Interoperability of statistical models in
pandemic preparedness: principles and reality.
Prevalence π of SARS-COV-2 in the UK, estimated from
randomized surveillance data: u positive out of U tested,
Hypergeometric model with parameter π.
targeted surveillance data (patients with clinical need,
health & care workers): n positive out of N tested,
Binomial model involving π, P(tested|infected) and
P(tested|not infected).
Pierre E. Jacob Models made of modules 9
Example: SARS-COV-2 prevalence
Nicholson et al. (2022). Interoperability of statistical models in
pandemic preparedness: principles and reality.
Prevalence π of SARS-COV-2 in the UK, estimated from
randomized surveillance data: u positive out of U tested,
Hypergeometric model with parameter π.
targeted surveillance data (patients with clinical need,
health & care workers): n positive out of N tested,
Binomial model involving π, P(tested|infected) and
P(tested|not infected).
If interest is in π: are the extra data useful or harmful?
If interest is in e.g. P(tested|infected): joint model or
“two-step” approach?
Pierre E. Jacob Models made of modules 9
Example: stochastic dynamical models
Parslow, Cressie, Campbell, Jones & Murray (2013). Bayesian
learning and predictability in a stochastic nonlinear dynamical model.
Geophysics model of the temperature of the ocean, φ.
The temperature φ can be used as “forcings” in a model of
plankton population size β, for example in an SDE model
dβt = µ(βt, φt)dt + σ(βt, φt)dWt.
Pierre E. Jacob Models made of modules 10
Example: stochastic dynamical models
Parslow, Cressie, Campbell, Jones & Murray (2013). Bayesian
learning and predictability in a stochastic nonlinear dynamical model.
Geophysics model of the temperature of the ocean, φ.
The temperature φ can be used as “forcings” in a model of
plankton population size β, for example in an SDE model
dβt = µ(βt, φt)dt + σ(βt, φt)dWt.
We might want to:
propagate uncertainty about the geophysics to the biology?
allow/prevent feedback from the biology to the geophysics?
Pierre E. Jacob Models made of modules 10
Example: PKPD
Bennett & Wakefield (2001). Errors-in-variables in joint population
pharmacokinetic/pharmacodynamic modeling.
Lunn, Best, Spiegelhalter, Graham & Neuenschwander (2009). Combining
MCMC with ‘sequential’ PKPD modelling.
Pharmacokinetics (PK):
models the time course of drug absorption.
∀t Yt ∼ Normal(log Ct, vPK), where Ct = function(t, θPK).
From this we extract (C
(j)
t )t≥0 for individual j = 1, . . . , J.
Pharmacodynamics (PD):
models the effect of drugs.
∀j Zj ∼ Normal(Ej, vPD), where Ej = function(C
(j)
tj
, θPD),
and where tj is the time at which Ej is measured.
Pierre E. Jacob Models made of modules 11
Example: HPV prevalence and cervical cancer incidence
Plummer (2014). Cuts in Bayesian graphical models.
Human papillomavirus prevalence ϕi in country i:
∀i = 1, . . . , I Zi ∼ Binomial(Ni, ϕi),
Zi: number of women infected with high-risk HPV,
Ni: population size in country i.
Impact of prevalence onto cervical cancer occurrence:
∀i = 1, . . . , I Yi ∼ Poisson(λiTi), log(λi) = η1 + η2 ϕi,
Yi is number of cases during study in country i,
Ti: woman-years of follow-up in country i.
Pierre E. Jacob Models made of modules 12
Example: two-step estimation
Murphy & Topel (1985). Estimation and Inference in Two-Step
Econometric Models.
Impact of unanticipated money growth on unemployment.
∀t Mt = θX1t + t,
Mt: proportional growth in the M1 definition of money,
X1t: lagged Mt, lagged unemployment, more variables.
∀t log
Ut
1 − Ut
= βX2t + γ t + Wt,
Ut: annual average unemployment rate,
X2t: minimum wage, more variables.
Pierre E. Jacob Models made of modules 13
Example: two-step estimation
Murphy  Topel (1985). Estimation and Inference in Two-Step
Econometric Models.
Impact of unanticipated money growth on unemployment.
∀t Mt = θX1t + t,
Mt: proportional growth in the M1 definition of money,
X1t: lagged Mt, lagged unemployment, more variables.
∀t log
Ut
1 − Ut
= βX2t + γ t + Wt,
Ut: annual average unemployment rate,
X2t: minimum wage, more variables.
Joint estimation “inappropriate or computationally infeasible”.
Pierre E. Jacob Models made of modules 13
Example: multiple imputation
Missing data: imputation of missing values, then analysis
of completed data.
Jackson, Best  Richardson (2009). Bayesian graphical models
for regression on multiple data sets with different variables.
Pierre E. Jacob Models made of modules 14
Example: multiple imputation
Missing data: imputation of missing values, then analysis
of completed data.
Jackson, Best  Richardson (2009). Bayesian graphical models
for regression on multiple data sets with different variables.
Multiphase inference: a first analyst pre-processes raw
data, then a second analyst uses the processed data.
Blocker  Meng (2013). The potential and perils of
preprocessing: Building new foundations.
Pierre E. Jacob Models made of modules 14
More examples
Environmental epidemiology: estimation of environmental
exposure, then associated health effects.
Blangiardo, Hansell  Richardson (2011). A Bayesian model of
time activity data to investigate health effect of air pollution in
time series studies.
Pierre E. Jacob Models made of modules 15
More examples
Environmental epidemiology: estimation of environmental
exposure, then associated health effects.
Blangiardo, Hansell  Richardson (2011). A Bayesian model of
time activity data to investigate health effect of air pollution in
time series studies.
Causal inference with propensity scores: estimation of
probability of individuals receiving treatment, then
treatment effect adjusted for propensity score.
Zigler, Watts, Yeh, Wang, Coull  Dominici (2013). Model
feedback in Bayesian propensity score estimation.
Saarela, Belzile  Stephens (2016). A Bayesian view of doubly
robust causal inference.
Pierre E. Jacob Models made of modules 15
More examples
Environmental epidemiology: estimation of environmental
exposure, then associated health effects.
Blangiardo, Hansell  Richardson (2011). A Bayesian model of
time activity data to investigate health effect of air pollution in
time series studies.
Causal inference with propensity scores: estimation of
probability of individuals receiving treatment, then
treatment effect adjusted for propensity score.
Zigler, Watts, Yeh, Wang, Coull  Dominici (2013). Model
feedback in Bayesian propensity score estimation.
Saarela, Belzile  Stephens (2016). A Bayesian view of doubly
robust causal inference.
Jacob, Murray, Holmes  Robert (2017). Better together? Statistical
learning in models made of modules.
Pierre E. Jacob Models made of modules 15
Supporters say aye, opponents say no
Setup: model 2 depends on an input that is itself estimated
using model 1.
Bayesian analysis with the joint model:
- coherency, simultaneous treatment of uncertainty, and
other appeals of standard Bayes,
- computational toolbox is already available,
Pierre E. Jacob Models made of modules 16
Supporters say aye, opponents say no
Setup: model 2 depends on an input that is itself estimated
using model 1.
Bayesian analysis with the joint model:
- coherency, simultaneous treatment of uncertainty, and
other appeals of standard Bayes,
- computational toolbox is already available,
/ computationally challenging
as difficulties pile up with more modules,
/ parameters might be hard to interpret as their meaning
changes across modules,
/ module misspecification means that incorporating more
data is not necessarily beneficial, and sometimes harmful.
Pierre E. Jacob Models made of modules 16
Outline
1 Models made of modules and issues with joint modeling
2 Cutting feedback: cut posterior distributions
3 Computation involved when cutting feedback
Pierre E. Jacob Models made of modules 16
Plug-in approach
Simple:
1 first estimate θ1 given Y1, e.g. θ̂1 =
R
θ1 p1(θ1|Y1)dθ1,
2 inference on θ2 given Y2 and θ̂1 using
p2(θ2|θ̂1, Y2) =
p2(θ2|θ̂1)p2(Y2|θ̂1, θ2)
p2(Y2|θ̂1)
.
Pierre E. Jacob Models made of modules 17
Plug-in approach
Simple:
1 first estimate θ1 given Y1, e.g. θ̂1 =
R
θ1 p1(θ1|Y1)dθ1,
2 inference on θ2 given Y2 and θ̂1 using
p2(θ2|θ̂1, Y2) =
p2(θ2|θ̂1)p2(Y2|θ̂1, θ2)
p2(Y2|θ̂1)
.
/ Uncertainty about θ1 is ignored in the estimation of θ2.
- Misspecification of 2nd module does not impact θ1.
Pierre E. Jacob Models made of modules 17
Cut approach
Propagate uncertainty without allowing feedback.
Define the cut distribution:
πcut
(θ1, θ2; Y1, Y2) = p1(θ1|Y1)p2 (θ2|θ1, Y2).
Pierre E. Jacob Models made of modules 18
Cut approach
Propagate uncertainty without allowing feedback.
Define the cut distribution:
πcut
(θ1, θ2; Y1, Y2) = p1(θ1|Y1)p2 (θ2|θ1, Y2).
Known as Bayesian two-step estimation, or as “cutting
feedback”, term suggested by Nicky Best according to Jonty
Rougier, in a comment on Sansó, Forest  Zantedeschi (2008).
Pierre E. Jacob Models made of modules 18
Cut approach
Propagate uncertainty without allowing feedback.
Define the cut distribution:
πcut
(θ1, θ2; Y1, Y2) = p1(θ1|Y1)p2 (θ2|θ1, Y2).
Known as Bayesian two-step estimation, or as “cutting
feedback”, term suggested by Nicky Best according to Jonty
Rougier, in a comment on Sansó, Forest  Zantedeschi (2008).
Ideal sampling procedure:
1 Sample θ1 from p1(θ1|Y1).
2 Sample θ2 given θ1 from p2(θ2|θ1, Y2).
3 Output (θ1, θ2).
Pierre E. Jacob Models made of modules 18
Cut approach
From the OpenBUGS manual,
Spiegelhalter, Thomas, Best  Lunn, 2004:
The cut function acts as a kind of valve in the graph:
prior information is allowed to flow downwards through
the cut, but likelihood information is prevented from
flowing upwards.
Pierre E. Jacob Models made of modules 19
Cut approach
Difference between cut and standard posterior density functions:
πcut
(θ1, θ2; Y1, Y2) ∝ p1(θ1)p1(Y1|θ1)
p2(θ2|θ1)p2(Y2|θ1, θ2)
p2(Y2|θ1)
∝
π(θ1, θ2|Y1, Y2)
p2(Y2|θ1)
.
Pierre E. Jacob Models made of modules 20
Cut approach
Difference between cut and standard posterior density functions:
πcut
(θ1, θ2; Y1, Y2) ∝ p1(θ1)p1(Y1|θ1)
p2(θ2|θ1)p2(Y2|θ1, θ2)
p2(Y2|θ1)
∝
π(θ1, θ2|Y1, Y2)
p2(Y2|θ1)
.
The term p2(Y2|θ1) is a measure of feedback of Y2 onto θ1:
p2(Y2|θ1) =
Z
p2(Y2|θ1, θ2)p2(θ2|θ1)dθ2.
Pierre E. Jacob Models made of modules 20
Cut approach
Difference between cut and standard posterior density functions:
πcut
(θ1, θ2; Y1, Y2) ∝ p1(θ1)p1(Y1|θ1)
p2(θ2|θ1)p2(Y2|θ1, θ2)
p2(Y2|θ1)
∝
π(θ1, θ2|Y1, Y2)
p2(Y2|θ1)
.
The term p2(Y2|θ1) is a measure of feedback of Y2 onto θ1:
p2(Y2|θ1) =
Z
p2(Y2|θ1, θ2)p2(θ2|θ1)dθ2.
The marginal distribution of θ1 differs,
p1(θ1|Y1) for cut, π(θ1|Y1, Y2) for standard posterior,
The conditional distribution of θ2 is the same: p2(θ2|θ1, Y2).
Pierre E. Jacob Models made of modules 20
A variational representation
Among all distributions q(θ1, θ2) with marginal p1(θ1|Y1) on θ1,
πcut (θ1, θ2; Y1, Y2) minimizes
Kullback–Leibler (q(θ1, θ2), π(θ1, θ2|Y1, Y2)) .
Lemma 1 in Yu, Nott  Smith (2021). Variational inference for
cutting feedback in misspecified models.
Pierre E. Jacob Models made of modules 21
A variational representation
Among all distributions q(θ1, θ2) with marginal p1(θ1|Y1) on θ1,
πcut (θ1, θ2; Y1, Y2) minimizes
Kullback–Leibler (q(θ1, θ2), π(θ1, θ2|Y1, Y2)) .
Lemma 1 in Yu, Nott  Smith (2021). Variational inference for
cutting feedback in misspecified models.
We can also view the cut distribution as a valid representation
of beliefs about the parameters.
Pierre E. Jacob Models made of modules 21
Updating beliefs
Bissiri, Holmes  Walker (2016). A general framework for updating
belief distribution.
Prior beliefs p(θ), loss associated with data l(y, θ),
assume loss and prior are independent pieces of information.
Pierre E. Jacob Models made of modules 22
Updating beliefs
Bissiri, Holmes  Walker (2016). A general framework for updating
belief distribution.
Prior beliefs p(θ), loss associated with data l(y, θ),
assume loss and prior are independent pieces of information.
We look for an update “p(θ|y)”= ψ(l(y, θ), p(θ)).
Pierre E. Jacob Models made of modules 22
Updating beliefs
Bissiri, Holmes  Walker (2016). A general framework for updating
belief distribution.
Prior beliefs p(θ), loss associated with data l(y, θ),
assume loss and prior are independent pieces of information.
We look for an update “p(θ|y)”= ψ(l(y, θ), p(θ)).
Coherence:
ψ(l(y0
, θ), ψ(l(y, θ), p(θ))) = ψ(l(y0
, θ) + l(y, θ), p(θ)).
Pierre E. Jacob Models made of modules 22
Updating beliefs
Bissiri, Holmes  Walker (2016). A general framework for updating
belief distribution.
Prior beliefs p(θ), loss associated with data l(y, θ),
assume loss and prior are independent pieces of information.
We look for an update “p(θ|y)”= ψ(l(y, θ), p(θ)).
Coherence:
ψ(l(y0
, θ), ψ(l(y, θ), p(θ))) = ψ(l(y0
, θ) + l(y, θ), p(θ)).
Result: for coherence and optimality, we need to define
p(θ|y) = argminq
Z
l(y, θ)q(dθ) + KL(q(θ), p(θ)).
Solution: p(θ|y) ∝ exp(−l(y, θ))p(θ).
Pierre E. Jacob Models made of modules 22
Cut distributions are valid belief updates
Carmona  Nicholls (2020). Semi-Modular Inference: enhanced
learning in multi-modular models by tempering the influence of
components.
Retrieve the cut distribution in the framework of Bissiri,
Holmes  Walker (2016), with the loss
“l(y, θ)” = − {log p(Y1|θ1)p(Y2|θ1, θ2) − log p2(Y2|θ1)} .
Pierre E. Jacob Models made of modules 23
Cut distributions are valid belief updates
Carmona  Nicholls (2020). Semi-Modular Inference: enhanced
learning in multi-modular models by tempering the influence of
components.
Retrieve the cut distribution in the framework of Bissiri,
Holmes  Walker (2016), with the loss
“l(y, θ)” = − {log p(Y1|θ1)p(Y2|θ1, θ2) − log p2(Y2|θ1)} .
The loss involves p2(Y2|θ1) =
R
p2(Y2|θ1, θ2)p2(θ2|θ1)dθ2 and
thus depends on the prior. . . the argument needs adjustments.
Nicholls, Lee, Wu  Carmona (2022). Valid belief updates for
prequentially additive loss functions arising in Semi-Modular
Inference.
Pierre E. Jacob Models made of modules 23
Variants
Allow a controlled amount of feedback.
Nicholls  Carmona (2020). Semi-Modular Inference: enhanced
learning in multi-modular models by tempering the influence of
components.
Nicholls, Lee, Wu  Carmona (2022). Valid belief updates for
prequentially additive loss functions arising in Semi-Modular
Inference.
Pierre E. Jacob Models made of modules 24
Variants
Allow a controlled amount of feedback.
Nicholls  Carmona (2020). Semi-Modular Inference: enhanced
learning in multi-modular models by tempering the influence of
components.
Nicholls, Lee, Wu  Carmona (2022). Valid belief updates for
prequentially additive loss functions arising in Semi-Modular
Inference.
Cut + replace minus log-likelihood by other loss functions.
Frazier  Nott (2022). Cutting feedback and modularized
analyses in generalized Bayesian inference.
Pierre E. Jacob Models made of modules 24
Asymptotics
Asymptotics of two-step estimators in Murphy  Topel (1985).
Estimation and Inference in Two-Step Econometric Models.
Extended to cut distributions in Pompe  Jacob (2022).
Asymptotics of cut distributions and robust modular inference using
Posterior Bootstrap.
Pierre E. Jacob Models made of modules 25
Asymptotics
Asymptotics of two-step estimators in Murphy  Topel (1985).
Estimation and Inference in Two-Step Econometric Models.
Extended to cut distributions in Pompe  Jacob (2022).
Asymptotics of cut distributions and robust modular inference using
Posterior Bootstrap.
scenario A n1/n2 → α  0
p?(Y1,1:n1 , Y2,1:n2 ) =
n1
Y
i=1
p?
1(Y1,i)
n2
Y
i=1
p?
2(Y2,i)
Pierre E. Jacob Models made of modules 25
Asymptotics
Asymptotics of two-step estimators in Murphy  Topel (1985).
Estimation and Inference in Two-Step Econometric Models.
Extended to cut distributions in Pompe  Jacob (2022).
Asymptotics of cut distributions and robust modular inference using
Posterior Bootstrap.
scenario A n1/n2 → α  0
p?(Y1,1:n1 , Y2,1:n2 ) =
n1
Y
i=1
p?
1(Y1,i)
n2
Y
i=1
p?
2(Y2,i)
scenario B n1 = n2 = n
p?(Y1,1:n1 , Y2,1:n2 ) =
n
Y
i=1
p?
(Y1,i, Y2,i)
6=
n
Y
i=1
p?
1(Y1,i)p?
2(Y2,i)
Pierre E. Jacob Models made of modules 25
Asymptotics
Under regularity conditions, introduce the two-step MLEs:
θ̂1 = argmaxθ1
log p1(Y1|θ1) −→ θ?
1,
θ̂2 = argmaxθ2
log p2(Y2|θ̂1, θ2) −→ θ?
2.
Their asymptotic joint distribution is Normal with variance
ΣA under scenario A, ΣB under scenario B.
Pierre E. Jacob Models made of modules 26
Asymptotics
Under regularity conditions, introduce the two-step MLEs:
θ̂1 = argmaxθ1
log p1(Y1|θ1) −→ θ?
1,
θ̂2 = argmaxθ2
log p2(Y2|θ̂1, θ2) −→ θ?
2.
Their asymptotic joint distribution is Normal with variance
ΣA under scenario A, ΣB under scenario B.
Sandwich formula: E?[−d2`(θ?)
dθ2 ]−1E?[(d`(θ?)
dθ )2]E?[−d2`(θ?)
dθ2 ]−1
+ possible dependencies between Y1 and Y2 in scenario B.
Pierre E. Jacob Models made of modules 26
Asymptotics
Under regularity conditions, introduce the two-step MLEs:
θ̂1 = argmaxθ1
log p1(Y1|θ1) −→ θ?
1,
θ̂2 = argmaxθ2
log p2(Y2|θ̂1, θ2) −→ θ?
2.
Their asymptotic joint distribution is Normal with variance
ΣA under scenario A, ΣB under scenario B.
Sandwich formula: E?[−d2`(θ?)
dθ2 ]−1E?[(d`(θ?)
dθ )2]E?[−d2`(θ?)
dθ2 ]−1
+ possible dependencies between Y1 and Y2 in scenario B.
For (θ1, θ2) drawn from the cut distribution,
√
n(θ1 − θ̂1, θ2 − θ̂2) → Normal(0, ΣC).
ΣC ≡ E?[−d2`(θ?)
dθ2 ]−1 does not match ΣA or ΣB.
Pierre E. Jacob Models made of modules 26
Asymptotics
Frazier  Nott (2022). Cutting feedback and modularized analyses in
generalized Bayesian inference.
Focus on the asymptotic behaviour of p2(θ2|θ1, Y2),
assuming θ1 is fixed in a neighborhood of the MLE limit θ?
1.
Pierre E. Jacob Models made of modules 27
Asymptotics
Frazier  Nott (2022). Cutting feedback and modularized analyses in
generalized Bayesian inference.
Focus on the asymptotic behaviour of p2(θ2|θ1, Y2),
assuming θ1 is fixed in a neighborhood of the MLE limit θ?
1.
The asymptotic conditional distribution p2(θ2|θ1, Y2) is Normal
with both mean and variance depending explicitly on θ1.
Allows a finer understanding of the impact of
the uncertainty of θ1 onto that of θ2.
Pierre E. Jacob Models made of modules 27
Supporters say aye, opponents say no
Setup: model 2 depends on an input that is itself estimated
using model 1.
Cut distribution:
- Can mitigate effect of misspecification.
- Facilitates interoperability across teams.
- Can resolve computational intractability of joint model.
- Not completely unprincipled.
Pierre E. Jacob Models made of modules 28
Supporters say aye, opponents say no
Setup: model 2 depends on an input that is itself estimated
using model 1.
Cut distribution:
- Can mitigate effect of misspecification.
- Facilitates interoperability across teams.
- Can resolve computational intractability of joint model.
- Not completely unprincipled.
/ Can lead to sub-optimal estimation/prediction accuracy.
/ Is no replacement for constructive model criticism.
/ Associated computations present their own challenges.
Pierre E. Jacob Models made of modules 28
Asking the data whether to cut
Jacob, Murray, Holmes  Robert (2017). Better together? Statistical
learning in models made of modules.
We can try to be principled about whether to cut or not.
Natural route: introduce measures of predictive performance
that can be evaluated on test data. But which data: Y1? Y2?
Pierre E. Jacob Models made of modules 29
Asking the data whether to cut
Jacob, Murray, Holmes  Robert (2017). Better together? Statistical
learning in models made of modules.
We can try to be principled about whether to cut or not.
Natural route: introduce measures of predictive performance
that can be evaluated on test data. But which data: Y1? Y2?
Postulate: parameters are meaningful in the module that first
defines them. Thus distributions of parameters should be
compared on predictions in the module that defines them.
Pierre E. Jacob Models made of modules 29
Asking the data whether to cut
In the first module, θ1 is defined in its relation to Y1.
We propose to assess candidate distributions for θ1 based on
predictive performance for Y1.
Pierre E. Jacob Models made of modules 30
Asking the data whether to cut
In the first module, θ1 is defined in its relation to Y1.
We propose to assess candidate distributions for θ1 based on
predictive performance for Y1.
Two candidates:
p1(θ1|Y1) and π(θ1|Y1, Y2).
Using the prequential approach and the logarithmic scoring
rule, we compare
p1(Y1) and π(Y1|Y2).
Pierre E. Jacob Models made of modules 30
Asking the data whether to cut
In the first module, θ1 is defined in its relation to Y1.
We propose to assess candidate distributions for θ1 based on
predictive performance for Y1.
Two candidates:
p1(θ1|Y1) and π(θ1|Y1, Y2).
Using the prequential approach and the logarithmic scoring
rule, we compare
p1(Y1) and π(Y1|Y2).
If p1(Y1)  π(Y1|Y2), we support the use of distributions on
(θ1, θ2) that admit p1(θ1|Y1) as first marginal, e.g. cut.
Pierre E. Jacob Models made of modules 30
Outline
1 Models made of modules and issues with joint modeling
2 Cutting feedback: cut posterior distributions
3 Computation involved when cutting feedback
Pierre E. Jacob Models made of modules 30
Confusion about cut distributions
From Gelman (2020) blog post entitled How to “cut” using
Stan, if you must.
Question (rephrased for brevity):
Have cut posteriors been implemented in Stan?
Pierre E. Jacob Models made of modules 31
Confusion about cut distributions
From Gelman (2020) blog post entitled How to “cut” using
Stan, if you must.
Question (rephrased for brevity):
Have cut posteriors been implemented in Stan?
Reply:
This topic has come up before, and I don’t think this
“cut” is a good idea. If you want to implement it, [. . . ]
you’d first fit model 1 and get posterior simulations,
then approx those simulations by a mixture of multivari-
ate normal or t distributions, then use that as a prior
for model 2. [. . . ]
Pierre E. Jacob Models made of modules 31
Confusion about cut distributions
From Gelman (2020) blog post entitled How to “cut” using
Stan, if you must.
Question (rephrased for brevity):
Have cut posteriors been implemented in Stan?
Reply:
This topic has come up before, and I don’t think this
“cut” is a good idea. If you want to implement it, [. . . ]
you’d first fit model 1 and get posterior simulations,
then approx those simulations by a mixture of multivari-
ate normal or t distributions, then use that as a prior
for model 2. [. . . ]
This would in fact amount to a two-step approximation of the
standard posterior distribution, not the cut distribution!
Pierre E. Jacob Models made of modules 31
Modular approximations of the standard posterior
Lunn, Barrett, Sweeting  Thompson (2013). Fully Bayesian
hierarchical modelling in two stages, with application to
meta-analysis.
Goudie, Presanis, Lunn, De Angelis  Wernisch (2016). Model
surgery: joining and splitting models with Markov melding.
Manderson  Goudie (2021). A numerically stable algorithm for
integrating Bayesian models using Markov melding.
Pierre E. Jacob Models made of modules 32
Modular approximations of the standard posterior
Lunn, Barrett, Sweeting  Thompson (2013). Fully Bayesian
hierarchical modelling in two stages, with application to
meta-analysis.
Goudie, Presanis, Lunn, De Angelis  Wernisch (2016). Model
surgery: joining and splitting models with Markov melding.
Manderson  Goudie (2021). A numerically stable algorithm for
integrating Bayesian models using Markov melding.
Leonelli, Barons  Smith (2018). A conditional independence
framework for coherent modularized inference.
Huge interest in approximating the supraBayesian with a
de-centralized strategy, but this is not about cutting feedback.
Pierre E. Jacob Models made of modules 32
Sampling from the cut distribution
The density of the cut distribution is
πcut
(θ1, θ2; Y1, Y2) ∝
π(θ1, θ2|Y1, Y2)
p2(Y2|θ1)
.
Pierre E. Jacob Models made of modules 33
Sampling from the cut distribution
The density of the cut distribution is
πcut
(θ1, θ2; Y1, Y2) ∝
π(θ1, θ2|Y1, Y2)
p2(Y2|θ1)
.
The term p2(Y2|θ1) is typically intractable,
p2(Y2|θ1) =
Z
p2(Y2|θ1, θ2)p2(θ2|θ1)dθ2.
MCMC approach for doubly intractable targets:
Liu  Goudie (2021). Stochastic approximation cut algorithm for
inference in modularized Bayesian models.
Pierre E. Jacob Models made of modules 33
Sampling from the cut distribution
OpenBUGS’ approach via the cut function: alternate between
sampling θ1
0 from K1(θ1, dθ1
0) targeting p1(dθ1|Y1),
sampling θ2
0 from K2
θ1
0 (θ2, dθ2
0) targeting p2(dθ2|θ1
0, Y2).
This does not leave the cut distribution invariant. Iterating the
kernel K2
θ0
1
enough times mitigates the issue.
Plummer (2014). Cuts in Bayesian graphical models.
Pierre E. Jacob Models made of modules 34
Sampling from the cut distribution
In a perfect sampling world, we could sample
θ1 from p1(θ1|Y1),
θ2 given θ1 from p2(θ2|θ1, Y2),
then (θ1, θ2) would be exactly following the cut distribution.
For many models, exact sampling is not feasible.
Pierre E. Jacob Models made of modules 35
Sampling from the cut distribution
In an MCMC world, we can sample
θ1 approximately from p1(θ1|Y1) using MCMC,
θ2 given θ1 approximately from p2(θ2|θ1, Y2) using MCMC,
then resulting samples approximate the cut distribution,
in the limit of the numbers of iterations in both stages.
/ Can involve tuning and convergence diagnostics for many
MCMC runs at the 2nd stage, each with a different target.
Pierre E. Jacob Models made of modules 36
Unbiased estimation of the cut distribution
Jacob, O’Leary  Atchadé (2020). Unbiased Markov chain Monte
Carlo with couplings.
By coupling pairs of π-invariant chains in a particular way,
meeting
0 50 100 150 200
Pierre E. Jacob Models made of modules 37
Unbiased estimation of the cut distribution
Jacob, O’Leary  Atchadé (2020). Unbiased Markov chain Monte
Carlo with couplings.
By coupling pairs of π-invariant chains in a particular way,
meeting
0 50 100 150 200
Pierre E. Jacob Models made of modules 37
Unbiased estimation of the cut distribution
Jacob, O’Leary  Atchadé (2020). Unbiased Markov chain Monte
Carlo with couplings.
By coupling pairs of π-invariant chains in a particular way,
meeting
0 50 100 150 200
Pierre E. Jacob Models made of modules 37
Unbiased estimation of the cut distribution
Jacob, O’Leary  Atchadé (2020). Unbiased Markov chain Monte
Carlo with couplings.
By coupling pairs of π-invariant chains in a particular way,
meeting
0 50 100 150 200
we can construct a signed measure π̂(dθ) =
PN
n=1 ωnδθn (dθ)
with E[π̂(h)] = π(h) for a class of test functions h.
Pierre E. Jacob Models made of modules 37
Unbiased estimation of the cut distribution
In an unbiased MCMC world, we can approximate without bias
p1(dθ1|Y1) with a random measure
π̂1(dθ1) =
N1
X
n=1
ω1,nδθ1,n (dθ1),
obtained from coupled p1(dθ1|Y1)-invariant chains.
Pierre E. Jacob Models made of modules 38
Unbiased estimation of the cut distribution
In an unbiased MCMC world, we can approximate without bias
p1(dθ1|Y1) with a random measure
π̂1(dθ1) =
N1
X
n=1
ω1,nδθ1,n (dθ1),
obtained from coupled p1(dθ1|Y1)-invariant chains.
p2(dθ2|θ1, Y2) for any θ1, with a random measure
π̂2(dθ2|θ1) =
N2
X
n=1
ω2,nδθ2,n (dθ2),
obtained from coupled p2(dθ2|θ1, Y2)-invariant chains.
Pierre E. Jacob Models made of modules 38
Unbiased estimation of the cut distribution
In an unbiased MCMC world, we can approximate without bias
p1(dθ1|Y1) with a random measure
π̂1(dθ1) =
N1
X
n=1
ω1,nδθ1,n (dθ1),
obtained from coupled p1(dθ1|Y1)-invariant chains.
p2(dθ2|θ1, Y2) for any θ1, with a random measure
π̂2(dθ2|θ1) =
N2
X
n=1
ω2,nδθ2,n (dθ2),
obtained from coupled p2(dθ2|θ1, Y2)-invariant chains.
Using the tower property, we can estimate without bias
expectations with respect to πcut(θ1, θ2; Y1, Y2).
Pierre E. Jacob Models made of modules 38
Inexact approximations of the cut distribution
Variational inference:
Yu, Nott  Smith (2021). Variational inference for cutting
feedback in misspecified models.
Carmona  Nicholls (2022). Scalable Semi-Modular Inference
with Variational Meta-Posteriors.
Pierre E. Jacob Models made of modules 39
Inexact approximations of the cut distribution
Variational inference:
Yu, Nott  Smith (2021). Variational inference for cutting
feedback in misspecified models.
Carmona  Nicholls (2022). Scalable Semi-Modular Inference
with Variational Meta-Posteriors.
Posterior bootstrap:
Pompe  Jacob (2022). Asymptotics of cut distributions and
robust modular inference using Posterior Bootstrap.
adapting techniques developed earlier
by Newton  Raftery, Fong, Lyddon, Holmes  Walker.
Pierre E. Jacob Models made of modules 39
Inexact approximations of the cut distribution
Variational inference:
Yu, Nott  Smith (2021). Variational inference for cutting
feedback in misspecified models.
Carmona  Nicholls (2022). Scalable Semi-Modular Inference
with Variational Meta-Posteriors.
Posterior bootstrap:
Pompe  Jacob (2022). Asymptotics of cut distributions and
robust modular inference using Posterior Bootstrap.
adapting techniques developed earlier
by Newton  Raftery, Fong, Lyddon, Holmes  Walker.
Modular approximate Bayesian computation
Chakraborty, Nott, Drovandi, Frazier  Sisson (2022).
Modularized Bayesian analyses and cutting feedback in
likelihood-free inference.
Pierre E. Jacob Models made of modules 39
Discussion
A very appealing aspect of Bayesian analysis is its unified
treatment of many statistical questions.
Modular approaches appear to depart from the main
framework, and thus bring discomfort.
Pierre E. Jacob Models made of modules 40
Discussion
A very appealing aspect of Bayesian analysis is its unified
treatment of many statistical questions.
Modular approaches appear to depart from the main
framework, and thus bring discomfort.
If the essence of Bayesian analysis is not the direct application
of Bayes’ theorem, but
probability distributions for the quantities of interest,
constructed from data and prior information,
a framework for statistical inference
supported by principles  decision theory,
Pierre E. Jacob Models made of modules 40
Discussion
A very appealing aspect of Bayesian analysis is its unified
treatment of many statistical questions.
Modular approaches appear to depart from the main
framework, and thus bring discomfort.
If the essence of Bayesian analysis is not the direct application
of Bayes’ theorem, but
probability distributions for the quantities of interest,
constructed from data and prior information,
a framework for statistical inference
supported by principles  decision theory,
a good excuse to play with Markov chains,
Pierre E. Jacob Models made of modules 40
Discussion
A very appealing aspect of Bayesian analysis is its unified
treatment of many statistical questions.
Modular approaches appear to depart from the main
framework, and thus bring discomfort.
If the essence of Bayesian analysis is not the direct application
of Bayes’ theorem, but
probability distributions for the quantities of interest,
constructed from data and prior information,
a framework for statistical inference
supported by principles  decision theory,
a good excuse to play with Markov chains,
then cut posteriors might be essentially Bayesian.
Thank you!
Pierre E. Jacob Models made of modules 40

More Related Content

What's hot

Sonochemical method of synthesis of nanoparticles.pptx
Sonochemical method of synthesis of nanoparticles.pptxSonochemical method of synthesis of nanoparticles.pptx
Sonochemical method of synthesis of nanoparticles.pptxMuhammadHashami2
 
Group Transfer Polymerisation (GTP)
Group Transfer Polymerisation (GTP)Group Transfer Polymerisation (GTP)
Group Transfer Polymerisation (GTP)Sujoy Saha
 
Factors affecting the magnitude of 10 Dq
Factors affecting the magnitude of 10 DqFactors affecting the magnitude of 10 Dq
Factors affecting the magnitude of 10 DqMithil Fal Desai
 
6. introduction TO POLYMER.pdf
6. introduction TO POLYMER.pdf6. introduction TO POLYMER.pdf
6. introduction TO POLYMER.pdfGebbyFebrilia
 
Discussion about hydrothermal & gel growth method of crystal
Discussion about hydrothermal & gel growth method of crystalDiscussion about hydrothermal & gel growth method of crystal
Discussion about hydrothermal & gel growth method of crystalMostakimRahman1
 
amperometric titration
amperometric titrationamperometric titration
amperometric titrationsurajganorkar
 
AMPEROMETRY and AMPEROMETRIC TITRATIONS
AMPEROMETRY and AMPEROMETRIC TITRATIONSAMPEROMETRY and AMPEROMETRIC TITRATIONS
AMPEROMETRY and AMPEROMETRIC TITRATIONSEinstein kannan
 
Graphene nanoparticles
Graphene nanoparticlesGraphene nanoparticles
Graphene nanoparticlesSandeep Kumar
 
엑셜 보안 설정과 Node xl 오류 고치기
엑셜 보안 설정과 Node xl 오류 고치기엑셜 보안 설정과 Node xl 오류 고치기
엑셜 보안 설정과 Node xl 오류 고치기Han Woo PARK
 
Ligand field theory - Supratim Chakraborty
Ligand field theory - Supratim ChakrabortyLigand field theory - Supratim Chakraborty
Ligand field theory - Supratim ChakrabortySupratimChakraborty19
 
Concepts of polymer thermodynamics by Van Dijk
Concepts of polymer thermodynamics by Van DijkConcepts of polymer thermodynamics by Van Dijk
Concepts of polymer thermodynamics by Van Dijkfhdjamshed
 
Estimate the amount Ni by EDTA
Estimate the amount Ni by EDTAEstimate the amount Ni by EDTA
Estimate the amount Ni by EDTAMithil Fal Desai
 
Presentation- Multilayer block copolymer meshes by orthogonal self-assembly
Presentation- Multilayer block copolymer  meshes by orthogonal self-assemblyPresentation- Multilayer block copolymer  meshes by orthogonal self-assembly
Presentation- Multilayer block copolymer meshes by orthogonal self-assemblyMohit Rajput
 

What's hot (17)

Polymerization techniques
Polymerization techniquesPolymerization techniques
Polymerization techniques
 
Sonochemical method of synthesis of nanoparticles.pptx
Sonochemical method of synthesis of nanoparticles.pptxSonochemical method of synthesis of nanoparticles.pptx
Sonochemical method of synthesis of nanoparticles.pptx
 
Group Transfer Polymerisation (GTP)
Group Transfer Polymerisation (GTP)Group Transfer Polymerisation (GTP)
Group Transfer Polymerisation (GTP)
 
Polymers
PolymersPolymers
Polymers
 
Factors affecting the magnitude of 10 Dq
Factors affecting the magnitude of 10 DqFactors affecting the magnitude of 10 Dq
Factors affecting the magnitude of 10 Dq
 
6. introduction TO POLYMER.pdf
6. introduction TO POLYMER.pdf6. introduction TO POLYMER.pdf
6. introduction TO POLYMER.pdf
 
Discussion about hydrothermal & gel growth method of crystal
Discussion about hydrothermal & gel growth method of crystalDiscussion about hydrothermal & gel growth method of crystal
Discussion about hydrothermal & gel growth method of crystal
 
amperometric titration
amperometric titrationamperometric titration
amperometric titration
 
AMPEROMETRY and AMPEROMETRIC TITRATIONS
AMPEROMETRY and AMPEROMETRIC TITRATIONSAMPEROMETRY and AMPEROMETRIC TITRATIONS
AMPEROMETRY and AMPEROMETRIC TITRATIONS
 
Graphene nanoparticles
Graphene nanoparticlesGraphene nanoparticles
Graphene nanoparticles
 
엑셜 보안 설정과 Node xl 오류 고치기
엑셜 보안 설정과 Node xl 오류 고치기엑셜 보안 설정과 Node xl 오류 고치기
엑셜 보안 설정과 Node xl 오류 고치기
 
Applications of Nanomaterials
Applications of NanomaterialsApplications of Nanomaterials
Applications of Nanomaterials
 
Conducting polymers
Conducting polymersConducting polymers
Conducting polymers
 
Ligand field theory - Supratim Chakraborty
Ligand field theory - Supratim ChakrabortyLigand field theory - Supratim Chakraborty
Ligand field theory - Supratim Chakraborty
 
Concepts of polymer thermodynamics by Van Dijk
Concepts of polymer thermodynamics by Van DijkConcepts of polymer thermodynamics by Van Dijk
Concepts of polymer thermodynamics by Van Dijk
 
Estimate the amount Ni by EDTA
Estimate the amount Ni by EDTAEstimate the amount Ni by EDTA
Estimate the amount Ni by EDTA
 
Presentation- Multilayer block copolymer meshes by orthogonal self-assembly
Presentation- Multilayer block copolymer  meshes by orthogonal self-assemblyPresentation- Multilayer block copolymer  meshes by orthogonal self-assembly
Presentation- Multilayer block copolymer meshes by orthogonal self-assembly
 

Similar to ISBA 2022 Susie Bayarri lecture

better together? statistical learning in models made of modules
better together? statistical learning in models made of modulesbetter together? statistical learning in models made of modules
better together? statistical learning in models made of modulesChristian Robert
 
Investigations of certain estimators for modeling panel data under violations...
Investigations of certain estimators for modeling panel data under violations...Investigations of certain estimators for modeling panel data under violations...
Investigations of certain estimators for modeling panel data under violations...Alexander Decker
 
American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1Double Check ĆŐNSULTING
 
因果関係の推論
因果関係の推論因果関係の推論
因果関係の推論Kevin No
 
Computing Bayesian posterior with empirical likelihood in population genetics
Computing Bayesian posterior with empirical likelihood in population geneticsComputing Bayesian posterior with empirical likelihood in population genetics
Computing Bayesian posterior with empirical likelihood in population geneticsPierre Pudlo
 
Nber Lecture Final
Nber Lecture FinalNber Lecture Final
Nber Lecture FinalNBER
 
Likelihood free computational statistics
Likelihood free computational statisticsLikelihood free computational statistics
Likelihood free computational statisticsPierre Pudlo
 
New Insights and Applications of Eco-Finance Networks and Collaborative Games
New Insights and Applications of Eco-Finance Networks and Collaborative GamesNew Insights and Applications of Eco-Finance Networks and Collaborative Games
New Insights and Applications of Eco-Finance Networks and Collaborative GamesSSA KPI
 
JHU Job Talk
JHU Job TalkJHU Job Talk
JHU Job Talkjtleek
 
Computational Pool-Testing with Retesting Strategy
Computational Pool-Testing with Retesting StrategyComputational Pool-Testing with Retesting Strategy
Computational Pool-Testing with Retesting StrategyWaqas Tariq
 
Neural Networks with Complex Sample Data
Neural Networks with Complex Sample DataNeural Networks with Complex Sample Data
Neural Networks with Complex Sample DataSavano Pereira
 
Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of...
Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of...Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of...
Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of...Umberto Picchini
 
Modelling Population Heterogeneity- MichaelGhebre
Modelling Population Heterogeneity- MichaelGhebreModelling Population Heterogeneity- MichaelGhebre
Modelling Population Heterogeneity- MichaelGhebreMichael A Ghebre, PhD
 
Methods for High Dimensional Interactions
Methods for High Dimensional InteractionsMethods for High Dimensional Interactions
Methods for High Dimensional Interactionssahirbhatnagar
 
Regression shrinkage: better answers to causal questions
Regression shrinkage: better answers to causal questionsRegression shrinkage: better answers to causal questions
Regression shrinkage: better answers to causal questionsMaarten van Smeden
 
New Epidemiologic Measures in Multilevel Study: Median Risk Ratio, Median Haz...
New Epidemiologic Measures in Multilevel Study: Median Risk Ratio, Median Haz...New Epidemiologic Measures in Multilevel Study: Median Risk Ratio, Median Haz...
New Epidemiologic Measures in Multilevel Study: Median Risk Ratio, Median Haz...Jinseob Kim
 

Similar to ISBA 2022 Susie Bayarri lecture (20)

better together? statistical learning in models made of modules
better together? statistical learning in models made of modulesbetter together? statistical learning in models made of modules
better together? statistical learning in models made of modules
 
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
 
Investigations of certain estimators for modeling panel data under violations...
Investigations of certain estimators for modeling panel data under violations...Investigations of certain estimators for modeling panel data under violations...
Investigations of certain estimators for modeling panel data under violations...
 
American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1
 
因果関係の推論
因果関係の推論因果関係の推論
因果関係の推論
 
Computing Bayesian posterior with empirical likelihood in population genetics
Computing Bayesian posterior with empirical likelihood in population geneticsComputing Bayesian posterior with empirical likelihood in population genetics
Computing Bayesian posterior with empirical likelihood in population genetics
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Nber Lecture Final
Nber Lecture FinalNber Lecture Final
Nber Lecture Final
 
Likelihood free computational statistics
Likelihood free computational statisticsLikelihood free computational statistics
Likelihood free computational statistics
 
New Insights and Applications of Eco-Finance Networks and Collaborative Games
New Insights and Applications of Eco-Finance Networks and Collaborative GamesNew Insights and Applications of Eco-Finance Networks and Collaborative Games
New Insights and Applications of Eco-Finance Networks and Collaborative Games
 
JHU Job Talk
JHU Job TalkJHU Job Talk
JHU Job Talk
 
Computational Pool-Testing with Retesting Strategy
Computational Pool-Testing with Retesting StrategyComputational Pool-Testing with Retesting Strategy
Computational Pool-Testing with Retesting Strategy
 
Neural Networks with Complex Sample Data
Neural Networks with Complex Sample DataNeural Networks with Complex Sample Data
Neural Networks with Complex Sample Data
 
MS-Intro.pptx
MS-Intro.pptxMS-Intro.pptx
MS-Intro.pptx
 
Seattle.Slides.7
Seattle.Slides.7Seattle.Slides.7
Seattle.Slides.7
 
Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of...
Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of...Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of...
Inference via Bayesian Synthetic Likelihoods for a Mixed-Effects SDE Model of...
 
Modelling Population Heterogeneity- MichaelGhebre
Modelling Population Heterogeneity- MichaelGhebreModelling Population Heterogeneity- MichaelGhebre
Modelling Population Heterogeneity- MichaelGhebre
 
Methods for High Dimensional Interactions
Methods for High Dimensional InteractionsMethods for High Dimensional Interactions
Methods for High Dimensional Interactions
 
Regression shrinkage: better answers to causal questions
Regression shrinkage: better answers to causal questionsRegression shrinkage: better answers to causal questions
Regression shrinkage: better answers to causal questions
 
New Epidemiologic Measures in Multilevel Study: Median Risk Ratio, Median Haz...
New Epidemiologic Measures in Multilevel Study: Median Risk Ratio, Median Haz...New Epidemiologic Measures in Multilevel Study: Median Risk Ratio, Median Haz...
New Epidemiologic Measures in Multilevel Study: Median Risk Ratio, Median Haz...
 

More from Pierre Jacob

Talk at CIRM on Poisson equation and debiasing techniques
Talk at CIRM on Poisson equation and debiasing techniquesTalk at CIRM on Poisson equation and debiasing techniques
Talk at CIRM on Poisson equation and debiasing techniquesPierre Jacob
 
Couplings of Markov chains and the Poisson equation
Couplings of Markov chains and the Poisson equation Couplings of Markov chains and the Poisson equation
Couplings of Markov chains and the Poisson equation Pierre Jacob
 
Monte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problemsMonte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problemsPierre Jacob
 
Monte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problemsMonte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problemsPierre Jacob
 
Markov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing themMarkov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing themPierre Jacob
 
Unbiased MCMC with couplings
Unbiased MCMC with couplingsUnbiased MCMC with couplings
Unbiased MCMC with couplingsPierre Jacob
 
Unbiased Markov chain Monte Carlo methods
Unbiased Markov chain Monte Carlo methods Unbiased Markov chain Monte Carlo methods
Unbiased Markov chain Monte Carlo methods Pierre Jacob
 
Recent developments on unbiased MCMC
Recent developments on unbiased MCMCRecent developments on unbiased MCMC
Recent developments on unbiased MCMCPierre Jacob
 
Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Pierre Jacob
 
Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Pierre Jacob
 
Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Pierre Jacob
 
Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Pierre Jacob
 
Current limitations of sequential inference in general hidden Markov models
Current limitations of sequential inference in general hidden Markov modelsCurrent limitations of sequential inference in general hidden Markov models
Current limitations of sequential inference in general hidden Markov modelsPierre Jacob
 
On non-negative unbiased estimators
On non-negative unbiased estimatorsOn non-negative unbiased estimators
On non-negative unbiased estimatorsPierre Jacob
 
Path storage in the particle filter
Path storage in the particle filterPath storage in the particle filter
Path storage in the particle filterPierre Jacob
 
Density exploration methods
Density exploration methodsDensity exploration methods
Density exploration methodsPierre Jacob
 
SMC^2: an algorithm for sequential analysis of state-space models
SMC^2: an algorithm for sequential analysis of state-space modelsSMC^2: an algorithm for sequential analysis of state-space models
SMC^2: an algorithm for sequential analysis of state-space modelsPierre Jacob
 
PAWL - GPU meeting @ Warwick
PAWL - GPU meeting @ WarwickPAWL - GPU meeting @ Warwick
PAWL - GPU meeting @ WarwickPierre Jacob
 
Presentation of SMC^2 at BISP7
Presentation of SMC^2 at BISP7Presentation of SMC^2 at BISP7
Presentation of SMC^2 at BISP7Pierre Jacob
 
Presentation MCB seminar 09032011
Presentation MCB seminar 09032011Presentation MCB seminar 09032011
Presentation MCB seminar 09032011Pierre Jacob
 

More from Pierre Jacob (20)

Talk at CIRM on Poisson equation and debiasing techniques
Talk at CIRM on Poisson equation and debiasing techniquesTalk at CIRM on Poisson equation and debiasing techniques
Talk at CIRM on Poisson equation and debiasing techniques
 
Couplings of Markov chains and the Poisson equation
Couplings of Markov chains and the Poisson equation Couplings of Markov chains and the Poisson equation
Couplings of Markov chains and the Poisson equation
 
Monte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problemsMonte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problems
 
Monte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problemsMonte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problems
 
Markov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing themMarkov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing them
 
Unbiased MCMC with couplings
Unbiased MCMC with couplingsUnbiased MCMC with couplings
Unbiased MCMC with couplings
 
Unbiased Markov chain Monte Carlo methods
Unbiased Markov chain Monte Carlo methods Unbiased Markov chain Monte Carlo methods
Unbiased Markov chain Monte Carlo methods
 
Recent developments on unbiased MCMC
Recent developments on unbiased MCMCRecent developments on unbiased MCMC
Recent developments on unbiased MCMC
 
Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...
 
Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...
 
Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...
 
Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...
 
Current limitations of sequential inference in general hidden Markov models
Current limitations of sequential inference in general hidden Markov modelsCurrent limitations of sequential inference in general hidden Markov models
Current limitations of sequential inference in general hidden Markov models
 
On non-negative unbiased estimators
On non-negative unbiased estimatorsOn non-negative unbiased estimators
On non-negative unbiased estimators
 
Path storage in the particle filter
Path storage in the particle filterPath storage in the particle filter
Path storage in the particle filter
 
Density exploration methods
Density exploration methodsDensity exploration methods
Density exploration methods
 
SMC^2: an algorithm for sequential analysis of state-space models
SMC^2: an algorithm for sequential analysis of state-space modelsSMC^2: an algorithm for sequential analysis of state-space models
SMC^2: an algorithm for sequential analysis of state-space models
 
PAWL - GPU meeting @ Warwick
PAWL - GPU meeting @ WarwickPAWL - GPU meeting @ Warwick
PAWL - GPU meeting @ Warwick
 
Presentation of SMC^2 at BISP7
Presentation of SMC^2 at BISP7Presentation of SMC^2 at BISP7
Presentation of SMC^2 at BISP7
 
Presentation MCB seminar 09032011
Presentation MCB seminar 09032011Presentation MCB seminar 09032011
Presentation MCB seminar 09032011
 

Recently uploaded

Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxStephen266013
 
Fuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyFuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyRafigAliyev2
 
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdfGenerative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdfEmmanuel Dauda
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxDilipVasan
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理pyhepag
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理pyhepag
 
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...Amil baba
 
How I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prisonHow I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prisonPayment Village
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictJack Cole
 
2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group MeetingAlison Pitt
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理cyebo
 
Formulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfFormulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfRobertoOcampo24
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Calllward7
 
basics of data science with application areas.pdf
basics of data science with application areas.pdfbasics of data science with application areas.pdf
basics of data science with application areas.pdfvyankatesh1
 
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理pyhepag
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理pyhepag
 
AI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfAI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfMichaelSenkow
 
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...ssuserf63bd7
 

Recently uploaded (20)

Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptx
 
Fuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyFuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertainty
 
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdfGenerative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
 
Slip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp ClaimsSlip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp Claims
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
 
Machine Learning for Accident Severity Prediction
Machine Learning for Accident Severity PredictionMachine Learning for Accident Severity Prediction
Machine Learning for Accident Severity Prediction
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
 
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
 
How I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prisonHow I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prison
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
 
2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理
 
Formulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfFormulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdf
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call
 
basics of data science with application areas.pdf
basics of data science with application areas.pdfbasics of data science with application areas.pdf
basics of data science with application areas.pdf
 
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理
 
AI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfAI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdf
 
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
 

ISBA 2022 Susie Bayarri lecture

  • 1. Bayesian inference with models made of modules Pierre E. Jacob 2022 ISBA World Meeting Susie Bayarri Lecture June 30, 2022 Pierre E. Jacob Models made of modules 1
  • 2. Outline 1 Models made of modules and issues with joint modeling 2 Cutting feedback: cut posterior distributions 3 Computation involved when cutting feedback Pierre E. Jacob Models made of modules 2
  • 3. Outline 1 Models made of modules and issues with joint modeling 2 Cutting feedback: cut posterior distributions 3 Computation involved when cutting feedback Pierre E. Jacob Models made of modules 2
  • 4. Models made of modules Let’s start with a simple example. . . Pierre E. Jacob Models made of modules 3
  • 5. Models made of modules Ogle, Barber & Sartor (2013). Feedback and Modularization in a Bayesian Meta–analysis of Tree Traits Affecting Forest Dynamics. Pierre E. Jacob Models made of modules 4
  • 6. Models made of modules Zooming in. . . we see arrows. . . and diodes . . . ? Pierre E. Jacob Models made of modules 5
  • 7. Models made of modules First module: parameter θ1, data Y1 prior: p1(θ1) likelihood: p1(Y1|θ1) Second module: parameter θ2, data Y2 prior: p2(θ2|θ1) likelihood: p2 (Y2|θ1, θ2) Pierre E. Jacob Models made of modules 6
  • 8. Joint model approach Parameter (θ1, θ2), with prior p(θ1, θ2) = p1(θ1)p2(θ2|θ1). Data (Y1, Y2), likelihood p(Y1, Y2|θ1, θ2) = p1(Y1|θ1)p2(Y2|θ1, θ2). Posterior distribution π (θ1, θ2|Y1, Y2) ∝ p1 (θ1) p1(Y1|θ1)p2 (θ2|θ1) p2 (Y2|θ1, θ2). Pierre E. Jacob Models made of modules 7
  • 9. Joint model approach Parameter (θ1, θ2), with prior p(θ1, θ2) = p1(θ1)p2(θ2|θ1). Data (Y1, Y2), likelihood p(Y1, Y2|θ1, θ2) = p1(Y1|θ1)p2(Y2|θ1, θ2). Posterior distribution π (θ1, θ2|Y1, Y2) ∝ p1 (θ1) p1(Y1|θ1)p2 (θ2|θ1) p2 (Y2|θ1, θ2). Departures from the joint model approach: why, how, when? Pierre E. Jacob Models made of modules 7
  • 10. Example: biased data Liu, Bayarri & Berger (2009). Modularization in Bayesian analysis, with emphasis on analysis of computer models. Location model: ∀i = 1, . . . , n1 Y i 1 ∼ Normal(θ1, 1) θ1 ∼ Normal(0, 1) Extra data Y2 suspected to be biased: ∀i = 1, . . . , n2 Y i 2 ∼ Normal(θ1 + θ2, 1) θ2 ∼ Normal(0, v) Pierre E. Jacob Models made of modules 8
  • 11. Example: biased data Liu, Bayarri & Berger (2009). Modularization in Bayesian analysis, with emphasis on analysis of computer models. Location model: ∀i = 1, . . . , n1 Y i 1 ∼ Normal(θ1, 1) θ1 ∼ Normal(0, 1) Extra data Y2 suspected to be biased: ∀i = 1, . . . , n2 Y i 2 ∼ Normal(θ1 + θ2, 1) θ2 ∼ Normal(0, v) If interest is in θ1: are the extra data useful or harmful? If interest is in θ2: joint model or “two-step” approach? Pierre E. Jacob Models made of modules 8
  • 12. Example: SARS-COV-2 prevalence Nicholson et al. (2022). Interoperability of statistical models in pandemic preparedness: principles and reality. Prevalence π of SARS-COV-2 in the UK, estimated from randomized surveillance data: u positive out of U tested, Hypergeometric model with parameter π. targeted surveillance data (patients with clinical need, health & care workers): n positive out of N tested, Binomial model involving π, P(tested|infected) and P(tested|not infected). Pierre E. Jacob Models made of modules 9
  • 13. Example: SARS-COV-2 prevalence Nicholson et al. (2022). Interoperability of statistical models in pandemic preparedness: principles and reality. Prevalence π of SARS-COV-2 in the UK, estimated from randomized surveillance data: u positive out of U tested, Hypergeometric model with parameter π. targeted surveillance data (patients with clinical need, health & care workers): n positive out of N tested, Binomial model involving π, P(tested|infected) and P(tested|not infected). If interest is in π: are the extra data useful or harmful? If interest is in e.g. P(tested|infected): joint model or “two-step” approach? Pierre E. Jacob Models made of modules 9
  • 14. Example: stochastic dynamical models Parslow, Cressie, Campbell, Jones & Murray (2013). Bayesian learning and predictability in a stochastic nonlinear dynamical model. Geophysics model of the temperature of the ocean, φ. The temperature φ can be used as “forcings” in a model of plankton population size β, for example in an SDE model dβt = µ(βt, φt)dt + σ(βt, φt)dWt. Pierre E. Jacob Models made of modules 10
  • 15. Example: stochastic dynamical models Parslow, Cressie, Campbell, Jones & Murray (2013). Bayesian learning and predictability in a stochastic nonlinear dynamical model. Geophysics model of the temperature of the ocean, φ. The temperature φ can be used as “forcings” in a model of plankton population size β, for example in an SDE model dβt = µ(βt, φt)dt + σ(βt, φt)dWt. We might want to: propagate uncertainty about the geophysics to the biology? allow/prevent feedback from the biology to the geophysics? Pierre E. Jacob Models made of modules 10
  • 16. Example: PKPD Bennett & Wakefield (2001). Errors-in-variables in joint population pharmacokinetic/pharmacodynamic modeling. Lunn, Best, Spiegelhalter, Graham & Neuenschwander (2009). Combining MCMC with ‘sequential’ PKPD modelling. Pharmacokinetics (PK): models the time course of drug absorption. ∀t Yt ∼ Normal(log Ct, vPK), where Ct = function(t, θPK). From this we extract (C (j) t )t≥0 for individual j = 1, . . . , J. Pharmacodynamics (PD): models the effect of drugs. ∀j Zj ∼ Normal(Ej, vPD), where Ej = function(C (j) tj , θPD), and where tj is the time at which Ej is measured. Pierre E. Jacob Models made of modules 11
  • 17. Example: HPV prevalence and cervical cancer incidence Plummer (2014). Cuts in Bayesian graphical models. Human papillomavirus prevalence ϕi in country i: ∀i = 1, . . . , I Zi ∼ Binomial(Ni, ϕi), Zi: number of women infected with high-risk HPV, Ni: population size in country i. Impact of prevalence onto cervical cancer occurrence: ∀i = 1, . . . , I Yi ∼ Poisson(λiTi), log(λi) = η1 + η2 ϕi, Yi is number of cases during study in country i, Ti: woman-years of follow-up in country i. Pierre E. Jacob Models made of modules 12
  • 18. Example: two-step estimation Murphy & Topel (1985). Estimation and Inference in Two-Step Econometric Models. Impact of unanticipated money growth on unemployment. ∀t Mt = θX1t + t, Mt: proportional growth in the M1 definition of money, X1t: lagged Mt, lagged unemployment, more variables. ∀t log Ut 1 − Ut = βX2t + γ t + Wt, Ut: annual average unemployment rate, X2t: minimum wage, more variables. Pierre E. Jacob Models made of modules 13
  • 19. Example: two-step estimation Murphy Topel (1985). Estimation and Inference in Two-Step Econometric Models. Impact of unanticipated money growth on unemployment. ∀t Mt = θX1t + t, Mt: proportional growth in the M1 definition of money, X1t: lagged Mt, lagged unemployment, more variables. ∀t log Ut 1 − Ut = βX2t + γ t + Wt, Ut: annual average unemployment rate, X2t: minimum wage, more variables. Joint estimation “inappropriate or computationally infeasible”. Pierre E. Jacob Models made of modules 13
  • 20. Example: multiple imputation Missing data: imputation of missing values, then analysis of completed data. Jackson, Best Richardson (2009). Bayesian graphical models for regression on multiple data sets with different variables. Pierre E. Jacob Models made of modules 14
  • 21. Example: multiple imputation Missing data: imputation of missing values, then analysis of completed data. Jackson, Best Richardson (2009). Bayesian graphical models for regression on multiple data sets with different variables. Multiphase inference: a first analyst pre-processes raw data, then a second analyst uses the processed data. Blocker Meng (2013). The potential and perils of preprocessing: Building new foundations. Pierre E. Jacob Models made of modules 14
  • 22. More examples Environmental epidemiology: estimation of environmental exposure, then associated health effects. Blangiardo, Hansell Richardson (2011). A Bayesian model of time activity data to investigate health effect of air pollution in time series studies. Pierre E. Jacob Models made of modules 15
  • 23. More examples Environmental epidemiology: estimation of environmental exposure, then associated health effects. Blangiardo, Hansell Richardson (2011). A Bayesian model of time activity data to investigate health effect of air pollution in time series studies. Causal inference with propensity scores: estimation of probability of individuals receiving treatment, then treatment effect adjusted for propensity score. Zigler, Watts, Yeh, Wang, Coull Dominici (2013). Model feedback in Bayesian propensity score estimation. Saarela, Belzile Stephens (2016). A Bayesian view of doubly robust causal inference. Pierre E. Jacob Models made of modules 15
  • 24. More examples Environmental epidemiology: estimation of environmental exposure, then associated health effects. Blangiardo, Hansell Richardson (2011). A Bayesian model of time activity data to investigate health effect of air pollution in time series studies. Causal inference with propensity scores: estimation of probability of individuals receiving treatment, then treatment effect adjusted for propensity score. Zigler, Watts, Yeh, Wang, Coull Dominici (2013). Model feedback in Bayesian propensity score estimation. Saarela, Belzile Stephens (2016). A Bayesian view of doubly robust causal inference. Jacob, Murray, Holmes Robert (2017). Better together? Statistical learning in models made of modules. Pierre E. Jacob Models made of modules 15
  • 25. Supporters say aye, opponents say no Setup: model 2 depends on an input that is itself estimated using model 1. Bayesian analysis with the joint model: - coherency, simultaneous treatment of uncertainty, and other appeals of standard Bayes, - computational toolbox is already available, Pierre E. Jacob Models made of modules 16
  • 26. Supporters say aye, opponents say no Setup: model 2 depends on an input that is itself estimated using model 1. Bayesian analysis with the joint model: - coherency, simultaneous treatment of uncertainty, and other appeals of standard Bayes, - computational toolbox is already available, / computationally challenging as difficulties pile up with more modules, / parameters might be hard to interpret as their meaning changes across modules, / module misspecification means that incorporating more data is not necessarily beneficial, and sometimes harmful. Pierre E. Jacob Models made of modules 16
  • 27. Outline 1 Models made of modules and issues with joint modeling 2 Cutting feedback: cut posterior distributions 3 Computation involved when cutting feedback Pierre E. Jacob Models made of modules 16
  • 28. Plug-in approach Simple: 1 first estimate θ1 given Y1, e.g. θ̂1 = R θ1 p1(θ1|Y1)dθ1, 2 inference on θ2 given Y2 and θ̂1 using p2(θ2|θ̂1, Y2) = p2(θ2|θ̂1)p2(Y2|θ̂1, θ2) p2(Y2|θ̂1) . Pierre E. Jacob Models made of modules 17
  • 29. Plug-in approach Simple: 1 first estimate θ1 given Y1, e.g. θ̂1 = R θ1 p1(θ1|Y1)dθ1, 2 inference on θ2 given Y2 and θ̂1 using p2(θ2|θ̂1, Y2) = p2(θ2|θ̂1)p2(Y2|θ̂1, θ2) p2(Y2|θ̂1) . / Uncertainty about θ1 is ignored in the estimation of θ2. - Misspecification of 2nd module does not impact θ1. Pierre E. Jacob Models made of modules 17
  • 30. Cut approach Propagate uncertainty without allowing feedback. Define the cut distribution: πcut (θ1, θ2; Y1, Y2) = p1(θ1|Y1)p2 (θ2|θ1, Y2). Pierre E. Jacob Models made of modules 18
  • 31. Cut approach Propagate uncertainty without allowing feedback. Define the cut distribution: πcut (θ1, θ2; Y1, Y2) = p1(θ1|Y1)p2 (θ2|θ1, Y2). Known as Bayesian two-step estimation, or as “cutting feedback”, term suggested by Nicky Best according to Jonty Rougier, in a comment on Sansó, Forest Zantedeschi (2008). Pierre E. Jacob Models made of modules 18
  • 32. Cut approach Propagate uncertainty without allowing feedback. Define the cut distribution: πcut (θ1, θ2; Y1, Y2) = p1(θ1|Y1)p2 (θ2|θ1, Y2). Known as Bayesian two-step estimation, or as “cutting feedback”, term suggested by Nicky Best according to Jonty Rougier, in a comment on Sansó, Forest Zantedeschi (2008). Ideal sampling procedure: 1 Sample θ1 from p1(θ1|Y1). 2 Sample θ2 given θ1 from p2(θ2|θ1, Y2). 3 Output (θ1, θ2). Pierre E. Jacob Models made of modules 18
  • 33. Cut approach From the OpenBUGS manual, Spiegelhalter, Thomas, Best Lunn, 2004: The cut function acts as a kind of valve in the graph: prior information is allowed to flow downwards through the cut, but likelihood information is prevented from flowing upwards. Pierre E. Jacob Models made of modules 19
  • 34. Cut approach Difference between cut and standard posterior density functions: πcut (θ1, θ2; Y1, Y2) ∝ p1(θ1)p1(Y1|θ1) p2(θ2|θ1)p2(Y2|θ1, θ2) p2(Y2|θ1) ∝ π(θ1, θ2|Y1, Y2) p2(Y2|θ1) . Pierre E. Jacob Models made of modules 20
  • 35. Cut approach Difference between cut and standard posterior density functions: πcut (θ1, θ2; Y1, Y2) ∝ p1(θ1)p1(Y1|θ1) p2(θ2|θ1)p2(Y2|θ1, θ2) p2(Y2|θ1) ∝ π(θ1, θ2|Y1, Y2) p2(Y2|θ1) . The term p2(Y2|θ1) is a measure of feedback of Y2 onto θ1: p2(Y2|θ1) = Z p2(Y2|θ1, θ2)p2(θ2|θ1)dθ2. Pierre E. Jacob Models made of modules 20
  • 36. Cut approach Difference between cut and standard posterior density functions: πcut (θ1, θ2; Y1, Y2) ∝ p1(θ1)p1(Y1|θ1) p2(θ2|θ1)p2(Y2|θ1, θ2) p2(Y2|θ1) ∝ π(θ1, θ2|Y1, Y2) p2(Y2|θ1) . The term p2(Y2|θ1) is a measure of feedback of Y2 onto θ1: p2(Y2|θ1) = Z p2(Y2|θ1, θ2)p2(θ2|θ1)dθ2. The marginal distribution of θ1 differs, p1(θ1|Y1) for cut, π(θ1|Y1, Y2) for standard posterior, The conditional distribution of θ2 is the same: p2(θ2|θ1, Y2). Pierre E. Jacob Models made of modules 20
  • 37. A variational representation Among all distributions q(θ1, θ2) with marginal p1(θ1|Y1) on θ1, πcut (θ1, θ2; Y1, Y2) minimizes Kullback–Leibler (q(θ1, θ2), π(θ1, θ2|Y1, Y2)) . Lemma 1 in Yu, Nott Smith (2021). Variational inference for cutting feedback in misspecified models. Pierre E. Jacob Models made of modules 21
  • 38. A variational representation Among all distributions q(θ1, θ2) with marginal p1(θ1|Y1) on θ1, πcut (θ1, θ2; Y1, Y2) minimizes Kullback–Leibler (q(θ1, θ2), π(θ1, θ2|Y1, Y2)) . Lemma 1 in Yu, Nott Smith (2021). Variational inference for cutting feedback in misspecified models. We can also view the cut distribution as a valid representation of beliefs about the parameters. Pierre E. Jacob Models made of modules 21
  • 39. Updating beliefs Bissiri, Holmes Walker (2016). A general framework for updating belief distribution. Prior beliefs p(θ), loss associated with data l(y, θ), assume loss and prior are independent pieces of information. Pierre E. Jacob Models made of modules 22
  • 40. Updating beliefs Bissiri, Holmes Walker (2016). A general framework for updating belief distribution. Prior beliefs p(θ), loss associated with data l(y, θ), assume loss and prior are independent pieces of information. We look for an update “p(θ|y)”= ψ(l(y, θ), p(θ)). Pierre E. Jacob Models made of modules 22
  • 41. Updating beliefs Bissiri, Holmes Walker (2016). A general framework for updating belief distribution. Prior beliefs p(θ), loss associated with data l(y, θ), assume loss and prior are independent pieces of information. We look for an update “p(θ|y)”= ψ(l(y, θ), p(θ)). Coherence: ψ(l(y0 , θ), ψ(l(y, θ), p(θ))) = ψ(l(y0 , θ) + l(y, θ), p(θ)). Pierre E. Jacob Models made of modules 22
  • 42. Updating beliefs Bissiri, Holmes Walker (2016). A general framework for updating belief distribution. Prior beliefs p(θ), loss associated with data l(y, θ), assume loss and prior are independent pieces of information. We look for an update “p(θ|y)”= ψ(l(y, θ), p(θ)). Coherence: ψ(l(y0 , θ), ψ(l(y, θ), p(θ))) = ψ(l(y0 , θ) + l(y, θ), p(θ)). Result: for coherence and optimality, we need to define p(θ|y) = argminq Z l(y, θ)q(dθ) + KL(q(θ), p(θ)). Solution: p(θ|y) ∝ exp(−l(y, θ))p(θ). Pierre E. Jacob Models made of modules 22
  • 43. Cut distributions are valid belief updates Carmona Nicholls (2020). Semi-Modular Inference: enhanced learning in multi-modular models by tempering the influence of components. Retrieve the cut distribution in the framework of Bissiri, Holmes Walker (2016), with the loss “l(y, θ)” = − {log p(Y1|θ1)p(Y2|θ1, θ2) − log p2(Y2|θ1)} . Pierre E. Jacob Models made of modules 23
  • 44. Cut distributions are valid belief updates Carmona Nicholls (2020). Semi-Modular Inference: enhanced learning in multi-modular models by tempering the influence of components. Retrieve the cut distribution in the framework of Bissiri, Holmes Walker (2016), with the loss “l(y, θ)” = − {log p(Y1|θ1)p(Y2|θ1, θ2) − log p2(Y2|θ1)} . The loss involves p2(Y2|θ1) = R p2(Y2|θ1, θ2)p2(θ2|θ1)dθ2 and thus depends on the prior. . . the argument needs adjustments. Nicholls, Lee, Wu Carmona (2022). Valid belief updates for prequentially additive loss functions arising in Semi-Modular Inference. Pierre E. Jacob Models made of modules 23
  • 45. Variants Allow a controlled amount of feedback. Nicholls Carmona (2020). Semi-Modular Inference: enhanced learning in multi-modular models by tempering the influence of components. Nicholls, Lee, Wu Carmona (2022). Valid belief updates for prequentially additive loss functions arising in Semi-Modular Inference. Pierre E. Jacob Models made of modules 24
  • 46. Variants Allow a controlled amount of feedback. Nicholls Carmona (2020). Semi-Modular Inference: enhanced learning in multi-modular models by tempering the influence of components. Nicholls, Lee, Wu Carmona (2022). Valid belief updates for prequentially additive loss functions arising in Semi-Modular Inference. Cut + replace minus log-likelihood by other loss functions. Frazier Nott (2022). Cutting feedback and modularized analyses in generalized Bayesian inference. Pierre E. Jacob Models made of modules 24
  • 47. Asymptotics Asymptotics of two-step estimators in Murphy Topel (1985). Estimation and Inference in Two-Step Econometric Models. Extended to cut distributions in Pompe Jacob (2022). Asymptotics of cut distributions and robust modular inference using Posterior Bootstrap. Pierre E. Jacob Models made of modules 25
  • 48. Asymptotics Asymptotics of two-step estimators in Murphy Topel (1985). Estimation and Inference in Two-Step Econometric Models. Extended to cut distributions in Pompe Jacob (2022). Asymptotics of cut distributions and robust modular inference using Posterior Bootstrap. scenario A n1/n2 → α 0 p?(Y1,1:n1 , Y2,1:n2 ) = n1 Y i=1 p? 1(Y1,i) n2 Y i=1 p? 2(Y2,i) Pierre E. Jacob Models made of modules 25
  • 49. Asymptotics Asymptotics of two-step estimators in Murphy Topel (1985). Estimation and Inference in Two-Step Econometric Models. Extended to cut distributions in Pompe Jacob (2022). Asymptotics of cut distributions and robust modular inference using Posterior Bootstrap. scenario A n1/n2 → α 0 p?(Y1,1:n1 , Y2,1:n2 ) = n1 Y i=1 p? 1(Y1,i) n2 Y i=1 p? 2(Y2,i) scenario B n1 = n2 = n p?(Y1,1:n1 , Y2,1:n2 ) = n Y i=1 p? (Y1,i, Y2,i) 6= n Y i=1 p? 1(Y1,i)p? 2(Y2,i) Pierre E. Jacob Models made of modules 25
  • 50. Asymptotics Under regularity conditions, introduce the two-step MLEs: θ̂1 = argmaxθ1 log p1(Y1|θ1) −→ θ? 1, θ̂2 = argmaxθ2 log p2(Y2|θ̂1, θ2) −→ θ? 2. Their asymptotic joint distribution is Normal with variance ΣA under scenario A, ΣB under scenario B. Pierre E. Jacob Models made of modules 26
  • 51. Asymptotics Under regularity conditions, introduce the two-step MLEs: θ̂1 = argmaxθ1 log p1(Y1|θ1) −→ θ? 1, θ̂2 = argmaxθ2 log p2(Y2|θ̂1, θ2) −→ θ? 2. Their asymptotic joint distribution is Normal with variance ΣA under scenario A, ΣB under scenario B. Sandwich formula: E?[−d2`(θ?) dθ2 ]−1E?[(d`(θ?) dθ )2]E?[−d2`(θ?) dθ2 ]−1 + possible dependencies between Y1 and Y2 in scenario B. Pierre E. Jacob Models made of modules 26
  • 52. Asymptotics Under regularity conditions, introduce the two-step MLEs: θ̂1 = argmaxθ1 log p1(Y1|θ1) −→ θ? 1, θ̂2 = argmaxθ2 log p2(Y2|θ̂1, θ2) −→ θ? 2. Their asymptotic joint distribution is Normal with variance ΣA under scenario A, ΣB under scenario B. Sandwich formula: E?[−d2`(θ?) dθ2 ]−1E?[(d`(θ?) dθ )2]E?[−d2`(θ?) dθ2 ]−1 + possible dependencies between Y1 and Y2 in scenario B. For (θ1, θ2) drawn from the cut distribution, √ n(θ1 − θ̂1, θ2 − θ̂2) → Normal(0, ΣC). ΣC ≡ E?[−d2`(θ?) dθ2 ]−1 does not match ΣA or ΣB. Pierre E. Jacob Models made of modules 26
  • 53. Asymptotics Frazier Nott (2022). Cutting feedback and modularized analyses in generalized Bayesian inference. Focus on the asymptotic behaviour of p2(θ2|θ1, Y2), assuming θ1 is fixed in a neighborhood of the MLE limit θ? 1. Pierre E. Jacob Models made of modules 27
  • 54. Asymptotics Frazier Nott (2022). Cutting feedback and modularized analyses in generalized Bayesian inference. Focus on the asymptotic behaviour of p2(θ2|θ1, Y2), assuming θ1 is fixed in a neighborhood of the MLE limit θ? 1. The asymptotic conditional distribution p2(θ2|θ1, Y2) is Normal with both mean and variance depending explicitly on θ1. Allows a finer understanding of the impact of the uncertainty of θ1 onto that of θ2. Pierre E. Jacob Models made of modules 27
  • 55. Supporters say aye, opponents say no Setup: model 2 depends on an input that is itself estimated using model 1. Cut distribution: - Can mitigate effect of misspecification. - Facilitates interoperability across teams. - Can resolve computational intractability of joint model. - Not completely unprincipled. Pierre E. Jacob Models made of modules 28
  • 56. Supporters say aye, opponents say no Setup: model 2 depends on an input that is itself estimated using model 1. Cut distribution: - Can mitigate effect of misspecification. - Facilitates interoperability across teams. - Can resolve computational intractability of joint model. - Not completely unprincipled. / Can lead to sub-optimal estimation/prediction accuracy. / Is no replacement for constructive model criticism. / Associated computations present their own challenges. Pierre E. Jacob Models made of modules 28
  • 57. Asking the data whether to cut Jacob, Murray, Holmes Robert (2017). Better together? Statistical learning in models made of modules. We can try to be principled about whether to cut or not. Natural route: introduce measures of predictive performance that can be evaluated on test data. But which data: Y1? Y2? Pierre E. Jacob Models made of modules 29
  • 58. Asking the data whether to cut Jacob, Murray, Holmes Robert (2017). Better together? Statistical learning in models made of modules. We can try to be principled about whether to cut or not. Natural route: introduce measures of predictive performance that can be evaluated on test data. But which data: Y1? Y2? Postulate: parameters are meaningful in the module that first defines them. Thus distributions of parameters should be compared on predictions in the module that defines them. Pierre E. Jacob Models made of modules 29
  • 59. Asking the data whether to cut In the first module, θ1 is defined in its relation to Y1. We propose to assess candidate distributions for θ1 based on predictive performance for Y1. Pierre E. Jacob Models made of modules 30
  • 60. Asking the data whether to cut In the first module, θ1 is defined in its relation to Y1. We propose to assess candidate distributions for θ1 based on predictive performance for Y1. Two candidates: p1(θ1|Y1) and π(θ1|Y1, Y2). Using the prequential approach and the logarithmic scoring rule, we compare p1(Y1) and π(Y1|Y2). Pierre E. Jacob Models made of modules 30
  • 61. Asking the data whether to cut In the first module, θ1 is defined in its relation to Y1. We propose to assess candidate distributions for θ1 based on predictive performance for Y1. Two candidates: p1(θ1|Y1) and π(θ1|Y1, Y2). Using the prequential approach and the logarithmic scoring rule, we compare p1(Y1) and π(Y1|Y2). If p1(Y1) π(Y1|Y2), we support the use of distributions on (θ1, θ2) that admit p1(θ1|Y1) as first marginal, e.g. cut. Pierre E. Jacob Models made of modules 30
  • 62. Outline 1 Models made of modules and issues with joint modeling 2 Cutting feedback: cut posterior distributions 3 Computation involved when cutting feedback Pierre E. Jacob Models made of modules 30
  • 63. Confusion about cut distributions From Gelman (2020) blog post entitled How to “cut” using Stan, if you must. Question (rephrased for brevity): Have cut posteriors been implemented in Stan? Pierre E. Jacob Models made of modules 31
  • 64. Confusion about cut distributions From Gelman (2020) blog post entitled How to “cut” using Stan, if you must. Question (rephrased for brevity): Have cut posteriors been implemented in Stan? Reply: This topic has come up before, and I don’t think this “cut” is a good idea. If you want to implement it, [. . . ] you’d first fit model 1 and get posterior simulations, then approx those simulations by a mixture of multivari- ate normal or t distributions, then use that as a prior for model 2. [. . . ] Pierre E. Jacob Models made of modules 31
  • 65. Confusion about cut distributions From Gelman (2020) blog post entitled How to “cut” using Stan, if you must. Question (rephrased for brevity): Have cut posteriors been implemented in Stan? Reply: This topic has come up before, and I don’t think this “cut” is a good idea. If you want to implement it, [. . . ] you’d first fit model 1 and get posterior simulations, then approx those simulations by a mixture of multivari- ate normal or t distributions, then use that as a prior for model 2. [. . . ] This would in fact amount to a two-step approximation of the standard posterior distribution, not the cut distribution! Pierre E. Jacob Models made of modules 31
  • 66. Modular approximations of the standard posterior Lunn, Barrett, Sweeting Thompson (2013). Fully Bayesian hierarchical modelling in two stages, with application to meta-analysis. Goudie, Presanis, Lunn, De Angelis Wernisch (2016). Model surgery: joining and splitting models with Markov melding. Manderson Goudie (2021). A numerically stable algorithm for integrating Bayesian models using Markov melding. Pierre E. Jacob Models made of modules 32
  • 67. Modular approximations of the standard posterior Lunn, Barrett, Sweeting Thompson (2013). Fully Bayesian hierarchical modelling in two stages, with application to meta-analysis. Goudie, Presanis, Lunn, De Angelis Wernisch (2016). Model surgery: joining and splitting models with Markov melding. Manderson Goudie (2021). A numerically stable algorithm for integrating Bayesian models using Markov melding. Leonelli, Barons Smith (2018). A conditional independence framework for coherent modularized inference. Huge interest in approximating the supraBayesian with a de-centralized strategy, but this is not about cutting feedback. Pierre E. Jacob Models made of modules 32
  • 68. Sampling from the cut distribution The density of the cut distribution is πcut (θ1, θ2; Y1, Y2) ∝ π(θ1, θ2|Y1, Y2) p2(Y2|θ1) . Pierre E. Jacob Models made of modules 33
  • 69. Sampling from the cut distribution The density of the cut distribution is πcut (θ1, θ2; Y1, Y2) ∝ π(θ1, θ2|Y1, Y2) p2(Y2|θ1) . The term p2(Y2|θ1) is typically intractable, p2(Y2|θ1) = Z p2(Y2|θ1, θ2)p2(θ2|θ1)dθ2. MCMC approach for doubly intractable targets: Liu Goudie (2021). Stochastic approximation cut algorithm for inference in modularized Bayesian models. Pierre E. Jacob Models made of modules 33
  • 70. Sampling from the cut distribution OpenBUGS’ approach via the cut function: alternate between sampling θ1 0 from K1(θ1, dθ1 0) targeting p1(dθ1|Y1), sampling θ2 0 from K2 θ1 0 (θ2, dθ2 0) targeting p2(dθ2|θ1 0, Y2). This does not leave the cut distribution invariant. Iterating the kernel K2 θ0 1 enough times mitigates the issue. Plummer (2014). Cuts in Bayesian graphical models. Pierre E. Jacob Models made of modules 34
  • 71. Sampling from the cut distribution In a perfect sampling world, we could sample θ1 from p1(θ1|Y1), θ2 given θ1 from p2(θ2|θ1, Y2), then (θ1, θ2) would be exactly following the cut distribution. For many models, exact sampling is not feasible. Pierre E. Jacob Models made of modules 35
  • 72. Sampling from the cut distribution In an MCMC world, we can sample θ1 approximately from p1(θ1|Y1) using MCMC, θ2 given θ1 approximately from p2(θ2|θ1, Y2) using MCMC, then resulting samples approximate the cut distribution, in the limit of the numbers of iterations in both stages. / Can involve tuning and convergence diagnostics for many MCMC runs at the 2nd stage, each with a different target. Pierre E. Jacob Models made of modules 36
  • 73. Unbiased estimation of the cut distribution Jacob, O’Leary Atchadé (2020). Unbiased Markov chain Monte Carlo with couplings. By coupling pairs of π-invariant chains in a particular way, meeting 0 50 100 150 200 Pierre E. Jacob Models made of modules 37
  • 74. Unbiased estimation of the cut distribution Jacob, O’Leary Atchadé (2020). Unbiased Markov chain Monte Carlo with couplings. By coupling pairs of π-invariant chains in a particular way, meeting 0 50 100 150 200 Pierre E. Jacob Models made of modules 37
  • 75. Unbiased estimation of the cut distribution Jacob, O’Leary Atchadé (2020). Unbiased Markov chain Monte Carlo with couplings. By coupling pairs of π-invariant chains in a particular way, meeting 0 50 100 150 200 Pierre E. Jacob Models made of modules 37
  • 76. Unbiased estimation of the cut distribution Jacob, O’Leary Atchadé (2020). Unbiased Markov chain Monte Carlo with couplings. By coupling pairs of π-invariant chains in a particular way, meeting 0 50 100 150 200 we can construct a signed measure π̂(dθ) = PN n=1 ωnδθn (dθ) with E[π̂(h)] = π(h) for a class of test functions h. Pierre E. Jacob Models made of modules 37
  • 77. Unbiased estimation of the cut distribution In an unbiased MCMC world, we can approximate without bias p1(dθ1|Y1) with a random measure π̂1(dθ1) = N1 X n=1 ω1,nδθ1,n (dθ1), obtained from coupled p1(dθ1|Y1)-invariant chains. Pierre E. Jacob Models made of modules 38
  • 78. Unbiased estimation of the cut distribution In an unbiased MCMC world, we can approximate without bias p1(dθ1|Y1) with a random measure π̂1(dθ1) = N1 X n=1 ω1,nδθ1,n (dθ1), obtained from coupled p1(dθ1|Y1)-invariant chains. p2(dθ2|θ1, Y2) for any θ1, with a random measure π̂2(dθ2|θ1) = N2 X n=1 ω2,nδθ2,n (dθ2), obtained from coupled p2(dθ2|θ1, Y2)-invariant chains. Pierre E. Jacob Models made of modules 38
  • 79. Unbiased estimation of the cut distribution In an unbiased MCMC world, we can approximate without bias p1(dθ1|Y1) with a random measure π̂1(dθ1) = N1 X n=1 ω1,nδθ1,n (dθ1), obtained from coupled p1(dθ1|Y1)-invariant chains. p2(dθ2|θ1, Y2) for any θ1, with a random measure π̂2(dθ2|θ1) = N2 X n=1 ω2,nδθ2,n (dθ2), obtained from coupled p2(dθ2|θ1, Y2)-invariant chains. Using the tower property, we can estimate without bias expectations with respect to πcut(θ1, θ2; Y1, Y2). Pierre E. Jacob Models made of modules 38
  • 80. Inexact approximations of the cut distribution Variational inference: Yu, Nott Smith (2021). Variational inference for cutting feedback in misspecified models. Carmona Nicholls (2022). Scalable Semi-Modular Inference with Variational Meta-Posteriors. Pierre E. Jacob Models made of modules 39
  • 81. Inexact approximations of the cut distribution Variational inference: Yu, Nott Smith (2021). Variational inference for cutting feedback in misspecified models. Carmona Nicholls (2022). Scalable Semi-Modular Inference with Variational Meta-Posteriors. Posterior bootstrap: Pompe Jacob (2022). Asymptotics of cut distributions and robust modular inference using Posterior Bootstrap. adapting techniques developed earlier by Newton Raftery, Fong, Lyddon, Holmes Walker. Pierre E. Jacob Models made of modules 39
  • 82. Inexact approximations of the cut distribution Variational inference: Yu, Nott Smith (2021). Variational inference for cutting feedback in misspecified models. Carmona Nicholls (2022). Scalable Semi-Modular Inference with Variational Meta-Posteriors. Posterior bootstrap: Pompe Jacob (2022). Asymptotics of cut distributions and robust modular inference using Posterior Bootstrap. adapting techniques developed earlier by Newton Raftery, Fong, Lyddon, Holmes Walker. Modular approximate Bayesian computation Chakraborty, Nott, Drovandi, Frazier Sisson (2022). Modularized Bayesian analyses and cutting feedback in likelihood-free inference. Pierre E. Jacob Models made of modules 39
  • 83. Discussion A very appealing aspect of Bayesian analysis is its unified treatment of many statistical questions. Modular approaches appear to depart from the main framework, and thus bring discomfort. Pierre E. Jacob Models made of modules 40
  • 84. Discussion A very appealing aspect of Bayesian analysis is its unified treatment of many statistical questions. Modular approaches appear to depart from the main framework, and thus bring discomfort. If the essence of Bayesian analysis is not the direct application of Bayes’ theorem, but probability distributions for the quantities of interest, constructed from data and prior information, a framework for statistical inference supported by principles decision theory, Pierre E. Jacob Models made of modules 40
  • 85. Discussion A very appealing aspect of Bayesian analysis is its unified treatment of many statistical questions. Modular approaches appear to depart from the main framework, and thus bring discomfort. If the essence of Bayesian analysis is not the direct application of Bayes’ theorem, but probability distributions for the quantities of interest, constructed from data and prior information, a framework for statistical inference supported by principles decision theory, a good excuse to play with Markov chains, Pierre E. Jacob Models made of modules 40
  • 86. Discussion A very appealing aspect of Bayesian analysis is its unified treatment of many statistical questions. Modular approaches appear to depart from the main framework, and thus bring discomfort. If the essence of Bayesian analysis is not the direct application of Bayes’ theorem, but probability distributions for the quantities of interest, constructed from data and prior information, a framework for statistical inference supported by principles decision theory, a good excuse to play with Markov chains, then cut posteriors might be essentially Bayesian. Thank you! Pierre E. Jacob Models made of modules 40