SlideShare a Scribd company logo
1 of 102
Download to read offline
Complexities in Bayesian inverse problems:
Models and Distributions
Simon Cotter
University of Manchester
17th May 2019
Simon Cotter Bayesian Complexities 0 / 39
The Graveyard Shift
Last day
After lunch
My promises to you:
2 talks in 1 to keep you on your toes
(or if you go to sleep in the 1st you have a second chance!)
Almost no details!
As many pictures of gorillas as is scientifically justifiable
Simon Cotter Bayesian Complexities 1 / 39
The Graveyard Shift
Last day
After lunch
My promises to you:
2 talks in 1 to keep you on your toes
(or if you go to sleep in the 1st you have a second chance!)
Almost no details!
As many pictures of gorillas as is scientifically justifiable
Simon Cotter Bayesian Complexities 1 / 39
The Graveyard Shift
Last day
After lunch
My promises to you:
2 talks in 1 to keep you on your toes
(or if you go to sleep in the 1st you have a second chance!)
Almost no details!
As many pictures of gorillas as is scientifically justifiable
Simon Cotter Bayesian Complexities 1 / 39
The Graveyard Shift
Last day
After lunch
My promises to you:
2 talks in 1 to keep you on your toes
(or if you go to sleep in the 1st you have a second chance!)
Almost no details!
As many pictures of gorillas as is scientifically justifiable
Simon Cotter Bayesian Complexities 1 / 39
The Graveyard Shift
Last day
After lunch
My promises to you:
2 talks in 1 to keep you on your toes
(or if you go to sleep in the 1st you have a second chance!)
Almost no details!
As many pictures of gorillas as is scientifically justifiable
Simon Cotter Bayesian Complexities 1 / 39
The Graveyard Shift
Last day
After lunch
My promises to you:
2 talks in 1 to keep you on your toes
(or if you go to sleep in the 1st you have a second chance!)
Almost no details!
As many pictures of gorillas as is scientifically justifiable
Simon Cotter Bayesian Complexities 1 / 39
The Graveyard Shift
Last day
After lunch
My promises to you:
2 talks in 1 to keep you on your toes
(or if you go to sleep in the 1st you have a second chance!)
Almost no details!
As many pictures of gorillas as is scientifically justifiable
Simon Cotter Bayesian Complexities 1 / 39
Collaborators
Left: Catherine Powell (University of Manchester, UK), Center:
James Rynn (University of Manchester, UK), Right: Louise Wright
(National Physical Laboratories, UK)
Simon Cotter Bayesian Complexities 2 / 39
Collaborators
Left: Colin Cotter (Imperial College, UK), Center: Yannis
Kevrekidis (John Hopkins, US), Right: Paul Russell (formerly
University of Manchester, UK).
SLC is grateful to EPSRC for First Grant award EP/L023393/1
Simon Cotter Bayesian Complexities 3 / 39
Outline
1 Introduction
2 SGFEM Surrogate-Accelerated Inference
3 Transport Map-Accelerated Adaptive Importance Sampling
4 Conclusions
Simon Cotter Bayesian Complexities 3 / 39
Bayesian Inverse Problems
Find the unknown θ given nz observations z, satisfying
z = G(θ) + η, η ∼ N(0, Σ),
where
z ∈ Rnz is a given vector of observations,
G : Θ → Rnz is the observation operator,
θ ∈ Θ is the unknown parameter,
η ∈ Rnz is a vector of observational noise.
Goal: Efficiently estimate the posterior density π(θ|z) for the
unknowns θ given the data z.
Simon Cotter Bayesian Complexities 4 / 39
Bayesian Inverse Problems
Find the unknown θ given nz observations z, satisfying
z = G(θ) + η, η ∼ N(0, Σ),
where
z ∈ Rnz is a given vector of observations,
G : Θ → Rnz is the observation operator,
θ ∈ Θ is the unknown parameter,
η ∈ Rnz is a vector of observational noise.
Goal: Efficiently estimate the posterior density π(θ|z) for the
unknowns θ given the data z.
Simon Cotter Bayesian Complexities 4 / 39
Bayesian Inverse Problems
Find the unknown θ given nz observations z, satisfying
z = G(θ) + η, η ∼ N(0, Σ),
where
z ∈ Rnz is a given vector of observations,
G : Θ → Rnz is the observation operator,
θ ∈ Θ is the unknown parameter,
η ∈ Rnz is a vector of observational noise.
Goal: Efficiently estimate the posterior density π(θ|z) for the
unknowns θ given the data z.
Simon Cotter Bayesian Complexities 4 / 39
Bayesian Inverse Problems
Find the unknown θ given nz observations z, satisfying
z = G(θ) + η, η ∼ N(0, Σ),
where
z ∈ Rnz is a given vector of observations,
G : Θ → Rnz is the observation operator,
θ ∈ Θ is the unknown parameter,
η ∈ Rnz is a vector of observational noise.
Goal: Efficiently estimate the posterior density π(θ|z) for the
unknowns θ given the data z.
Simon Cotter Bayesian Complexities 4 / 39
Bayesian Inverse Problems
Find the unknown θ given nz observations z, satisfying
z = G(θ) + η, η ∼ N(0, Σ),
where
z ∈ Rnz is a given vector of observations,
G : Θ → Rnz is the observation operator,
θ ∈ Θ is the unknown parameter,
η ∈ Rnz is a vector of observational noise.
Goal: Efficiently estimate the posterior density π(θ|z) for the
unknowns θ given the data z.
Simon Cotter Bayesian Complexities 4 / 39
Bayesian Inverse Problems
Find the unknown θ given nz observations z, satisfying
z = G(θ) + η, η ∼ N(0, Σ),
where
z ∈ Rnz is a given vector of observations,
G : Θ → Rnz is the observation operator,
θ ∈ Θ is the unknown parameter,
η ∈ Rnz is a vector of observational noise.
Goal: Efficiently estimate the posterior density π(θ|z) for the
unknowns θ given the data z.
Simon Cotter Bayesian Complexities 4 / 39
Markov Chain Monte Carlo (MCMC) Methods
π(θ|z) ∝ L(z|θ)π0(θ)
∝ exp −
1
2
z − G(θ) 2
Σ π0(θ).
Markov chain Monte Carlo estimates
Eπ[φ] =
Θ
φ(θ)π(θ|z)dθ ≈
1
M
M
i=1
φ(θi).
Extremely costly for some problem classes
Expensive likelihood evaluations (talk #1)
Poor mixing requiring high values of M (talk #2)
Simon Cotter Bayesian Complexities 5 / 39
Markov Chain Monte Carlo (MCMC) Methods
π(θ|z) ∝ L(z|θ)π0(θ)
∝ exp −
1
2
z − G(θ) 2
Σ π0(θ).
Markov chain Monte Carlo estimates
Eπ[φ] =
Θ
φ(θ)π(θ|z)dθ ≈
1
M
M
i=1
φ(θi).
Extremely costly for some problem classes
Expensive likelihood evaluations (talk #1)
Poor mixing requiring high values of M (talk #2)
Simon Cotter Bayesian Complexities 5 / 39
Markov Chain Monte Carlo (MCMC) Methods
π(θ|z) ∝ L(z|θ)π0(θ)
∝ exp −
1
2
z − G(θ) 2
Σ π0(θ).
Markov chain Monte Carlo estimates
Eπ[φ] =
Θ
φ(θ)π(θ|z)dθ ≈
1
M
M
i=1
φ(θi).
Extremely costly for some problem classes
Expensive likelihood evaluations (talk #1)
Poor mixing requiring high values of M (talk #2)
Simon Cotter Bayesian Complexities 5 / 39
Markov Chain Monte Carlo (MCMC) Methods
π(θ|z) ∝ L(z|θ)π0(θ)
∝ exp −
1
2
z − G(θ) 2
Σ π0(θ).
Markov chain Monte Carlo estimates
Eπ[φ] =
Θ
φ(θ)π(θ|z)dθ ≈
1
M
M
i=1
φ(θi).
Extremely costly for some problem classes
Expensive likelihood evaluations (talk #1)
Poor mixing requiring high values of M (talk #2)
Simon Cotter Bayesian Complexities 5 / 39
Markov Chain Monte Carlo (MCMC) Methods
π(θ|z) ∝ L(z|θ)π0(θ)
∝ exp −
1
2
z − G(θ) 2
Σ π0(θ).
Markov chain Monte Carlo estimates
Eπ[φ] =
Θ
φ(θ)π(θ|z)dθ ≈
1
M
M
i=1
φ(θi).
Extremely costly for some problem classes
Expensive likelihood evaluations (talk #1)
Poor mixing requiring high values of M (talk #2)
Simon Cotter Bayesian Complexities 5 / 39
Outline
1 Introduction
2 SGFEM Surrogate-Accelerated Inference
3 Transport Map-Accelerated Adaptive Importance Sampling
4 Conclusions
Simon Cotter Bayesian Complexities 5 / 39
Industrial Example
Simon Cotter Bayesian Complexities 6 / 39
Industrial Example
0 5 10 15 20 25 30 35 40
-1
0
1
2
3
4
5
6
7
8
9
Simon Cotter Bayesian Complexities 7 / 39
Industrial Example
Possible unknowns:
λ — thermal conductivity,
I — laser intensity,
κ — heat transfer coefficient (affects boundary conditions),
σ — standard deviation of measurement noise.
. . .
Simon Cotter Bayesian Complexities 8 / 39
Forward Model: Heat Equation
Model the physical relationship between λ, I and the temperature
T of the material using the heat equation:
cp
∂T
∂t
− · (λ T) = Q(I).
Given a sample of λ and I, we can use numerical methods
(FEMs) to approximate T at the measurement times.
One evaluation of the FEM code takes ≈ 30 seconds, for a
“reasonable level of accuracy”.
Simon Cotter Bayesian Complexities 9 / 39
Forward Model: Heat Equation
Model the physical relationship between λ, I and the temperature
T of the material using the heat equation:
cp
∂T
∂t
− · (λ T) = Q(I).
Given a sample of λ and I, we can use numerical methods
(FEMs) to approximate T at the measurement times.
One evaluation of the FEM code takes ≈ 30 seconds, for a
“reasonable level of accuracy”.
Simon Cotter Bayesian Complexities 9 / 39
Metropolis Hastings Algorithm (FEM)
Algorithm 1: Metropolis-Hastings Algorithm
set initial state X(0) = θ0
for m = 1, 2, . . . , M do
draw proposal
evaluate likelihood by FEM approximation of G (expensive!)
compute acceptance probability α
accept proposal with probability α
end for
output chain X = (θ0, θ1, . . . , θM)
30 seconds per (time-dependant) PDE solve =⇒ 10m samples
takes 3 × 108 seconds = 9.5 years (single CPU)
Simon Cotter Bayesian Complexities 10 / 39
Surrogate Model
The temperature T is a function of θ = (λ, I).
cp
∂T(θ)
∂t
− · (λ T(θ)) = Q(I).
Instead of solving for individual samples of λ and I:
Offline: build an approximation of the form
Tapprox(θ) =
nk
i=1
TiΨi(θ), (Surrogate)
where Ψi are orthogonal polynomials in θ (Legendre)
This can be cheaply evaluated in MCMC routines (Online).
Simon Cotter Bayesian Complexities 11 / 39
Surrogate Model
The temperature T is a function of θ = (λ, I).
cp
∂T(θ)
∂t
− · (λ T(θ)) = Q(I).
Instead of solving for individual samples of λ and I:
Offline: build an approximation of the form
Tapprox(θ) =
nk
i=1
TiΨi(θ), (Surrogate)
where Ψi are orthogonal polynomials in θ (Legendre)
This can be cheaply evaluated in MCMC routines (Online).
Simon Cotter Bayesian Complexities 11 / 39
Metropolis Hastings Algorithm (SGFEM)
Algorithm 2: Metropolis-Hastings Algorithm with SGFEM Surro-
gate
compute SGFEM solution (cost dependant on FEM
discretisation parameters + maximum polynomial degree k)
set initial state X(0) = θ0
for m = 1, 2, . . . , M do
draw proposal
evaluate likelihood by evaluating SGFEM approximation of
G (cheap!)
compute acceptance probability α
accept proposal with probability α
end for
output chain X = (θ0, θ1, . . . , θM)
Simon Cotter Bayesian Complexities 12 / 39
Results: Copper Sample
Offline: Compute surrogate solution : ≈ 16 mins
(Solved 600K equations ×nt = 800 time steps)
Online: Generate M = 107 samples of θ from the approximate
posterior πapprox(θ|d) using standard MCMC method ≈ 26 mins
Computational Costs:
Offline Online
Standard -/- M × (nt × CPDE)
Surrogate nt × O(nk × CPDE) M × Ceval
Simon Cotter Bayesian Complexities 13 / 39
Results: Copper Sample
Offline: Compute surrogate solution : ≈ 16 mins
(Solved 600K equations ×nt = 800 time steps)
Online: Generate M = 107 samples of θ from the approximate
posterior πapprox(θ|d) using standard MCMC method ≈ 26 mins
Computational Costs:
Offline Online
Standard -/- M × (nt × CPDE)
Surrogate nt × O(nk × CPDE) M × Ceval
Simon Cotter Bayesian Complexities 13 / 39
Posterior Density, π(θ|z)
353 354 355 356 357
1.179
1.18
1.181
1.182
1.183
1.184
10
12
1
1.5
2
2.5
3
3.5
4
4.5
5
10-10
352 353 354 355 356 357 358
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
1.178 1.179 1.18 1.181 1.182 1.183 1.184 1.185
1012
0
1
2
3
4
5
6
7
10
-10
Simon Cotter Bayesian Complexities 14 / 39
Posterior Convergence in k (Polynomial Degree)
352 353 354 355 356 357 358 359
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Simon Cotter Bayesian Complexities 15 / 39
V. HOANG, C. SCHWAB, AND A. STUART, Complexity Analysis
of Accelerated MCMC Methods for Bayesian Inversion,
Inverse Problems, 29 (2013), p. 085010.
Y. MARZOUK, H. NAJM, AND L. RAHN, Stochastic Spectral
Methods for Efficient Bayesian Solution of Inverse Problems,
Journal of Computational Physics, 224 (2007), pp. 560–586.
F. NOBILE AND R. TEMPONE, Analysis and Implementation
Issues for the Numerical Approximation of Parabolic
Equations with Random Coefficients, International Journal for
Numerical Methods in Engineering, 80 (2009), pp. 979–1006.
J. A. RYNN, S. L. COTTER, C. E. POWELL, AND L. WRIGHT,
Surrogate accelerated bayesian inversion for the
determination of the thermal diffusivity of a material,
Metrologia, 56 (2019), p. 015018.
Simon Cotter Bayesian Complexities 16 / 39
Outline
1 Introduction
2 SGFEM Surrogate-Accelerated Inference
3 Transport Map-Accelerated Adaptive Importance Sampling
4 Conclusions
Simon Cotter Bayesian Complexities 17 / 39
Motivation
Simon Cotter Bayesian Complexities 18 / 39
Motivation
Simon Cotter Bayesian Complexities 19 / 39
Motivation
Simon Cotter Bayesian Complexities 20 / 39
Multiscale Systems
0 0.05 0.1 0.15 0.2
70
80
90
100
110
120
130
Time t
Numberofmolecules
X1 X2 X3 (X1+X2)/2
∅
k1
−−−−→ X1
k2x1
−−−−→
←−−−−
k3x2
X2
k4x2
−−−−→ X3
k5x3
−−−−→ ∅
Simon Cotter Bayesian Complexities 21 / 39
Constrained approximation: Simple Example
100 105
0
0.2
0.4
k1k2
100 105
0
20
0 10 20 30
0
0.05
0.1
k3
100 105
0
20
40
0 20
0
20
40
0 20 40
0
0.05
0.1
k1
k4
100 105
0
5
k2
0 20
0
5
k3
0 20 40
0
5
0 5
0
0.5
1
k4
Figure: CMA approximation of the posterior arising from observations of
the slow variable S = X1 + X2, concentrated around a manifold
k1(k2+k3+k4)
k2k4
= C, i.e. more challenging than this plot suggests. (Any
visualisation suggestions?)
Simon Cotter Bayesian Complexities 22 / 39
Importance Sampling
Sample xi ∼ ν
Compute weights wi = π(xi)/ν(xi)
Monte Carlo estimates:
Eπ(f) ≈
1
j wj
N
i=1
f(xi)wi
Variance of weights indicator for efficiency
Small when π ≈ ν
Simon Cotter Bayesian Complexities 23 / 39
Importance Sampling
Sample xi ∼ ν
Compute weights wi = π(xi)/ν(xi)
Monte Carlo estimates:
Eπ(f) ≈
1
j wj
N
i=1
f(xi)wi
Variance of weights indicator for efficiency
Small when π ≈ ν
Simon Cotter Bayesian Complexities 23 / 39
Importance Sampling
Sample xi ∼ ν
Compute weights wi = π(xi)/ν(xi)
Monte Carlo estimates:
Eπ(f) ≈
1
j wj
N
i=1
f(xi)wi
Variance of weights indicator for efficiency
Small when π ≈ ν
Simon Cotter Bayesian Complexities 23 / 39
Importance Sampling
Sample xi ∼ ν
Compute weights wi = π(xi)/ν(xi)
Monte Carlo estimates:
Eπ(f) ≈
1
j wj
N
i=1
f(xi)wi
Variance of weights indicator for efficiency
Small when π ≈ ν
Simon Cotter Bayesian Complexities 23 / 39
Importance Sampling
Sample xi ∼ ν
Compute weights wi = π(xi)/ν(xi)
Monte Carlo estimates:
Eπ(f) ≈
1
j wj
N
i=1
f(xi)wi
Variance of weights indicator for efficiency
Small when π ≈ ν
Simon Cotter Bayesian Complexities 23 / 39
Advantages of Importance Sampling: 102
samples
0 0.2 0.4 0.6 0.8 1
0
5
10
15
20
25
Simon Cotter Bayesian Complexities 24 / 39
Advantages of Importance Sampling: 103
samples
0 0.2 0.4 0.6 0.8 1
0
1
2
3
4
5
6
7
8
9
Simon Cotter Bayesian Complexities 24 / 39
Advantages of Importance Sampling: 104
samples
0 0.2 0.4 0.6 0.8 1
0
1
2
3
4
5
6
7
8
Simon Cotter Bayesian Complexities 24 / 39
Advantages of Importance Sampling: 105
samples
0 0.2 0.4 0.6 0.8 1
0
1
2
3
4
5
6
Simon Cotter Bayesian Complexities 24 / 39
Advantages of Importance Sampling: 106
samples
0 0.2 0.4 0.6 0.8 1
0
1
2
3
4
5
6
Simon Cotter Bayesian Complexities 24 / 39
Advantages of Importance Sampling: Weights
0 0.2 0.4 0.6 0.8 1
10
−15
10
−10
10
−5
10
0
10
5
/
Simon Cotter Bayesian Complexities 25 / 39
Disadvantages of Importance Sampling: 102
samples
−0.5 0 0.5 1
0
5
10
15
20
25
30
35
40
Simon Cotter Bayesian Complexities 26 / 39
Disadvantages of Importance Sampling: 103
samples
−0.5 0 0.5 1
0
5
10
15
20
25
30
35
Simon Cotter Bayesian Complexities 26 / 39
Disadvantages of Importance Sampling: 104
samples
−0.5 0 0.5 1
0
5
10
15
20
25
Simon Cotter Bayesian Complexities 26 / 39
Disadvantages of Importance Sampling: 105
samples
−0.5 0 0.5 1
0
2
4
6
8
10
12
Simon Cotter Bayesian Complexities 26 / 39
Disadvantages of Importance Sampling: 106
samples
−0.5 0 0.5 1
0
2
4
6
8
10
12
14
Simon Cotter Bayesian Complexities 26 / 39
Disadvantages of Importance Sampling: Weights
−4 −2 0 2 4 6
10
−300
10
−200
10
−100
10
0
10
100
/
Simon Cotter Bayesian Complexities 27 / 39
Ensemble Transport Adaptive Importance Sampling
Proposal distribution in kth iteration informed by M ensemble
members with states θi
χ(k)
=
1
M
M
i=1
q(·; θ
(k)
i , β)
q(·; ·, β) a transition kernel, e.g. Gaussian, MALA proposal,
etc, with scaling parameter β
Resampling step; ensemble transform method
For large M, greedy approximation used, “multinomial
approximation”
C. Cotter, SLC, P. Russell, “Ensemble transport adaptive importance
sampling”, SIAM JUQ 2019.
S. Reich, “A non-parametric ensemble transform method for Bayesian
inference”, SISC 2013.
Simon Cotter Bayesian Complexities 28 / 39
Ensemble Transport Adaptive Importance Sampling
Proposal distribution in kth iteration informed by M ensemble
members with states θi
χ(k)
=
1
M
M
i=1
q(·; θ
(k)
i , β)
q(·; ·, β) a transition kernel, e.g. Gaussian, MALA proposal,
etc, with scaling parameter β
Resampling step; ensemble transform method
For large M, greedy approximation used, “multinomial
approximation”
C. Cotter, SLC, P. Russell, “Ensemble transport adaptive importance
sampling”, SIAM JUQ 2019.
S. Reich, “A non-parametric ensemble transform method for Bayesian
inference”, SISC 2013.
Simon Cotter Bayesian Complexities 28 / 39
Ensemble Transport Adaptive Importance Sampling
Proposal distribution in kth iteration informed by M ensemble
members with states θi
χ(k)
=
1
M
M
i=1
q(·; θ
(k)
i , β)
q(·; ·, β) a transition kernel, e.g. Gaussian, MALA proposal,
etc, with scaling parameter β
Resampling step; ensemble transform method
For large M, greedy approximation used, “multinomial
approximation”
C. Cotter, SLC, P. Russell, “Ensemble transport adaptive importance
sampling”, SIAM JUQ 2019.
S. Reich, “A non-parametric ensemble transform method for Bayesian
inference”, SISC 2013.
Simon Cotter Bayesian Complexities 28 / 39
Ensemble Transport Adaptive Importance Sampling
Proposal distribution in kth iteration informed by M ensemble
members with states θi
χ(k)
=
1
M
M
i=1
q(·; θ
(k)
i , β)
q(·; ·, β) a transition kernel, e.g. Gaussian, MALA proposal,
etc, with scaling parameter β
Resampling step; ensemble transform method
For large M, greedy approximation used, “multinomial
approximation”
C. Cotter, SLC, P. Russell, “Ensemble transport adaptive importance
sampling”, SIAM JUQ 2019.
S. Reich, “A non-parametric ensemble transform method for Bayesian
inference”, SISC 2013.
Simon Cotter Bayesian Complexities 28 / 39
Ensemble Transport Adaptive Importance Sampling:
Prior and Posterior
−5 0 5 10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Simon Cotter Bayesian Complexities 29 / 39
Ensemble Transport Adaptive Importance Sampling:
Current State Xi
0 2 4 6 8 10
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Simon Cotter Bayesian Complexities 29 / 39
Ensemble Transport Adaptive Importance Sampling:
MALA Proposals
0 2 4 6 8 10
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Simon Cotter Bayesian Complexities 29 / 39
Ensemble Transport Adaptive Importance Sampling:
Aggregate Proposal
0 2 4 6 8 10
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Simon Cotter Bayesian Complexities 29 / 39
Ensemble Transport Adaptive Importance Sampling:
Aggregate Proposal
0 2 4 6 8 10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Simon Cotter Bayesian Complexities 29 / 39
Ensemble Transport Adaptive Importance Sampling:
Aggregate Proposal and Weight Function
10
−3
10
−2
10
−1
10
0
10
1
10
−50
10
−40
10
−30
10
−20
10
−10
10
0
Target distribution
Proposal distribution
Weights /
Simon Cotter Bayesian Complexities 29 / 39
Ensemble Transport Adaptive Importance Sampling:
Samples from Proposal
0 2 4 6 8 10
0
0.2
0.4
0.6
0.8
1
Simon Cotter Bayesian Complexities 29 / 39
Ensemble Transport Adaptive Importance Sampling:
Sample Weights
0 2 4 6 8 10
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
Simon Cotter Bayesian Complexities 29 / 39
Ensemble Transport Adaptive Importance Sampling:
Resampled States
0 2 4 6 8 10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Simon Cotter Bayesian Complexities 29 / 39
ETAIS - pros and cons
PROS:
Possible big speed-ups with parallelisation
Well-informed proposals
Reduces variance of importance weights
Adaptive to global differences in scales of parameters
CONS:
Posterior concentrated on lower dimensional manifold:
Stability issues
Slow convergence
Requires large ensemble size (expensive)
Particle transition kernel q needs to “know” about the manifold
Simon Cotter Bayesian Complexities 30 / 39
ETAIS - pros and cons
PROS:
Possible big speed-ups with parallelisation
Well-informed proposals
Reduces variance of importance weights
Adaptive to global differences in scales of parameters
CONS:
Posterior concentrated on lower dimensional manifold:
Stability issues
Slow convergence
Requires large ensemble size (expensive)
Particle transition kernel q needs to “know” about the manifold
Simon Cotter Bayesian Complexities 30 / 39
ETAIS - pros and cons
PROS:
Possible big speed-ups with parallelisation
Well-informed proposals
Reduces variance of importance weights
Adaptive to global differences in scales of parameters
CONS:
Posterior concentrated on lower dimensional manifold:
Stability issues
Slow convergence
Requires large ensemble size (expensive)
Particle transition kernel q needs to “know” about the manifold
Simon Cotter Bayesian Complexities 30 / 39
ETAIS - pros and cons
PROS:
Possible big speed-ups with parallelisation
Well-informed proposals
Reduces variance of importance weights
Adaptive to global differences in scales of parameters
CONS:
Posterior concentrated on lower dimensional manifold:
Stability issues
Slow convergence
Requires large ensemble size (expensive)
Particle transition kernel q needs to “know” about the manifold
Simon Cotter Bayesian Complexities 30 / 39
ETAIS - pros and cons
PROS:
Possible big speed-ups with parallelisation
Well-informed proposals
Reduces variance of importance weights
Adaptive to global differences in scales of parameters
CONS:
Posterior concentrated on lower dimensional manifold:
Stability issues
Slow convergence
Requires large ensemble size (expensive)
Particle transition kernel q needs to “know” about the manifold
Simon Cotter Bayesian Complexities 30 / 39
ETAIS - pros and cons
PROS:
Possible big speed-ups with parallelisation
Well-informed proposals
Reduces variance of importance weights
Adaptive to global differences in scales of parameters
CONS:
Posterior concentrated on lower dimensional manifold:
Stability issues
Slow convergence
Requires large ensemble size (expensive)
Particle transition kernel q needs to “know” about the manifold
Simon Cotter Bayesian Complexities 30 / 39
ETAIS - pros and cons
PROS:
Possible big speed-ups with parallelisation
Well-informed proposals
Reduces variance of importance weights
Adaptive to global differences in scales of parameters
CONS:
Posterior concentrated on lower dimensional manifold:
Stability issues
Slow convergence
Requires large ensemble size (expensive)
Particle transition kernel q needs to “know” about the manifold
Simon Cotter Bayesian Complexities 30 / 39
ETAIS - pros and cons
PROS:
Possible big speed-ups with parallelisation
Well-informed proposals
Reduces variance of importance weights
Adaptive to global differences in scales of parameters
CONS:
Posterior concentrated on lower dimensional manifold:
Stability issues
Slow convergence
Requires large ensemble size (expensive)
Particle transition kernel q needs to “know” about the manifold
Simon Cotter Bayesian Complexities 30 / 39
ETAIS - pros and cons
PROS:
Possible big speed-ups with parallelisation
Well-informed proposals
Reduces variance of importance weights
Adaptive to global differences in scales of parameters
CONS:
Posterior concentrated on lower dimensional manifold:
Stability issues
Slow convergence
Requires large ensemble size (expensive)
Particle transition kernel q needs to “know” about the manifold
Simon Cotter Bayesian Complexities 30 / 39
Motivation
��������������
Simon Cotter Bayesian Complexities 31 / 39
Transport maps
Find homeomorphism T : Rd → Rd which maps target
measure π to an easily explored reference measure πr
µ(T−1
(A)) = µr (A)
Simple proposal densities on πr map to complex informed
densities on π via T−1
v ∼ T−1
(q(·, u; β))
Low-dimensional approximation can be computed from a
posterior sample
M. Parno, Y. Marzouk, “Transport Map Accelerated Markov Chain Monte
Carlo”, SIAM journal on uncertainty quantification, 2018.
Simon Cotter Bayesian Complexities 32 / 39
Transport maps
Find homeomorphism T : Rd → Rd which maps target
measure π to an easily explored reference measure πr
µ(T−1
(A)) = µr (A)
Simple proposal densities on πr map to complex informed
densities on π via T−1
v ∼ T−1
(q(·, u; β))
Low-dimensional approximation can be computed from a
posterior sample
M. Parno, Y. Marzouk, “Transport Map Accelerated Markov Chain Monte
Carlo”, SIAM journal on uncertainty quantification, 2018.
Simon Cotter Bayesian Complexities 32 / 39
Transport maps
Find homeomorphism T : Rd → Rd which maps target
measure π to an easily explored reference measure πr
µ(T−1
(A)) = µr (A)
Simple proposal densities on πr map to complex informed
densities on π via T−1
v ∼ T−1
(q(·, u; β))
Low-dimensional approximation can be computed from a
posterior sample
M. Parno, Y. Marzouk, “Transport Map Accelerated Markov Chain Monte
Carlo”, SIAM journal on uncertainty quantification, 2018.
Simon Cotter Bayesian Complexities 32 / 39
Transport map simplification of Rosenbrock
Target parameter space
-1 0 1 2 3
0
2
4
6
(a) Original sample θ
from MH-RW algorithm.
Reference parameter space
-4 -2 0 2 4
-4
-2
0
2
4
(b) Push forward of θ
onto reference space.
Figure: The effect of the approximate transport map ˜T on a sample from
the Rosenbrock target density.
Simon Cotter Bayesian Complexities 33 / 39
Rosenbrock density
Number of samples
102
103
104
105
106
relativeL2
error
10-4
10-3
10-2
10-1
RWMH
TRWMH
ETAIS-RW
ETAIS-TRW
Simon Cotter Bayesian Complexities 34 / 39
Constrained approximation: Simple Example
100 105
0
0.2
0.4
k1k2
100 105
0
20
0 10 20 30
0
0.05
0.1
k3
100 105
0
20
40
0 20
0
20
40
0 20 40
0
0.05
0.1
k1
k4
100 105
0
5
k2
0 20
0
5
k3
0 20 40
0
5
0 5
0
0.5
1
k4
Figure: CMA approximation of the posterior arising from observations of
the slow variable S = X1 + X2, concentrated around a manifold
k1(k2+k3+k4)
k2k4
= C, i.e. more challenging than this plot suggests. (Any
visualisation suggestions?)
Simon Cotter Bayesian Complexities 35 / 39
Multiscale stochastic reaction network example
Number of samples
10
2
10
3
10
4
10
5
10
6
relativeerror
10-4
10-3
10
-2
10
-1
10
0
MH-logRW
MH-logTRW
ETAIS-logRW
ETAIS-logTRW
O(1/
√
N)
Figure: Sampling algorithms with a log preconditioner for ˜T.
Simon Cotter Bayesian Complexities 36 / 39
Simon Cotter Bayesian Complexities 37 / 39
References
SLC, I. Kevrekidis, P. Russell, 2019. “Transport map
accelerated adaptive importance sampling, and application to
inverse problems arising from multiscale stochastic reaction
networks.” arXiv preprint arXiv:1901.11269, submitted to
SIAM UQ.
M. Parno, Y. Marzouk, “Transport Map Accelerated Markov
Chain Monte Carlo”, SIAM journal on uncertainty
quantification, 2018.
C. Cotter, SLC, P. Russell, “Ensemble transport adaptive
importance sampling”, SIAM journal on uncertainty
quantification, 2019.
S. Reich, “A non-parametric ensemble transform method for
Bayesian inference”, SISC 2013.
SLC, “Constrained approximation of effective generators for
multiscale stochastic reaction networks and application to
conditioned path sampling”, Journal of Computational
Physics, 2016
Simon Cotter Bayesian Complexities 38 / 39
Outline
1 Introduction
2 SGFEM Surrogate-Accelerated Inference
3 Transport Map-Accelerated Adaptive Importance Sampling
4 Conclusions
Simon Cotter Bayesian Complexities 38 / 39
Conclusions
Multiple possible reasons for extortionate costs in Bayesian
inference
High cost of likelihood evaluations due to numerical
approximation of PDEs
Surrogates can sample efficiently from approximation of
posterior
Does πapprox(θ|d) converge to π(θ|d)? In what sense?
How is the error π(θ|d) − πapprox(θ|d) affected by the error
between the solution to the forward model and the chosen
surrogate, T(θ) − Tapprox(θ) ?
Large number of MCMC samples required due to complex
posterior structure
Many posteriors are concentrated on lower-dimensional
manifolds
Optimal transport maps can simplify target distributions
Ensemble adaptive importance sampling schemes can be
stabilised and accelerated for lower ensemble sizes
Simon Cotter Bayesian Complexities 39 / 39
Conclusions
Multiple possible reasons for extortionate costs in Bayesian
inference
High cost of likelihood evaluations due to numerical
approximation of PDEs
Surrogates can sample efficiently from approximation of
posterior
Does πapprox(θ|d) converge to π(θ|d)? In what sense?
How is the error π(θ|d) − πapprox(θ|d) affected by the error
between the solution to the forward model and the chosen
surrogate, T(θ) − Tapprox(θ) ?
Large number of MCMC samples required due to complex
posterior structure
Many posteriors are concentrated on lower-dimensional
manifolds
Optimal transport maps can simplify target distributions
Ensemble adaptive importance sampling schemes can be
stabilised and accelerated for lower ensemble sizes
Simon Cotter Bayesian Complexities 39 / 39
Conclusions
Multiple possible reasons for extortionate costs in Bayesian
inference
High cost of likelihood evaluations due to numerical
approximation of PDEs
Surrogates can sample efficiently from approximation of
posterior
Does πapprox(θ|d) converge to π(θ|d)? In what sense?
How is the error π(θ|d) − πapprox(θ|d) affected by the error
between the solution to the forward model and the chosen
surrogate, T(θ) − Tapprox(θ) ?
Large number of MCMC samples required due to complex
posterior structure
Many posteriors are concentrated on lower-dimensional
manifolds
Optimal transport maps can simplify target distributions
Ensemble adaptive importance sampling schemes can be
stabilised and accelerated for lower ensemble sizes
Simon Cotter Bayesian Complexities 39 / 39
Conclusions
Multiple possible reasons for extortionate costs in Bayesian
inference
High cost of likelihood evaluations due to numerical
approximation of PDEs
Surrogates can sample efficiently from approximation of
posterior
Does πapprox(θ|d) converge to π(θ|d)? In what sense?
How is the error π(θ|d) − πapprox(θ|d) affected by the error
between the solution to the forward model and the chosen
surrogate, T(θ) − Tapprox(θ) ?
Large number of MCMC samples required due to complex
posterior structure
Many posteriors are concentrated on lower-dimensional
manifolds
Optimal transport maps can simplify target distributions
Ensemble adaptive importance sampling schemes can be
stabilised and accelerated for lower ensemble sizes
Simon Cotter Bayesian Complexities 39 / 39
Conclusions
Multiple possible reasons for extortionate costs in Bayesian
inference
High cost of likelihood evaluations due to numerical
approximation of PDEs
Surrogates can sample efficiently from approximation of
posterior
Does πapprox(θ|d) converge to π(θ|d)? In what sense?
How is the error π(θ|d) − πapprox(θ|d) affected by the error
between the solution to the forward model and the chosen
surrogate, T(θ) − Tapprox(θ) ?
Large number of MCMC samples required due to complex
posterior structure
Many posteriors are concentrated on lower-dimensional
manifolds
Optimal transport maps can simplify target distributions
Ensemble adaptive importance sampling schemes can be
stabilised and accelerated for lower ensemble sizes
Simon Cotter Bayesian Complexities 39 / 39
Conclusions
Multiple possible reasons for extortionate costs in Bayesian
inference
High cost of likelihood evaluations due to numerical
approximation of PDEs
Surrogates can sample efficiently from approximation of
posterior
Does πapprox(θ|d) converge to π(θ|d)? In what sense?
How is the error π(θ|d) − πapprox(θ|d) affected by the error
between the solution to the forward model and the chosen
surrogate, T(θ) − Tapprox(θ) ?
Large number of MCMC samples required due to complex
posterior structure
Many posteriors are concentrated on lower-dimensional
manifolds
Optimal transport maps can simplify target distributions
Ensemble adaptive importance sampling schemes can be
stabilised and accelerated for lower ensemble sizes
Simon Cotter Bayesian Complexities 39 / 39
Conclusions
Multiple possible reasons for extortionate costs in Bayesian
inference
High cost of likelihood evaluations due to numerical
approximation of PDEs
Surrogates can sample efficiently from approximation of
posterior
Does πapprox(θ|d) converge to π(θ|d)? In what sense?
How is the error π(θ|d) − πapprox(θ|d) affected by the error
between the solution to the forward model and the chosen
surrogate, T(θ) − Tapprox(θ) ?
Large number of MCMC samples required due to complex
posterior structure
Many posteriors are concentrated on lower-dimensional
manifolds
Optimal transport maps can simplify target distributions
Ensemble adaptive importance sampling schemes can be
stabilised and accelerated for lower ensemble sizes
Simon Cotter Bayesian Complexities 39 / 39
Conclusions
Multiple possible reasons for extortionate costs in Bayesian
inference
High cost of likelihood evaluations due to numerical
approximation of PDEs
Surrogates can sample efficiently from approximation of
posterior
Does πapprox(θ|d) converge to π(θ|d)? In what sense?
How is the error π(θ|d) − πapprox(θ|d) affected by the error
between the solution to the forward model and the chosen
surrogate, T(θ) − Tapprox(θ) ?
Large number of MCMC samples required due to complex
posterior structure
Many posteriors are concentrated on lower-dimensional
manifolds
Optimal transport maps can simplify target distributions
Ensemble adaptive importance sampling schemes can be
stabilised and accelerated for lower ensemble sizes
Simon Cotter Bayesian Complexities 39 / 39
Conclusions
Multiple possible reasons for extortionate costs in Bayesian
inference
High cost of likelihood evaluations due to numerical
approximation of PDEs
Surrogates can sample efficiently from approximation of
posterior
Does πapprox(θ|d) converge to π(θ|d)? In what sense?
How is the error π(θ|d) − πapprox(θ|d) affected by the error
between the solution to the forward model and the chosen
surrogate, T(θ) − Tapprox(θ) ?
Large number of MCMC samples required due to complex
posterior structure
Many posteriors are concentrated on lower-dimensional
manifolds
Optimal transport maps can simplify target distributions
Ensemble adaptive importance sampling schemes can be
stabilised and accelerated for lower ensemble sizes
Simon Cotter Bayesian Complexities 39 / 39

More Related Content

What's hot

conference_poster_4
conference_poster_4conference_poster_4
conference_poster_4
Jiayi Jiang
 
2010 3-24 cryptography stamatiou
2010 3-24 cryptography stamatiou2010 3-24 cryptography stamatiou
2010 3-24 cryptography stamatiou
vafopoulos
 
Optics Fourier Transform Ii
Optics Fourier Transform IiOptics Fourier Transform Ii
Optics Fourier Transform Ii
diarmseven
 

What's hot (20)

Mark Girolami's Read Paper 2010
Mark Girolami's Read Paper 2010Mark Girolami's Read Paper 2010
Mark Girolami's Read Paper 2010
 
conference_poster_4
conference_poster_4conference_poster_4
conference_poster_4
 
2010 3-24 cryptography stamatiou
2010 3-24 cryptography stamatiou2010 3-24 cryptography stamatiou
2010 3-24 cryptography stamatiou
 
Fourier analysis techniques fourier series
Fourier analysis techniques   fourier seriesFourier analysis techniques   fourier series
Fourier analysis techniques fourier series
 
Fourier analysis
Fourier analysisFourier analysis
Fourier analysis
 
5. fourier properties
5. fourier properties5. fourier properties
5. fourier properties
 
Optics Fourier Transform Ii
Optics Fourier Transform IiOptics Fourier Transform Ii
Optics Fourier Transform Ii
 
Monte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problemsMonte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problems
 
Couplings of Markov chains and the Poisson equation
Couplings of Markov chains and the Poisson equation Couplings of Markov chains and the Poisson equation
Couplings of Markov chains and the Poisson equation
 
Fourier Transform
Fourier TransformFourier Transform
Fourier Transform
 
Fourier transform
Fourier transformFourier transform
Fourier transform
 
Monte Carlo Statistical Methods
Monte Carlo Statistical MethodsMonte Carlo Statistical Methods
Monte Carlo Statistical Methods
 
Lesson 5: Continuity
Lesson 5: ContinuityLesson 5: Continuity
Lesson 5: Continuity
 
Unbiased MCMC with couplings
Unbiased MCMC with couplingsUnbiased MCMC with couplings
Unbiased MCMC with couplings
 
Lesson 5: Continuity
Lesson 5: ContinuityLesson 5: Continuity
Lesson 5: Continuity
 
Monte Caro Simualtions, Sampling and Markov Chain Monte Carlo
Monte Caro Simualtions, Sampling and Markov Chain Monte CarloMonte Caro Simualtions, Sampling and Markov Chain Monte Carlo
Monte Caro Simualtions, Sampling and Markov Chain Monte Carlo
 
Fourier transform
Fourier transformFourier transform
Fourier transform
 
Fourier transforms
Fourier transforms Fourier transforms
Fourier transforms
 
An Intuitive Approach to Fourier Optics
An Intuitive Approach to Fourier OpticsAn Intuitive Approach to Fourier Optics
An Intuitive Approach to Fourier Optics
 
Fourier transforms
Fourier transformsFourier transforms
Fourier transforms
 

Similar to MUMS: Transition & SPUQ Workshop - Complexities in Bayesian Inverse Problems: Models and Distributions - Simon Cotter, May 17, 2019

Lattices, sphere packings, spherical codes
Lattices, sphere packings, spherical codesLattices, sphere packings, spherical codes
Lattices, sphere packings, spherical codes
wtyru1989
 
Black hole microstate geometries from string amplitudes
Black hole microstate geometries from string amplitudesBlack hole microstate geometries from string amplitudes
Black hole microstate geometries from string amplitudes
djt36
 
A Review of Probability and its Applications Shameel Farhan new applied [Comp...
A Review of Probability and its Applications Shameel Farhan new applied [Comp...A Review of Probability and its Applications Shameel Farhan new applied [Comp...
A Review of Probability and its Applications Shameel Farhan new applied [Comp...
shameel farhan
 
Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...
Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...
Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...
Sean Meyn
 

Similar to MUMS: Transition & SPUQ Workshop - Complexities in Bayesian Inverse Problems: Models and Distributions - Simon Cotter, May 17, 2019 (20)

Lattices, sphere packings, spherical codes
Lattices, sphere packings, spherical codesLattices, sphere packings, spherical codes
Lattices, sphere packings, spherical codes
 
Richard Everitt's slides
Richard Everitt's slidesRichard Everitt's slides
Richard Everitt's slides
 
Can we estimate a constant?
Can we estimate a constant?Can we estimate a constant?
Can we estimate a constant?
 
Monte carlo
Monte carloMonte carlo
Monte carlo
 
Fol
FolFol
Fol
 
Seminaire ihp
Seminaire ihpSeminaire ihp
Seminaire ihp
 
Monte Carlo in Montréal 2017
Monte Carlo in Montréal 2017Monte Carlo in Montréal 2017
Monte Carlo in Montréal 2017
 
Current limitations of sequential inference in general hidden Markov models
Current limitations of sequential inference in general hidden Markov modelsCurrent limitations of sequential inference in general hidden Markov models
Current limitations of sequential inference in general hidden Markov models
 
ABC with data cloning for MLE in state space models
ABC with data cloning for MLE in state space modelsABC with data cloning for MLE in state space models
ABC with data cloning for MLE in state space models
 
ABC short course: model choice chapter
ABC short course: model choice chapterABC short course: model choice chapter
ABC short course: model choice chapter
 
Black hole microstate geometries from string amplitudes
Black hole microstate geometries from string amplitudesBlack hole microstate geometries from string amplitudes
Black hole microstate geometries from string amplitudes
 
Stratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computationStratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computation
 
A Review of Probability and its Applications Shameel Farhan new applied [Comp...
A Review of Probability and its Applications Shameel Farhan new applied [Comp...A Review of Probability and its Applications Shameel Farhan new applied [Comp...
A Review of Probability and its Applications Shameel Farhan new applied [Comp...
 
Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...
Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...
Quasi-Stochastic Approximation: Algorithm Design Principles with Applications...
 
Quantum Minimax Theorem in Statistical Decision Theory (RIMS2014)
Quantum Minimax Theorem in Statistical Decision Theory (RIMS2014)Quantum Minimax Theorem in Statistical Decision Theory (RIMS2014)
Quantum Minimax Theorem in Statistical Decision Theory (RIMS2014)
 
Complete l fuzzy metric spaces and common fixed point theorems
Complete l fuzzy metric spaces and  common fixed point theoremsComplete l fuzzy metric spaces and  common fixed point theorems
Complete l fuzzy metric spaces and common fixed point theorems
 
NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015
 
random forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimationrandom forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimation
 
Função de mão única
Função de mão únicaFunção de mão única
Função de mão única
 
Bayesian inference on mixtures
Bayesian inference on mixturesBayesian inference on mixtures
Bayesian inference on mixtures
 

More from The Statistical and Applied Mathematical Sciences Institute

Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
The Statistical and Applied Mathematical Sciences Institute
 
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
The Statistical and Applied Mathematical Sciences Institute
 
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
The Statistical and Applied Mathematical Sciences Institute
 
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
The Statistical and Applied Mathematical Sciences Institute
 
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
The Statistical and Applied Mathematical Sciences Institute
 
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
The Statistical and Applied Mathematical Sciences Institute
 

More from The Statistical and Applied Mathematical Sciences Institute (20)

Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
 
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
 
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
 
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
 
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
 
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
 
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
 
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
 
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
 
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
 
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
 
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
 
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
 
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
 
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
 
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
 
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
 
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
 
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
 
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
 

Recently uploaded

IATP How-to Foreign Travel May 2024.pdff
IATP How-to Foreign Travel May 2024.pdffIATP How-to Foreign Travel May 2024.pdff
IATP How-to Foreign Travel May 2024.pdff
17thcssbs2
 
The basics of sentences session 4pptx.pptx
The basics of sentences session 4pptx.pptxThe basics of sentences session 4pptx.pptx
The basics of sentences session 4pptx.pptx
heathfieldcps1
 

Recently uploaded (20)

Open Educational Resources Primer PowerPoint
Open Educational Resources Primer PowerPointOpen Educational Resources Primer PowerPoint
Open Educational Resources Primer PowerPoint
 
philosophy and it's principles based on the life
philosophy and it's principles based on the lifephilosophy and it's principles based on the life
philosophy and it's principles based on the life
 
IATP How-to Foreign Travel May 2024.pdff
IATP How-to Foreign Travel May 2024.pdffIATP How-to Foreign Travel May 2024.pdff
IATP How-to Foreign Travel May 2024.pdff
 
Championnat de France de Tennis de table/
Championnat de France de Tennis de table/Championnat de France de Tennis de table/
Championnat de France de Tennis de table/
 
[GDSC YCCE] Build with AI Online Presentation
[GDSC YCCE] Build with AI Online Presentation[GDSC YCCE] Build with AI Online Presentation
[GDSC YCCE] Build with AI Online Presentation
 
Dementia (Alzheimer & vasular dementia).
Dementia (Alzheimer & vasular dementia).Dementia (Alzheimer & vasular dementia).
Dementia (Alzheimer & vasular dementia).
 
Pragya Champions Chalice 2024 Prelims & Finals Q/A set, General Quiz
Pragya Champions Chalice 2024 Prelims & Finals Q/A set, General QuizPragya Champions Chalice 2024 Prelims & Finals Q/A set, General Quiz
Pragya Champions Chalice 2024 Prelims & Finals Q/A set, General Quiz
 
slides CapTechTalks Webinar May 2024 Alexander Perry.pptx
slides CapTechTalks Webinar May 2024 Alexander Perry.pptxslides CapTechTalks Webinar May 2024 Alexander Perry.pptx
slides CapTechTalks Webinar May 2024 Alexander Perry.pptx
 
The basics of sentences session 4pptx.pptx
The basics of sentences session 4pptx.pptxThe basics of sentences session 4pptx.pptx
The basics of sentences session 4pptx.pptx
 
factors influencing drug absorption-final-2.pptx
factors influencing drug absorption-final-2.pptxfactors influencing drug absorption-final-2.pptx
factors influencing drug absorption-final-2.pptx
 
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdfINU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
 
Basic Civil Engineering notes on Transportation Engineering, Modes of Transpo...
Basic Civil Engineering notes on Transportation Engineering, Modes of Transpo...Basic Civil Engineering notes on Transportation Engineering, Modes of Transpo...
Basic Civil Engineering notes on Transportation Engineering, Modes of Transpo...
 
size separation d pharm 1st year pharmaceutics
size separation d pharm 1st year pharmaceuticssize separation d pharm 1st year pharmaceutics
size separation d pharm 1st year pharmaceutics
 
Post Exam Fun(da) Intra UEM General Quiz 2024 - Prelims q&a.pdf
Post Exam Fun(da) Intra UEM General Quiz 2024 - Prelims q&a.pdfPost Exam Fun(da) Intra UEM General Quiz 2024 - Prelims q&a.pdf
Post Exam Fun(da) Intra UEM General Quiz 2024 - Prelims q&a.pdf
 
How to the fix Attribute Error in odoo 17
How to the fix Attribute Error in odoo 17How to the fix Attribute Error in odoo 17
How to the fix Attribute Error in odoo 17
 
REPRODUCTIVE TOXICITY STUDIE OF MALE AND FEMALEpptx
REPRODUCTIVE TOXICITY  STUDIE OF MALE AND FEMALEpptxREPRODUCTIVE TOXICITY  STUDIE OF MALE AND FEMALEpptx
REPRODUCTIVE TOXICITY STUDIE OF MALE AND FEMALEpptx
 
2024_Student Session 2_ Set Plan Preparation.pptx
2024_Student Session 2_ Set Plan Preparation.pptx2024_Student Session 2_ Set Plan Preparation.pptx
2024_Student Session 2_ Set Plan Preparation.pptx
 
Telling Your Story_ Simple Steps to Build Your Nonprofit's Brand Webinar.pdf
Telling Your Story_ Simple Steps to Build Your Nonprofit's Brand Webinar.pdfTelling Your Story_ Simple Steps to Build Your Nonprofit's Brand Webinar.pdf
Telling Your Story_ Simple Steps to Build Your Nonprofit's Brand Webinar.pdf
 
Incoming and Outgoing Shipments in 2 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 2 STEPS Using Odoo 17Incoming and Outgoing Shipments in 2 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 2 STEPS Using Odoo 17
 
Morse OER Some Benefits and Challenges.pptx
Morse OER Some Benefits and Challenges.pptxMorse OER Some Benefits and Challenges.pptx
Morse OER Some Benefits and Challenges.pptx
 

MUMS: Transition & SPUQ Workshop - Complexities in Bayesian Inverse Problems: Models and Distributions - Simon Cotter, May 17, 2019

  • 1. Complexities in Bayesian inverse problems: Models and Distributions Simon Cotter University of Manchester 17th May 2019 Simon Cotter Bayesian Complexities 0 / 39
  • 2. The Graveyard Shift Last day After lunch My promises to you: 2 talks in 1 to keep you on your toes (or if you go to sleep in the 1st you have a second chance!) Almost no details! As many pictures of gorillas as is scientifically justifiable Simon Cotter Bayesian Complexities 1 / 39
  • 3. The Graveyard Shift Last day After lunch My promises to you: 2 talks in 1 to keep you on your toes (or if you go to sleep in the 1st you have a second chance!) Almost no details! As many pictures of gorillas as is scientifically justifiable Simon Cotter Bayesian Complexities 1 / 39
  • 4. The Graveyard Shift Last day After lunch My promises to you: 2 talks in 1 to keep you on your toes (or if you go to sleep in the 1st you have a second chance!) Almost no details! As many pictures of gorillas as is scientifically justifiable Simon Cotter Bayesian Complexities 1 / 39
  • 5. The Graveyard Shift Last day After lunch My promises to you: 2 talks in 1 to keep you on your toes (or if you go to sleep in the 1st you have a second chance!) Almost no details! As many pictures of gorillas as is scientifically justifiable Simon Cotter Bayesian Complexities 1 / 39
  • 6. The Graveyard Shift Last day After lunch My promises to you: 2 talks in 1 to keep you on your toes (or if you go to sleep in the 1st you have a second chance!) Almost no details! As many pictures of gorillas as is scientifically justifiable Simon Cotter Bayesian Complexities 1 / 39
  • 7. The Graveyard Shift Last day After lunch My promises to you: 2 talks in 1 to keep you on your toes (or if you go to sleep in the 1st you have a second chance!) Almost no details! As many pictures of gorillas as is scientifically justifiable Simon Cotter Bayesian Complexities 1 / 39
  • 8. The Graveyard Shift Last day After lunch My promises to you: 2 talks in 1 to keep you on your toes (or if you go to sleep in the 1st you have a second chance!) Almost no details! As many pictures of gorillas as is scientifically justifiable Simon Cotter Bayesian Complexities 1 / 39
  • 9. Collaborators Left: Catherine Powell (University of Manchester, UK), Center: James Rynn (University of Manchester, UK), Right: Louise Wright (National Physical Laboratories, UK) Simon Cotter Bayesian Complexities 2 / 39
  • 10. Collaborators Left: Colin Cotter (Imperial College, UK), Center: Yannis Kevrekidis (John Hopkins, US), Right: Paul Russell (formerly University of Manchester, UK). SLC is grateful to EPSRC for First Grant award EP/L023393/1 Simon Cotter Bayesian Complexities 3 / 39
  • 11. Outline 1 Introduction 2 SGFEM Surrogate-Accelerated Inference 3 Transport Map-Accelerated Adaptive Importance Sampling 4 Conclusions Simon Cotter Bayesian Complexities 3 / 39
  • 12. Bayesian Inverse Problems Find the unknown θ given nz observations z, satisfying z = G(θ) + η, η ∼ N(0, Σ), where z ∈ Rnz is a given vector of observations, G : Θ → Rnz is the observation operator, θ ∈ Θ is the unknown parameter, η ∈ Rnz is a vector of observational noise. Goal: Efficiently estimate the posterior density π(θ|z) for the unknowns θ given the data z. Simon Cotter Bayesian Complexities 4 / 39
  • 13. Bayesian Inverse Problems Find the unknown θ given nz observations z, satisfying z = G(θ) + η, η ∼ N(0, Σ), where z ∈ Rnz is a given vector of observations, G : Θ → Rnz is the observation operator, θ ∈ Θ is the unknown parameter, η ∈ Rnz is a vector of observational noise. Goal: Efficiently estimate the posterior density π(θ|z) for the unknowns θ given the data z. Simon Cotter Bayesian Complexities 4 / 39
  • 14. Bayesian Inverse Problems Find the unknown θ given nz observations z, satisfying z = G(θ) + η, η ∼ N(0, Σ), where z ∈ Rnz is a given vector of observations, G : Θ → Rnz is the observation operator, θ ∈ Θ is the unknown parameter, η ∈ Rnz is a vector of observational noise. Goal: Efficiently estimate the posterior density π(θ|z) for the unknowns θ given the data z. Simon Cotter Bayesian Complexities 4 / 39
  • 15. Bayesian Inverse Problems Find the unknown θ given nz observations z, satisfying z = G(θ) + η, η ∼ N(0, Σ), where z ∈ Rnz is a given vector of observations, G : Θ → Rnz is the observation operator, θ ∈ Θ is the unknown parameter, η ∈ Rnz is a vector of observational noise. Goal: Efficiently estimate the posterior density π(θ|z) for the unknowns θ given the data z. Simon Cotter Bayesian Complexities 4 / 39
  • 16. Bayesian Inverse Problems Find the unknown θ given nz observations z, satisfying z = G(θ) + η, η ∼ N(0, Σ), where z ∈ Rnz is a given vector of observations, G : Θ → Rnz is the observation operator, θ ∈ Θ is the unknown parameter, η ∈ Rnz is a vector of observational noise. Goal: Efficiently estimate the posterior density π(θ|z) for the unknowns θ given the data z. Simon Cotter Bayesian Complexities 4 / 39
  • 17. Bayesian Inverse Problems Find the unknown θ given nz observations z, satisfying z = G(θ) + η, η ∼ N(0, Σ), where z ∈ Rnz is a given vector of observations, G : Θ → Rnz is the observation operator, θ ∈ Θ is the unknown parameter, η ∈ Rnz is a vector of observational noise. Goal: Efficiently estimate the posterior density π(θ|z) for the unknowns θ given the data z. Simon Cotter Bayesian Complexities 4 / 39
  • 18. Markov Chain Monte Carlo (MCMC) Methods π(θ|z) ∝ L(z|θ)π0(θ) ∝ exp − 1 2 z − G(θ) 2 Σ π0(θ). Markov chain Monte Carlo estimates Eπ[φ] = Θ φ(θ)π(θ|z)dθ ≈ 1 M M i=1 φ(θi). Extremely costly for some problem classes Expensive likelihood evaluations (talk #1) Poor mixing requiring high values of M (talk #2) Simon Cotter Bayesian Complexities 5 / 39
  • 19. Markov Chain Monte Carlo (MCMC) Methods π(θ|z) ∝ L(z|θ)π0(θ) ∝ exp − 1 2 z − G(θ) 2 Σ π0(θ). Markov chain Monte Carlo estimates Eπ[φ] = Θ φ(θ)π(θ|z)dθ ≈ 1 M M i=1 φ(θi). Extremely costly for some problem classes Expensive likelihood evaluations (talk #1) Poor mixing requiring high values of M (talk #2) Simon Cotter Bayesian Complexities 5 / 39
  • 20. Markov Chain Monte Carlo (MCMC) Methods π(θ|z) ∝ L(z|θ)π0(θ) ∝ exp − 1 2 z − G(θ) 2 Σ π0(θ). Markov chain Monte Carlo estimates Eπ[φ] = Θ φ(θ)π(θ|z)dθ ≈ 1 M M i=1 φ(θi). Extremely costly for some problem classes Expensive likelihood evaluations (talk #1) Poor mixing requiring high values of M (talk #2) Simon Cotter Bayesian Complexities 5 / 39
  • 21. Markov Chain Monte Carlo (MCMC) Methods π(θ|z) ∝ L(z|θ)π0(θ) ∝ exp − 1 2 z − G(θ) 2 Σ π0(θ). Markov chain Monte Carlo estimates Eπ[φ] = Θ φ(θ)π(θ|z)dθ ≈ 1 M M i=1 φ(θi). Extremely costly for some problem classes Expensive likelihood evaluations (talk #1) Poor mixing requiring high values of M (talk #2) Simon Cotter Bayesian Complexities 5 / 39
  • 22. Markov Chain Monte Carlo (MCMC) Methods π(θ|z) ∝ L(z|θ)π0(θ) ∝ exp − 1 2 z − G(θ) 2 Σ π0(θ). Markov chain Monte Carlo estimates Eπ[φ] = Θ φ(θ)π(θ|z)dθ ≈ 1 M M i=1 φ(θi). Extremely costly for some problem classes Expensive likelihood evaluations (talk #1) Poor mixing requiring high values of M (talk #2) Simon Cotter Bayesian Complexities 5 / 39
  • 23. Outline 1 Introduction 2 SGFEM Surrogate-Accelerated Inference 3 Transport Map-Accelerated Adaptive Importance Sampling 4 Conclusions Simon Cotter Bayesian Complexities 5 / 39
  • 24. Industrial Example Simon Cotter Bayesian Complexities 6 / 39
  • 25. Industrial Example 0 5 10 15 20 25 30 35 40 -1 0 1 2 3 4 5 6 7 8 9 Simon Cotter Bayesian Complexities 7 / 39
  • 26. Industrial Example Possible unknowns: λ — thermal conductivity, I — laser intensity, κ — heat transfer coefficient (affects boundary conditions), σ — standard deviation of measurement noise. . . . Simon Cotter Bayesian Complexities 8 / 39
  • 27. Forward Model: Heat Equation Model the physical relationship between λ, I and the temperature T of the material using the heat equation: cp ∂T ∂t − · (λ T) = Q(I). Given a sample of λ and I, we can use numerical methods (FEMs) to approximate T at the measurement times. One evaluation of the FEM code takes ≈ 30 seconds, for a “reasonable level of accuracy”. Simon Cotter Bayesian Complexities 9 / 39
  • 28. Forward Model: Heat Equation Model the physical relationship between λ, I and the temperature T of the material using the heat equation: cp ∂T ∂t − · (λ T) = Q(I). Given a sample of λ and I, we can use numerical methods (FEMs) to approximate T at the measurement times. One evaluation of the FEM code takes ≈ 30 seconds, for a “reasonable level of accuracy”. Simon Cotter Bayesian Complexities 9 / 39
  • 29. Metropolis Hastings Algorithm (FEM) Algorithm 1: Metropolis-Hastings Algorithm set initial state X(0) = θ0 for m = 1, 2, . . . , M do draw proposal evaluate likelihood by FEM approximation of G (expensive!) compute acceptance probability α accept proposal with probability α end for output chain X = (θ0, θ1, . . . , θM) 30 seconds per (time-dependant) PDE solve =⇒ 10m samples takes 3 × 108 seconds = 9.5 years (single CPU) Simon Cotter Bayesian Complexities 10 / 39
  • 30. Surrogate Model The temperature T is a function of θ = (λ, I). cp ∂T(θ) ∂t − · (λ T(θ)) = Q(I). Instead of solving for individual samples of λ and I: Offline: build an approximation of the form Tapprox(θ) = nk i=1 TiΨi(θ), (Surrogate) where Ψi are orthogonal polynomials in θ (Legendre) This can be cheaply evaluated in MCMC routines (Online). Simon Cotter Bayesian Complexities 11 / 39
  • 31. Surrogate Model The temperature T is a function of θ = (λ, I). cp ∂T(θ) ∂t − · (λ T(θ)) = Q(I). Instead of solving for individual samples of λ and I: Offline: build an approximation of the form Tapprox(θ) = nk i=1 TiΨi(θ), (Surrogate) where Ψi are orthogonal polynomials in θ (Legendre) This can be cheaply evaluated in MCMC routines (Online). Simon Cotter Bayesian Complexities 11 / 39
  • 32. Metropolis Hastings Algorithm (SGFEM) Algorithm 2: Metropolis-Hastings Algorithm with SGFEM Surro- gate compute SGFEM solution (cost dependant on FEM discretisation parameters + maximum polynomial degree k) set initial state X(0) = θ0 for m = 1, 2, . . . , M do draw proposal evaluate likelihood by evaluating SGFEM approximation of G (cheap!) compute acceptance probability α accept proposal with probability α end for output chain X = (θ0, θ1, . . . , θM) Simon Cotter Bayesian Complexities 12 / 39
  • 33. Results: Copper Sample Offline: Compute surrogate solution : ≈ 16 mins (Solved 600K equations ×nt = 800 time steps) Online: Generate M = 107 samples of θ from the approximate posterior πapprox(θ|d) using standard MCMC method ≈ 26 mins Computational Costs: Offline Online Standard -/- M × (nt × CPDE) Surrogate nt × O(nk × CPDE) M × Ceval Simon Cotter Bayesian Complexities 13 / 39
  • 34. Results: Copper Sample Offline: Compute surrogate solution : ≈ 16 mins (Solved 600K equations ×nt = 800 time steps) Online: Generate M = 107 samples of θ from the approximate posterior πapprox(θ|d) using standard MCMC method ≈ 26 mins Computational Costs: Offline Online Standard -/- M × (nt × CPDE) Surrogate nt × O(nk × CPDE) M × Ceval Simon Cotter Bayesian Complexities 13 / 39
  • 35. Posterior Density, π(θ|z) 353 354 355 356 357 1.179 1.18 1.181 1.182 1.183 1.184 10 12 1 1.5 2 2.5 3 3.5 4 4.5 5 10-10 352 353 354 355 356 357 358 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 1.178 1.179 1.18 1.181 1.182 1.183 1.184 1.185 1012 0 1 2 3 4 5 6 7 10 -10 Simon Cotter Bayesian Complexities 14 / 39
  • 36. Posterior Convergence in k (Polynomial Degree) 352 353 354 355 356 357 358 359 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Simon Cotter Bayesian Complexities 15 / 39
  • 37. V. HOANG, C. SCHWAB, AND A. STUART, Complexity Analysis of Accelerated MCMC Methods for Bayesian Inversion, Inverse Problems, 29 (2013), p. 085010. Y. MARZOUK, H. NAJM, AND L. RAHN, Stochastic Spectral Methods for Efficient Bayesian Solution of Inverse Problems, Journal of Computational Physics, 224 (2007), pp. 560–586. F. NOBILE AND R. TEMPONE, Analysis and Implementation Issues for the Numerical Approximation of Parabolic Equations with Random Coefficients, International Journal for Numerical Methods in Engineering, 80 (2009), pp. 979–1006. J. A. RYNN, S. L. COTTER, C. E. POWELL, AND L. WRIGHT, Surrogate accelerated bayesian inversion for the determination of the thermal diffusivity of a material, Metrologia, 56 (2019), p. 015018. Simon Cotter Bayesian Complexities 16 / 39
  • 38. Outline 1 Introduction 2 SGFEM Surrogate-Accelerated Inference 3 Transport Map-Accelerated Adaptive Importance Sampling 4 Conclusions Simon Cotter Bayesian Complexities 17 / 39
  • 39. Motivation Simon Cotter Bayesian Complexities 18 / 39
  • 40. Motivation Simon Cotter Bayesian Complexities 19 / 39
  • 41. Motivation Simon Cotter Bayesian Complexities 20 / 39
  • 42. Multiscale Systems 0 0.05 0.1 0.15 0.2 70 80 90 100 110 120 130 Time t Numberofmolecules X1 X2 X3 (X1+X2)/2 ∅ k1 −−−−→ X1 k2x1 −−−−→ ←−−−− k3x2 X2 k4x2 −−−−→ X3 k5x3 −−−−→ ∅ Simon Cotter Bayesian Complexities 21 / 39
  • 43. Constrained approximation: Simple Example 100 105 0 0.2 0.4 k1k2 100 105 0 20 0 10 20 30 0 0.05 0.1 k3 100 105 0 20 40 0 20 0 20 40 0 20 40 0 0.05 0.1 k1 k4 100 105 0 5 k2 0 20 0 5 k3 0 20 40 0 5 0 5 0 0.5 1 k4 Figure: CMA approximation of the posterior arising from observations of the slow variable S = X1 + X2, concentrated around a manifold k1(k2+k3+k4) k2k4 = C, i.e. more challenging than this plot suggests. (Any visualisation suggestions?) Simon Cotter Bayesian Complexities 22 / 39
  • 44. Importance Sampling Sample xi ∼ ν Compute weights wi = π(xi)/ν(xi) Monte Carlo estimates: Eπ(f) ≈ 1 j wj N i=1 f(xi)wi Variance of weights indicator for efficiency Small when π ≈ ν Simon Cotter Bayesian Complexities 23 / 39
  • 45. Importance Sampling Sample xi ∼ ν Compute weights wi = π(xi)/ν(xi) Monte Carlo estimates: Eπ(f) ≈ 1 j wj N i=1 f(xi)wi Variance of weights indicator for efficiency Small when π ≈ ν Simon Cotter Bayesian Complexities 23 / 39
  • 46. Importance Sampling Sample xi ∼ ν Compute weights wi = π(xi)/ν(xi) Monte Carlo estimates: Eπ(f) ≈ 1 j wj N i=1 f(xi)wi Variance of weights indicator for efficiency Small when π ≈ ν Simon Cotter Bayesian Complexities 23 / 39
  • 47. Importance Sampling Sample xi ∼ ν Compute weights wi = π(xi)/ν(xi) Monte Carlo estimates: Eπ(f) ≈ 1 j wj N i=1 f(xi)wi Variance of weights indicator for efficiency Small when π ≈ ν Simon Cotter Bayesian Complexities 23 / 39
  • 48. Importance Sampling Sample xi ∼ ν Compute weights wi = π(xi)/ν(xi) Monte Carlo estimates: Eπ(f) ≈ 1 j wj N i=1 f(xi)wi Variance of weights indicator for efficiency Small when π ≈ ν Simon Cotter Bayesian Complexities 23 / 39
  • 49. Advantages of Importance Sampling: 102 samples 0 0.2 0.4 0.6 0.8 1 0 5 10 15 20 25 Simon Cotter Bayesian Complexities 24 / 39
  • 50. Advantages of Importance Sampling: 103 samples 0 0.2 0.4 0.6 0.8 1 0 1 2 3 4 5 6 7 8 9 Simon Cotter Bayesian Complexities 24 / 39
  • 51. Advantages of Importance Sampling: 104 samples 0 0.2 0.4 0.6 0.8 1 0 1 2 3 4 5 6 7 8 Simon Cotter Bayesian Complexities 24 / 39
  • 52. Advantages of Importance Sampling: 105 samples 0 0.2 0.4 0.6 0.8 1 0 1 2 3 4 5 6 Simon Cotter Bayesian Complexities 24 / 39
  • 53. Advantages of Importance Sampling: 106 samples 0 0.2 0.4 0.6 0.8 1 0 1 2 3 4 5 6 Simon Cotter Bayesian Complexities 24 / 39
  • 54. Advantages of Importance Sampling: Weights 0 0.2 0.4 0.6 0.8 1 10 −15 10 −10 10 −5 10 0 10 5 / Simon Cotter Bayesian Complexities 25 / 39
  • 55. Disadvantages of Importance Sampling: 102 samples −0.5 0 0.5 1 0 5 10 15 20 25 30 35 40 Simon Cotter Bayesian Complexities 26 / 39
  • 56. Disadvantages of Importance Sampling: 103 samples −0.5 0 0.5 1 0 5 10 15 20 25 30 35 Simon Cotter Bayesian Complexities 26 / 39
  • 57. Disadvantages of Importance Sampling: 104 samples −0.5 0 0.5 1 0 5 10 15 20 25 Simon Cotter Bayesian Complexities 26 / 39
  • 58. Disadvantages of Importance Sampling: 105 samples −0.5 0 0.5 1 0 2 4 6 8 10 12 Simon Cotter Bayesian Complexities 26 / 39
  • 59. Disadvantages of Importance Sampling: 106 samples −0.5 0 0.5 1 0 2 4 6 8 10 12 14 Simon Cotter Bayesian Complexities 26 / 39
  • 60. Disadvantages of Importance Sampling: Weights −4 −2 0 2 4 6 10 −300 10 −200 10 −100 10 0 10 100 / Simon Cotter Bayesian Complexities 27 / 39
  • 61. Ensemble Transport Adaptive Importance Sampling Proposal distribution in kth iteration informed by M ensemble members with states θi χ(k) = 1 M M i=1 q(·; θ (k) i , β) q(·; ·, β) a transition kernel, e.g. Gaussian, MALA proposal, etc, with scaling parameter β Resampling step; ensemble transform method For large M, greedy approximation used, “multinomial approximation” C. Cotter, SLC, P. Russell, “Ensemble transport adaptive importance sampling”, SIAM JUQ 2019. S. Reich, “A non-parametric ensemble transform method for Bayesian inference”, SISC 2013. Simon Cotter Bayesian Complexities 28 / 39
  • 62. Ensemble Transport Adaptive Importance Sampling Proposal distribution in kth iteration informed by M ensemble members with states θi χ(k) = 1 M M i=1 q(·; θ (k) i , β) q(·; ·, β) a transition kernel, e.g. Gaussian, MALA proposal, etc, with scaling parameter β Resampling step; ensemble transform method For large M, greedy approximation used, “multinomial approximation” C. Cotter, SLC, P. Russell, “Ensemble transport adaptive importance sampling”, SIAM JUQ 2019. S. Reich, “A non-parametric ensemble transform method for Bayesian inference”, SISC 2013. Simon Cotter Bayesian Complexities 28 / 39
  • 63. Ensemble Transport Adaptive Importance Sampling Proposal distribution in kth iteration informed by M ensemble members with states θi χ(k) = 1 M M i=1 q(·; θ (k) i , β) q(·; ·, β) a transition kernel, e.g. Gaussian, MALA proposal, etc, with scaling parameter β Resampling step; ensemble transform method For large M, greedy approximation used, “multinomial approximation” C. Cotter, SLC, P. Russell, “Ensemble transport adaptive importance sampling”, SIAM JUQ 2019. S. Reich, “A non-parametric ensemble transform method for Bayesian inference”, SISC 2013. Simon Cotter Bayesian Complexities 28 / 39
  • 64. Ensemble Transport Adaptive Importance Sampling Proposal distribution in kth iteration informed by M ensemble members with states θi χ(k) = 1 M M i=1 q(·; θ (k) i , β) q(·; ·, β) a transition kernel, e.g. Gaussian, MALA proposal, etc, with scaling parameter β Resampling step; ensemble transform method For large M, greedy approximation used, “multinomial approximation” C. Cotter, SLC, P. Russell, “Ensemble transport adaptive importance sampling”, SIAM JUQ 2019. S. Reich, “A non-parametric ensemble transform method for Bayesian inference”, SISC 2013. Simon Cotter Bayesian Complexities 28 / 39
  • 65. Ensemble Transport Adaptive Importance Sampling: Prior and Posterior −5 0 5 10 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Simon Cotter Bayesian Complexities 29 / 39
  • 66. Ensemble Transport Adaptive Importance Sampling: Current State Xi 0 2 4 6 8 10 0 0.2 0.4 0.6 0.8 1 1.2 1.4 Simon Cotter Bayesian Complexities 29 / 39
  • 67. Ensemble Transport Adaptive Importance Sampling: MALA Proposals 0 2 4 6 8 10 0 0.2 0.4 0.6 0.8 1 1.2 1.4 Simon Cotter Bayesian Complexities 29 / 39
  • 68. Ensemble Transport Adaptive Importance Sampling: Aggregate Proposal 0 2 4 6 8 10 0 0.2 0.4 0.6 0.8 1 1.2 1.4 Simon Cotter Bayesian Complexities 29 / 39
  • 69. Ensemble Transport Adaptive Importance Sampling: Aggregate Proposal 0 2 4 6 8 10 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Simon Cotter Bayesian Complexities 29 / 39
  • 70. Ensemble Transport Adaptive Importance Sampling: Aggregate Proposal and Weight Function 10 −3 10 −2 10 −1 10 0 10 1 10 −50 10 −40 10 −30 10 −20 10 −10 10 0 Target distribution Proposal distribution Weights / Simon Cotter Bayesian Complexities 29 / 39
  • 71. Ensemble Transport Adaptive Importance Sampling: Samples from Proposal 0 2 4 6 8 10 0 0.2 0.4 0.6 0.8 1 Simon Cotter Bayesian Complexities 29 / 39
  • 72. Ensemble Transport Adaptive Importance Sampling: Sample Weights 0 2 4 6 8 10 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 Simon Cotter Bayesian Complexities 29 / 39
  • 73. Ensemble Transport Adaptive Importance Sampling: Resampled States 0 2 4 6 8 10 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Simon Cotter Bayesian Complexities 29 / 39
  • 74. ETAIS - pros and cons PROS: Possible big speed-ups with parallelisation Well-informed proposals Reduces variance of importance weights Adaptive to global differences in scales of parameters CONS: Posterior concentrated on lower dimensional manifold: Stability issues Slow convergence Requires large ensemble size (expensive) Particle transition kernel q needs to “know” about the manifold Simon Cotter Bayesian Complexities 30 / 39
  • 75. ETAIS - pros and cons PROS: Possible big speed-ups with parallelisation Well-informed proposals Reduces variance of importance weights Adaptive to global differences in scales of parameters CONS: Posterior concentrated on lower dimensional manifold: Stability issues Slow convergence Requires large ensemble size (expensive) Particle transition kernel q needs to “know” about the manifold Simon Cotter Bayesian Complexities 30 / 39
  • 76. ETAIS - pros and cons PROS: Possible big speed-ups with parallelisation Well-informed proposals Reduces variance of importance weights Adaptive to global differences in scales of parameters CONS: Posterior concentrated on lower dimensional manifold: Stability issues Slow convergence Requires large ensemble size (expensive) Particle transition kernel q needs to “know” about the manifold Simon Cotter Bayesian Complexities 30 / 39
  • 77. ETAIS - pros and cons PROS: Possible big speed-ups with parallelisation Well-informed proposals Reduces variance of importance weights Adaptive to global differences in scales of parameters CONS: Posterior concentrated on lower dimensional manifold: Stability issues Slow convergence Requires large ensemble size (expensive) Particle transition kernel q needs to “know” about the manifold Simon Cotter Bayesian Complexities 30 / 39
  • 78. ETAIS - pros and cons PROS: Possible big speed-ups with parallelisation Well-informed proposals Reduces variance of importance weights Adaptive to global differences in scales of parameters CONS: Posterior concentrated on lower dimensional manifold: Stability issues Slow convergence Requires large ensemble size (expensive) Particle transition kernel q needs to “know” about the manifold Simon Cotter Bayesian Complexities 30 / 39
  • 79. ETAIS - pros and cons PROS: Possible big speed-ups with parallelisation Well-informed proposals Reduces variance of importance weights Adaptive to global differences in scales of parameters CONS: Posterior concentrated on lower dimensional manifold: Stability issues Slow convergence Requires large ensemble size (expensive) Particle transition kernel q needs to “know” about the manifold Simon Cotter Bayesian Complexities 30 / 39
  • 80. ETAIS - pros and cons PROS: Possible big speed-ups with parallelisation Well-informed proposals Reduces variance of importance weights Adaptive to global differences in scales of parameters CONS: Posterior concentrated on lower dimensional manifold: Stability issues Slow convergence Requires large ensemble size (expensive) Particle transition kernel q needs to “know” about the manifold Simon Cotter Bayesian Complexities 30 / 39
  • 81. ETAIS - pros and cons PROS: Possible big speed-ups with parallelisation Well-informed proposals Reduces variance of importance weights Adaptive to global differences in scales of parameters CONS: Posterior concentrated on lower dimensional manifold: Stability issues Slow convergence Requires large ensemble size (expensive) Particle transition kernel q needs to “know” about the manifold Simon Cotter Bayesian Complexities 30 / 39
  • 82. ETAIS - pros and cons PROS: Possible big speed-ups with parallelisation Well-informed proposals Reduces variance of importance weights Adaptive to global differences in scales of parameters CONS: Posterior concentrated on lower dimensional manifold: Stability issues Slow convergence Requires large ensemble size (expensive) Particle transition kernel q needs to “know” about the manifold Simon Cotter Bayesian Complexities 30 / 39
  • 84. Transport maps Find homeomorphism T : Rd → Rd which maps target measure π to an easily explored reference measure πr µ(T−1 (A)) = µr (A) Simple proposal densities on πr map to complex informed densities on π via T−1 v ∼ T−1 (q(·, u; β)) Low-dimensional approximation can be computed from a posterior sample M. Parno, Y. Marzouk, “Transport Map Accelerated Markov Chain Monte Carlo”, SIAM journal on uncertainty quantification, 2018. Simon Cotter Bayesian Complexities 32 / 39
  • 85. Transport maps Find homeomorphism T : Rd → Rd which maps target measure π to an easily explored reference measure πr µ(T−1 (A)) = µr (A) Simple proposal densities on πr map to complex informed densities on π via T−1 v ∼ T−1 (q(·, u; β)) Low-dimensional approximation can be computed from a posterior sample M. Parno, Y. Marzouk, “Transport Map Accelerated Markov Chain Monte Carlo”, SIAM journal on uncertainty quantification, 2018. Simon Cotter Bayesian Complexities 32 / 39
  • 86. Transport maps Find homeomorphism T : Rd → Rd which maps target measure π to an easily explored reference measure πr µ(T−1 (A)) = µr (A) Simple proposal densities on πr map to complex informed densities on π via T−1 v ∼ T−1 (q(·, u; β)) Low-dimensional approximation can be computed from a posterior sample M. Parno, Y. Marzouk, “Transport Map Accelerated Markov Chain Monte Carlo”, SIAM journal on uncertainty quantification, 2018. Simon Cotter Bayesian Complexities 32 / 39
  • 87. Transport map simplification of Rosenbrock Target parameter space -1 0 1 2 3 0 2 4 6 (a) Original sample θ from MH-RW algorithm. Reference parameter space -4 -2 0 2 4 -4 -2 0 2 4 (b) Push forward of θ onto reference space. Figure: The effect of the approximate transport map ˜T on a sample from the Rosenbrock target density. Simon Cotter Bayesian Complexities 33 / 39
  • 88. Rosenbrock density Number of samples 102 103 104 105 106 relativeL2 error 10-4 10-3 10-2 10-1 RWMH TRWMH ETAIS-RW ETAIS-TRW Simon Cotter Bayesian Complexities 34 / 39
  • 89. Constrained approximation: Simple Example 100 105 0 0.2 0.4 k1k2 100 105 0 20 0 10 20 30 0 0.05 0.1 k3 100 105 0 20 40 0 20 0 20 40 0 20 40 0 0.05 0.1 k1 k4 100 105 0 5 k2 0 20 0 5 k3 0 20 40 0 5 0 5 0 0.5 1 k4 Figure: CMA approximation of the posterior arising from observations of the slow variable S = X1 + X2, concentrated around a manifold k1(k2+k3+k4) k2k4 = C, i.e. more challenging than this plot suggests. (Any visualisation suggestions?) Simon Cotter Bayesian Complexities 35 / 39
  • 90. Multiscale stochastic reaction network example Number of samples 10 2 10 3 10 4 10 5 10 6 relativeerror 10-4 10-3 10 -2 10 -1 10 0 MH-logRW MH-logTRW ETAIS-logRW ETAIS-logTRW O(1/ √ N) Figure: Sampling algorithms with a log preconditioner for ˜T. Simon Cotter Bayesian Complexities 36 / 39
  • 91. Simon Cotter Bayesian Complexities 37 / 39
  • 92. References SLC, I. Kevrekidis, P. Russell, 2019. “Transport map accelerated adaptive importance sampling, and application to inverse problems arising from multiscale stochastic reaction networks.” arXiv preprint arXiv:1901.11269, submitted to SIAM UQ. M. Parno, Y. Marzouk, “Transport Map Accelerated Markov Chain Monte Carlo”, SIAM journal on uncertainty quantification, 2018. C. Cotter, SLC, P. Russell, “Ensemble transport adaptive importance sampling”, SIAM journal on uncertainty quantification, 2019. S. Reich, “A non-parametric ensemble transform method for Bayesian inference”, SISC 2013. SLC, “Constrained approximation of effective generators for multiscale stochastic reaction networks and application to conditioned path sampling”, Journal of Computational Physics, 2016 Simon Cotter Bayesian Complexities 38 / 39
  • 93. Outline 1 Introduction 2 SGFEM Surrogate-Accelerated Inference 3 Transport Map-Accelerated Adaptive Importance Sampling 4 Conclusions Simon Cotter Bayesian Complexities 38 / 39
  • 94. Conclusions Multiple possible reasons for extortionate costs in Bayesian inference High cost of likelihood evaluations due to numerical approximation of PDEs Surrogates can sample efficiently from approximation of posterior Does πapprox(θ|d) converge to π(θ|d)? In what sense? How is the error π(θ|d) − πapprox(θ|d) affected by the error between the solution to the forward model and the chosen surrogate, T(θ) − Tapprox(θ) ? Large number of MCMC samples required due to complex posterior structure Many posteriors are concentrated on lower-dimensional manifolds Optimal transport maps can simplify target distributions Ensemble adaptive importance sampling schemes can be stabilised and accelerated for lower ensemble sizes Simon Cotter Bayesian Complexities 39 / 39
  • 95. Conclusions Multiple possible reasons for extortionate costs in Bayesian inference High cost of likelihood evaluations due to numerical approximation of PDEs Surrogates can sample efficiently from approximation of posterior Does πapprox(θ|d) converge to π(θ|d)? In what sense? How is the error π(θ|d) − πapprox(θ|d) affected by the error between the solution to the forward model and the chosen surrogate, T(θ) − Tapprox(θ) ? Large number of MCMC samples required due to complex posterior structure Many posteriors are concentrated on lower-dimensional manifolds Optimal transport maps can simplify target distributions Ensemble adaptive importance sampling schemes can be stabilised and accelerated for lower ensemble sizes Simon Cotter Bayesian Complexities 39 / 39
  • 96. Conclusions Multiple possible reasons for extortionate costs in Bayesian inference High cost of likelihood evaluations due to numerical approximation of PDEs Surrogates can sample efficiently from approximation of posterior Does πapprox(θ|d) converge to π(θ|d)? In what sense? How is the error π(θ|d) − πapprox(θ|d) affected by the error between the solution to the forward model and the chosen surrogate, T(θ) − Tapprox(θ) ? Large number of MCMC samples required due to complex posterior structure Many posteriors are concentrated on lower-dimensional manifolds Optimal transport maps can simplify target distributions Ensemble adaptive importance sampling schemes can be stabilised and accelerated for lower ensemble sizes Simon Cotter Bayesian Complexities 39 / 39
  • 97. Conclusions Multiple possible reasons for extortionate costs in Bayesian inference High cost of likelihood evaluations due to numerical approximation of PDEs Surrogates can sample efficiently from approximation of posterior Does πapprox(θ|d) converge to π(θ|d)? In what sense? How is the error π(θ|d) − πapprox(θ|d) affected by the error between the solution to the forward model and the chosen surrogate, T(θ) − Tapprox(θ) ? Large number of MCMC samples required due to complex posterior structure Many posteriors are concentrated on lower-dimensional manifolds Optimal transport maps can simplify target distributions Ensemble adaptive importance sampling schemes can be stabilised and accelerated for lower ensemble sizes Simon Cotter Bayesian Complexities 39 / 39
  • 98. Conclusions Multiple possible reasons for extortionate costs in Bayesian inference High cost of likelihood evaluations due to numerical approximation of PDEs Surrogates can sample efficiently from approximation of posterior Does πapprox(θ|d) converge to π(θ|d)? In what sense? How is the error π(θ|d) − πapprox(θ|d) affected by the error between the solution to the forward model and the chosen surrogate, T(θ) − Tapprox(θ) ? Large number of MCMC samples required due to complex posterior structure Many posteriors are concentrated on lower-dimensional manifolds Optimal transport maps can simplify target distributions Ensemble adaptive importance sampling schemes can be stabilised and accelerated for lower ensemble sizes Simon Cotter Bayesian Complexities 39 / 39
  • 99. Conclusions Multiple possible reasons for extortionate costs in Bayesian inference High cost of likelihood evaluations due to numerical approximation of PDEs Surrogates can sample efficiently from approximation of posterior Does πapprox(θ|d) converge to π(θ|d)? In what sense? How is the error π(θ|d) − πapprox(θ|d) affected by the error between the solution to the forward model and the chosen surrogate, T(θ) − Tapprox(θ) ? Large number of MCMC samples required due to complex posterior structure Many posteriors are concentrated on lower-dimensional manifolds Optimal transport maps can simplify target distributions Ensemble adaptive importance sampling schemes can be stabilised and accelerated for lower ensemble sizes Simon Cotter Bayesian Complexities 39 / 39
  • 100. Conclusions Multiple possible reasons for extortionate costs in Bayesian inference High cost of likelihood evaluations due to numerical approximation of PDEs Surrogates can sample efficiently from approximation of posterior Does πapprox(θ|d) converge to π(θ|d)? In what sense? How is the error π(θ|d) − πapprox(θ|d) affected by the error between the solution to the forward model and the chosen surrogate, T(θ) − Tapprox(θ) ? Large number of MCMC samples required due to complex posterior structure Many posteriors are concentrated on lower-dimensional manifolds Optimal transport maps can simplify target distributions Ensemble adaptive importance sampling schemes can be stabilised and accelerated for lower ensemble sizes Simon Cotter Bayesian Complexities 39 / 39
  • 101. Conclusions Multiple possible reasons for extortionate costs in Bayesian inference High cost of likelihood evaluations due to numerical approximation of PDEs Surrogates can sample efficiently from approximation of posterior Does πapprox(θ|d) converge to π(θ|d)? In what sense? How is the error π(θ|d) − πapprox(θ|d) affected by the error between the solution to the forward model and the chosen surrogate, T(θ) − Tapprox(θ) ? Large number of MCMC samples required due to complex posterior structure Many posteriors are concentrated on lower-dimensional manifolds Optimal transport maps can simplify target distributions Ensemble adaptive importance sampling schemes can be stabilised and accelerated for lower ensemble sizes Simon Cotter Bayesian Complexities 39 / 39
  • 102. Conclusions Multiple possible reasons for extortionate costs in Bayesian inference High cost of likelihood evaluations due to numerical approximation of PDEs Surrogates can sample efficiently from approximation of posterior Does πapprox(θ|d) converge to π(θ|d)? In what sense? How is the error π(θ|d) − πapprox(θ|d) affected by the error between the solution to the forward model and the chosen surrogate, T(θ) − Tapprox(θ) ? Large number of MCMC samples required due to complex posterior structure Many posteriors are concentrated on lower-dimensional manifolds Optimal transport maps can simplify target distributions Ensemble adaptive importance sampling schemes can be stabilised and accelerated for lower ensemble sizes Simon Cotter Bayesian Complexities 39 / 39