So-called “inverse” problems arise when the parameters of a physical system cannot be directly observed. The mapping between these latent parameters and the space of noisy observations is represented as a mathematical model, often involving a system of differential equations. We seek to infer the parameter values that best fit our observed data. However, it is also vital to obtain accurate quantification of the uncertainty involved with these parameters, particularly when the output of the model will be used for forecasting. Bayesian inference provides well-calibrated uncertainty estimates, represented by the posterior distribution over the parameters. In this talk, I will give a brief introduction to Markov chain Monte Carlo (MCMC) algorithms for sampling from the posterior distribution and describe how they can be combined with numerical solvers for the forward model. We apply these methods to two examples of ODE models: growth curves in ecology, and thermogravimetric analysis (TGA) in chemistry. This is joint work with Matthew Berry, Mark Nelson, Brian Monaghan and Raymond Longbottom.
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Bayesian Inference and Uncertainty Quantification for Inverse Problems
1. Bayesian Inference and Uncertainty Quantification for Inverse Problems
Matt Moores
@MooresMt
https://uow.edu.au/~mmoores/
Centre for Environmental Informatics (CEI), University of Wollongong, NSW, Australia
TIDE Hub Seminar Series
joint with Matthew Berry, Mark Nelson, Brian Monaghan and Raymond Longbottom
1 / 19
2. Introduction
Physical model F
t; ~
θ
Unknown parameters ~
θ = {θ1, . . . , θk}
Initial conditions F0 and derivatives ∂F
∂t , etc.
2 / 19
3. Introduction
Physical model F
t; ~
θ
Unknown parameters ~
θ = {θ1, . . . , θk}
Initial conditions F0 and derivatives ∂F
∂t , etc.
Observed data yt at times t = 1, . . . , T
yt = F
t; ~
θ
+ εt
where εt is random noise.
2 / 19
9. Drawbacks
No guarantee of convexity:
will only find the nearest local optimum
results are completely dependent on initialization
6 / 19
10. Drawbacks
No guarantee of convexity:
will only find the nearest local optimum
results are completely dependent on initialization
Standard errors are underestimated:
95% profile CI for α and β only achieve ~83% coverage
6 / 19
12. Inverse Problem
Find parameters ~
θ consistent with ~
y
Physical model F : Θ → Y doesn’t need to be invertible:
although lack of identifiability for ~
θ can create problems
we don’t need F in closed form — can use numeric solver for DE
7 / 19
13. Inverse Problem
Find parameters ~
θ consistent with ~
y
Physical model F : Θ → Y doesn’t need to be invertible:
although lack of identifiability for ~
θ can create problems
we don’t need F in closed form — can use numeric solver for DE
Likelihood L(~
y | ~
θ) is based on
E[Yi | ~
θ] = F
ti; ~
θ
as well as distribution of random noise (not necessarily Gaussian)
7 / 19
20. Thermogravimetric Analysis
The TGA model for a single reaction involves the Arrhenius equations:
dM
dt
= −MA exp
−
E
RT
dT
dt
= α
where M is mass fraction, T is temperature, R is the ideal gas constant. The initial mass M0,
the initial temperature T0, and the heating rate α are experimentally controlled.
The unknown parameters ~
θ are
A pre-exponential factor
E activation energy
σ2 variance of the additive Gaussian noise
13 / 19
21. Reparameterisation
We have good prior information for E (physically interpretable) and σ2 (measurement noise),
but not for A.
The distributions for A and E are also very highly correlated, slowing the mixing of MCMC.
14 / 19
22. Reparameterisation
We have good prior information for E (physically interpretable) and σ2 (measurement noise),
but not for A.
The distributions for A and E are also very highly correlated, slowing the mixing of MCMC.
Instead, we reparameterise the model in terms of the temperature, Tm, at which the rate dM
dt is
maximised:
A exp
−
E
RTm
=
E
RT2
m
α
In our MCMC algorithm, we first propose q(E∗ | Ej−1), then propose q(Tm∗ | Tm,j−1). The
value of A∗ can then be obtained from the equation above.
We solve ~
F
~
t; E∗, A∗
numerically using a Runge-Kutta method.
14 / 19
24. Posterior for Functions of ~
θ
A function of a random variable is also a random variable.
Now that we have random samples from our posterior π(A, E, Tm, σ2 | ~
y), we can use our
model to obtain predictions for any measurable function of the parameters, g(A, E, Tm).
16 / 19
25. Posterior for Functions of ~
θ
A function of a random variable is also a random variable.
Now that we have random samples from our posterior π(A, E, Tm, σ2 | ~
y), we can use our
model to obtain predictions for any measurable function of the parameters, g(A, E, Tm).
For example, the critical length of a stockpile:
Lcr = g(A, E, Tm) = K
v
u
u
texp
n
E
RT2
m
o
A
RT2
m
E
E[g(A, E, Tm) | ~
y] =
Z 1000
0
Z ∞
0
Z ∞
0
g(A, E, Tm) dπ(A, E, Tm | ~
y)
≈
J
X
j=1
g(Aj, Ej, Tj)
where K is a constant and {Aj, Ej, Tj}J
j=1 are the MCMC samples (after discarding burn-in).
16 / 19
27. Advanced Methods
Parallel and distributed computation
Sequential updating of π(~
θ | ~
y) and streaming inference
Spatio-temporal modelling of F(x, y, z, t; ~
θ)
Emulation of the forward map Θ → Y
(e.g. using artificial neural networks)
Approximating the likelihood L (e.g. ABC)
Accounting for multi-modality
(e.g. ill-posed problems)
Accounting for error in the numerical solver (probabilistic numerics)
Accounting for model misspecification (discrepancy function)
Multi-level Monte Carlo methods
Hamiltonian Monte Carlo 18 / 19
28. Further Reading
R. J. Longbottom, B. J. Monaghan, D. J. Pinson, N. A. S. Webster S. J. Chew
In situ Phase Analysis during Self-sintering of BOS Filter Cake for Improved Recycling.
ISIJ International 60(11): 2436–2445, 2020.
A. M. Stuart
Inverse problems: A Bayesian perspective.
Acta Numerica 19: 451—559, 2010.
L.C. Astfalck, E.J. Cripps, J.P. Gosling, M.R. Hodkiewicz I.A. Milne
Expert elicitation of directional metocean parameters. Ocean Engineering 161: 268–276, 2018.
B. P. Carlin A. E. Gelfand
An iterative Monte Carlo method for nonconjugate Bayesian analysis.
Statistics Computing 1(2): 119–128, 1991.
19 / 19