SlideShare a Scribd company logo
1 of 161
Download to read offline
Large-scale Inverse Problems
Tania Bakhos, Peter Kitanidis
Institute for Computational Mathematical Engineering, Stanford University
Arvind K. Saibaba
Department of Electrical and Computer Engineering,Tufts University
June 28, 2015
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 1 / 114
Outline
1 Introduction
2 Linear Inverse Problems
3 Geostatistical Approach
Bayes’ theorem
Coin toss example
Covariance modeling
Non-Gaussian priors
4 Data Assimilation
Application: CO2 monitoring
5 Uncertainty quantification
MCMC
6 Concluding remarks
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 2 / 114
What is an Inverse Problem?
Parameters s
Model
h(s)
Data y
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 3 / 114
What is an Inverse Problem?
Parameters s
Model
h(s)
Data y
Inverse Problems
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 3 / 114
What is an Inverse Problem?
Parameters s
Model
h(s)
Data y
Quantities
of Interest
Inverse Problems
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 3 / 114
Inverse problems: Applications
Inverse Problems Geosciences
CO2
monitoring
in the
subsurface
Contaminant
source iden-
tification
Climate
change
Hydraulic
Tomog-
raphy
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 4 / 114
Inverse problems: Applications
Inverse Problems Other fields
Medical
Imaging
Non-
destructive
testing
Neuroscience
Image
Deblurring
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 4 / 114
Application: Contaminant source identification1
1http://www.solinst.com/Prod/660/660d2.html, Stockie, SIAM Review 2011
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 5 / 114
Application: Contaminant source identification1
Initial
conditions
Transport
processes
Predictions/
Measurements
1http://www.solinst.com/Prod/660/660d2.html, Stockie, SIAM Review 2011
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 5 / 114
Application: Contaminant source identification1
Initial
conditions
Transport
processes
Predictions/
Measurements
1http://www.solinst.com/Prod/660/660d2.html, Stockie, SIAM Review 2011
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 5 / 114
Application: Hydraulic Tomography
Manage underground sites
To better locate natural
resources
Contaminant remediation
Source http://web.stanford.edu/ jonghyun/research.html
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 6 / 114
Field pictures
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 7 / 114
Transient Hydraulic Tomography
Results from a field experiment conducted at the Boise Hydrological Research Site
(BHRS) 2
Figure 1 : Hydraulic head measurements at observation wells (left) and log10 estimate of
the hydraulic conductivity (right)
2Cardiff, Barrash and Kitanidis - Water Resoures Research 47(12) 2011.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 8 / 114
CSEM: Oil Exploration
Source: Morten et al, 72nd EAGE Conference 2010 Barcelona, and Newman et al.
Geophysics, 72(2) 2010;
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 9 / 114
Monitoring CO2 emissions
Atmospheric transport model
Observations from monitoring stations, satellite observations, etc
Source: Anna Michalak’s plenary talk
https://www.pathlms.com/siam/courses/1043/sections/1257
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 10 / 114
Application: Global Seismic Inversion
Bui-Thanh, Tan, et al. SISC 35.6 (2013): A2494-A2523.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 11 / 114
Need for Uncertainty Quantification
“ Uncertainty quantification (UQ) is the science of quantitative characterization
and reduction of uncertainties in applications. It tries to determine how likely
certain outcomes are if some aspects of the system are not exactly known.” -
Wikipedia.
6Bui et al. Proceedings of the International Conference on High Performance Computing,
Networking, Storage and Analysis. IEEE Computer Society Press 2012
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 12 / 114
Need for Uncertainty Quantification
“ Uncertainty quantification (UQ) is the science of quantitative characterization
and reduction of uncertainties in applications. It tries to determine how likely
certain outcomes are if some aspects of the system are not exactly known.” -
Wikipedia.
“ ... how do we quantify uncertainties in the predictions of our large-scale
simulations, given limitations in observational data, computational resources, and
our understanding of physical processes ?”6
6Bui et al. Proceedings of the International Conference on High Performance Computing,
Networking, Storage and Analysis. IEEE Computer Society Press 2012
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 12 / 114
Need for Uncertainty Quantification
“ Uncertainty quantification (UQ) is the science of quantitative characterization
and reduction of uncertainties in applications. It tries to determine how likely
certain outcomes are if some aspects of the system are not exactly known.” -
Wikipedia.
“ ... how do we quantify uncertainties in the predictions of our large-scale
simulations, given limitations in observational data, computational resources, and
our understanding of physical processes ?”6
“ Well, what I’m saying is that there are known knowns and that there are known
unknowns. But there are also unknown unknowns; things we don’t know that we
don’t know. ”
- Gin Rummy, paraphrasing D. Rumsfeld.
6Bui et al. Proceedings of the International Conference on High Performance Computing,
Networking, Storage and Analysis. IEEE Computer Society Press 2012
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 12 / 114
Statistical framework for inverse problems
Estimate model parameters (and uncertainties) from data.
Propagate forward uncertainties to predict quantities and uncertainties.
Optimal experiment design
What experimental conditions yield the most information?
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 13 / 114
Statistical framework for inverse problems
Estimate model parameters (and uncertainties) from data.
Propagate forward uncertainties to predict quantities and uncertainties.
Optimal experiment design
What experimental conditions yield the most information?
Challenge: framework often intractable because
Mathematically ill-posed (sensitivity to noise)
Computationally challenging problem
Insufficient information from data
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 13 / 114
Opportunities and challenges
Central question in our research
How to exploit structure in order to overcome the curse of dimensionality to
develop scalable algorithms for statistical inverse problems?
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 14 / 114
Opportunities and challenges
Central question in our research
How to exploit structure in order to overcome the curse of dimensionality to
develop scalable algorithms for statistical inverse problems?
What do we mean by scalable?
amount of data
discretization of unknown random field
number of processors
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 14 / 114
Sessions at SIAM Geosciences
Plenary talks
IP1 The Seismic Inverse Problem Towards Wave Equation Based Velocity
Estimation
Fons ten Kroode, Shell Research, The Netherlands
McCaw Hall 8:30-9:15 AM (Monday)
Contributed Talks
CP 3: Inverse Modeling
4:30 PM - 6:30 PM, Monday June 29 th, Room: Fisher Conference Center
room #5
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 15 / 114
Minisymposia at SIAM Geosciences
MS 54 Recent advances in Geophysical Inverse Problems
Tania Bakhos, Peter Kitanidis, Arvind Saibaba
9:30 AM - 11:30 AM Thursday July 2, Room: Bechtel Conference Center -
Main Hall
MS 12 Bayesian Methods for Large-scale Geophysical Inverse Problems
Omar Ghattas, Noemi Petra, Georg Stadler
2:00 PM - 4:00 PM, Monday June 29, Room: Fisher Conference Center room
#4
MS2, MS9, MS 15 Full-waveform inversion
William Symes, Hughes Djikpesse
9:30 AM - 11:30 AM, 2:00 - 4:00 PM and 4:30 - 6:30 PM
Room: Fisher Conference Center room #1
MS 19 Full Waveform Inversion
MS 36 3D Elastic Waveform Inversion: Challenges in Modeling and Inversion
MS 58 Forward and Inverse Problems in Geodesy, Geodynamics, and
Geomagnetism
MS46 Data Assimilation in Subsurface Applications: Advances in Model
Uncertainty Quantification
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 16 / 114
Outline
1 Introduction
2 Linear Inverse Problems
3 Geostatistical Approach
Bayes’ theorem
Coin toss example
Covariance modeling
Non-Gaussian priors
4 Data Assimilation
Application: CO2 monitoring
5 Uncertainty quantification
MCMC
6 Concluding remarks
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 17 / 114
Introduction
What is an inverse problem?
Forward problem: Compute the output given a system and an input.
Inverse problem: Compute either the input or the system given the output.
Hansen, PC. Discrete inverse problems: insight and algorithms. Vol. 7. SIAM, 2010
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 18 / 114
Example
Figure 2 : Magnetization inside volcano of Mt. Vesuvius from measurements of magnetic
field
Hansen, Per Christian. Discrete inverse problems: insight and algorithms. Vol. 7. SIAM,
2010
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 19 / 114
Challenges
Inverse problems are ill-posed. They do not satisfy the three conditions for
well-posedness.
Existence: The problem must have at least a solution.
Uniqueness: The problem must only have one solution.
Stability: The solution depends continuously on the data.
The mathematical term well-posed problem stems from a definition given by
Jacques Hadamard.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 20 / 114
Image processing
Consider the equation,
y = Ax +
Notation:
b : observations - the blurry image.
x : true image, we want to estimate.
A : blurring operator - given.
: noise in the data
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 21 / 114
Image processing
Consider the equation,
y = Ax +
Forward problem:
Given the true image x and the blurring matrix A, we get the blurred image b.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 21 / 114
Image processing
Consider the equation,
y = Ax +
Forward problem:
Given the true image x and the blurring matrix A, we get the blurred image b.
What is the inverse problem?
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 21 / 114
Image processing
Consider the equation,
y = Ax +
Forward problem:
Given the true image x and the blurring matrix A, we get the blurred image b.
What is the inverse problem?
The opposite of the forward problem. Given b and A, we compute x (the true
image).
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 21 / 114
Image processing
From http://www.math.vt.edu/people/jmchung/resources/CSGF07.pdf
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 22 / 114
Review of basic linear algebra
A square real matrix U ∈ Rn×n
is orthogonal if its inverse equals its
transpose, i.e. UUT
= I and UT
U = I.
A real symmetric matrix A = AT
has a spectral decomposition, A = UΛUT
where U is orthogonal and Λ = diag(λ1, ..., λn) is a diagonal matrix whose
entries are eigenvalues of A.
A real square matrix that is not symmetric can be diagonalized by two
orthogonal matrices with the singular value decomposition (SVD),
A = UΣV T
where Σ is a diagonal matrix whose entries are the singular
values of A.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 23 / 114
Need for regularization
Perturbation theory
Ax = b Would like to solve
A(x + δx) = b + Instead solving
Subtracting equation (2) - equation (1)
Aδx = ⇒ δx = A−1
Can show the following bounds
δx 2 ≤ A−1
2 2 x 2 ≥
A 2
b 2
Important result
δx 2
x 2
≤ A 2 A−1
2
cond(A)
2
b 2
The more ill-conditioned the blurring operator A is, the worse is the reconstruction.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 24 / 114
TSVD
Regularization controls the amplification of noise.
Truncated SVD: Discard all the singular values that are smaller than a chosen
number.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 25 / 114
TSVD
Regularization controls the amplification of noise.
Truncated SVD: Discard all the singular values that are smaller than a chosen
number. The naive solution was given by
x = A−1
b = V Σ−1
UT
b =
N
i=1
uT
i b
σi
vi
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 25 / 114
TSVD
Regularization controls the amplification of noise.
Truncated SVD: Discard all the singular values that are smaller than a chosen
number. The naive solution was given by
x = A−1
b = V Σ−1
UT
b =
N
i=1
uT
i b
σi
vi
For TSVD we truncate the singular values so the solution is given by,
xk =
k
i=1
uT
i b
σi
vi k < N
This yields the same solution as imposing a minimum 2-norm constraint on the
least squares problem minx Ax − b 2.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 25 / 114
TSVD
Figure 3 : Exact image (top left), TSVD k = 658 (top right), k = 218 (bottom left) and
k = 7243 (bottom right)
658 was too low (over-smoothed) and 7243 too high (under-smoothed).
Hansen, PC. Discrete inverse problems: insight and algorithms. Vol. 7.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 26 / 114
Selective SVD
A variant of the TSVD is the SSVD where we only include components that
significantly contribute to the regularized solution. Given a threshold τ,
x =
|uT
i b|>τ
uT
i b
σi
vi
This method is advantageous when some of the components uT
i b corresponding
to large singular values are small.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 27 / 114
Tikhonov regularization
Least squares objective function
ˆx = arg min
x
Ax − b 2
2 + α2
x 2
2
where α is a regularization parameter.
The first term Ax − b 2
2 measures how well the solution predicts the noisy
data, sometimes referred to as “goodness-of-fit”.
The second term x 2
2 measures the regularity of the solution.
The balance of the terms is controlled by the parameter α.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 28 / 114
Relation between Tikhonov and TSVD
The solution to the Tikhonov problem is given by,
xα = (AT
A + α2
I)−1
AT
b
If we replace A by its SVD,
xα = (V Σ2
V T
+ α2
VV T
)−1
V ΣUT
b
= V (Σ2
+ α2
I)−1
ΣUT
b
=
n
i=1
φα
i
uT
i b
σi
vi
where φα
i =
σ2
i
σ2
i + α2
are called filter factors
Note:
φα
i =
1 if σi α
σ2
i
α2 σi α
φTSVD
i =
1 if i ≤ k
0 i > k
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 29 / 114
Relation between Tikhonov and TSVD
For each k in TSVD there exists an α such that the solution to the Tikhonov
problem and the solution based on TSVD are approximately equal.
Hansen, PC. Discrete inverse problems: insight and algorithms. Vol. 7. SIAM, 2010
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 30 / 114
Choice of parameter α
How do we choose optimal α?
L-curve is log-log plot of the norm of the regularized solution versus the residual
norm. The best parameter lies at the corner of the L (maximum curvature)
Hansen, PC. Discrete inverse problems: insight and algorithms. Vol. 7. SIAM, 2010
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 31 / 114
General form of Tikhonov regularization
The Tikhonov formulation can be generalized to,
minx Ax − b 2
2 + α2
Lx 2
2
where L is a discrete smoothing operator. Common choices are the discrete first
and second derivative operators.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 32 / 114
Comparison of regularization methods
Figure 4 : The original image (top left) and blurred image (top right). Tikhonov
regularization (bottom left) and TSVD (bottom right).
http://www2.compute.dtu.dk/ pcha/DIP/chap8.pdf
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 33 / 114
Summary
Regularization suppresses components from noise and enforces regularity on the
computed solution.
Figure 5 : Illustration of why regularization is needed
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 34 / 114
Geophysical model problem
Unknown mass with density f (t) located at depth d below the surface.
No mass outside source.
We measure vertical component gravity field, g(s),
Figure 6 : Gravity surveying example problem.
Hansen, PC. Discrete inverse problems: insight and algorithms. Vol. 7. SIAM, 2010
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 35 / 114
Geophysical model problem
Magnitude of gravity field along s is
f (t) dt
d2 + (s − t)2
and the direction is in the direction from the point at s to the point at t.
dg =
sin θ
r2
f (t)dt
Using sin θ = d/r and integrating we get the forward problem:
g(s) =
1
0
d
(d2 + (s − t)2)
3/2
f (t)dt
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 36 / 114
Geophysical model problem
Swapping elements of forward problem, we get the inverse problem.
1
0
d
d2
+ (s − t)2 3/2
K(s,t)
f (t)dt = g(s)
where f (t) is the quantity we wish to estimate given measurements of g(s).
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 37 / 114
TSVD
Figure 7 : Exact solution (bottom right) and TSVD solutions
Hansen, PC. Discrete inverse problems: insight and algorithms. Vol. 7. SIAM, 2010
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 38 / 114
Tikhonov regularization
Figure 8 : Exact solution (bottom right) and Tikhonov solutions
Hansen, PC. Discrete inverse problems: insight and algorithms. Vol. 7. SIAM, 2010
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 39 / 114
Large-scale inverse problems
SVD infeasible for large-scale problems O(N3
).
Apply iterative methods to the linear system
(AT
A + α2
I)x(α) = AT
b
Generate a sequence of vectors (Krylov subspace)
Kk (AT
A, AT
b)
def
= Span{AT
b, (AT
A)AT
b, . . . , (AT
A)k−1
Ab
}
Lanczos bidiagonalization (LBD)
AVk = Uk Bk
AT
Uk = Vk BT
k + βk vk+1eT
k I
UT
k Uk = I and V T
k Vk = I
Bk =








α1
β1 α2
β2
...
... αk−1
αk








Singular vectors of Bk converge to the singular values of A. (typically largest ones
converge first)
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 40 / 114
Large-scale iterative solvers
CGLS
The LBD can be rewritten as
(AT
A + α2
I)Vk = Vk (Bk BT
k + α2
I)
Find xk = Vk yk such that
yk = (Bk BT
k + α2
I)−1
b 2e1
obtained by a Galerkin projection on the residual
LSQR
Find xk = Vk yk by solving a k × k system of equations
yk = arg min
y
Bk
βk eT
k
y − b 2e1
2
2 + α2
y 2
2
Solve a small regularized least squares problem at each step
Additionally regularization parameter α can be estimated at each iteration.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 41 / 114
Semi-convergence behavior
Standard convergence criteria for iterative solvers based on residual do not work
well for inverse problems.
This is because measurements are corrupted by noise. Need different stopping
criteria/ regularization methods.
From http://www.math.vt.edu/people/jmchung/resources/CSGF07.pdf
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 42 / 114
CGLS convergence 1/2
Hansen, PC. Discrete inverse problems: insight and algorithms. Vol. 7. SIAM, 2010
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 43 / 114
CGLS convergence 2/2
Hansen, PC. Discrete inverse problems: insight and algorithms. Vol. 7. SIAM, 2010
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 44 / 114
Outline
1 Introduction
2 Linear Inverse Problems
3 Geostatistical Approach
Bayes’ theorem
Coin toss example
Covariance modeling
Non-Gaussian priors
4 Data Assimilation
Application: CO2 monitoring
5 Uncertainty quantification
MCMC
6 Concluding remarks
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 45 / 114
Bayes’ theorem
Reverend Thomas Bayes
Interpretation: Inductive argument
p(Hypothesis|Evidence) ∝ p(Evidence|Hypothesis)p(Hypothesis)
(left) http://www.gaussianwaves.com/2013/10/bayes-theorem/ (right) Wikipedia
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 46 / 114
Coin toss experiment: Bayesian analysis
Say we have a “biased coin”
X1, X2, . . . , Xn+1 p(Xi = 1|π) = π p(Xi = 0|π) = 1 − π
What is the probability of observing a certain sequence?
H, T, H, . . .
H, H, H, . . .
After n + 1 trials we have
p(X1 = x1, X2 = x2, . . . , Xn+1 = xn+1|π) =
n+1
k=1
p(Xk = xk |π)
= π xk
(1 − π)n+1− xk
The Xi are conditionally independent.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 47 / 114
Bayesian update: Uniform prior
Let’s assume that we don’t have any information
p(π) =
1 0 < π < 1
0 otherwise
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 48 / 114
Bayesian analysis: Uniform prior
Bayes’ rule
p(π|x1, x2, . . . , xn+1) =
p(x1, x2, . . . , xn+1|π)p(π)
p(x1, x2, . . . , xn+1)
Applying the Bayes rule
p(π|x1, x2, . . . , xn+1) ∝ π xk
(1 − π)n+1− xk
× I0<π<1
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 49 / 114
Bayesian analysis: Uniform prior
Bayes’ rule
p(π|x1, x2, . . . , xn+1) =
p(x1, x2, . . . , xn+1|π)p(π)
p(x1, x2, . . . , xn+1)
Applying the Bayes rule
p(π|x1, x2, . . . , xn+1) ∝ π xk
(1 − π)n+1− xk
× I0<π<1
Summary of distribution:
Conditional Mean :
n
n + 2
xk
n
+
1
n + 2
Maximum :
xk
n
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 49 / 114
Bayesian analysis: Uniform prior
Bayes’ rule
p(π|x1, x2, . . . , xn+1) =
p(x1, x2, . . . , xn+1|π)p(π)
p(x1, x2, . . . , xn+1)
Applying the Bayes rule
p(π|x1, x2, . . . , xn+1) ∝ π xk
(1 − π)n+1− xk
× I0<π<1
Summary of distribution:
Conditional Mean :
n
n + 2
xk
n
+
1
n + 2
Maximum :
xk
n
Can approximate the distribution by a Gaussian (Laplace’s approximation)
p(π|x1, x2, . . . , xn+1) ∼ N(µ, σ2
) µ =
xk
n
σ2
=
µ(1 − µ)
n
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 49 / 114
Prior: Beta distribution
π follows a Beta(α, β) distribution
p(π) ∝ πα−1
(1 − π)β−1
Beta distribution is analytically tractable; example of conjugate prior.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 50 / 114
Bayesian update: Beta prior α = 5, β = 2
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 51 / 114
Bayesian update: Beta prior α = 0.5, β = 0.5
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 52 / 114
Bayesian Analysis: Beta prior
Applying the Bayes rule
p(π|x1, x2, . . . , xn+1) ∝ π xk
(1 − π)n+1− xk
× πα−1
(1 − π)β−1
π xk +α−1
(1 − π)n+1− xk +β
Conditional mean
Eπ[p(π|x1, . . . , xn+1)] =
1
0
πp(π|x1, . . . , xn+1)dπ
=
n
n + α + β
xk
n
+
α + β
n + α + β
α
α + β
Observe that this gives the right limit as n → ∞.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 53 / 114
Inverse problems: Bayesian viewpoint
Consider the measurement equation
y = h(s) + v v ∼ N(0, Γnoise)
Notation:
y : observations or measurements - given.
s : model parameters, we want to estimate.
h(s) : parameter-to-observation map - given.
v : additive i.i.d Gaussian noise
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 54 / 114
Inverse problems: Bayesian viewpoint
Consider the measurement equation
y = h(s) + v v ∼ N(0, Γnoise)
Using Bayes’ rule, the posterior pdf is
p(s|y) ∝ p(y|s)
Data misfit
p(s)
Prior
Data misfit - “How well the model reproduces data”
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 54 / 114
Inverse problems: Bayesian viewpoint
Consider the measurement equation
y = h(s) + v v ∼ N(0, Γnoise)
Using Bayes’ rule, the posterior pdf is
p(s|y) ∝ p(y|s)
Data misfit
p(s)
Prior
Data misfit - “How well the model reproduces data”
Prior - “Prior knowledge of unknown field ”
Smoothness, sparsity, etc
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 54 / 114
Geostatistical approach
Let s(x) be the parameter field we wish to recover
s(x) =
p
k=1 fi (x)βk
Deterministic term
+ (x)
Random term
Possible choices for fi (x)
Low order polynomials f1 = 1, f2 = x, f3 = x2
, etc.
Zonation model
fi is nonzero only in certain regions
Several possible choices for (x)
We will assume Gaussian random fields.
Revisit this assumption (later in this talk).
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 55 / 114
Gaussian Random Fields
GRF are multidimensional generalizations of Gaussian processes.
Definition
A Gaussian process is a collection of random variables, any finite number of which
have a joint Gaussian distribution.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 56 / 114
Gaussian Random Fields
GRF are multidimensional generalizations of Gaussian processes.
Definition
A Gaussian process is a collection of random variables, any finite number of which
have a joint Gaussian distribution.
A Gaussian process is completely specified by its mean function and covariance
function.
µ(x)
def
= E[f (x)]
κ(x, y)
def
= E[(f (x) − µ(x))(f (y) − µ(y))]
The GP is denoted as
f (x) ∼ N(µ(x), κ(x, y))
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 56 / 114
Gaussian Random Fields
GRF are multidimensional generalizations of Gaussian processes.
Definition
A Gaussian process is a collection of random variables, any finite number of which
have a joint Gaussian distribution.
Examples of Gaussian random fields
Figure 9 : Samples from Gaussian random fields
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 56 / 114
Geostatistical approach
Model priors as Gaussian random fields
s|β ∼ N(Xβ, Γprior) p(β) ∝ 1
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 57 / 114
Geostatistical approach
Model priors as Gaussian random fields
s|β ∼ N(Xβ, Γprior) p(β) ∝ 1
Posterior distribution
Applying Bayes theorem
p(s, β|y) ∝ p(y|s, β)p(s|β)p(β)
exp −
1
2
y − h(s) 2
Γ−1
noise
−
1
2
s − Xβ 2
Γ−1
prior
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 57 / 114
Geostatistical approach
Model priors as Gaussian random fields
s|β ∼ N(Xβ, Γprior) p(β) ∝ 1
Posterior distribution
Applying Bayes theorem
p(s, β|y) ∝ p(y|s, β)p(s|β)p(β)
exp −
1
2
y − h(s) 2
Γ−1
noise
−
1
2
s − Xβ 2
Γ−1
prior
Maximum a posteriori (MAP) estimate:
ˆs, ˆβ = arg min
s,β
− log p(s, β|y)
= arg min
s,β
1
2
y − h(s) 2
Γ−1
noise
+
1
2
s − Xβ 2
Γ−1
prior
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 57 / 114
MAP Estimate - Linear Inverse Problems
Maximum a posteriori (MAP) estimate: for h(s) = Hs
ˆs, ˆβ = arg min
s,β
1
2
y − Hs 2
Γ−1
noise
+
1
2
s − Xβ 2
Γ−1
prior
21Preconditioned iterative solver developed in Saibaba and Kitanidis, WRR 2012.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 58 / 114
MAP Estimate - Linear Inverse Problems
Maximum a posteriori (MAP) estimate: for h(s) = Hs
ˆs, ˆβ = arg min
s,β
1
2
y − Hs 2
Γ−1
noise
+
1
2
s − Xβ 2
Γ−1
prior
Obtained by solving the system of equations
HΓpriorHT
+ Γnoise HX
(HX)T
0
ˆξ
ˆβ
=
y
0
ˆs = X ˆβ + ΓpriorHT ˆξ
21Preconditioned iterative solver developed in Saibaba and Kitanidis, WRR 2012.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 58 / 114
MAP Estimate - Linear Inverse Problems
Maximum a posteriori (MAP) estimate: for h(s) = Hs
ˆs, ˆβ = arg min
s,β
1
2
y − Hs 2
Γ−1
noise
+
1
2
s − Xβ 2
Γ−1
prior
Obtained by solving the system of equations
HΓpriorHT
+ Γnoise HX
(HX)T
0
ˆξ
ˆβ
=
y
0
ˆs = X ˆβ + ΓpriorHT ˆξ
Solved using a matrix-free Krylov solver.
Requires fast ways to compute Hx and Γpriorx
Preconditioner21
using a low-rank representation of Γprior
21Preconditioned iterative solver developed in Saibaba and Kitanidis, WRR 2012.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 58 / 114
Interpolation using Gaussian Processes22
The posterior is Gaussian with
µpost(x∗
) = κ(x∗
, x)(κ(x, x) + σ2
I)−1
y(x)
covpost(x∗
, x∗
) = κ(x∗
, x∗
) − κ(x∗
, x)(κ(x, x) + σ2
I)−1
κ(x, x∗
)
22Gaussian Processes for Machine Learning, Rasmussen and Williams
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 59 / 114
Application: CO2 monitoring
Challenge:
Real-time monitoring of CO2 concentration
Time series of noisy seismic traveltime tomography data.
288 measurements and 234 × 217 unknowns
A.K. Saibaba, Ambikasaran, Li, Darve, Kitanidis, Oil and Gas Science and Technology 67.5
(2012): 857.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 60 / 114
Mat´ern covariance family
Mat`ern class of covariance kernels
κ(x, y) =
(αr)ν
2ν−1Γ(ν)
Kν(αr), α > 0, ν > 0
Here, r = x − y 2 is the radial distance between points x and y.
Examples: Exponential kernel (ν = 1/2), Gaussian kernel ν = ∞.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 61 / 114
Mat´ern covariance kernels
Deconvolution equation
y(t) =
T
0
f (t − τ)s(τ)dτ
Mat´ern covariance kernels ν = 1/2, 3/2, ∞
κ(x, y) = exp(−|x − y|/L)
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 62 / 114
Mat´ern covariance kernels
Deconvolution equation
y(t) =
T
0
f (t − τ)s(τ)dτ
Mat´ern covariance kernels ν = 1/2, 3/2, ∞
κ(x, y) = (1 +
√
3|x − y|/L) exp(−
√
3|x − y|/L)
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 62 / 114
Mat´ern covariance kernels
Deconvolution equation
y(t) =
T
0
f (t − τ)s(τ)dτ
Mat´ern covariance kernels ν = 1/2, 3/2, ∞
κ(x, y) = exp(−|x − y|2
/L2
)
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 62 / 114
Fast covariance evaluations
Consider the Gaussian priors
s|β ∼ N(Xβ, Γprior)
Covariance matrices are dense - expensive to store and compute.
For example, a dense 106
× 106
matrix costs 7.45 TB.
Typically, only need to evaluate Γpriorx and Γ−1
priorx.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 63 / 114
Fast covariance evaluations
Consider the Gaussian priors
s|β ∼ N(Xβ, Γprior)
Standard approaches
FFT based methods,
Fast Multipole Method,
Hierarchical Matrices
Kronecker tensor product approximations.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 63 / 114
Fast covariance evaluations
Consider the Gaussian priors
s|β ∼ N(Xβ, Γprior)
Standard approaches
FFT based methods,
Fast Multipole Method,
Hierarchical Matrices
Kronecker tensor product approximations.
Compared to the naive O(N2
)
Storage cost: O(N logα
N) Matvec cost: O(N logβ
N)
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 63 / 114
Toeplitz Matrices
A Toeplitz matrix T is an N × N matrix with entries such that Tij = ti−j , i.e. a
matrix of the form
T =








t0 t−1 t−2 . . . t−(N−1)
t1 t0 t−1
t2 t1 t0
...
...
...
tN−1 . . . t0








Suppose points xi = i × h and yj = j × h for i, j = 1, . . . , N
Stationary kernels Qij = κ(xi , yj ) = κ((i − j)h)
Translation-invariant kernels Qij = κ(xi , yj ) = κ(|i − j|h)
Need to store only O(N) entries, compared to O(N2
) entries.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 64 / 114
FFT based methods
Toeplitz matrices arise from stationary covariance kernels on regular grids


c b a
b c b
a b c

 Periodic embedding
=⇒






c b a a b
b c b a a
a b c b a
a a b c b
b a a b c






Diagonalizable by Fourier basis
Matrix-Vector Products for Toeplitz matrices O(N log N)
Restricted to regular, equispaced grids.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 65 / 114
H-matrix formulation: An Intuitive Explanation.
Consider for xi , yi = (i − 1) 1
N−1 , i = 1, . . . , N
κα(x, y) =
1
|x − y| + α
α > 0
Figure 10 : blockwise rank- α = 10−6
, = 10−6
, N = M = 256
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 66 / 114
H-matrix formulation: An Intuitive Explanation.
Consider for xi , yi = (i − 1) 1
N−1 , i = 1, . . . , N
κ(x, y) = exp(−|x − y|)
Figure 11 : blockwise rank- = 10−6
, N = M = 256
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 67 / 114
Exponentially decaying singular values of off-diagonal
blocks
κα(x, y) =
1
|x − y| + α
α > 0 (1)
Figure 12 : First 32 singular values of off-diagonal sub-blocks of matrix corresponding to
non-overlapping segments (left) [0, 0.5] × [0.5, 1] and (right) [0, 0.25] × [0.75, 1.0]
The decay of singular values can be related to the smoothness of the kernel.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 68 / 114
Prof. SVD - Gene Golub
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 69 / 114
Prof. SVD - Gene Golub
Rank-10 approximation
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 69 / 114
Prof. SVD - Gene Golub
Rank-20 approximation
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 69 / 114
Prof. SVD - Gene Golub
Rank-100 approximation
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 69 / 114
Hierarchical-matrices24
Hierarchical separation of space.
Low rank sub-blocks with well separated clusters.
Mild restrictions on the types of permissible kernels
24Hackbusch - 2000, Grasedyck and Hackbusch - 2003, Bebendorf - 2008
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 70 / 114
Hierarchical-matrices24
Hierarchical separation of space.
Low rank sub-blocks with well separated clusters.
Mild restrictions on the types of permissible kernels
Level 0
Full-rank blocks Low-rank blocks
24Hackbusch - 2000, Grasedyck and Hackbusch - 2003, Bebendorf - 2008
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 70 / 114
Hierarchical-matrices24
Hierarchical separation of space.
Low rank sub-blocks with well separated clusters.
Mild restrictions on the types of permissible kernels
Level 0
Level 1
Full-rank blocks Low-rank blocks
24Hackbusch - 2000, Grasedyck and Hackbusch - 2003, Bebendorf - 2008
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 70 / 114
Hierarchical-matrices24
Hierarchical separation of space.
Low rank sub-blocks with well separated clusters.
Mild restrictions on the types of permissible kernels
Level 0
Level 1
Level 2
Full-rank blocks Low-rank blocks
24Hackbusch - 2000, Grasedyck and Hackbusch - 2003, Bebendorf - 2008
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 70 / 114
Hierarchical-matrices24
Hierarchical separation of space.
Low rank sub-blocks with well separated clusters.
Mild restrictions on the types of permissible kernels
Level 0
Level 1
Level 2
Level 3
Full-rank blocks Low-rank blocks
24Hackbusch - 2000, Grasedyck and Hackbusch - 2003, Bebendorf - 2008
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 70 / 114
Clustering
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 71 / 114
Block clustering
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 72 / 114
Hierarchical-matrices25
Full-rank blocks Low-rank blocks
25Hackbusch - 2000, Grasedyck and Hackbusch - 2003, Bebendorf - 2008
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 73 / 114
Quasi-linear geostatistical approach
Maximum a posteriori estimate:
arg min
s,β
1
2
y − h(s) 2
Γ−1
noise
+
1
2
s − Xβ 2
Γ−1
prior
26Preconditioned iterative solver developed in Saibaba and Kitanidis, WRR 2012.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 74 / 114
Quasi-linear geostatistical approach
Maximum a posteriori estimate:
arg min
s,β
1
2
y − h(s) 2
Γ−1
noise
+
1
2
s − Xβ 2
Γ−1
prior
Algorithm 2 Quasi-linear geostatistical approach
1: while Not converged do
2: Solve the system of equations26
,
Jk ΓpriorJT
k + Γnoise Jk X
(Jk X)
T
0
ξk+1
βk+1
=
y − h(sk ) + Jk sk
0
where, the Jacobian J = ∂h
∂s s=sk
3: The update sk+1 = Xβk+1 + ΓpriorJT
k ξk+1
4: end while
26Preconditioned iterative solver developed in Saibaba and Kitanidis, WRR 2012.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 74 / 114
MAP Estimate - Quasi-linear Inverse Problems
At each step,
linearize to get a local Gaussian approximation
Solve a sequence of linear inverse problems.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 75 / 114
MAP Estimate - Quasi-linear Inverse Problems
At each step,
linearize to get a local Gaussian approximation
Solve a sequence of linear inverse problems.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 75 / 114
MAP Estimate - Quasi-linear Inverse Problems
At each step,
linearize to get a local Gaussian approximation
Solve a sequence of linear inverse problems.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 75 / 114
MAP Estimate - Quasi-linear Inverse Problems
At each step,
linearize to get a local Gaussian approximation
Solve a sequence of linear inverse problems.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 75 / 114
Non-Gaussian priors
Gaussian random fields often produce smooth reconstructions
Often need discontinuous reconstructions
Facies detection, tumor location.
Several possibilities
Total Variation regularization
Level Set approach
Markov Random Fields
Wavelet based reconstructions
Only scratching the surface, lots of techniques (and literature) available.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 76 / 114
Non-Gaussian priors
Gaussian random fields often produce smooth reconstructions
Often need discontinuous reconstructions
Facies detection, tumor location.
Several possibilities
Total Variation regularization
Level Set approach
Markov Random Fields
Wavelet based reconstructions
Only scratching the surface, lots of techniques (and literature) available.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 76 / 114
Total variation regularization
Total variation in 1D
TV (f ) = sup
n−1
k=1
|f (xk+1)−f (xk )|
Measure of arc length of a curve
Gif: Wikipedia, Figure: Kaipio et al. Statistical and computational inverse problems. Vol.
160. Springer Science & Business Media, 2006
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 77 / 114
Total Variation Regularization
MAP estimate (penalize discontinuous changes)
min
s
1
2
y − h(s) 2
Γ−1
noise
+ α
Ω
| s|ds | s| ≈
√
s · s + ε
Figure 13 : Inverse Wave propagation problem. (left) Cross-sections of inverted and
target models, (right) Surface model of the target.
Akcelic, Biros and Ghattas, Supercomputing, ACM/IEEE 2002 Conference. IEEE, 2002.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 78 / 114
Level Set approach
s(x) = cf (x)H(φ(x)) + cb(x)(1 − H(φ(x))) H(x) =
1
2
(1 + sign(x))
Figure 14 : Image courtesy of Wikipedia
Topologically flexible - able to recover multiple connected components
Evolve the shape by the minimizing an objective function.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 79 / 114
Bayesian Level set approach
Level set function
s(x) = cf (x)H(φ(x)) + cb(x)(1 − H(φ(x))) H(x) =
1
2
(1 + sign(x))
Employ a Gaussian random field as prior for φ(x)
Groundwater flow
− · κ u(x) = f (x) x ∈ Ω
u = 0 x ∈ ∂Ω
Transformation s = log κ
Iglesias et al. Preprint arXiv:1504.00313
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 80 / 114
Outline
1 Introduction
2 Linear Inverse Problems
3 Geostatistical Approach
Bayes’ theorem
Coin toss example
Covariance modeling
Non-Gaussian priors
4 Data Assimilation
Application: CO2 monitoring
5 Uncertainty quantification
MCMC
6 Concluding remarks
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 81 / 114
Tracking trajectories
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 82 / 114
Tracking trajectories
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 82 / 114
360◦
panorama - Teliportme
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 83 / 114
4D Var Filtering
Consider the dynamical system
∂v
∂t
= F(v) + η
v(x, 0) = v0(x)
3D Var Filtering
J3(v)
def
= y(T) − h(v(x; T)) 2
Γ−1
noise
+
1
2
v0(x) − v∗
0 (x) 2
Γ−1
prior
Optimization problem
ˆv0
def
= arg min
v0
Jk (v) k = 3, 4
4D Var Filtering
J4(v)
def
=
Nt
k=1
y(tk ) − h(v(x; tk )) 2
Γ−1
noise
+
1
2
v0(x) − v∗
0 (x) 2
Γ−1
prior
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 84 / 114
Application: Contaminant source identification
Transport equations
∂c
∂t
+ v · c = D 2
c
D c · n = 0
c(x, 0) = c0(x)
Estimate initial conditions from measurements of the contaminant field.
Akcelik, Volkan, et al. Proceedings of the 2005 ACM/IEEE conference on Supercomputing.
IEEE Computer Society, 2005.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 85 / 114
Linear Dynamical System
System Noise:
Measurements:
uk−1 uk uk+1
· · · sk−1 F sk F sk+1 · · ·
vk−1 Hk−1 vk Hk vk+1 Hk+1
yk−1 yk yk+1
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 86 / 114
State Evolution equations
Linear evolution equations
sk+1 = Fk sk + uk uk ∼ N(0, Γprior)
yk+1 = Hk+1sk+1 + vk vk ∼ N(0, Γnoise)
obtained by discretizing a PDE
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 87 / 114
State Evolution equations
Linear evolution equations
sk+1 = Fk sk + uk uk ∼ N(0, Γprior)
yk+1 = Hk+1sk+1 + vk vk ∼ N(0, Γnoise)
obtained by discretizing a PDE
Nonlinear evolution equations
sk+1 = f (sk ) + uk uk ∼ N(0, Γprior)
yk+1 = h(sk+1) + vk vk ∼ N(0, Γnoise)
Can be linearized (Extended Kalman Filter) or handled as is (Ensemble filtering)
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 87 / 114
Kalman Filter
Current N(ˆsk|k, Σk|k) Update Predict
Future
N(ˆsk+1|k+1, Σk+1|k+1)
Transition matrix Fk Observation Hk
Sys. noise
wk ∼ N(0, Γsys)
Meas. noise
vk ∼ N(0, Γnoise)
All variables are modeled as Gaussian random variables
Completely specified by the mean and covariance matrix.
Kalman filter provides a recursive way to update state knowledge and
predictions.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 88 / 114
Standard implementation of Kalman Filter
Predict
ˆsk+1|k = ˆsk|k −
Σk+1|k = Fk Σk|k FT
k + Γprior O(N3
)
Update
Sk = Hk Σk+1|k HT
k + Γnoise O(nmN2
)
Kk = Σk+1|k HT
S−1
k O(nmN2
)
ˆsk+1|k+1 = ˆsk+1|k + Kk (yk − Hkˆsk+1|k ) O(nmN)
Σk+1|k+1 = (Σ−1
k+1|k + HT
k Γ−1
noiseHk )−1
O(nmN2
+ N3
)
N: number of unknowns and nm: number of measurements
Storage cost O(N2
) and computational cost O(N3
)
This cost is prohibitively expensive for large-scale implementation
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 89 / 114
Standard implementation of Kalman Filter
Predict
ˆsk+1|k = ˆsk|k −
Σk+1|k = Fk Σk|k FT
k + Γprior O(N3
)
Update
Sk = Hk Σk+1|k HT
k + Γnoise O(nmN2
)
Kk = Σk+1|k HT
S−1
k O(nmN2
)
ˆsk+1|k+1 = ˆsk+1|k + Kk (yk − Hkˆsk+1|k ) O(nmN)
Σk+1|k+1 = (Σ−1
k+1|k + HT
k Γ−1
noiseHk )−1
O(nmN2
+ N3
)
N: number of unknowns and nm: number of measurements
Storage cost O(N2
) and computational cost O(N3
)
This cost is prohibitively expensive for large-scale implementation
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 89 / 114
Ensemble Kalman Filter
The EnKF is a Monte Carlo approximation of the Kalman filter.
Ensemble of state variables: X = [x1, . . . , xN ]
Ensemble of realizations are propagated individually
Can reuse legacy codes
Easily parallelizable
To update filter compute statistics based on the ensemble
Unlike Kalman filter, can be readily applied to nonlinear problems
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 90 / 114
Ensemble Kalman Filter
The EnKF is a Monte Carlo approximation of the Kalman filter.
Ensemble of state variables: X = [x1, . . . , xN ]
Ensemble of realizations are propagated individually
Can reuse legacy codes
Easily parallelizable
To update filter compute statistics based on the ensemble
Unlike Kalman filter, can be readily applied to nonlinear problems
The ensemble mean and covariance can be computed as
E[X] =
1
N
N
k=1
xk C =
1
N − 1
AAT
where A is the mean subtracted ensemble.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 90 / 114
Application: Real-time CO2 monitoring
Sources
Receivers
Sources fire a pulse, receivers measure time delay.
Measurements - travel time of each source-receiver pair.
6 sources, 48 receivers = 288 measurements
Assumption: rays travel in straight-line path
tsr =
recv
source
1
v(x)
Slowness
d + noise
Model problem for: reflection seismology, CT scanning, etc.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 91 / 114
Random Walk Forecast model
Evolution of CO2 can be modeled as
sk+1 = Fk sk + uk uk ∼ N(0, Γprior)
yk+1 = Hk+1sk+1 + vk vk ∼ N(0, Γnoise)
A.K. Saibaba, E.L. Miller, P.K. Kitanidis, A Fast Kalman Filter for time-lapse Electrical
Resistivity Tomography. Proceedings of IGARSS 2014, Montreal
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 92 / 114
Random Walk Forecast model
Evolution of CO2 can be modeled as
sk+1 = Fk sk + uk uk ∼ N(0, Γprior)
yk+1 = Hk+1sk+1 + vk vk ∼ N(0, Γnoise)
Random walk assumption Fk = I
Useful modeling assumption when measurements can be acquired rapidly
Applications: Electrical Impedance Tomography, Electrical Resistivity
Tomography, Seismic Travel-time tomography
Treat Γprior using Hierarchical matrix approach
A.K. Saibaba, E.L. Miller, P.K. Kitanidis, A Fast Kalman Filter for time-lapse Electrical
Resistivity Tomography. Proceedings of IGARSS 2014, Montreal
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 92 / 114
Results: Kalman Filter
Figure 15 : True and estimated CO2-induced changes in slowness (reciprocal of velocity)
between two wells for the grid size 234 × 219 at times 3, 30 and 60 hours respectively.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 93 / 114
Comparison of costs of different algorithms
Grid size 59 × 55
Γprior is constructed using kernel κ(r) = θ exp(−
√
r)
Γnoise = σ2
I with σ2
= 10−4
Saibaba, Arvind K., et al. Inverse Problems 31.1 (2015): 015009.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 94 / 114
Error in the reconstruction
Γprior is constructed using kernel κ(r) = θ exp(−
√
r)
Γnoise = σ2
I with σ2
= 10−4
Saibaba, Arvind K., et al. Inverse Problems 31.1 (2015): 015009.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 95 / 114
Conditional Realizations
Figure 16 : Conditional realizations of CO2-induced changes in slowness (reciprocal of
velocity) between two wells for the grid size 59 × 55 at times 3, 30 and 60 hours
respectively.
Saibaba, Arvind K., et al. Inverse Problems 31.1 (2015): 015009.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 96 / 114
Outline
1 Introduction
2 Linear Inverse Problems
3 Geostatistical Approach
Bayes’ theorem
Coin toss example
Covariance modeling
Non-Gaussian priors
4 Data Assimilation
Application: CO2 monitoring
5 Uncertainty quantification
MCMC
6 Concluding remarks
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 97 / 114
Inverse problems: Bayesian viewpoint
Consider the measurement equation
y = h(s) + v v ∼ N(0, Γnoise)
Notation:
y : observations or measurements - given.
s : model parameters, we want to estimate.
h(s) : parameter-to-observation map - given.
v : additive i.i.d Gaussian noise
Using Bayes’ rule, the posterior pdf is
p(s|y) ∝ p(y|s)
Data misfit
p(s)
Prior
The posterior distribution is the Bayesian solution to the inverse problem.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 98 / 114
Bayesian Inference: Quantifying uncertainty
Maximum-a-posteriori (MAP) estimate arg max p(s|y)
Conditional mean
sCM = Es|y [s] = s p(s|y)ds
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 99 / 114
Bayesian Inference: Quantifying uncertainty
Maximum-a-posteriori (MAP) estimate arg max p(s|y)
Conditional mean
sCM = Es|y [s] = s p(s|y)ds
Credibility intervals: Find sets C(y)
p[s ∈ C(y)|y] = 1 − α
Sample realizations from the posterior
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 99 / 114
Linear Inverse Problems
Recall the distribution is given by
p(s|y) ∝ exp −
1
2
y − Hs 2
Γ−1
noise
−
1
2
s − µ 2
Γ−1
prior
Posterior distribution
s|y ∼ N(sMAP, Γpost)
Γpost = Γ−1
prior + HT
Γ−1
noiseH
−1
= Γprior − ΓpriorHT
(HΓpriorHT
+ Γnoise)−1
HΓprior
sMAP = Γpost(HT
Γ−1
noisey + Γ−1
priorµ)
Observe that
Γpost ≤ Γprior
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 100 / 114
Application: CO2 monitoring
Variance = diag(Γpost) = diag(Γ−1
prior + HT
Γ−1
noiseH)−1
A.K. Saibaba, Ambikasaran, Li, Darve, Kitanidis, Oil and Gas Science and Technology 67.5
(2012): 857.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 101 / 114
Nonlinear Inverse Problems
Linearize the forward operator (at the MAP point)
h(s) = h(sMAP) +
∂h
∂s
(s − sMAP) + O( s − sMAP
2
2)
Groundwater flow equations
− · (κ(x) φ) = Qδ(x − xsource) x ∈ Ω
φ = 0 x ∈ ΩD
Inverse problem:
Estimate hydraulic tomography κ from discrete measurements of φ.
To make problem well-posed, work with s = log κ.
Saibaba, Arvind K., et al., Advances in Water Resources 82 (2015): 124-138.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 102 / 114
Nonlinear Inverse Problems
Linearize the forward operator (at the MAP point)
h(s) = h(sMAP) +
∂h
∂s
(s − sMAP) + O( s − sMAP
2
2)
Figure 17 : (left) Reconstruction of log conductivity (right) Posterior variance
Saibaba, Arvind K., et al., Advances in Water Resources 82 (2015): 124-138.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 102 / 114
Monte Carlo sampling
Suppose X has density p(x) and we are interested in f (X)
E[f (X)] = f (x)p(x)dx = lim
N→∞
1
N
N
k=1
f (xk )
Approximate using sample averages
E[f (X)] ≈
1
N
N
k=1
f (xk )
p(x) is understood to be the posterior distribution.
If samples are easy to generate, procedure is straightforward.
Use Central Limit Theorem to generate confidence intervals.
Generating samples from p(x) may not be straightforward.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 103 / 114
Acceptance-rejection sampling
Approximate distribution by an easier distribution
Points under curve
Points generated
× box area = lim
n→∞
B
A
f (x)dx
From PyMC2 website: http://pymc-devs.github.io/pymc/theory.html
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 104 / 114
Markov chains
Consider a sequence of random variables X1, X2, . . .
p(Xt+1 = xt+1|Xt = xt, . . . , X1 = x1) = p(Xt+1 = xt+1|Xt = xt)
The future depends only on the present - not the past!
Under some conditions, the chain has a stationary distribution.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 105 / 114
Implementation
Create a Markov Chain whose stationary distribution is p(x)
1 Draw a proposal y from q(y|xn)
2 Calculate acceptance ratio
α(xn, y) = min 1,
p(y)q(xn|y)
p(xn)q(y|xn)
3 Accept/Reject
xn+1 =
y with probabilityα(xn, y)
xn with probability 1 − α(xn, y)
If q(x, y) = q(y, x) then α(xn, y) = min{1, p(y)/p(xn)}
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 106 / 114
MCMC demo
Demo at: http://chifeng.scripts.mit.edu/stuff/mcmc-demo/
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 107 / 114
Properties of MCMC sampling
Ergodic theorem for expectations
lim
N→∞
1
N
N
k=1
f (xi ) =
Ω
f (x)p(x)dx
However samples xk are no longer i.i.d. Has higher variance than MC sampling.
Popular sampling strategies
Metropolis-Hastings
Gibbs samplers
Hamiltonian MCMC
Adaptive MCMC with Delayed rejection (DRAM)
Metropolis adjusted Langevin Algorithm (MALA)
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 108 / 114
Curse of dimensionality
What is the probability of hitting a hypersphere inscribed in a hypercube?
In dimension n = 100, the probability < 2 × 10−70
.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 109 / 114
Stochastic Newton MCMC
Martin, James, et al. SISC 34.3 (2012): A1460-A1487.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 110 / 114
Outline
1 Introduction
2 Linear Inverse Problems
3 Geostatistical Approach
Bayes’ theorem
Coin toss example
Covariance modeling
Non-Gaussian priors
4 Data Assimilation
Application: CO2 monitoring
5 Uncertainty quantification
MCMC
6 Concluding remarks
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 111 / 114
Opportunities
Theoretical and numerical
“Big data” meets “Big Models”
Model reduction
Posterior uncertainty quantification
Applications
New application areas, new technologies that generate inverse problems
Combining multiple modalities to make better predictions
Software that transcends application areas
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 112 / 114
Resources for learning Inverse Problems
Books
Hansen, Per Christian. Discrete inverse problems: insight and algorithms.
Vol. 7. SIAM, 2010.
Hansen, Per Christian. Rank-deficient and discrete ill-posed problems:
numerical aspects of linear inversion. Vol. 4. SIAM, 1998.
Hansen, Per Christian, James G. Nagy, and Dianne P. O’Leary. Deblurring
images: matrices, spectra, and filtering. Vol. 3. Siam, 2006.
Tarantola, Albert. Inverse problem theory and methods for model parameter
estimation. SIAM, 2005.
Kaipio, Jari, and Erkki Somersalo. Statistical and computational inverse
problems. Vol. 160. Springer Science & Business Media, 2006.
Vogel, Curtis R. Computational methods for inverse problems. Vol. 23.
SIAM, 2002.
PK Kitanidis. Introduction to geostatistics: applications in hydrogeology.
Cambridge University Press, 1997.
Cressie, Noel. Statistics for spatial data. John Wiley & Sons, 2015.
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 113 / 114
Resources for learning Inverse Problems
Software Packages
Regularization Tools (MATLAB)
Website:
http://www2.compute.dtu.dk/~pcha/Regutools/regutools.html
PEST
Website http://www.pesthomepage.org/
bgaPEST
Website: http://pubs.usgs.gov/tm/07/c09/
MUQ
Website: https://bitbucket.org/mituq/muq
Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 114 / 114

More Related Content

What's hot

Logistic regression: topological and geometric considerations
Logistic regression: topological and geometric considerationsLogistic regression: topological and geometric considerations
Logistic regression: topological and geometric considerationsColleen Farrelly
 
Gram-Schmidt and QR Decomposition (Factorization) of Matrices
Gram-Schmidt and QR Decomposition (Factorization) of MatricesGram-Schmidt and QR Decomposition (Factorization) of Matrices
Gram-Schmidt and QR Decomposition (Factorization) of MatricesIsaac Yowetu
 
Seminar- Robust Regression Methods
Seminar- Robust Regression MethodsSeminar- Robust Regression Methods
Seminar- Robust Regression MethodsSumon Sdb
 
Dimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptxDimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptxSivam Chinna
 
Real time implementation of unscented kalman filter for target tracking
Real time implementation of unscented kalman filter for target trackingReal time implementation of unscented kalman filter for target tracking
Real time implementation of unscented kalman filter for target trackingIAEME Publication
 
Breast cancer diagnosis and recurrence prediction using machine learning tech...
Breast cancer diagnosis and recurrence prediction using machine learning tech...Breast cancer diagnosis and recurrence prediction using machine learning tech...
Breast cancer diagnosis and recurrence prediction using machine learning tech...eSAT Journals
 
CART – Classification & Regression Trees
CART – Classification & Regression TreesCART – Classification & Regression Trees
CART – Classification & Regression TreesHemant Chetwani
 
introductoin to Biostatistics ( 1st and 2nd lec ).ppt
introductoin to Biostatistics ( 1st and 2nd lec ).pptintroductoin to Biostatistics ( 1st and 2nd lec ).ppt
introductoin to Biostatistics ( 1st and 2nd lec ).pptKvkExambranch
 
Basic of Statistical Inference Part-III: The Theory of Estimation from Dexlab...
Basic of Statistical Inference Part-III: The Theory of Estimation from Dexlab...Basic of Statistical Inference Part-III: The Theory of Estimation from Dexlab...
Basic of Statistical Inference Part-III: The Theory of Estimation from Dexlab...Dexlab Analytics
 
Calculating a single sample z test by hand
Calculating a single sample z test by handCalculating a single sample z test by hand
Calculating a single sample z test by handKen Plummer
 
Support vector machine
Support vector machineSupport vector machine
Support vector machineMusa Hawamdah
 
Decision trees and random forests
Decision trees and random forestsDecision trees and random forests
Decision trees and random forestsDebdoot Sheet
 
Breast Cancer Detection with Convolutional Neural Networks (CNN)
Breast Cancer Detection with Convolutional Neural Networks (CNN)Breast Cancer Detection with Convolutional Neural Networks (CNN)
Breast Cancer Detection with Convolutional Neural Networks (CNN)Mehmet Çağrı Aksoy
 
Chapter 4 part1-Probability Model
Chapter 4 part1-Probability ModelChapter 4 part1-Probability Model
Chapter 4 part1-Probability Modelnszakir
 
Multiple comparison problem
Multiple comparison problemMultiple comparison problem
Multiple comparison problemJiri Haviger
 
CS 402 DATAMINING AND WAREHOUSING -MODULE 2
CS 402 DATAMINING AND WAREHOUSING -MODULE 2CS 402 DATAMINING AND WAREHOUSING -MODULE 2
CS 402 DATAMINING AND WAREHOUSING -MODULE 2NIMMYRAJU
 

What's hot (20)

Logistic regression: topological and geometric considerations
Logistic regression: topological and geometric considerationsLogistic regression: topological and geometric considerations
Logistic regression: topological and geometric considerations
 
Numerical method (curve fitting)
Numerical method (curve fitting)Numerical method (curve fitting)
Numerical method (curve fitting)
 
Gram-Schmidt and QR Decomposition (Factorization) of Matrices
Gram-Schmidt and QR Decomposition (Factorization) of MatricesGram-Schmidt and QR Decomposition (Factorization) of Matrices
Gram-Schmidt and QR Decomposition (Factorization) of Matrices
 
Seminar- Robust Regression Methods
Seminar- Robust Regression MethodsSeminar- Robust Regression Methods
Seminar- Robust Regression Methods
 
Dimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptxDimensionality Reduction and feature extraction.pptx
Dimensionality Reduction and feature extraction.pptx
 
Learning from imbalanced data
Learning from imbalanced data Learning from imbalanced data
Learning from imbalanced data
 
Real time implementation of unscented kalman filter for target tracking
Real time implementation of unscented kalman filter for target trackingReal time implementation of unscented kalman filter for target tracking
Real time implementation of unscented kalman filter for target tracking
 
Breast cancer diagnosis and recurrence prediction using machine learning tech...
Breast cancer diagnosis and recurrence prediction using machine learning tech...Breast cancer diagnosis and recurrence prediction using machine learning tech...
Breast cancer diagnosis and recurrence prediction using machine learning tech...
 
CART – Classification & Regression Trees
CART – Classification & Regression TreesCART – Classification & Regression Trees
CART – Classification & Regression Trees
 
introductoin to Biostatistics ( 1st and 2nd lec ).ppt
introductoin to Biostatistics ( 1st and 2nd lec ).pptintroductoin to Biostatistics ( 1st and 2nd lec ).ppt
introductoin to Biostatistics ( 1st and 2nd lec ).ppt
 
Basic of Statistical Inference Part-III: The Theory of Estimation from Dexlab...
Basic of Statistical Inference Part-III: The Theory of Estimation from Dexlab...Basic of Statistical Inference Part-III: The Theory of Estimation from Dexlab...
Basic of Statistical Inference Part-III: The Theory of Estimation from Dexlab...
 
Calculating a single sample z test by hand
Calculating a single sample z test by handCalculating a single sample z test by hand
Calculating a single sample z test by hand
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
Decision trees and random forests
Decision trees and random forestsDecision trees and random forests
Decision trees and random forests
 
Breast Cancer Detection with Convolutional Neural Networks (CNN)
Breast Cancer Detection with Convolutional Neural Networks (CNN)Breast Cancer Detection with Convolutional Neural Networks (CNN)
Breast Cancer Detection with Convolutional Neural Networks (CNN)
 
Chapter 4 part1-Probability Model
Chapter 4 part1-Probability ModelChapter 4 part1-Probability Model
Chapter 4 part1-Probability Model
 
Multiple comparison problem
Multiple comparison problemMultiple comparison problem
Multiple comparison problem
 
Ada boost
Ada boostAda boost
Ada boost
 
Kmeans
KmeansKmeans
Kmeans
 
CS 402 DATAMINING AND WAREHOUSING -MODULE 2
CS 402 DATAMINING AND WAREHOUSING -MODULE 2CS 402 DATAMINING AND WAREHOUSING -MODULE 2
CS 402 DATAMINING AND WAREHOUSING -MODULE 2
 

Similar to Large-Scale Inverse Problems Techniques

Automatically Detecting Scientific Misinformation
Automatically Detecting Scientific MisinformationAutomatically Detecting Scientific Misinformation
Automatically Detecting Scientific MisinformationIsabelle Augenstein
 
Challenges in Analytics for BIG Data
Challenges in Analytics for BIG DataChallenges in Analytics for BIG Data
Challenges in Analytics for BIG DataPrasant Misra
 
Data, Responsibly: The Next Decade of Data Science
Data, Responsibly: The Next Decade of Data ScienceData, Responsibly: The Next Decade of Data Science
Data, Responsibly: The Next Decade of Data ScienceUniversity of Washington
 
H2O World - Machine Learning to Save Lives - Taposh Dutta Roy
H2O World - Machine Learning to Save Lives - Taposh Dutta RoyH2O World - Machine Learning to Save Lives - Taposh Dutta Roy
H2O World - Machine Learning to Save Lives - Taposh Dutta RoySri Ambati
 
Technology, Innovation, and the Grand Challenges Scholars Program - Yannis Yo...
Technology, Innovation, and the Grand Challenges Scholars Program - Yannis Yo...Technology, Innovation, and the Grand Challenges Scholars Program - Yannis Yo...
Technology, Innovation, and the Grand Challenges Scholars Program - Yannis Yo...KEEN
 
Uncertainty Quantification in Complex Physical Systems. (An Inroduction)
Uncertainty Quantification in Complex Physical Systems. (An Inroduction)Uncertainty Quantification in Complex Physical Systems. (An Inroduction)
Uncertainty Quantification in Complex Physical Systems. (An Inroduction)Ogechi Onuoha
 
Curses, tradeoffs, and scalable management: advancing evolutionary direct pol...
Curses, tradeoffs, and scalable management: advancing evolutionary direct pol...Curses, tradeoffs, and scalable management: advancing evolutionary direct pol...
Curses, tradeoffs, and scalable management: advancing evolutionary direct pol...Environmental Intelligence Lab
 
Detecting outliers and anomalies in data streams
Detecting outliers and anomalies in data streamsDetecting outliers and anomalies in data streams
Detecting outliers and anomalies in data streamsfatimabenjelloun1
 
Data Responsibly: The next decade of data science
Data Responsibly: The next decade of data scienceData Responsibly: The next decade of data science
Data Responsibly: The next decade of data scienceUniversity of Washington
 
Redhyte: Towards a Self-diagnosing, Self-correcting, and Helpful Analytic Pla...
Redhyte: Towards a Self-diagnosing, Self-correcting, and Helpful Analytic Pla...Redhyte: Towards a Self-diagnosing, Self-correcting, and Helpful Analytic Pla...
Redhyte: Towards a Self-diagnosing, Self-correcting, and Helpful Analytic Pla...Wei Zhong Toh
 
Design and evaluation of an interactive proof-of-concept dashboard for genera...
Design and evaluation of an interactive proof-of-concept dashboard for genera...Design and evaluation of an interactive proof-of-concept dashboard for genera...
Design and evaluation of an interactive proof-of-concept dashboard for genera...Robin De Croon
 
Building Interactive Systems for Social Good [Job Talk]
Building Interactive Systems for Social Good [Job Talk]Building Interactive Systems for Social Good [Job Talk]
Building Interactive Systems for Social Good [Job Talk]Matthew Louis Mauriello
 
20170410 CENTRA2 meeting - AirBox
20170410 CENTRA2 meeting - AirBox20170410 CENTRA2 meeting - AirBox
20170410 CENTRA2 meeting - AirBoxLing-Jyh Chen
 
Interactive Visualization Systems and Data Integration Methods for Supporting...
Interactive Visualization Systems and Data Integration Methods for Supporting...Interactive Visualization Systems and Data Integration Methods for Supporting...
Interactive Visualization Systems and Data Integration Methods for Supporting...Don Pellegrino
 
תכניית כנס האיגוד לסטטיסטיקה 2013
תכניית כנס האיגוד לסטטיסטיקה 2013תכניית כנס האיגוד לסטטיסטיקה 2013
תכניית כנס האיגוד לסטטיסטיקה 2013Anochi.com.
 

Similar to Large-Scale Inverse Problems Techniques (20)

Automatically Detecting Scientific Misinformation
Automatically Detecting Scientific MisinformationAutomatically Detecting Scientific Misinformation
Automatically Detecting Scientific Misinformation
 
Challenges in Analytics for BIG Data
Challenges in Analytics for BIG DataChallenges in Analytics for BIG Data
Challenges in Analytics for BIG Data
 
Data, Responsibly: The Next Decade of Data Science
Data, Responsibly: The Next Decade of Data ScienceData, Responsibly: The Next Decade of Data Science
Data, Responsibly: The Next Decade of Data Science
 
H2O World - Machine Learning to Save Lives - Taposh Dutta Roy
H2O World - Machine Learning to Save Lives - Taposh Dutta RoyH2O World - Machine Learning to Save Lives - Taposh Dutta Roy
H2O World - Machine Learning to Save Lives - Taposh Dutta Roy
 
QMC: Transition Workshop - Selected Highlights from the Probabilistic Numeric...
QMC: Transition Workshop - Selected Highlights from the Probabilistic Numeric...QMC: Transition Workshop - Selected Highlights from the Probabilistic Numeric...
QMC: Transition Workshop - Selected Highlights from the Probabilistic Numeric...
 
Technology, Innovation, and the Grand Challenges Scholars Program - Yannis Yo...
Technology, Innovation, and the Grand Challenges Scholars Program - Yannis Yo...Technology, Innovation, and the Grand Challenges Scholars Program - Yannis Yo...
Technology, Innovation, and the Grand Challenges Scholars Program - Yannis Yo...
 
Uncertainty Quantification in Complex Physical Systems. (An Inroduction)
Uncertainty Quantification in Complex Physical Systems. (An Inroduction)Uncertainty Quantification in Complex Physical Systems. (An Inroduction)
Uncertainty Quantification in Complex Physical Systems. (An Inroduction)
 
Curses, tradeoffs, and scalable management: advancing evolutionary direct pol...
Curses, tradeoffs, and scalable management: advancing evolutionary direct pol...Curses, tradeoffs, and scalable management: advancing evolutionary direct pol...
Curses, tradeoffs, and scalable management: advancing evolutionary direct pol...
 
Detecting outliers and anomalies in data streams
Detecting outliers and anomalies in data streamsDetecting outliers and anomalies in data streams
Detecting outliers and anomalies in data streams
 
Change Point Analysis
Change Point AnalysisChange Point Analysis
Change Point Analysis
 
Data Responsibly: The next decade of data science
Data Responsibly: The next decade of data scienceData Responsibly: The next decade of data science
Data Responsibly: The next decade of data science
 
Redhyte: Towards a Self-diagnosing, Self-correcting, and Helpful Analytic Pla...
Redhyte: Towards a Self-diagnosing, Self-correcting, and Helpful Analytic Pla...Redhyte: Towards a Self-diagnosing, Self-correcting, and Helpful Analytic Pla...
Redhyte: Towards a Self-diagnosing, Self-correcting, and Helpful Analytic Pla...
 
Why Data Science is a Science
Why Data Science is a ScienceWhy Data Science is a Science
Why Data Science is a Science
 
Design and evaluation of an interactive proof-of-concept dashboard for genera...
Design and evaluation of an interactive proof-of-concept dashboard for genera...Design and evaluation of an interactive proof-of-concept dashboard for genera...
Design and evaluation of an interactive proof-of-concept dashboard for genera...
 
Building Interactive Systems for Social Good [Job Talk]
Building Interactive Systems for Social Good [Job Talk]Building Interactive Systems for Social Good [Job Talk]
Building Interactive Systems for Social Good [Job Talk]
 
Big data in healthcare
Big data in healthcareBig data in healthcare
Big data in healthcare
 
20170410 CENTRA2 meeting - AirBox
20170410 CENTRA2 meeting - AirBox20170410 CENTRA2 meeting - AirBox
20170410 CENTRA2 meeting - AirBox
 
Interactive Visualization Systems and Data Integration Methods for Supporting...
Interactive Visualization Systems and Data Integration Methods for Supporting...Interactive Visualization Systems and Data Integration Methods for Supporting...
Interactive Visualization Systems and Data Integration Methods for Supporting...
 
תכניית כנס האיגוד לסטטיסטיקה 2013
תכניית כנס האיגוד לסטטיסטיקה 2013תכניית כנס האיגוד לסטטיסטיקה 2013
תכניית כנס האיגוד לסטטיסטיקה 2013
 
Health Data Science Seminar Series
Health Data Science Seminar SeriesHealth Data Science Seminar Series
Health Data Science Seminar Series
 

Large-Scale Inverse Problems Techniques

  • 1. Large-scale Inverse Problems Tania Bakhos, Peter Kitanidis Institute for Computational Mathematical Engineering, Stanford University Arvind K. Saibaba Department of Electrical and Computer Engineering,Tufts University June 28, 2015 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 1 / 114
  • 2. Outline 1 Introduction 2 Linear Inverse Problems 3 Geostatistical Approach Bayes’ theorem Coin toss example Covariance modeling Non-Gaussian priors 4 Data Assimilation Application: CO2 monitoring 5 Uncertainty quantification MCMC 6 Concluding remarks Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 2 / 114
  • 3. What is an Inverse Problem? Parameters s Model h(s) Data y Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 3 / 114
  • 4. What is an Inverse Problem? Parameters s Model h(s) Data y Inverse Problems Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 3 / 114
  • 5. What is an Inverse Problem? Parameters s Model h(s) Data y Quantities of Interest Inverse Problems Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 3 / 114
  • 6. Inverse problems: Applications Inverse Problems Geosciences CO2 monitoring in the subsurface Contaminant source iden- tification Climate change Hydraulic Tomog- raphy Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 4 / 114
  • 7. Inverse problems: Applications Inverse Problems Other fields Medical Imaging Non- destructive testing Neuroscience Image Deblurring Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 4 / 114
  • 8. Application: Contaminant source identification1 1http://www.solinst.com/Prod/660/660d2.html, Stockie, SIAM Review 2011 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 5 / 114
  • 9. Application: Contaminant source identification1 Initial conditions Transport processes Predictions/ Measurements 1http://www.solinst.com/Prod/660/660d2.html, Stockie, SIAM Review 2011 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 5 / 114
  • 10. Application: Contaminant source identification1 Initial conditions Transport processes Predictions/ Measurements 1http://www.solinst.com/Prod/660/660d2.html, Stockie, SIAM Review 2011 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 5 / 114
  • 11. Application: Hydraulic Tomography Manage underground sites To better locate natural resources Contaminant remediation Source http://web.stanford.edu/ jonghyun/research.html Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 6 / 114
  • 12. Field pictures Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 7 / 114
  • 13. Transient Hydraulic Tomography Results from a field experiment conducted at the Boise Hydrological Research Site (BHRS) 2 Figure 1 : Hydraulic head measurements at observation wells (left) and log10 estimate of the hydraulic conductivity (right) 2Cardiff, Barrash and Kitanidis - Water Resoures Research 47(12) 2011. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 8 / 114
  • 14. CSEM: Oil Exploration Source: Morten et al, 72nd EAGE Conference 2010 Barcelona, and Newman et al. Geophysics, 72(2) 2010; Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 9 / 114
  • 15. Monitoring CO2 emissions Atmospheric transport model Observations from monitoring stations, satellite observations, etc Source: Anna Michalak’s plenary talk https://www.pathlms.com/siam/courses/1043/sections/1257 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 10 / 114
  • 16. Application: Global Seismic Inversion Bui-Thanh, Tan, et al. SISC 35.6 (2013): A2494-A2523. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 11 / 114
  • 17. Need for Uncertainty Quantification “ Uncertainty quantification (UQ) is the science of quantitative characterization and reduction of uncertainties in applications. It tries to determine how likely certain outcomes are if some aspects of the system are not exactly known.” - Wikipedia. 6Bui et al. Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. IEEE Computer Society Press 2012 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 12 / 114
  • 18. Need for Uncertainty Quantification “ Uncertainty quantification (UQ) is the science of quantitative characterization and reduction of uncertainties in applications. It tries to determine how likely certain outcomes are if some aspects of the system are not exactly known.” - Wikipedia. “ ... how do we quantify uncertainties in the predictions of our large-scale simulations, given limitations in observational data, computational resources, and our understanding of physical processes ?”6 6Bui et al. Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. IEEE Computer Society Press 2012 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 12 / 114
  • 19. Need for Uncertainty Quantification “ Uncertainty quantification (UQ) is the science of quantitative characterization and reduction of uncertainties in applications. It tries to determine how likely certain outcomes are if some aspects of the system are not exactly known.” - Wikipedia. “ ... how do we quantify uncertainties in the predictions of our large-scale simulations, given limitations in observational data, computational resources, and our understanding of physical processes ?”6 “ Well, what I’m saying is that there are known knowns and that there are known unknowns. But there are also unknown unknowns; things we don’t know that we don’t know. ” - Gin Rummy, paraphrasing D. Rumsfeld. 6Bui et al. Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. IEEE Computer Society Press 2012 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 12 / 114
  • 20. Statistical framework for inverse problems Estimate model parameters (and uncertainties) from data. Propagate forward uncertainties to predict quantities and uncertainties. Optimal experiment design What experimental conditions yield the most information? Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 13 / 114
  • 21. Statistical framework for inverse problems Estimate model parameters (and uncertainties) from data. Propagate forward uncertainties to predict quantities and uncertainties. Optimal experiment design What experimental conditions yield the most information? Challenge: framework often intractable because Mathematically ill-posed (sensitivity to noise) Computationally challenging problem Insufficient information from data Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 13 / 114
  • 22. Opportunities and challenges Central question in our research How to exploit structure in order to overcome the curse of dimensionality to develop scalable algorithms for statistical inverse problems? Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 14 / 114
  • 23. Opportunities and challenges Central question in our research How to exploit structure in order to overcome the curse of dimensionality to develop scalable algorithms for statistical inverse problems? What do we mean by scalable? amount of data discretization of unknown random field number of processors Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 14 / 114
  • 24. Sessions at SIAM Geosciences Plenary talks IP1 The Seismic Inverse Problem Towards Wave Equation Based Velocity Estimation Fons ten Kroode, Shell Research, The Netherlands McCaw Hall 8:30-9:15 AM (Monday) Contributed Talks CP 3: Inverse Modeling 4:30 PM - 6:30 PM, Monday June 29 th, Room: Fisher Conference Center room #5 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 15 / 114
  • 25. Minisymposia at SIAM Geosciences MS 54 Recent advances in Geophysical Inverse Problems Tania Bakhos, Peter Kitanidis, Arvind Saibaba 9:30 AM - 11:30 AM Thursday July 2, Room: Bechtel Conference Center - Main Hall MS 12 Bayesian Methods for Large-scale Geophysical Inverse Problems Omar Ghattas, Noemi Petra, Georg Stadler 2:00 PM - 4:00 PM, Monday June 29, Room: Fisher Conference Center room #4 MS2, MS9, MS 15 Full-waveform inversion William Symes, Hughes Djikpesse 9:30 AM - 11:30 AM, 2:00 - 4:00 PM and 4:30 - 6:30 PM Room: Fisher Conference Center room #1 MS 19 Full Waveform Inversion MS 36 3D Elastic Waveform Inversion: Challenges in Modeling and Inversion MS 58 Forward and Inverse Problems in Geodesy, Geodynamics, and Geomagnetism MS46 Data Assimilation in Subsurface Applications: Advances in Model Uncertainty Quantification Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 16 / 114
  • 26. Outline 1 Introduction 2 Linear Inverse Problems 3 Geostatistical Approach Bayes’ theorem Coin toss example Covariance modeling Non-Gaussian priors 4 Data Assimilation Application: CO2 monitoring 5 Uncertainty quantification MCMC 6 Concluding remarks Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 17 / 114
  • 27. Introduction What is an inverse problem? Forward problem: Compute the output given a system and an input. Inverse problem: Compute either the input or the system given the output. Hansen, PC. Discrete inverse problems: insight and algorithms. Vol. 7. SIAM, 2010 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 18 / 114
  • 28. Example Figure 2 : Magnetization inside volcano of Mt. Vesuvius from measurements of magnetic field Hansen, Per Christian. Discrete inverse problems: insight and algorithms. Vol. 7. SIAM, 2010 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 19 / 114
  • 29. Challenges Inverse problems are ill-posed. They do not satisfy the three conditions for well-posedness. Existence: The problem must have at least a solution. Uniqueness: The problem must only have one solution. Stability: The solution depends continuously on the data. The mathematical term well-posed problem stems from a definition given by Jacques Hadamard. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 20 / 114
  • 30. Image processing Consider the equation, y = Ax + Notation: b : observations - the blurry image. x : true image, we want to estimate. A : blurring operator - given. : noise in the data Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 21 / 114
  • 31. Image processing Consider the equation, y = Ax + Forward problem: Given the true image x and the blurring matrix A, we get the blurred image b. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 21 / 114
  • 32. Image processing Consider the equation, y = Ax + Forward problem: Given the true image x and the blurring matrix A, we get the blurred image b. What is the inverse problem? Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 21 / 114
  • 33. Image processing Consider the equation, y = Ax + Forward problem: Given the true image x and the blurring matrix A, we get the blurred image b. What is the inverse problem? The opposite of the forward problem. Given b and A, we compute x (the true image). Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 21 / 114
  • 34. Image processing From http://www.math.vt.edu/people/jmchung/resources/CSGF07.pdf Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 22 / 114
  • 35. Review of basic linear algebra A square real matrix U ∈ Rn×n is orthogonal if its inverse equals its transpose, i.e. UUT = I and UT U = I. A real symmetric matrix A = AT has a spectral decomposition, A = UΛUT where U is orthogonal and Λ = diag(λ1, ..., λn) is a diagonal matrix whose entries are eigenvalues of A. A real square matrix that is not symmetric can be diagonalized by two orthogonal matrices with the singular value decomposition (SVD), A = UΣV T where Σ is a diagonal matrix whose entries are the singular values of A. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 23 / 114
  • 36. Need for regularization Perturbation theory Ax = b Would like to solve A(x + δx) = b + Instead solving Subtracting equation (2) - equation (1) Aδx = ⇒ δx = A−1 Can show the following bounds δx 2 ≤ A−1 2 2 x 2 ≥ A 2 b 2 Important result δx 2 x 2 ≤ A 2 A−1 2 cond(A) 2 b 2 The more ill-conditioned the blurring operator A is, the worse is the reconstruction. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 24 / 114
  • 37. TSVD Regularization controls the amplification of noise. Truncated SVD: Discard all the singular values that are smaller than a chosen number. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 25 / 114
  • 38. TSVD Regularization controls the amplification of noise. Truncated SVD: Discard all the singular values that are smaller than a chosen number. The naive solution was given by x = A−1 b = V Σ−1 UT b = N i=1 uT i b σi vi Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 25 / 114
  • 39. TSVD Regularization controls the amplification of noise. Truncated SVD: Discard all the singular values that are smaller than a chosen number. The naive solution was given by x = A−1 b = V Σ−1 UT b = N i=1 uT i b σi vi For TSVD we truncate the singular values so the solution is given by, xk = k i=1 uT i b σi vi k < N This yields the same solution as imposing a minimum 2-norm constraint on the least squares problem minx Ax − b 2. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 25 / 114
  • 40. TSVD Figure 3 : Exact image (top left), TSVD k = 658 (top right), k = 218 (bottom left) and k = 7243 (bottom right) 658 was too low (over-smoothed) and 7243 too high (under-smoothed). Hansen, PC. Discrete inverse problems: insight and algorithms. Vol. 7. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 26 / 114
  • 41. Selective SVD A variant of the TSVD is the SSVD where we only include components that significantly contribute to the regularized solution. Given a threshold τ, x = |uT i b|>τ uT i b σi vi This method is advantageous when some of the components uT i b corresponding to large singular values are small. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 27 / 114
  • 42. Tikhonov regularization Least squares objective function ˆx = arg min x Ax − b 2 2 + α2 x 2 2 where α is a regularization parameter. The first term Ax − b 2 2 measures how well the solution predicts the noisy data, sometimes referred to as “goodness-of-fit”. The second term x 2 2 measures the regularity of the solution. The balance of the terms is controlled by the parameter α. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 28 / 114
  • 43. Relation between Tikhonov and TSVD The solution to the Tikhonov problem is given by, xα = (AT A + α2 I)−1 AT b If we replace A by its SVD, xα = (V Σ2 V T + α2 VV T )−1 V ΣUT b = V (Σ2 + α2 I)−1 ΣUT b = n i=1 φα i uT i b σi vi where φα i = σ2 i σ2 i + α2 are called filter factors Note: φα i = 1 if σi α σ2 i α2 σi α φTSVD i = 1 if i ≤ k 0 i > k Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 29 / 114
  • 44. Relation between Tikhonov and TSVD For each k in TSVD there exists an α such that the solution to the Tikhonov problem and the solution based on TSVD are approximately equal. Hansen, PC. Discrete inverse problems: insight and algorithms. Vol. 7. SIAM, 2010 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 30 / 114
  • 45. Choice of parameter α How do we choose optimal α? L-curve is log-log plot of the norm of the regularized solution versus the residual norm. The best parameter lies at the corner of the L (maximum curvature) Hansen, PC. Discrete inverse problems: insight and algorithms. Vol. 7. SIAM, 2010 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 31 / 114
  • 46. General form of Tikhonov regularization The Tikhonov formulation can be generalized to, minx Ax − b 2 2 + α2 Lx 2 2 where L is a discrete smoothing operator. Common choices are the discrete first and second derivative operators. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 32 / 114
  • 47. Comparison of regularization methods Figure 4 : The original image (top left) and blurred image (top right). Tikhonov regularization (bottom left) and TSVD (bottom right). http://www2.compute.dtu.dk/ pcha/DIP/chap8.pdf Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 33 / 114
  • 48. Summary Regularization suppresses components from noise and enforces regularity on the computed solution. Figure 5 : Illustration of why regularization is needed Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 34 / 114
  • 49. Geophysical model problem Unknown mass with density f (t) located at depth d below the surface. No mass outside source. We measure vertical component gravity field, g(s), Figure 6 : Gravity surveying example problem. Hansen, PC. Discrete inverse problems: insight and algorithms. Vol. 7. SIAM, 2010 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 35 / 114
  • 50. Geophysical model problem Magnitude of gravity field along s is f (t) dt d2 + (s − t)2 and the direction is in the direction from the point at s to the point at t. dg = sin θ r2 f (t)dt Using sin θ = d/r and integrating we get the forward problem: g(s) = 1 0 d (d2 + (s − t)2) 3/2 f (t)dt Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 36 / 114
  • 51. Geophysical model problem Swapping elements of forward problem, we get the inverse problem. 1 0 d d2 + (s − t)2 3/2 K(s,t) f (t)dt = g(s) where f (t) is the quantity we wish to estimate given measurements of g(s). Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 37 / 114
  • 52. TSVD Figure 7 : Exact solution (bottom right) and TSVD solutions Hansen, PC. Discrete inverse problems: insight and algorithms. Vol. 7. SIAM, 2010 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 38 / 114
  • 53. Tikhonov regularization Figure 8 : Exact solution (bottom right) and Tikhonov solutions Hansen, PC. Discrete inverse problems: insight and algorithms. Vol. 7. SIAM, 2010 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 39 / 114
  • 54. Large-scale inverse problems SVD infeasible for large-scale problems O(N3 ). Apply iterative methods to the linear system (AT A + α2 I)x(α) = AT b Generate a sequence of vectors (Krylov subspace) Kk (AT A, AT b) def = Span{AT b, (AT A)AT b, . . . , (AT A)k−1 Ab } Lanczos bidiagonalization (LBD) AVk = Uk Bk AT Uk = Vk BT k + βk vk+1eT k I UT k Uk = I and V T k Vk = I Bk =         α1 β1 α2 β2 ... ... αk−1 αk         Singular vectors of Bk converge to the singular values of A. (typically largest ones converge first) Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 40 / 114
  • 55. Large-scale iterative solvers CGLS The LBD can be rewritten as (AT A + α2 I)Vk = Vk (Bk BT k + α2 I) Find xk = Vk yk such that yk = (Bk BT k + α2 I)−1 b 2e1 obtained by a Galerkin projection on the residual LSQR Find xk = Vk yk by solving a k × k system of equations yk = arg min y Bk βk eT k y − b 2e1 2 2 + α2 y 2 2 Solve a small regularized least squares problem at each step Additionally regularization parameter α can be estimated at each iteration. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 41 / 114
  • 56. Semi-convergence behavior Standard convergence criteria for iterative solvers based on residual do not work well for inverse problems. This is because measurements are corrupted by noise. Need different stopping criteria/ regularization methods. From http://www.math.vt.edu/people/jmchung/resources/CSGF07.pdf Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 42 / 114
  • 57. CGLS convergence 1/2 Hansen, PC. Discrete inverse problems: insight and algorithms. Vol. 7. SIAM, 2010 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 43 / 114
  • 58. CGLS convergence 2/2 Hansen, PC. Discrete inverse problems: insight and algorithms. Vol. 7. SIAM, 2010 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 44 / 114
  • 59. Outline 1 Introduction 2 Linear Inverse Problems 3 Geostatistical Approach Bayes’ theorem Coin toss example Covariance modeling Non-Gaussian priors 4 Data Assimilation Application: CO2 monitoring 5 Uncertainty quantification MCMC 6 Concluding remarks Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 45 / 114
  • 60. Bayes’ theorem Reverend Thomas Bayes Interpretation: Inductive argument p(Hypothesis|Evidence) ∝ p(Evidence|Hypothesis)p(Hypothesis) (left) http://www.gaussianwaves.com/2013/10/bayes-theorem/ (right) Wikipedia Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 46 / 114
  • 61. Coin toss experiment: Bayesian analysis Say we have a “biased coin” X1, X2, . . . , Xn+1 p(Xi = 1|π) = π p(Xi = 0|π) = 1 − π What is the probability of observing a certain sequence? H, T, H, . . . H, H, H, . . . After n + 1 trials we have p(X1 = x1, X2 = x2, . . . , Xn+1 = xn+1|π) = n+1 k=1 p(Xk = xk |π) = π xk (1 − π)n+1− xk The Xi are conditionally independent. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 47 / 114
  • 62. Bayesian update: Uniform prior Let’s assume that we don’t have any information p(π) = 1 0 < π < 1 0 otherwise Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 48 / 114
  • 63. Bayesian analysis: Uniform prior Bayes’ rule p(π|x1, x2, . . . , xn+1) = p(x1, x2, . . . , xn+1|π)p(π) p(x1, x2, . . . , xn+1) Applying the Bayes rule p(π|x1, x2, . . . , xn+1) ∝ π xk (1 − π)n+1− xk × I0<π<1 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 49 / 114
  • 64. Bayesian analysis: Uniform prior Bayes’ rule p(π|x1, x2, . . . , xn+1) = p(x1, x2, . . . , xn+1|π)p(π) p(x1, x2, . . . , xn+1) Applying the Bayes rule p(π|x1, x2, . . . , xn+1) ∝ π xk (1 − π)n+1− xk × I0<π<1 Summary of distribution: Conditional Mean : n n + 2 xk n + 1 n + 2 Maximum : xk n Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 49 / 114
  • 65. Bayesian analysis: Uniform prior Bayes’ rule p(π|x1, x2, . . . , xn+1) = p(x1, x2, . . . , xn+1|π)p(π) p(x1, x2, . . . , xn+1) Applying the Bayes rule p(π|x1, x2, . . . , xn+1) ∝ π xk (1 − π)n+1− xk × I0<π<1 Summary of distribution: Conditional Mean : n n + 2 xk n + 1 n + 2 Maximum : xk n Can approximate the distribution by a Gaussian (Laplace’s approximation) p(π|x1, x2, . . . , xn+1) ∼ N(µ, σ2 ) µ = xk n σ2 = µ(1 − µ) n Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 49 / 114
  • 66. Prior: Beta distribution π follows a Beta(α, β) distribution p(π) ∝ πα−1 (1 − π)β−1 Beta distribution is analytically tractable; example of conjugate prior. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 50 / 114
  • 67. Bayesian update: Beta prior α = 5, β = 2 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 51 / 114
  • 68. Bayesian update: Beta prior α = 0.5, β = 0.5 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 52 / 114
  • 69. Bayesian Analysis: Beta prior Applying the Bayes rule p(π|x1, x2, . . . , xn+1) ∝ π xk (1 − π)n+1− xk × πα−1 (1 − π)β−1 π xk +α−1 (1 − π)n+1− xk +β Conditional mean Eπ[p(π|x1, . . . , xn+1)] = 1 0 πp(π|x1, . . . , xn+1)dπ = n n + α + β xk n + α + β n + α + β α α + β Observe that this gives the right limit as n → ∞. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 53 / 114
  • 70. Inverse problems: Bayesian viewpoint Consider the measurement equation y = h(s) + v v ∼ N(0, Γnoise) Notation: y : observations or measurements - given. s : model parameters, we want to estimate. h(s) : parameter-to-observation map - given. v : additive i.i.d Gaussian noise Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 54 / 114
  • 71. Inverse problems: Bayesian viewpoint Consider the measurement equation y = h(s) + v v ∼ N(0, Γnoise) Using Bayes’ rule, the posterior pdf is p(s|y) ∝ p(y|s) Data misfit p(s) Prior Data misfit - “How well the model reproduces data” Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 54 / 114
  • 72. Inverse problems: Bayesian viewpoint Consider the measurement equation y = h(s) + v v ∼ N(0, Γnoise) Using Bayes’ rule, the posterior pdf is p(s|y) ∝ p(y|s) Data misfit p(s) Prior Data misfit - “How well the model reproduces data” Prior - “Prior knowledge of unknown field ” Smoothness, sparsity, etc Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 54 / 114
  • 73. Geostatistical approach Let s(x) be the parameter field we wish to recover s(x) = p k=1 fi (x)βk Deterministic term + (x) Random term Possible choices for fi (x) Low order polynomials f1 = 1, f2 = x, f3 = x2 , etc. Zonation model fi is nonzero only in certain regions Several possible choices for (x) We will assume Gaussian random fields. Revisit this assumption (later in this talk). Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 55 / 114
  • 74. Gaussian Random Fields GRF are multidimensional generalizations of Gaussian processes. Definition A Gaussian process is a collection of random variables, any finite number of which have a joint Gaussian distribution. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 56 / 114
  • 75. Gaussian Random Fields GRF are multidimensional generalizations of Gaussian processes. Definition A Gaussian process is a collection of random variables, any finite number of which have a joint Gaussian distribution. A Gaussian process is completely specified by its mean function and covariance function. µ(x) def = E[f (x)] κ(x, y) def = E[(f (x) − µ(x))(f (y) − µ(y))] The GP is denoted as f (x) ∼ N(µ(x), κ(x, y)) Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 56 / 114
  • 76. Gaussian Random Fields GRF are multidimensional generalizations of Gaussian processes. Definition A Gaussian process is a collection of random variables, any finite number of which have a joint Gaussian distribution. Examples of Gaussian random fields Figure 9 : Samples from Gaussian random fields Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 56 / 114
  • 77. Geostatistical approach Model priors as Gaussian random fields s|β ∼ N(Xβ, Γprior) p(β) ∝ 1 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 57 / 114
  • 78. Geostatistical approach Model priors as Gaussian random fields s|β ∼ N(Xβ, Γprior) p(β) ∝ 1 Posterior distribution Applying Bayes theorem p(s, β|y) ∝ p(y|s, β)p(s|β)p(β) exp − 1 2 y − h(s) 2 Γ−1 noise − 1 2 s − Xβ 2 Γ−1 prior Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 57 / 114
  • 79. Geostatistical approach Model priors as Gaussian random fields s|β ∼ N(Xβ, Γprior) p(β) ∝ 1 Posterior distribution Applying Bayes theorem p(s, β|y) ∝ p(y|s, β)p(s|β)p(β) exp − 1 2 y − h(s) 2 Γ−1 noise − 1 2 s − Xβ 2 Γ−1 prior Maximum a posteriori (MAP) estimate: ˆs, ˆβ = arg min s,β − log p(s, β|y) = arg min s,β 1 2 y − h(s) 2 Γ−1 noise + 1 2 s − Xβ 2 Γ−1 prior Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 57 / 114
  • 80. MAP Estimate - Linear Inverse Problems Maximum a posteriori (MAP) estimate: for h(s) = Hs ˆs, ˆβ = arg min s,β 1 2 y − Hs 2 Γ−1 noise + 1 2 s − Xβ 2 Γ−1 prior 21Preconditioned iterative solver developed in Saibaba and Kitanidis, WRR 2012. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 58 / 114
  • 81. MAP Estimate - Linear Inverse Problems Maximum a posteriori (MAP) estimate: for h(s) = Hs ˆs, ˆβ = arg min s,β 1 2 y − Hs 2 Γ−1 noise + 1 2 s − Xβ 2 Γ−1 prior Obtained by solving the system of equations HΓpriorHT + Γnoise HX (HX)T 0 ˆξ ˆβ = y 0 ˆs = X ˆβ + ΓpriorHT ˆξ 21Preconditioned iterative solver developed in Saibaba and Kitanidis, WRR 2012. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 58 / 114
  • 82. MAP Estimate - Linear Inverse Problems Maximum a posteriori (MAP) estimate: for h(s) = Hs ˆs, ˆβ = arg min s,β 1 2 y − Hs 2 Γ−1 noise + 1 2 s − Xβ 2 Γ−1 prior Obtained by solving the system of equations HΓpriorHT + Γnoise HX (HX)T 0 ˆξ ˆβ = y 0 ˆs = X ˆβ + ΓpriorHT ˆξ Solved using a matrix-free Krylov solver. Requires fast ways to compute Hx and Γpriorx Preconditioner21 using a low-rank representation of Γprior 21Preconditioned iterative solver developed in Saibaba and Kitanidis, WRR 2012. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 58 / 114
  • 83. Interpolation using Gaussian Processes22 The posterior is Gaussian with µpost(x∗ ) = κ(x∗ , x)(κ(x, x) + σ2 I)−1 y(x) covpost(x∗ , x∗ ) = κ(x∗ , x∗ ) − κ(x∗ , x)(κ(x, x) + σ2 I)−1 κ(x, x∗ ) 22Gaussian Processes for Machine Learning, Rasmussen and Williams Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 59 / 114
  • 84. Application: CO2 monitoring Challenge: Real-time monitoring of CO2 concentration Time series of noisy seismic traveltime tomography data. 288 measurements and 234 × 217 unknowns A.K. Saibaba, Ambikasaran, Li, Darve, Kitanidis, Oil and Gas Science and Technology 67.5 (2012): 857. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 60 / 114
  • 85. Mat´ern covariance family Mat`ern class of covariance kernels κ(x, y) = (αr)ν 2ν−1Γ(ν) Kν(αr), α > 0, ν > 0 Here, r = x − y 2 is the radial distance between points x and y. Examples: Exponential kernel (ν = 1/2), Gaussian kernel ν = ∞. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 61 / 114
  • 86. Mat´ern covariance kernels Deconvolution equation y(t) = T 0 f (t − τ)s(τ)dτ Mat´ern covariance kernels ν = 1/2, 3/2, ∞ κ(x, y) = exp(−|x − y|/L) Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 62 / 114
  • 87. Mat´ern covariance kernels Deconvolution equation y(t) = T 0 f (t − τ)s(τ)dτ Mat´ern covariance kernels ν = 1/2, 3/2, ∞ κ(x, y) = (1 + √ 3|x − y|/L) exp(− √ 3|x − y|/L) Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 62 / 114
  • 88. Mat´ern covariance kernels Deconvolution equation y(t) = T 0 f (t − τ)s(τ)dτ Mat´ern covariance kernels ν = 1/2, 3/2, ∞ κ(x, y) = exp(−|x − y|2 /L2 ) Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 62 / 114
  • 89. Fast covariance evaluations Consider the Gaussian priors s|β ∼ N(Xβ, Γprior) Covariance matrices are dense - expensive to store and compute. For example, a dense 106 × 106 matrix costs 7.45 TB. Typically, only need to evaluate Γpriorx and Γ−1 priorx. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 63 / 114
  • 90. Fast covariance evaluations Consider the Gaussian priors s|β ∼ N(Xβ, Γprior) Standard approaches FFT based methods, Fast Multipole Method, Hierarchical Matrices Kronecker tensor product approximations. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 63 / 114
  • 91. Fast covariance evaluations Consider the Gaussian priors s|β ∼ N(Xβ, Γprior) Standard approaches FFT based methods, Fast Multipole Method, Hierarchical Matrices Kronecker tensor product approximations. Compared to the naive O(N2 ) Storage cost: O(N logα N) Matvec cost: O(N logβ N) Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 63 / 114
  • 92. Toeplitz Matrices A Toeplitz matrix T is an N × N matrix with entries such that Tij = ti−j , i.e. a matrix of the form T =         t0 t−1 t−2 . . . t−(N−1) t1 t0 t−1 t2 t1 t0 ... ... ... tN−1 . . . t0         Suppose points xi = i × h and yj = j × h for i, j = 1, . . . , N Stationary kernels Qij = κ(xi , yj ) = κ((i − j)h) Translation-invariant kernels Qij = κ(xi , yj ) = κ(|i − j|h) Need to store only O(N) entries, compared to O(N2 ) entries. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 64 / 114
  • 93. FFT based methods Toeplitz matrices arise from stationary covariance kernels on regular grids   c b a b c b a b c   Periodic embedding =⇒       c b a a b b c b a a a b c b a a a b c b b a a b c       Diagonalizable by Fourier basis Matrix-Vector Products for Toeplitz matrices O(N log N) Restricted to regular, equispaced grids. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 65 / 114
  • 94. H-matrix formulation: An Intuitive Explanation. Consider for xi , yi = (i − 1) 1 N−1 , i = 1, . . . , N κα(x, y) = 1 |x − y| + α α > 0 Figure 10 : blockwise rank- α = 10−6 , = 10−6 , N = M = 256 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 66 / 114
  • 95. H-matrix formulation: An Intuitive Explanation. Consider for xi , yi = (i − 1) 1 N−1 , i = 1, . . . , N κ(x, y) = exp(−|x − y|) Figure 11 : blockwise rank- = 10−6 , N = M = 256 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 67 / 114
  • 96. Exponentially decaying singular values of off-diagonal blocks κα(x, y) = 1 |x − y| + α α > 0 (1) Figure 12 : First 32 singular values of off-diagonal sub-blocks of matrix corresponding to non-overlapping segments (left) [0, 0.5] × [0.5, 1] and (right) [0, 0.25] × [0.75, 1.0] The decay of singular values can be related to the smoothness of the kernel. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 68 / 114
  • 97. Prof. SVD - Gene Golub Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 69 / 114
  • 98. Prof. SVD - Gene Golub Rank-10 approximation Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 69 / 114
  • 99. Prof. SVD - Gene Golub Rank-20 approximation Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 69 / 114
  • 100. Prof. SVD - Gene Golub Rank-100 approximation Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 69 / 114
  • 101. Hierarchical-matrices24 Hierarchical separation of space. Low rank sub-blocks with well separated clusters. Mild restrictions on the types of permissible kernels 24Hackbusch - 2000, Grasedyck and Hackbusch - 2003, Bebendorf - 2008 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 70 / 114
  • 102. Hierarchical-matrices24 Hierarchical separation of space. Low rank sub-blocks with well separated clusters. Mild restrictions on the types of permissible kernels Level 0 Full-rank blocks Low-rank blocks 24Hackbusch - 2000, Grasedyck and Hackbusch - 2003, Bebendorf - 2008 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 70 / 114
  • 103. Hierarchical-matrices24 Hierarchical separation of space. Low rank sub-blocks with well separated clusters. Mild restrictions on the types of permissible kernels Level 0 Level 1 Full-rank blocks Low-rank blocks 24Hackbusch - 2000, Grasedyck and Hackbusch - 2003, Bebendorf - 2008 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 70 / 114
  • 104. Hierarchical-matrices24 Hierarchical separation of space. Low rank sub-blocks with well separated clusters. Mild restrictions on the types of permissible kernels Level 0 Level 1 Level 2 Full-rank blocks Low-rank blocks 24Hackbusch - 2000, Grasedyck and Hackbusch - 2003, Bebendorf - 2008 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 70 / 114
  • 105. Hierarchical-matrices24 Hierarchical separation of space. Low rank sub-blocks with well separated clusters. Mild restrictions on the types of permissible kernels Level 0 Level 1 Level 2 Level 3 Full-rank blocks Low-rank blocks 24Hackbusch - 2000, Grasedyck and Hackbusch - 2003, Bebendorf - 2008 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 70 / 114
  • 106. Clustering Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 71 / 114
  • 107. Block clustering Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 72 / 114
  • 108. Hierarchical-matrices25 Full-rank blocks Low-rank blocks 25Hackbusch - 2000, Grasedyck and Hackbusch - 2003, Bebendorf - 2008 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 73 / 114
  • 109. Quasi-linear geostatistical approach Maximum a posteriori estimate: arg min s,β 1 2 y − h(s) 2 Γ−1 noise + 1 2 s − Xβ 2 Γ−1 prior 26Preconditioned iterative solver developed in Saibaba and Kitanidis, WRR 2012. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 74 / 114
  • 110. Quasi-linear geostatistical approach Maximum a posteriori estimate: arg min s,β 1 2 y − h(s) 2 Γ−1 noise + 1 2 s − Xβ 2 Γ−1 prior Algorithm 2 Quasi-linear geostatistical approach 1: while Not converged do 2: Solve the system of equations26 , Jk ΓpriorJT k + Γnoise Jk X (Jk X) T 0 ξk+1 βk+1 = y − h(sk ) + Jk sk 0 where, the Jacobian J = ∂h ∂s s=sk 3: The update sk+1 = Xβk+1 + ΓpriorJT k ξk+1 4: end while 26Preconditioned iterative solver developed in Saibaba and Kitanidis, WRR 2012. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 74 / 114
  • 111. MAP Estimate - Quasi-linear Inverse Problems At each step, linearize to get a local Gaussian approximation Solve a sequence of linear inverse problems. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 75 / 114
  • 112. MAP Estimate - Quasi-linear Inverse Problems At each step, linearize to get a local Gaussian approximation Solve a sequence of linear inverse problems. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 75 / 114
  • 113. MAP Estimate - Quasi-linear Inverse Problems At each step, linearize to get a local Gaussian approximation Solve a sequence of linear inverse problems. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 75 / 114
  • 114. MAP Estimate - Quasi-linear Inverse Problems At each step, linearize to get a local Gaussian approximation Solve a sequence of linear inverse problems. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 75 / 114
  • 115. Non-Gaussian priors Gaussian random fields often produce smooth reconstructions Often need discontinuous reconstructions Facies detection, tumor location. Several possibilities Total Variation regularization Level Set approach Markov Random Fields Wavelet based reconstructions Only scratching the surface, lots of techniques (and literature) available. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 76 / 114
  • 116. Non-Gaussian priors Gaussian random fields often produce smooth reconstructions Often need discontinuous reconstructions Facies detection, tumor location. Several possibilities Total Variation regularization Level Set approach Markov Random Fields Wavelet based reconstructions Only scratching the surface, lots of techniques (and literature) available. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 76 / 114
  • 117. Total variation regularization Total variation in 1D TV (f ) = sup n−1 k=1 |f (xk+1)−f (xk )| Measure of arc length of a curve Gif: Wikipedia, Figure: Kaipio et al. Statistical and computational inverse problems. Vol. 160. Springer Science & Business Media, 2006 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 77 / 114
  • 118. Total Variation Regularization MAP estimate (penalize discontinuous changes) min s 1 2 y − h(s) 2 Γ−1 noise + α Ω | s|ds | s| ≈ √ s · s + ε Figure 13 : Inverse Wave propagation problem. (left) Cross-sections of inverted and target models, (right) Surface model of the target. Akcelic, Biros and Ghattas, Supercomputing, ACM/IEEE 2002 Conference. IEEE, 2002. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 78 / 114
  • 119. Level Set approach s(x) = cf (x)H(φ(x)) + cb(x)(1 − H(φ(x))) H(x) = 1 2 (1 + sign(x)) Figure 14 : Image courtesy of Wikipedia Topologically flexible - able to recover multiple connected components Evolve the shape by the minimizing an objective function. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 79 / 114
  • 120. Bayesian Level set approach Level set function s(x) = cf (x)H(φ(x)) + cb(x)(1 − H(φ(x))) H(x) = 1 2 (1 + sign(x)) Employ a Gaussian random field as prior for φ(x) Groundwater flow − · κ u(x) = f (x) x ∈ Ω u = 0 x ∈ ∂Ω Transformation s = log κ Iglesias et al. Preprint arXiv:1504.00313 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 80 / 114
  • 121. Outline 1 Introduction 2 Linear Inverse Problems 3 Geostatistical Approach Bayes’ theorem Coin toss example Covariance modeling Non-Gaussian priors 4 Data Assimilation Application: CO2 monitoring 5 Uncertainty quantification MCMC 6 Concluding remarks Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 81 / 114
  • 122. Tracking trajectories Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 82 / 114
  • 123. Tracking trajectories Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 82 / 114
  • 124. 360◦ panorama - Teliportme Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 83 / 114
  • 125. 4D Var Filtering Consider the dynamical system ∂v ∂t = F(v) + η v(x, 0) = v0(x) 3D Var Filtering J3(v) def = y(T) − h(v(x; T)) 2 Γ−1 noise + 1 2 v0(x) − v∗ 0 (x) 2 Γ−1 prior Optimization problem ˆv0 def = arg min v0 Jk (v) k = 3, 4 4D Var Filtering J4(v) def = Nt k=1 y(tk ) − h(v(x; tk )) 2 Γ−1 noise + 1 2 v0(x) − v∗ 0 (x) 2 Γ−1 prior Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 84 / 114
  • 126. Application: Contaminant source identification Transport equations ∂c ∂t + v · c = D 2 c D c · n = 0 c(x, 0) = c0(x) Estimate initial conditions from measurements of the contaminant field. Akcelik, Volkan, et al. Proceedings of the 2005 ACM/IEEE conference on Supercomputing. IEEE Computer Society, 2005. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 85 / 114
  • 127. Linear Dynamical System System Noise: Measurements: uk−1 uk uk+1 · · · sk−1 F sk F sk+1 · · · vk−1 Hk−1 vk Hk vk+1 Hk+1 yk−1 yk yk+1 Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 86 / 114
  • 128. State Evolution equations Linear evolution equations sk+1 = Fk sk + uk uk ∼ N(0, Γprior) yk+1 = Hk+1sk+1 + vk vk ∼ N(0, Γnoise) obtained by discretizing a PDE Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 87 / 114
  • 129. State Evolution equations Linear evolution equations sk+1 = Fk sk + uk uk ∼ N(0, Γprior) yk+1 = Hk+1sk+1 + vk vk ∼ N(0, Γnoise) obtained by discretizing a PDE Nonlinear evolution equations sk+1 = f (sk ) + uk uk ∼ N(0, Γprior) yk+1 = h(sk+1) + vk vk ∼ N(0, Γnoise) Can be linearized (Extended Kalman Filter) or handled as is (Ensemble filtering) Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 87 / 114
  • 130. Kalman Filter Current N(ˆsk|k, Σk|k) Update Predict Future N(ˆsk+1|k+1, Σk+1|k+1) Transition matrix Fk Observation Hk Sys. noise wk ∼ N(0, Γsys) Meas. noise vk ∼ N(0, Γnoise) All variables are modeled as Gaussian random variables Completely specified by the mean and covariance matrix. Kalman filter provides a recursive way to update state knowledge and predictions. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 88 / 114
  • 131. Standard implementation of Kalman Filter Predict ˆsk+1|k = ˆsk|k − Σk+1|k = Fk Σk|k FT k + Γprior O(N3 ) Update Sk = Hk Σk+1|k HT k + Γnoise O(nmN2 ) Kk = Σk+1|k HT S−1 k O(nmN2 ) ˆsk+1|k+1 = ˆsk+1|k + Kk (yk − Hkˆsk+1|k ) O(nmN) Σk+1|k+1 = (Σ−1 k+1|k + HT k Γ−1 noiseHk )−1 O(nmN2 + N3 ) N: number of unknowns and nm: number of measurements Storage cost O(N2 ) and computational cost O(N3 ) This cost is prohibitively expensive for large-scale implementation Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 89 / 114
  • 132. Standard implementation of Kalman Filter Predict ˆsk+1|k = ˆsk|k − Σk+1|k = Fk Σk|k FT k + Γprior O(N3 ) Update Sk = Hk Σk+1|k HT k + Γnoise O(nmN2 ) Kk = Σk+1|k HT S−1 k O(nmN2 ) ˆsk+1|k+1 = ˆsk+1|k + Kk (yk − Hkˆsk+1|k ) O(nmN) Σk+1|k+1 = (Σ−1 k+1|k + HT k Γ−1 noiseHk )−1 O(nmN2 + N3 ) N: number of unknowns and nm: number of measurements Storage cost O(N2 ) and computational cost O(N3 ) This cost is prohibitively expensive for large-scale implementation Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 89 / 114
  • 133. Ensemble Kalman Filter The EnKF is a Monte Carlo approximation of the Kalman filter. Ensemble of state variables: X = [x1, . . . , xN ] Ensemble of realizations are propagated individually Can reuse legacy codes Easily parallelizable To update filter compute statistics based on the ensemble Unlike Kalman filter, can be readily applied to nonlinear problems Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 90 / 114
  • 134. Ensemble Kalman Filter The EnKF is a Monte Carlo approximation of the Kalman filter. Ensemble of state variables: X = [x1, . . . , xN ] Ensemble of realizations are propagated individually Can reuse legacy codes Easily parallelizable To update filter compute statistics based on the ensemble Unlike Kalman filter, can be readily applied to nonlinear problems The ensemble mean and covariance can be computed as E[X] = 1 N N k=1 xk C = 1 N − 1 AAT where A is the mean subtracted ensemble. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 90 / 114
  • 135. Application: Real-time CO2 monitoring Sources Receivers Sources fire a pulse, receivers measure time delay. Measurements - travel time of each source-receiver pair. 6 sources, 48 receivers = 288 measurements Assumption: rays travel in straight-line path tsr = recv source 1 v(x) Slowness d + noise Model problem for: reflection seismology, CT scanning, etc. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 91 / 114
  • 136. Random Walk Forecast model Evolution of CO2 can be modeled as sk+1 = Fk sk + uk uk ∼ N(0, Γprior) yk+1 = Hk+1sk+1 + vk vk ∼ N(0, Γnoise) A.K. Saibaba, E.L. Miller, P.K. Kitanidis, A Fast Kalman Filter for time-lapse Electrical Resistivity Tomography. Proceedings of IGARSS 2014, Montreal Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 92 / 114
  • 137. Random Walk Forecast model Evolution of CO2 can be modeled as sk+1 = Fk sk + uk uk ∼ N(0, Γprior) yk+1 = Hk+1sk+1 + vk vk ∼ N(0, Γnoise) Random walk assumption Fk = I Useful modeling assumption when measurements can be acquired rapidly Applications: Electrical Impedance Tomography, Electrical Resistivity Tomography, Seismic Travel-time tomography Treat Γprior using Hierarchical matrix approach A.K. Saibaba, E.L. Miller, P.K. Kitanidis, A Fast Kalman Filter for time-lapse Electrical Resistivity Tomography. Proceedings of IGARSS 2014, Montreal Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 92 / 114
  • 138. Results: Kalman Filter Figure 15 : True and estimated CO2-induced changes in slowness (reciprocal of velocity) between two wells for the grid size 234 × 219 at times 3, 30 and 60 hours respectively. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 93 / 114
  • 139. Comparison of costs of different algorithms Grid size 59 × 55 Γprior is constructed using kernel κ(r) = θ exp(− √ r) Γnoise = σ2 I with σ2 = 10−4 Saibaba, Arvind K., et al. Inverse Problems 31.1 (2015): 015009. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 94 / 114
  • 140. Error in the reconstruction Γprior is constructed using kernel κ(r) = θ exp(− √ r) Γnoise = σ2 I with σ2 = 10−4 Saibaba, Arvind K., et al. Inverse Problems 31.1 (2015): 015009. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 95 / 114
  • 141. Conditional Realizations Figure 16 : Conditional realizations of CO2-induced changes in slowness (reciprocal of velocity) between two wells for the grid size 59 × 55 at times 3, 30 and 60 hours respectively. Saibaba, Arvind K., et al. Inverse Problems 31.1 (2015): 015009. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 96 / 114
  • 142. Outline 1 Introduction 2 Linear Inverse Problems 3 Geostatistical Approach Bayes’ theorem Coin toss example Covariance modeling Non-Gaussian priors 4 Data Assimilation Application: CO2 monitoring 5 Uncertainty quantification MCMC 6 Concluding remarks Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 97 / 114
  • 143. Inverse problems: Bayesian viewpoint Consider the measurement equation y = h(s) + v v ∼ N(0, Γnoise) Notation: y : observations or measurements - given. s : model parameters, we want to estimate. h(s) : parameter-to-observation map - given. v : additive i.i.d Gaussian noise Using Bayes’ rule, the posterior pdf is p(s|y) ∝ p(y|s) Data misfit p(s) Prior The posterior distribution is the Bayesian solution to the inverse problem. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 98 / 114
  • 144. Bayesian Inference: Quantifying uncertainty Maximum-a-posteriori (MAP) estimate arg max p(s|y) Conditional mean sCM = Es|y [s] = s p(s|y)ds Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 99 / 114
  • 145. Bayesian Inference: Quantifying uncertainty Maximum-a-posteriori (MAP) estimate arg max p(s|y) Conditional mean sCM = Es|y [s] = s p(s|y)ds Credibility intervals: Find sets C(y) p[s ∈ C(y)|y] = 1 − α Sample realizations from the posterior Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 99 / 114
  • 146. Linear Inverse Problems Recall the distribution is given by p(s|y) ∝ exp − 1 2 y − Hs 2 Γ−1 noise − 1 2 s − µ 2 Γ−1 prior Posterior distribution s|y ∼ N(sMAP, Γpost) Γpost = Γ−1 prior + HT Γ−1 noiseH −1 = Γprior − ΓpriorHT (HΓpriorHT + Γnoise)−1 HΓprior sMAP = Γpost(HT Γ−1 noisey + Γ−1 priorµ) Observe that Γpost ≤ Γprior Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 100 / 114
  • 147. Application: CO2 monitoring Variance = diag(Γpost) = diag(Γ−1 prior + HT Γ−1 noiseH)−1 A.K. Saibaba, Ambikasaran, Li, Darve, Kitanidis, Oil and Gas Science and Technology 67.5 (2012): 857. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 101 / 114
  • 148. Nonlinear Inverse Problems Linearize the forward operator (at the MAP point) h(s) = h(sMAP) + ∂h ∂s (s − sMAP) + O( s − sMAP 2 2) Groundwater flow equations − · (κ(x) φ) = Qδ(x − xsource) x ∈ Ω φ = 0 x ∈ ΩD Inverse problem: Estimate hydraulic tomography κ from discrete measurements of φ. To make problem well-posed, work with s = log κ. Saibaba, Arvind K., et al., Advances in Water Resources 82 (2015): 124-138. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 102 / 114
  • 149. Nonlinear Inverse Problems Linearize the forward operator (at the MAP point) h(s) = h(sMAP) + ∂h ∂s (s − sMAP) + O( s − sMAP 2 2) Figure 17 : (left) Reconstruction of log conductivity (right) Posterior variance Saibaba, Arvind K., et al., Advances in Water Resources 82 (2015): 124-138. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 102 / 114
  • 150. Monte Carlo sampling Suppose X has density p(x) and we are interested in f (X) E[f (X)] = f (x)p(x)dx = lim N→∞ 1 N N k=1 f (xk ) Approximate using sample averages E[f (X)] ≈ 1 N N k=1 f (xk ) p(x) is understood to be the posterior distribution. If samples are easy to generate, procedure is straightforward. Use Central Limit Theorem to generate confidence intervals. Generating samples from p(x) may not be straightforward. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 103 / 114
  • 151. Acceptance-rejection sampling Approximate distribution by an easier distribution Points under curve Points generated × box area = lim n→∞ B A f (x)dx From PyMC2 website: http://pymc-devs.github.io/pymc/theory.html Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 104 / 114
  • 152. Markov chains Consider a sequence of random variables X1, X2, . . . p(Xt+1 = xt+1|Xt = xt, . . . , X1 = x1) = p(Xt+1 = xt+1|Xt = xt) The future depends only on the present - not the past! Under some conditions, the chain has a stationary distribution. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 105 / 114
  • 153. Implementation Create a Markov Chain whose stationary distribution is p(x) 1 Draw a proposal y from q(y|xn) 2 Calculate acceptance ratio α(xn, y) = min 1, p(y)q(xn|y) p(xn)q(y|xn) 3 Accept/Reject xn+1 = y with probabilityα(xn, y) xn with probability 1 − α(xn, y) If q(x, y) = q(y, x) then α(xn, y) = min{1, p(y)/p(xn)} Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 106 / 114
  • 154. MCMC demo Demo at: http://chifeng.scripts.mit.edu/stuff/mcmc-demo/ Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 107 / 114
  • 155. Properties of MCMC sampling Ergodic theorem for expectations lim N→∞ 1 N N k=1 f (xi ) = Ω f (x)p(x)dx However samples xk are no longer i.i.d. Has higher variance than MC sampling. Popular sampling strategies Metropolis-Hastings Gibbs samplers Hamiltonian MCMC Adaptive MCMC with Delayed rejection (DRAM) Metropolis adjusted Langevin Algorithm (MALA) Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 108 / 114
  • 156. Curse of dimensionality What is the probability of hitting a hypersphere inscribed in a hypercube? In dimension n = 100, the probability < 2 × 10−70 . Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 109 / 114
  • 157. Stochastic Newton MCMC Martin, James, et al. SISC 34.3 (2012): A1460-A1487. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 110 / 114
  • 158. Outline 1 Introduction 2 Linear Inverse Problems 3 Geostatistical Approach Bayes’ theorem Coin toss example Covariance modeling Non-Gaussian priors 4 Data Assimilation Application: CO2 monitoring 5 Uncertainty quantification MCMC 6 Concluding remarks Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 111 / 114
  • 159. Opportunities Theoretical and numerical “Big data” meets “Big Models” Model reduction Posterior uncertainty quantification Applications New application areas, new technologies that generate inverse problems Combining multiple modalities to make better predictions Software that transcends application areas Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 112 / 114
  • 160. Resources for learning Inverse Problems Books Hansen, Per Christian. Discrete inverse problems: insight and algorithms. Vol. 7. SIAM, 2010. Hansen, Per Christian. Rank-deficient and discrete ill-posed problems: numerical aspects of linear inversion. Vol. 4. SIAM, 1998. Hansen, Per Christian, James G. Nagy, and Dianne P. O’Leary. Deblurring images: matrices, spectra, and filtering. Vol. 3. Siam, 2006. Tarantola, Albert. Inverse problem theory and methods for model parameter estimation. SIAM, 2005. Kaipio, Jari, and Erkki Somersalo. Statistical and computational inverse problems. Vol. 160. Springer Science & Business Media, 2006. Vogel, Curtis R. Computational methods for inverse problems. Vol. 23. SIAM, 2002. PK Kitanidis. Introduction to geostatistics: applications in hydrogeology. Cambridge University Press, 1997. Cressie, Noel. Statistics for spatial data. John Wiley & Sons, 2015. Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 113 / 114
  • 161. Resources for learning Inverse Problems Software Packages Regularization Tools (MATLAB) Website: http://www2.compute.dtu.dk/~pcha/Regutools/regutools.html PEST Website http://www.pesthomepage.org/ bgaPEST Website: http://pubs.usgs.gov/tm/07/c09/ MUQ Website: https://bitbucket.org/mituq/muq Bakhos, Kitanidis, Saibaba Large-Scale Inverse Problems June 28, 2015 114 / 114