MUMS: Transition & SPUQ Workshop - Some Strategies to Quantify Uncertainty for Extrapolation in Physical Systems - Aaron Danielson, May 14, 2019

STRATEGIES TO QUANTIFY
UNCERTAINTY FOR EXTRAPOLATION
IN PHYSICAL SYSTEMS
AARON DANIELSON, DEREK BINGHAM, AND
DAVE HIGDON

SFU: BRITISH COLUMBIA OR
CAPRICA?

DON’T EXTRAPOLATE?
• Extrapolation: estimate the value of a variable y* at
input points x* beyond the range of observed data
(x,y) using the associations between x and y learned
by the observed data.
• But, these associations may change across the input
space. Physical system operates differently at
different points in the input space.
• Then the extrapolations are misleading.
• How to quantify uncertainty in these cases?

COMPUTER EXPERIMENTS
• Scientiﬁc applications use mathematical models to
describe physical systems.
• Growth in computer power enables the study of
complex phenomena that otherwise might be too
time consuming or expensive to observe.
• To understand how inputs impact the system,
scientists vary simulation inputs and observe the
response. Conduct a computer experiment.

SOME GOALS FOR UQ
• Model Calibration:
• Use data from computer experiments.
• Use data from field experiments.
• Model the shared signal.
• Estimate parameters that govern the system with uncertainty.
• Model Validation:
• Demonstrate the model fits the data.
• Account for discrepancy between theory and empirical observations.
• Prediction (and … Extrapolation):
• Build a predictive model for the system with estimates of uncertainty.
• Hope the predictive model is “better” than using the code alone or field data alone.

ONE CONVENTIONAL APPROACH
(NO DISCREPANCY)
• Empirical data:
• Simulation data:
• is needed to run the model.
• But, its value is not known in the ﬁeld.
• Calibration Parameter:
• Simulator:
• Observation Error:
• View computer code as single realization of a Gaussian
Process (Sacks, Welch, Mitchell, & Wynn, 1989).
Ys(xs) = η (xs, ts)
Yf(xf ) = η (xf, θ) + ϵf
η(x, t)
ϵf ∼ N(0,1/λf )
θ
0.0 0.2 0.4 0.6 0.8 1.0
−1.5−1.0−0.50.00.51.01.52.0
x
y
(xf, yf )
(xs, ts, ys)
ts

CONVENTIONAL APPROACH
(WITH DISCREPANCY)
• The discrepancy measures systematic deviation between
theory and empirical observation.
• Conventional models posit as a realization of a
Gaussian process.
• This is the Kennedy-O’Hagan (KOH) framework. (Kennedy &
O’Hagan, 2001) and (Higdon et al., 2004).
Ys(xs) = η (xs, ts)
Yf(xf ) = η (xf, θ) + δ(xf ) + ϵf
δ(xf )

GAUSSIAN PROCESSES
(INFORMAL DESCRIPTIONS)
• Prior on the space of functions for the data
• Regression that models observed data primarily via a covariance function
rather than the mean function.
• Gaussian process model is essentially a normal regression model with
correlated errors.
• N-dimensional version of kriging spatial statistics where spatial coordinates
are replaced by a general input space.
• Can interpolate data and provides statistical uncertainty at un-sampled inputs.
• But, computationally intensive when number of observations is large.
Disadvantages associated with GP discrepancy discussed throughout this talk.

MORE ON GAUSSIAN PROCESSES
• Simple example. Ignore calibration parameter.
• Model where:
• Mean zero
• Variance
• Correlation between data points:
• When is large, then the function wiggles around the data more extremely.
η(x) = μ + w(x)
𝔼[w(x)] = 0
Var(w(x)) = σ2
Corr(w(x), w(x′)) =
d
∏
i=1
exp {−βi(xi − x′i)2
}
β

A SIMPLE EXAMPLE: BALLS
FALLING THROUGH THE AIR
• Drop balls from 3 different
locations.
• Record time and vertical
displacement.
• Use two sets as training data.
• Reserve last as a validation set.
• Look out below!

A SIMPLE EXAMPLE: BALLS
FALLING THROUGH THE AIR
• Simulation does not
require complicated
computation.
• Can ﬁt using the standard
model with or without a
discrepancy.
Ys (xs) =
1
2
tsx2
s
ts = acceleration due to gravity

BAYESIAN FRAMEWORK
• Combine likelihood and prior to
get posterior distribution for .
• Prior speciﬁcations for parameters depend on the
discrepancy and observation used in the model.
• To predict, sample and draw from the distribution
of given and the data .
L(ω|ys, xs, ts, yf, xf ) π(ω)
ω
ω
y* ω (x*, ys, xs, ts, yf, xf )

LEAVE-ONE-OUT VALIDATION
(KOH W/O DISCREPANCY)

MODEL COMPARISON
(VIA BAYES FACTOR)
Logged Model Densities
Bayes Factor = 7.67e−07
Frequency
−300 −200 −100 0 100 200 300
050001000015000200002500030000
With Discrepancy
No Discrepancy

NOW EXTRAPOLATE
• Yellow point are extrapolations.

SO EVERYTHING IS GREAT, RIGHT?
• Accurate Depiction of Derek and Dave Collecting Data

LIMITATIONS OF THE
CONVENTIONAL APPROACH
• The predictive variance is bounded.
• and are precision parameters associated with
the covariance functions for and .
var (ynew|xnew
) ≤
λη + λδ
ληλδ
λη λδ
η( ⋅ , ⋅ ) δ( ⋅ )

DISCREPANCY HAS BOUNDED
VARIANCE
• The variance of the discrepancy does not depend
on x.
• Compare this to OLS:
var (δ(x*)|x*, ω) =
1
λδ
exp
{
−
p
∑
k=1
βδ
k |x*k
− x*k
|2
}
=
1
λδ
.
var(Y(x*)|x*, ω) = s2
(1 + (x*)⊤
(X⊤
X)−1
x*) .

FUNKY BEHAVIOR FAR FROM THE
EMPIRICAL DATA
• The variance of the discrepancy converges to an
upper bound as the distance between an input
point and the empirical observations increases.
• The covariance terms converge to zero. In the
limit, predicted points are independent of the
empirical data … and the discrepancy.
lim
x*−x →∞
cov (x*, x) = lim
∥x*−x∥→∞
1
λδ
exp
{
−
p
∑
k=1
βδ
k |x*k
− xk |2
}
= 0.

WHAT TO DO?
• Linear models do not have this problem.
Uncertainty associated with the parameter values
leads to unbounded variance over the input space.
• Observation errors can be structured to enforce
unbounded prediction variance.
• Probability distributions with assumptions similar
to the constant coefﬁcient of variation property
(CCV).

MODELS WITH LINEAR TERMS
• Setting ensures the variance of the
discrepancy is unbounded.
• And, the association between the discrepancy and
extrapolations does not disappear.
• Other choices with these properties include polynomials
and splines.
• Can also use a model for with unbounded
variance.
δ(xf ) = x⊤
f βδ
η( ⋅ , ⋅ )

MODELS WITH LINEAR TERMS
(GP EMULATOR, LINEAR DISCREPANCY)

OBSERVATION ERROR
• Take a cue from literature on heteroskedasticity. Changes in
the variance as we traverse the input space.
• Model error as where is a
function of the input data and a parameter .
• Some examples:
• where is a measure of central tendency
such as the spatial median or the distance from k nearest
neighbors.
• where is the Euclidean norm.
•
ϵ(x, τ) ∼ N (0,(g(x, τ) + 1)σ2
)
g(xf, τ) = d(xf, x0)τ x0
g(xf, τ) = ∥xf∥τ ⋅
g(xf, τ) =
∑
|xf,p |τp
g(x, τ)
x τ

OBSERVATION ERROR
Quadratic Emulator with Distance from Spatial Median

CONSTANT COEFFICIENT OF
VARIATION (CCV)
• Classic assumption:
• Written differently:
• Some common distributions satisfy this: lognormal,
gamma and exponential distributions.
• Can be applied to regression settings. (Amemiya, 1973)
• The parameter can be estimated or set a priori. For
example, implies the variance is 10% of the
mean squared.
κ =
σ
μ
σ2
∝ μ2
κ
κ2
= .1

CCV ASSUMPTIONS
• Suppose a subject matter expert believes
• or
• .
• Model the discrepancy with a CCV distribution.
• Assume or
where
• .
• Assume is lognormal, gamma or exponential if it has support
on the positive reals.
var (δ(xf )) ∝ 𝔼[δ(xf )]2
δ(xf ) ∼ N(μ, κμ2
) δ(xf ) ∼ N
(
μ,
exp {μ2
/κ1}
κ2 )
μ = x⊤
f βδ
Yf(xf )
var (Yf(xf )) ∝ 𝔼[Yf(x)]2

CCV DISCREPANCY
(NORMAL DISTRIBUTION)

MORE ON CCV MODELS
GAMMA EXAMPLE
• Model with as:
• Conditional mean:
• Conditional variance:
• No Gaussian process in this model.
Yf(xf )
p(yf(xf )|xf, θ, κ) ∝ y
1
2κ θx2
f −1
f
exp
{
−
yf
κ }
1
2
θx2
f
κ
1
2
θx2
f
𝔼 [η(xi, θ)] = 1/2θx2
i

CCV DISCREPANCY
(GAMMA DISTRIBUTION)

COMPARISON OF STRATEGIES
Simulator Discprepancy
Observation
Error
Discrepancy
Var
Bounded
Prediction
Var
Bounded
Discrepancy
Dissipates
Yes Yes Yes
Yes No Yes
No No No
No No No
CCV No No No
η(xf; θ)
η(xf; θ)
η(xf; θ)
η(xf; θ)
η(xf; θ)
δ(xf )
δ(xf )
ϵ
ϵ(x, τ)
x⊤
f βδ
ϵ
x⊤
f βδ ϵ(x, τ)
ϵ
ϵ

DISCREPANCY WITH VARIABLES
EXTERNAL TO MODEL
• Besides fall times x and vertical distances y, we
observe additional information z, the radius and
density of each ball.
• Expand the models to incorporate this.
• Can use different functional forms:
Yf(xf ) = η(x, θ) + δ(x, z) + ϵ
Yf(xf ) = η(x, θ) + δ(x) + ϵ(z)
δ(x, z) = (x, z)⊤
βδ
δ(x, z) = (x)⊤
βδ + δ(z)
δ(x, z) = (z)⊤
βδ

HOW MUCH DOES THIS HELP?
(POLYNOMIAL MODEL)

HOW MUCH DOES THIS HELP?
(GP DISCREPANCY BASED ON Z)
Yf(xf ) = η(x, θ) + δ(z) + ϵ(x)

HOW TO CHOOSE?
• Scientiﬁc application provides natural choice.
• For the ball drop, we expect a positive bias.
Variance should increase in the time. 
Extra variables are important to the physical
process.
• Elicitation of Expert Opinion.
• No universal answers; just strategies for
consideration.

REFERENCES
• Amemiya, Takeshi. "Regression analysis when the variance of the dependent variable is proportional to the square
of its expectation." Journal of the American Statistical Association 68.344 (1973): 928-934.
• Fang, Zhide, and Douglas P. Wiens. "Robust extrapolation designs and weights for biased regression models with
heteroscedastic errors." Canadian Journal of Statistics 27.4 (1999): 751-770.
• Gelman, Andrew, et al. Bayesian data analysis. Chapman and Hall/CRC, 2013.
• Higdon, Dave, et al. "Combining field data and computer simulations for calibration and prediction." SIAM Journal
on Scientific Computing 26.2 (2004): 448-466.
• Higdon, Dave, et al. "Computer model calibration using high-dimensional output." Journal of the American
Statistical Association 103.482 (2008): 570-583.
• Kennedy, Marc C., and Anthony O'Hagan. "Bayesian calibration of computer models." Journal of the Royal
Statistical Society: Series B (Statistical Methodology) 63.3 (2001): 425-464.
• Khan, Rasul A. "A remark on estimating the mean of a normal distribution with known coefficient of variation."
Statistics 49.3 (2015): 705-710.
• Sacks, Jerome, et al. "Design and analysis of computer experiments." Statistical science (1989): 409-423.
• Zabarankin, Michael and Stan Uryasev. Statistical decision problems. Springer-Verlag New York, 2016.
38

MODEL CALIBRATION
(IN PICTURES)
• From Higdon et al., 2008

PSEUDOLIKELIHOOD
APPROACH TO CCV
• Model empirical observations as:
• And, model the simulations as a Gaussian Process:
• Matrix Gamma distribution as an alternative.
p(yi(xi)|xi, y−i, x−i, θ, κ) ∝ y
𝔼[η(xi, θ)|y−i, x−i]
κ −1
i
exp
{
−
yi
κ }

MUMS: Transition & SPUQ Workshop - Some Strategies to Quantify Uncertainty for Extrapolation in Physical Systems - Aaron Danielson, May 14, 2019

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to MUMS: Transition & SPUQ Workshop - Some Strategies to Quantify Uncertainty for Extrapolation in Physical Systems - Aaron Danielson, May 14, 2019

Similar to MUMS: Transition & SPUQ Workshop - Some Strategies to Quantify Uncertainty for Extrapolation in Physical Systems - Aaron Danielson, May 14, 2019 (20)

More from The Statistical and Applied Mathematical Sciences Institute

More from The Statistical and Applied Mathematical Sciences Institute (20)

Recently uploaded

Recently uploaded (20)

MUMS: Transition & SPUQ Workshop - Some Strategies to Quantify Uncertainty for Extrapolation in Physical Systems - Aaron Danielson, May 14, 2019