SlideShare a Scribd company logo
1 of 91
Download to read offline
SAMSI Fall,2018
✬
✫
✩
✪
Lectures 8-9: Gaussian Processes and
Statistical Emulators
1
SAMSI Fall,2018
✬
✫
✩
✪
Outline
• Motivation for emulation of simulators
• Design for simulator runs
• Some emulation strategies
• Gaussian process (GaSP) emulators
• Fitting GaSP emulators and the Robust R-package
• The differences between use of Gaussian processes in Spatial Statistics
and UQ
• Emulation in more complicated situations
– Functional emulators via Kronecker products
– Functional emulators via basis decomposition
– Functional emulators via parallel partial emulation
– Coupling emulators
2
SAMSI Fall,2018
✬
✫
✩
✪
Crucial assumption: I will be assuming the “black box” scenario, where all
we can do is run the simulator at various inputs; i.e., we do not have access
to the internal code.
3
SAMSI Fall,2018
✬
✫
✩
✪
Motivations for emulation (approximation)
of complex computer models
Simulators (complex computer models of processes) often take hours to
weeks for a single run. One often needs fast simulator approximations
(emulators, surrogate models, meta models, response surface
approximations, . . .) for Uncertainty Quantification analyses such as
• prediction of the simulator at unobserved values of the inputs
• optimization of the simulator over input values
• inverse problems (learning unknown simulator parameters from data)
• propagating uncertainty in inputs through the simulator
• data assimilation (predicting reality with a combination of simulator
output and observational data)
• assessing simulator bias and detecting suspect simulator components
• interfacing systems of simulators (or systems of simulators/stat models)
4
SAMSI Fall,2018
✬
✫
✩
✪
Statistical Design of Runs of the Simulator
(needed to create the emulator) McKay et al. (1979); Sacks et al. (1989); Welch
et al. (1992); Bates et al. (1996); Lim et al. (2002); Santner et al. (2003)
Notation: For these lectures, x ∈ X will denote the d-dimensional vector of
inputs to the simulator (computer model). These could be initial
conditions, control inputs, computer model parameters, ...
The simulator output is denoted by yM
(x).
Goal for design: Choose m points D = {x1, .., xm} at which the simulator
is to be evaluated, yielding yD
= (yM
(x1), . . . , yM
(xm))′
. From these, the
emulator (approximation) to the simulator will be constructed.
– Folklore says that m should be at least 10d although many more runs
are often needed (but often not available).
Criterion: In general, should be problem specific. General purpose criteria
involve finding “space-filling” designs.
5
SAMSI Fall,2018
✬
✫
✩
✪
Most common space filling design: Maximin Latin Hypercube Design
• A Latin Hypercube Design (LHD) is a design in a grid whereby each sample
is the only one in each axis-aligned hyperplane containing it.
• A maximin LHD is an LHD that maximizes mini=j δ(xi, xj), where δ(·, ·) is
a distance on X.
Figure 1: Left: 47 point maximin LHD; d = 6; 2d-projection;
Right: 47 point “0-Correlation” LHD; same 2d projection.
6
SAMSI Fall,2018
✬
✫
✩
✪
TITAN2D simulating pyroclastic flow on Montserrat (Bayarri et al. (2009);
Dalbey et al. (2012))
Inputs: Initial conditions: x1 = flow volume (V ); x2 = flow direction (ϕ);
Model parameters: x3 = basal friction (δbed); x4 = internal friction (δint).
7
SAMSI Fall,2018
✬
✫
✩
✪
Background of the application: The simulator, TITAN2D, for given inputs
V , ϕ, δbed, and δint, yields a description of the pyroclastic flow over a large
space-time grid of points. Each run of TITAN2D takes two hours.
Of primary interest is the maximum (over time) flow height,
yM
(V, ϕ, δbed, δint), at k spatial locations over the island.
• Flow heights yM
(V, ϕ, δbed, δint) > 1m are deemed to be catastrophic.
The analysis begins
• by choosing a Latin hypercube design to select m = 2048 design points in the
feasible input region X = [105
m3
, 109.5
m3
] × [0, 2π] × [5, 18] × [15, 35];
• running TITAN2D at these preliminary points, yielding the ‘data’
yD
=








yM
(x1)
yM
(x2)
...
yM
(xm)








=








(yM
1 (x1), yM
2 (x1), . . . yM
k (x1))
(yM
1 (x2), yM
2 (x2), . . . yM
k (x2))
...
(yM
1 (xm), yM
2 (xm), . . . yM
k (xm))








;
• constructing the emulator from yD
(in general, a matrix of size 2048 × 109
).
8
SAMSI Fall,2018
✬
✫
✩
✪
Adaptive design: (Aslett et al. (1998); Lam and Notz (2008); Ranjan et al.
(2008); Cumming and Goldstein (2009); Gramacy and Lee (2009); Loeppky et al.
(2010); Spiller et al. (2014)).
1 2 3 4 5 6
0
0.05
0.1
0.15
0.2
0.25
0.3
Initiation angle, (radians)
standarderror,(meters)
Figure 2: Standard error of the emulator of a function of interest in the volcano
problem. Red: original Latin hypercube design of 256 points. Blue: standard error
with 9 additional points chosen to maximize the resulting information.
9
SAMSI Fall,2018
✬
✫
✩
✪
Some Emulation Strategies
Recall the goal: We want to develop an emulator (approximation) of the
computer model that allows us to predict the computer model outcome
yM
(x∗
) at new inputs x∗
. This can be done by
• Regression: fine if the simulator output is smooth enough and
– one knows good basis functions to regress upon,
– or can find good basis functions by, e.g., PCA (or POD or EOD).
∗ This often works, but is often suboptimal (e.g., for TITAN2D).
• Polynomial chaos: statisticians question its effectiveness as a general
tool for emulation (it fails for TITAN2D and many other processes).
• Gaussian stochastic processes (GaSP’s), of a variety of types, typically
what are called separable GaSP’s.
• Combinations of the above (e.g. GaSP’s on coefficients of basis
expansions (Bayarri et al. (2007); Bowman and Woods (2016)).
10
SAMSI Fall,2018
✬
✫
✩
✪
An Aside: the Multivariate Normal Distribution
If Y = (Y1, Y2, . . . , Ym) has a multivariate normal distribution with mean
µ = (µ1, µ2, . . . , µm) and m × m positive definite covariance matrix Σ
having entries σi,j (notation: Y ∼ MV N(µ, Σ)), then
• each Yi is marginally normally distributed with
– mean E[Yi] = µi,
– variance E[(Yi − µi)2
] = σi,i;
• σi,j = E[(Yi − µi)(Yj − µj)] is called the covariance between Yi and Yj;
• ci,j =
σi,j
√
σi,iσj,j
is called the correlation between Yi and Yj.
11
SAMSI Fall,2018
✬
✫
✩
✪
GaSP Emulators
Model the real-valued for now simulator output yM
(x) as an unknown
function via a Gaussian stochastic process:
yM
(·) ∼ GaSP( µ(·), σ2
c (·, ·)),
with mean function µ(x), variance σ2
, and correlation function c (·, ·) if, for
any inputs {x1, . . . , xm} from X,
(yM
(x1), . . . , yM
(xm)) ∼ MV N (µ(x1), . . . , µ(xm)), σ2
C , (1)
where C is the correlation matrix with (i, j) element ci,j = c (xi, xj).
• This is a random distribution on functions of x.
• All we really need to know is the induced MV N distribution in (1) of
the function evaluated at a finite set of points.
12
SAMSI Fall,2018
✬
✫
✩
✪
Common is to choose the following forms for the mean and correlations:
• Model the unknown mean function via regression, as
µ(x) = Ψ(x) θ ≡
l
i=1
ψi(x)θi ,
Ψ(·) = (ψ1(·), . . . , ψl(·)) being a vector of specified basis functions and
the θi being unknown (e.g., µ(V, ϕ, δbed, δint) = θ1 + θ2V for TITAN2D).
• As the correlation function arising from the d-dimensional x, utilize the
separable power exponential family
c (x, x∗
) =
d
j=1 exp {−(|xj − xj
∗
|/γj)αj
};
– γj > 0 determines how fast the correlation decays to 0
– αj ∈ (0, 2] determines continuity, differentiability, . . .
∗ We set the αj = 1.9. (αj = 2 can have numerical problems.)
– the product form greatly speeds computation and allows stochastic
inputs to be handled easily.
13
SAMSI Fall,2018
✬
✫
✩
✪
Example: Suppose d = 2 and m = 2, so we have two designed inputs
x1 = (x11, x12) and x2 = (x21, x22). The correlation matrix for
(yM
(x1), yM
(x2)) is then
C =


1 e−[|x11−x21|/γ1]α1
e−[|x12−x22|/γ2]α2
e−[|x11−x21|/γ1]α1
e−[|x12−x22|/γ2]α2
1

 .
• As, say, γ1 → ∞,
C →


1 e−[|x12−x22|/γ2]α2
e−[|x12−x22|/γ2]α2
1

 .
Typically the emulator will then be constant in the first coordinate.
• If any γi → 0,
C →


1 0
0 1

 ,
which gives a terrible emulator.
14
SAMSI Fall,2018
✬
✫
✩
✪
• After obtaining the computer runs yD
at the design points, the
traditional strategy was to estimate the GaSP parameters by maximum
likelihood using this data, and then use the standard Kriging formula
for the emulator predictive mean and variance.
– Maximum likelihood (least squares fit) is not a good idea here; too
often the range parameters end up being 0 or ∞.
Figure 3: GaSP mean and 90% confidence bands fit by maximum likelihood to a
damped sine wave (the red curve) for m=10 (left) and m=9 (right) points.
15
SAMSI Fall,2018
✬
✫
✩
✪
Advantages of the GaSP Emulator
• It is an interpolator of the simulator values yM
(xi) at the design inputs
xi.
• It provides an assessment of the accuracy of the approximation, which
is quite reliable (in a conservative sense) when it is not crazy.
• The separable form properly allows very different fits to the various
inputs.
• The analysis stays within probability (Bayesian) calculus.
Disadvantages of the GaSP Emulator
• Maximum likelihood is very unreliable (fixable, as we will see).
• It requires inversion of an m × m matrix, requiring special techniques if
m is large (lots of research on this).
• It is a stationary process and, hence, not always suitable as an
emulator.
16
SAMSI Fall,2018
✬
✫
✩
✪
Improving on maximum likelihood (least squares)
estimation of the unknown GaSP parameters (Lopes (2011);
Ranjan and Karsten (2011); Roustant et al. (2012); Gu et al. (2016, 2018)).
Step 1. Deal with the crucial parameters (θ, σ2
) via a fully Bayesian
analysis, using the objective prior: π(θ, σ2
) = 1/σ2
.
• Alas, dealing with the correlation parameters by full Bayesian
methods is computationally intractable.
Step 2. Estimate γ = (γ1, . . . , γd) as the mode ˆγ of its marginal posterior
distribution arising
• by integrating out θ and σ2
with respect to their objective prior;
• multiplying the resulting integrated likelihood by the reference prior
for γ (the most popular objective Bayesian prior).
Step 3. The resulting GaSP emulator of yM
(x∗
) at new input x∗
is a
t-process with mean function and covariance function in closed form.
17
SAMSI Fall,2018
✬
✫
✩
✪
The GaSP emulator, yM
(x∗
), of yM
(x∗
) at new input x∗
is a t-distribution
yM
(x∗
) ∼ T ˆµ(x∗
) , ˆσ2
V (x∗
) , m − l ,
where
ˆµ(x∗
) = Ψ(x∗
)ˆθ + C(x∗
)C−1
(yD
− Ψˆθ)
ˆσ2
=
1
m − l
yD′
C−1
yD
− ˆθ
′
(Ψ′
C−1
Ψ)ˆθ
V (x∗
) = 1 − C(x∗
)C−1
C(x∗
)′
+ (Ψ(x∗
) − C(x∗
)C−1
Ψ)(Ψ′
C−1
Ψ)−1
(Ψ(x∗
) − C(x∗
)C−1
Ψ)′
,
with ˆθ = (Ψ′
C−1
Ψ)−1
Ψ′
C−1
yD
, Ψ = (Ψj(xi)), Ψ(x∗
) = (Ψ1(x∗
), . . . , Ψl(x∗
)),
and C(x∗
) = (c(x1, x∗
), . . . , c(xm, x∗
)).
• Not the usual Kriging formula because of use of posterior for (θ, σ2
).
• This is an interpolator of the simulator values yM
(xi) at the design inputs xi.
• It provides an assessment of the accuracy of the approximation, also
incorporating the uncertainty arising from estimating θ and σ2
.
• The only potential computational challenge is computing C−1
if m is very
large.
18
SAMSI Fall,2018
✬
✫
✩
✪
Figure 4: Mean of the emulator of TITAN2D, predicting ‘maximum flow
height’ at a location, as a function of flow volume and angle, for fixed δbed =
15 and δint = 27. Left: Plymouth, Right: Bramble Airport. Black points:
max-height simulator outputs at design points.
19
SAMSI Fall,2018
✬
✫
✩
✪
Details of Steps 1 and 3
• Note that
(yM
(x1), . . . , yM
(xm), yM
(x∗
)) ∼ MV N (µ(x1), . . . , µ(xm), µ(x∗
)), σ2
C∗
,
where
C∗
=


C C(x∗
)′
C(x∗
) 1

 .
• Multiplying this by π(θ, σ2
) = 1/σ2
gives the joint density of
(yM
(x1), . . . , yM
(xm), yM
(x∗
), θ, σ2
)
• Compute the conditional density of (yM
(x∗
), θ, σ2
) given
(yM
(x1), . . . , yM
(xm)).
• Integrate out θ and σ2
to obtain the posterior predictive density
˜yM
(x∗
) of the target yM
(x∗
).
20
SAMSI Fall,2018
✬
✫
✩
✪
Details of Versions of Step 2
Version 1. Finding the Marginal Maximum Likelihood Estimate
(MMLE) of the correlation parameters γ = (γ1, . . . , γd):
• Starting with the likelihood L(θ, σ2
, γ) arising from
(yM
(x1), . . . , yM
(xm)) ∼ MV N (µ(x1), . . . , µ(xm)), σ2
C ,
integrate out θ and σ2
, using the objective prior π(θ, σ2
) = 1/σ2
,
obtaining the marginal likelihood for γ
L(γ) = L(θ, σ2
, γ)
1
σ2
dθ dσ2
∝ |C(γ)|− 1
2 |X′
C(γ)−1
X|− 1
2 (S2
(γ))−( n−p
2
)
,
where S2
(γ) = (Y − Xˆθ)′
C(γ)−1
(Y − Xˆθ) is the residual sum of squares
and ˆθ = (X′
C(γ)−1
X)−1
X′
C(γ)−1
Y is the least squares estimator of θ,.
• The MMLE estimate is that which maximizes L(γ).
21
SAMSI Fall,2018
✬
✫
✩
✪
Definition 0.1 (An Aside: Robust Estimation.) Estimation of the
correlation parameters in the GaSP is called robust, if the following two
situations do NOT happen:
(i) ˆC = 1n1T
n ,
(ii) ˆC = In,
(even approximately), where ˆC is the estimated correlation matrix.
When ˆC ≈ 1n1T
n , the correlation matrix is almost singular, leading to very
large computational errors in the GaSP predictive mean.
When ˆC ≈ In, the GaSP predictive mean will degenerate to the fitted
mean and impulse functions, as shown in the next figures.
22
SAMSI Fall,2018
✬
✫
✩
✪
0.0 0.2 0.4 0.6 0.8 1.0
−3−2−10123
x
y
0.0 0.2 0.4 0.6 0.8 1.0
−3−2−10123
x
y
Example of the problem when ˆC ≈ In: Emulation of the function
y = 3sin(5πx)x + cos(7πx), graphed as the black solid curves (overlapping the
green curves in the left panel). The n = 12 input function values are the black
circles. The left panel is for α = 1.9 and the right panel for α = 1, for the power
exponential correlation function.
• The blue curves give the emulator mean from the MLE approach;
• the red curves (overlapping with green on left) give the emulator mean from
the MMLE approach;
• the green curves give the emulator mean from the posterior mode approach.
23
SAMSI Fall,2018
✬
✫
✩
✪
Here are three common ways of parameterizing the range parameters in
power exponential correlation function:
cβl
(|xil − xjl|) = exp{−βl|xil − xjl|αl
},
cγl
(|xil − xjl|) = exp{−(|xil − xjl|/γl)αl
},
cξl
(|xil − xjl|) = exp {− exp(ξl)|xil − xjl|αl
} ,
for any l = 1, · · · , d.
Lemma 0.1 Robustness is lacking in either of the following two cases.
Case 1. If for all 1 ≤ l ≤ d, ˆβl = 0 (or ˆγl = ∞ or ˆξl = −∞ in the other
parameterizations), then ˆC = 1m1T
m.
Case 2. If any ˆβl = ∞ (equivalent to ˆγl = 0 or ˆξl = ∞), then ˆC = Im.
24
SAMSI Fall,2018
✬
✫
✩
✪
Version 2. Finding the Reference Posterior Mode (RPM) of the
correlation parameters:
The standard objective prior (the reference prior) for β (Paulo (2005)) is
πR
(β) ∝ |I⋆
(β)|1/2
where, with l being the dimension of θ and d the
dimension of β,
I⋆
(β) =








(m − l) trW1 trW2 · · · trWd
trW2
1 trW1W2 · · · trW1Wd
... · · ·
...
trW2
d








and
Wk =
∂C
∂βk
C(β)−1
[In − X(X′
C(β)−1
X)−1
X′
C(β)−1
] .
The posterior mode is then found by maximizing
• L(ψ−1
(β)) πR
(β) in the β parameterization, where β = ψ(γ);
• L(ψ−1
(exp(ξ))) πR
(exp(ξ)) exp( l ξl) in the ξ parameterization;
• L(γ) πR
(ψ(γ)) ψ′
(γ) in the γ parameterization.
25
SAMSI Fall,2018
✬
✫
✩
✪
1n is a column of X 1n is not a column of X
some βl → ∞ βl → 0 for all l some βl → ∞ βl → 0 for all l
Profile Lik O(1) O(γ
−α/2
(1) ) O(1) O(γ
−α/2
(1) )
Marginal Lik O(1) O(1) O(1) O((γ
−α/2
(1) ))
Post β, p = 1 O(exp(−βC)) O(1) O(β
1
2 exp(−βC)) O(β−1/2
)
p ≥ 2 O(
l∈E
exp(−βlCl)) O(β
−(p−1)
(p) ) O((
l∈E
βl)
1
2
p
l=1
exp(−βl)Cl) O(β
−(p−1/2)
(p) )
Post γ, p = 1 O(exp(−C/γα)
γ(α+1) ) O(γ−α−1
) O(exp(−C/γα)
γ(α/2+1) ) O(γ−α/2−1
)
p ≥ 2 O(
l∈E
exp(−Cl/γα
l )
γ
(α+1)
l
) O(
p
l=1
γ−α−1
l
γ
(1−p)α
(1)
) O(
l∈E
exp(−Cl/γα
l )
γ
(α/2+1)
l
) O(
p
l=1
γ−α−1
l
γ
(1/2−p)α
(1)
)
Post ξ, p = 1 O(exp(− exp(ξ)C + ξ)) O(exp(ξ)) O(exp(− exp(ξ)C + 3
2
ξ)) O(exp(ξ/2))
p ≥ 2 O(
l∈E
exp(− exp(ξl)Cl + ξl)) O(
exp(
p−1
l=1
ξl)
exp((p−2)ξ(p))
) O(
l∈E
exp(− exp(ξl)Cl) + 3
2
ξl) O(
exp(
p−1
i=1
ξl)
exp((p−1/2)ξ(p))
)
Tail behaviors of the profile likelihood (insert the MLE’s of θ and σ2
), the
marginal likelihood and the posterior distributions for different parameterizations
of the power exponential correlation function, using the reference prior.
• Blue gives the cases where the tail behavior is constant, so that there is
danger of non-robustness (the mle could be at ∞).
• Red gives the non-robust cases where the posterior goes to infinity in the tail.
• Thus use the posterior mode for either the γ or ξ parameterizations.
26
SAMSI Fall,2018
✬
✫
✩
✪
Another Improvement
One of the most frequently used Mat´ern correlation functions is
cl(dl) = 1 +
√
5dl
γl
+
5d2
l
3γ2
l
exp −
√
5dl
γl
,
where dl stands for any of the |xil − xjl|. Denoting ˜dl = dl/γl, the
following properties can be established.
• When ˜dl → 0, cl( ˜dl) ≈ 1 − C ˜d2
l with C > 0 being a constant. This thus
behaves similarly to exp(− ˜d2
l ) ≈ 1 − ˜d2
l , which corresponds to the power
exponential correlation with αl = 2, and thus has similar smoothness near
design points.
• When ˜dl → ∞, the dominant part of cl( ˜dl) is exp −
√
5 ˜dl which matches
the power exponential correlation with αl = 1. Thus the Mat´ern correlation
prevents the correlation from decreasing quickly with distance, as does the
Gaussian correlation. This can be of benefit in emulation since some inputs
may have almost no effect on the computer model, corresponding to near
constant correlations for distant inputs.
27
SAMSI Fall,2018
✬
✫
✩
✪
We test the following five functions:
i. 1 dimensional Higdon function,
Y = sin(2πX/10) + 0.2 sin(2πX/2.5), where X ∈ [0, 10].
ii. 2 dimensional Lim function,
Y = 1
6 [(30 + 5X1 sin(5X1))(4 + exp(−5X2)) − 100] + ǫ, where
Xi ∈ [0, 1], for i = 1, 2.
iii. 3 dimensional Pepelyshev function,
Y = 4(X1 − 2 + 8X2 − 8X2
2 )2
+ (3 − 4X2)2
+ 16
√
X3 + 1(2X3 − 1)2
,
where Xi ∈ [0, 1], for i = 1, 2, 3.
iv. 4 dimensional Park function,
Y = 2
3 exp(X1 + X2) − X4 sin(X3) + X3, where Xi ∈ [0, 1), for
i = 1, 2, 3, 4.
v. 5 dimensional Friedman function from,
Y = 10 sin(πX1X2) + 20(X3 − 0.5)2
+ 10X4 + 5X5, where Xi ∈ [0, 1],
for i = 1, 2, 3, 4, 5.
28
SAMSI Fall,2018
✬
✫
✩
✪
Robust GaSP ξ Robust GaSP γ MLE DiceKriging
1-dim Higdon .00011 .00012 .00013 .00013
2-dim Lim .0064 .0080 .021 .0083
3-dim Pepelyshev .083 .15 3.5 .79
4-dim Park .00011 .00011 .033 .00063
5-dim Friedman .026 .038 4.7 .44
Table 1: Average MSE of the four estimation procedures for the five exper-
imental functions. The sample size is n = 20 for the Higdon function and
n = 10p for the others. Designs are generated by maxmin LHD.
This suggests that optimal is to use the posterior mode in the ξ
parameterization, with the Matern correlation function.
29
SAMSI Fall,2018
✬
✫
✩
✪
●●●●●●●●●
●●●●●●●●
●●●●●●
●
●●●
●
●●●●●●
●
●●●
●●●
●●●●●●●
●●
●
●●●●●●●●
●
●●●●●●●●●●●●●●●
●●●●●●
●
●
●
●●●
●
●
●●
●●
●
●
●
●
●●●●●●●●●●●●
●●
●
●●
●
●●●●●
●
●●●
●
●
●
●●
●
●
●●●
●●●●●●●●●●●●●●●●●●
●
●
●
●●
●
●●●●●●●●●●
●
●
●
●●
●
●
●●●●
●
●
●
●
●●●●●
●
●●
●
●●●●●●●●
●●●●
●
●●●
●
●
●
●
●
●●●●●●●
●
●
●
●
●
●●
●
●●●
●
●●●●●●
●
●●●●●●●●●
●
●●●
●●●●●●●●●●●
●
●●●●●
●
●●●●
●
●●
●
●●●
●
●●●●●●●
●
●
●●
●
●
●
●●●●●●
●
●
●
●
●●●
●
●●●●●●●●●●●●●●●●●
●
●●
●
●
●●●●●●●●●●●●●●●●●●●
●
●●●●●●●●
●
●●●●●●●●
●
●●
●
●●●
●●
●
●●●●
●
●
●●●●●
●
●
●●●●●
●●●
●
●●●
●
●●●●
●
●●●
●
●●●
●
●●●●●●●●●●●●
●●●●
●
●●●●●●
●
●
●●●●●●●●●●●●●●
●
●
●
●
●
●●
●
●●●●●●
●●●
●
●
●
●●●●●
●
●●●●●●
●
●
●●●●●●●
●●
0 100 200 300 400 500
0.000.050.100.150.20
Num
MSEDifferentbetweenMLEandRobustGaSP
●●●●●
●
●
●
●●●
●
●
●●●●●●
●
●
●
●●●●●●●
●
●
●
●
●●●●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●●●
●
●
●
●
●●●
●●
●
●
●
●
●●●
●
●
●●●●●●
●
●●
●
●
●
●●●
●
●
●●●●●●
●
●●
●
●●●
●
●●
●
●
●
●
●
●●●
●
●
●●
●
●●●●
●
●●●
●
●
●
●
●
●●●●●●●
●
●●
●
●●●●●●●
●
●
●
●
●
●
●
●
●●●
●●
●●
●
●●●●
●
●
●
●
●●
●
●●●●●●
●
●●
●
●●●
●
●●●●
●
●●
●
●●●
●
●●●●●
●
●●
●
●●
●
●●
●
●
●
●●●●
●
●●●●
●
●
●●●●●●●●●●
●
●
●
●●●●
●●
●
●
●●●
●●
●●
●
●●●●●●
●
●
●
●●●●●
●
●●
●
●
●
●●
●
●
●
●●
●
●
●
●●●●●●●●●
●
●●●●●●●
●
●●
●●
●
●
●●●●
●
●●
●
●
●●●●
●
●
●
●
●
●
●●●●●●●●
●
●
●
●●●●●●
●
●●
●
●●●
●
●●●●●●
●
●●
●
●●●●●●●●●●●
●
●
●●●
●
●●
●
●●●
●
●
●
●
●●●●●
●
●●
●
●●●●
●
●●●
●●
●
●
●
●
●
●
●
●●●●●
●
●●●●●
●
●●
●
●●●
●
●
●
●
●●
●●●●
●
●
●
●
●●●
●
●
●●
●
●
●●●●●●
●
●
●
●
●●●●●●●●
●
●●
●
●●
●●●●
0 100 200 300 400 500
010203040
Num
MSEDifferentbetweenMLEandRobustGaSP
●●●●●●
●
●●●
●
●
●●
●
●●●
●
●●●●●●●●●●●●●●
●
●●●
●
●●●●
●
●
●
●
●
●●●
●
●●●
●
●●●●●●●●●●
●
●
●
●
●●●●●●●
●
●
●●
●
●
●
●
●
●
●
●
●●●●●●●●
●
●
●
●
●●●●●●●
●
●●●●●●
●
●●●
●
●●●●●●●●●
●
●●●
●
●●
●
●
●
●
●
●
●●●●●●●●
●
●
●
●●●
●
●●●●
●
●
●●
●
●●●●●●●●
●
●●
●
●●●●●●●
●
●
●
●
●
●
●
●●●●
●
●
●●●●●●
●
●●●
●
●●●●●●●●
●
●
●
●
●
●●●
●
●
●
●
●
●
●●●
●
●●
●
●
●
●●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●●●
●●
●●●●●●●●●●
●
●●●●●●●●●
●
●
●
●
●●
●
●●●●
●
●
●
●
●●
●
●●●●●
●
●●●●●●●●●●
●
●●●
●
●●●●
●
●
●
●
●
●
●
●●●
●
●
●●●●
●
●●
●
●
●
●
●●
●
●●●
●
●
●
●
●●●●
●
●●●●●
●
●●●●●●●
●
●
●●●●●●
●
●
●
●
●
●
●●●●●●
●
●
●
●
●●●●●●●●●●●●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●●●●
●
●
●●●●
●
●
●●
●
●
●
●●●
●
●●●
●
●●
●
●
●
●●●●●●
●
●
●
●●●●●●
●
●●●
●
●
●●●●●
●
●●
0 100 200 300 400 500
0.000.050.100.150.200.25
Num
MSEDifferentbetweenMLEandRobustGaSP
●
●
●
●
●
●●●
●
●
●
●
●
●●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●●●
●
●
●
●●
●
●
●
●
●
●●●●
●
●
●
●
●
●●
●
●
●
●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●●
●
●●
●
●
●
●
●
●●●
●
●●
●
●
●
●
●●●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●●●●
●
●
●●●●●●
●
●
●●
●
●
●●●
●●
●
●
●
●
●
●●●●●
●
●
●●
●
●
●
●●
●
●
●
●●
●
●
●●
●
●●●
●
●●●●●
●
●●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●●
●
●●
●●
●●
●
●
●
●
●
●
●
●●
●●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●●●●
●
●
●
●
●
●●●
●
●
●
●●●
●
●
●
●●●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●●●
●
●
●
●●●
●
●
●
●●●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●
0 100 200 300 400 500
0510152025
Num
MSEDifferentbetweenMLEandRobustGaSP
Difference of MSE for the MLE GaSP and robust GaSP (ξ parameterization) for
each of N = 500 designs for the Lim function (upper left), Pepelyshev function
(upper right), Park function (lower left)) and Friedman function (lower right).
30
SAMSI Fall,2018
✬
✫
✩
✪
The Jointly Robust Prior
Evaluation of the reference prior (especially computation of the needed
derivatives) and, hence, determination of the posterior mode can be
somewhat costly if p is large.
An approximation to the reference prior that has the same tail behavior in
terms of robustness is
πJR
(β) = (
p
l=1
Clβl)0.2
exp(−b
p
l=1
Clβl) ,
where b = n−1/p
(a + p) and Cl equals the mean of |xD
il − xD
jl|, for
1 ≤ i, j ≤ n, i = j.
31
SAMSI Fall,2018
✬
✫
✩
✪
Software
All the above methodology has been implemented in
RobustGaSP
Robust Gaussian Stochastic Process Emulation in R
by Mengyang Gu and Jesus Palomo
• There are many choices of the correlation function (Matern is the
default).
• Choice between the reference prior and its jointly robust approximation
(the approximation is the default).
• Inclusion of a nugget is possible, with the resulting prior changed
appropriately.
• There is also the capability of identifying and removing ‘inert’ inputs.
32
SAMSI Fall,2018
✬
✫
✩
✪
Differences between use of Gaussian processes in
Spatial Statistics and UQ
• Spatial statistics typically has only d = 2 or d = 3.
• Typically there are many fewer ‘observations’ in UQ, exacerbated by
the larger d.
– This makes estimation of the range parameters γ much more
difficult than in spatial statistics.
– But, luckily, the UQ design points are typically spread out, making
the α correlation parameters less important, compared with their
importance in spatial statistics.
• Spatial processes are often (though not always) smoother than UQ
processes.
• Instead of the product correlation structure, spatial statistics often uses
correlations such as
c(x, y) = e{−|x−y|α
/γ}
.
33
SAMSI Fall,2018
✬
✫
✩
✪
Emulation in more complicated situations
• Functional emulators via Kronecker products
• Functional emulators via basis decomposition
• Functional emulators via parallel partial emulation
• Coupling emulators
34
SAMSI Fall,2018
✬
✫
✩
✪
Functional emulators via Kronecker products
Example: A Vehicle Crash Model. Collision of a vehicle with a barrier
is implemented as a non-linear dynamic analysis code using a finite element
representation of the vehicle. The focus is on velocity changes of the
driver’s head, as in the following 30mph crash.
There are d = 2 inputs to the simulator: crash barrier type B and crash
vehicle velocity v.
35
SAMSI Fall,2018
✬
✫
✩
✪
Obvious approach – Discretize: Sample the m output functions from
the simulator (arising from the design on input space) at a discrete number
nT of time points and use t as just another input to the emulator.
• Only feasible if the functions are fairly regular.
• There are now m × nT inputs, so computing C−1
might be untenable.
– But Kronecker products come to the rescue.
36
SAMSI Fall,2018
✬
✫
✩
✪
Example: Suppose m = 2 and the original design inputs were x1 and x2. Also
suppose we discretize at t1 and t2. Then there are four modified inputs
{(x1, t1), (x1, t2), (x2, t1), (x2, t2)} with correlation matrix (assuming the product
exponential form and setting the correlation parameters to 1 for simplicity)







1 e−|t1−t2|
e−|x1−x2|
e−|x1−x2|
e−|t1−t2|
e−|t1−t2|
1 e−|x1−x2|
e−|t1−t2|
e−|x1−x2|
e−|x1−x2|
e−|x1−x2|
e−|t1−t2|
1 e−|t1−t2|
e−|x1−x2|
e−|t1−t2|
e−|x1−x2|
e−|t1−t2|
1







=








1 ×


1 e−|t1−t2|
e−|t1−t2|
1

 e−|x1−x2|
×


1 e−|t1−t2|
e−|t1−t2|
1


e−|x1−x2|
×


1 e−|t1−t2|
e−|t1−t2|
1

 1 ×


1 e−|t1−t2|
e−|t1−t2|
1










=


1 e−|x1−x2|
e−|x1−x2|
1

 ⊗


1 e−|t1−t2|
e−|t1−t2|
1


where ⊗ denotes the Kronecker product of the two matrices.
37
SAMSI Fall,2018
✬
✫
✩
✪
In general, if CD is the correlation matrix arising from the original
designed input to the GaSP, and CT is the correlation matrix that arises
from the discretization of time, then using the combined input results in
the overall correlation matrix
C = CD ⊗ CT .
The wonderful thing about Kronecker products is
C−1
= (CD ⊗ CT )−1
= C−1
D ⊗ C−1
T
|C| = |CD ⊗ CT | = |CD|nT
|CT |m
.
38
SAMSI Fall,2018
✬
✫
✩
✪
Functional emulators via basis decomposition
Example: Consider a vehicle being driven over a road with two major
potholes.
– x = (x1, . . . , x7) is the vector of key vehicle characteristics;
– yR
(x; t) is the time-history curve of resulting forces.
A finite element PDE computer model of the vehicle being driven over the
road
– depends on x = (x1, . . . , x7) and unknown calibration parameters
u = (u1, u2);
– yields time-history force curve yM
(x, u; t).
39
SAMSI Fall,2018
✬
✫
✩
✪
Parameter Type (label) Uncertainty
Damping 1 (force dissipation) Calibration (u1) 15%
Damping 2 (force dissipation) Calibration (u2) 15%
Bushing Stiffness (Voided) Unmeasured (x1) 15%
Bushing Stiffness (Non-Voided) Unmeasured (x2) 10%
Front rebound travel until Contact Unmeasured (x3) 5%
Front rebound bumper stiffness Unmeasured (x4) 8%
Sprung Mass Unmeasured (x5) 5%
Unsprung Mass Unmeasured (x6) 12%
Body Pitch Inertia Unmeasured (x7) 12%
Table 2: I/U Map: Vehicle characteristics (‘unmeasured’) and model cali-
bration inputs, and their prior uncertainty ranges.
40
SAMSI Fall,2018
✬
✫
✩
✪
Field Data: Seven runs of a given test vehicle over the same road
containing two potholes.
Denote the r-th field time-history curve by yF
r (x∗
; t), r = 1, . . . , 7,
where x∗
= (x∗
1, . . . , x∗
7) refers to the unknown vehicle characteristics of
the given test vehicle.
Model Data: The computer model of the vehicle was ‘run over the
potholes’ at 65 input values of z = (x, u) = (x1, . . . , x7, u1, u2); let
zr = (xr, ur), r = 1, . . . , 65, denote the corresponding parameter
vectors, which were chosen by a Latin-hypercube design over the input
uncertainty ranges.
Let yM
(zr; t) denote the rth computer model time-history curve,
r = 1, 2, . . . , 65.
41
SAMSIFall,2018
✬
✫
78910
00.51.0
meters
Tension
78910
00.51.0
meters
Tension
78910
00.51.0
meters
Tension
78910
00.51.0
meters
Tension
78910
00.51.0
meters
Tension
78910
00.51.0
meters
Tension
78910
00.51.0
meters
Tension
FieldRuns:Ch45:Pothole1
78910
00.51.0
meters
Tension
78910
00.51.0
meters
Tension
78910
00.51.0
meters
Tension
78910
00.51.0
meters
Tension
78910
00.51.0
meters
Tension
78910
00.51.0
meters
Tension
78910
00.51.0
meters
Tension
78910
00.51.0
meters
Tension
78910
00.51.0
meters
Tension
78910
00.51.0
meters
Tension
78910
00.51.0
meters
Tension
78910
00.51.0
meters
Tension
78910
00.51.0
meters
Tension
78910
00.51.0
meters
Tension
78910
00.51.0
meters
Tension
78910
00.51.0
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
78910
00.501.00
meters
Tension
ModelRuns:Ch45:Pothole1
3738394041
00.5010000
meters
Tension
3738394041
00.5010000
meters
Tension
3738394041
00.5010000
meters
Tension
3738394041
00.5010000
meters
Tension
3738394041
00.5010000
meters
Tension
3738394041
00.5010000
meters
Tension
3738394041
00.501.00
meters
Tension
FieldRuns:Ch45:Pothole2
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
3738394041
050001.00
meters
Tension
ModelRuns:Ch45:Pothole2
Figure5:Forcesfromfieldandmodeldatafortwopotholes.
42
SAMSI Fall,2018
✬
✫
✩
✪
Wavelet Representation of the Curves:
We view yF
r (x∗
; t), yM
(zr; t), and yR
(x∗
; t) as random functions, and must
ultimately perform a Bayesian analysis with these random functions.
To do this, we used a wavelet representation of the functions as follows.
• Each curve was replaced with its values on a dyadic grid with
212
= 4, 112 points, so that the number of resolution levels associated
with the wavelet decomposition is L = 13 (including the mean of each
curve as being at level 0.)
• The R wavethresh package was used to obtain the decomposition, with
thresholding of coefficients at the fourth and higher levels whose
absolute value was below the 0.975 percentile of the absolute values of
the wavelet coefficients in the level.
• The union, over all curves (field and model), of wavelet basis elements
was taken as the final basis, yielding a total of 289 basis elements,
ψi(t) , i = 1, . . . , 289.
43
SAMSI Fall,2018
✬
✫
✩
✪
Thus the 65 model response curves and 7 field response curves are
represented as
yM
(zj; t) =
289
i=1
wM
i (zj)ψi(t), j = 1, . . . , 65,
yF
r (x∗
; t) =
289
i=1
wF
ir(x∗
)ψi(t), r = 1, . . . , 7,
where the wM
i (zj) and wF
ir(x∗
) are the coefficients computed through the
wavelet decomposition.
44
SAMSI Fall,2018
✬
✫
✩
✪
7 8 9 10
00.51.0
meters
Tension
First Field Run:Ch45:Pothole 1
7 8 9 10
00.51.0
meters
Tension
First Field Run Reconstructed with wavelets:Ch45:Pothole 1
37 38 39 40 41
00.51.0
meters
Tension
First Field Run:Ch45:Pothole 2
37 38 39 40 41
00.51.0
meters
Tension
First Field Run Reconstructed with wavelets:Ch45:Pothole 2
Figure 6: The accuracy of the wavelet decomposition.
45
SAMSI Fall,2018
✬
✫
✩
✪
GASP Approximation to each of the Model Wavelet Coefficient
Functions wM
i (z):
Formally (and dropping the index i for convenience)
wM
(·) ∼ GASP( µ, 1
λM cM
(·, ·) ,)
with the usual separable correlation function, the parameters αj and βj
being estimated by maximum likelihood (recall there are 289 pairs of
them), as well as µ and λM
.
The posterior of the ith
coefficient, given GASP parameters and model-run
data, at a new input z∗
, is
˜wM
i (z∗
) ∼ N(ˆµi(z∗
), ˆVi(z∗
)) ,
where ˆµi(z∗
) and ˆVi(z∗
) are given by the usual Kriging expressions.
Thus the overall emulator is (assuming that the ˜wM
i are independent)
˜yM
(z∗
; t) ∼ N
289
i=1
ˆµi(z∗
)ψi(t),
289
i=1
ˆVi(z∗
)ψ2
i (t) .
46
SAMSI Fall,2018
✬
✫
✩
✪
Bayesian comparison of model and field data: calibration/tuning;
estimation of bias; and development of tolerance bands for
prediction
For the ith wavelet coefficient, the computer model is related to reality via
wR
i (x∗
) = wM
i (x∗
, u∗
) + bi(x∗
),
where (x∗
, u∗
) are the true (but unknown) values of the field vehicle inputs
and model calibration parameters, respectively. Hence bi(x∗
) is here just an
unknown constant.
(As usual, there is inevitable confounding between u∗
and the bi, but prediction
is little affected.)
47
SAMSI Fall,2018
✬
✫
✩
✪
The replicated field data is modeled as
wF
ir(x∗
) = wR
i (x∗
) + ǫir
= wM
i (x∗
, u∗
) + bi(x∗
) + ǫir, r = 1, . . . , 7,
where the ǫir are i.i.d. N(0, σ2
i ) random errors.
The sufficient statistics, ¯wF
i (x∗
) = 1
7
7
r=1 wF
ir(x∗
) and
S2
i (x∗
) =
7
r=1 (wF
ir(x∗
) − ¯wF
i (x∗
))2
, then have distributions
¯wF
i (x∗
) ∼ N wM
i (x∗
, u∗
) + bi(x∗
),
σ2
i
7
S2
i (x∗
)/σ2
i ∼ Chi-Square (6) ,
which we assume to be independent across i.
48
SAMSI Fall,2018
✬
✫
✩
✪
Prior Distributions: π(b, x∗
, u∗
, σ2
, τ2
) =
π(b | x∗
, u∗
, σ2
, τ2
)π(τ2
| x∗
, u∗
, σ2
)π(σ2
| x∗
, u∗
)π(x∗
, u∗
),
where b,σ2
and τ2
refer to the vectors of bi,σ2
i and τ2
j ; and, with ¯σ2
j
denoting the average of the σ2
i at wavelet level j,
π(b | x∗
, u∗
, σ2
, τ2
) =
289
i=1
N(bi | 0, τ2
j ) ,
(We did also try Cauchy and mixture priors, to some benefit.)
π(τ2
| x∗
, u∗
, σ2
) ∝
12
j=0
1
τ2
j + ¯σ2
j /7
,
π(σ2
| x∗
, u∗
, ) ∝
289
i=1
1
σ2
i
.
49
SAMSI Fall,2018
✬
✫
✩
✪
Finally, the I/U map is translated into
π(x∗
, u∗
) =
2
i=1
p(u∗
i )
7
i=1
p(x∗
i ),
p(u∗
i ) = Uniform(u∗
i | 0.125, 0.875), i = 1, 2,
p(x∗
i ) ∝ N(x∗
i | 0.5, 0.11112
)I(0.1667,0.8333)(x∗
i ), i = 1, 2, 3
p(x∗
4) ∝ N(x∗
4 | 0.5, 0.06412
)I(0.3077,0.6923)(x∗
4),
p(x∗
i ) ∝ N(x∗
i | 0.5, 0.11762
)I(0.1471,0.8529)(x∗
i ), i = 5, 6,
p(x∗
7) ∝ N(x∗
7 | 0.5, 0.10262
)I(0.1923,0.8077)(x∗
7),
where IA(u) is 1 if u ∈ A and 0 otherwise.
50
SAMSI Fall,2018
✬
✫
✩
✪
Posterior Distribution: Denote the available data:
D = { ¯wF
i , S2
i , ˆµi(·), ˆVi(·) : i = 1, . . . , 289}. The posterior distribution of
b,x∗
, u∗
, σ2
, τ2
and wM∗
≡ wM
(x∗
, u∗
) can be expressed as
π(wM∗
, b, x∗
, u∗
, σ2
, τ2
| D) = π(wM∗
| b, x∗
, u∗
, σ2
, τ2
, D)
× π(b | x∗
, u∗
, σ2
, τ2
, D)π(x∗
, u∗
, σ2
, τ2
| D),
where π(wM∗
| b, x∗
, u∗
, σ2
, τ2
, D) =
289
i=1
N(wM∗
i | ˜µi, ˜Vi),
˜µi =
ˆVi(x∗
,u∗
)
ˆVi(x∗,u∗)+σ2
i /7
( ¯wF
i − bi) +
σ2
i /7
ˆVi(x∗,u∗)+σ2
i /7
ˆµi(x∗
, u∗
) ,
˜Vi =
ˆVi(x∗,u∗)σ2
i /7
ˆVi(x∗,u∗)+σ2
i /7
;
π(b | x∗
, u∗
, σ2
, τ2
, D) =
289
i=1
N(bi | µ∗
i , V ∗
i ),
µ∗
i =
τ2
j ( ¯wF
i − ˆµi(x∗
, u∗
))
τ2
j + ˆVi(x∗, u∗) + σ2
i /7
, V ∗
i =
τ2
j ( ˆVi(x∗
, u∗
) + σ2
i /7)
τ2
j + ˆVi(x∗, u∗) + σ2
i /7
;
51
SAMSI Fall,2018
✬
✫
✩
✪
π(x∗
, u∗
, σ2
, τ2
| D) ∝ L(x∗
, u∗
, σ2
, τ2
)π(x∗
, u∗
, σ2
, τ2
),
L(x∗
, u∗
, σ2
, τ2
) =
289
i=1
(σ2
i )−3
τ2
j +
σ2
i
7
+ ˆVi(x∗, u∗)
× exp −
1
2
( ¯wF
i − ˆµi(x∗
, u∗
))2
τ2
j +
σ2
i
7
+ ˆVi(x∗, u∗)
+
S2
i
σ2
i
.
52
SAMSI Fall,2018
✬
✫
✩
✪
Computation: A Metropolis-Hastings-Gibbs Sampling scheme was used,
the Metropolis step being needed to sample from π(x∗
, u∗
, σ2
, τ2
| D). The
proposal used was
• for the σ2
i : Inverse Gamma distributions with shape 3 and scales 2/S2
i ;
• for the τ2
j : local moves ∝ 1/τ2
j in (0.5 τ
2(old)
j , 2 τ
2(old)
j );
• for the x∗
and u∗
, a mixture of prior and local moves:
gu(z) =
9
i=1
{ 0.5 Unif(zi | Ti) + 0.5 Unif(zi | T∗
i )} ,
where Ti = (ai, bi) is the support of each prior and
T∗
i = (max{ai, z
(old)
i − 0.05}, min{bi, z
(old)
i + 0.05}).
53
SAMSI Fall,2018
✬
✫
✩
✪
Computation was initially done by a standard Markov Chain Monte
Carlo analysis:
• Closed form full conditionals are available for b, and for the emulator
wavelet coefficients wM∗
≡ wM
(x∗
, u∗
).
• Metropolis-Hastings steps were used for (x∗
, u∗
, σ2
, τ2
); efficient
proposal distributions were available, so all seemed fine.
Shock: The original computation failed and could not be fixed using
traditional methods; the answers were also ‘wrong’!
• Problem: Some of the σ2
i (variances corresponding to certain wavelet
coefficients of the field data) got ‘stuck’ at very large values, with the
effect that the corresponding biases were estimated as near zero.
• Likely Cause: Modeling the bi as hierarchically normally distributed;
biases for many wavelet coefficients can be expected to be small, but
some are likely to be large.
54
SAMSI Fall,2018
✬
✫
✩
✪
Ideal solution: Improve the hierarchical models; for instance, one could
consider use of more robust models (e.g. Cauchy models) for the bi.
Pragmatic solution: Cheat computationally, and only allow generation
of the σ2
i from the replicate information (i.e., from the
s2
i =
7
r=1(wF
ir(x∗
) − ¯wF
i )2
), not allowing transference of information from
the bi to the σ2
i .
• We call such cheating modularization, the idea being to not always
allow Bayesian updating to flow both ways between modules
(components) of a complex model.
• Another name given to this idea is cutting feedback (Best, Spiegelhalter,
...); related notions are inconsistent dependency networks (Heckerman);
and inconsistent Gibbs for missing data problems (Gelman and others).
• In the road load analysis, the modularization approach gives very
similar answers to the improved modeling approach.
55
SAMSI Fall,2018
✬
✫
✩
✪
Inference: Bias estimates, predictions, and associated accuracy
statements can all be constructed from the posterior sample
{(wM∗
)(h)
, b(h)
, x∗(h)
, u∗(h)
, (σ2
)(h)
}, h = 1, . . . , N ,
and an auxiliary sample ǫ(h)
, h = 1, . . . , N, from a multivariate normal
distribution with zero mean and diagonal covariance matrix Diag(σ2
)(h)
.
• The posterior sample of bias curves is
b(h)
(d) =
289
i=1
b
(h)
i ψi(t), h = 1, . . . , N.
• The posterior sample of bias-corrected predictions of reality is
(yR
)(h)
(d) =
289
i=1
(wM∗
i )(h)
+ b
(h)
i ψi(t), h = 1, . . . , N.
• The posterior sample of individual (field) bias-corrected prediction curves is
(yF
)(h)
(d) =
289
i=1
(wM∗
i )(h)
+ b
(h)
i + ǫ
(h)
i ψi(t), h = 1, . . . , N.
56
SAMSI Fall,2018
✬
✫
✩
✪
u− 1
u− 1
Freq.
0.0 0.2 0.4 0.6 0.8 1.0
01234
u− 2
u− 2
Freq.
0.0 0.2 0.4 0.6 0.8 1.0
0.00.51.01.52.02.53.0
u− 3
u− 3
Freq.
0.0 0.2 0.4 0.6 0.8 1.0
0.00.51.01.52.02.5
u− 4
u− 4
Freq.
0.0 0.2 0.4 0.6 0.8 1.0
0.00.51.01.52.02.53.0
u− 5
u− 5
Freq.
0.0 0.2 0.4 0.6 0.8 1.0
0.00.51.01.52.0
u− 6
u− 6
Freq.
0.0 0.2 0.4 0.6 0.8 1.0
0.00.51.01.52.02.53.0
u− 7
u− 7
Freq.
0.0 0.2 0.4 0.6 0.8 1.0
020406080
u− 8
u− 8
Freq.
0.0 0.2 0.4 0.6 0.8 1.0
02468
u− 9
u− 9
Freq. 0.0 0.2 0.4 0.6 0.8 1.0
01234
PX: Ch45
Figure 7: Posterior distribution of u∗
and x∗
.
57
SAMSI Fall,2018
✬
✫
✩
✪
7 8 9 10
−3.0−2.0−1.001.0
PX: Ch45: Region 1: Bias Curves
Distance (m)
Tension(N)
Bias function
MCMC (90% Tolerance bounds)
Figure 8: Posterior bias curve estimate and 90% tolerance bands.
58
SAMSI Fall,2018
✬
✫
✩
✪
7 8 9 10
00.51.0
PX: Ch45: Region 1: Bias Corrected Prediction: Individual Curve
Distance (m)
Tension(N)
Model Data
Field Data
Model Prediction
90% Tolerance bounds
59
SAMSI Fall,2018
✬
✫
✩
✪
7 8 9 10
05.01.0
PX: Ch45: Region 1: Nominal Model Prediction: Individual Curve
Distance (m)
Tension(N)
Model Data
Field Data
Model Prediction
90% Tolerance bounds
60
SAMSI Fall,2018
✬
✫
✩
✪
7 8 9 10
00.51.01.52.02.53.0
PX: Ch45: Region 1: Bias Corrected Prediction: Individual Curve
Distance (m)
Tension(N)
Field Data
Model Prediction
90% Tolerance bounds
7 8 9 10
−1.5−1.0−0.500.51.0
PX: Ch60: Region 1: Bias Corrected Prediction: Individual Curve
Distance (m)
Tension(N)
Field Data
Model Prediction
90% Tolerance bounds
37 38 39 40 41
00.51.01.52.02.53.0
PX: Ch45: Region 2: Bias Corrected Prediction: Individual Curve
Distance (m)
Tension(N)
Field Data
Model Prediction
90% Tolerance bounds
37 38 39 40 41
−1.5−1.0−0.500.51.0
PX: Ch60: Region 2: Bias Corrected Prediction: Individual Curve
Distance (m)
Tension(N)
Field Data
Model Prediction
90% Tolerance bounds
Figure 10: Multiplicative extrapolation of bias to Vehicle B.
61
SAMSI Fall,2018
✬
✫
✩
✪
Functional emulators via Parallel Partial emulation
Example: In the pyroclastic flow example, the full output of TITAN2D is
yM
(x) = (yM
1 (x), yM
2 (x), . . . , yM
k (x)) where each yM
i (x) is the pyroclastic
flow heights (and speed and direction) at the k space-time grid points on
which TITAN2D is run. This is a huge (discretized) function, with k as
large as 109
. One realization of the function, only looking at maximum flow
height at 24,000 spatial locations, looks like this:
62
SAMSI Fall,2018
✬
✫
✩
✪
Movie Time
Determination of 1m contours of maximum flow over time at k = 23, 040
spatial locations, using m = 50 simulator runs at various inputs to develop
the emulator.
63
SAMSI Fall,2018
✬
✫
✩
✪
The Big Issue: This wildly varying function varies even more wildly over
the inputs, so it is virtually unimaginable to capture it with any of the
previous methods, or any previous emulator method (stat or math).
So we have to trust to the magic of GaSP’s and hope to get lucky (you
can’t force UQ)!
Run the simulator at xD
= {x1, . . . , xm}, yielding outputs
yD
= (yM
(x1)′
, . . . , yM
(xm)′
)′
(a matrix of size up to 2048 × 109
).
The simplest imaginable GaSP for the k-dimensional yM
(x):
An independent GaSP is assigned to each coordinate yM
j (x), with
• prior mean functions of the regression form Ψ(x)θj, where Ψ(x) is a
common l-vector of given basis functions and the θj are differing
unknown regression coefficients;
• differing unknown prior variances σ2
j ;
• common estimated correlation parameters ˆγ (discussed later).
64
SAMSI Fall,2018
✬
✫
✩
✪
The mean function of the posterior GaSP for yM
j (x∗
) at new input x∗
is
ˆµj(x∗
) = Ψ(x∗
)ˆθj + C(x∗
)C−1
(yD
j − Ψˆθj) ,
where yD
j is the jth
column of yD
and ˆθj = (Ψ′
C−1
Ψ)−1
Ψ′
C−1
yD
j , with
Ψ being the earlier specified m × l design matrix, C being the earlier
specified m × m correlation matrix, Ψ(x∗
) = (Ψ1(x∗
), . . . , Ψl(x∗
)), and
C(x∗
) = (c(x1, x∗
), . . . , c(xm, x∗
)).
This can be rewritten
ˆµj(x∗
) =
m
i=1
hi(x∗
)yD
ij ,
where hi(x∗
) is the ith
element of the m-vector
h(x∗
) = (Ψ(x∗
) − C(x∗
)C−1
Ψ)(Ψ′
C−1
Ψ)−1
Ψ′
C−1
+ C(x∗
)C−1
.
As Ψ and C (and the functions of them) can be pre-computed, computing
h(x∗
) (at a new x∗
) requires roughly m2
numerical operations.
65
SAMSI Fall,2018
✬
✫
✩
✪
Finally, we can write the complete parallel partial posterior (PP) mean
vector (the emulator of the full simulator output at a new input x∗
) as
ˆµ(x∗
) = (ˆµ1(x∗
), . . . , ˆµk(x∗
)) = h(x∗
)yD
.
• The overall computational cost is just O(mk) when k >> m.
– It is crucially important to have differing θj and σ2
j at each coordinate,
but this comes with essentially no computational cost.
• Computation of all the PP emulator variances is O(m2
k), but one
rarely needs to compute all of them.
• The emulator is an interpolator so, when x∗
equals one of the runs xi,
the emulator will return the exact values from the computer run.
• As the emulator mean is just a weighted average of the actual simulator
runs, it hopefully captures some of the dynamics of the process.
66
SAMSI Fall,2018
✬
✫
✩
✪
What happens if the assumptions are relaxed?
• If different coordinates are allowed different bases, the cost goes up to
O([m2
l + l3
]k). (Recall the cost of the PP emulator was O(mk)).
– For TITAN2D, m ≈ 2000, l = 4, and k ≈ 109
⇒ O(1016
) computations,
compared to O(1012
) for the PP emulator.
• If the correlation parameters, γj, are allowed to vary at each
coordinate, the computational cost would be O(n m3
k) because there
would be differing m × m correlation matrices Cj at each coordinate
and the inversion of Cj would need to be done n times, in order to
estimate γj.
– For TITAN2D, n ≈ 150, m ≈ 2000, k ≈ 109
⇒ O(1021
) computations.
• In either case the emulator would still be an interpolator, but would no
longer be a weighted average of the simulator runs.
67
SAMSI Fall,2018
✬
✫
✩
✪
Figure 11: The mean of the emulator of ‘maximum flow height over time’ from
TITAN2D, at 24,000 spatial locations over Montserrat and for new input values
V = 107.462
, ϕ = 2.827, δbed = 11.111, and δint = 27.7373.
68
SAMSI Fall,2018
✬
✫
✩
✪
Figure 12: Variance of the emulator of ‘maximum flow height over time’ from
TITAN2D, at 24,000 spatial locations over Montserrat and for new input values
V = 107.462
, ϕ = 2.827, δbed = 11.111, and δint = 27.7373.
69
SAMSI Fall,2018
✬
✫
✩
✪
Movie Time
Determination of 1m contours of maximum flow over time at k = 23, 040
spatial locations, using m = 50 simulator runs at various inputs to develop
the emulator.
70
SAMSI Fall,2018
✬
✫
✩
✪
The spatial ‘elephant in the room’
Is the key (and clearly invalid) assumption that simulator output
values at all coordinates (e.g., space-time locations) are independent.
The usual attempted solution: Introduce a second spatial process over the
output coordinates of yM
(x) = (yM
1 (x), yM
2 (x), . . . , yM
k (x)) to reflect the
clear dependence. Usual assumptions on this process:
• It is also a Gaussian process, with correlation function λ(i, j), leading to the
k × k correlation matrix Λ.
• Because k is huge, the process must be chosen so that Λ is sparse (e.g., only
allow correlation with nearby points), to allow for the needed inversions of Λ.
• Separability with the GaSP over the process input space is assumed, so that
the covariance matrix of the joint Gaussian process is (letting σ denote the
diagonal matrix of coordinate standard deviations)
Σ = σΛσ ⊗ C , and thus Σ−1
= σ−1
Λ−1
σ−1
⊗ C−1
.
The problem: It is difficult to add plausible spatial structure while keeping
the computation manageable when k is huge.
71
SAMSI Fall,2018
✬
✫
✩
✪
The Surprise: The spatial elephant can (mostly) be ignored, as the PP
emulator will give essentially the same answers. Indeed, for any spatial
structure Λ, the following can be shown:
• The emulator mean of yM
(x∗
) = (yM
1 (x∗
), . . . , yM
k (x∗
)), at a new input
x∗
, is exactly the same as the PP emulator mean. (Intuition: it does not
matter that the yM
i (x∗
) are spatially related, as they are all unknown.)
• The emulator variance at coordinate j is still ˆσ2
j Vj(x∗
), with only ˆσ2
j
depending on the spatial structure, and only in a minor way; thus one
can just use the (slightly conservative) PP emulator variance.
The remaining little elephant: If one actually needs random draws from the
emulator, the PP emulator’s draws will be too rough (because each
coordinate is independent), which might be harmful in some applications.
• A relatively simple fix to obtain smoother draws is to divide the grid into
squares of moderate size s (e.g., s = 4), have the squares be independent, but
allow a dependent spatial process in each square.
• If Λ in each square is assigned the objective prior π(Λ) = |Λ|−s
, the mean
and variance of the emulator will then be the same as the PP emulator.
72
SAMSI Fall,2018
✬
✫
✩
✪
Additional concerns with the assumptions for the PP emulator:
• The likelihood from which the correlation parameters γ are estimated
might be bad because of the assumption of independence of
coordinates.
– In practice, use of a joint spatial process seems to give worse results,
because of considerable numerical instabilities in the likelihood.
– Also, the estimates of γ should primarily be driven by the varying
simulator output over the inputs xi at each fixed location.
– The likelihood is almost certainly too concentrated but, as we are
only using it to obtain plug-in estimates, this is not a major concern.
• Assuming common values of the correlation parameters γ at all
coordinates is potentially problematical, as the simulator may have
very different levels of smoothness in different regions of input space.
– One could utilize different γ in a few different regions, with minimal
additional cost, as in Gramacy and Lee (2008).
– Simulations (see later) indicate this is not a problem for TITAN2D.
73
SAMSI Fall,2018
✬
✫
✩
✪
Introduction of a nugget
Often certain inputs have very little effect on the simulator output, and
emulators that ignore that input can do better at prediction. But, for
deterministic simulators, one must then introduce a ‘nugget’ (i.i.d.
Gaussian errors) in the GaSP model.
We simply let the correlation matrix be C + ξI, renormalized to be a
correlation matrix, with ξ unknown. The computations are then only
slightly more complicated.
Example: In TITAN2D, δint has only a minor effect, so we will investigate
• the full 4 input emulator,
• the 3 input emulator with δint removed and a nugget inserted.
74
SAMSI Fall,2018
✬
✫
✩
✪
Emulator: PP GaSP PP GaSP MS GaSP LMC GaSP
Parameter estimation: robust est. robust est. robust est. DiceKriging
4 inputs 3 inputs and an estimated nugget
Mean Square Error 0.109 0.097 0.103 0.137
95% CI Coverage 0.926 0.950 0.924 0.909
95% CI Length 0.521 0.536 0.491 0.478
time (s) using R 50.0 28.1 31337.7 3407.6
Table 3: Performance of various emulators, developed from 50 simulator runs, of
max flow height over all spatial locations except the crater and non-flow areas.
• The first emulator uses all 4 inputs while the remaining three emulators use
3 inputs (V, δbed, φ) and a nugget, all with the same regressor h(x) = (1, V ).
• The LMC emulator uses coregionalization with SVD output decomposition.
• Evaluations based on n∗
= 633 held-out inputs over k = 17, 311 locations.
• The last row shows the computational times of the emulators, using R.
75
SAMSI Fall,2018
✬
✫
✩
✪
Determining Hazard Probabilities
Goal: Determine PH,T (k), the probability, at location k, that the maximum
pyroclastic flow height exceeds H over the next T years.
Implementation:
• Perform statistical analysis of historical data to determine the posterior
distribution of the simulator inputs (V, δbed, φ).
• Draw 100,000 samples from this posterior, and evaluate the emulator
at these inputs to estimate the distribution Fk of maximum flow
heights at each location k.
• Assuming pyroclastic flows follow a stationary Poisson process, an
exact expression can be given, in terms of the Fk, for the probability
distribution of maximum flow heights over T years at location k.
• From these, determination of the PH,T (k) is straightforward.
76
SAMSI Fall,2018
✬
✫
✩
✪
Figure 13: For SVH, contours of the probabilities that the maximum flow
heights exceed 0.5 (left), 1 (center) and 2 (right) meters over the next T = 2.5
years at each location on SVH. The shaded area is Belham Valley, which is
still inhabited.
77
SAMSI Fall,2018
✬
✫
✩
✪
Coupling emulators (closed form) to emulate
coupled computer models (Kyzyurova (2017)).
Coupled simulators:
• fM
(x) is the output of a simulator with input x.
– Example: fM
(x) is TITAN2D
• gM
(z) is a simulator with input z.
– Example: gM
(z) is a computer model that determines damage to a
structure incurred by being hit be a pyroclastic flow with properties z.
• Of interest is gM
◦ fM
(x) = gM
(fM
(x)), the coupled simulator
computing the damage from a pyroclastic flow arising from inputs x.
The problem: It is usually difficult to directly link two simulators.
• The output of fM
will often not be in the form needed as input to gM
.
• It is difficult to determine a good design in terms of inputs x for the coupled
emulator.
• It may well be that many more runs of fM
are available than runs of gM
.
78
SAMSI Fall,2018
✬
✫
✩
✪
A solution: Separately develop emulators ˜fM
of fM
and ˜gM
of gM
, and
couple the emulators.
• Always possible by Monte Carlo (generate outputs from ˜fM
and use
them in ˜gM
).
• For GaSP’s, a closed form mean and variance of the coupled emulator
is available!
Theorem. Suppose the GaSP for gM
has the linear mean function
h(z′
)β = β0 + β1z′
b, and a product power correlation function with αj = 2
for inputs j ∈ b, . . . , d that arise from fM
. For each j ∈ b, . . . , d, let fM
j be
an independent emulator of fj, the function which gives rise to the value of
input j for g(·). Then the mean Eξ and variance Vξ of the linked emulator
ξ of the coupled simulator (g ◦ (fb, . . . , fd))(u) are
79
SAMSI Fall,2018
✬
✫
✩
✪
Eξ = β0 + β1µ∗
fb
(ub
) +
m
i=1
ai
b−1
j=1
exp −
|uj − zij|
δj
αj d
j=b
Ii
j,
Vξ = σ2
(1 + η) + β2
0 + 2β0β1µ∗
fb
(ub
) + β2
1(σ∗2
fb
(ub
) + (µ∗
fb
(ub
))2
) − (Eξ)2
+


m
k,l=1
(alak − σ2
{Cz
−1
}k,l)
b−1
j=1
e
−
|uj −zkj |
δj
αj
+
|uj −zlj |
δj
αj d
j=b
I1k,l
j

 +
2
m
i=1
ai
b−1
j=1
exp −
|uj − zij|
δj
αj
β0Ii
b + β1I+i
b
d
j=b+1
Ii
j,
80
SAMSI Fall,2018
✬
✫
✩
✪
where a = (a1, . . . , am)T
= C−1
z (gM
(z) − h(z)β) and
Ii
j =
1
1 + 2
σ∗2
fj
(uj)
δ2
j
exp −
(zij − µ∗
fj
(uj
))2
δ2
j + 2σ∗2
fj
(uj)
I1k,l
j =
1
1 + 4
σ∗2
fj
(uj)
δ2
j
e
−
zkj +zlj
2
−µ∗
fj
(uj)
2
δ2
j
2
+2σ∗2
fj
(uj)
e
−
(zkj −zlj )2
2δ2
j
I+i
b =
2
σ∗2
fb
(ub
)
δ2
b
zib + µ∗
fb
(ub
)
1 + 2
σ∗2
fb
(ub)
δ2
b
3
exp −
(zib − µ∗
fb
(ub
))2
δ2
b + 2σ∗2
fb
(ub)
.
81
SAMSI Fall,2018
✬
✫
✩
✪
−1 −0.63 −0.26 0.11 0.48 0.85
−101
●
●
●
●
●
●
x
f
f(x2) f(x3) f(x1) f(x6) f(x5)
−101
●
●
●
●
●
●
z
g
−1 −0.63 −0.26 0.11 0.48 0.85
−101
●
●
●
●
●
●
x
gOf
−1 −0.63 −0.26 0.11 0.48 0.85
−101
●
●
●
●
●
●
x
gOf
Figure 14: Top figures are functions f(x) and g(z) and their emulators. Bottom
left is the (closed form) coupled emulator of g ◦ f. Bottom right is the emulator of
the math coupled g ◦ f (constructed from the same inputs/outputs).
82
SAMSI Fall,2018
✬
✫
✩
✪
Other complex emulation scenarios
• Dynamic emulators (Conti et al. (2009); Liu and West (2009); Conti and
O’Hagan (2010); Reichert et al. (2011)).
• Emulating models with qualitative factors (Qian et al. (2008)).
• Nonstationary emulators (Gramacy and Lee (2008); Ba and Joseph
(2012).
• Emulating multivariate output (Bayarri et al. (2009); Paulo et al.
(2012); Fricker et al. (2013); Overstall and Woods (2016)).
• Evaluating the quality of emulators (Bastos and O’Hagan (2009);
Overstall and Woods (2016)).
83
SAMSI Fall,2018
✬
✫
✩
✪
References
Aslett, R., R. J. Buck, S. G. Duvall, J. Sacks, and W. J. Welch (1998).
Circuit optimization via sequential computer experiments: Design of an
output buffer. Applied Statistics 47, 31–48.
Ba, S. and V. R. Joseph (2012). Composite gaussian process models for
emulating expensive functions. The Annals of Applied Statistics 6(4),
1838–1860.
Bastos, L. S. and A. O’Hagan (2009). Diagnostics for gaussian process
emulators. Technometrics 51(4), 425–438.
Bates, R. A., R. J. Buck, E. Riccomagno, and H. P. Wynn (1996).
Experimental design and observation for large systems (Disc: p95–111).
Journal of the Royal Statistical Society, Series B, Methodological 58,
77–94.
Bayarri, M. J., J. O. Berger, E. S. Calder, K. Dalbey, S. Lunagomez, A. K.
Patra, E. B. Pitman, E. T. Spiller, and R. L. Wolpert (2009). Using
84
SAMSI Fall,2018
✬
✫
✩
✪
statistical and computer models to quantify volcanic hazards.
Technometrics 51, 402–413.
Bayarri, M. J., J. O. Berger, G. Garc´ıa-Donato, F. Liu, J. Palomo,
R. Paulo, J. Sacks, J. Walsh, J. A. Cafeo, and R. Parthasarathy (2007).
Computer model validation with functional output. Annals of
Statistics 35, 1874–1906.
Bayarri, M. J., J. O. Berger, M. C. Kennedy, A. Kottas, R. Paulo, J. Sacks,
J. A. Cafeo, C. H. Lin, and J. Tu (2009). Predicting vehicle
crashworthiness: validation of computer models for functional and
hierarchical data. Journal of the American Statistical Association 104,
929–942.
Bowman, V. E. and D. C. Woods (2016). Emulation of multivariate
simulators using thin-plate splines with application to atmospheric
dispersion. SIAM/ASA Journal on Uncertainty Quantification 4(1),
1323–1344.
85
SAMSI Fall,2018
✬
✫
✩
✪
Conti, S., J. P. Gosling, J. Oakley, and A. O’hagan (2009). Gaussian
process emulation of dynamic computer codes. Biometrika, asp028.
Conti, S. and A. O’Hagan (2010). Bayesian emulation of complex
multi-output and dynamic computer models. Journal of statistical
planning and inference 140(3), 640–651.
Cumming, J. A. and M. Goldstein (2009). Small sample bayesian designs
for complex high-dimensional models based on information gained using
fast approximations. Technometrics 51, 377–388.
Dalbey, K., M. Jones, E. B. Pitman, E. S. Calder, M. Bursik, and A. K.
Patra (2012). Hazard risk analysis using computer models of physical
phenomena and surrogate statistical models. Int. J. for Unceratainty
Quantification. To appear.
Fricker, T. E., J. E. Oakley, and N. M. Urban (2013). Multivariate gaussian
process emulators with nonseparable covariance structures.
Technometrics 55(1), 47–56.
86
SAMSI Fall,2018
✬
✫
✩
✪
Gramacy, B. and H. K. H. Lee (2008). Bayesian treed gaussian process
models with an application to computer modeling. Journal of the
American Statistical Association 103(483), 1119 – 1130.
Gramacy, B. and H. K. H. Lee (2009). Adaptive design and analysis of
supercomputer experiments. Technometrics 51, 130–145.
doi:10.1198/TECH.2009.0015.
Gu, M., J. Palomo, and J. O. Berger (2016). RobustGaSP: Robust
Gaussian Stochastic Process Emulation. R package version 0.5.4.
Gu, M., X. Wang, and J. Berger (2018). Robust gaussian stochastic process
emulation. Annals of Statistics.
Kyzyurova, K. N. (2017). On Uncertainty Quantification for Systems of
Computer Models. Ph. D. thesis, Duke University.
Lam, C. Q. and W. I. Notz (2008). Sequential adaptive designs in
computer experiments for response surface model fit. Statistics and
Applications 66(9), 207–233.
87
SAMSI Fall,2018
✬
✫
✩
✪
Lim, Y. B., J. Sacks, W. Studden, and W. J. Welch (2002). Design and
analysis of computer experiments when the output is highly correlated
over the input space. Canadian Journal of Statistics 30(1), 109–126.
Liu, F. and M. West (2009). A dynamic modelling strategy for Bayesian
computer model emulation. Bayesian Analysis 4(2), 393–412.
Loeppky, J. L., L. M. Moore, and B. J. Williams (2010). Batch sequential
designs for computer experiments. Journal of Statistical Planning and
Inference 140(6), 1452–1464.
Lopes, D. (2011). Development and Implementation of Bayesian Computer
Model Emulators. Ph.d. dissertation, Duke University.
McKay, M. D., W. J. Conover, and R. J. Beckman (1979). A comparison of
three methods for selecting values of input variables in the analysis of
output from a computer code. Technometrics 21, 239–245.
Overstall, A. M. and D. C. Woods (2016). Multivariate emulation of
computer simulators: model selection and diagnostics with application to
88
SAMSI Fall,2018
✬
✫
✩
✪
a humanitarian relief model. Journal of the Royal Statistical Society:
Series C (Applied Statistics) 65(4), 483–505.
Paulo, R. (2005). Default priors for gaussian processes. The Annals of
statistics 33(2), 556–582.
Paulo, R., G. Garc´ıa-Donato, and J. Palomo (2012). Calibration of
computer models with multivariate output. Computational Statistics and
Data Analysis 56(12), 3959–3974.
Qian, P. Z. G., H. Wu, and C. F. J. Wu (2008). Gaussian process models
for computer experiments with qualitative and quantitative factors.
Technometrics 50(3), 383–396.
Ranjan, P., H. R. and R. Karsten (2011). A computationally stable
approach to gaussian process interpolation of deterministic computer
simulation data. Technometrics 53, 366 – 378.
Ranjan, P., D. Bingham, and G. Michailidis (2008). Sequential experiment
89
SAMSI Fall,2018
✬
✫
✩
✪
design for contour estimation from complex computer codes.
Technometrics 50(4). Errata Technometrics 53, 527–541.
Reichert, P., G. White, M. J. Bayarri, and E. B. Pitman (2011).
Mechanism-based emulation of dynamic simulations models: Concept
and application in hydrology. Computational Statistics & Data
Analysis 55, 1638–1655.
Roustant, O., D. Ginsbourger, and Y. Deville (2012). Dicekriging,
diceoptim: Two r packages for the analysis of computer experiments by
kriging-based metamodeling and optimization. Journal of Statistical
Software 51(1), 1–55.
Sacks, J., W. J. Welch, T. J. Mitchell, and H. P. Wynn (1989). Design and
analysis of computer experiments (C/R: p423–435). Statistical Science 4,
409–423.
Santner, T. J., B. Williams, and W. Notz (2003). The Design and Analysis
of Computer Experiments. Springer-Verlag.
90
SAMSI Fall,2018
✬
✫
✩
✪
Spiller, E. T., M. Bayarri, J. O. Berger, E. S. Calder, A. K. Patra, E. B.
Pitman, and R. L. Wolpert (2014). Automating emulator construction
for geophysical hazard maps. SIAM/ASA Journal on Uncertainty
Quantification 2(1), 126–152.
Welch, W. J., R. J. Buck, J. Sacks, H. P. Wynn, T. J. Mitchell, and M. D.
Morris (1992). Screening, predicting, and computer experiments.
Technometrics 34, 15–25.
91

More Related Content

What's hot

Gradient Descent
Gradient DescentGradient Descent
Gradient DescentJinho Choi
 
Inference for stochastic differential equations via approximate Bayesian comp...
Inference for stochastic differential equations via approximate Bayesian comp...Inference for stochastic differential equations via approximate Bayesian comp...
Inference for stochastic differential equations via approximate Bayesian comp...Umberto Picchini
 
March12 natarajan
March12 natarajanMarch12 natarajan
March12 natarajanBBKuhn
 
Accelerated approximate Bayesian computation with applications to protein fol...
Accelerated approximate Bayesian computation with applications to protein fol...Accelerated approximate Bayesian computation with applications to protein fol...
Accelerated approximate Bayesian computation with applications to protein fol...Umberto Picchini
 
Intro to Approximate Bayesian Computation (ABC)
Intro to Approximate Bayesian Computation (ABC)Intro to Approximate Bayesian Computation (ABC)
Intro to Approximate Bayesian Computation (ABC)Umberto Picchini
 
My data are incomplete and noisy: Information-reduction statistical methods f...
My data are incomplete and noisy: Information-reduction statistical methods f...My data are incomplete and noisy: Information-reduction statistical methods f...
My data are incomplete and noisy: Information-reduction statistical methods f...Umberto Picchini
 
A Game Theoretic Analysis of Intrusion Detection in Access Control Systems - ...
A Game Theoretic Analysis of Intrusion Detection in Access Control Systems - ...A Game Theoretic Analysis of Intrusion Detection in Access Control Systems - ...
A Game Theoretic Analysis of Intrusion Detection in Access Control Systems - ...Kim Hammar
 
Distributed Architecture of Subspace Clustering and Related
Distributed Architecture of Subspace Clustering and RelatedDistributed Architecture of Subspace Clustering and Related
Distributed Architecture of Subspace Clustering and RelatedPei-Che Chang
 
Lecture8 multi class_svm
Lecture8 multi class_svmLecture8 multi class_svm
Lecture8 multi class_svmStéphane Canu
 
Lecture9 multi kernel_svm
Lecture9 multi kernel_svmLecture9 multi kernel_svm
Lecture9 multi kernel_svmStéphane Canu
 
Tensor train to solve stochastic PDEs
Tensor train to solve stochastic PDEsTensor train to solve stochastic PDEs
Tensor train to solve stochastic PDEsAlexander Litvinenko
 
Bayesian Subset Simulation
Bayesian Subset SimulationBayesian Subset Simulation
Bayesian Subset SimulationJulien Bect
 
Explanation on Tensorflow example -Deep mnist for expert
Explanation on Tensorflow example -Deep mnist for expertExplanation on Tensorflow example -Deep mnist for expert
Explanation on Tensorflow example -Deep mnist for expert홍배 김
 
Risk-Aversion, Risk-Premium and Utility Theory
Risk-Aversion, Risk-Premium and Utility TheoryRisk-Aversion, Risk-Premium and Utility Theory
Risk-Aversion, Risk-Premium and Utility TheoryAshwin Rao
 

What's hot (20)

Gradient Descent
Gradient DescentGradient Descent
Gradient Descent
 
Inference for stochastic differential equations via approximate Bayesian comp...
Inference for stochastic differential equations via approximate Bayesian comp...Inference for stochastic differential equations via approximate Bayesian comp...
Inference for stochastic differential equations via approximate Bayesian comp...
 
AINL 2016: Strijov
AINL 2016: StrijovAINL 2016: Strijov
AINL 2016: Strijov
 
March12 natarajan
March12 natarajanMarch12 natarajan
March12 natarajan
 
Accelerated approximate Bayesian computation with applications to protein fol...
Accelerated approximate Bayesian computation with applications to protein fol...Accelerated approximate Bayesian computation with applications to protein fol...
Accelerated approximate Bayesian computation with applications to protein fol...
 
Intro to Approximate Bayesian Computation (ABC)
Intro to Approximate Bayesian Computation (ABC)Intro to Approximate Bayesian Computation (ABC)
Intro to Approximate Bayesian Computation (ABC)
 
My data are incomplete and noisy: Information-reduction statistical methods f...
My data are incomplete and noisy: Information-reduction statistical methods f...My data are incomplete and noisy: Information-reduction statistical methods f...
My data are incomplete and noisy: Information-reduction statistical methods f...
 
A Game Theoretic Analysis of Intrusion Detection in Access Control Systems - ...
A Game Theoretic Analysis of Intrusion Detection in Access Control Systems - ...A Game Theoretic Analysis of Intrusion Detection in Access Control Systems - ...
A Game Theoretic Analysis of Intrusion Detection in Access Control Systems - ...
 
1 - Linear Regression
1 - Linear Regression1 - Linear Regression
1 - Linear Regression
 
Regression
RegressionRegression
Regression
 
Distributed Architecture of Subspace Clustering and Related
Distributed Architecture of Subspace Clustering and RelatedDistributed Architecture of Subspace Clustering and Related
Distributed Architecture of Subspace Clustering and Related
 
Lecture8 multi class_svm
Lecture8 multi class_svmLecture8 multi class_svm
Lecture8 multi class_svm
 
Lecture9 multi kernel_svm
Lecture9 multi kernel_svmLecture9 multi kernel_svm
Lecture9 multi kernel_svm
 
Tensor train to solve stochastic PDEs
Tensor train to solve stochastic PDEsTensor train to solve stochastic PDEs
Tensor train to solve stochastic PDEs
 
MCQMC 2016 Tutorial
MCQMC 2016 TutorialMCQMC 2016 Tutorial
MCQMC 2016 Tutorial
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Bayesian Subset Simulation
Bayesian Subset SimulationBayesian Subset Simulation
Bayesian Subset Simulation
 
Explanation on Tensorflow example -Deep mnist for expert
Explanation on Tensorflow example -Deep mnist for expertExplanation on Tensorflow example -Deep mnist for expert
Explanation on Tensorflow example -Deep mnist for expert
 
Risk-Aversion, Risk-Premium and Utility Theory
Risk-Aversion, Risk-Premium and Utility TheoryRisk-Aversion, Risk-Premium and Utility Theory
Risk-Aversion, Risk-Premium and Utility Theory
 
MUMS Opening Workshop - Quantifying Nonparametric Modeling Uncertainty with B...
MUMS Opening Workshop - Quantifying Nonparametric Modeling Uncertainty with B...MUMS Opening Workshop - Quantifying Nonparametric Modeling Uncertainty with B...
MUMS Opening Workshop - Quantifying Nonparametric Modeling Uncertainty with B...
 

Similar to 2018 MUMS Fall Course - Gaussian Processes and Statistic Emulators (EDITED) - Jim Berger , October 23, 2018

Svm map reduce_slides
Svm map reduce_slidesSvm map reduce_slides
Svm map reduce_slidesSara Asher
 
Paper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelinePaper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelineChenYiHuang5
 
Lecture: Monte Carlo Methods
Lecture: Monte Carlo MethodsLecture: Monte Carlo Methods
Lecture: Monte Carlo MethodsFrank Kienle
 
MCQMC_talk_Chiheb_Ben_hammouda.pdf
MCQMC_talk_Chiheb_Ben_hammouda.pdfMCQMC_talk_Chiheb_Ben_hammouda.pdf
MCQMC_talk_Chiheb_Ben_hammouda.pdfChiheb Ben Hammouda
 
Parallel Optimization in Machine Learning
Parallel Optimization in Machine LearningParallel Optimization in Machine Learning
Parallel Optimization in Machine LearningFabian Pedregosa
 
Apache SystemML Optimizer and Runtime techniques by Matthias Boehm
Apache SystemML Optimizer and Runtime techniques by Matthias BoehmApache SystemML Optimizer and Runtime techniques by Matthias Boehm
Apache SystemML Optimizer and Runtime techniques by Matthias BoehmArvind Surve
 
Apache SystemML Optimizer and Runtime techniques by Matthias Boehm
Apache SystemML Optimizer and Runtime techniques by Matthias BoehmApache SystemML Optimizer and Runtime techniques by Matthias Boehm
Apache SystemML Optimizer and Runtime techniques by Matthias BoehmArvind Surve
 
Chap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsChap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsYoung-Geun Choi
 
Complexity Analysis
Complexity Analysis Complexity Analysis
Complexity Analysis Shaista Qadir
 
Efficient Computation of Regret-ratio Minimizing Set: A Compact Maxima Repres...
Efficient Computation ofRegret-ratio Minimizing Set:A Compact Maxima Repres...Efficient Computation ofRegret-ratio Minimizing Set:A Compact Maxima Repres...
Efficient Computation of Regret-ratio Minimizing Set: A Compact Maxima Repres...Abolfazl Asudeh
 
The Concurrent Constraint Programming Research Programmes -- Redux (part2)
The Concurrent Constraint Programming Research Programmes -- Redux (part2)The Concurrent Constraint Programming Research Programmes -- Redux (part2)
The Concurrent Constraint Programming Research Programmes -- Redux (part2)Pierre Schaus
 
Computer Graphics Unit 1
Computer Graphics Unit 1Computer Graphics Unit 1
Computer Graphics Unit 1aravindangc
 
Tensor Train data format for uncertainty quantification
Tensor Train data format for uncertainty quantificationTensor Train data format for uncertainty quantification
Tensor Train data format for uncertainty quantificationAlexander Litvinenko
 
Efficient anomaly detection via matrix sketching
Efficient anomaly detection via matrix sketchingEfficient anomaly detection via matrix sketching
Efficient anomaly detection via matrix sketchingHsing-chuan Hsieh
 

Similar to 2018 MUMS Fall Course - Gaussian Processes and Statistic Emulators (EDITED) - Jim Berger , October 23, 2018 (20)

Presentation.pdf
Presentation.pdfPresentation.pdf
Presentation.pdf
 
Svm map reduce_slides
Svm map reduce_slidesSvm map reduce_slides
Svm map reduce_slides
 
Paper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelinePaper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipeline
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Lecture: Monte Carlo Methods
Lecture: Monte Carlo MethodsLecture: Monte Carlo Methods
Lecture: Monte Carlo Methods
 
2018 MUMS Fall Course - Issue Arising in Several Working Groups: Probabilisti...
2018 MUMS Fall Course - Issue Arising in Several Working Groups: Probabilisti...2018 MUMS Fall Course - Issue Arising in Several Working Groups: Probabilisti...
2018 MUMS Fall Course - Issue Arising in Several Working Groups: Probabilisti...
 
MCQMC_talk_Chiheb_Ben_hammouda.pdf
MCQMC_talk_Chiheb_Ben_hammouda.pdfMCQMC_talk_Chiheb_Ben_hammouda.pdf
MCQMC_talk_Chiheb_Ben_hammouda.pdf
 
Parallel Optimization in Machine Learning
Parallel Optimization in Machine LearningParallel Optimization in Machine Learning
Parallel Optimization in Machine Learning
 
Apache SystemML Optimizer and Runtime techniques by Matthias Boehm
Apache SystemML Optimizer and Runtime techniques by Matthias BoehmApache SystemML Optimizer and Runtime techniques by Matthias Boehm
Apache SystemML Optimizer and Runtime techniques by Matthias Boehm
 
Apache SystemML Optimizer and Runtime techniques by Matthias Boehm
Apache SystemML Optimizer and Runtime techniques by Matthias BoehmApache SystemML Optimizer and Runtime techniques by Matthias Boehm
Apache SystemML Optimizer and Runtime techniques by Matthias Boehm
 
ML unit-1.pptx
ML unit-1.pptxML unit-1.pptx
ML unit-1.pptx
 
MUMS: Transition & SPUQ Workshop - Practical Bayesian Optimization for Urban ...
MUMS: Transition & SPUQ Workshop - Practical Bayesian Optimization for Urban ...MUMS: Transition & SPUQ Workshop - Practical Bayesian Optimization for Urban ...
MUMS: Transition & SPUQ Workshop - Practical Bayesian Optimization for Urban ...
 
Chap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsChap 8. Optimization for training deep models
Chap 8. Optimization for training deep models
 
Input analysis
Input analysisInput analysis
Input analysis
 
Complexity Analysis
Complexity Analysis Complexity Analysis
Complexity Analysis
 
Efficient Computation of Regret-ratio Minimizing Set: A Compact Maxima Repres...
Efficient Computation ofRegret-ratio Minimizing Set:A Compact Maxima Repres...Efficient Computation ofRegret-ratio Minimizing Set:A Compact Maxima Repres...
Efficient Computation of Regret-ratio Minimizing Set: A Compact Maxima Repres...
 
The Concurrent Constraint Programming Research Programmes -- Redux (part2)
The Concurrent Constraint Programming Research Programmes -- Redux (part2)The Concurrent Constraint Programming Research Programmes -- Redux (part2)
The Concurrent Constraint Programming Research Programmes -- Redux (part2)
 
Computer Graphics Unit 1
Computer Graphics Unit 1Computer Graphics Unit 1
Computer Graphics Unit 1
 
Tensor Train data format for uncertainty quantification
Tensor Train data format for uncertainty quantificationTensor Train data format for uncertainty quantification
Tensor Train data format for uncertainty quantification
 
Efficient anomaly detection via matrix sketching
Efficient anomaly detection via matrix sketchingEfficient anomaly detection via matrix sketching
Efficient anomaly detection via matrix sketching
 

More from The Statistical and Applied Mathematical Sciences Institute

More from The Statistical and Applied Mathematical Sciences Institute (20)

Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
 
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
 
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
 
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
 
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
 
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
 
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
 
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
 
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
 
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
 
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
 
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
 
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
 
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
 
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
 
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
 
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
 
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
 
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
 
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
 

Recently uploaded

fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 

Recently uploaded (20)

fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 

2018 MUMS Fall Course - Gaussian Processes and Statistic Emulators (EDITED) - Jim Berger , October 23, 2018

  • 1. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Lectures 8-9: Gaussian Processes and Statistical Emulators 1
  • 2. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Outline • Motivation for emulation of simulators • Design for simulator runs • Some emulation strategies • Gaussian process (GaSP) emulators • Fitting GaSP emulators and the Robust R-package • The differences between use of Gaussian processes in Spatial Statistics and UQ • Emulation in more complicated situations – Functional emulators via Kronecker products – Functional emulators via basis decomposition – Functional emulators via parallel partial emulation – Coupling emulators 2
  • 3. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Crucial assumption: I will be assuming the “black box” scenario, where all we can do is run the simulator at various inputs; i.e., we do not have access to the internal code. 3
  • 4. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Motivations for emulation (approximation) of complex computer models Simulators (complex computer models of processes) often take hours to weeks for a single run. One often needs fast simulator approximations (emulators, surrogate models, meta models, response surface approximations, . . .) for Uncertainty Quantification analyses such as • prediction of the simulator at unobserved values of the inputs • optimization of the simulator over input values • inverse problems (learning unknown simulator parameters from data) • propagating uncertainty in inputs through the simulator • data assimilation (predicting reality with a combination of simulator output and observational data) • assessing simulator bias and detecting suspect simulator components • interfacing systems of simulators (or systems of simulators/stat models) 4
  • 5. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Statistical Design of Runs of the Simulator (needed to create the emulator) McKay et al. (1979); Sacks et al. (1989); Welch et al. (1992); Bates et al. (1996); Lim et al. (2002); Santner et al. (2003) Notation: For these lectures, x ∈ X will denote the d-dimensional vector of inputs to the simulator (computer model). These could be initial conditions, control inputs, computer model parameters, ... The simulator output is denoted by yM (x). Goal for design: Choose m points D = {x1, .., xm} at which the simulator is to be evaluated, yielding yD = (yM (x1), . . . , yM (xm))′ . From these, the emulator (approximation) to the simulator will be constructed. – Folklore says that m should be at least 10d although many more runs are often needed (but often not available). Criterion: In general, should be problem specific. General purpose criteria involve finding “space-filling” designs. 5
  • 6. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Most common space filling design: Maximin Latin Hypercube Design • A Latin Hypercube Design (LHD) is a design in a grid whereby each sample is the only one in each axis-aligned hyperplane containing it. • A maximin LHD is an LHD that maximizes mini=j δ(xi, xj), where δ(·, ·) is a distance on X. Figure 1: Left: 47 point maximin LHD; d = 6; 2d-projection; Right: 47 point “0-Correlation” LHD; same 2d projection. 6
  • 7. SAMSI Fall,2018 ✬ ✫ ✩ ✪ TITAN2D simulating pyroclastic flow on Montserrat (Bayarri et al. (2009); Dalbey et al. (2012)) Inputs: Initial conditions: x1 = flow volume (V ); x2 = flow direction (ϕ); Model parameters: x3 = basal friction (δbed); x4 = internal friction (δint). 7
  • 8. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Background of the application: The simulator, TITAN2D, for given inputs V , ϕ, δbed, and δint, yields a description of the pyroclastic flow over a large space-time grid of points. Each run of TITAN2D takes two hours. Of primary interest is the maximum (over time) flow height, yM (V, ϕ, δbed, δint), at k spatial locations over the island. • Flow heights yM (V, ϕ, δbed, δint) > 1m are deemed to be catastrophic. The analysis begins • by choosing a Latin hypercube design to select m = 2048 design points in the feasible input region X = [105 m3 , 109.5 m3 ] × [0, 2π] × [5, 18] × [15, 35]; • running TITAN2D at these preliminary points, yielding the ‘data’ yD =         yM (x1) yM (x2) ... yM (xm)         =         (yM 1 (x1), yM 2 (x1), . . . yM k (x1)) (yM 1 (x2), yM 2 (x2), . . . yM k (x2)) ... (yM 1 (xm), yM 2 (xm), . . . yM k (xm))         ; • constructing the emulator from yD (in general, a matrix of size 2048 × 109 ). 8
  • 9. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Adaptive design: (Aslett et al. (1998); Lam and Notz (2008); Ranjan et al. (2008); Cumming and Goldstein (2009); Gramacy and Lee (2009); Loeppky et al. (2010); Spiller et al. (2014)). 1 2 3 4 5 6 0 0.05 0.1 0.15 0.2 0.25 0.3 Initiation angle, (radians) standarderror,(meters) Figure 2: Standard error of the emulator of a function of interest in the volcano problem. Red: original Latin hypercube design of 256 points. Blue: standard error with 9 additional points chosen to maximize the resulting information. 9
  • 10. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Some Emulation Strategies Recall the goal: We want to develop an emulator (approximation) of the computer model that allows us to predict the computer model outcome yM (x∗ ) at new inputs x∗ . This can be done by • Regression: fine if the simulator output is smooth enough and – one knows good basis functions to regress upon, – or can find good basis functions by, e.g., PCA (or POD or EOD). ∗ This often works, but is often suboptimal (e.g., for TITAN2D). • Polynomial chaos: statisticians question its effectiveness as a general tool for emulation (it fails for TITAN2D and many other processes). • Gaussian stochastic processes (GaSP’s), of a variety of types, typically what are called separable GaSP’s. • Combinations of the above (e.g. GaSP’s on coefficients of basis expansions (Bayarri et al. (2007); Bowman and Woods (2016)). 10
  • 11. SAMSI Fall,2018 ✬ ✫ ✩ ✪ An Aside: the Multivariate Normal Distribution If Y = (Y1, Y2, . . . , Ym) has a multivariate normal distribution with mean µ = (µ1, µ2, . . . , µm) and m × m positive definite covariance matrix Σ having entries σi,j (notation: Y ∼ MV N(µ, Σ)), then • each Yi is marginally normally distributed with – mean E[Yi] = µi, – variance E[(Yi − µi)2 ] = σi,i; • σi,j = E[(Yi − µi)(Yj − µj)] is called the covariance between Yi and Yj; • ci,j = σi,j √ σi,iσj,j is called the correlation between Yi and Yj. 11
  • 12. SAMSI Fall,2018 ✬ ✫ ✩ ✪ GaSP Emulators Model the real-valued for now simulator output yM (x) as an unknown function via a Gaussian stochastic process: yM (·) ∼ GaSP( µ(·), σ2 c (·, ·)), with mean function µ(x), variance σ2 , and correlation function c (·, ·) if, for any inputs {x1, . . . , xm} from X, (yM (x1), . . . , yM (xm)) ∼ MV N (µ(x1), . . . , µ(xm)), σ2 C , (1) where C is the correlation matrix with (i, j) element ci,j = c (xi, xj). • This is a random distribution on functions of x. • All we really need to know is the induced MV N distribution in (1) of the function evaluated at a finite set of points. 12
  • 13. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Common is to choose the following forms for the mean and correlations: • Model the unknown mean function via regression, as µ(x) = Ψ(x) θ ≡ l i=1 ψi(x)θi , Ψ(·) = (ψ1(·), . . . , ψl(·)) being a vector of specified basis functions and the θi being unknown (e.g., µ(V, ϕ, δbed, δint) = θ1 + θ2V for TITAN2D). • As the correlation function arising from the d-dimensional x, utilize the separable power exponential family c (x, x∗ ) = d j=1 exp {−(|xj − xj ∗ |/γj)αj }; – γj > 0 determines how fast the correlation decays to 0 – αj ∈ (0, 2] determines continuity, differentiability, . . . ∗ We set the αj = 1.9. (αj = 2 can have numerical problems.) – the product form greatly speeds computation and allows stochastic inputs to be handled easily. 13
  • 14. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Example: Suppose d = 2 and m = 2, so we have two designed inputs x1 = (x11, x12) and x2 = (x21, x22). The correlation matrix for (yM (x1), yM (x2)) is then C =   1 e−[|x11−x21|/γ1]α1 e−[|x12−x22|/γ2]α2 e−[|x11−x21|/γ1]α1 e−[|x12−x22|/γ2]α2 1   . • As, say, γ1 → ∞, C →   1 e−[|x12−x22|/γ2]α2 e−[|x12−x22|/γ2]α2 1   . Typically the emulator will then be constant in the first coordinate. • If any γi → 0, C →   1 0 0 1   , which gives a terrible emulator. 14
  • 15. SAMSI Fall,2018 ✬ ✫ ✩ ✪ • After obtaining the computer runs yD at the design points, the traditional strategy was to estimate the GaSP parameters by maximum likelihood using this data, and then use the standard Kriging formula for the emulator predictive mean and variance. – Maximum likelihood (least squares fit) is not a good idea here; too often the range parameters end up being 0 or ∞. Figure 3: GaSP mean and 90% confidence bands fit by maximum likelihood to a damped sine wave (the red curve) for m=10 (left) and m=9 (right) points. 15
  • 16. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Advantages of the GaSP Emulator • It is an interpolator of the simulator values yM (xi) at the design inputs xi. • It provides an assessment of the accuracy of the approximation, which is quite reliable (in a conservative sense) when it is not crazy. • The separable form properly allows very different fits to the various inputs. • The analysis stays within probability (Bayesian) calculus. Disadvantages of the GaSP Emulator • Maximum likelihood is very unreliable (fixable, as we will see). • It requires inversion of an m × m matrix, requiring special techniques if m is large (lots of research on this). • It is a stationary process and, hence, not always suitable as an emulator. 16
  • 17. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Improving on maximum likelihood (least squares) estimation of the unknown GaSP parameters (Lopes (2011); Ranjan and Karsten (2011); Roustant et al. (2012); Gu et al. (2016, 2018)). Step 1. Deal with the crucial parameters (θ, σ2 ) via a fully Bayesian analysis, using the objective prior: π(θ, σ2 ) = 1/σ2 . • Alas, dealing with the correlation parameters by full Bayesian methods is computationally intractable. Step 2. Estimate γ = (γ1, . . . , γd) as the mode ˆγ of its marginal posterior distribution arising • by integrating out θ and σ2 with respect to their objective prior; • multiplying the resulting integrated likelihood by the reference prior for γ (the most popular objective Bayesian prior). Step 3. The resulting GaSP emulator of yM (x∗ ) at new input x∗ is a t-process with mean function and covariance function in closed form. 17
  • 18. SAMSI Fall,2018 ✬ ✫ ✩ ✪ The GaSP emulator, yM (x∗ ), of yM (x∗ ) at new input x∗ is a t-distribution yM (x∗ ) ∼ T ˆµ(x∗ ) , ˆσ2 V (x∗ ) , m − l , where ˆµ(x∗ ) = Ψ(x∗ )ˆθ + C(x∗ )C−1 (yD − Ψˆθ) ˆσ2 = 1 m − l yD′ C−1 yD − ˆθ ′ (Ψ′ C−1 Ψ)ˆθ V (x∗ ) = 1 − C(x∗ )C−1 C(x∗ )′ + (Ψ(x∗ ) − C(x∗ )C−1 Ψ)(Ψ′ C−1 Ψ)−1 (Ψ(x∗ ) − C(x∗ )C−1 Ψ)′ , with ˆθ = (Ψ′ C−1 Ψ)−1 Ψ′ C−1 yD , Ψ = (Ψj(xi)), Ψ(x∗ ) = (Ψ1(x∗ ), . . . , Ψl(x∗ )), and C(x∗ ) = (c(x1, x∗ ), . . . , c(xm, x∗ )). • Not the usual Kriging formula because of use of posterior for (θ, σ2 ). • This is an interpolator of the simulator values yM (xi) at the design inputs xi. • It provides an assessment of the accuracy of the approximation, also incorporating the uncertainty arising from estimating θ and σ2 . • The only potential computational challenge is computing C−1 if m is very large. 18
  • 19. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Figure 4: Mean of the emulator of TITAN2D, predicting ‘maximum flow height’ at a location, as a function of flow volume and angle, for fixed δbed = 15 and δint = 27. Left: Plymouth, Right: Bramble Airport. Black points: max-height simulator outputs at design points. 19
  • 20. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Details of Steps 1 and 3 • Note that (yM (x1), . . . , yM (xm), yM (x∗ )) ∼ MV N (µ(x1), . . . , µ(xm), µ(x∗ )), σ2 C∗ , where C∗ =   C C(x∗ )′ C(x∗ ) 1   . • Multiplying this by π(θ, σ2 ) = 1/σ2 gives the joint density of (yM (x1), . . . , yM (xm), yM (x∗ ), θ, σ2 ) • Compute the conditional density of (yM (x∗ ), θ, σ2 ) given (yM (x1), . . . , yM (xm)). • Integrate out θ and σ2 to obtain the posterior predictive density ˜yM (x∗ ) of the target yM (x∗ ). 20
  • 21. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Details of Versions of Step 2 Version 1. Finding the Marginal Maximum Likelihood Estimate (MMLE) of the correlation parameters γ = (γ1, . . . , γd): • Starting with the likelihood L(θ, σ2 , γ) arising from (yM (x1), . . . , yM (xm)) ∼ MV N (µ(x1), . . . , µ(xm)), σ2 C , integrate out θ and σ2 , using the objective prior π(θ, σ2 ) = 1/σ2 , obtaining the marginal likelihood for γ L(γ) = L(θ, σ2 , γ) 1 σ2 dθ dσ2 ∝ |C(γ)|− 1 2 |X′ C(γ)−1 X|− 1 2 (S2 (γ))−( n−p 2 ) , where S2 (γ) = (Y − Xˆθ)′ C(γ)−1 (Y − Xˆθ) is the residual sum of squares and ˆθ = (X′ C(γ)−1 X)−1 X′ C(γ)−1 Y is the least squares estimator of θ,. • The MMLE estimate is that which maximizes L(γ). 21
  • 22. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Definition 0.1 (An Aside: Robust Estimation.) Estimation of the correlation parameters in the GaSP is called robust, if the following two situations do NOT happen: (i) ˆC = 1n1T n , (ii) ˆC = In, (even approximately), where ˆC is the estimated correlation matrix. When ˆC ≈ 1n1T n , the correlation matrix is almost singular, leading to very large computational errors in the GaSP predictive mean. When ˆC ≈ In, the GaSP predictive mean will degenerate to the fitted mean and impulse functions, as shown in the next figures. 22
  • 23. SAMSI Fall,2018 ✬ ✫ ✩ ✪ 0.0 0.2 0.4 0.6 0.8 1.0 −3−2−10123 x y 0.0 0.2 0.4 0.6 0.8 1.0 −3−2−10123 x y Example of the problem when ˆC ≈ In: Emulation of the function y = 3sin(5πx)x + cos(7πx), graphed as the black solid curves (overlapping the green curves in the left panel). The n = 12 input function values are the black circles. The left panel is for α = 1.9 and the right panel for α = 1, for the power exponential correlation function. • The blue curves give the emulator mean from the MLE approach; • the red curves (overlapping with green on left) give the emulator mean from the MMLE approach; • the green curves give the emulator mean from the posterior mode approach. 23
  • 24. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Here are three common ways of parameterizing the range parameters in power exponential correlation function: cβl (|xil − xjl|) = exp{−βl|xil − xjl|αl }, cγl (|xil − xjl|) = exp{−(|xil − xjl|/γl)αl }, cξl (|xil − xjl|) = exp {− exp(ξl)|xil − xjl|αl } , for any l = 1, · · · , d. Lemma 0.1 Robustness is lacking in either of the following two cases. Case 1. If for all 1 ≤ l ≤ d, ˆβl = 0 (or ˆγl = ∞ or ˆξl = −∞ in the other parameterizations), then ˆC = 1m1T m. Case 2. If any ˆβl = ∞ (equivalent to ˆγl = 0 or ˆξl = ∞), then ˆC = Im. 24
  • 25. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Version 2. Finding the Reference Posterior Mode (RPM) of the correlation parameters: The standard objective prior (the reference prior) for β (Paulo (2005)) is πR (β) ∝ |I⋆ (β)|1/2 where, with l being the dimension of θ and d the dimension of β, I⋆ (β) =         (m − l) trW1 trW2 · · · trWd trW2 1 trW1W2 · · · trW1Wd ... · · · ... trW2 d         and Wk = ∂C ∂βk C(β)−1 [In − X(X′ C(β)−1 X)−1 X′ C(β)−1 ] . The posterior mode is then found by maximizing • L(ψ−1 (β)) πR (β) in the β parameterization, where β = ψ(γ); • L(ψ−1 (exp(ξ))) πR (exp(ξ)) exp( l ξl) in the ξ parameterization; • L(γ) πR (ψ(γ)) ψ′ (γ) in the γ parameterization. 25
  • 26. SAMSI Fall,2018 ✬ ✫ ✩ ✪ 1n is a column of X 1n is not a column of X some βl → ∞ βl → 0 for all l some βl → ∞ βl → 0 for all l Profile Lik O(1) O(γ −α/2 (1) ) O(1) O(γ −α/2 (1) ) Marginal Lik O(1) O(1) O(1) O((γ −α/2 (1) )) Post β, p = 1 O(exp(−βC)) O(1) O(β 1 2 exp(−βC)) O(β−1/2 ) p ≥ 2 O( l∈E exp(−βlCl)) O(β −(p−1) (p) ) O(( l∈E βl) 1 2 p l=1 exp(−βl)Cl) O(β −(p−1/2) (p) ) Post γ, p = 1 O(exp(−C/γα) γ(α+1) ) O(γ−α−1 ) O(exp(−C/γα) γ(α/2+1) ) O(γ−α/2−1 ) p ≥ 2 O( l∈E exp(−Cl/γα l ) γ (α+1) l ) O( p l=1 γ−α−1 l γ (1−p)α (1) ) O( l∈E exp(−Cl/γα l ) γ (α/2+1) l ) O( p l=1 γ−α−1 l γ (1/2−p)α (1) ) Post ξ, p = 1 O(exp(− exp(ξ)C + ξ)) O(exp(ξ)) O(exp(− exp(ξ)C + 3 2 ξ)) O(exp(ξ/2)) p ≥ 2 O( l∈E exp(− exp(ξl)Cl + ξl)) O( exp( p−1 l=1 ξl) exp((p−2)ξ(p)) ) O( l∈E exp(− exp(ξl)Cl) + 3 2 ξl) O( exp( p−1 i=1 ξl) exp((p−1/2)ξ(p)) ) Tail behaviors of the profile likelihood (insert the MLE’s of θ and σ2 ), the marginal likelihood and the posterior distributions for different parameterizations of the power exponential correlation function, using the reference prior. • Blue gives the cases where the tail behavior is constant, so that there is danger of non-robustness (the mle could be at ∞). • Red gives the non-robust cases where the posterior goes to infinity in the tail. • Thus use the posterior mode for either the γ or ξ parameterizations. 26
  • 27. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Another Improvement One of the most frequently used Mat´ern correlation functions is cl(dl) = 1 + √ 5dl γl + 5d2 l 3γ2 l exp − √ 5dl γl , where dl stands for any of the |xil − xjl|. Denoting ˜dl = dl/γl, the following properties can be established. • When ˜dl → 0, cl( ˜dl) ≈ 1 − C ˜d2 l with C > 0 being a constant. This thus behaves similarly to exp(− ˜d2 l ) ≈ 1 − ˜d2 l , which corresponds to the power exponential correlation with αl = 2, and thus has similar smoothness near design points. • When ˜dl → ∞, the dominant part of cl( ˜dl) is exp − √ 5 ˜dl which matches the power exponential correlation with αl = 1. Thus the Mat´ern correlation prevents the correlation from decreasing quickly with distance, as does the Gaussian correlation. This can be of benefit in emulation since some inputs may have almost no effect on the computer model, corresponding to near constant correlations for distant inputs. 27
  • 28. SAMSI Fall,2018 ✬ ✫ ✩ ✪ We test the following five functions: i. 1 dimensional Higdon function, Y = sin(2πX/10) + 0.2 sin(2πX/2.5), where X ∈ [0, 10]. ii. 2 dimensional Lim function, Y = 1 6 [(30 + 5X1 sin(5X1))(4 + exp(−5X2)) − 100] + ǫ, where Xi ∈ [0, 1], for i = 1, 2. iii. 3 dimensional Pepelyshev function, Y = 4(X1 − 2 + 8X2 − 8X2 2 )2 + (3 − 4X2)2 + 16 √ X3 + 1(2X3 − 1)2 , where Xi ∈ [0, 1], for i = 1, 2, 3. iv. 4 dimensional Park function, Y = 2 3 exp(X1 + X2) − X4 sin(X3) + X3, where Xi ∈ [0, 1), for i = 1, 2, 3, 4. v. 5 dimensional Friedman function from, Y = 10 sin(πX1X2) + 20(X3 − 0.5)2 + 10X4 + 5X5, where Xi ∈ [0, 1], for i = 1, 2, 3, 4, 5. 28
  • 29. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Robust GaSP ξ Robust GaSP γ MLE DiceKriging 1-dim Higdon .00011 .00012 .00013 .00013 2-dim Lim .0064 .0080 .021 .0083 3-dim Pepelyshev .083 .15 3.5 .79 4-dim Park .00011 .00011 .033 .00063 5-dim Friedman .026 .038 4.7 .44 Table 1: Average MSE of the four estimation procedures for the five exper- imental functions. The sample size is n = 20 for the Higdon function and n = 10p for the others. Designs are generated by maxmin LHD. This suggests that optimal is to use the posterior mode in the ξ parameterization, with the Matern correlation function. 29
  • 30. SAMSI Fall,2018 ✬ ✫ ✩ ✪ ●●●●●●●●● ●●●●●●●● ●●●●●● ● ●●● ● ●●●●●● ● ●●● ●●● ●●●●●●● ●● ● ●●●●●●●● ● ●●●●●●●●●●●●●●● ●●●●●● ● ● ● ●●● ● ● ●● ●● ● ● ● ● ●●●●●●●●●●●● ●● ● ●● ● ●●●●● ● ●●● ● ● ● ●● ● ● ●●● ●●●●●●●●●●●●●●●●●● ● ● ● ●● ● ●●●●●●●●●● ● ● ● ●● ● ● ●●●● ● ● ● ● ●●●●● ● ●● ● ●●●●●●●● ●●●● ● ●●● ● ● ● ● ● ●●●●●●● ● ● ● ● ● ●● ● ●●● ● ●●●●●● ● ●●●●●●●●● ● ●●● ●●●●●●●●●●● ● ●●●●● ● ●●●● ● ●● ● ●●● ● ●●●●●●● ● ● ●● ● ● ● ●●●●●● ● ● ● ● ●●● ● ●●●●●●●●●●●●●●●●● ● ●● ● ● ●●●●●●●●●●●●●●●●●●● ● ●●●●●●●● ● ●●●●●●●● ● ●● ● ●●● ●● ● ●●●● ● ● ●●●●● ● ● ●●●●● ●●● ● ●●● ● ●●●● ● ●●● ● ●●● ● ●●●●●●●●●●●● ●●●● ● ●●●●●● ● ● ●●●●●●●●●●●●●● ● ● ● ● ● ●● ● ●●●●●● ●●● ● ● ● ●●●●● ● ●●●●●● ● ● ●●●●●●● ●● 0 100 200 300 400 500 0.000.050.100.150.20 Num MSEDifferentbetweenMLEandRobustGaSP ●●●●● ● ● ● ●●● ● ● ●●●●●● ● ● ● ●●●●●●● ● ● ● ● ●●●● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ●●● ● ● ● ● ●●● ●● ● ● ● ● ●●● ● ● ●●●●●● ● ●● ● ● ● ●●● ● ● ●●●●●● ● ●● ● ●●● ● ●● ● ● ● ● ● ●●● ● ● ●● ● ●●●● ● ●●● ● ● ● ● ● ●●●●●●● ● ●● ● ●●●●●●● ● ● ● ● ● ● ● ● ●●● ●● ●● ● ●●●● ● ● ● ● ●● ● ●●●●●● ● ●● ● ●●● ● ●●●● ● ●● ● ●●● ● ●●●●● ● ●● ● ●● ● ●● ● ● ● ●●●● ● ●●●● ● ● ●●●●●●●●●● ● ● ● ●●●● ●● ● ● ●●● ●● ●● ● ●●●●●● ● ● ● ●●●●● ● ●● ● ● ● ●● ● ● ● ●● ● ● ● ●●●●●●●●● ● ●●●●●●● ● ●● ●● ● ● ●●●● ● ●● ● ● ●●●● ● ● ● ● ● ● ●●●●●●●● ● ● ● ●●●●●● ● ●● ● ●●● ● ●●●●●● ● ●● ● ●●●●●●●●●●● ● ● ●●● ● ●● ● ●●● ● ● ● ● ●●●●● ● ●● ● ●●●● ● ●●● ●● ● ● ● ● ● ● ● ●●●●● ● ●●●●● ● ●● ● ●●● ● ● ● ● ●● ●●●● ● ● ● ● ●●● ● ● ●● ● ● ●●●●●● ● ● ● ● ●●●●●●●● ● ●● ● ●● ●●●● 0 100 200 300 400 500 010203040 Num MSEDifferentbetweenMLEandRobustGaSP ●●●●●● ● ●●● ● ● ●● ● ●●● ● ●●●●●●●●●●●●●● ● ●●● ● ●●●● ● ● ● ● ● ●●● ● ●●● ● ●●●●●●●●●● ● ● ● ● ●●●●●●● ● ● ●● ● ● ● ● ● ● ● ● ●●●●●●●● ● ● ● ● ●●●●●●● ● ●●●●●● ● ●●● ● ●●●●●●●●● ● ●●● ● ●● ● ● ● ● ● ● ●●●●●●●● ● ● ● ●●● ● ●●●● ● ● ●● ● ●●●●●●●● ● ●● ● ●●●●●●● ● ● ● ● ● ● ● ●●●● ● ● ●●●●●● ● ●●● ● ●●●●●●●● ● ● ● ● ● ●●● ● ● ● ● ● ● ●●● ● ●● ● ● ● ●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●●● ●● ●●●●●●●●●● ● ●●●●●●●●● ● ● ● ● ●● ● ●●●● ● ● ● ● ●● ● ●●●●● ● ●●●●●●●●●● ● ●●● ● ●●●● ● ● ● ● ● ● ● ●●● ● ● ●●●● ● ●● ● ● ● ● ●● ● ●●● ● ● ● ● ●●●● ● ●●●●● ● ●●●●●●● ● ● ●●●●●● ● ● ● ● ● ● ●●●●●● ● ● ● ● ●●●●●●●●●●●●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●●●●●● ● ● ●●●● ● ● ●● ● ● ● ●●● ● ●●● ● ●● ● ● ● ●●●●●● ● ● ● ●●●●●● ● ●●● ● ● ●●●●● ● ●● 0 100 200 300 400 500 0.000.050.100.150.200.25 Num MSEDifferentbetweenMLEandRobustGaSP ● ● ● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●●●● ● ● ● ●● ● ● ● ● ● ●●●● ● ● ● ● ● ●● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ●● ● ●● ● ● ● ● ● ●●● ● ●● ● ● ● ● ●●●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ●●●● ● ● ●●●●●● ● ● ●● ● ● ●●● ●● ● ● ● ● ● ●●●●● ● ● ●● ● ● ● ●● ● ● ● ●● ● ● ●● ● ●●● ● ●●●●● ● ●● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ●● ● ●● ●● ●● ● ● ● ● ● ● ● ●● ●●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ●● ● ● ●● ●● ● ● ● ● ●● ● ● ● ●●●● ● ● ● ● ● ●●● ● ● ● ●●● ● ● ● ●●● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●●●● ● ● ● ●●● ● ● ● ●●●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●●● ● ● ● 0 100 200 300 400 500 0510152025 Num MSEDifferentbetweenMLEandRobustGaSP Difference of MSE for the MLE GaSP and robust GaSP (ξ parameterization) for each of N = 500 designs for the Lim function (upper left), Pepelyshev function (upper right), Park function (lower left)) and Friedman function (lower right). 30
  • 31. SAMSI Fall,2018 ✬ ✫ ✩ ✪ The Jointly Robust Prior Evaluation of the reference prior (especially computation of the needed derivatives) and, hence, determination of the posterior mode can be somewhat costly if p is large. An approximation to the reference prior that has the same tail behavior in terms of robustness is πJR (β) = ( p l=1 Clβl)0.2 exp(−b p l=1 Clβl) , where b = n−1/p (a + p) and Cl equals the mean of |xD il − xD jl|, for 1 ≤ i, j ≤ n, i = j. 31
  • 32. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Software All the above methodology has been implemented in RobustGaSP Robust Gaussian Stochastic Process Emulation in R by Mengyang Gu and Jesus Palomo • There are many choices of the correlation function (Matern is the default). • Choice between the reference prior and its jointly robust approximation (the approximation is the default). • Inclusion of a nugget is possible, with the resulting prior changed appropriately. • There is also the capability of identifying and removing ‘inert’ inputs. 32
  • 33. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Differences between use of Gaussian processes in Spatial Statistics and UQ • Spatial statistics typically has only d = 2 or d = 3. • Typically there are many fewer ‘observations’ in UQ, exacerbated by the larger d. – This makes estimation of the range parameters γ much more difficult than in spatial statistics. – But, luckily, the UQ design points are typically spread out, making the α correlation parameters less important, compared with their importance in spatial statistics. • Spatial processes are often (though not always) smoother than UQ processes. • Instead of the product correlation structure, spatial statistics often uses correlations such as c(x, y) = e{−|x−y|α /γ} . 33
  • 34. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Emulation in more complicated situations • Functional emulators via Kronecker products • Functional emulators via basis decomposition • Functional emulators via parallel partial emulation • Coupling emulators 34
  • 35. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Functional emulators via Kronecker products Example: A Vehicle Crash Model. Collision of a vehicle with a barrier is implemented as a non-linear dynamic analysis code using a finite element representation of the vehicle. The focus is on velocity changes of the driver’s head, as in the following 30mph crash. There are d = 2 inputs to the simulator: crash barrier type B and crash vehicle velocity v. 35
  • 36. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Obvious approach – Discretize: Sample the m output functions from the simulator (arising from the design on input space) at a discrete number nT of time points and use t as just another input to the emulator. • Only feasible if the functions are fairly regular. • There are now m × nT inputs, so computing C−1 might be untenable. – But Kronecker products come to the rescue. 36
  • 37. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Example: Suppose m = 2 and the original design inputs were x1 and x2. Also suppose we discretize at t1 and t2. Then there are four modified inputs {(x1, t1), (x1, t2), (x2, t1), (x2, t2)} with correlation matrix (assuming the product exponential form and setting the correlation parameters to 1 for simplicity)        1 e−|t1−t2| e−|x1−x2| e−|x1−x2| e−|t1−t2| e−|t1−t2| 1 e−|x1−x2| e−|t1−t2| e−|x1−x2| e−|x1−x2| e−|x1−x2| e−|t1−t2| 1 e−|t1−t2| e−|x1−x2| e−|t1−t2| e−|x1−x2| e−|t1−t2| 1        =         1 ×   1 e−|t1−t2| e−|t1−t2| 1   e−|x1−x2| ×   1 e−|t1−t2| e−|t1−t2| 1   e−|x1−x2| ×   1 e−|t1−t2| e−|t1−t2| 1   1 ×   1 e−|t1−t2| e−|t1−t2| 1           =   1 e−|x1−x2| e−|x1−x2| 1   ⊗   1 e−|t1−t2| e−|t1−t2| 1   where ⊗ denotes the Kronecker product of the two matrices. 37
  • 38. SAMSI Fall,2018 ✬ ✫ ✩ ✪ In general, if CD is the correlation matrix arising from the original designed input to the GaSP, and CT is the correlation matrix that arises from the discretization of time, then using the combined input results in the overall correlation matrix C = CD ⊗ CT . The wonderful thing about Kronecker products is C−1 = (CD ⊗ CT )−1 = C−1 D ⊗ C−1 T |C| = |CD ⊗ CT | = |CD|nT |CT |m . 38
  • 39. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Functional emulators via basis decomposition Example: Consider a vehicle being driven over a road with two major potholes. – x = (x1, . . . , x7) is the vector of key vehicle characteristics; – yR (x; t) is the time-history curve of resulting forces. A finite element PDE computer model of the vehicle being driven over the road – depends on x = (x1, . . . , x7) and unknown calibration parameters u = (u1, u2); – yields time-history force curve yM (x, u; t). 39
  • 40. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Parameter Type (label) Uncertainty Damping 1 (force dissipation) Calibration (u1) 15% Damping 2 (force dissipation) Calibration (u2) 15% Bushing Stiffness (Voided) Unmeasured (x1) 15% Bushing Stiffness (Non-Voided) Unmeasured (x2) 10% Front rebound travel until Contact Unmeasured (x3) 5% Front rebound bumper stiffness Unmeasured (x4) 8% Sprung Mass Unmeasured (x5) 5% Unsprung Mass Unmeasured (x6) 12% Body Pitch Inertia Unmeasured (x7) 12% Table 2: I/U Map: Vehicle characteristics (‘unmeasured’) and model cali- bration inputs, and their prior uncertainty ranges. 40
  • 41. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Field Data: Seven runs of a given test vehicle over the same road containing two potholes. Denote the r-th field time-history curve by yF r (x∗ ; t), r = 1, . . . , 7, where x∗ = (x∗ 1, . . . , x∗ 7) refers to the unknown vehicle characteristics of the given test vehicle. Model Data: The computer model of the vehicle was ‘run over the potholes’ at 65 input values of z = (x, u) = (x1, . . . , x7, u1, u2); let zr = (xr, ur), r = 1, . . . , 65, denote the corresponding parameter vectors, which were chosen by a Latin-hypercube design over the input uncertainty ranges. Let yM (zr; t) denote the rth computer model time-history curve, r = 1, 2, . . . , 65. 41
  • 42. SAMSIFall,2018 ✬ ✫ 78910 00.51.0 meters Tension 78910 00.51.0 meters Tension 78910 00.51.0 meters Tension 78910 00.51.0 meters Tension 78910 00.51.0 meters Tension 78910 00.51.0 meters Tension 78910 00.51.0 meters Tension FieldRuns:Ch45:Pothole1 78910 00.51.0 meters Tension 78910 00.51.0 meters Tension 78910 00.51.0 meters Tension 78910 00.51.0 meters Tension 78910 00.51.0 meters Tension 78910 00.51.0 meters Tension 78910 00.51.0 meters Tension 78910 00.51.0 meters Tension 78910 00.51.0 meters Tension 78910 00.51.0 meters Tension 78910 00.51.0 meters Tension 78910 00.51.0 meters Tension 78910 00.51.0 meters Tension 78910 00.51.0 meters Tension 78910 00.51.0 meters Tension 78910 00.51.0 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension 78910 00.501.00 meters Tension ModelRuns:Ch45:Pothole1 3738394041 00.5010000 meters Tension 3738394041 00.5010000 meters Tension 3738394041 00.5010000 meters Tension 3738394041 00.5010000 meters Tension 3738394041 00.5010000 meters Tension 3738394041 00.5010000 meters Tension 3738394041 00.501.00 meters Tension FieldRuns:Ch45:Pothole2 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension 3738394041 050001.00 meters Tension ModelRuns:Ch45:Pothole2 Figure5:Forcesfromfieldandmodeldatafortwopotholes. 42
  • 43. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Wavelet Representation of the Curves: We view yF r (x∗ ; t), yM (zr; t), and yR (x∗ ; t) as random functions, and must ultimately perform a Bayesian analysis with these random functions. To do this, we used a wavelet representation of the functions as follows. • Each curve was replaced with its values on a dyadic grid with 212 = 4, 112 points, so that the number of resolution levels associated with the wavelet decomposition is L = 13 (including the mean of each curve as being at level 0.) • The R wavethresh package was used to obtain the decomposition, with thresholding of coefficients at the fourth and higher levels whose absolute value was below the 0.975 percentile of the absolute values of the wavelet coefficients in the level. • The union, over all curves (field and model), of wavelet basis elements was taken as the final basis, yielding a total of 289 basis elements, ψi(t) , i = 1, . . . , 289. 43
  • 44. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Thus the 65 model response curves and 7 field response curves are represented as yM (zj; t) = 289 i=1 wM i (zj)ψi(t), j = 1, . . . , 65, yF r (x∗ ; t) = 289 i=1 wF ir(x∗ )ψi(t), r = 1, . . . , 7, where the wM i (zj) and wF ir(x∗ ) are the coefficients computed through the wavelet decomposition. 44
  • 45. SAMSI Fall,2018 ✬ ✫ ✩ ✪ 7 8 9 10 00.51.0 meters Tension First Field Run:Ch45:Pothole 1 7 8 9 10 00.51.0 meters Tension First Field Run Reconstructed with wavelets:Ch45:Pothole 1 37 38 39 40 41 00.51.0 meters Tension First Field Run:Ch45:Pothole 2 37 38 39 40 41 00.51.0 meters Tension First Field Run Reconstructed with wavelets:Ch45:Pothole 2 Figure 6: The accuracy of the wavelet decomposition. 45
  • 46. SAMSI Fall,2018 ✬ ✫ ✩ ✪ GASP Approximation to each of the Model Wavelet Coefficient Functions wM i (z): Formally (and dropping the index i for convenience) wM (·) ∼ GASP( µ, 1 λM cM (·, ·) ,) with the usual separable correlation function, the parameters αj and βj being estimated by maximum likelihood (recall there are 289 pairs of them), as well as µ and λM . The posterior of the ith coefficient, given GASP parameters and model-run data, at a new input z∗ , is ˜wM i (z∗ ) ∼ N(ˆµi(z∗ ), ˆVi(z∗ )) , where ˆµi(z∗ ) and ˆVi(z∗ ) are given by the usual Kriging expressions. Thus the overall emulator is (assuming that the ˜wM i are independent) ˜yM (z∗ ; t) ∼ N 289 i=1 ˆµi(z∗ )ψi(t), 289 i=1 ˆVi(z∗ )ψ2 i (t) . 46
  • 47. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Bayesian comparison of model and field data: calibration/tuning; estimation of bias; and development of tolerance bands for prediction For the ith wavelet coefficient, the computer model is related to reality via wR i (x∗ ) = wM i (x∗ , u∗ ) + bi(x∗ ), where (x∗ , u∗ ) are the true (but unknown) values of the field vehicle inputs and model calibration parameters, respectively. Hence bi(x∗ ) is here just an unknown constant. (As usual, there is inevitable confounding between u∗ and the bi, but prediction is little affected.) 47
  • 48. SAMSI Fall,2018 ✬ ✫ ✩ ✪ The replicated field data is modeled as wF ir(x∗ ) = wR i (x∗ ) + ǫir = wM i (x∗ , u∗ ) + bi(x∗ ) + ǫir, r = 1, . . . , 7, where the ǫir are i.i.d. N(0, σ2 i ) random errors. The sufficient statistics, ¯wF i (x∗ ) = 1 7 7 r=1 wF ir(x∗ ) and S2 i (x∗ ) = 7 r=1 (wF ir(x∗ ) − ¯wF i (x∗ ))2 , then have distributions ¯wF i (x∗ ) ∼ N wM i (x∗ , u∗ ) + bi(x∗ ), σ2 i 7 S2 i (x∗ )/σ2 i ∼ Chi-Square (6) , which we assume to be independent across i. 48
  • 49. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Prior Distributions: π(b, x∗ , u∗ , σ2 , τ2 ) = π(b | x∗ , u∗ , σ2 , τ2 )π(τ2 | x∗ , u∗ , σ2 )π(σ2 | x∗ , u∗ )π(x∗ , u∗ ), where b,σ2 and τ2 refer to the vectors of bi,σ2 i and τ2 j ; and, with ¯σ2 j denoting the average of the σ2 i at wavelet level j, π(b | x∗ , u∗ , σ2 , τ2 ) = 289 i=1 N(bi | 0, τ2 j ) , (We did also try Cauchy and mixture priors, to some benefit.) π(τ2 | x∗ , u∗ , σ2 ) ∝ 12 j=0 1 τ2 j + ¯σ2 j /7 , π(σ2 | x∗ , u∗ , ) ∝ 289 i=1 1 σ2 i . 49
  • 50. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Finally, the I/U map is translated into π(x∗ , u∗ ) = 2 i=1 p(u∗ i ) 7 i=1 p(x∗ i ), p(u∗ i ) = Uniform(u∗ i | 0.125, 0.875), i = 1, 2, p(x∗ i ) ∝ N(x∗ i | 0.5, 0.11112 )I(0.1667,0.8333)(x∗ i ), i = 1, 2, 3 p(x∗ 4) ∝ N(x∗ 4 | 0.5, 0.06412 )I(0.3077,0.6923)(x∗ 4), p(x∗ i ) ∝ N(x∗ i | 0.5, 0.11762 )I(0.1471,0.8529)(x∗ i ), i = 5, 6, p(x∗ 7) ∝ N(x∗ 7 | 0.5, 0.10262 )I(0.1923,0.8077)(x∗ 7), where IA(u) is 1 if u ∈ A and 0 otherwise. 50
  • 51. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Posterior Distribution: Denote the available data: D = { ¯wF i , S2 i , ˆµi(·), ˆVi(·) : i = 1, . . . , 289}. The posterior distribution of b,x∗ , u∗ , σ2 , τ2 and wM∗ ≡ wM (x∗ , u∗ ) can be expressed as π(wM∗ , b, x∗ , u∗ , σ2 , τ2 | D) = π(wM∗ | b, x∗ , u∗ , σ2 , τ2 , D) × π(b | x∗ , u∗ , σ2 , τ2 , D)π(x∗ , u∗ , σ2 , τ2 | D), where π(wM∗ | b, x∗ , u∗ , σ2 , τ2 , D) = 289 i=1 N(wM∗ i | ˜µi, ˜Vi), ˜µi = ˆVi(x∗ ,u∗ ) ˆVi(x∗,u∗)+σ2 i /7 ( ¯wF i − bi) + σ2 i /7 ˆVi(x∗,u∗)+σ2 i /7 ˆµi(x∗ , u∗ ) , ˜Vi = ˆVi(x∗,u∗)σ2 i /7 ˆVi(x∗,u∗)+σ2 i /7 ; π(b | x∗ , u∗ , σ2 , τ2 , D) = 289 i=1 N(bi | µ∗ i , V ∗ i ), µ∗ i = τ2 j ( ¯wF i − ˆµi(x∗ , u∗ )) τ2 j + ˆVi(x∗, u∗) + σ2 i /7 , V ∗ i = τ2 j ( ˆVi(x∗ , u∗ ) + σ2 i /7) τ2 j + ˆVi(x∗, u∗) + σ2 i /7 ; 51
  • 52. SAMSI Fall,2018 ✬ ✫ ✩ ✪ π(x∗ , u∗ , σ2 , τ2 | D) ∝ L(x∗ , u∗ , σ2 , τ2 )π(x∗ , u∗ , σ2 , τ2 ), L(x∗ , u∗ , σ2 , τ2 ) = 289 i=1 (σ2 i )−3 τ2 j + σ2 i 7 + ˆVi(x∗, u∗) × exp − 1 2 ( ¯wF i − ˆµi(x∗ , u∗ ))2 τ2 j + σ2 i 7 + ˆVi(x∗, u∗) + S2 i σ2 i . 52
  • 53. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Computation: A Metropolis-Hastings-Gibbs Sampling scheme was used, the Metropolis step being needed to sample from π(x∗ , u∗ , σ2 , τ2 | D). The proposal used was • for the σ2 i : Inverse Gamma distributions with shape 3 and scales 2/S2 i ; • for the τ2 j : local moves ∝ 1/τ2 j in (0.5 τ 2(old) j , 2 τ 2(old) j ); • for the x∗ and u∗ , a mixture of prior and local moves: gu(z) = 9 i=1 { 0.5 Unif(zi | Ti) + 0.5 Unif(zi | T∗ i )} , where Ti = (ai, bi) is the support of each prior and T∗ i = (max{ai, z (old) i − 0.05}, min{bi, z (old) i + 0.05}). 53
  • 54. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Computation was initially done by a standard Markov Chain Monte Carlo analysis: • Closed form full conditionals are available for b, and for the emulator wavelet coefficients wM∗ ≡ wM (x∗ , u∗ ). • Metropolis-Hastings steps were used for (x∗ , u∗ , σ2 , τ2 ); efficient proposal distributions were available, so all seemed fine. Shock: The original computation failed and could not be fixed using traditional methods; the answers were also ‘wrong’! • Problem: Some of the σ2 i (variances corresponding to certain wavelet coefficients of the field data) got ‘stuck’ at very large values, with the effect that the corresponding biases were estimated as near zero. • Likely Cause: Modeling the bi as hierarchically normally distributed; biases for many wavelet coefficients can be expected to be small, but some are likely to be large. 54
  • 55. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Ideal solution: Improve the hierarchical models; for instance, one could consider use of more robust models (e.g. Cauchy models) for the bi. Pragmatic solution: Cheat computationally, and only allow generation of the σ2 i from the replicate information (i.e., from the s2 i = 7 r=1(wF ir(x∗ ) − ¯wF i )2 ), not allowing transference of information from the bi to the σ2 i . • We call such cheating modularization, the idea being to not always allow Bayesian updating to flow both ways between modules (components) of a complex model. • Another name given to this idea is cutting feedback (Best, Spiegelhalter, ...); related notions are inconsistent dependency networks (Heckerman); and inconsistent Gibbs for missing data problems (Gelman and others). • In the road load analysis, the modularization approach gives very similar answers to the improved modeling approach. 55
  • 56. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Inference: Bias estimates, predictions, and associated accuracy statements can all be constructed from the posterior sample {(wM∗ )(h) , b(h) , x∗(h) , u∗(h) , (σ2 )(h) }, h = 1, . . . , N , and an auxiliary sample ǫ(h) , h = 1, . . . , N, from a multivariate normal distribution with zero mean and diagonal covariance matrix Diag(σ2 )(h) . • The posterior sample of bias curves is b(h) (d) = 289 i=1 b (h) i ψi(t), h = 1, . . . , N. • The posterior sample of bias-corrected predictions of reality is (yR )(h) (d) = 289 i=1 (wM∗ i )(h) + b (h) i ψi(t), h = 1, . . . , N. • The posterior sample of individual (field) bias-corrected prediction curves is (yF )(h) (d) = 289 i=1 (wM∗ i )(h) + b (h) i + ǫ (h) i ψi(t), h = 1, . . . , N. 56
  • 57. SAMSI Fall,2018 ✬ ✫ ✩ ✪ u− 1 u− 1 Freq. 0.0 0.2 0.4 0.6 0.8 1.0 01234 u− 2 u− 2 Freq. 0.0 0.2 0.4 0.6 0.8 1.0 0.00.51.01.52.02.53.0 u− 3 u− 3 Freq. 0.0 0.2 0.4 0.6 0.8 1.0 0.00.51.01.52.02.5 u− 4 u− 4 Freq. 0.0 0.2 0.4 0.6 0.8 1.0 0.00.51.01.52.02.53.0 u− 5 u− 5 Freq. 0.0 0.2 0.4 0.6 0.8 1.0 0.00.51.01.52.0 u− 6 u− 6 Freq. 0.0 0.2 0.4 0.6 0.8 1.0 0.00.51.01.52.02.53.0 u− 7 u− 7 Freq. 0.0 0.2 0.4 0.6 0.8 1.0 020406080 u− 8 u− 8 Freq. 0.0 0.2 0.4 0.6 0.8 1.0 02468 u− 9 u− 9 Freq. 0.0 0.2 0.4 0.6 0.8 1.0 01234 PX: Ch45 Figure 7: Posterior distribution of u∗ and x∗ . 57
  • 58. SAMSI Fall,2018 ✬ ✫ ✩ ✪ 7 8 9 10 −3.0−2.0−1.001.0 PX: Ch45: Region 1: Bias Curves Distance (m) Tension(N) Bias function MCMC (90% Tolerance bounds) Figure 8: Posterior bias curve estimate and 90% tolerance bands. 58
  • 59. SAMSI Fall,2018 ✬ ✫ ✩ ✪ 7 8 9 10 00.51.0 PX: Ch45: Region 1: Bias Corrected Prediction: Individual Curve Distance (m) Tension(N) Model Data Field Data Model Prediction 90% Tolerance bounds 59
  • 60. SAMSI Fall,2018 ✬ ✫ ✩ ✪ 7 8 9 10 05.01.0 PX: Ch45: Region 1: Nominal Model Prediction: Individual Curve Distance (m) Tension(N) Model Data Field Data Model Prediction 90% Tolerance bounds 60
  • 61. SAMSI Fall,2018 ✬ ✫ ✩ ✪ 7 8 9 10 00.51.01.52.02.53.0 PX: Ch45: Region 1: Bias Corrected Prediction: Individual Curve Distance (m) Tension(N) Field Data Model Prediction 90% Tolerance bounds 7 8 9 10 −1.5−1.0−0.500.51.0 PX: Ch60: Region 1: Bias Corrected Prediction: Individual Curve Distance (m) Tension(N) Field Data Model Prediction 90% Tolerance bounds 37 38 39 40 41 00.51.01.52.02.53.0 PX: Ch45: Region 2: Bias Corrected Prediction: Individual Curve Distance (m) Tension(N) Field Data Model Prediction 90% Tolerance bounds 37 38 39 40 41 −1.5−1.0−0.500.51.0 PX: Ch60: Region 2: Bias Corrected Prediction: Individual Curve Distance (m) Tension(N) Field Data Model Prediction 90% Tolerance bounds Figure 10: Multiplicative extrapolation of bias to Vehicle B. 61
  • 62. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Functional emulators via Parallel Partial emulation Example: In the pyroclastic flow example, the full output of TITAN2D is yM (x) = (yM 1 (x), yM 2 (x), . . . , yM k (x)) where each yM i (x) is the pyroclastic flow heights (and speed and direction) at the k space-time grid points on which TITAN2D is run. This is a huge (discretized) function, with k as large as 109 . One realization of the function, only looking at maximum flow height at 24,000 spatial locations, looks like this: 62
  • 63. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Movie Time Determination of 1m contours of maximum flow over time at k = 23, 040 spatial locations, using m = 50 simulator runs at various inputs to develop the emulator. 63
  • 64. SAMSI Fall,2018 ✬ ✫ ✩ ✪ The Big Issue: This wildly varying function varies even more wildly over the inputs, so it is virtually unimaginable to capture it with any of the previous methods, or any previous emulator method (stat or math). So we have to trust to the magic of GaSP’s and hope to get lucky (you can’t force UQ)! Run the simulator at xD = {x1, . . . , xm}, yielding outputs yD = (yM (x1)′ , . . . , yM (xm)′ )′ (a matrix of size up to 2048 × 109 ). The simplest imaginable GaSP for the k-dimensional yM (x): An independent GaSP is assigned to each coordinate yM j (x), with • prior mean functions of the regression form Ψ(x)θj, where Ψ(x) is a common l-vector of given basis functions and the θj are differing unknown regression coefficients; • differing unknown prior variances σ2 j ; • common estimated correlation parameters ˆγ (discussed later). 64
  • 65. SAMSI Fall,2018 ✬ ✫ ✩ ✪ The mean function of the posterior GaSP for yM j (x∗ ) at new input x∗ is ˆµj(x∗ ) = Ψ(x∗ )ˆθj + C(x∗ )C−1 (yD j − Ψˆθj) , where yD j is the jth column of yD and ˆθj = (Ψ′ C−1 Ψ)−1 Ψ′ C−1 yD j , with Ψ being the earlier specified m × l design matrix, C being the earlier specified m × m correlation matrix, Ψ(x∗ ) = (Ψ1(x∗ ), . . . , Ψl(x∗ )), and C(x∗ ) = (c(x1, x∗ ), . . . , c(xm, x∗ )). This can be rewritten ˆµj(x∗ ) = m i=1 hi(x∗ )yD ij , where hi(x∗ ) is the ith element of the m-vector h(x∗ ) = (Ψ(x∗ ) − C(x∗ )C−1 Ψ)(Ψ′ C−1 Ψ)−1 Ψ′ C−1 + C(x∗ )C−1 . As Ψ and C (and the functions of them) can be pre-computed, computing h(x∗ ) (at a new x∗ ) requires roughly m2 numerical operations. 65
  • 66. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Finally, we can write the complete parallel partial posterior (PP) mean vector (the emulator of the full simulator output at a new input x∗ ) as ˆµ(x∗ ) = (ˆµ1(x∗ ), . . . , ˆµk(x∗ )) = h(x∗ )yD . • The overall computational cost is just O(mk) when k >> m. – It is crucially important to have differing θj and σ2 j at each coordinate, but this comes with essentially no computational cost. • Computation of all the PP emulator variances is O(m2 k), but one rarely needs to compute all of them. • The emulator is an interpolator so, when x∗ equals one of the runs xi, the emulator will return the exact values from the computer run. • As the emulator mean is just a weighted average of the actual simulator runs, it hopefully captures some of the dynamics of the process. 66
  • 67. SAMSI Fall,2018 ✬ ✫ ✩ ✪ What happens if the assumptions are relaxed? • If different coordinates are allowed different bases, the cost goes up to O([m2 l + l3 ]k). (Recall the cost of the PP emulator was O(mk)). – For TITAN2D, m ≈ 2000, l = 4, and k ≈ 109 ⇒ O(1016 ) computations, compared to O(1012 ) for the PP emulator. • If the correlation parameters, γj, are allowed to vary at each coordinate, the computational cost would be O(n m3 k) because there would be differing m × m correlation matrices Cj at each coordinate and the inversion of Cj would need to be done n times, in order to estimate γj. – For TITAN2D, n ≈ 150, m ≈ 2000, k ≈ 109 ⇒ O(1021 ) computations. • In either case the emulator would still be an interpolator, but would no longer be a weighted average of the simulator runs. 67
  • 68. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Figure 11: The mean of the emulator of ‘maximum flow height over time’ from TITAN2D, at 24,000 spatial locations over Montserrat and for new input values V = 107.462 , ϕ = 2.827, δbed = 11.111, and δint = 27.7373. 68
  • 69. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Figure 12: Variance of the emulator of ‘maximum flow height over time’ from TITAN2D, at 24,000 spatial locations over Montserrat and for new input values V = 107.462 , ϕ = 2.827, δbed = 11.111, and δint = 27.7373. 69
  • 70. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Movie Time Determination of 1m contours of maximum flow over time at k = 23, 040 spatial locations, using m = 50 simulator runs at various inputs to develop the emulator. 70
  • 71. SAMSI Fall,2018 ✬ ✫ ✩ ✪ The spatial ‘elephant in the room’ Is the key (and clearly invalid) assumption that simulator output values at all coordinates (e.g., space-time locations) are independent. The usual attempted solution: Introduce a second spatial process over the output coordinates of yM (x) = (yM 1 (x), yM 2 (x), . . . , yM k (x)) to reflect the clear dependence. Usual assumptions on this process: • It is also a Gaussian process, with correlation function λ(i, j), leading to the k × k correlation matrix Λ. • Because k is huge, the process must be chosen so that Λ is sparse (e.g., only allow correlation with nearby points), to allow for the needed inversions of Λ. • Separability with the GaSP over the process input space is assumed, so that the covariance matrix of the joint Gaussian process is (letting σ denote the diagonal matrix of coordinate standard deviations) Σ = σΛσ ⊗ C , and thus Σ−1 = σ−1 Λ−1 σ−1 ⊗ C−1 . The problem: It is difficult to add plausible spatial structure while keeping the computation manageable when k is huge. 71
  • 72. SAMSI Fall,2018 ✬ ✫ ✩ ✪ The Surprise: The spatial elephant can (mostly) be ignored, as the PP emulator will give essentially the same answers. Indeed, for any spatial structure Λ, the following can be shown: • The emulator mean of yM (x∗ ) = (yM 1 (x∗ ), . . . , yM k (x∗ )), at a new input x∗ , is exactly the same as the PP emulator mean. (Intuition: it does not matter that the yM i (x∗ ) are spatially related, as they are all unknown.) • The emulator variance at coordinate j is still ˆσ2 j Vj(x∗ ), with only ˆσ2 j depending on the spatial structure, and only in a minor way; thus one can just use the (slightly conservative) PP emulator variance. The remaining little elephant: If one actually needs random draws from the emulator, the PP emulator’s draws will be too rough (because each coordinate is independent), which might be harmful in some applications. • A relatively simple fix to obtain smoother draws is to divide the grid into squares of moderate size s (e.g., s = 4), have the squares be independent, but allow a dependent spatial process in each square. • If Λ in each square is assigned the objective prior π(Λ) = |Λ|−s , the mean and variance of the emulator will then be the same as the PP emulator. 72
  • 73. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Additional concerns with the assumptions for the PP emulator: • The likelihood from which the correlation parameters γ are estimated might be bad because of the assumption of independence of coordinates. – In practice, use of a joint spatial process seems to give worse results, because of considerable numerical instabilities in the likelihood. – Also, the estimates of γ should primarily be driven by the varying simulator output over the inputs xi at each fixed location. – The likelihood is almost certainly too concentrated but, as we are only using it to obtain plug-in estimates, this is not a major concern. • Assuming common values of the correlation parameters γ at all coordinates is potentially problematical, as the simulator may have very different levels of smoothness in different regions of input space. – One could utilize different γ in a few different regions, with minimal additional cost, as in Gramacy and Lee (2008). – Simulations (see later) indicate this is not a problem for TITAN2D. 73
  • 74. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Introduction of a nugget Often certain inputs have very little effect on the simulator output, and emulators that ignore that input can do better at prediction. But, for deterministic simulators, one must then introduce a ‘nugget’ (i.i.d. Gaussian errors) in the GaSP model. We simply let the correlation matrix be C + ξI, renormalized to be a correlation matrix, with ξ unknown. The computations are then only slightly more complicated. Example: In TITAN2D, δint has only a minor effect, so we will investigate • the full 4 input emulator, • the 3 input emulator with δint removed and a nugget inserted. 74
  • 75. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Emulator: PP GaSP PP GaSP MS GaSP LMC GaSP Parameter estimation: robust est. robust est. robust est. DiceKriging 4 inputs 3 inputs and an estimated nugget Mean Square Error 0.109 0.097 0.103 0.137 95% CI Coverage 0.926 0.950 0.924 0.909 95% CI Length 0.521 0.536 0.491 0.478 time (s) using R 50.0 28.1 31337.7 3407.6 Table 3: Performance of various emulators, developed from 50 simulator runs, of max flow height over all spatial locations except the crater and non-flow areas. • The first emulator uses all 4 inputs while the remaining three emulators use 3 inputs (V, δbed, φ) and a nugget, all with the same regressor h(x) = (1, V ). • The LMC emulator uses coregionalization with SVD output decomposition. • Evaluations based on n∗ = 633 held-out inputs over k = 17, 311 locations. • The last row shows the computational times of the emulators, using R. 75
  • 76. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Determining Hazard Probabilities Goal: Determine PH,T (k), the probability, at location k, that the maximum pyroclastic flow height exceeds H over the next T years. Implementation: • Perform statistical analysis of historical data to determine the posterior distribution of the simulator inputs (V, δbed, φ). • Draw 100,000 samples from this posterior, and evaluate the emulator at these inputs to estimate the distribution Fk of maximum flow heights at each location k. • Assuming pyroclastic flows follow a stationary Poisson process, an exact expression can be given, in terms of the Fk, for the probability distribution of maximum flow heights over T years at location k. • From these, determination of the PH,T (k) is straightforward. 76
  • 77. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Figure 13: For SVH, contours of the probabilities that the maximum flow heights exceed 0.5 (left), 1 (center) and 2 (right) meters over the next T = 2.5 years at each location on SVH. The shaded area is Belham Valley, which is still inhabited. 77
  • 78. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Coupling emulators (closed form) to emulate coupled computer models (Kyzyurova (2017)). Coupled simulators: • fM (x) is the output of a simulator with input x. – Example: fM (x) is TITAN2D • gM (z) is a simulator with input z. – Example: gM (z) is a computer model that determines damage to a structure incurred by being hit be a pyroclastic flow with properties z. • Of interest is gM ◦ fM (x) = gM (fM (x)), the coupled simulator computing the damage from a pyroclastic flow arising from inputs x. The problem: It is usually difficult to directly link two simulators. • The output of fM will often not be in the form needed as input to gM . • It is difficult to determine a good design in terms of inputs x for the coupled emulator. • It may well be that many more runs of fM are available than runs of gM . 78
  • 79. SAMSI Fall,2018 ✬ ✫ ✩ ✪ A solution: Separately develop emulators ˜fM of fM and ˜gM of gM , and couple the emulators. • Always possible by Monte Carlo (generate outputs from ˜fM and use them in ˜gM ). • For GaSP’s, a closed form mean and variance of the coupled emulator is available! Theorem. Suppose the GaSP for gM has the linear mean function h(z′ )β = β0 + β1z′ b, and a product power correlation function with αj = 2 for inputs j ∈ b, . . . , d that arise from fM . For each j ∈ b, . . . , d, let fM j be an independent emulator of fj, the function which gives rise to the value of input j for g(·). Then the mean Eξ and variance Vξ of the linked emulator ξ of the coupled simulator (g ◦ (fb, . . . , fd))(u) are 79
  • 80. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Eξ = β0 + β1µ∗ fb (ub ) + m i=1 ai b−1 j=1 exp − |uj − zij| δj αj d j=b Ii j, Vξ = σ2 (1 + η) + β2 0 + 2β0β1µ∗ fb (ub ) + β2 1(σ∗2 fb (ub ) + (µ∗ fb (ub ))2 ) − (Eξ)2 +   m k,l=1 (alak − σ2 {Cz −1 }k,l) b−1 j=1 e − |uj −zkj | δj αj + |uj −zlj | δj αj d j=b I1k,l j   + 2 m i=1 ai b−1 j=1 exp − |uj − zij| δj αj β0Ii b + β1I+i b d j=b+1 Ii j, 80
  • 81. SAMSI Fall,2018 ✬ ✫ ✩ ✪ where a = (a1, . . . , am)T = C−1 z (gM (z) − h(z)β) and Ii j = 1 1 + 2 σ∗2 fj (uj) δ2 j exp − (zij − µ∗ fj (uj ))2 δ2 j + 2σ∗2 fj (uj) I1k,l j = 1 1 + 4 σ∗2 fj (uj) δ2 j e − zkj +zlj 2 −µ∗ fj (uj) 2 δ2 j 2 +2σ∗2 fj (uj) e − (zkj −zlj )2 2δ2 j I+i b = 2 σ∗2 fb (ub ) δ2 b zib + µ∗ fb (ub ) 1 + 2 σ∗2 fb (ub) δ2 b 3 exp − (zib − µ∗ fb (ub ))2 δ2 b + 2σ∗2 fb (ub) . 81
  • 82. SAMSI Fall,2018 ✬ ✫ ✩ ✪ −1 −0.63 −0.26 0.11 0.48 0.85 −101 ● ● ● ● ● ● x f f(x2) f(x3) f(x1) f(x6) f(x5) −101 ● ● ● ● ● ● z g −1 −0.63 −0.26 0.11 0.48 0.85 −101 ● ● ● ● ● ● x gOf −1 −0.63 −0.26 0.11 0.48 0.85 −101 ● ● ● ● ● ● x gOf Figure 14: Top figures are functions f(x) and g(z) and their emulators. Bottom left is the (closed form) coupled emulator of g ◦ f. Bottom right is the emulator of the math coupled g ◦ f (constructed from the same inputs/outputs). 82
  • 83. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Other complex emulation scenarios • Dynamic emulators (Conti et al. (2009); Liu and West (2009); Conti and O’Hagan (2010); Reichert et al. (2011)). • Emulating models with qualitative factors (Qian et al. (2008)). • Nonstationary emulators (Gramacy and Lee (2008); Ba and Joseph (2012). • Emulating multivariate output (Bayarri et al. (2009); Paulo et al. (2012); Fricker et al. (2013); Overstall and Woods (2016)). • Evaluating the quality of emulators (Bastos and O’Hagan (2009); Overstall and Woods (2016)). 83
  • 84. SAMSI Fall,2018 ✬ ✫ ✩ ✪ References Aslett, R., R. J. Buck, S. G. Duvall, J. Sacks, and W. J. Welch (1998). Circuit optimization via sequential computer experiments: Design of an output buffer. Applied Statistics 47, 31–48. Ba, S. and V. R. Joseph (2012). Composite gaussian process models for emulating expensive functions. The Annals of Applied Statistics 6(4), 1838–1860. Bastos, L. S. and A. O’Hagan (2009). Diagnostics for gaussian process emulators. Technometrics 51(4), 425–438. Bates, R. A., R. J. Buck, E. Riccomagno, and H. P. Wynn (1996). Experimental design and observation for large systems (Disc: p95–111). Journal of the Royal Statistical Society, Series B, Methodological 58, 77–94. Bayarri, M. J., J. O. Berger, E. S. Calder, K. Dalbey, S. Lunagomez, A. K. Patra, E. B. Pitman, E. T. Spiller, and R. L. Wolpert (2009). Using 84
  • 85. SAMSI Fall,2018 ✬ ✫ ✩ ✪ statistical and computer models to quantify volcanic hazards. Technometrics 51, 402–413. Bayarri, M. J., J. O. Berger, G. Garc´ıa-Donato, F. Liu, J. Palomo, R. Paulo, J. Sacks, J. Walsh, J. A. Cafeo, and R. Parthasarathy (2007). Computer model validation with functional output. Annals of Statistics 35, 1874–1906. Bayarri, M. J., J. O. Berger, M. C. Kennedy, A. Kottas, R. Paulo, J. Sacks, J. A. Cafeo, C. H. Lin, and J. Tu (2009). Predicting vehicle crashworthiness: validation of computer models for functional and hierarchical data. Journal of the American Statistical Association 104, 929–942. Bowman, V. E. and D. C. Woods (2016). Emulation of multivariate simulators using thin-plate splines with application to atmospheric dispersion. SIAM/ASA Journal on Uncertainty Quantification 4(1), 1323–1344. 85
  • 86. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Conti, S., J. P. Gosling, J. Oakley, and A. O’hagan (2009). Gaussian process emulation of dynamic computer codes. Biometrika, asp028. Conti, S. and A. O’Hagan (2010). Bayesian emulation of complex multi-output and dynamic computer models. Journal of statistical planning and inference 140(3), 640–651. Cumming, J. A. and M. Goldstein (2009). Small sample bayesian designs for complex high-dimensional models based on information gained using fast approximations. Technometrics 51, 377–388. Dalbey, K., M. Jones, E. B. Pitman, E. S. Calder, M. Bursik, and A. K. Patra (2012). Hazard risk analysis using computer models of physical phenomena and surrogate statistical models. Int. J. for Unceratainty Quantification. To appear. Fricker, T. E., J. E. Oakley, and N. M. Urban (2013). Multivariate gaussian process emulators with nonseparable covariance structures. Technometrics 55(1), 47–56. 86
  • 87. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Gramacy, B. and H. K. H. Lee (2008). Bayesian treed gaussian process models with an application to computer modeling. Journal of the American Statistical Association 103(483), 1119 – 1130. Gramacy, B. and H. K. H. Lee (2009). Adaptive design and analysis of supercomputer experiments. Technometrics 51, 130–145. doi:10.1198/TECH.2009.0015. Gu, M., J. Palomo, and J. O. Berger (2016). RobustGaSP: Robust Gaussian Stochastic Process Emulation. R package version 0.5.4. Gu, M., X. Wang, and J. Berger (2018). Robust gaussian stochastic process emulation. Annals of Statistics. Kyzyurova, K. N. (2017). On Uncertainty Quantification for Systems of Computer Models. Ph. D. thesis, Duke University. Lam, C. Q. and W. I. Notz (2008). Sequential adaptive designs in computer experiments for response surface model fit. Statistics and Applications 66(9), 207–233. 87
  • 88. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Lim, Y. B., J. Sacks, W. Studden, and W. J. Welch (2002). Design and analysis of computer experiments when the output is highly correlated over the input space. Canadian Journal of Statistics 30(1), 109–126. Liu, F. and M. West (2009). A dynamic modelling strategy for Bayesian computer model emulation. Bayesian Analysis 4(2), 393–412. Loeppky, J. L., L. M. Moore, and B. J. Williams (2010). Batch sequential designs for computer experiments. Journal of Statistical Planning and Inference 140(6), 1452–1464. Lopes, D. (2011). Development and Implementation of Bayesian Computer Model Emulators. Ph.d. dissertation, Duke University. McKay, M. D., W. J. Conover, and R. J. Beckman (1979). A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 21, 239–245. Overstall, A. M. and D. C. Woods (2016). Multivariate emulation of computer simulators: model selection and diagnostics with application to 88
  • 89. SAMSI Fall,2018 ✬ ✫ ✩ ✪ a humanitarian relief model. Journal of the Royal Statistical Society: Series C (Applied Statistics) 65(4), 483–505. Paulo, R. (2005). Default priors for gaussian processes. The Annals of statistics 33(2), 556–582. Paulo, R., G. Garc´ıa-Donato, and J. Palomo (2012). Calibration of computer models with multivariate output. Computational Statistics and Data Analysis 56(12), 3959–3974. Qian, P. Z. G., H. Wu, and C. F. J. Wu (2008). Gaussian process models for computer experiments with qualitative and quantitative factors. Technometrics 50(3), 383–396. Ranjan, P., H. R. and R. Karsten (2011). A computationally stable approach to gaussian process interpolation of deterministic computer simulation data. Technometrics 53, 366 – 378. Ranjan, P., D. Bingham, and G. Michailidis (2008). Sequential experiment 89
  • 90. SAMSI Fall,2018 ✬ ✫ ✩ ✪ design for contour estimation from complex computer codes. Technometrics 50(4). Errata Technometrics 53, 527–541. Reichert, P., G. White, M. J. Bayarri, and E. B. Pitman (2011). Mechanism-based emulation of dynamic simulations models: Concept and application in hydrology. Computational Statistics & Data Analysis 55, 1638–1655. Roustant, O., D. Ginsbourger, and Y. Deville (2012). Dicekriging, diceoptim: Two r packages for the analysis of computer experiments by kriging-based metamodeling and optimization. Journal of Statistical Software 51(1), 1–55. Sacks, J., W. J. Welch, T. J. Mitchell, and H. P. Wynn (1989). Design and analysis of computer experiments (C/R: p423–435). Statistical Science 4, 409–423. Santner, T. J., B. Williams, and W. Notz (2003). The Design and Analysis of Computer Experiments. Springer-Verlag. 90
  • 91. SAMSI Fall,2018 ✬ ✫ ✩ ✪ Spiller, E. T., M. Bayarri, J. O. Berger, E. S. Calder, A. K. Patra, E. B. Pitman, and R. L. Wolpert (2014). Automating emulator construction for geophysical hazard maps. SIAM/ASA Journal on Uncertainty Quantification 2(1), 126–152. Welch, W. J., R. J. Buck, J. Sacks, H. P. Wynn, T. J. Mitchell, and M. D. Morris (1992). Screening, predicting, and computer experiments. Technometrics 34, 15–25. 91