MUMS Undergraduate Workshop - Quantifying Uncertainty in Hazard Forecasting - Elaine Spiller, February 26, 2019

MULOGO
Quantifying uncertainty in hazard forecasting
Elaine Spiller
Marquette University
February 26, 2019

MULOGO
interdisciplinary research team
Along with many current and former students...

MULOGO
XKCD comic from September 28, 2017

MULOGO
Eyjafjallaj¨okull – $2-5bn

MULOGO
Nevada del Ruiz – 23,000 fatalities

MULOGO
Mount Pelée, Martinique – 30,000 fatalities
“One hundred years ago, government ofﬁcials in Martinique
made the mistake of assuming that, despite signs to the
contrary, Mount Pelée would behave in 1902 as it had in 1851 –
when a rain of ash from what they considered a benign volcano
surprised, but did not harm those living under its shadow.”
(Cristina Reed, Geotimes 2002)

MULOGO
Pyroclastic – “broken ﬁre” – ﬂows

MULOGO
Montserrat – A volcanologist playground

MULOGO
Data at Montserrat – valleys traversed by PFs

MULOGO
Data at Montserrat – PF frequency and volume
5e+04 5e+05 5e+06 5e+07 5e+08
0.020.050.100.200.501.002.005.0020.0050.00
PF Volume (m3
)
AnnualRate

MULOGO
What is a random variable?
A quantity whose value is random and to which a probability
distribution is assigned
RV Example (discrete): Number of siblings
x 0 1 2 3 . . .
Pr 0.28 0.3 0.24 0.08 . . .
RV Example (continuous): Annual teaching income, US
Probability distribution function (pdf)
of teachers’ salaries across US
40,000 100,000
x~

MULOGO
How do we calculate a probability?
Example: What is the probability a teaching makes more
than $85K annually?
40,000 100,000
P(salary>85K)=∫0
∞
I(x)p(x)dx
Indicator function I(X) = 1 if X ≥ 85K, I(X) = 0 otherwise
To simulate, Monte Carlo
P(salary> $85K) ≈ 1
M
M
1 I(X) with X ∼ p(X)

MULOGO
Exponential distributions – one parameter
p(x) = λ exp(−λx), x ≥ 0
0 1 2 3 4 5 6
x
0
0.5
1
1.5
2
2.5
3
3.5
4

MULOGO
Normal distributions – two parameters
p(x) =
1
√
2πσ2
exp −
(x − µ)2
2σ2
-8 -6 -4 -2 0 2 4 6 8
x
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4

MULOGO
Normal distributions – two parameters
p(x) =
1
√
2πσ2
exp −
(x − µ)2
2σ2
-8 -6 -4 -2 0 2 4 6 8
x
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8

MULOGO
On Twitter yesterday
Coded
by
Rasmus Bååth (2012)
normal
µ
σ1σ2
t distrib.
µ
σ1σ2
df1df2
uniform
L H
beta
a, b
beta−binomial
a, b
Bernoulli
θ
gamma
S, R
inv−gamma
α, β
binomial
p, n
neg. binomial
p, r
folded t
µ
σ
df
Poisson
λ
chi−square
k
noncentral
chi−square
k, λ
double exp.
µ
τ1τ2
exponential
λ
shifted exp.
λ
F dist.
df1, df2
gen. gamma
r, λ, b
logistic
µ
τ1τ2
log−normal
µ
σ
Pareto
µ, σ
Weibull
v, λ
categorical
v, λ
noncentral
hypergeom.
n1, n2, m1, ψ
r−cens.
normal
µ
σ1
c
σ2
l−cens.
normal
µ
σ1
c
σ2
Cauchy
x0
γ1γ2
half−t
σ
df
half−Cauchy
γ
half−normal
σ

MULOGO
Data at Montserrat – (negative) slope α
5e+04 5e+05 5e+06 5e+07 5e+08
0.020.050.100.200.501.002.005.0020.0050.00
PF Volume (m3
)
AnnualRate
log10(p(v)) ∝ −α log10(v) → p(v) ∝ v−α

MULOGO
Learning about α from data
Bayes Theorem
p(α | data) ∝ p(data | α)p(α)
Typically can’t compute p(α | data), but can sample

MULOGO
Data at Montserrat – p(α | data)
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8
0
500
1000
1500
2000
2500

MULOGO
Data and data models at Montserrat
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8
0
500
1000
1500
2000
2500
5e+04 5e+05 5e+06 5e+07 5e+08
0.020.050.100.200.501.002.005.0020.0050.00
PF Volume (m3
)
AnnualRate
α < 1 indicates so-called heavy tails

MULOGO
Rare Events
Pareto model is much more likely to observe
future volumes that far exceed those in the
recent history...
Consider a record of 10 volumes (V1, . . . , V10)
non-heavy tailed:
P(V11 > 10 max(V1, ..., V10)) = 1/200, 000
heavy tailed:
P(V11 > 10 max(V1, ..., V10)) = 1/100

MULOGO
What happens at larger-than-recorded volumes?
5e+04 5e+05 5e+06 5e+07 5e+08
0.020.050.100.200.501.002.005.0020.0050.00
PF Volume (m3
)
AnnualRate

MULOGO
We would like records from many volcanic eruptions

MULOGO
Best we can do: simulate replicate volcanic eruptions

MULOGO
Models: systems of PDEs
du
dt
= h(t, u(t), parameters)
u − state of the system
u(0) = uo − initial conditions, IC
parameters ∈ RN
boundary conidtions, BC
For UQ, we’d like the mapping
g : IC, BC, parameters −→ u or F(u)

MULOGO
Models: systems of PDEs
du
dt
= h(t, u(t), parameters)
u − state of the system
u(0) = uo − initial conditions, IC
parameters ∈ RN
boundary conidtions, BC
For UQ, we’d like the mapping
Computer model : inputs −→ outputs

MULOGO
Physics based models as a “lab”
Assume: ﬂow layer thin relative to lateral extension
continuity
x momentum
∂h
∂t
+
∂hux
∂x
+
∂huy
∂y
= es
∂hux
∂t
+
∂(hu2
x + kapgzh2/2)
∂x
+
∂huyux
∂y
=
gxh + uxes−
ux
u2
x + u2
y
(gz +
u2
x
κx
)h tan(φbed)−sgn(∂uxy)hkap
∂hgz
∂y
sin(φint)
1 Gravitational driving force
2 Coulomb friction at the base – φbed
3 Intergranular Coulomb force – φint
due to velocity gradients normal to ﬂow direction
(see Savage; Bursik; Pitman)

MULOGO
Simulated pyroclastic ﬂows at four different inputs (V, θ)

MULOGO
Belham Valley Probabilistic Hazard Map (t=2.5 yrs)
incorporate any/all sources
of knowledge
physics of granular ﬂow
data on frequency/size
of ﬂows
avoid one-off simulations
Methodology developed for hazard mapping works for UQ

MULOGO
Physical scenarios: data and models of p(V, θ)
5e+04 5e+05 5e+06 5e+07 5e+08
0.020.050.100.200.501.002.005.0020.0050.00
PF Volume (m3
)
AnnualRate
Possible statistical models for physical scenarios, p(V, θ)
linear volume model
(2 parameters)
frequency model, rate= λ
(1 parameter)
uniform
Von Mises (2 pars)
(Gaussian on a circle)

MULOGO
Monte Carlo Simulation
Idea literally named for gambling
“roll” the “die” N times
“die” is probabilistic scenario model
“roll” is the ﬂow model exercised at a
sampled scenario
P(hazard)= (# of catastrophes)/N

MULOGO
four draws from p(V, θ)

MULOGO
Cartoon p(physical scenario)
Physical Scenarios

MULOGO
Physical Scenarios
Monte Carlo Samples

MULOGO
Physical Scenarios
scenarios leading
to catastrophe

MULOGO
Physical Scenarios
to catastrophe
scenarios leading

MULOGO
Uncertainty
Aleatory variability — random scenarios
volume
initiation angle
frequency
Epistemic uncertainty — imperfect descriptions
probabilistic models (of random scenarios)
numerical resolution
physical parameters

MULOGO
What if we have a small computational budget?
-1 -0.5 0 0.5 1
x
-1.5
-1
-0.5
0
0.5
1
1.5
2
y

MULOGO
Notation for the rest of this talk
Computer model: g(·)
Inputs: x = {x1, . . . , xn} ∈ Rn
Outputs: y, for convenience, consider scalar output
Design: D = {x(1)
, . . . , x(m)
}
Corresponding response: yM
= {y(1)
, . . . , y(m)
} ∈ Rm
Quantity of Interest:
P[Y > threshold] = 1g(x)≥threshold p(x)dx
where Y = g(X) and X ∼ p

MULOGO
GP emulators – statistical surrogate
Recall our quantify of interest:
Goal: We want to predict, with uncertainty, simulator output g(·)
at untested inputs. E.g., we want ˜g(·) ≈ g(·)
Time to evaluate ˜g(·) << Time to evaluate g(·)
Strategy: View computer model output as a single realization of
a random process – Gaussian Process (GP)
(Sacks, Welch, Mitchell, and Wynn, 1989)

MULOGO
˜g(·): predictive estimate of g(·) with uncertainty
How can we get this?

MULOGO
Random functions – short range correlations
-1 -0.5 0 0.5 1
x
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
y

MULOGO
Random functions – long range correlations
-1 -0.5 0 0.5 1
x
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
y

MULOGO
Random functions
-1 -0.5 0 0.5 1
x
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
y

MULOGO
Random functions
Assume computer model is a realization of a random function
y(x) = µ + Z(x)
-3 -2 -1 0 1 2 3
y
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
where
E[Z(x)] = 0
V ar[Z(x)] = σ2
Z(x) ∼ N(0, σ2
)
Corr Z(x), Z(x ) =
d
i=1
e−φi(xi−xi)2
For m design points, correlation matrix Σ = σ2
R

MULOGO
Computer model runs
-1 -0.5 0 0.5 1
x
-1.5
-1
-0.5
0
0.5
1
1.5
2
y
Only consider GP to go through simulator runs

MULOGO
Conditional GPs
Larger φ – more “wiggly”
-1 -0.5 0 0.5 1
x
-3
-2
-1
0
1
2
3
y

MULOGO
Conditional GPs
Smaller φ – less “wiggly”
-1 -0.5 0 0.5 1
x
-2
-1
0
1
2
3
y
Need to ﬁnd “right” φ = {φ1, . . . , φd}

MULOGO
Predictive emulator with uncertainty

MULOGO
GP emulators – statistical surrogate
Recall our quantify of interest:
Goal: We want to predict, with uncertainty, simulator output g(·)
at untested inputs. E.g., we want ˜g(·) ≈ g(·)
Time to evaluate ˜g(·) << Time to evaluate g(·)
Strategy: View computer model output as a single realization of
a random process – Gaussian Process (GP)

MULOGO
Emulator at one map site

MULOGO
Emulator at one map site
Initiation angle (from East)
10 7
10 8
10 9
Volume(m3
)
East North South West East
0°
90 °
180 °
270 °
360 °

MULOGO
Making a hazard map
1 Run N = 2048 TITAN2D, store data for each location
2 Repeat following process, in parallel, for each site
1 choose subdesign
2 ﬁt emulator
3 draw catastrophic contours, ψ(θ)’s
3 Choose model for aleatory variability of scenarios
4 Run probability calculations, in parallel, for each site
Note
step 1 is expensive, 2 is parallelizable, 4 is post processing!
details in SIAM/ASA JUQ (Spiller 2014), overview in IJUQ (Bayarri 2015)

MULOGO
P(catastrophe in 2.5 years)

MULOGO
Epistemic uncertainty: uncertainty in probability model
Recall volume data used to characterize p(V, θ) (aleatory variability)
5e+04 5e+05 5e+06 5e+07 5e+08
0.020.050.100.200.501.002.005.0020.0050.00
PF Volume (m3
)
AnnualRate

MULOGO
Epistemic uncertainty: uncertainty in probability model
Recall volume data used to characterize p(V, θ) (aleatory variability)
5e+04 5e+05 5e+06 5e+07 5e+08
0.020.050.100.200.501.002.005.0020.0050.00
PF Volume (m3
)
AnnualRate
each red curve corresponds to a different slope p(V, θ|α)
now probability calculation is cheap —
we can ﬁnd P(hazard) for each α!

MULOGO
Epistemic uncertainty: in probability & physical models
uncertainty in prob model
5e+04 5e+05 5e+06 5e+07 5e+08
0.020.050.100.200.501.002.005.0020.0050.00
PF Volume (m3
)
AnnualRate
uncertainty in phys model
0 5 10 15 20
8
10
12
14
16
18
Volume (x106
m3
)
BasalFriction(degrees)
50 replicates of φ vs. V at Montserrat
Repeat probability calculations many times
vary α – probability model
vary friction uncertainty – physical model

MULOGO
Histograms of catastrophic probabilities – close
red – ﬁx friction, vary α s blue – ﬁx α = ˆα, vary friction
0.3 0.4 0.5 0.6 0.7 0.8
0
50
100
150
200
250
300
probability of catastrophe
P(catastrophe)=0.49

MULOGO
Histograms of catastrophic probabilities – far
red – ﬁx friction, vary α s blue – ﬁx α = ˆα, vary friction
0 0.01 0.02 0.03
0
50
100
150
200
250
300
350
probability of catastrophe
P(catastrophe)=0.0067

MULOGO
A retrospective “validation”
use data from 1995-2003
to estimate Poisson
frequencies for
(top, stationary)
(mid, low activity)
(bottom, high activity)
forecast probabilities of
inundation for 2004-2010
under these three
scenarios
white overlay, extent of
deposits for 2004-2010

MULOGO
Aleatoric variability – short term modeling
1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008
Years
0
100
200
300
400
500
600
700
800
900
1000
Eventcount
500 1000 1500 2000 2500 3000 3500 4000 4500
Days from January 1, 1996

MULOGO
Aleatoric variability – short term modeling
Sep 17, 1996
0
80117
153
215
280
106
107
108
Jun 25, 1997
0
80117
153
215
280
106
107
108
Aug 3, 1997
0
80117
153
215
280
106
107
108
Sep 21, 1997
0
80117
153
215
280
106
107
108
Dec 26, 1997
0
80117
153
215
280
106
107
108
Jul 3, 1998
0
80117
153
215
280
106
107
108
Mar 20, 2000
0
80117
153
215
280
106
107
108
Jul 29, 2001
0
80117
153
215
280
106
107
108
Jul 13, 2003
0
80117
153
215
280
106
107
108
May 20, 2006
0
80117
153
215
280
106
107
108

MULOGO
Short-term probabilistic hazard maps
90 days 180 days
µo = 135◦
µo = 0◦

MULOGO
To wrap up
Take home message
Emulator of physical model identiﬁes important regions of
state space independent of probabilistic model
Enables fast, ﬂexible direct or MC probability calculations
w/o more physical simulations
Framework for exploring multiple sources of epistemic
uncertainty and aleatory variability
Not a replacement, but a tool for civil protection and
scientists to forecast dynamic hazards and quantify
uncertainty

MULOGO
The End
Thanks to NSF:DMS-1228317-1228265-1228217,
EAR-1331353, DMS-1622403-1621853-1622467,
DMS-1638521, DMS-1821311-1821338-1821289,
SES-1521855

MUMS Undergraduate Workshop - Quantifying Uncertainty in Hazard Forecasting - Elaine Spiller, February 26, 2019

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to MUMS Undergraduate Workshop - Quantifying Uncertainty in Hazard Forecasting - Elaine Spiller, February 26, 2019

Similar to MUMS Undergraduate Workshop - Quantifying Uncertainty in Hazard Forecasting - Elaine Spiller, February 26, 2019 (20)

More from The Statistical and Applied Mathematical Sciences Institute

More from The Statistical and Applied Mathematical Sciences Institute (20)

Recently uploaded

Recently uploaded (20)

MUMS Undergraduate Workshop - Quantifying Uncertainty in Hazard Forecasting - Elaine Spiller, February 26, 2019