The document discusses quantifying uncertainty in hazard forecasting using probabilistic models and simulations. It summarizes using Gaussian processes to build statistical surrogates of computationally expensive simulation models, allowing estimation of model outputs and uncertainties at untested inputs within the modeled space. Monte Carlo simulation is used to sample from probabilistic models of physical scenarios and propagate scenarios through simulations to estimate hazard probabilities and maps while accounting for input uncertainties.
6. MULOGO
Mount Pelée, Martinique – 30,000 fatalities
“One hundred years ago, government officials in Martinique
made the mistake of assuming that, despite signs to the
contrary, Mount Pelée would behave in 1902 as it had in 1851 –
when a rain of ash from what they considered a benign volcano
surprised, but did not harm those living under its shadow.”
(Cristina Reed, Geotimes 2002)
10. MULOGO
Data at Montserrat – PF frequency and volume
5e+04 5e+05 5e+06 5e+07 5e+08
0.020.050.100.200.501.002.005.0020.0050.00
PF Volume (m3
)
AnnualRate
11. MULOGO
What is a random variable?
A quantity whose value is random and to which a probability
distribution is assigned
RV Example (discrete): Number of siblings
x 0 1 2 3 . . .
Pr 0.28 0.3 0.24 0.08 . . .
RV Example (continuous): Annual teaching income, US
Probability distribution function (pdf)
of teachers’ salaries across US
40,000 100,000
x~
12. MULOGO
How do we calculate a probability?
Example: What is the probability a teaching makes more
than $85K annually?
40,000 100,000
P(salary>85K)=∫0
∞
I(x)p(x)dx
Indicator function I(X) = 1 if X ≥ 85K, I(X) = 0 otherwise
To simulate, Monte Carlo
P(salary> $85K) ≈ 1
M
M
1 I(X) with X ∼ p(X)
24. MULOGO
Data and data models at Montserrat
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8
0
500
1000
1500
2000
2500
5e+04 5e+05 5e+06 5e+07 5e+08
0.020.050.100.200.501.002.005.0020.0050.00
PF Volume (m3
)
AnnualRate
α < 1 indicates so-called heavy tails
25. MULOGO
Rare Events
Pareto model is much more likely to observe
future volumes that far exceed those in the
recent history...
Consider a record of 10 volumes (V1, . . . , V10)
non-heavy tailed:
P(V11 > 10 max(V1, ..., V10)) = 1/200, 000
heavy tailed:
P(V11 > 10 max(V1, ..., V10)) = 1/100
26. MULOGO
What happens at larger-than-recorded volumes?
5e+04 5e+05 5e+06 5e+07 5e+08
0.020.050.100.200.501.002.005.0020.0050.00
PF Volume (m3
)
AnnualRate
29. MULOGO
Models: systems of PDEs
du
dt
= h(t, u(t), parameters)
u − state of the system
u(0) = uo − initial conditions, IC
parameters ∈ RN
boundary conidtions, BC
For UQ, we’d like the mapping
g : IC, BC, parameters −→ u or F(u)
30. MULOGO
Models: systems of PDEs
du
dt
= h(t, u(t), parameters)
u − state of the system
u(0) = uo − initial conditions, IC
parameters ∈ RN
boundary conidtions, BC
For UQ, we’d like the mapping
Computer model : inputs −→ outputs
31. MULOGO
Physics based models as a “lab”
Assume: flow layer thin relative to lateral extension
continuity
x momentum
∂h
∂t
+
∂hux
∂x
+
∂huy
∂y
= es
∂hux
∂t
+
∂(hu2
x + kapgzh2/2)
∂x
+
∂huyux
∂y
=
gxh + uxes−
ux
u2
x + u2
y
(gz +
u2
x
κx
)h tan(φbed)−sgn(∂uxy)hkap
∂hgz
∂y
sin(φint)
1 Gravitational driving force
2 Coulomb friction at the base – φbed
3 Intergranular Coulomb force – φint
due to velocity gradients normal to flow direction
(see Savage; Bursik; Pitman)
34. MULOGO
Belham Valley Probabilistic Hazard Map (t=2.5 yrs)
incorporate any/all sources
of knowledge
physics of granular flow
data on frequency/size
of flows
avoid one-off simulations
Methodology developed for hazard mapping works for UQ
35. MULOGO
Physical scenarios: data and models of p(V, θ)
5e+04 5e+05 5e+06 5e+07 5e+08
0.020.050.100.200.501.002.005.0020.0050.00
PF Volume (m3
)
AnnualRate
Possible statistical models for physical scenarios, p(V, θ)
linear volume model
(2 parameters)
frequency model, rate= λ
(1 parameter)
uniform
Von Mises (2 pars)
(Gaussian on a circle)
36. MULOGO
Monte Carlo Simulation
Idea literally named for gambling
“roll” the “die” N times
“die” is probabilistic scenario model
“roll” is the flow model exercised at a
sampled scenario
P(hazard)= (# of catastrophes)/N
43. MULOGO
Uncertainty
Aleatory variability — random scenarios
volume
initiation angle
frequency
Epistemic uncertainty — imperfect descriptions
probabilistic models (of random scenarios)
numerical resolution
physical parameters
44. MULOGO
What if we have a small computational budget?
-1 -0.5 0 0.5 1
x
-1.5
-1
-0.5
0
0.5
1
1.5
2
y
45. MULOGO
Notation for the rest of this talk
Computer model: g(·)
Inputs: x = {x1, . . . , xn} ∈ Rn
Outputs: y, for convenience, consider scalar output
Design: D = {x(1)
, . . . , x(m)
}
Corresponding response: yM
= {y(1)
, . . . , y(m)
} ∈ Rm
Quantity of Interest:
P[Y > threshold] = 1g(x)≥threshold p(x)dx
where Y = g(X) and X ∼ p
46. MULOGO
GP emulators – statistical surrogate
Recall our quantify of interest:
P[Y > threshold] = 1g(x)≥threshold p(x)dx
where Y = g(X) and X ∼ p
Goal: We want to predict, with uncertainty, simulator output g(·)
at untested inputs. E.g., we want ˜g(·) ≈ g(·)
Time to evaluate ˜g(·) << Time to evaluate g(·)
Strategy: View computer model output as a single realization of
a random process – Gaussian Process (GP)
(Sacks, Welch, Mitchell, and Wynn, 1989)
52. MULOGO
Random functions
Assume computer model is a realization of a random function
y(x) = µ + Z(x)
-3 -2 -1 0 1 2 3
y
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
where
E[Z(x)] = 0
V ar[Z(x)] = σ2
Z(x) ∼ N(0, σ2
)
Corr Z(x), Z(x ) =
d
i=1
e−φi(xi−xi)2
For m design points, correlation matrix Σ = σ2
R
53. MULOGO
Computer model runs
-1 -0.5 0 0.5 1
x
-1.5
-1
-0.5
0
0.5
1
1.5
2
y
Only consider GP to go through simulator runs
58. MULOGO
GP emulators – statistical surrogate
Recall our quantify of interest:
P[Y > threshold] = 1g(x)≥threshold p(x)dx
where Y = g(X) and X ∼ p
Goal: We want to predict, with uncertainty, simulator output g(·)
at untested inputs. E.g., we want ˜g(·) ≈ g(·)
Time to evaluate ˜g(·) << Time to evaluate g(·)
Strategy: View computer model output as a single realization of
a random process – Gaussian Process (GP)
60. MULOGO
Emulator at one map site
Initiation angle (from East)
10 7
10 8
10 9
Volume(m3
)
East North South West East
0°
90 °
180 °
270 °
360 °
61. MULOGO
Making a hazard map
1 Run N = 2048 TITAN2D, store data for each location
2 Repeat following process, in parallel, for each site
1 choose subdesign
2 fit emulator
3 draw catastrophic contours, ψ(θ)’s
3 Choose model for aleatory variability of scenarios
4 Run probability calculations, in parallel, for each site
Note
step 1 is expensive, 2 is parallelizable, 4 is post processing!
details in SIAM/ASA JUQ (Spiller 2014), overview in IJUQ (Bayarri 2015)
63. MULOGO
Epistemic uncertainty: uncertainty in probability model
Recall volume data used to characterize p(V, θ) (aleatory variability)
5e+04 5e+05 5e+06 5e+07 5e+08
0.020.050.100.200.501.002.005.0020.0050.00
PF Volume (m3
)
AnnualRate
64. MULOGO
Epistemic uncertainty: uncertainty in probability model
Recall volume data used to characterize p(V, θ) (aleatory variability)
5e+04 5e+05 5e+06 5e+07 5e+08
0.020.050.100.200.501.002.005.0020.0050.00
PF Volume (m3
)
AnnualRate
each red curve corresponds to a different slope p(V, θ|α)
now probability calculation is cheap —
we can find P(hazard) for each α!
65. MULOGO
Epistemic uncertainty: in probability & physical models
uncertainty in prob model
5e+04 5e+05 5e+06 5e+07 5e+08
0.020.050.100.200.501.002.005.0020.0050.00
PF Volume (m3
)
AnnualRate
uncertainty in phys model
0 5 10 15 20
8
10
12
14
16
18
Volume (x106
m3
)
BasalFriction(degrees)
50 replicates of φ vs. V at Montserrat
Repeat probability calculations many times
vary α – probability model
vary friction uncertainty – physical model
66. MULOGO
Histograms of catastrophic probabilities – close
red – fix friction, vary α s blue – fix α = ˆα, vary friction
0.3 0.4 0.5 0.6 0.7 0.8
0
50
100
150
200
250
300
probability of catastrophe
P(catastrophe)=0.49
67. MULOGO
Histograms of catastrophic probabilities – far
red – fix friction, vary α s blue – fix α = ˆα, vary friction
0 0.01 0.02 0.03
0
50
100
150
200
250
300
350
probability of catastrophe
P(catastrophe)=0.0067
68. MULOGO
A retrospective “validation”
use data from 1995-2003
to estimate Poisson
frequencies for
(top, stationary)
(mid, low activity)
(bottom, high activity)
forecast probabilities of
inundation for 2004-2010
under these three
scenarios
white overlay, extent of
deposits for 2004-2010
69. MULOGO
Aleatoric variability – short term modeling
1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008
Years
0
100
200
300
400
500
600
700
800
900
1000
Eventcount
500 1000 1500 2000 2500 3000 3500 4000 4500
Days from January 1, 1996
72. MULOGO
To wrap up
Take home message
Emulator of physical model identifies important regions of
state space independent of probabilistic model
Enables fast, flexible direct or MC probability calculations
w/o more physical simulations
Framework for exploring multiple sources of epistemic
uncertainty and aleatory variability
Not a replacement, but a tool for civil protection and
scientists to forecast dynamic hazards and quantify
uncertainty
73. MULOGO
The End
Thanks to NSF:DMS-1228317-1228265-1228217,
EAR-1331353, DMS-1622403-1621853-1622467,
DMS-1638521, DMS-1821311-1821338-1821289,
SES-1521855