Making communications land - Are they received and understood as intended? we...
Bayesian Subset Simulation
1. Bayesian Subset Simulation
— a kriging-based subset simulation algorithm for
the estimation of small probabilities of failure —
Ling Li, Julien Bect, Emmanuel Vazquez
Supelec, France
PSAM11-ESREL12
Helsinki, June 26, 2012
2. A classical problem in (probabilistic) reliability. . . (1/2)
❍ Consider a system subject to uncertainties,
◮ aleatory and/or epistemic,
◮ represented by a random vector X ∼ PX
where PX is a probability measure on X ⊂ Rd .
3. A classical problem in (probabilistic) reliability. . . (1/2)
❍ Consider a system subject to uncertainties,
◮ aleatory and/or epistemic,
◮ represented by a random vector X ∼ PX
where PX is a probability measure on X ⊂ Rd .
❍ Assume that the system fails when f (X ) > u
◮ f : X → R is a cost function,
◮ u ∈ R is the critical level.
❍ x → u − f (x ) is sometimes called the “limit state function”
4. A classical problem in (probabilistic) reliability. . . (2/2)
α
❍ Define the failure region
PX
Γ = {x ∈ X : f (x ) > u}.
u
f (x )
Γ
❍ The probability of failure is
α = PX {Γ} = 1f >u dPX x
X
Figure: A 1d illustration
5. A classical problem in (probabilistic) reliability. . . (2/2)
α
❍ Define the failure region
PX
Γ = {x ∈ X : f (x ) > u}.
u
f (x )
Γ
❍ The probability of failure is
α = PX {Γ} = 1f >u dPX x
X
Figure: A 1d illustration
A fundamental numerical problem in reliability analysis
How to estimate α using a computer program that can provide
f (x ) for any given x ∈ X ?
6. The venerable Monte Carlo method
❍ The Monte Carlo (MC) estimator
m
1 iid
αMC =
ˆ 1f (Xi )>u with X1 , . . . , Xm ∼ PX
m i=1
has a coefficient of variation given by
1−α 1
δ = ≈ √ .
αm αm
7. The venerable Monte Carlo method
❍ The Monte Carlo (MC) estimator
m
1 iid
αMC =
ˆ 1f (Xi )>u with X1 , . . . , Xm ∼ PX
m i=1
has a coefficient of variation given by
1−α 1
δ = ≈ √ .
αm αm
❍ Computation time for a given δ ?
1 τ0
m ≈ ⇒ τ MC ≈
δ2 α δ2 α
Ex: with δ = 50%, α = 10−5 , τ0 = 5 min, τ MC ≈ 4 years.
8. A short and selective review of existing techniques
❍ The MC estimator is impractical when
◮ either f is expensive to evaluate (i.e., τ0 is large),
◮ or Γ is a rare event under PX (i.e., α is small).
9. A short and selective review of existing techniques
❍ The MC estimator is impractical when
◮ either f is expensive to evaluate (i.e., τ0 is large),
◮ or Γ is a rare event under PX (i.e., α is small).
❍ Approximation techniques (and related adaptive sampling
schemes) address the first issue.
◮ parametric: FORM/SORM, polynomial RSM, . . .
◮ non-parametric: kriging (Gaussian processes), SVM, . . .
10. A short and selective review of existing techniques
❍ The MC estimator is impractical when
◮ either f is expensive to evaluate (i.e., τ0 is large),
◮ or Γ is a rare event under PX (i.e., α is small).
❍ Approximation techniques (and related adaptive sampling
schemes) address the first issue.
◮ parametric: FORM/SORM, polynomial RSM, . . .
◮ non-parametric: kriging (Gaussian processes), SVM, . . .
❍ Variance reduction techniques (e.g., importance sampling)
address the second issue.
◮ Subset simulation (Au & Beck, 2001) is especially
√
appropriate for very small α, since δ ∝ | log α|/ m.
11. What if I have an expensive f and a small α ? (1/2)
❍ Some parametric approximation techniques (e.g.,
FORM/SORM) can be still be used. . .
◮ strong assumption ⇒ “structural” error that cannot be
reduced by adding more samples.
12. What if I have an expensive f and a small α ? (1/2)
❍ Some parametric approximation techniques (e.g.,
FORM/SORM) can be still be used. . .
◮ strong assumption ⇒ “structural” error that cannot be
reduced by adding more samples.
❍ Contribution of this paper: Bayesian Subset Simulation (BSS)
◮ Bayesian: uses a Gaussian process prior on f (kriging)
◮ flexibility of a non-parametric approach,
◮ framework to design efficient adaptive sampling schemes.
◮ generalizes subset simulation
◮ in the framework of Sequential Monte Carlo (SMC)
methods (Del Moral et al, 2006).
13. What if I have an expensive f and a small α ? (2/2)
❍ Some recent related work
◮ V. Dubourg, F. Deheeger and B. Sudret
Metamodel-based importance sampling for
structural reliability analysis. Preprint submitted to
Probabilistic Engineering Mechanics (available on arXiv).
➥ use kriging + (adaptive) importance sampling
◮ J.-M. Bourinet, F. Deheeger and M. Lemaire
Assessing small failure probabilities by combined
subset simulation and Support Vector Machines,
Structural Safety, 33:6, 343–353, 2011.
➥ use SVM + subset simulation
14. Example : deflection of a cantilever beam
❍ We consider a cantilever beam of length L = 6 m, with
uniformly distributed load (Rajashekhar & Ellingwood, 1993).
http://en.wikipedia.org/wiki/File:Beam1svg.svg
❍ The maximal deflection of the beam is
3 L4 x 1
f (x1 , x2 ) = 3,
2 E x2
with x1 the load per unit area and x2 the depth.
❍ Young’s modulus: E = 2.6 104 MPa.
15. Example : deflection of a cantilever beam
❍ We assume an imperfect knowledge of x1 and x2 :
◮ 2
X1 ∼ N µ1 , σ1 , µ1 = 10−3 MPa, σ1 = 0.2 µ1 ,
◮ 2
X2 ∼ N µ2 , σ2 , µ2 = 300 mm, σ2 = 0.1 µ2 .
◮ truncated independent Gaussian variables.
16. Example : deflection of a cantilever beam
❍ We assume an imperfect knowledge of x1 and x2 :
◮ 2
X1 ∼ N µ1 , σ1 , µ1 = 10−3 MPa, σ1 = 0.2 µ1 ,
◮ 2
X2 ∼ N µ2 , σ2 , µ2 = 300 mm, σ2 = 0.1 µ2 .
◮ truncated independent Gaussian variables.
❍ A failure occurs when f (X1 , X2 ) > u = L/325.
◮ Reference value: α ≈ 3.94 10−6 ,
◮ obtained by MC with m = 1010 (⇒ δ ≈ 0.5%).
❍ Note: our beam is thicker than the one of Rajashekhar &
Ellingwood to make α smaller !
34. And now... Bayesian subset simulation ! (1/2)
❍ In the previous experiment, subset simulation performed
N = m + (1 − p0 )(T − 1)m = 88000 evaluations of f .
where T = 6 is the number of stages.
❍ Idea : we can do much better with a Gaussian process prior.
35. And now... Bayesian subset simulation ! (1/2)
❍ In the previous experiment, subset simulation performed
N = m + (1 − p0 )(T − 1)m = 88000 evaluations of f .
where T = 6 is the number of stages.
❍ Idea : we can do much better with a Gaussian process prior.
❍ Key idea #1 (sequential Monte Carlo)
◮ SS uses an expensive sequence of target densities
qt ∝ 1f >ut−1 πX
where ut is the target level at stage t.
◮ We replace them by the cheaper densities
qt ∝ Pn (f > ut−1 ) πX
where Pn is the GP posterior given n evaluations of f .
36. And now... Bayesian subset simulation ! (2/2)
❍ Key idea #2 (adaptive sampling)
◮ At each stage t, we improve our GP model around the
next target level ut .
◮ Strategy: Stepwise Uncertainty Reduction (SUR)
(Vazquez & Piera-Martinez (2007), Vazquez & Bect (2009))
◮ Other strategies could be used as well. . .
(e.g., Picheny et al. (2011))
37. And now... Bayesian subset simulation ! (2/2)
❍ Key idea #2 (adaptive sampling)
◮ At each stage t, we improve our GP model around the
next target level ut .
◮ Strategy: Stepwise Uncertainty Reduction (SUR)
(Vazquez & Piera-Martinez (2007), Vazquez & Bect (2009))
◮ Other strategies could be used as well. . .
(e.g., Picheny et al. (2011))
❍ Miscellaneous details
◮ Number of evaluations per stage: chosen adaptively.
◮ Number of stages T , levels ut : chosen adaptively.
85. Performance ?
❍ Preliminary Monte Carlo studies (PhD thesis of Ling Li, 2012).
◮ Case tests in dimensions d = 2 and d = 6.
◮ Comparison with plain subset simulation and the
2 SMART algorithm (Deheeger, 2007; Bourinet et al., 2011).
⇒ very significant evaluation savings
(for a comparable MSE)
86. Performance ?
❍ Preliminary Monte Carlo studies (PhD thesis of Ling Li, 2012).
◮ Case tests in dimensions d = 2 and d = 6.
◮ Comparison with plain subset simulation and the
2 SMART algorithm (Deheeger, 2007; Bourinet et al., 2011).
⇒ very significant evaluation savings
(for a comparable MSE)
❍ Our estimate is biased (nothing is free. . . ).
◮ Typically weakly biased in our experiments.
87. Performance ?
❍ Preliminary Monte Carlo studies (PhD thesis of Ling Li, 2012).
◮ Case tests in dimensions d = 2 and d = 6.
◮ Comparison with plain subset simulation and the
2 SMART algorithm (Deheeger, 2007; Bourinet et al., 2011).
⇒ very significant evaluation savings
(for a comparable MSE)
❍ Our estimate is biased (nothing is free. . . ).
◮ Typically weakly biased in our experiments.
◮ Two sources of bias, that can be removed
◮ level-adaptation bias
➥ solution: two passes,
◮ Bayesian bias
➥ solution: evaluate all points at the last stage
88. Closing remarks
❍ Estimating small probabilities of failure on expensive computer
models is possible, using a blend of :
◮ advanced simulation techniques (here, SMC)
◮ meta-modelling (here, Gaussian process modelling)
❍ Benchmarking wrt state-of-the-art techniques
◮ work in progress
89. Closing remarks
❍ Estimating small probabilities of failure on expensive computer
models is possible, using a blend of :
◮ advanced simulation techniques (here, SMC)
◮ meta-modelling (here, Gaussian process modelling)
❍ Benchmarking wrt state-of-the-art techniques
◮ work in progress
❍ Open questions
◮ How well do we need do know f at intermediate stages ?
◮ How smooth should f be for BSS to be efficient ?
◮ Theoretical properties ?
90. References
❍ This talk is based on the paper
◮ Ling Li, Julien Bect, Emmanuel Vazquez, Bayesian Subset
Simulation : a kriging-based subset simulation algorithm for the
estimation of small probabilities of failure, Proceedings of PSAM 11
& ESREL 2012, June 25-29, 2012, Helsinki, Finland [clickme]
❍ For more information on kriging based adaptive sampling
strategies (a.k.a sequential design of experiments)
◮ Julien Bect, David Ginsbourger, Ling Li, Victor Picheny, Emmanuel
Vazquez, Sequential design of computer experiments for the
estimation of a probability of failure, Statistics and Computing,
22(3):773–793, 2012. [clickme]