Upcoming SlideShare
×

Bayesian Subset Simulation

1,263 views

Published on

Slides for my talk at PSAM11-ESREL12 (June 25-29, 2012, Helsinki, Finland.

Published in: Education, Technology, Business
• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here
• good !

Are you sure you want to  Yes  No
Your message goes here
• Be the first to like this

Bayesian Subset Simulation

1. 1. Bayesian Subset Simulation— a kriging-based subset simulation algorithm forthe estimation of small probabilities of failure — Ling Li, Julien Bect, Emmanuel Vazquez Supelec, France PSAM11-ESREL12 Helsinki, June 26, 2012
2. 2. A classical problem in (probabilistic) reliability. . . (1/2) ❍ Consider a system subject to uncertainties, ◮ aleatory and/or epistemic, ◮ represented by a random vector X ∼ PX where PX is a probability measure on X ⊂ Rd .
3. 3. A classical problem in (probabilistic) reliability. . . (1/2) ❍ Consider a system subject to uncertainties, ◮ aleatory and/or epistemic, ◮ represented by a random vector X ∼ PX where PX is a probability measure on X ⊂ Rd . ❍ Assume that the system fails when f (X ) > u ◮ f : X → R is a cost function, ◮ u ∈ R is the critical level. ❍ x → u − f (x ) is sometimes called the “limit state function”
4. 4. A classical problem in (probabilistic) reliability. . . (2/2) α ❍ Deﬁne the failure region PX Γ = {x ∈ X : f (x ) > u}. u f (x ) Γ ❍ The probability of failure is α = PX {Γ} = 1f >u dPX x X Figure: A 1d illustration
5. 5. A classical problem in (probabilistic) reliability. . . (2/2) α ❍ Deﬁne the failure region PX Γ = {x ∈ X : f (x ) > u}. u f (x ) Γ ❍ The probability of failure is α = PX {Γ} = 1f >u dPX x X Figure: A 1d illustration A fundamental numerical problem in reliability analysis How to estimate α using a computer program that can provide f (x ) for any given x ∈ X ?
6. 6. The venerable Monte Carlo method ❍ The Monte Carlo (MC) estimator m 1 iid αMC = ˆ 1f (Xi )>u with X1 , . . . , Xm ∼ PX m i=1 has a coeﬃcient of variation given by 1−α 1 δ = ≈ √ . αm αm
7. 7. The venerable Monte Carlo method ❍ The Monte Carlo (MC) estimator m 1 iid αMC = ˆ 1f (Xi )>u with X1 , . . . , Xm ∼ PX m i=1 has a coeﬃcient of variation given by 1−α 1 δ = ≈ √ . αm αm ❍ Computation time for a given δ ? 1 τ0 m ≈ ⇒ τ MC ≈ δ2 α δ2 α Ex: with δ = 50%, α = 10−5 , τ0 = 5 min, τ MC ≈ 4 years.
8. 8. A short and selective review of existing techniques ❍ The MC estimator is impractical when ◮ either f is expensive to evaluate (i.e., τ0 is large), ◮ or Γ is a rare event under PX (i.e., α is small).
9. 9. A short and selective review of existing techniques ❍ The MC estimator is impractical when ◮ either f is expensive to evaluate (i.e., τ0 is large), ◮ or Γ is a rare event under PX (i.e., α is small). ❍ Approximation techniques (and related adaptive sampling schemes) address the ﬁrst issue. ◮ parametric: FORM/SORM, polynomial RSM, . . . ◮ non-parametric: kriging (Gaussian processes), SVM, . . .
10. 10. A short and selective review of existing techniques ❍ The MC estimator is impractical when ◮ either f is expensive to evaluate (i.e., τ0 is large), ◮ or Γ is a rare event under PX (i.e., α is small). ❍ Approximation techniques (and related adaptive sampling schemes) address the ﬁrst issue. ◮ parametric: FORM/SORM, polynomial RSM, . . . ◮ non-parametric: kriging (Gaussian processes), SVM, . . . ❍ Variance reduction techniques (e.g., importance sampling) address the second issue. ◮ Subset simulation (Au & Beck, 2001) is especially √ appropriate for very small α, since δ ∝ | log α|/ m.
11. 11. What if I have an expensive f and a small α ? (1/2) ❍ Some parametric approximation techniques (e.g., FORM/SORM) can be still be used. . . ◮ strong assumption ⇒ “structural” error that cannot be reduced by adding more samples.
12. 12. What if I have an expensive f and a small α ? (1/2) ❍ Some parametric approximation techniques (e.g., FORM/SORM) can be still be used. . . ◮ strong assumption ⇒ “structural” error that cannot be reduced by adding more samples. ❍ Contribution of this paper: Bayesian Subset Simulation (BSS) ◮ Bayesian: uses a Gaussian process prior on f (kriging) ◮ ﬂexibility of a non-parametric approach, ◮ framework to design eﬃcient adaptive sampling schemes. ◮ generalizes subset simulation ◮ in the framework of Sequential Monte Carlo (SMC) methods (Del Moral et al, 2006).
13. 13. What if I have an expensive f and a small α ? (2/2) ❍ Some recent related work ◮ V. Dubourg, F. Deheeger and B. Sudret Metamodel-based importance sampling for structural reliability analysis. Preprint submitted to Probabilistic Engineering Mechanics (available on arXiv). ➥ use kriging + (adaptive) importance sampling ◮ J.-M. Bourinet, F. Deheeger and M. Lemaire Assessing small failure probabilities by combined subset simulation and Support Vector Machines, Structural Safety, 33:6, 343–353, 2011. ➥ use SVM + subset simulation
14. 14. Example : deﬂection of a cantilever beam ❍ We consider a cantilever beam of length L = 6 m, with uniformly distributed load (Rajashekhar & Ellingwood, 1993). http://en.wikipedia.org/wiki/File:Beam1svg.svg ❍ The maximal deﬂection of the beam is 3 L4 x 1 f (x1 , x2 ) = 3, 2 E x2 with x1 the load per unit area and x2 the depth. ❍ Young’s modulus: E = 2.6 104 MPa.
15. 15. Example : deﬂection of a cantilever beam ❍ We assume an imperfect knowledge of x1 and x2 : ◮ 2 X1 ∼ N µ1 , σ1 , µ1 = 10−3 MPa, σ1 = 0.2 µ1 , ◮ 2 X2 ∼ N µ2 , σ2 , µ2 = 300 mm, σ2 = 0.1 µ2 . ◮ truncated independent Gaussian variables.
16. 16. Example : deﬂection of a cantilever beam ❍ We assume an imperfect knowledge of x1 and x2 : ◮ 2 X1 ∼ N µ1 , σ1 , µ1 = 10−3 MPa, σ1 = 0.2 µ1 , ◮ 2 X2 ∼ N µ2 , σ2 , µ2 = 300 mm, σ2 = 0.1 µ2 . ◮ truncated independent Gaussian variables. ❍ A failure occurs when f (X1 , X2 ) > u = L/325. ◮ Reference value: α ≈ 3.94 10−6 , ◮ obtained by MC with m = 1010 (⇒ δ ≈ 0.5%). ❍ Note: our beam is thicker than the one of Rajashekhar & Ellingwood to make α smaller !
17. 17. Subset simulation with p0 = 10% and m = 16 000
18. 18. Subset simulation with p0 = 10% and m = 16 000
19. 19. Subset simulation with p0 = 10% and m = 16 000
20. 20. Subset simulation with p0 = 10% and m = 16 000
21. 21. Subset simulation with p0 = 10% and m = 16 000
22. 22. Subset simulation with p0 = 10% and m = 16 000
23. 23. Subset simulation with p0 = 10% and m = 16 000
24. 24. Subset simulation with p0 = 10% and m = 16 000
25. 25. Subset simulation with p0 = 10% and m = 16 000
26. 26. Subset simulation with p0 = 10% and m = 16 000
27. 27. Subset simulation with p0 = 10% and m = 16 000
28. 28. Subset simulation with p0 = 10% and m = 16 000
29. 29. Subset simulation with p0 = 10% and m = 16 000
30. 30. Subset simulation with p0 = 10% and m = 16 000
31. 31. Subset simulation with p0 = 10% and m = 16 000
32. 32. Subset simulation with p0 = 10% and m = 16 000
33. 33. Subset simulation with p0 = 10% and m = 16 000
34. 34. And now... Bayesian subset simulation ! (1/2) ❍ In the previous experiment, subset simulation performed N = m + (1 − p0 )(T − 1)m = 88000 evaluations of f . where T = 6 is the number of stages. ❍ Idea : we can do much better with a Gaussian process prior.
35. 35. And now... Bayesian subset simulation ! (1/2) ❍ In the previous experiment, subset simulation performed N = m + (1 − p0 )(T − 1)m = 88000 evaluations of f . where T = 6 is the number of stages. ❍ Idea : we can do much better with a Gaussian process prior. ❍ Key idea #1 (sequential Monte Carlo) ◮ SS uses an expensive sequence of target densities qt ∝ 1f >ut−1 πX where ut is the target level at stage t. ◮ We replace them by the cheaper densities qt ∝ Pn (f > ut−1 ) πX where Pn is the GP posterior given n evaluations of f .
36. 36. And now... Bayesian subset simulation ! (2/2) ❍ Key idea #2 (adaptive sampling) ◮ At each stage t, we improve our GP model around the next target level ut . ◮ Strategy: Stepwise Uncertainty Reduction (SUR) (Vazquez & Piera-Martinez (2007), Vazquez & Bect (2009)) ◮ Other strategies could be used as well. . . (e.g., Picheny et al. (2011))
37. 37. And now... Bayesian subset simulation ! (2/2) ❍ Key idea #2 (adaptive sampling) ◮ At each stage t, we improve our GP model around the next target level ut . ◮ Strategy: Stepwise Uncertainty Reduction (SUR) (Vazquez & Piera-Martinez (2007), Vazquez & Bect (2009)) ◮ Other strategies could be used as well. . . (e.g., Picheny et al. (2011)) ❍ Miscellaneous details ◮ Number of evaluations per stage: chosen adaptively. ◮ Number of stages T , levels ut : chosen adaptively.
38. 38. BSS with p0 = 10% and m = 16 000
39. 39. BSS with p0 = 10% and m = 16 000
40. 40. BSS with p0 = 10% and m = 16 000
41. 41. BSS with p0 = 10% and m = 16 000
42. 42. BSS with p0 = 10% and m = 16 000
43. 43. BSS with p0 = 10% and m = 16 000
44. 44. BSS with p0 = 10% and m = 16 000
45. 45. BSS with p0 = 10% and m = 16 000
46. 46. BSS with p0 = 10% and m = 16 000
47. 47. BSS with p0 = 10% and m = 16 000
48. 48. BSS with p0 = 10% and m = 16 000
49. 49. BSS with p0 = 10% and m = 16 000
50. 50. BSS with p0 = 10% and m = 16 000
51. 51. BSS with p0 = 10% and m = 16 000
52. 52. BSS with p0 = 10% and m = 16 000
53. 53. BSS with p0 = 10% and m = 16 000
54. 54. BSS with p0 = 10% and m = 16 000
55. 55. BSS with p0 = 10% and m = 16 000
56. 56. BSS with p0 = 10% and m = 16 000
57. 57. BSS with p0 = 10% and m = 16 000
58. 58. BSS with p0 = 10% and m = 16 000
59. 59. BSS with p0 = 10% and m = 16 000
60. 60. BSS with p0 = 10% and m = 16 000
61. 61. BSS with p0 = 10% and m = 16 000
62. 62. BSS with p0 = 10% and m = 16 000
63. 63. BSS with p0 = 10% and m = 16 000
64. 64. BSS with p0 = 10% and m = 16 000
65. 65. BSS with p0 = 10% and m = 16 000
66. 66. BSS with p0 = 10% and m = 16 000
67. 67. BSS with p0 = 10% and m = 16 000
68. 68. BSS with p0 = 10% and m = 16 000
69. 69. BSS with p0 = 10% and m = 16 000
70. 70. BSS with p0 = 10% and m = 16 000
71. 71. BSS with p0 = 10% and m = 16 000
72. 72. BSS with p0 = 10% and m = 16 000
73. 73. BSS with p0 = 10% and m = 16 000
74. 74. BSS with p0 = 10% and m = 16 000
75. 75. BSS with p0 = 10% and m = 16 000
76. 76. BSS with p0 = 10% and m = 16 000
77. 77. BSS with p0 = 10% and m = 16 000
78. 78. BSS with p0 = 10% and m = 16 000
79. 79. BSS with p0 = 10% and m = 16 000
80. 80. BSS with p0 = 10% and m = 16 000
81. 81. BSS with p0 = 10% and m = 16 000
82. 82. BSS with p0 = 10% and m = 16 000
83. 83. BSS with p0 = 10% and m = 16 000
84. 84. BSS with p0 = 10% and m = 16 000
85. 85. Performance ? ❍ Preliminary Monte Carlo studies (PhD thesis of Ling Li, 2012). ◮ Case tests in dimensions d = 2 and d = 6. ◮ Comparison with plain subset simulation and the 2 SMART algorithm (Deheeger, 2007; Bourinet et al., 2011). ⇒ very signiﬁcant evaluation savings (for a comparable MSE)
86. 86. Performance ? ❍ Preliminary Monte Carlo studies (PhD thesis of Ling Li, 2012). ◮ Case tests in dimensions d = 2 and d = 6. ◮ Comparison with plain subset simulation and the 2 SMART algorithm (Deheeger, 2007; Bourinet et al., 2011). ⇒ very signiﬁcant evaluation savings (for a comparable MSE) ❍ Our estimate is biased (nothing is free. . . ). ◮ Typically weakly biased in our experiments.
87. 87. Performance ? ❍ Preliminary Monte Carlo studies (PhD thesis of Ling Li, 2012). ◮ Case tests in dimensions d = 2 and d = 6. ◮ Comparison with plain subset simulation and the 2 SMART algorithm (Deheeger, 2007; Bourinet et al., 2011). ⇒ very signiﬁcant evaluation savings (for a comparable MSE) ❍ Our estimate is biased (nothing is free. . . ). ◮ Typically weakly biased in our experiments. ◮ Two sources of bias, that can be removed ◮ level-adaptation bias ➥ solution: two passes, ◮ Bayesian bias ➥ solution: evaluate all points at the last stage
88. 88. Closing remarks ❍ Estimating small probabilities of failure on expensive computer models is possible, using a blend of : ◮ advanced simulation techniques (here, SMC) ◮ meta-modelling (here, Gaussian process modelling) ❍ Benchmarking wrt state-of-the-art techniques ◮ work in progress
89. 89. Closing remarks ❍ Estimating small probabilities of failure on expensive computer models is possible, using a blend of : ◮ advanced simulation techniques (here, SMC) ◮ meta-modelling (here, Gaussian process modelling) ❍ Benchmarking wrt state-of-the-art techniques ◮ work in progress ❍ Open questions ◮ How well do we need do know f at intermediate stages ? ◮ How smooth should f be for BSS to be eﬃcient ? ◮ Theoretical properties ?
90. 90. References ❍ This talk is based on the paper ◮ Ling Li, Julien Bect, Emmanuel Vazquez, Bayesian Subset Simulation : a kriging-based subset simulation algorithm for the estimation of small probabilities of failure, Proceedings of PSAM 11 & ESREL 2012, June 25-29, 2012, Helsinki, Finland [clickme] ❍ For more information on kriging based adaptive sampling strategies (a.k.a sequential design of experiments) ◮ Julien Bect, David Ginsbourger, Ling Li, Victor Picheny, Emmanuel Vazquez, Sequential design of computer experiments for the estimation of a probability of failure, Statistics and Computing, 22(3):773–793, 2012. [clickme]