QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop, Important sampling the union of rare events, with an application to power systems analysis - Art Owen, Dec 15, 2017
Bayesian hybrid variable selection under generalized linear models
Similar to QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop, Important sampling the union of rare events, with an application to power systems analysis - Art Owen, Dec 15, 2017
Similar to QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop, Important sampling the union of rare events, with an application to power systems analysis - Art Owen, Dec 15, 2017 (20)
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop, Important sampling the union of rare events, with an application to power systems analysis - Art Owen, Dec 15, 2017
1. SAMSI Monte Carlo workshop 1
Important sampling the union of
rare events, with an application
to power systems analysis
Art B. Owen
Stanford University
With Yury Maximov and Michael Chertkov
of Los Alamos National Laboratory.
December 15, 2017
2. SAMSI Monte Carlo workshop 2
MCQMC 2018
July 1–8, 2018
Rennes, France
MC, QMC, MCMC, RQMC, SMC, MLMC, MIMC, · · ·
Call for papers is now online:
http://mcqmc2018.inria.fr/
December 15, 2017
3. SAMSI Monte Carlo workshop 3
Rare event sampling
Motivation: an electrical grid has N nodes. Power p1, p2, · · · , pN
• Random pi > 0, e.g., wind generation,
• Random pi < 0, e.g. consumption,
• Fixed pi for controllable nodes,
AC phase angles θi
• (θ1, . . . , θN ) = F(p1, . . . , pN )
• Constraints: |θi − θj| < ¯θ if i ∼ j
• Find P maxi∼j |θi − θj| ¯θ
Simplified model
• (p1, . . . , pN ) Gaussian
• θ linear in p1, . . . , pN
December 15, 2017
4. SAMSI Monte Carlo workshop 4
Gaussian setup
For x ∼ N(0, Id), Hj = {x | ωT
j x τj} find µ = P(x ∈ ∪J
j=1Hj).
WLOG ωj = 1. Ordinarily τj > 0. d hundreds. J thousands.
−10 −5 0 5 10
−6−4−20246
c(−7,6)
q
q
q
q
q
q
Solid: deciles of x . Dashed: 10−3
· · · 10−7
. December 15, 2017
5. SAMSI Monte Carlo workshop 5
Basic bounds
Let Pj ≡ P(ωT
j x τj) = Φ(−τj)
Then max
1 j J
Pj ≡ µ µ ¯µ ≡
J
j=1
Pj
Inclusion-exclusion
For u ⊆ 1:J = {1, 2, . . . , J}, let
Hu = ∪j∈uHj Hu(x) ≡ 1{x ∈ Hu} Pu = P(Hu) = E(Hu(x))
µ =
|u|>0
(−1)|u|−1
Pu
Plethora of bounds
Survey by Yang, Alajaji & Takahara (2014)
December 15, 2017
6. SAMSI Monte Carlo workshop 6
Other uses
• Other engineering reliability
• False discoveries in statistics: J correlated test statistics
• Speculative: inference after model selection
Taylor, Fithian, Markovic, Tian
December 15, 2017
7. SAMSI Monte Carlo workshop 7
Importance sampling
For x ∼ p, seek η = Ep(f(x)) = f(x)p(x) dx. Take
ˆη =
1
n
n
i=1
f(xi)p(xi)
q(xi)
, xi
iid
∼ q
Unbiased if q(x) > 0 whenever f(x)p(x) = 0.
Variance
Var(ˆη) = σ2
q /n, where
σ2
q =
f2
p2
q
− µ2
= · · · =
(fp − µq)2
q
Num: seek q ≈ fp/µ, i.e., nearly proportional to fp
Den: watch out for small q.
December 15, 2017
8. SAMSI Monte Carlo workshop 8
Self-normalized IS
ˆηSNIS =
n
i=1 f(xi)p(xi)/q(xi)
n
i=1 p(xi)/q(xi)
Available for unnormalized p and / or q.
Good for Bayes, limited effectiveness for rare events.
Optimal SNIS puts 1/2 the samples in the rare event.
Optimal plain IS puts all samples there.
Best possible asymptotic coefficient of variation is 2/
√
n.
December 15, 2017
9. SAMSI Monte Carlo workshop 9
Mixture sampling
For αj 0, j αj = 1
ˆηα =
1
n
n
i=1
f(xi)p(xi)
j αjqj(xi)
, xi
iid
∼ qα ≡
j
αjqj
Defensive mixtures
Take q0 = p, α0 > 0. Get p/qα 1/α0. Hesterberg (1995)
Additional refs
Use qj = 1 as control variates. O & Zhou (2000).
Optimization over α is convex. He & O (2014).
Multiple IS. Veach & Guibas (1994). Veach’s Oscar.
Elvira, Martino, Luengo, Bugallo (2015) Generalizations.
December 15, 2017
11. SAMSI Monte Carlo workshop 11
Mixture of conditional sampling
α0, α1, . . . , αJ 0,
j
αj = 1, qα =
j
αjqj, q0 ≡ p
µ = P(x ∈ ∪J
j=1Hj) = P(x ∈ H1:J ) = E H1:J (x)
Mixture IS
ˆµα =
1
n
n
i=1
H1:J (xi)p(xi)
J
j=0 αjqj(xi)
(where qj = pHj/Pj)
=
1
n
n
i=1
H1:J (xi)
J
j=0 αjHj(xi)/Pj
with H0 ≡ 1 and P0 = 1.
It would be nice to have αj/Pj constant.
December 15, 2017
12. SAMSI Monte Carlo workshop 12
ALORE
At Least One Rare Event
Like AMIS of Cornuet, Marin, Mira, Robert (2012)
Put α∗
0 = 0, and take α∗
j ∝ Pj. That is
α∗
j =
Pj
J
j =1 Pj
=
Pj
¯µ
, ( ¯µ is the union bound )
Then
ˆµα∗ =
1
n
n
i=1
H1:J (xi)
J
j=1 αjHj(xi)/Pj
= ¯µ ×
1
n
n
i=1
H1:J (xi)
J
j=1 Hj(xi)
If x ∼ qα∗ then H1:J (x) = 1. So
ˆµα∗ = ¯µ ×
1
n
n
i=1
1
S(xi)
, S(x) ≡
J
j=1
Hj(x)
December 15, 2017
13. SAMSI Monte Carlo workshop 13
Prior work
Adler, Blanchet, Liu (2008, 2012) estimate
w(b) = P max
t∈T
f(t) > b
for a Gaussian random field f(t) over T ⊂ Rd
.
They also consider a finite set T = {t1, . . . , tN }.
Comparisons
• We use the same mixture estimate as them for finite T and Gaussian data.
• Their analysis is for Gaussian distributions.
We consider arbitrary sets of J events.
• They take limits as b → ∞.
We have non-asymptotic bounds and more general limits.
• Our analysis is limited to finite sets.
They handle extrema over a continuum.
December 15, 2017
14. SAMSI Monte Carlo workshop 14
Theorem
O, Maximov & Cherkov (2017)
Let H1, . . . , HJ be events defined by x ∼ p.
Let qj(x) = p(x)Hj(x)/Pj for Pj = P(x ∈ Hj).
Let xi
iid
∼ qα∗ =
J
j=1 α∗
j qj for α∗
j = Pj/¯µ.
Take
ˆµ = ¯µ ×
1
n
n
i=1
1
S(xi)
, S(xi) =
J
j=1
Hj(xi)
Then E(ˆµ) = µ and
Var(ˆµ) =
1
n
¯µ
J
s=1
Ts
s
− µ2 µ(¯µ − µ)
n
where Ts ≡ P(S(x) = s).
The RHS follows because
J
s=1
Ts
s
J
s=1
Ts = µ.
December 15, 2017
15. SAMSI Monte Carlo workshop 15
Remarks
With S(x) equal to the number of rare events at x,
ˆµ = ¯µ ×
1
n
n
i=1
1
S(xi)
1 S(x) J =⇒
¯µ
J
ˆµ ¯µ
Robustness
The usual problem with rare event estimation is getting no rare events in n tries.
Then ˆµ = 0.
Here the corresponding failure is never seeing S 2 rare events.
Then ˆµ = ¯µ, an upper bound and probably fairly accurate if all S(xi) = 1
Conditioning vs sampling from N(µj, I)
Avoids wasting samples outside the failure zone.
Avoids awkward likelihood ratios.
December 15, 2017
16. SAMSI Monte Carlo workshop 16
General Gaussians
For y ∼ N(η, Σ) let the event be γT
j y κj.
Translate into xT
ωj τj for
ωj =
γT
j Σ1/2
γT
j Σγj
and τj =
κj − γT
j η
γT
j Σγj
Adler et al.
Their context has all κj = b
December 15, 2017
17. SAMSI Monte Carlo workshop 17
Sampling
Get x ∼ N(0, I), such that xT
ω τ.
y will be xT
ω
1) Sample u ∼ U(0, 1).
2) Let y = Φ−1
(Φ(τ) + u(1 − Φ(τ))). May easily get y = ∞.
3) Sample z ∼ N(0, I).
4) Deliver x = ωy + (I − ωωT
)z.
Better numerics from
1) Sample u ∼ U(0, 1).
2) Let y = Φ−1
(uΦ(−τ)).
3) Sample z ∼ N(0, I).
4) Let x = ωy + (I − ωωT
)z.
5) Deliver x = −x.
I.E., sample x ∼ N(0, I) subject to xT
ω −τ and deliver −x.
Step 4 via ωy + z − ω(ωT
z).
December 15, 2017
18. SAMSI Monte Carlo workshop 18
More bounds
Recall that Ts = P(S(x) = s). Therefore
¯µ =
J
j=1
Pj = E
J
j=1
Hj(x) = E(S) =
J
s=1
sTs
µ = E max
1 j J
Hj(x) = P(S > 0), so
¯µ = E(S | S > 0) × P(S > 0) = µ × E(S | S > 0)
Therefore
n × Var(ˆµ) = ¯µ
J
s=1
Ts
s
− µ2
= µE(S | S > 0) µE(S−1
| S > 0) − µ2
= µ2
E(S | S > 0)E(S−1
| S > 0) − 1
December 15, 2017
19. SAMSI Monte Carlo workshop 19
Bounds continued
Var(ˆµ)
1
n
E(S | S > 0) × E(S−1
| S > 0) − 1
Lemma
Let S be a random variable in {1, 2, . . . , J} for J ∈ N. Then
E(S)E(S−1
)
J + J−1
+ 2
4
with equality if and only if S ∼ U{1, J}.
Corollary
Var(ˆµ)
µ2
n
J + J−1
− 2
4
.
Thanks to Yanbo Tang and Jeffrey Negrea for an improved proof of the lemma.
December 15, 2017
20. SAMSI Monte Carlo workshop 20
Numerical comparison
R function mvtnorm of Genz & Bretz (2009) gets
P a y b = P ∩j{aj yj bj} for y ∼ N(η, Σ)
Their code makes sophisticated use of quasi-Monte Carlo.
Adaptive, up to 25,000 evals in FORTRAN.
It was not designed for rare events.
It computes an intersection.
Usage
Pack ωj into Ω and τj into T
1 − µ = P(ΩT
x T ) = P(y T ), y ∼ N(0, ΩT
Ω)
It can handle up to 1000 inputs. IE J 1000
Upshot
ALORE works (much) better for rare events.
mvtnorm works better for non-rare events.
December 15, 2017
21. SAMSI Monte Carlo workshop 21
Polygon example
P(J, τ) regular J sided polygon in R2
outside circle of radius τ > 0.
P =
x
sin(2πj/J)
cos(2πj/J)
T
x τ, j = 1, . . . , J
a priori bounds
µ = P(x ∈ Pc
) P(χ2
(2) τ2
) = exp(−τ2
/2)
This is pretty tight. A trigonometric argument gives
1
µ
exp(−τ2/2)
1 −
J tan π
J − π τ2
2π
.
= 1 −
π2
τ2
6J2
Let’s use J = 360
So µ
.
= exp(−τ2
/2) for reasonable τ.
December 15, 2017
22. SAMSI Monte Carlo workshop 22
IS vs MVN
ALORE had n = 1000. MVN had up to 25,000. 100 random repeats.
τ µ E((ˆµALORE/µ − 1)2
) E((ˆµMVN/µ − 1)2
)
2 1.35×10−01
0.000399 9.42×10−08
3 1.11×10−02
0.000451 9.24×10−07
4 3.35×10−04
0.000549 2.37×10−02
5 3.73×10−06
0.000600 1.81×10+00
6 1.52×10−08
0.000543 4.39×10−01
7 2.29×10−11
0.000559 3.62×10−01
8 1.27×10−14
0.000540 1.34×10−01
For τ = 5 MVN had a few outliers.
December 15, 2017
23. SAMSI Monte Carlo workshop 23
Polygon again
ALORE
Relative estimate
Frequency
0.96 1.00 1.04
0246810 Pmvnorm
Relative estimate
Frequency
1 2 3 4 5 6
0102030405060
Results of 100 estimates of the P(x ∈ P(360, 6)), divided by exp(−62
/2).
Left panel: ALORE. Right panel: pmvnorm.
December 15, 2017
24. SAMSI Monte Carlo workshop 24
Symmetry
Maybe the circumscribed polygon is too easy due to symmetry.
Redo it with just ωj = (cos(2πj/J), sin(2πj/J))T
for the 72 prime numbers
j ∈ {1, 2, . . . , 360}.
For τ = 6 variance of ˆµ/ exp(−18) is 0.00077 for ALORE (n = 1000),
and 8.5 for pmvnorm.
December 15, 2017
25. SAMSI Monte Carlo workshop 25
High dimensional half spaces
Take random ωj ∈ Rd
for large d.
By concentration of measure they’re nearly orthogonal.
µ
.
= 1 −
J
j=1
(1 − Pj)
.
=
j
Pj = ¯µ (for rare events)
Simulation
• Dimension d ∼ U{20, 50, 100, 200, 500}
• Constraints J ∼ U{d/2, d, 2d}
• τ such that − log10(¯µ) ∼ U[4, 8] (then rounded to 2 places).
December 15, 2017
27. SAMSI Monte Carlo workshop 27
Power systems
Network with N nodes called busses and M edges.
Typically M/N is not very large.
Power pi at bus i. Each bus is Fixed or Random or Slack:
slack bus S has pS = − i=S pi.
p = (pT
F , pT
R, pS)T
∈ RN
Randomness driven by pR ∼ N(ηR, ΣRR)
Inductances Bij
A Laplacian Bij = 0 if i ∼ j in the network. Bii = − j=i Bij.
Taylor approximation
θ = B+
p (pseudo inverse)
Therefore θ is Gaussian as are all θi − θj for i ∼ j.
December 15, 2017
28. SAMSI Monte Carlo workshop 28
Examples
Our examples come from MATPOWER (a Matlab toolbox).
Zimmernan, Murillo-S´anchez & Thomas (2011)
We used n = 10,000.
Polish winter peak grid
2383 busses, d = 326 random busses, J = 5772 phase constraints.
¯ω ˆµ se/ˆµ µ ¯µ
π/4 3.7 × 10−23
0.0024 3.6 × 10−23
4.2 × 10−23
π/5 2.6 × 10−12
0.0022 2.6 × 10−12
2.9 × 10−12
π/6 3.9 × 10−07
0.0024 3.9 × 10−07
4.4 × 10−07
π/7 2.0 × 10−03
0.0027 2.0 × 10−03
2.4 × 10−03
¯ω is the phase constraint, ˆµ is the ALORE estimate, se is the estimated standard
error, µ is the largest single event probability and ¯µ is the union bound. December 15, 2017
29. SAMSI Monte Carlo workshop 29
Pegase 2869
Fliscounakis et al. (2013) “large part of the European system”
N = 2869 busses. d = 509 random busses. J = 7936 phase constraints.
Results
¯ω ˆµ se/ˆµ µ ¯µ
π/2 3.5 × 10−20
0∗
3.3 × 10−20
3.5 × 10−20
π/3 8.9 × 10−10
5.0 × 10−5
7.7 × 10−10
8.9 × 10−10
π/4 4.3 × 10−06
1.8 × 10−3
3.5 × 10−06
4.6 × 10−06
π/5 2.9 × 10−03
3.5 × 10−3
1.8 × 10−03
4.1 × 10−03
Notes
¯θ = π/2 is unrealistically large. We got ˆµ = ¯µ. All 10,000 samples had S = 1.
Some sort of Wilson interval or Bayesian approach could help.
One half space was sampled 9408 times, another 592 times. December 15, 2017
30. SAMSI Monte Carlo workshop 30
Other models
IEEE case 14 and IEEE case 300 and Pegase 1354 were all dominated by one
failure mode so µ
.
= ¯µ and no sampling is needed.
Another model had random power corresponding to wind generators but phase
failures were not rare events in that model.
The Pegase 13659 model was too large for our computer. The Laplacian had
37,250 rows and columns.
The Pegase 9241 model was large and slow and phase failure was not a rare
event.
Caveats
We used a DC approximation to AC power flow (which is common) and the phase
estimates were based on a Taylor approximation.
December 15, 2017
31. SAMSI Monte Carlo workshop 31
Next steps
1) Nonlinear boundaries
2) Non-Gaussian models
3) Optimizing cost subject to a constraint on µ
Sampling half-spaces will work if we have convex failure regions and a
log-concave nominal density.
December 15, 2017
32. SAMSI Monte Carlo workshop 32
Thanks
• Michael Chertkov and Yury Maximov, co-authors
• Center for Nonlinear Studies at LANL, for hospitality
• Alan Genz for help on mvtnorm
• Bert Zwart, pointing out Adler et al.
• Yanbo Tang and Jeffrey Negrea, improved Lemma proof
• NSF DMS-1407397 & DMS-1521145
• DOE/GMLC 2.0 Project: Emergency monitoring and controls through new
technologies and analytics
• Jianfeng Lu, Ilse Ipsen
• Sue McDonald, Kerem Jackson, Thomas Gehrmann
December 15, 2017