This talk will report briey on some findings from the problem of picking the weights for a weighted function space in QMC. Then it will be mostly about importance sampling. We want to estimate the probability _ of a union of J rare events. The method uses n samples, each of which picks one of the rare events at random, samples conditionally on that rare event happening and counts the total number of rare events that happen. It was used by Naiman and Priebe for scan
statistics, Shi, Siegmund and Yakir for genomic scans and Adler, Blanchet and Liu for extrema of Gaussian processes. We call it ALOE, for `at least one event'. The ALOE estimate is unbiased and we find that it has a coefficient of variation no larger than p (J + J�1 � 2)=(4n). The coefficient of variation is also no larger than p (__=_ � 1)=n where __ is the union bound. Our motivating problem comes from power system reliability, where the phase differences between connected nodes have a joint Gaussian distribution and the J rare events arise from unacceptably large phase differences. In the grid reliability problems even some events defined by 5772
constraints in 326 dimensions, with probability below 10�22, are estimated with a coefficient of variation of about 0:0024 with only n = 10;000 sample values. In a genomic context, the rare events become false discoveries. There we are interested in the possibility of a large number of simultaneous events, not just one or more. Some work with Kenneth Tay will be presented on that problem.
Joint with Yury Maximov and Michael Chertkov Los Alamos National Laboratory and Kenneth Tay, Stanford
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Similar to QMC: Transition Workshop - Importance Sampling the Union of Rare Events with an Application to Power Systems Analysis - Art Owen, May 9, 2018
Similar to QMC: Transition Workshop - Importance Sampling the Union of Rare Events with an Application to Power Systems Analysis - Art Owen, May 9, 2018 (20)
QMC: Transition Workshop - Importance Sampling the Union of Rare Events with an Application to Power Systems Analysis - Art Owen, May 9, 2018
1. SAMSI QMC Transition workshop 1
SAMSI QMC transition
workshop
Art Owen
Stanford University
May 9, 2018
2. SAMSI QMC Transition workshop 2
Importance sampling the union
of rare events, with bounded
error and an application to
power systems analysis and
another application to genomic
testing, plus something about
the weight working group
A. B. Owen + Yury Maximov + Michael Chertkov with Kenneth J. Tay and Yue Hui
May 9, 2018
3. SAMSI QMC Transition workshop 3
Acknowledgments
1) Mac Hyman, Ilse Ipsen & SAMSI staff
2) Fred, Frances, Pierre
May 9, 2018
4. SAMSI QMC Transition workshop 4
Weight working group
• We want µ = f(x) dx
• We use ˆµ = (1/n)
n
i=1 f(xi)
The RKHS theory is strong for f in a weighted Sobolev space
Which weighted space?
For most applications, f belongs to
• All of the RHKS, or
• none of them.
We had in mind a pipeline
f
1
→ weights
2
→ polynomial lattice rule
3
→ ˆµ
We got stuck on 2. Note from after the talk: chatting with Pierre L’Ecuyer it is clear
that the sticking point is more like step 1.5, choosing a figure of merit.
Lattice builder + SSJ: Godin, Jemel, L’Ecuyer, Marion, Munger
May 9, 2018
5. SAMSI QMC Transition workshop 5
Interleaving
This is a strategy by Dick & Baldeaux to build a new QMC input by smushing
together digits from two old ones.
You need a net in [0, 1]2s
to get points in [0, 1]s
.
Should we smush them all, or just some?
Yue Hui has looked into strategies for interleaving some but not all inputs to QMC
Preliminary results:
• small differences from which inputs get interleaved
• enormous gains for a variable with a large additive component
• lesser gains from variables that interact
Tentative finding: Variables with large closed Sobol’ index but not so large total
Sobol’ index are promising.
May 9, 2018
6. SAMSI QMC Transition workshop 6
Rare event sampling
Motivation: an electrical grid has N nodes. Power p1, p2, · · · , pN
• Random pi > 0, e.g., wind generation,
• Random pi < 0, e.g., consumption,
• Fixed pi for controllable nodes,
AC phase angles θi
• (θ1, . . . , θN ) = F(p1, . . . , pN )
• Constraints: |θi − θj| < ¯θ if i ∼ j
• Find P maxi∼j |θi − θj| ¯θ
Simplified model
• (p1, . . . , pN ) Gaussian
• θ linear in p1, . . . , pN
May 9, 2018
7. SAMSI QMC Transition workshop 7
Gaussian setup
For x ∼ N(0, Id), Hj = {x | ωT
j x τj} find µ = P(x ∈ ∪J
j=1Hj).
WLOG ωj = 1. Ordinarily τj > 0. d hundreds. J thousands.
−10 −5 0 5 10
−6−4−20246
c(−7,6)
q
q
q
q
q
q
Solid: deciles of x . Dashed: 10−3
· · · 10−7
. May 9, 2018
8. SAMSI QMC Transition workshop 8
Basic bounds
Let Pj ≡ P(ωT
j x τj) = Φ(−τj)
then µ ¯µ ≡
J
j=1
Pj (union bound)
and µ µ ≡ max
1 j J
Pj
Inclusion-exclusion
For u ⊆ 1:J = {1, 2, . . . , J}, let
Hu = ∪j∈uHj Hu(x) ≡ 1{x ∈ Hu} Hc
u(x) ≡ 1 − Hu(x)
Pu = P(Hu) = E(Hu(x))
µ =
|u|>0
(−1)|u|−1
Pu
May 9, 2018
9. SAMSI QMC Transition workshop 9
Importance sampling
For x ∼ p, seek η = Ep(f(x)) = f(x)p(x) dx. Take
ˆη =
1
n
n
i=1
f(xi)p(xi)
q(xi)
, xi
iid
∼ q
Unbiased if q(x) > 0 whenever f(x)p(x) = 0.
Variance
Var(ˆη) = σ2
q /n, where
σ2
q =
f2
p2
q
− µ2
= · · · =
(fp − µq)2
q
Num: seek q ≈ fp/µ, i.e., nearly proportional to fp
Den: watch out for small q.
May 9, 2018
10. SAMSI QMC Transition workshop 10
Mixture sampling
For αj 0, j αj = 1
ˆηα =
1
n
n
i=1
f(xi)p(xi)
j αjqj(xi)
, xi
iid
∼ qα ≡
j
αjqj
Defensive mixtures
Take q0 = p, α0 > 0. Get p/qα 1/α0. Hesterberg (1995)
Additional refs
Use qj = 1 as control variates. O & Zhou (2000).
Optimization over α is convex. He & O (2014).
Multiple IS. Veach & Guibas (1994). Veach’s Oscar.
Elvira, Martino, Luengo, Bugallo (2015) Generalizations.
May 9, 2018
12. SAMSI QMC Transition workshop 12
Mixture of conditional sampling
α1, . . . , αJ 0,
j
αj = 1, qα =
j
αjqj
µ = P(x ∈ ∪J
j=1Hj) = P(x ∈ H1:J ) = E H1:J (x)
Mixture IS
ˆµα =
1
n
n
i=1
H1:J (xi)p(xi)
J
j=1 αjqj(xi)
(where qj = pHj/Pj)
=
1
n
n
i=1
H1:J (xi)
J
j=1 αjHj(xi)/Pj
with H0 ≡ 1 and P0 = 1.
It would be nice to have αj/Pj constant.
May 9, 2018
13. SAMSI QMC Transition workshop 13
ALOE
At Least One (Rare) Event
Like AMIS of Cornuet, Marin, Mira, Robert (2012)
Take αj = α∗
j ∝ Pj. That is
α∗
j =
Pj
J
j =1 Pj
=
Pj
¯µ
, ( ¯µ is the union bound )
Then
ˆµα∗ =
1
n
n
i=1
H1:J (xi)
J
j=1 αjHj(xi)/Pj
= ¯µ ×
1
n
n
i=1
H1:J (xi)
J
j=1 Hj(xi)
If x ∼ qα∗ then H1:J (x) = 1. So
ˆµα∗ = ¯µ ×
1
n
n
i=1
1
S(xi)
, S(x) ≡
J
j=1
Hj(x)
May 9, 2018
14. SAMSI QMC Transition workshop 14
Some prior uses
Frigessi & Vercelis (1985)
Combinatorial enumeration.
Naiman & Priebe (2001)
Scan statistics, genetics and imaging.
Naiman, Priebe & Cope (2001)
Scan statistics, marked point processes (e.g., mine fields)
Shi, Siegmund, Yakir (2007)
Scan statistics, linkage analysis
Adler, Blanchet, Liu (2012)
Excursions of random fields over T ⊂ Rd
Nobody seems to have named it.
Thanks to: Bert Zwart, Jose Blanchet, David Siegmund for literature pointers
May 9, 2018
15. SAMSI QMC Transition workshop 15
Theorem
O, Maximov & Chertkov (2017)
Let H1, . . . , HJ be events defined by x ∼ p.
Let qj(x) = p(x)Hj(x)/Pj for Pj = P(x ∈ Hj).
Let xi
iid
∼ qα∗ =
J
j=1 α∗
j qj for α∗
j = Pj/¯µ.
Take
ˆµ = ¯µ ×
1
n
n
i=1
1
S(xi)
, S(xi) =
J
j=1
Hj(xi)
Then E(ˆµ) = µ and
Var(ˆµ) =
1
n
¯µ
J
s=1
Ts
s
− µ2 µ(¯µ − µ)
n
where Ts ≡ P(S(x) = s), x ∼ p.
The RHS follows because
J
s=1
Ts
s
J
s=1
Ts = µ.
May 9, 2018
16. SAMSI QMC Transition workshop 16
Remarks
With S(x) equal to the number of rare events at x,
ˆµ = ¯µ ×
1
n
n
i=1
1
S(xi)
1 S(x) J =⇒
¯µ
J
ˆµ ¯µ
Robustness
The usual problem with rare event estimation is getting no rare events in n tries.
Then ˆµ = 0.
Here the corresponding failure is never seeing S 2 rare events.
Then ˆµ = ¯µ, an upper bound and probably fairly accurate if all S(xi) = 1
Conditioning vs sampling from N(µj, I)
Avoids wasting samples outside the failure zone.
Avoids awkward likelihood ratios.
May 9, 2018
17. SAMSI QMC Transition workshop 17
General Gaussians
For y ∼ N(η, Σ) let the event be γT
j y κj.
Same as xT
ωj τj for x dim N(0, I) and
ωj =
γT
j Σ1/2
γT
j Σγj
and τj =
κj − γT
j η
γT
j Σγj
Optimizations
Later we will want to vary η.
That changes τj but not ωj.
May 9, 2018
18. SAMSI QMC Transition workshop 18
More bounds
For event count S(x) =
J
j=1 Hj(x) and x ∼ N(0, I):
E(S) =
J
j=1
Pj = ¯µ
P(S > 0) = E max
1 j J
Hj(x) = µ
E(S−1
| S > 0) =
J
s=1
P(S = s)
s
P(S > 0) =
J
s=1
P(S = s)
s
µ
Therefore
n × Var(ˆµ) = ¯µ
J
s=1
P(S = s)
s
− µ2
= ¯µ × µ × E(S−1
| S > 0) − µ2
= µ2
× E(S | S > 0) × E(S−1
| S > 0) − µ2
May 9, 2018
19. SAMSI QMC Transition workshop 19
Bounds continued
Var(ˆµ)
µ2
1
n
E(S | S > 0) × E(S−1
| S > 0) − 1
Lemma
Let S be a random variable in {1, 2, . . . , J} for J ∈ N. Then
E(S)E(S−1
)
J + J−1
+ 2
4
by Cauchy-Schwarz, with equality if and only if S ∼ U{1, J}.
Corollary
Var(ˆµ)
µ2
n
J + J−1
− 2
4
.
For J > 2 there must be a sharper bound for half-spaces and a Gaussian.
Thanks to Yanbo Tang and Jeffrey Negrea for an improved proof of the lemma.
May 9, 2018
20. SAMSI QMC Transition workshop 20
Numerical comparison
R function mvtnorm of Genz & Bretz (2009) gets
P a y b = P ∩j{aj yj bj} for y ∼ N(η, Σ)
Their code makes sophisticated use of quasi-Monte Carlo.
Adaptive, up to 25,000 evals in FORTRAN.
It was not designed for rare events.
It computes an intersection.
Usage
Pack ωj into Ω and τj into T
1 − µ = P(ΩT
x T ) = P(y T ), y ∼ N(0, ΩT
Ω)
It can handle up to 1000 inputs, i.e., J 1000
Upshot
ALOE works (much) better for rare events.
mvtnorm works better for non-rare events.
May 9, 2018
21. SAMSI QMC Transition workshop 21
Circumscribed polygon
P(J, τ) regular J sided polygon in R2
outside circle of radius τ > 0.
q
a priori bounds
µ = P(x ∈ Pc
) P(χ2
(2) τ2
) = exp(−τ2
/2)
This is pretty tight. A trigonometric argument gives
1
µ
exp(−τ2/2)
1 −
J tan π
J − π τ2
2π
.
= 1 −
π2
τ2
6J2
Let’s use J = 360
So µ
.
= exp(−τ2
/2) for reasonable τ.
May 9, 2018
22. SAMSI QMC Transition workshop 22
IS vs MVN
ALOE had n = 1000. MVN had up to 25,000. 100 random repeats.
τ µ E((ˆµALOE/µ − 1)2
) E((ˆµMVN/µ − 1)2
)
2 1.35×10−01
0.000399 9.42×10−08
3 1.11×10−02
0.000451 9.24×10−07
4 3.35×10−04
0.000549 2.37×10−02
5 3.73×10−06
0.000600 1.81×10+00
6 1.52×10−08
0.000543 4.39×10−01
7 2.29×10−11
0.000559 3.62×10−01
8 1.27×10−14
0.000540 1.34×10−01
For τ = 5 MVN had a few outliers.
May 9, 2018
23. SAMSI QMC Transition workshop 23
Polygon again
ALOE
Relative estimate
Frequency
0.96 1.00 1.04
0246810 Pmvnorm
Relative estimate
Frequency
1 2 3 4 5 6
0102030405060
Results of 100 estimates of the P(x ∈ P(360, 6)), divided by exp(−62
/2).
Left panel: ALOE. Right panel: pmvnorm.
May 9, 2018
24. SAMSI QMC Transition workshop 24
More examples
Similar results for high dimensions.
ALOE very effective on rare events.
MVN can go > 1000× the union bound.
May 9, 2018
25. SAMSI QMC Transition workshop 25
Power systems
Network with N nodes called busses and M edges.
Typically M/N is not very large.
Power pi at bus i. Each bus is Fixed or Random or Slack:
slack bus S has pS = − i=S pi.
p = (pT
F , pT
R, pS)T
∈ RN
Randomness driven by pR ∼ N(ηR, ΣRR)
Inductances Bij
A Laplacian Bij = 0 if i ∼ j in the network. Bii = − j=i Bij.
Taylor approximation
θ = B+
p (pseudo inverse)
Therefore θ is Gaussian as are all θi − θj for i ∼ j.
May 9, 2018
26. SAMSI QMC Transition workshop 26
Examples
Our examples come from MATPOWER (a Matlab toolbox).
Zimmernan, Murillo-S´anchez & Thomas (2011)
We used n = 10,000.
Polish winter peak grid
2383 busses, d = 326 random busses, J = 5772 phase constraints.
¯ω ˆµ se/ˆµ µ ¯µ
π/4 3.7 × 10−23
0.0024 3.6 × 10−23
4.2 × 10−23
π/5 2.6 × 10−12
0.0022 2.6 × 10−12
2.9 × 10−12
π/6 3.9 × 10−07
0.0024 3.9 × 10−07
4.4 × 10−07
π/7 2.0 × 10−03
0.0027 2.0 × 10−03
2.4 × 10−03
¯ω is the phase constraint, ˆµ is the ALOE estimate, se is the estimated standard
error, µ is the largest single event probability and ¯µ is the union bound. May 9, 2018
27. SAMSI QMC Transition workshop 27
Pegase 2869
Fliscounakis et al. (2013) “large part of the European system”
N = 2869 busses. d = 509 random busses. J = 7936 phase constraints.
Results
¯ω ˆµ se/ˆµ µ ¯µ
π/2 3.5 × 10−20
0∗
3.3 × 10−20
3.5 × 10−20
π/3 8.9 × 10−10
5.0 × 10−5
7.7 × 10−10
8.9 × 10−10
π/4 4.3 × 10−06
1.8 × 10−3
3.5 × 10−06
4.6 × 10−06
π/5 2.9 × 10−03
3.5 × 10−3
1.8 × 10−03
4.1 × 10−03
Notes
¯θ = π/2 is unrealistically large. We got ˆµ = ¯µ. All 10,000 samples had S = 1.
Some sort of Wilson interval or Bayesian approach could help.
One half space was sampled 9408 times, another 592 times. May 9, 2018
28. SAMSI QMC Transition workshop 28
Other models
IEEE case 14 and IEEE case 300 and Pegase 1354 were all dominated by one
failure mode so µ
.
= ¯µ and no sampling is needed.
Another model had random power corresponding to wind generators but phase
failures were not rare events in that model.
The Pegase 13659 model was too large for our computer. The Laplacian had
37,250 rows and columns.
The Pegase 9241 model was large and slow and phase failure was not a rare
event.
Caveats
We used a DC approximation to AC power flow (which is common) and the phase
estimates were based on a Taylor approximation.
May 9, 2018
29. SAMSI QMC Transition workshop 29
Genomics (in progress)
Similar problem comes up with false discoveries. Sketch
Measure G genes on n people.
Expression X = (X1 X2 · · · XG) ∈ Rn×G
And a phenotype: Y ∈ Rn
Exploration
Which genes (if any) correlate with Y ?
What could go wrong’?
Try G = 10,000 tests at α = 5% significance.
Expect at least 500 discoveries by chance.
Bonferroni
So · · · test at level α/G, e.g., 5 × 10−6
.
Seems conservative
May 9, 2018
30. SAMSI QMC Transition workshop 30
Multiple testing
FD = # False discoveries
TD = # True discoveries
False discovery proportion and rate
FDP =
FD
FD+TD
, denom not 0
0, else
FDR = E(FDP)
If Y Gaussian and independent of X,
then |XT
g Y | > τ brings a false discovery.
Benjamini-Hochberg
Controls FDR at e.g., 10% for independent tests
(and some dependent cases)
Correlations among genes =⇒ False discoveries come in bursts.
May 9, 2018
31. SAMSI QMC Transition workshop 31
Testing for association
Center and scale columns Xg.
Gene g is significantly associated with Y if
XT
g Y τ or XT
g Y −τ
For Gaussian Y we have similar geometry to the power grid problem
Using ALOE
We get P(FD k) X held fixed Y random
Power systems case was k = 1.
May 9, 2018
32. SAMSI QMC Transition workshop 32
Bursts from Benjamini-Hochberg
We will use actual X, (Duchenne Muscular dystrophy data)
Gaussian noise Y
Thanks to Jingshu Wang
Target false discovery rate 10%.
log2(1 + #FD) 0 1 2 3 4 6 7 8 9 10 11
3,756 genes, 30 subj 909 50 18 7 3 3 1 4 2 1 2
3,756 Indep. p-values 894 93 13
12,625 genes, 23 subj 929 30 19 6 3 1 1 3 1 5 2
12,625 Indep. p-values 902 85 12 1
Table 1: In 1000 simulations of BH with Y ∼ N(0, In), and real data X or indep
p-values.
May 9, 2018
34. SAMSI QMC Transition workshop 34
Duchenne data p = 0.005
3756 genes
Black: ALOE estimates (5 times)
Red: binomial reference
Source: Kenneth Tay May 9, 2018
35. SAMSI QMC Transition workshop 35
Duchenne data p = 5 × 10−5
3756 genes
Black: ALOE estimates
Red: binomial reference
Source: Kenneth Tay
May 9, 2018
36. SAMSI QMC Transition workshop 36
Complete null
In the complete null, no genes associate with Y .
What if some do?
Does that raise or lower the number of false discoveries?
May 9, 2018
37. SAMSI QMC Transition workshop 37
Null domination
P FD k | TD = 0 P FD k | TD > 0
Dudoit and van der Laan show t-tests have asymptotic null domination
K. Tay simulations show: not even close for small n and big G.
May 9, 2018
38. SAMSI QMC Transition workshop 38
Generalization I
Replace half spaces by convex sets
−10 −5 0 5 10
−6−4−20246
c(−7,6)
q
q
q
Algorithm will go through
Variance bounds too
Coefficient of variation: ok unless P(blue) P(red)
May 9, 2018
39. SAMSI QMC Transition workshop 39
Generalization II
Replace Gaussian by log concave density
−10 −5 0 5 10
−6−4−20246
c(−7,6)
q
q
q
We can find the instantons
We may have to do location shifts (instead of conditional sampling) May 9, 2018
40. SAMSI QMC Transition workshop 40
Generalization III
Sharper power systems models are nonlinear. Let
Hj(x) ≡ {x | gj(x) τj}
• Boundary not known a priori
• Sampling xi reveals gj(xi)
• Kriging for gj.
May 9, 2018
41. SAMSI QMC Transition workshop 41
Generalization IV
To minimize cost subject to P(Hu) ε we want to estimate
∂E(Hu)
∂τj
After some algebra
∂µ
∂τj
= −ϕ(τj) 1 − E(H−j(x) | ωT
j x = τj)
= −ϕ(τj)E(Hc
−j(x) | ωT
j x = τj)
= −ϕ(τj)P(No other events | Hj barely happened)
The goal is to get noisy estimates from the same data used to estimate µ.
Notes
1) For large τj most ωT
j x τj satisfy ωT
j x
.
= τj
2) Adapt the sampling to ensure that several Hj happen even if τj is large.
May 9, 2018
42. SAMSI QMC Transition workshop 42
Thanks
• Weight working group + Yue Hui
• Michael Chertkov and Yury Maximov, co-authors
• Center for Nonlinear Studies at LANL, for hospitality
• Kenneth Tay for genomic examples
• Jingshu Wang for Duchenne muscular dystrophy data
• Alan Genz for discussions on mvtnorm
• Bert Zwart, Jose Blanchet, David Siegmund for references
• Yanbo Tang and Jeffrey Negrea, improved Lemma proof
• NSF DMS-1407397 & DMS-1521145
• DOE/GMLC 2.0 Project: Emergency monitoring and controls through new
technologies and analytics
• SAMSI, Ilse Ipsen, David Banks
• Sue, Kerem, Thomas, Rita, Rebecca, Rick May 9, 2018