Upcoming SlideShare
×

# Omiros' talk on the Bernoulli factory problem

3,321 views

Published on

Talk given at BigMC seminar, 15 Apr. 2010, by Prof. Omiros Papaspiliopoulos, UPF

Published in: Education
1 Like
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here
• Be the first to comment

Views
Total views
3,321
On SlideShare
0
From Embeds
0
Number of Embeds
733
Actions
Shares
0
12
0
Likes
1
Embeds 0
No embeds

No notes for slide

### Omiros' talk on the Bernoulli factory problem

1. 1. Simulating events of unknown probability by reverse time martingales (poprzez martyngaly z czasem odwr´conym) o Omiros Papaspiliopoulos UPF, Barcelona joint work with Krzysztof Latuszynski Ioannis Kosmidis Gareth O. Roberts (Warwick University)
2. 2. Motivation The Bernoulli factory - general results and previous approaches Reverse time martingale approach to sampling Application to the Bernoulli Factory problem
3. 3. Generic description of the problem Let p ∈ (0, 1) be unknown. Given a black box that samples p−coins Can we construct a black box that samples f (p) coins for known f ? For example f (p) = min(1, 2p)
4. 4. Some history (see for example Peres, 1992) von Neumann posed and solved the problem: f (p) = 1/2
5. 5. Some history (see for example Peres, 1992) von Neumann posed and solved the problem: f (p) = 1/2 1. set n = 1; 2. sample Xn , Xn+1 3. if (Xn , Xn+1 ) = (0, 1) output 1 and STOP 4. if (Xn , Xn+1 ) = (1, 0) output 0 and STOP 5. set n := n + 2 and GOTO 2. Lets check why this work
6. 6. The Bernoulli Factory problem for known f and unknown p, how to generate an f (p)−coin? von Neumann: f (p) = 1/2 Asmussen conjectured f (p) = 2p, but it turned out diﬃcult
7. 7. Exact simulation of diﬀusions as Bernoulli factory This is the description of EA closest in spirit to Beskos and Roberts (2005) Simulate XT at time T > 0 from: dXt = α(Xt ) dt + dWt , X0 = x ∈ R, t ∈ [0, T ] (1) driven by the Brownian motion {Wt ; 0 ≤ t ≤ T }
8. 8. Ω ≡ C ([0, T ], R), co-ordinate mappings Bt : Ω → R, t ∈ [0, T ], such that for any t, Bt (ω) = ω(t) and the cylinder σ-algebra C = σ({Bt ; 0 ≤ t ≤ T }). W x = {Wtx ; 0 ≤ t ≤ T } the Brownian motion started at x ∈ R, and by W x,u = {Wtx,u ; 0 ≤ t ≤ T } the Brownian bridge.
9. 9. 1. The drift function α is diﬀerentiable. 2. The function h(u) = exp{A(u) − (u − x)2 /2T }, u ∈ R, for u A(u) = 0 α(y )dy , is integrable. 3. The function (α2 + α )/2 is bounded below by > −∞, and above by r + < ∞. 1 2 φ(u) = [(α + α )/2 − ] ∈ [0, 1] , (2) r
10. 10. Q be the probability measure induced by the solution X of (1) on (Ω, C), W the corresponding probability measure for W x , and Z be the probability measure deﬁned as the following simple change of measure from W: dW/dZ(ω) ∝ exp{−A(BT )}. Note that a stochastic process distributed according to Z has similar dynamics to the Brownian motion, with the exception of the distribution of the marginal distribution at time T (with density, say, h) which is biased according to A.
11. 11. Q be the probability measure induced by the solution X of (1) on (Ω, C), W the corresponding probability measure for W x , and Z be the probability measure deﬁned as the following simple change of measure from W: dW/dZ(ω) ∝ exp{−A(BT )}. Note that a stochastic process distributed according to Z has similar dynamics to the Brownian motion, with the exception of the distribution of the marginal distribution at time T (with density, say, h) which is biased according to A. Then, T dQ (ω) ∝ exp −rT T −1 φ(Bt )dt ≤1 Z − a.s. (3) dZ 0
12. 12. EA using Bernoulli factory 1. simulate u ∼ h T 2. generate a Cs coin where s := e −rTJ , and J := 0 T −1 φ(Wtx,u )dt; 3. If Cs = 1 output u and STOP; 4. If Cs = 0 GOTO 1. Exploiting the Markov property, we can assume from now on that rT < 1.
13. 13. The challenging part of the algorithm is Step 2, since exact computation of J is impossible due to the integration over a Brownian bridge path. On the other hand, it is easy to generate J-coins: x,u CJ = I(ψ < φ(Wχ )), ψ ∼ U(0, 1) , χ ∼ U(0, T ) independent of the Brownian bridge W x,u and of each other. Therefore, we deal with another instance of the problem studied in this article: given p-coins how to generate f (p)-coins, where here f is the exponential function. (Note the interplay between unbiased estimation and exact simulation here)
14. 14. Keane and O’Brien - existence result Keane and O’Brien (1994): Let p ∈ P ⊆ (0, 1) → [0, 1] then it is possible to simulate an f (p)−coin ⇐⇒ f is constant f is continuous and for some n ∈ N and all p ∈ P satisﬁes n min f (p), 1 − f (p) ≥ min p, 1 − p Note that the result rules out min{1, 2p}, but not min{1 − 2 , 2p}
15. 15. Keane and O’Brien - existence result Keane and O’Brien (1994): Let p ∈ P ⊆ (0, 1) → [0, 1] then it is possible to simulate an f (p)−coin ⇐⇒ f is constant f is continuous and for some n ∈ N and all p ∈ P satisﬁes n min f (p), 1 − f (p) ≥ min p, 1 − p however their proof is not constructive Note that the result rules out min{1, 2p}, but not min{1 − 2 , 2p}
16. 16. Nacu-Peres (2005) Theorem - Bernstein polynomial approach There exists an algorithm which simulates f ⇐⇒ there exist polynomials n n n n gn (x, y ) = a(n, k)x k y n−k , hn (x, y ) = b(n, k)x k y n−k k k k=0 k=0 with following properties
17. 17. Nacu-Peres (2005) Theorem - Bernstein polynomial approach There exists an algorithm which simulates f ⇐⇒ there exist polynomials n n n n gn (x, y ) = a(n, k)x k y n−k , hn (x, y ) = b(n, k)x k y n−k k k k=0 k=0 with following properties 0 ≤ a(n, k) ≤ b(n, k) ≤ 1
18. 18. Nacu-Peres (2005) Theorem - Bernstein polynomial approach There exists an algorithm which simulates f ⇐⇒ there exist polynomials n n n n gn (x, y ) = a(n, k)x k y n−k , hn (x, y ) = b(n, k)x k y n−k k k k=0 k=0 with following properties 0 ≤ a(n, k) ≤ b(n, k) ≤ 1 n n k a(n, k) and k b(n, k) are integers
19. 19. Nacu-Peres (2005) Theorem - Bernstein polynomial approach There exists an algorithm which simulates f ⇐⇒ there exist polynomials n n n n gn (x, y ) = a(n, k)x k y n−k , hn (x, y ) = b(n, k)x k y n−k k k k=0 k=0 with following properties 0 ≤ a(n, k) ≤ b(n, k) ≤ 1 n n k a(n, k) and k b(n, k) are integers limn→∞ gn (p, 1 − p) = f (p) = limn→∞ hn (p, 1 − p)
20. 20. Nacu-Peres (2005) Theorem - Bernstein polynomial approach There exists an algorithm which simulates f ⇐⇒ there exist polynomials n n n n gn (x, y ) = a(n, k)x k y n−k , hn (x, y ) = b(n, k)x k y n−k k k k=0 k=0 with following properties 0 ≤ a(n, k) ≤ b(n, k) ≤ 1 n n k a(n, k) and k b(n, k) are integers limn→∞ gn (p, 1 − p) = f (p) = limn→∞ hn (p, 1 − p) for all m < n (x+y )n−m gm (x, y ) gn (x, y ) and (x+y )n−m hm (x, y ) hn (x, y )
21. 21. Nacu-Peres (2005) Theorem - Bernstein polynomial approach There exists an algorithm which simulates f ⇐⇒ there exist polynomials n n n n gn (x, y ) = a(n, k)x k y n−k , hn (x, y ) = b(n, k)x k y n−k k k k=0 k=0 with following properties 0 ≤ a(n, k) ≤ b(n, k) ≤ 1 n n k a(n, k) and k b(n, k) are integers limn→∞ gn (p, 1 − p) = f (p) = limn→∞ hn (p, 1 − p) for all m < n (x+y )n−m gm (x, y ) gn (x, y ) and (x+y )n−m hm (x, y ) hn (x, y ) Nacu & Peres provide coeﬃcients for f (p) = min{2p, 1 − 2ε} explicitly.
22. 22. Nacu-Peres (2005) Theorem - Bernstein polynomial approach There exists an algorithm which simulates f ⇐⇒ there exist polynomials n n n n gn (x, y ) = a(n, k)x k y n−k , hn (x, y ) = b(n, k)x k y n−k k k k=0 k=0 with following properties 0 ≤ a(n, k) ≤ b(n, k) ≤ 1 n n k a(n, k) and k b(n, k) are integers limn→∞ gn (p, 1 − p) = f (p) = limn→∞ hn (p, 1 − p) for all m < n (x+y )n−m gm (x, y ) gn (x, y ) and (x+y )n−m hm (x, y ) hn (x, y ) Nacu & Peres provide coeﬃcients for f (p) = min{2p, 1 − 2ε} explicitly. Given an algorithm for f (p) = min{2p, 1 − 2ε} Nacu & Peres develop a calculus that collapses every real analytic g to nesting the algorithm for f and simulating g .
23. 23. too nice to be true? at time n the N-P algorithm computes sets An and Bn - subsets of all 01 strings of length n
24. 24. too nice to be true? at time n the N-P algorithm computes sets An and Bn - subsets of all 01 strings of length n n the cardinalities of An and Bn are precisely k a(n, k) and n k b(n, k)
25. 25. too nice to be true? at time n the N-P algorithm computes sets An and Bn - subsets of all 01 strings of length n n the cardinalities of An and Bn are precisely k a(n, k) and n k b(n, k) the upper polynomial approximation is converging slowly to f
26. 26. too nice to be true? at time n the N-P algorithm computes sets An and Bn - subsets of all 01 strings of length n n the cardinalities of An and Bn are precisely k a(n, k) and n k b(n, k) the upper polynomial approximation is converging slowly to f length of 01 strings is 215 = 32768 and above, e.g. 225 = 16777216
27. 27. too nice to be true? at time n the N-P algorithm computes sets An and Bn - subsets of all 01 strings of length n n the cardinalities of An and Bn are precisely k a(n, k) and n k b(n, k) the upper polynomial approximation is converging slowly to f length of 01 strings is 215 = 32768 and above, e.g. 225 = 16777216 25 one has to deal eﬃciently with the set of 22 strings, of length 225 each.
28. 28. too nice to be true? at time n the N-P algorithm computes sets An and Bn - subsets of all 01 strings of length n n the cardinalities of An and Bn are precisely k a(n, k) and n k b(n, k) the upper polynomial approximation is converging slowly to f length of 01 strings is 215 = 32768 and above, e.g. 225 = 16777216 25 one has to deal eﬃciently with the set of 22 strings, of length 225 each. we shall develop a reverse time martingale approach to the problem
29. 29. too nice to be true? at time n the N-P algorithm computes sets An and Bn - subsets of all 01 strings of length n n the cardinalities of An and Bn are precisely k a(n, k) and n k b(n, k) the upper polynomial approximation is converging slowly to f length of 01 strings is 215 = 32768 and above, e.g. 225 = 16777216 25 one has to deal eﬃciently with the set of 22 strings, of length 225 each. we shall develop a reverse time martingale approach to the problem we will construct reverse time super- and submartingales that perform a random walk on the Nacu-Peres polynomial coeﬃcients a(n, k), b(n, k) and result in a black box that has algorithmic cost linear in the number of original p−coins
30. 30. Before giving the most general algorithm, let us think gradually how to simulate events of unknown probability constructively
31. 31. Algorithm 1 - randomization Lemma: Sampling events of probability s ∈ [0, 1] is equivalent to constructing an unbiased estimator of s taking values in [0, 1] with probability 1.
32. 32. Algorithm 1 - randomization Lemma: Sampling events of probability s ∈ [0, 1] is equivalent to constructing an unbiased estimator of s taking values in [0, 1] with probability 1. ˆ ˆ ˆ Proof: Let S, s.t. ES = s and P(S ∈ [0, 1]) = 1 be the ˆ estimator. Then draw G0 ∼ U(0, 1), obtain S and deﬁne a coin ˆ Cs := I{G0 ≤ S}. ˆ s ˆ s P(Cs = 1) = E I(G0 ≤ S) = E E I(G0 ≤ ˆ) | S = ˆ ˆ = ES = s. The converse is straightforward since an s−coin is an unbiased estimator of s with values in [0, 1].
33. 33. Algorithm 1 - randomization Lemma: Sampling events of probability s ∈ [0, 1] is equivalent to constructing an unbiased estimator of s taking values in [0, 1] with probability 1. ˆ ˆ ˆ Proof: Let S, s.t. ES = s and P(S ∈ [0, 1]) = 1 be the ˆ estimator. Then draw G0 ∼ U(0, 1), obtain S and deﬁne a coin ˆ Cs := I{G0 ≤ S}. ˆ s ˆ s P(Cs = 1) = E I(G0 ≤ S) = E E I(G0 ≤ ˆ) | S = ˆ ˆ = ES = s. The converse is straightforward since an s−coin is an unbiased estimator of s with values in [0, 1]. Algorithm 1 1. simulate G0 ∼ U(0, 1); 2. ˆ obtain S; 3. ˆ if G0 ≤ S set Cs := 1, otherwise set Cs := 0; 4. output Cs .
34. 34. Algorithm 2 - lower and upper monotone deterministic bounds let l1 , l2 , ... and u1 , u2 , ... be sequences of lower and upper monmotone bounds for s converging to s, i.e. li s and ui s.
35. 35. Algorithm 2 - lower and upper monotone deterministic bounds let l1 , l2 , ... and u1 , u2 , ... be sequences of lower and upper monmotone bounds for s converging to s, i.e. li s and ui s. Algorithm 2 1. simulate G0 ∼ U(0, 1); set n = 1; 2. compute ln and un ; 3. if G0 ≤ ln set Cs := 1; 4. if G0 > un set Cs := 0; 5. if ln < G0 ≤ un set n := n + 1 and GOTO 2; 6. output Cs .
36. 36. Algorithm 2 - lower and upper monotone deterministic bounds let l1 , l2 , ... and u1 , u2 , ... be sequences of lower and upper monmotone bounds for s converging to s, i.e. li s and ui s. Algorithm 2 1. simulate G0 ∼ U(0, 1); set n = 1; 2. compute ln and un ; 3. if G0 ≤ ln set Cs := 1; 4. if G0 > un set Cs := 0; 5. if ln < G0 ≤ un set n := n + 1 and GOTO 2; 6. output Cs . Remark: P(N > n) = un − ln .
37. 37. This is a practically useful technique, suggested for example in Devroye (1986), and implemented for simulation from random measures in P+Roberts (2008).
38. 38. Algorithm 3 - monotone stochastic bounds Ln ≤ Un (4) Ln ∈ [0, 1] and Un ∈ [0, 1] (5) Ln−1 ≤ Ln and Un−1 ≥ Un (6) E Ln = ln s and E Un = un s. (7) F0 = {∅, Ω}, Fn = σ{Ln , Un }, Fk,n = σ{Fk , Fk+1 , ...Fn } for k ≤ n.
39. 39. Algorithm 3 - monotone stochastic bounds Ln ≤ Un (4) Ln ∈ [0, 1] and Un ∈ [0, 1] (5) Ln−1 ≤ Ln and Un−1 ≥ Un (6) E Ln = ln s and E Un = un s. (7) F0 = {∅, Ω}, Fn = σ{Ln , Un }, Fk,n = σ{Fk , Fk+1 , ...Fn } for k ≤ n. Algorithm 3 1. simulate G0 ∼ U(0, 1); set n = 1; 2. obtain Ln and Un ; conditionally on F1,n−1 3. if G0 ≤ Ln set Cs := 1; 4. if G0 > Un set Cs := 0; 5. if Ln < G0 ≤ Un set n := n + 1 and GOTO 2; 6. output Cs .
40. 40. Algorithm 3 - monotone stochastic bounds Ln ≤ Un (4) Ln ∈ [0, 1] and Un ∈ [0, 1] (5) Ln−1 ≤ Ln and Un−1 ≥ Un (6) E Ln = ln s and E Un = un s. (7) F0 = {∅, Ω}, Fn = σ{Ln , Un }, Fk,n = σ{Fk , Fk+1 , ...Fn } for k ≤ n. Algorithm 3 1. simulate G0 ∼ U(0, 1); set n = 1; 2. obtain Ln and Un ; conditionally on F1,n−1 3. if G0 ≤ Ln set Cs := 1; 4. if G0 > Un set Cs := 0; 5. if Ln < G0 ≤ Un set n := n + 1 and GOTO 2; 6. output Cs .
41. 41. Algorithm 3 - monotone stochastic bounds Lemma Assume (4), (5), (6) and (7). Then Algorithm 3 outputs a valid s−coin. Moreover the probability that it needs N > n iterations equals un − ln . Proof. Probability that Algorithm 3 needs more then n iterations equals E(Un − Ln ) = un − ln → 0 as n → ∞. And since 0 ≤ Un − Ln is a decreasing sequence a.s., we also have Un − Ln → 0 a.s. So there exists a ˆ random variable S, such that for almost every realization of sequences {Ln (ω)}n≥1 and {Un (ω)}n≥1 we have Ln (ω) ˆ S(ω) and Un (ω) ˆ S(ω). ˆ By (5) we have S ∈ [0, 1] a.s. Thus for a ﬁxed ω the algorithm outputs ˆ ˆ an S(ω)−coin a.s (by Algorithm 2). Clearly E Ln ≤ E S ≤ E Un and ˆ hence E S = s.
42. 42. Algorithm 4 - reverse time martingales Ln ≤ Un Ln ∈ [0, 1] and Un ∈ [0, 1] Ln−1 ≤ Ln and Un−1 ≥ Un E Ln = ln s and E Un = un s. F0 = {∅, Ω}, Fn = σ{Ln , Un }, Fk,n = σ{Fk , Fk+1 , ...Fn } for k ≤ n. The ﬁnal step is to weaken 3rd condition and let Ln be a reverse time supermartingale and Un a reverse time submartingale with respect to Fn,∞ . Precisely, assume that for every n = 1, 2, ... we have E (Ln−1 | Fn,∞ ) = E (Ln−1 | Fn ) ≤ Ln a.s. and (8) E (Un−1 | Fn,∞ ) = E (Un−1 | Fn ) ≥ Un a.s. (9)
43. 43. Algorithm 4 - reverse time martingales Algorithm 4 ˜ 1. simulate G0 ∼ U(0, 1); set n = 1; set L0 ≡ L0 ≡ 0 and ˜ U0 ≡ U0 ≡ 1 2. obtain Ln and Un given F0,n−1 , 3. compute L∗ = E (Ln−1 | Fn ) and Un = E (Un−1 | Fn ). n ∗ 4. compute ˜ ˜ Ln − L∗ ˜ n ˜ Ln = Ln−1 + ∗ Un−1 − Ln−1 (10) Un − L∗n ˜ ˜ U ∗ − Un ˜ ˜ Un = Un−1 − n Un−1 − Ln−1 (11) ∗ Un − L∗ n 5. ˜ if G0 ≤ Ln set Cs := 1; 6. ˜ if G0 > Un set Cs := 0; 7. ˜ ˜ if Ln < G0 ≤ Un set n := n + 1 and GOTO 2; 8. output Cs .
44. 44. Algorithm 4 - reverse time martingales Theorem Assume (4), (5), (7), (8) and (9). Then Algorithm 4 outputs a valid s−coin. Moreover the probability that it needs N > n iterations equals un − ln . ˜ ˜ We show that L and U satisfy (4), (5), (7) and (6) and hence Algorithm 4 is valid since Algorithm 3 was valid. In fact, we have constructed a mean preserving transformation, in the ˜ ˜ sense E[Ln ] = E[Ln ], and E[Un ] = E[Un ]. Therefore, the proof is based on establishing this property and appealing to Algorithm 3.
45. 45. By construction L0 = 0, U0 = 1 a.s thus L∗ = 0, U0 = 1 0 ∗
46. 46. By construction L0 = 0, U0 = 1 a.s thus L∗ = 0, U0 = 1 0 ∗ ˜ ˜ Therefore, L1 = L1 , and U1 = U1 (intuitive)
47. 47. By construction L0 = 0, U0 = 1 a.s thus L∗ = 0, U0 = 1 0 ∗ ˜ ˜ Therefore, L1 = L1 , and U1 = U1 (intuitive) Thus, ˜ L2 − L∗ 2 L2 = L1 + ∗ (U1 − L1 ) U2 − L∗ 2
48. 48. By construction L0 = 0, U0 = 1 a.s thus L∗ = 0, U0 = 1 0 ∗ ˜ ˜ Therefore, L1 = L1 , and U1 = U1 (intuitive) Thus, ˜ L2 − L∗ 2 L2 = L1 + ∗ (U1 − L1 ) U2 − L∗ 2 Take conditional expectation given F2 Therefore result holds for n = 1, 2, then use induction (following the same approach)
49. 49. Unbiased estimators Theorem Suppose that for an unknown value of interest s ∈ R, there exist a constant M < ∞ and random sequences Ln and Un s.t. P(Ln ≤ Un ) = 1 for every n = 1, 2, ... P(Ln ∈ [−M, M]) = 1 and P(Un ∈ [−M, M]) = 1 for every n = 1, 2, ... E Ln = ln s and E U n = un s E (Ln−1 | Fn,∞ ) = E (Ln−1 | Fn ) ≤ Ln a.s. and E (Un−1 | Fn,∞ ) = E (Un−1 | Fn ) ≥ Un a.s. Then one can construct an unbiased estimator of s. Proof. After rescaling, one can use Algorithm 4 to sample events of probability (M + s)/2M, which gives an unbiased estimator of (M + s)/2M and consequently of s.
50. 50. A version of the Nacu-Peres Theorem An algorithm that simulates a function f on P ⊆ (0, 1) exists if and only if for all n ≥ 1 there exist polynomials gn (p) and hn (p) of the form n n n n gn (p) = a(n, k)p k (1−p)n−k and hn (p) = b(n, k)p k (1−p)n−k , k k k=0 k=0 s.t. (i) 0 ≤ a(n, k) ≤ b(n, k) ≤ 1, (ii) limn→∞ gn (p) = f (p) = limn→∞ hn (p), (iii) For all m < n, their coeﬃcients satisfy k n−m m k n−m m k−i i k−i i a(n, k) ≥ n a(m, i), b(n, k) ≤ n b(m, i). (12) i=0 k i=0 k
51. 51. Algorithm 4 - reverse time martingales Proof: polynomials ⇒ algorithm. Let X1 , X2 , . . . iid tosses of a p−coin.
52. 52. Algorithm 4 - reverse time martingales Proof: polynomials ⇒ algorithm. Let X1 , X2 , . . . iid tosses of a p−coin. Deﬁne {Ln , Un }n≥1 as follows:
53. 53. Algorithm 4 - reverse time martingales Proof: polynomials ⇒ algorithm. Let X1 , X2 , . . . iid tosses of a p−coin. Deﬁne {Ln , Un }n≥1 as follows: n if i=1 Xi = k, let Ln = a(n, k) and Un = b(n, k).
54. 54. Algorithm 4 - reverse time martingales Proof: polynomials ⇒ algorithm. Let X1 , X2 , . . . iid tosses of a p−coin. Deﬁne {Ln , Un }n≥1 as follows: n if i=1 Xi = k, let Ln = a(n, k) and Un = b(n, k). In the rest of the proof we check that (4), (5), (7), (8) and (9) hold for {Ln , Un }n≥1 with s = f (p). Thus executing Algorithm 4 with {Ln , Un }n≥1 yields a valid f (p)−coin. Clearly (4) and (5) hold due to (i). For (7) note that E Ln = gn (p) f (p) and E Un = hn (p) f (p).
55. 55. Algorithm 4 - reverse time martingales Proof - continued. To obtain (8) and (9) deﬁne the sequence of random variables Hn to be
56. 56. Algorithm 4 - reverse time martingales Proof - continued. To obtain (8) and (9) deﬁne the sequence of random variables Hn to be n the number of heads in {X1 , . . . , Xn }, i.e. Hn = i=1 Xi
57. 57. Algorithm 4 - reverse time martingales Proof - continued. To obtain (8) and (9) deﬁne the sequence of random variables Hn to be n the number of heads in {X1 , . . . , Xn }, i.e. Hn = i=1 Xi and let Gn = σ(Hn ). Thus Ln = a(n, Hn ) and Un = b(n, Hn ), hence Fn ⊆ Gn and it is enough to check that E(Lm |Gn ) ≤ Ln and E(Um |Gn ) ≥ Un for m < n.
58. 58. Algorithm 4 - reverse time martingales Proof - continued. To obtain (8) and (9) deﬁne the sequence of random variables Hn to be n the number of heads in {X1 , . . . , Xn }, i.e. Hn = i=1 Xi and let Gn = σ(Hn ). Thus Ln = a(n, Hn ) and Un = b(n, Hn ), hence Fn ⊆ Gn and it is enough to check that E(Lm |Gn ) ≤ Ln and E(Um |Gn ) ≥ Un for m < n. The distribution of Hm given Hn is hypergeometric and Hn n−m m Hn −i i E(Lm |Gn ) = E(a(m, Hm )|Hn ) = n a(m, i) ≤ a(n, Hn ) = Ln . i=0 Hn Clearly the distribution of Hm given Hn is the same as the distribution of Hm given {Hn , Hn+1 , . . . }. The argument for Un is identical.
59. 59. Practical issues for Bernoulli factory Given a function f , ﬁnding polynomial envelopes satisfying properties required is not easy. Section 3 of N-P provides explicit formulas for polynomial envelopes of f (p) = min{2p, 1 − 2ε} that satisfy conditions, precisely a(n, k) and b(n, k) satisfy (ii) and (iii) and one can easily compute n0 = n0 (ε) s.t. for n ≥ n0 condition (i) also holds, which is enough for the algorithm (however n0 is substantial, e.g. n0 (ε) = 32768 for ε = 0.1 and it increases as ε decreases). The probability that Algorithm 4 needs N > n inputs equals hn (p) − gn (p). The polynomials provided in N-P satisfy hn (p) − gn (p) ≤ C ρn for p ∈ [0, 1/2 − 4ε] guaranteeing fast convergence, and hn (p) − gn (p) ≤ Dn−1/2 elsewhere. Using similar techniques one can establish polynomial envelopes s.t. hn (p) − gn (p) ≤ C ρn for p ∈ [0, 1] (1/2 − (2 + c)ε, 1/2 − (2 − c)ε).
60. 60. Moreover, we note that despite the fact that the techniques developed in N-P for simulating a real analytic g exhibit exponentially decaying tails, they are often not practical. Nesting k times the algorithm for f (p) = min{2p, 1 − 2ε} is very ineﬃcient. One needs at least n0 (ε)k of original p−coins for a single output. Nevertheless, both algorithms, i.e. the original N-P and our martingale modiﬁcation, use the same number of original p−coins for a single f (p)−coin output with f (p) = min{2p, 1 − 2ε} and consequently also for simulating any real analytic function using methodology of N-P. A signiﬁcant improvement in terms of p−coins can be achieved only if the monotone super/sub-martingales can be constructed directly and used along with Algorithm 3. This is discussed in the next subsection.
61. 61. Bernoulli Factory for alternating series expansions Proposition Let f : [0, 1] → [0, 1] have an alternating series expansion ∞ f (p) = (−1)k ak p k with 1 ≥ a0 ≥ a1 ≥ . . . k=0 Then an f (p)−coin can be simulated by Algorithm 3 and the probability that it needs N > n iterations equals an p n .
62. 62. Proof. Let X1 , X2 , . . . be a sequence of p−coins and deﬁne U0 := a0 L0 := 0, n Un−1 − an k=1 Xk if n is odd, Ln := Ln−1 if n is even, Un−1 if n is odd, Un := n Ln−1 + an k=1 Xk if n is even. Clearly (4), (5), (6) and (7) are satisﬁed with s = f (p). Moreover, un − ln = E Un − E Ln = an p n ≤ an . Thus if an → 0, the algorithm converges for p ∈ [0, 1], otherwise for p ∈ [0, 1).
63. 63. Proof. Let X1 , X2 , . . . be a sequence of p−coins and deﬁne U0 := a0 L0 := 0, n Un−1 − an k=1 Xk if n is odd, Ln := Ln−1 if n is even, Un−1 if n is odd, Un := n Ln−1 + an k=1 Xk if n is even. Clearly (4), (5), (6) and (7) are satisﬁed with s = f (p). Moreover, un − ln = E Un − E Ln = an p n ≤ an . Thus if an → 0, the algorithm converges for p ∈ [0, 1], otherwise for p ∈ [0, 1). The exponential function is precisely in this family
64. 64. Some References A. Beskos, G.O. Roberts. Exact simulation of diﬀusions. Ann. Appl. Probab. 15(4): 2422–2444, 2005. A. Beskos, O. Papaspiliopoulos, G.O. Roberts. Retrospective exact simulation of diﬀusion sample paths with applications. Bernoulli 12: 1077–1098, 2006. A. Beskos, O. Papaspiliopoulos, G.O. Roberts, and P. Fearnhead. Exact and computationally eﬃcient likelihood-based estimation for discreetly observed diﬀusion processes (with discussion). Journal of the Royal Statistical Society B, 68(3):333–382, 2006. L. Devroye. Non-uniform random variable generation. Springer-Verlag, New York, 1986. M.S. Keane and G.L. O’Brien. A Bernoulli factory. ACM Transactions on Modelling and Computer Simulation (TOMACS), 4(2):213–219, 1994.
65. 65. P.E. Kloeden and E.Platen. Numerical solution of stochastic diﬀerential equations, Springer-Verlag, 1995. K. Latuszynski, I. Kosmidis, O. Papaspiliopoulos, G.O. Roberts. Simulating events of unknown probabilities via reverse time martingales. Random Structures and Algorithms, to appear. S. Nacu and Y. Peres. Fast simulation of new coins from old. Annals of Applied Probability, 15(1):93–115, 2005. O. Papaspiliopoulos, G.O. Roberts. Retrospective Markov chain Monte Carlo for Dirichlet process hierarchical models. Biometrika, 95:169–186, 2008. Y. Peres. Iterating von Neumann’s procedure for extracting random bits. Annals of Statistics, 20(1): 590–597, 1992.