Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Omiros' talk on the Bernoulli factory problem

4,048 views

Published on

Talk given at BigMC seminar, 15 Apr. 2010, by Prof. Omiros Papaspiliopoulos, UPF

Published in: Education
  • Sex in your area is here: ❶❶❶ http://bit.ly/2Qu6Caa ❶❶❶
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Dating direct: ❤❤❤ http://bit.ly/2Qu6Caa ❤❤❤
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Omiros' talk on the Bernoulli factory problem

  1. 1. Simulating events of unknown probability by reverse time martingales (poprzez martyngaly z czasem odwr´conym) o Omiros Papaspiliopoulos UPF, Barcelona joint work with Krzysztof Latuszynski Ioannis Kosmidis Gareth O. Roberts (Warwick University)
  2. 2. Motivation The Bernoulli factory - general results and previous approaches Reverse time martingale approach to sampling Application to the Bernoulli Factory problem
  3. 3. Generic description of the problem Let p ∈ (0, 1) be unknown. Given a black box that samples p−coins Can we construct a black box that samples f (p) coins for known f ? For example f (p) = min(1, 2p)
  4. 4. Some history (see for example Peres, 1992) von Neumann posed and solved the problem: f (p) = 1/2
  5. 5. Some history (see for example Peres, 1992) von Neumann posed and solved the problem: f (p) = 1/2 1. set n = 1; 2. sample Xn , Xn+1 3. if (Xn , Xn+1 ) = (0, 1) output 1 and STOP 4. if (Xn , Xn+1 ) = (1, 0) output 0 and STOP 5. set n := n + 2 and GOTO 2. Lets check why this work
  6. 6. The Bernoulli Factory problem for known f and unknown p, how to generate an f (p)−coin? von Neumann: f (p) = 1/2 Asmussen conjectured f (p) = 2p, but it turned out difficult
  7. 7. Exact simulation of diffusions as Bernoulli factory This is the description of EA closest in spirit to Beskos and Roberts (2005) Simulate XT at time T > 0 from: dXt = α(Xt ) dt + dWt , X0 = x ∈ R, t ∈ [0, T ] (1) driven by the Brownian motion {Wt ; 0 ≤ t ≤ T }
  8. 8. Ω ≡ C ([0, T ], R), co-ordinate mappings Bt : Ω → R, t ∈ [0, T ], such that for any t, Bt (ω) = ω(t) and the cylinder σ-algebra C = σ({Bt ; 0 ≤ t ≤ T }). W x = {Wtx ; 0 ≤ t ≤ T } the Brownian motion started at x ∈ R, and by W x,u = {Wtx,u ; 0 ≤ t ≤ T } the Brownian bridge.
  9. 9. 1. The drift function α is differentiable. 2. The function h(u) = exp{A(u) − (u − x)2 /2T }, u ∈ R, for u A(u) = 0 α(y )dy , is integrable. 3. The function (α2 + α )/2 is bounded below by > −∞, and above by r + < ∞. 1 2 φ(u) = [(α + α )/2 − ] ∈ [0, 1] , (2) r
  10. 10. Q be the probability measure induced by the solution X of (1) on (Ω, C), W the corresponding probability measure for W x , and Z be the probability measure defined as the following simple change of measure from W: dW/dZ(ω) ∝ exp{−A(BT )}. Note that a stochastic process distributed according to Z has similar dynamics to the Brownian motion, with the exception of the distribution of the marginal distribution at time T (with density, say, h) which is biased according to A.
  11. 11. Q be the probability measure induced by the solution X of (1) on (Ω, C), W the corresponding probability measure for W x , and Z be the probability measure defined as the following simple change of measure from W: dW/dZ(ω) ∝ exp{−A(BT )}. Note that a stochastic process distributed according to Z has similar dynamics to the Brownian motion, with the exception of the distribution of the marginal distribution at time T (with density, say, h) which is biased according to A. Then, T dQ (ω) ∝ exp −rT T −1 φ(Bt )dt ≤1 Z − a.s. (3) dZ 0
  12. 12. EA using Bernoulli factory 1. simulate u ∼ h T 2. generate a Cs coin where s := e −rTJ , and J := 0 T −1 φ(Wtx,u )dt; 3. If Cs = 1 output u and STOP; 4. If Cs = 0 GOTO 1. Exploiting the Markov property, we can assume from now on that rT < 1.
  13. 13. The challenging part of the algorithm is Step 2, since exact computation of J is impossible due to the integration over a Brownian bridge path. On the other hand, it is easy to generate J-coins: x,u CJ = I(ψ < φ(Wχ )), ψ ∼ U(0, 1) , χ ∼ U(0, T ) independent of the Brownian bridge W x,u and of each other. Therefore, we deal with another instance of the problem studied in this article: given p-coins how to generate f (p)-coins, where here f is the exponential function. (Note the interplay between unbiased estimation and exact simulation here)
  14. 14. Keane and O’Brien - existence result Keane and O’Brien (1994): Let p ∈ P ⊆ (0, 1) → [0, 1] then it is possible to simulate an f (p)−coin ⇐⇒ f is constant f is continuous and for some n ∈ N and all p ∈ P satisfies n min f (p), 1 − f (p) ≥ min p, 1 − p Note that the result rules out min{1, 2p}, but not min{1 − 2 , 2p}
  15. 15. Keane and O’Brien - existence result Keane and O’Brien (1994): Let p ∈ P ⊆ (0, 1) → [0, 1] then it is possible to simulate an f (p)−coin ⇐⇒ f is constant f is continuous and for some n ∈ N and all p ∈ P satisfies n min f (p), 1 − f (p) ≥ min p, 1 − p however their proof is not constructive Note that the result rules out min{1, 2p}, but not min{1 − 2 , 2p}
  16. 16. Nacu-Peres (2005) Theorem - Bernstein polynomial approach There exists an algorithm which simulates f ⇐⇒ there exist polynomials n n n n gn (x, y ) = a(n, k)x k y n−k , hn (x, y ) = b(n, k)x k y n−k k k k=0 k=0 with following properties
  17. 17. Nacu-Peres (2005) Theorem - Bernstein polynomial approach There exists an algorithm which simulates f ⇐⇒ there exist polynomials n n n n gn (x, y ) = a(n, k)x k y n−k , hn (x, y ) = b(n, k)x k y n−k k k k=0 k=0 with following properties 0 ≤ a(n, k) ≤ b(n, k) ≤ 1
  18. 18. Nacu-Peres (2005) Theorem - Bernstein polynomial approach There exists an algorithm which simulates f ⇐⇒ there exist polynomials n n n n gn (x, y ) = a(n, k)x k y n−k , hn (x, y ) = b(n, k)x k y n−k k k k=0 k=0 with following properties 0 ≤ a(n, k) ≤ b(n, k) ≤ 1 n n k a(n, k) and k b(n, k) are integers
  19. 19. Nacu-Peres (2005) Theorem - Bernstein polynomial approach There exists an algorithm which simulates f ⇐⇒ there exist polynomials n n n n gn (x, y ) = a(n, k)x k y n−k , hn (x, y ) = b(n, k)x k y n−k k k k=0 k=0 with following properties 0 ≤ a(n, k) ≤ b(n, k) ≤ 1 n n k a(n, k) and k b(n, k) are integers limn→∞ gn (p, 1 − p) = f (p) = limn→∞ hn (p, 1 − p)
  20. 20. Nacu-Peres (2005) Theorem - Bernstein polynomial approach There exists an algorithm which simulates f ⇐⇒ there exist polynomials n n n n gn (x, y ) = a(n, k)x k y n−k , hn (x, y ) = b(n, k)x k y n−k k k k=0 k=0 with following properties 0 ≤ a(n, k) ≤ b(n, k) ≤ 1 n n k a(n, k) and k b(n, k) are integers limn→∞ gn (p, 1 − p) = f (p) = limn→∞ hn (p, 1 − p) for all m < n (x+y )n−m gm (x, y ) gn (x, y ) and (x+y )n−m hm (x, y ) hn (x, y )
  21. 21. Nacu-Peres (2005) Theorem - Bernstein polynomial approach There exists an algorithm which simulates f ⇐⇒ there exist polynomials n n n n gn (x, y ) = a(n, k)x k y n−k , hn (x, y ) = b(n, k)x k y n−k k k k=0 k=0 with following properties 0 ≤ a(n, k) ≤ b(n, k) ≤ 1 n n k a(n, k) and k b(n, k) are integers limn→∞ gn (p, 1 − p) = f (p) = limn→∞ hn (p, 1 − p) for all m < n (x+y )n−m gm (x, y ) gn (x, y ) and (x+y )n−m hm (x, y ) hn (x, y ) Nacu & Peres provide coefficients for f (p) = min{2p, 1 − 2ε} explicitly.
  22. 22. Nacu-Peres (2005) Theorem - Bernstein polynomial approach There exists an algorithm which simulates f ⇐⇒ there exist polynomials n n n n gn (x, y ) = a(n, k)x k y n−k , hn (x, y ) = b(n, k)x k y n−k k k k=0 k=0 with following properties 0 ≤ a(n, k) ≤ b(n, k) ≤ 1 n n k a(n, k) and k b(n, k) are integers limn→∞ gn (p, 1 − p) = f (p) = limn→∞ hn (p, 1 − p) for all m < n (x+y )n−m gm (x, y ) gn (x, y ) and (x+y )n−m hm (x, y ) hn (x, y ) Nacu & Peres provide coefficients for f (p) = min{2p, 1 − 2ε} explicitly. Given an algorithm for f (p) = min{2p, 1 − 2ε} Nacu & Peres develop a calculus that collapses every real analytic g to nesting the algorithm for f and simulating g .
  23. 23. too nice to be true? at time n the N-P algorithm computes sets An and Bn - subsets of all 01 strings of length n
  24. 24. too nice to be true? at time n the N-P algorithm computes sets An and Bn - subsets of all 01 strings of length n n the cardinalities of An and Bn are precisely k a(n, k) and n k b(n, k)
  25. 25. too nice to be true? at time n the N-P algorithm computes sets An and Bn - subsets of all 01 strings of length n n the cardinalities of An and Bn are precisely k a(n, k) and n k b(n, k) the upper polynomial approximation is converging slowly to f
  26. 26. too nice to be true? at time n the N-P algorithm computes sets An and Bn - subsets of all 01 strings of length n n the cardinalities of An and Bn are precisely k a(n, k) and n k b(n, k) the upper polynomial approximation is converging slowly to f length of 01 strings is 215 = 32768 and above, e.g. 225 = 16777216
  27. 27. too nice to be true? at time n the N-P algorithm computes sets An and Bn - subsets of all 01 strings of length n n the cardinalities of An and Bn are precisely k a(n, k) and n k b(n, k) the upper polynomial approximation is converging slowly to f length of 01 strings is 215 = 32768 and above, e.g. 225 = 16777216 25 one has to deal efficiently with the set of 22 strings, of length 225 each.
  28. 28. too nice to be true? at time n the N-P algorithm computes sets An and Bn - subsets of all 01 strings of length n n the cardinalities of An and Bn are precisely k a(n, k) and n k b(n, k) the upper polynomial approximation is converging slowly to f length of 01 strings is 215 = 32768 and above, e.g. 225 = 16777216 25 one has to deal efficiently with the set of 22 strings, of length 225 each. we shall develop a reverse time martingale approach to the problem
  29. 29. too nice to be true? at time n the N-P algorithm computes sets An and Bn - subsets of all 01 strings of length n n the cardinalities of An and Bn are precisely k a(n, k) and n k b(n, k) the upper polynomial approximation is converging slowly to f length of 01 strings is 215 = 32768 and above, e.g. 225 = 16777216 25 one has to deal efficiently with the set of 22 strings, of length 225 each. we shall develop a reverse time martingale approach to the problem we will construct reverse time super- and submartingales that perform a random walk on the Nacu-Peres polynomial coefficients a(n, k), b(n, k) and result in a black box that has algorithmic cost linear in the number of original p−coins
  30. 30. Before giving the most general algorithm, let us think gradually how to simulate events of unknown probability constructively
  31. 31. Algorithm 1 - randomization Lemma: Sampling events of probability s ∈ [0, 1] is equivalent to constructing an unbiased estimator of s taking values in [0, 1] with probability 1.
  32. 32. Algorithm 1 - randomization Lemma: Sampling events of probability s ∈ [0, 1] is equivalent to constructing an unbiased estimator of s taking values in [0, 1] with probability 1. ˆ ˆ ˆ Proof: Let S, s.t. ES = s and P(S ∈ [0, 1]) = 1 be the ˆ estimator. Then draw G0 ∼ U(0, 1), obtain S and define a coin ˆ Cs := I{G0 ≤ S}. ˆ s ˆ s P(Cs = 1) = E I(G0 ≤ S) = E E I(G0 ≤ ˆ) | S = ˆ ˆ = ES = s. The converse is straightforward since an s−coin is an unbiased estimator of s with values in [0, 1].
  33. 33. Algorithm 1 - randomization Lemma: Sampling events of probability s ∈ [0, 1] is equivalent to constructing an unbiased estimator of s taking values in [0, 1] with probability 1. ˆ ˆ ˆ Proof: Let S, s.t. ES = s and P(S ∈ [0, 1]) = 1 be the ˆ estimator. Then draw G0 ∼ U(0, 1), obtain S and define a coin ˆ Cs := I{G0 ≤ S}. ˆ s ˆ s P(Cs = 1) = E I(G0 ≤ S) = E E I(G0 ≤ ˆ) | S = ˆ ˆ = ES = s. The converse is straightforward since an s−coin is an unbiased estimator of s with values in [0, 1]. Algorithm 1 1. simulate G0 ∼ U(0, 1); 2. ˆ obtain S; 3. ˆ if G0 ≤ S set Cs := 1, otherwise set Cs := 0; 4. output Cs .
  34. 34. Algorithm 2 - lower and upper monotone deterministic bounds let l1 , l2 , ... and u1 , u2 , ... be sequences of lower and upper monmotone bounds for s converging to s, i.e. li s and ui s.
  35. 35. Algorithm 2 - lower and upper monotone deterministic bounds let l1 , l2 , ... and u1 , u2 , ... be sequences of lower and upper monmotone bounds for s converging to s, i.e. li s and ui s. Algorithm 2 1. simulate G0 ∼ U(0, 1); set n = 1; 2. compute ln and un ; 3. if G0 ≤ ln set Cs := 1; 4. if G0 > un set Cs := 0; 5. if ln < G0 ≤ un set n := n + 1 and GOTO 2; 6. output Cs .
  36. 36. Algorithm 2 - lower and upper monotone deterministic bounds let l1 , l2 , ... and u1 , u2 , ... be sequences of lower and upper monmotone bounds for s converging to s, i.e. li s and ui s. Algorithm 2 1. simulate G0 ∼ U(0, 1); set n = 1; 2. compute ln and un ; 3. if G0 ≤ ln set Cs := 1; 4. if G0 > un set Cs := 0; 5. if ln < G0 ≤ un set n := n + 1 and GOTO 2; 6. output Cs . Remark: P(N > n) = un − ln .
  37. 37. This is a practically useful technique, suggested for example in Devroye (1986), and implemented for simulation from random measures in P+Roberts (2008).
  38. 38. Algorithm 3 - monotone stochastic bounds Ln ≤ Un (4) Ln ∈ [0, 1] and Un ∈ [0, 1] (5) Ln−1 ≤ Ln and Un−1 ≥ Un (6) E Ln = ln s and E Un = un s. (7) F0 = {∅, Ω}, Fn = σ{Ln , Un }, Fk,n = σ{Fk , Fk+1 , ...Fn } for k ≤ n.
  39. 39. Algorithm 3 - monotone stochastic bounds Ln ≤ Un (4) Ln ∈ [0, 1] and Un ∈ [0, 1] (5) Ln−1 ≤ Ln and Un−1 ≥ Un (6) E Ln = ln s and E Un = un s. (7) F0 = {∅, Ω}, Fn = σ{Ln , Un }, Fk,n = σ{Fk , Fk+1 , ...Fn } for k ≤ n. Algorithm 3 1. simulate G0 ∼ U(0, 1); set n = 1; 2. obtain Ln and Un ; conditionally on F1,n−1 3. if G0 ≤ Ln set Cs := 1; 4. if G0 > Un set Cs := 0; 5. if Ln < G0 ≤ Un set n := n + 1 and GOTO 2; 6. output Cs .
  40. 40. Algorithm 3 - monotone stochastic bounds Ln ≤ Un (4) Ln ∈ [0, 1] and Un ∈ [0, 1] (5) Ln−1 ≤ Ln and Un−1 ≥ Un (6) E Ln = ln s and E Un = un s. (7) F0 = {∅, Ω}, Fn = σ{Ln , Un }, Fk,n = σ{Fk , Fk+1 , ...Fn } for k ≤ n. Algorithm 3 1. simulate G0 ∼ U(0, 1); set n = 1; 2. obtain Ln and Un ; conditionally on F1,n−1 3. if G0 ≤ Ln set Cs := 1; 4. if G0 > Un set Cs := 0; 5. if Ln < G0 ≤ Un set n := n + 1 and GOTO 2; 6. output Cs .
  41. 41. Algorithm 3 - monotone stochastic bounds Lemma Assume (4), (5), (6) and (7). Then Algorithm 3 outputs a valid s−coin. Moreover the probability that it needs N > n iterations equals un − ln . Proof. Probability that Algorithm 3 needs more then n iterations equals E(Un − Ln ) = un − ln → 0 as n → ∞. And since 0 ≤ Un − Ln is a decreasing sequence a.s., we also have Un − Ln → 0 a.s. So there exists a ˆ random variable S, such that for almost every realization of sequences {Ln (ω)}n≥1 and {Un (ω)}n≥1 we have Ln (ω) ˆ S(ω) and Un (ω) ˆ S(ω). ˆ By (5) we have S ∈ [0, 1] a.s. Thus for a fixed ω the algorithm outputs ˆ ˆ an S(ω)−coin a.s (by Algorithm 2). Clearly E Ln ≤ E S ≤ E Un and ˆ hence E S = s.
  42. 42. Algorithm 4 - reverse time martingales Ln ≤ Un Ln ∈ [0, 1] and Un ∈ [0, 1] Ln−1 ≤ Ln and Un−1 ≥ Un E Ln = ln s and E Un = un s. F0 = {∅, Ω}, Fn = σ{Ln , Un }, Fk,n = σ{Fk , Fk+1 , ...Fn } for k ≤ n. The final step is to weaken 3rd condition and let Ln be a reverse time supermartingale and Un a reverse time submartingale with respect to Fn,∞ . Precisely, assume that for every n = 1, 2, ... we have E (Ln−1 | Fn,∞ ) = E (Ln−1 | Fn ) ≤ Ln a.s. and (8) E (Un−1 | Fn,∞ ) = E (Un−1 | Fn ) ≥ Un a.s. (9)
  43. 43. Algorithm 4 - reverse time martingales Algorithm 4 ˜ 1. simulate G0 ∼ U(0, 1); set n = 1; set L0 ≡ L0 ≡ 0 and ˜ U0 ≡ U0 ≡ 1 2. obtain Ln and Un given F0,n−1 , 3. compute L∗ = E (Ln−1 | Fn ) and Un = E (Un−1 | Fn ). n ∗ 4. compute ˜ ˜ Ln − L∗ ˜ n ˜ Ln = Ln−1 + ∗ Un−1 − Ln−1 (10) Un − L∗n ˜ ˜ U ∗ − Un ˜ ˜ Un = Un−1 − n Un−1 − Ln−1 (11) ∗ Un − L∗ n 5. ˜ if G0 ≤ Ln set Cs := 1; 6. ˜ if G0 > Un set Cs := 0; 7. ˜ ˜ if Ln < G0 ≤ Un set n := n + 1 and GOTO 2; 8. output Cs .
  44. 44. Algorithm 4 - reverse time martingales Theorem Assume (4), (5), (7), (8) and (9). Then Algorithm 4 outputs a valid s−coin. Moreover the probability that it needs N > n iterations equals un − ln . ˜ ˜ We show that L and U satisfy (4), (5), (7) and (6) and hence Algorithm 4 is valid since Algorithm 3 was valid. In fact, we have constructed a mean preserving transformation, in the ˜ ˜ sense E[Ln ] = E[Ln ], and E[Un ] = E[Un ]. Therefore, the proof is based on establishing this property and appealing to Algorithm 3.
  45. 45. By construction L0 = 0, U0 = 1 a.s thus L∗ = 0, U0 = 1 0 ∗
  46. 46. By construction L0 = 0, U0 = 1 a.s thus L∗ = 0, U0 = 1 0 ∗ ˜ ˜ Therefore, L1 = L1 , and U1 = U1 (intuitive)
  47. 47. By construction L0 = 0, U0 = 1 a.s thus L∗ = 0, U0 = 1 0 ∗ ˜ ˜ Therefore, L1 = L1 , and U1 = U1 (intuitive) Thus, ˜ L2 − L∗ 2 L2 = L1 + ∗ (U1 − L1 ) U2 − L∗ 2
  48. 48. By construction L0 = 0, U0 = 1 a.s thus L∗ = 0, U0 = 1 0 ∗ ˜ ˜ Therefore, L1 = L1 , and U1 = U1 (intuitive) Thus, ˜ L2 − L∗ 2 L2 = L1 + ∗ (U1 − L1 ) U2 − L∗ 2 Take conditional expectation given F2 Therefore result holds for n = 1, 2, then use induction (following the same approach)
  49. 49. Unbiased estimators Theorem Suppose that for an unknown value of interest s ∈ R, there exist a constant M < ∞ and random sequences Ln and Un s.t. P(Ln ≤ Un ) = 1 for every n = 1, 2, ... P(Ln ∈ [−M, M]) = 1 and P(Un ∈ [−M, M]) = 1 for every n = 1, 2, ... E Ln = ln s and E U n = un s E (Ln−1 | Fn,∞ ) = E (Ln−1 | Fn ) ≤ Ln a.s. and E (Un−1 | Fn,∞ ) = E (Un−1 | Fn ) ≥ Un a.s. Then one can construct an unbiased estimator of s. Proof. After rescaling, one can use Algorithm 4 to sample events of probability (M + s)/2M, which gives an unbiased estimator of (M + s)/2M and consequently of s.
  50. 50. A version of the Nacu-Peres Theorem An algorithm that simulates a function f on P ⊆ (0, 1) exists if and only if for all n ≥ 1 there exist polynomials gn (p) and hn (p) of the form n n n n gn (p) = a(n, k)p k (1−p)n−k and hn (p) = b(n, k)p k (1−p)n−k , k k k=0 k=0 s.t. (i) 0 ≤ a(n, k) ≤ b(n, k) ≤ 1, (ii) limn→∞ gn (p) = f (p) = limn→∞ hn (p), (iii) For all m < n, their coefficients satisfy k n−m m k n−m m k−i i k−i i a(n, k) ≥ n a(m, i), b(n, k) ≤ n b(m, i). (12) i=0 k i=0 k
  51. 51. Algorithm 4 - reverse time martingales Proof: polynomials ⇒ algorithm. Let X1 , X2 , . . . iid tosses of a p−coin.
  52. 52. Algorithm 4 - reverse time martingales Proof: polynomials ⇒ algorithm. Let X1 , X2 , . . . iid tosses of a p−coin. Define {Ln , Un }n≥1 as follows:
  53. 53. Algorithm 4 - reverse time martingales Proof: polynomials ⇒ algorithm. Let X1 , X2 , . . . iid tosses of a p−coin. Define {Ln , Un }n≥1 as follows: n if i=1 Xi = k, let Ln = a(n, k) and Un = b(n, k).
  54. 54. Algorithm 4 - reverse time martingales Proof: polynomials ⇒ algorithm. Let X1 , X2 , . . . iid tosses of a p−coin. Define {Ln , Un }n≥1 as follows: n if i=1 Xi = k, let Ln = a(n, k) and Un = b(n, k). In the rest of the proof we check that (4), (5), (7), (8) and (9) hold for {Ln , Un }n≥1 with s = f (p). Thus executing Algorithm 4 with {Ln , Un }n≥1 yields a valid f (p)−coin. Clearly (4) and (5) hold due to (i). For (7) note that E Ln = gn (p) f (p) and E Un = hn (p) f (p).
  55. 55. Algorithm 4 - reverse time martingales Proof - continued. To obtain (8) and (9) define the sequence of random variables Hn to be
  56. 56. Algorithm 4 - reverse time martingales Proof - continued. To obtain (8) and (9) define the sequence of random variables Hn to be n the number of heads in {X1 , . . . , Xn }, i.e. Hn = i=1 Xi
  57. 57. Algorithm 4 - reverse time martingales Proof - continued. To obtain (8) and (9) define the sequence of random variables Hn to be n the number of heads in {X1 , . . . , Xn }, i.e. Hn = i=1 Xi and let Gn = σ(Hn ). Thus Ln = a(n, Hn ) and Un = b(n, Hn ), hence Fn ⊆ Gn and it is enough to check that E(Lm |Gn ) ≤ Ln and E(Um |Gn ) ≥ Un for m < n.
  58. 58. Algorithm 4 - reverse time martingales Proof - continued. To obtain (8) and (9) define the sequence of random variables Hn to be n the number of heads in {X1 , . . . , Xn }, i.e. Hn = i=1 Xi and let Gn = σ(Hn ). Thus Ln = a(n, Hn ) and Un = b(n, Hn ), hence Fn ⊆ Gn and it is enough to check that E(Lm |Gn ) ≤ Ln and E(Um |Gn ) ≥ Un for m < n. The distribution of Hm given Hn is hypergeometric and Hn n−m m Hn −i i E(Lm |Gn ) = E(a(m, Hm )|Hn ) = n a(m, i) ≤ a(n, Hn ) = Ln . i=0 Hn Clearly the distribution of Hm given Hn is the same as the distribution of Hm given {Hn , Hn+1 , . . . }. The argument for Un is identical.
  59. 59. Practical issues for Bernoulli factory Given a function f , finding polynomial envelopes satisfying properties required is not easy. Section 3 of N-P provides explicit formulas for polynomial envelopes of f (p) = min{2p, 1 − 2ε} that satisfy conditions, precisely a(n, k) and b(n, k) satisfy (ii) and (iii) and one can easily compute n0 = n0 (ε) s.t. for n ≥ n0 condition (i) also holds, which is enough for the algorithm (however n0 is substantial, e.g. n0 (ε) = 32768 for ε = 0.1 and it increases as ε decreases). The probability that Algorithm 4 needs N > n inputs equals hn (p) − gn (p). The polynomials provided in N-P satisfy hn (p) − gn (p) ≤ C ρn for p ∈ [0, 1/2 − 4ε] guaranteeing fast convergence, and hn (p) − gn (p) ≤ Dn−1/2 elsewhere. Using similar techniques one can establish polynomial envelopes s.t. hn (p) − gn (p) ≤ C ρn for p ∈ [0, 1] (1/2 − (2 + c)ε, 1/2 − (2 − c)ε).
  60. 60. Moreover, we note that despite the fact that the techniques developed in N-P for simulating a real analytic g exhibit exponentially decaying tails, they are often not practical. Nesting k times the algorithm for f (p) = min{2p, 1 − 2ε} is very inefficient. One needs at least n0 (ε)k of original p−coins for a single output. Nevertheless, both algorithms, i.e. the original N-P and our martingale modification, use the same number of original p−coins for a single f (p)−coin output with f (p) = min{2p, 1 − 2ε} and consequently also for simulating any real analytic function using methodology of N-P. A significant improvement in terms of p−coins can be achieved only if the monotone super/sub-martingales can be constructed directly and used along with Algorithm 3. This is discussed in the next subsection.
  61. 61. Bernoulli Factory for alternating series expansions Proposition Let f : [0, 1] → [0, 1] have an alternating series expansion ∞ f (p) = (−1)k ak p k with 1 ≥ a0 ≥ a1 ≥ . . . k=0 Then an f (p)−coin can be simulated by Algorithm 3 and the probability that it needs N > n iterations equals an p n .
  62. 62. Proof. Let X1 , X2 , . . . be a sequence of p−coins and define U0 := a0 L0 := 0, n Un−1 − an k=1 Xk if n is odd, Ln := Ln−1 if n is even, Un−1 if n is odd, Un := n Ln−1 + an k=1 Xk if n is even. Clearly (4), (5), (6) and (7) are satisfied with s = f (p). Moreover, un − ln = E Un − E Ln = an p n ≤ an . Thus if an → 0, the algorithm converges for p ∈ [0, 1], otherwise for p ∈ [0, 1).
  63. 63. Proof. Let X1 , X2 , . . . be a sequence of p−coins and define U0 := a0 L0 := 0, n Un−1 − an k=1 Xk if n is odd, Ln := Ln−1 if n is even, Un−1 if n is odd, Un := n Ln−1 + an k=1 Xk if n is even. Clearly (4), (5), (6) and (7) are satisfied with s = f (p). Moreover, un − ln = E Un − E Ln = an p n ≤ an . Thus if an → 0, the algorithm converges for p ∈ [0, 1], otherwise for p ∈ [0, 1). The exponential function is precisely in this family
  64. 64. Some References A. Beskos, G.O. Roberts. Exact simulation of diffusions. Ann. Appl. Probab. 15(4): 2422–2444, 2005. A. Beskos, O. Papaspiliopoulos, G.O. Roberts. Retrospective exact simulation of diffusion sample paths with applications. Bernoulli 12: 1077–1098, 2006. A. Beskos, O. Papaspiliopoulos, G.O. Roberts, and P. Fearnhead. Exact and computationally efficient likelihood-based estimation for discreetly observed diffusion processes (with discussion). Journal of the Royal Statistical Society B, 68(3):333–382, 2006. L. Devroye. Non-uniform random variable generation. Springer-Verlag, New York, 1986. M.S. Keane and G.L. O’Brien. A Bernoulli factory. ACM Transactions on Modelling and Computer Simulation (TOMACS), 4(2):213–219, 1994.
  65. 65. P.E. Kloeden and E.Platen. Numerical solution of stochastic differential equations, Springer-Verlag, 1995. K. Latuszynski, I. Kosmidis, O. Papaspiliopoulos, G.O. Roberts. Simulating events of unknown probabilities via reverse time martingales. Random Structures and Algorithms, to appear. S. Nacu and Y. Peres. Fast simulation of new coins from old. Annals of Applied Probability, 15(1):93–115, 2005. O. Papaspiliopoulos, G.O. Roberts. Retrospective Markov chain Monte Carlo for Dirichlet process hierarchical models. Biometrika, 95:169–186, 2008. Y. Peres. Iterating von Neumann’s procedure for extracting random bits. Annals of Statistics, 20(1): 590–597, 1992.

×