Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Coordinate sampler : A non-reversible Gibbs-like sampler

1,214 views

Published on

talk given at CIRM for the Quasi-Monte Carlo Methods and Applications research school, 05 November 2020

Published in: Science
  • Be the first to comment

  • Be the first to like this

Coordinate sampler : A non-reversible Gibbs-like sampler

  1. 1. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Christian P. Robert U Paris Dauphine PSL & University of Warwick Joint work with Wu Changye, plus loans from Arnaud Doucet Statistics and Computing 30, 721–730 (2020)
  2. 2. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Outline Background Versions of PDMP Coordinate Sampler Numerical comparison Conclusion
  3. 3. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Background Generic issue Goal: sample from a target known up to a constant, defined over Rd , π(x) ∝ γ(x) with energy U(x) = − log π(x), U ∈ C1.
  4. 4. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Background Marketing arguments Current default workhorse: reversible MCMC methods Non-reversible MCMC algorithms based on piecewise deterministic Markov processes perform well empirically
  5. 5. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Background Marketing arguments Non-reversible MCMC algorithms based on piecewise deterministic Markov processes perform well empirically Quantitative convergence rates and variance now available Physics (Peters & De With, 2012; Krauth et al., 2009, 2015, 2016) roots Mesquita and Hespanha (2010) show geometric ergodicity for exponentially decaying tail targets Monmarché (2016) gives sharp results for compact state-spaces Bierkens et al. (2016a,b) show ergodicity of targets on the real line
  6. 6. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Background Motivation: piecewise deterministic Markov process PDMP sampler is a (new?) continuous-time, non-reversible MCMC method based on auxiliary variables 1. particle physics simulation [Peters and de With, 2012] 2. empirically state-of-the-art performances [Bouchard-Côté et al., 2018] 3. exact subsampling in big data settings [Bierkens et al., 2016] 4. geometric ergodicity for a large class of distribution [Deligiannidis et al., 2017; Bierkens et al., 2017] 5. Ability to deal with intractable potential U(x) = Uω(x)µ(dω) [Pakman et al., 2016]
  7. 7. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Background Older versions Use of alternative methodology based on Birth–&-Death (point) process Idea: Create Markov chain in continuous time, i.e. a Markov jump process Time till next modification (jump) exponentially distributed with intensity q(θ, θ ) depending on current and future states. [Preston, 1976; Ripley, 1977; Geyer & Møller, 1994; Stephens, 1999]
  8. 8. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Background Older versions Use of alternative methodology based on Birth–&-Death (point) process Idea: Create Markov chain in continuous time, i.e. a Markov jump process Time till next modification (jump) exponentially distributed with intensity q(θ, θ ) depending on current and future states. [Preston, 1976; Ripley, 1977; Geyer & Møller, 1994; Stephens, 1999]
  9. 9. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Background Older versions Difference with MH-MCMC: Whenever jump occurs, corresponding move always accepted. Acceptance probabilities replaced with holding times. Implausible configurations L(θ)π(θ) 1 die quickly. It is sufficient to have detailed balance L(θ)π(θ)q(θ, θ ) = L(θ )π(θ )q(θ , θ) for all θ, θ for ˜π(θ) ∝ L(θ)π(θ) to be stationary. [Cappé et al., 2000]
  10. 10. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Background Older versions Difference with MH-MCMC: Whenever jump occurs, corresponding move always accepted. Acceptance probabilities replaced with holding times. Implausible configurations L(θ)π(θ) 1 die quickly. It is sufficient to have detailed balance L(θ)π(θ)q(θ, θ ) = L(θ )π(θ )q(θ , θ) for all θ, θ for ˜π(θ) ∝ L(θ)π(θ) to be stationary. [Cappé et al., 2000]
  11. 11. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Background Recent reference [Bouchard-Côté et al., 2018]
  12. 12. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Background Hamiltonian setup All MCMC schemes presented here target an extended distribution on Z = Rd × Rd ρ(z) = π(x) × ψ(v) = exp( Hamiltonian −H(z) ) where z = (x, v) extended state and Ψ(v) [by default] multivariate standard Normal Physics takes v as velocity or momentum variables allowing for a deterministic dynamics on Rd Obviously sampling from ρ provides samples from π
  13. 13. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Background Piecewise deterministic Markov process Piecewise deterministic Markov process {zt ∈ Z}t∈[0,∞), with three ingredients 1. Deterministic dynamics: between events, deterministic evolution based on ODE dzt/dt = Φ(zt) 2. Event occurrence rate: λ(t) = λ(zt) 3. Transition dynamics: At event time, τ, state prior to τ denoted by zτ−, and new state generated by zτ ∼ Q(·|zτ−). [Davis, 1984, 1993]
  14. 14. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Background Implementation Algorithm 1: Simulation of PDMP Starting point z0, τ0 ← 0. for k = 1, 2, 3, · · · do Sample inter-event time ηk from distribution P(ηk > t) = exp − t 0 λ(zτk−1+s )ds . τk ← τk−1 + ηk, zτk−1+s ← Ψs(zτk−1 ), for s ∈ (0, ηk), where Ψ ODE flow of Φ. zτk − ← Ψηk (zτk−1 ), zτk ∼ Q(·|zτk −). end
  15. 15. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Background Simulation of PDMP: constraints requires being able to compute exactly flow zt = Φt(z0) simplest algorithms based on Φ(z) = (v; 0d ) hence Φ(zt) = (x0 + v0t; v0) except Hamiltonian BPS with Hamiltonian dynamics for proxy Gaussian Hamiltonian [Vanetti et al., 20 requires ability to simulate event times (inversion, thinning, superposition) requires simulations from Q
  16. 16. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Versions of PDMP Outline Background Versions of PDMP Coordinate Sampler Numerical comparison Conclusion
  17. 17. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Versions of PDMP Basic bouncy particle sampler Simulation of continuous-time piecewise linear trajectory (xt)t with each segment in trajectory specified by initial position x length τ velocity v [Bouchard-Côté et al., 2018]
  18. 18. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Versions of PDMP Basic bouncy particle sampler Simulation of continuous-time piecewise linear trajectory (xt)t with each segment in trajectory specified by initial position x length τ velocity v length specified by inhomogeneous Poisson point process with intensity function λ(x, v) = max{0, < U(x), v >} [Bouchard-Côté et al., 2018]
  19. 19. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Versions of PDMP Basic bouncy particle sampler Simulation of continuous-time piecewise linear trajectory (xt)t with each segment in trajectory specified by initial position x length τ velocity v new velocity after bouncing given by Newtonian elastic collision R(x)v = v − 2 < U(x), v > || U(x)||2 U(x) [Bouchard-Côté et al., 2018]
  20. 20. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Versions of PDMP Basic bouncy particle sampler [(C.) Bouchard-Côté et al., 2018]
  21. 21. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Versions of PDMP Implementation hardships Generally speaking, the main difficulties of implementing PDMP come from 1. Computing the ODE flow Ψ: linear dynamic, quadratic dynamic 2. Simulating the inter-event time ηk: many techniques of superposition and thinning for Poisson processes [Devroye, 1986]
  22. 22. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Versions of PDMP Poisson process on R+ Definition (Poisson process) Poisson process with rate λ(·) on R+ is sequence τ1, τ2, · · · of rv’s when intervals τ1, τ2 − τ1, τ3 − τ2, · · · are iid with P(τi − τi−1 > T) = exp − τi−1+T τi−1 λ(t)dt , τ0 = 0
  23. 23. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Versions of PDMP Poisson process on R+ Definition (Poisson process) Poisson process with rate λ(·) on R+ is sequence τ1, τ2, · · · of rv’s when intervals τ1, τ2 − τ1, τ3 − τ2, · · · are iid with a rarely available cdf
  24. 24. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Versions of PDMP Simulation by thinning Theorem (Lewis and Shedler, 1979) Let λ, Λ : R+ → R+ be continuous functions such that λ(·) Λ(·). Let τ1, τ2, · · · , be the increasing sequence of a Poisson process with rate Λ(·). For all i, if τi is removed from the sequence with probability 1 − λ(τi )/Λ(τi ) then the remaining ˜τ1, ˜τ2, · · · form a non-homogeneous Poisson process with rate λ(·)
  25. 25. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Versions of PDMP Simulation by thinning Theorem (Lewis and Shedler, 1979) Let λ, Λ : R+ → R+ be continuous functions such that λ(·) Λ(·). Let τ1, τ2, · · · , be the increasing sequence of a Poisson process with rate Λ(·). Simulation from upper bound [need be found]
  26. 26. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Versions of PDMP Simulation by superposition theorem Theorem (Kingman, 1992) Let Π1, Π2, · · · , be countable collection of independent Poisson processes on R+ with resp. rates λn(·). If ∞ n=1λn(t) < ∞ for all t’s, then superposition process Π = ∞ n=1 Πn is Poisson process with rate λ(t) = ∞ n=1 λn(t)
  27. 27. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Versions of PDMP Simulation by superposition theorem Theorem (Kingman, 1992) Let Π1, Π2, · · · , be countable collection of independent Poisson processes on R+ with resp. rates λn(·). If ∞ n=1λn(t) < ∞ for all t’s, then superposition process is Poisson process with rate lambda(t) Decomposition of U = j Uj plus thinning
  28. 28. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Versions of PDMP Simulation by superposition plus thinning Almost all implementations of discrete-time schemes consist in sampling a Bernoulli rv of parameter α(z) For Φ(z) = (x + v , v) and α(z) = 1 ∧ π(x + v )/π(x) sampling inter-event time for strictly convex U can be obtained by solving t = arg min U(x + vt) and additional randomization thinning: if there exists ¯α such that α(Φk(z)) ¯α(x, k), accept-reject superposition and thinning: when α(z) = 1 ∧ ρ(Φ(z))/ρ(z) and ρ(·) = i ρi (·) then ¯α(z, k) = i ¯αi (z, k)
  29. 29. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Versions of PDMP Extended generator Definition For D(L) set of measurable functions f : Z → R such that there exists a measurable function h : Z → R with t → h(zt) Pz-a.s. for each z ∈ Z and the process Cf t = f (zt) − f (z0) − t 0 h(zs)ds is a local martingale. Then h ∆ = Lf and (L, D(L)) is the extended generator of the process {zt}t 0.
  30. 30. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Versions of PDMP Extended generator of PDMP Theorem (Davis, 1993) The generator, L, of above PDMP is, for f ∈ D(L) Lf (z) = f (z) · Φ(z) + λ(z) z f (z ) − f (z) Q(dz |z) Furthermore, µ(dz) is an invariant distribution of above PDMP, if Lf (z)µ(dz) = 0, for all f ∈ D(L)
  31. 31. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Versions of PDMP PDMP-based sampler PDMP-based sampler is an auxiliary variable technique Given target π(x), 1. introduce auxiliary variable V ∈ V along with a density π(v|x), 2. choose appropriate Φ, λ and Q for π(x)π(v|x) to be unique invariant distribution of Markov process
  32. 32. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Versions of PDMP Bouncy Particle Sampler (Bouchard-Côté et al., 2018) V = Rd , and π(v|x) = ϕ(v) for N(0, Id ) 1. Deterministic dynamics: dxt/dt = vt, dvt/dt = 0 2. Event occurrence rate: λ(x, v) = v, U(x) + + λref 3. Transition dynamics: Q((dx , dv )|(x, v)) = v, U(x) + λ(x, v) δx(dx )δR U(x)v(dv ) + λref λ(x, v) δx(dx )ϕ(dv ) where R U(x)v = v − 2 U(x),v U(x), U(x) U(x)
  33. 33. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Versions of PDMP Zig-Zag Sampler (Bierkens et al., 2016) V = {+1, −1}d , and π(v|x) ∼ Uniform({+1, −1}d ) 1. Deterministic dynamics: dxt/dt = vt, dvt/dt = 0 2. Event occurrence rate: λ(x, v) = d i=1 λi (x, v) = d i=1 {vi i U(x)}+ + λref i 3. Transition dynamics: Q((dx , dv )|(x, v)) = d i=1 λi (x, v) λ(x, v) δx(dx )δFi v(dv ) where Fi operator that flips i-th component of v and keep others unchanged back to CS
  34. 34. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Versions of PDMP Continuous-time Hamiltonian Monte Carlo (Neal, 1999) V = Rd , and π(v|x) = ϕ(v) ∼ N(0, Id ) 1. Deterministic dynamics: dxt/dt = vt, dvt/dt = − U(xt) 2. Event occurrence rate: λ(x, v) = λ0(x) 3. Transition dynamics: Q((dx , dv )|(x, v)) = δx(dx )ϕ(dv )
  35. 35. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Versions of PDMP Continuous-time Riemannian manifold HMC (Girolami & Calderhead, 2011) V = Rd , and π(v|x) = N(0, G(x)), with Hamiltonian H(x, v) = U(x) + 1/2vT G(x)−1 v + 1/2 log(|G(x)|) 1. Deterministic dynamics: dxt/dt = ∂H/∂v(xt, vt), dvt/dt = −∂H/∂x(xt, vt) 2. Event occurrence rate: λ(x, v) = λ0(x) 3. Transition dynamics: Q((dx , dv )|(x, v)) = δx(dx )ϕ(dv |x )
  36. 36. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Versions of PDMP Randomized BPS Define a = v, U(x) U(x), U(x) U(x), b = v − a Regular BPS, move v = −a + b Alternatives 1. Fearnhead et al. (2016): v ∼ Qx(dv |v) = max {0, −v , U(x) } dv 2. Wu and Robert (2017): v = −a + b , where b Gaussian variate over the space orthogonal to U(x) in Rd .
  37. 37. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Versions of PDMP HMC-BPS (Vanetti et al., 2017) ρ(x) ∝ exp{−V (x)} is a Gaussian approximation of the target π(x). ^H(x, v) = V (x) + 1/2vT v, ˜U(x) = U(x) − V (x) 1. Deterministic dynamics: dxt/dt = vt, dvt/dt = − V (xt) 2. Event occurrence rate: λ(x, v) = v, ˜U(x) + + λref 3. Transition dynamics: Q((dx , dv )|(x, v)) = v, ˜U(x) + λ(x, v) δx(dx )δR ˜U(x)v(dv ) + λref λ(x, v) δx(dx )ϕ(dv )
  38. 38. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Versions of PDMP Discretisation 1. Sherlock and Thiery (2017) considers delayed rejection approach with only point-wise evaluations of target, by making speed flip move once proposal involving flip in speed and drift in variable of interest rejected. Also add random perturbation for eergodicity, plus another perturbation based on a Brownian argument. Requires calibration 2. Vanetti et al. (2017) Benefit: bypassing the generation of inter-event time of inhomogeneous Poisson processes.
  39. 39. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Versions of PDMP Discretisation 1. Sherlock and Thiery (2017) 2. Vanetti et al. (2017) unifies many threads and relates PDMP, HMC, and discrete versions, with convergence results. Main idea improves upon existing deterministic methods by accounting for target. Borrows from earlier slice sampler idea of Murray et al. (AISTATS, 2010), exploiting exact Hamiltonian dynamics for approximation to true target. Except that bouncing avoids the slice step. Discrete BPS both correct against target and not simulating event times. Benefit: bypassing the generation of inter-event time of inhomogeneous Poisson processes.
  40. 40. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Coordinate Sampler Outline Background Versions of PDMP Coordinate Sampler Numerical comparison Conclusion
  41. 41. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Coordinate Sampler Coordinate sampler A generalisation of the Zig-Zag sampler such that 1. velocity set used in coordinate sampler (CS) made of orthonormal basis of Rd , while for Zigzag sampler (ZS) it is restricted to {−1, 1}d 2. event rate function λ(·) in ZS much larger than for CS, especially for high dimensional distributions: events occur more frequently in ZS [with lower efficiency] 3. CS targets only one component at a time, while ZS modifies all components at the same time
  42. 42. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Coordinate Sampler Coordinate Sampler generalisation of ZS where bounce uniformly random on V = {±e1, · · · , ±ed } 1. Deterministic dynamics: dxt/dt = vt, dvt/dt = 0 2. Event occurrence rate: λ(x, v) = v, U(x) + + λref 3. Transition dynamics: Q((dx , dv )|(x, v)) = v∗∈V λ(x, −v∗) λ(x) δx(dx )δv∗ (dv ) where λ(x) = v∈V λ(x, v) = 2dλref + d i=1 ∂U(x)/∂xi
  43. 43. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Coordinate Sampler Validation of Coordinate Sampler Extended generator Lf = xf (x, v), v + λ(x, v) v ∈V λ(x, −v ) λ(x) f (x, v ) − f (x, v) Theorem For any positive λref > 0, the PDMP induced by CS enjoys π(x)ϕ(v) as unique invariant distribution, provided potential U is C1.
  44. 44. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Coordinate Sampler Geometric Ergodicity of Coordinate Sampler Geometric ergodicity for distributions with tails faster than exponential and slower than Gaussian Assumptions: Assume U : Rd → R+ satisfy A1 ∂2U(x)/∂xi xj is locally Lipschitz continuous for all i, j A2 U(x) π(dx) < ∞ A3 lim|x|→∞eU(x)/2/ U(x) > 0 A4 V c0 for some positive constant c0 [Deligiannidis et al., 2017]
  45. 45. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Coordinate Sampler Geometric Ergodicity of Coordinate Sampler Geometric ergodicity for distributions with tails faster than exponential and slower than Gaussian Assumptions: Assume U : Rd → R+ satisfy Further conditions C1 lim|x|→∞ U(x) = ∞, lim|x|→∞ ∆U(x) α1 < ∞ and λref > √ 8α1 C2 lim|x|→∞ U(x) = 2α2 > 0, lim|x|→∞ ∆U(x) = 0 and λref < α2/14d [Deligiannidis et al., 2017]
  46. 46. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Coordinate Sampler Geometric Ergodicity of Coordinate Sampler Lyapunov function for the Markov process induced by coordinate sampler is V (x, v) = eU(x)/2 / √ λref+ U(x),−v + Theorem Suppose A1 - A4 hold and one of the conditions C1, C2 holds, then CS is V -uniformly ergodic.
  47. 47. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Coordinate Sampler Coordinate Sampler vs Zig-Zag Sampler 1. Cardinality of V of CS is 2d, while for ZS it is 2d ; 2. Along each piecewise segment, CS only changes one component of x, while ZS modifies all components at the same time (con); 3. λCS = {vi i U(x)}+ + λref, if v = vi ei ; while λZS = d i =1 {vi i U(x)}+ + λref if v = d i =1 vi ei . Generally speaking, λCS is much smaller than λZS (pro);
  48. 48. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Coordinate Sampler Coordinate Sampler vs Zig-Zag Sampler Suppose that 1. λ(x, v) of CS and λi (x, v) of Zig-Zag sampler have same scale. 2. Simulating first event time of Poisson process with rates λ(x, v) of CS and λi (x, v) of ZS : same computation cost, O(c) O(dc) computation cost result in 1. ZS makes each component evolve with scale O( /d), 2. CS makes each component evolve O( ) [gain O(d)] on average
  49. 49. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Numerical comparison Outline Background Versions of PDMP Coordinate Sampler Numerical comparison Conclusion
  50. 50. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Numerical comparison Example 1: Banana-shaped Distribution target with density π(x) ∝ exp −(x1 − 1)2 − κ(x2 − x2 1 )2 where large κ increases curvature and difficulty −2 −1 0 1 2 3 4 5 0.00.51.01.52.02.53.0 log2(κ) RatioofESSpersecond first component second component loglikelihood
  51. 51. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Numerical comparison Example 1: Banana-shaped Distribution −2 −1 0 1 2 3 4 5 0.00.51.01.52.02.53.0 log2(κ) RatioofESSpersecond first component second component loglikelihood x-axis corresponds to log2(κ), y-axis to ratio of ESS’s per second for CS versus ZS. Red line efficiency ratio for component, blue for second , and green for log-likelihood.
  52. 52. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Numerical comparison Example 2: Multivariate Gaussian Distribution π(x) ∝ exp − 1 2 xT A−1 x 1. MVN1: Aii = 1, for i = 1, · · · , d and Aij = 0.9 for i = j. 2. MVN2: Aii = 1, fro i = 1, · · · , d and Aij = 0.9|i−j| for i = j.
  53. 53. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Numerical comparison Example 2: Multivariate Gaussian Distribution 20 40 60 80 100 468101214 MinESS MeanESS MaxESS 20 40 60 80 100 4567891011 MinESS MeanESS MaxESS lhs plot shows results for MVN1 and rhs for MVN2, x-axis indexes dimension d and y-axis efficiency ratios of CS over ZS in terms of minimal (red), mean (blue), and maximal ESS (green) across components
  54. 54. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Numerical comparison Example 2: Multivariate Gaussian Distribution upper plot shows results for MVN1 and lower for MVN2. The x-axis indexes dimension d of distribution, and y-axis efficiency ratios of CS over ZS in terms of minimum, mean, median and maximum of ESS across the components over number of recall event rate function.
  55. 55. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Numerical comparison Example 3: Bayesian Logistic Posterior simulated dataset of N observations {(rn; tn)}N 1 where each rn;i drawn from standard Normal distribution and tn drawn from {−1, 1} uniformly π(x) ∝ N n=1 exp(tnxT rn) 1 + exp(xT rn)
  56. 56. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Numerical comparison Example 3: Bayesian Logistic Posterior 0 50 100 150 MinESS MeanESS MaxESS ESSpersecond Type CS ZS Comparison of CS versus ZS: y-axis stands for ESS per second, d = 10, N = 40
  57. 57. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Conclusion Forward 1. Comparing coordinate sampler and zigzag sampler theoretically 2. Optimising reparametrization 3. Riemannian manifold technique
  58. 58. Coordinate Sampler: A Non-Reversible Gibbs-like Sampler Conclusion Bierkens, J., Fearnhead, P., and Roberts, G. (2016). The zig-zag process and super-efficient sampling for Bayesian analysis of big data. arXiv preprint arXiv:1607.03188. Bierkens, J., Roberts, G., and Zitt, P.-A. (2017). Ergodicity of the zigzag process. arXiv preprint arXiv:1712.09875. Bouchard-Côté, A., Vollmer, S. J., and Doucet, A. (2018). The bouncy particle sampler: a non-reversible rejection-free Markov chain Monte Carlo method. Journal of the American Statistical Association, (to appear). Davis, M. H. (1984). Piecewise-deterministic Markov processes: A general class of non-diffusion stochastic models. Journal of the Royal Statistical Society. Series B (Methodological), pages 353–388. Davis, M. H. (1993). Markov Models & Optimization, volume 49. CRC Press. Deligiannidis, G., Bouchard-Côté, A., and Doucet, A. (2017). Exponential ergodicity of the bouncy particle sampler. arXiv preprint arXiv:1705.04579. Fearnhead, P., Bierkens, J., Pollock, M., and Roberts, G. O. (2016). Piecewise deterministic Markov processes for continuous-time Monte Carlo. arXiv preprint arXiv:1611.07873. Kingman, J. F. C. (1992). Poisson processes, volume 3. Clarendon Press. Lewis, P. A. and Shedler, G. S. (1979). Simulation of nonhomogeneous Poisson processes by thinning. Naval Research Logistics (NRL), 26(3):403–413. Peters, E. A. and de With, G. (2012). Rejection-free Monte Carlo sampling for general potentials. Physical Review E, 85(2):026703. Sherlock, C. and Thiery, A. H. (2017). A discrete bouncy particle sampler. arXiv preprint arXiv:1707.05200. Vanetti, P., Bouchard-Côté, A., Deligiannidis, G., and Doucet, A. (2017). Piecewise deterministic Markov chain Monte Carlo. arXiv preprint arXiv:1707.05296. Wu, C. and Robert, C. P. (2017). Generalized bouncy particle sampler. arXiv preprint arXiv:1706.04781.

×