Does MCMC converge?
        Postprocessing MCMC output




Multimodality and label switching: a discussion

              ...
Does MCMC converge?
                Postprocessing MCMC output


Outline




  1   Does MCMC converge?

  2   Postprocessi...
Does MCMC converge?
               Postprocessing MCMC output


Monte Carlo perspective


  When, given a target π, an MCM...
Does MCMC converge?
               Postprocessing MCMC output


Monte Carlo perspective


  When, given a target π, an MCM...
Does MCMC converge?
               Postprocessing MCMC output


Monte Carlo perspective


  When, given a target π, an MCM...
Does MCMC converge?
               Postprocessing MCMC output


Monte Carlo perspective


  When, given a target π, an MCM...
Does MCMC converge?
               Postprocessing MCMC output


Monte Carlo perspective

  When, given a target π, an MCMC...
Does MCMC converge?
               Postprocessing MCMC output


Imposed permutations may miss the mark




  While duplica...
Does MCMC converge?
               Postprocessing MCMC output


Imposed permutations may miss the mark




  While duplica...
Does MCMC converge?
                  Postprocessing MCMC output


Illustrations

   Example (Two-mean Gaussian mixture)

...
Does MCMC converge?
               Postprocessing MCMC output


Illustrations (2)

  Example (Two-mean Gaussian mixture an...
Does MCMC converge?
                Postprocessing MCMC output


Illustrations (3)

  Example (Outlier Gaussian mixture)
 ...
Does MCMC converge?
                Postprocessing MCMC output


Illustrations (3)

  Example (Outlier Gaussian mixture)
 ...
Does MCMC converge?      Chib’s solution
                Postprocessing MCMC output     Nested sampling


Postprocessing i...
Does MCMC converge?      Chib’s solution
                Postprocessing MCMC output     Nested sampling


Postprocessing i...
Does MCMC converge?      Chib’s solution
                  Postprocessing MCMC output     Nested sampling


Postprocessing...
Does MCMC converge?      Chib’s solution
               Postprocessing MCMC output     Nested sampling


Chib’s representa...
Does MCMC converge?         Chib’s solution
               Postprocessing MCMC output        Nested sampling


Chib’s repr...
Does MCMC converge?            Chib’s solution
                   Postprocessing MCMC output           Nested sampling


C...
Does MCMC converge?            Chib’s solution
                   Postprocessing MCMC output           Nested sampling


C...
Does MCMC converge?            Chib’s solution
                   Postprocessing MCMC output           Nested sampling


C...
Does MCMC converge?      Chib’s solution
                Postprocessing MCMC output     Nested sampling


Compensation for...
Does MCMC converge?       Chib’s solution
                Postprocessing MCMC output      Nested sampling


Compensation f...
Does MCMC converge?      Chib’s solution
                Postprocessing MCMC output     Nested sampling


Galaxy dataset (...
Does MCMC converge?      Chib’s solution
                 Postprocessing MCMC output     Nested sampling


Galaxy dataset ...
Does MCMC converge?      Chib’s solution
                Postprocessing MCMC output     Nested sampling


Comparison betwe...
Does MCMC converge?      Chib’s solution
               Postprocessing MCMC output     Nested sampling


Comparison (cont’...
Does MCMC converge?      Chib’s solution
               Postprocessing MCMC output     Nested sampling


Comparison (cont’...
Does MCMC converge?      Chib’s solution
               Postprocessing MCMC output     Nested sampling


Comparison (cont’...
Does MCMC converge?      Chib’s solution
            Postprocessing MCMC output     Nested sampling


Comparison (cont’d)
...
Upcoming SlideShare
Loading in …5
×

ICMS Discussion, March 2010

1,025 views

Published on

This is a discussion of the presentations of John Geweke and of Sylvia Früwirth-Schnatter, during the ICMS convference on March 3-5, 2010, in Edinburgh

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,025
On SlideShare
0
From Embeds
0
Number of Embeds
169
Actions
Shares
0
Downloads
8
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

ICMS Discussion, March 2010

  1. 1. Does MCMC converge? Postprocessing MCMC output Multimodality and label switching: a discussion Christian P. Robert Universit´ Paris-Dauphine and CREST, INSEE e http://www.ceremade.dauphine.fr/~xian Workshop on mixtures, ICMS February 28, 2010 Christian P. Robert Multimodality and label switching: a discussion
  2. 2. Does MCMC converge? Postprocessing MCMC output Outline 1 Does MCMC converge? 2 Postprocessing MCMC output Christian P. Robert Multimodality and label switching: a discussion
  3. 3. Does MCMC converge? Postprocessing MCMC output Monte Carlo perspective When, given a target π, an MCMC sampler never visits more than 50% of the support of π, Christian P. Robert Multimodality and label switching: a discussion
  4. 4. Does MCMC converge? Postprocessing MCMC output Monte Carlo perspective When, given a target π, an MCMC sampler never visits more than 50% of the support of π, it can be argued that the sampler does not converge! Christian P. Robert Multimodality and label switching: a discussion
  5. 5. Does MCMC converge? Postprocessing MCMC output Monte Carlo perspective When, given a target π, an MCMC sampler never visits more than 50% of the support of π, it can be argued that the sampler does not converge! Two-component normal mixture and Gibbs sampler 4 Case when both means µi 3 are the only unknowns, 2 with different weights and µ2 1 same variance: identifiable 0 model −1 −1 0 1 2 3 4 µ1 Christian P. Robert Multimodality and label switching: a discussion
  6. 6. Does MCMC converge? Postprocessing MCMC output Monte Carlo perspective When, given a target π, an MCMC sampler never visits more than 50% of the support of π, it can be argued that the sampler does not converge! Two-component normal mixture and Gibbs sampler 4 Case when both means µi 3 are the only unknowns, 2 with different weights and µ2 1 same variance: identifiable 0 model −1 −1 0 1 2 3 4 µ1 Christian P. Robert Multimodality and label switching: a discussion
  7. 7. Does MCMC converge? Postprocessing MCMC output Monte Carlo perspective When, given a target π, an MCMC sampler never visits more than 50% of the support of π, it can be argued that the sampler does not converge! Two-component normal mixture and Gibbs sampler 4 Case when both means µi 3 are the only unknowns, 2 with different weights and µ2 1 same variance: identifiable 0 model −1 −1 0 1 2 3 4 µ1 (C.) Simple MCMC does not work Christian P. Robert Multimodality and label switching: a discussion
  8. 8. Does MCMC converge? Postprocessing MCMC output Imposed permutations may miss the mark While duplicating the MCMC sampler according to all permutations ρ in Sk produces perfect exchangeability [nice!], Christian P. Robert Multimodality and label switching: a discussion
  9. 9. Does MCMC converge? Postprocessing MCMC output Imposed permutations may miss the mark While duplicating the MCMC sampler according to all permutations ρ in Sk produces perfect exchangeability [nice!], it does not bring additional energy to the MCMC sampler it does not identify other modes (under- or over-fitting) it does not apply in nearly-but-not exchangeable settings Christian P. Robert Multimodality and label switching: a discussion
  10. 10. Does MCMC converge? Postprocessing MCMC output Illustrations Example (Two-mean Gaussian mixture) 4 3 2 Case of pN (µ1 , 1) + (1 − p)N (µ1 , 1) µ1 1 (p = 0.5) 0 −1 4 3 2 1 0 −1 µ2 Christian P. Robert Multimodality and label switching: a discussion
  11. 11. Does MCMC converge? Postprocessing MCMC output Illustrations (2) Example (Two-mean Gaussian mixture and outliers) Same model, but data from 5-component mixture Christian P. Robert Multimodality and label switching: a discussion
  12. 12. Does MCMC converge? Postprocessing MCMC output Illustrations (3) Example (Outlier Gaussian mixture) Case of pN (0, 1) + (1 − p)N (µ, σ 2 ) with p known Christian P. Robert Multimodality and label switching: a discussion
  13. 13. Does MCMC converge? Postprocessing MCMC output Illustrations (3) Example (Outlier Gaussian mixture) Case of pN (0, 1) + (1 − p)N (µ, σ 2 ) with p known Christian P. Robert Multimodality and label switching: a discussion
  14. 14. Does MCMC converge? Chib’s solution Postprocessing MCMC output Nested sampling Postprocessing issues When assessing the number k of components via the evidence Zk = πk (θk )Lk (θk ) dθk , Θk aka the marginal likelihood, Christian P. Robert Multimodality and label switching: a discussion
  15. 15. Does MCMC converge? Chib’s solution Postprocessing MCMC output Nested sampling Postprocessing issues When assessing the number k of components via the evidence Zk = πk (θk )Lk (θk ) dθk , Θk aka the marginal likelihood, label switching is a liability and a uninteresting phenomenon Christian P. Robert Multimodality and label switching: a discussion
  16. 16. Does MCMC converge? Chib’s solution Postprocessing MCMC output Nested sampling Postprocessing issues When assessing the number k of components via the evidence Zk = πk (θk )Lk (θk ) dθk , Θk aka the marginal likelihood, label switching is a liability and a uninteresting phenomenon Indeed, πk (θk )Lk (θk ) dθk , = k! πk (θk )Lk (θk ) dθk Θk Θk /Sk means that integrating over the restricted space is [more than] ok! Christian P. Robert Multimodality and label switching: a discussion
  17. 17. Does MCMC converge? Chib’s solution Postprocessing MCMC output Nested sampling Chib’s representation Direct application of Bayes’ theorem: given x ∼ fk (x|θk ) and θk ∼ πk (θk ), fk (x|θk ) πk (θk ) Zk = mk (x) = πk (θk |x) Christian P. Robert Multimodality and label switching: a discussion
  18. 18. Does MCMC converge? Chib’s solution Postprocessing MCMC output Nested sampling Chib’s representation Direct application of Bayes’ theorem: given x ∼ fk (x|θk ) and θk ∼ πk (θk ), fk (x|θk ) πk (θk ) Zk = mk (x) = πk (θk |x) Use of an approximation to the posterior ∗ ∗ fk (x|θk ) πk (θk ) Zk = mk (x) = . πk (θk |x) ˆ ∗ Christian P. Robert Multimodality and label switching: a discussion
  19. 19. Does MCMC converge? Chib’s solution Postprocessing MCMC output Nested sampling Case of latent variables For missing variable z as in mixture models, natural Rao-Blackwell estimate T ∗ 1 ∗ (t) πk (θk |x) = πk (θk |x, zk ) , T t=1 (t) where the zk ’s are Gibbs sampled latent variables Christian P. Robert Multimodality and label switching: a discussion
  20. 20. Does MCMC converge? Chib’s solution Postprocessing MCMC output Nested sampling Case of latent variables For missing variable z as in mixture models, natural Rao-Blackwell estimate T ∗ 1 ∗ (t) πk (θk |x) = πk (θk |x, zk ) , T t=1 (t) where the zk ’s are Gibbs sampled latent variables But convergence impaired by lack of label switching Christian P. Robert Multimodality and label switching: a discussion
  21. 21. Does MCMC converge? Chib’s solution Postprocessing MCMC output Nested sampling Case of latent variables For missing variable z as in mixture models, natural Rao-Blackwell estimate T ∗ 1 ∗ (t) πk (θk |x) = πk (θk |x, zk ) , T t=1 (t) where the zk ’s are Gibbs sampled latent variables But convergence impaired by lack of label switching (C.) Simple MCMC does not work Christian P. Robert Multimodality and label switching: a discussion
  22. 22. Does MCMC converge? Chib’s solution Postprocessing MCMC output Nested sampling Compensation for label switching (t) For mixture models, zk usually fails to visit all configurations in a balanced way, despite the symmetry predicted by the theory 1 πk (θk |x) = πk (ρ(θk )|x) = πk (ρ(θk )|x) k! ρ∈S for all ρ’s in Sk , set of all permutations of {1, . . . , k}. Consequences on numerical approximation, biased by an order k! Christian P. Robert Multimodality and label switching: a discussion
  23. 23. Does MCMC converge? Chib’s solution Postprocessing MCMC output Nested sampling Compensation for label switching (t) For mixture models, zk usually fails to visit all configurations in a balanced way, despite the symmetry predicted by the theory 1 πk (θk |x) = πk (ρ(θk )|x) = πk (ρ(θk )|x) k! ρ∈S for all ρ’s in Sk , set of all permutations of {1, . . . , k}. Consequences on numerical approximation, biased by an order k! Recover the theoretical symmetry by using T ∗ 1 ∗ (t) πk (θk |x) = πk (ρ(θk )|x, zk ) . T k! ρ∈Sk t=1 [Berkhof, Mechelen, & Gelman, 2003] Christian P. Robert Multimodality and label switching: a discussion
  24. 24. Does MCMC converge? Chib’s solution Postprocessing MCMC output Nested sampling Galaxy dataset (k) ∗ Using only the original estimate, with θk as the MAP estimator, log(mk (x)) = −105.1396 ˆ for k = 3 (based on 103 simulations), while introducing the permutations leads to log(mk (x)) = −103.3479 = −105.1396 + log(3!) ˆ Christian P. Robert Multimodality and label switching: a discussion
  25. 25. Does MCMC converge? Chib’s solution Postprocessing MCMC output Nested sampling Galaxy dataset (k) ∗ Using only the original estimate, with θk as the MAP estimator, log(mk (x)) = −105.1396 ˆ for k = 3 (based on 103 simulations), while introducing the permutations leads to log(mk (x)) = −103.3479 = −105.1396 + log(3!) ˆ k 2 3 4 5 6 7 8 mk (x) -115.68 -103.35 -102.66 -101.93 -102.88 -105.48 -108.44 Estimations of the marginal likelihoods by the symmetrised Chib’s approximation (based on 105 Gibbs iterations and, for k > 5, 100 permutations selected at random in Sk ). [Lee, Marin, Mengersen & Robert, 2008] Christian P. Robert Multimodality and label switching: a discussion
  26. 26. Does MCMC converge? Chib’s solution Postprocessing MCMC output Nested sampling Comparison between evidence approximations 1 Nested sampling: M = 1000 points, with 10 random walk moves at each step, simulations from the constr’d prior and a stopping rule at 95% of the observed maximum likelihood 2 T = 104 MCMC (=Gibbs) simulations producing non-parametric estimates ϕ 3 Monte Carlo estimates Z1 , Z2 , Z3 using product of two Gaussian kernels 4 numerical integration based on 850 × 950 grid [reference value, confirmed by Chib’s] Christian P. Robert Multimodality and label switching: a discussion
  27. 27. Does MCMC converge? Chib’s solution Postprocessing MCMC output Nested sampling Comparison (cont’d) Graph based on a sample of 10 observations for µ = 2 and σ = 3/2 (150 replicas) V1=Nested sampling, V2=importance sampling, V3=harmonic mean, V4=bridge sampling. [Chopin & Robert, 2010] Christian P. Robert Multimodality and label switching: a discussion
  28. 28. Does MCMC converge? Chib’s solution Postprocessing MCMC output Nested sampling Comparison (cont’d) Graph based on a sample of 50 observations for µ = 2 and σ = 3/2 (150 replicas) V1=Nested sampling, V2=importance sampling, V3=harmonic mean, V4=bridge sampling. [Chopin & Robert, 2010] Christian P. Robert Multimodality and label switching: a discussion
  29. 29. Does MCMC converge? Chib’s solution Postprocessing MCMC output Nested sampling Comparison (cont’d) Graph based on a sample of 100 observations for µ = 2 and σ = 3/2 (150 replicas) V1=Nested sampling, V2=importance sampling, V3=harmonic mean, V4=bridge sampling. [Chopin & Robert, 2010] Christian P. Robert Multimodality and label switching: a discussion
  30. 30. Does MCMC converge? Chib’s solution Postprocessing MCMC output Nested sampling Comparison (cont’d) [Lee, Marin, Mengersen & Robert, 2010] Christian P. Robert Multimodality and label switching: a discussion

×