SlideShare a Scribd company logo
1 of 20
Download to read offline
Variational Bayes
                       VBmix
                    Summary




            Variational Bayes
         using the R package VBmix


       Matt Moores              Zoé van Havre

      Bayesian Research & Applications Group
Queensland University of Technology, Brisbane, Australia
            CRICOS provider no. 00213J


         Thursday October 11, 2012




               BRAG Oct. 11      Variational Bayes
Variational Bayes
                                VBmix
                             Summary


Outline



  1   Variational Bayes
        Introduction
        univariate Gaussian
        mixture of Gaussians


  2   VBmix




                       BRAG Oct. 11      Variational Bayes
Variational Bayes    Introduction
                                    VBmix     univariate Gaussian
                                 Summary      mixture of Gaussians


Exact Inference

  When the posterior distribution is analytically tractable
      eg. Normal distribution with natural conjugate priors

            p(θ|Y) = p(µ, σ 2 |Y) = p(µ|σ 2 , Y)p(σ 2 |Y)                 (1)
                               σ2
                   ∼ N m′ , ′ IG(a′ , b ′ )                               (2)
                               ν

  where
            ν ′ = ν0 + n
                   1               ¯
           m′ =    ν ′ (ν0 m0   + ny )
            a′ =   a0 + n  2
                         1        n                       ν0 n(y −m0 )2
                                                               ¯
            b ′ = b0 +   2        i=1 (yi    − y )2 +
                                               ¯              ν0 +n


                             BRAG Oct. 11     Variational Bayes
Variational Bayes   Introduction
                                VBmix    univariate Gaussian
                             Summary     mixture of Gaussians


Approximate Inference



  Stochastic approximation
      Markov chain Monte Carlo
  Analytic approximation
      expectation propagation
      Laplace approximation
      variational Bayes




                          BRAG Oct. 11   Variational Bayes
Variational Bayes   Introduction
                                       VBmix    univariate Gaussian
                                    Summary     mixture of Gaussians


Variational Bayes


         VB is derived from the calculus of variations
         (Euler, Lagrange, et al.)
              integration and differentiation of functionals
              (functions of functions)
         Kullback-Leibler (KL) divergence
              measures the distance between our approximation q(θ)
              and the true posterior distribution p(θ|Y)
                                                         p(θ|Y)
                   KL(q||p) = −             q(θ) ln                    dθ   (3)
                                                          q(θ)


Kullback & Leibler (1951) On Information and Sufficiency

                              BRAG Oct. 11      Variational Bayes
Variational Bayes       Introduction
                                         VBmix        univariate Gaussian
                                      Summary         mixture of Gaussians


Mean Field Variational Bayes



    If the posterior distribution is analytically intractable,
    approximate it using a distribution that is tractable
          eg. using mean field theory:
                                                  M
                                 q(θ) =               qm (θm )               (4)
                                              m=1

    then minimise the KL divergence using convex optimisation




Parisi (1988) Statistical Field Theory

                                BRAG Oct. 11          Variational Bayes
Variational Bayes   Introduction
                                   VBmix    univariate Gaussian
                                Summary     mixture of Gaussians


VB for the univariate Gaussian distribution
  The exact posterior distribution is analytically tractable
  (see equation 1):

                    p(µ, σ 2 |Y) = p(µ|σ 2 , Y)p(σ 2 |Y)

  but for the purpose of illustration:

  q(µ, σ 2 ) = qµ (µ) × qσ2 (σ 2 )
                    ν0 m0 + ny E[σ 2 ]
                              ¯
    qµ (µ) ∼ N                   ,
                       ν0 + n      ν0 + n
                                                     n
        2                 n      1
  qσ2 (σ ) ∼ IG       a0 + , b0 + Eµ                      (yi − µ)2 + ν0 (µ − m0 )2
                          2      2
                                                    i=1

  this lends itself to estimation via a variant of the EM algorithm
                          BRAG Oct. 11      Variational Bayes
Variational Bayes   Introduction
                              VBmix    univariate Gaussian
                           Summary     mixture of Gaussians


R code for VB
 §
  while ( LB − oldLB > 0 . 1 ) {
    # E−ste p
    Emu ← m_vb
    Etau ← a_vb / b_vb
    # M−ste p
    m_vb ← mean( y )
    n_vb ← n
    a_vb ← n / 2
    b_vb ← (sum ( ( y − Emu) ^ 2 ) + 1 / Etau ) / 2
    # check convergence
    oldLB ← LB
    LB ← calcLowerBound (m_vb , n_vb , a_vb , b_vb )
  }

                     BRAG Oct. 11      Variational Bayes
Variational Bayes   Introduction
                                   VBmix    univariate Gaussian
                                Summary     mixture of Gaussians


VB in action


                        iteration 0
     2.0

     1.5

    τ 1.0

     0.5

     0.0
            0.0   0.5          1.0          1.5         2.0
                                µ


                          BRAG Oct. 11      Variational Bayes
Variational Bayes   Introduction
                                   VBmix    univariate Gaussian
                                Summary     mixture of Gaussians


VB in action


            iteration 1 bound is −100.6
     2.0

     1.5

    τ 1.0

     0.5

     0.0
            0.0   0.5          1.0          1.5         2.0
                                µ


                          BRAG Oct. 11      Variational Bayes
Variational Bayes   Introduction
                                   VBmix    univariate Gaussian
                                Summary     mixture of Gaussians


VB in action


            iteration 2 bound is −100.2
     2.0

     1.5

    τ 1.0

     0.5

     0.0
            0.0   0.5          1.0          1.5         2.0
                                µ


                          BRAG Oct. 11      Variational Bayes
Variational Bayes        Introduction
                                      VBmix         univariate Gaussian
                                   Summary          mixture of Gaussians


Gaussian Mixture Model
  Likelihood function:
                                                                                          
                           n           k
                                                     1                     (yi − µj   )2
      p(y|λ, µ, σ 2 ) =                     λj              exp −                         
                          i=1          j=1          2πσj2                      2σj2

  where
                                             k
                                                  λj = 1
                                           j=1

  Natural conjugate priors:
         p(λ) ∼ Dirichlet(α)
                                 σj2
     p(µj |σj2 ) ∼ N      mj ,   νj

          p(σj2 ) ∼ IG(aj , bj )
                               BRAG Oct. 11         Variational Bayes
Variational Bayes   Introduction
                                      VBmix    univariate Gaussian
                                   Summary     mixture of Gaussians


Exact Inference for GMM


    Complexity of the posterior distribution is O(k n )
         computationally infeasible for more than a small handful of
         observations and mixture components
         back of the envelope:
              if k = 2 and n = 50, it would take approximately 15min
              on an nVidia Tesla M2050 (1288 GFLOPs peak throughput)
              if k = 2 and n = 100, it would take 31 billion years
    For EM, Gibbs sampling and Variational Bayes, we approximate
    the posterior by introducing a matrix Z of indicator variables,
    such that zij = 1 if yi has the label j, and zij = 0 otherwise.

Robert & Mengersen (2011) Exact Bayesian analysis of mixtures

                             BRAG Oct. 11      Variational Bayes
Variational Bayes     Introduction
                                     VBmix      univariate Gaussian
                                  Summary       mixture of Gaussians


Variational Bayes for GMM
  mean field approximation:
                                                 k
                q(θ) = q(Z) × q(λ)                    q(µj |σj2 )q(σj2 )
                                                j=1

  Variational E-step:
                             n     k
                                         z
             q(Z) =                    ρij ij
                           i=1 j=1
                                   ωij
                ρij   =          k
                                 x=1 ωix
                                    1             1
            log ωij   = E[log λj ] − E[log σj2 ] − log 2π
                                    2             2
                                    (xi − µj )2
                          1
                        − Eµj ,σ2
                          2       j     σj2
                            BRAG Oct. 11        Variational Bayes
Variational Bayes        Introduction
                                        VBmix         univariate Gaussian
                                     Summary          mixture of Gaussians


Variational Bayes for GMM, continued

  M-step:
         n                              n                                      n
                                  1                                       1
  ˆ
  nj =         ρij      ¯
                        yj   =               ρij yi          sj2 =
                                                             ˆ                       ρij (yi − yj )2
                                                                                               ¯
                                  ˆ
                                  nj                                      ˆ
                                                                          nj
         i=1                           i=1                                     i=1
                                                      ˆ
                                                      nj            nj sj2
                                                                    ˆˆ
                                                                    ν0 nj (yj − m0 )2
                                                                       ˆ ¯
                     q(σj2 ) ∼ IG           a0 +         , b0 +   +
                                                      2         2               ˆ
                                                                       2(ν0 + nj )
                                                  ˆ¯
                                         ν0 m0 + nj yj     σj2
               q(µj |σj2 ) ∼ N                         ,
                                                 ˆ
                                            ν0 + n j           ˆ
                                                         ν0 + n j
                                          ˆ                 ˆ
     q(λ1 , . . . , λk ) ∼ Dirichlet(α0 + n1 , . . . , α0 + nk )



                                 BRAG Oct. 11         Variational Bayes
Variational Bayes
                                      VBmix
                                   Summary


VBmix



    An R package by Pierrick Bruneau
        Variational Bayesian inference for mixtures of Gaussians
              see §10.2 of Bishop (2006)
         open source (GPL v3)
         implemented in C using the Gnu Scientific Library (GSL)
         Windows binary unavailable on CRAN




Christopher M. Bishop (2006) Pattern Recognition and Machine Learning

                             BRAG Oct. 11      Variational Bayes
Variational Bayes
                                   VBmix
                                Summary


VBmix for Fisher’s iris data
 §
  i n s t a l l . packages ( "VBmix" ) # r e q u i r e s GSL, Qt , f f t w 3
  l i b r a r y ( VBmix )

  # 3 component m i x t u r e o f m u l t i v a r i a t e Gaussians
  f i t _vb ← varbayes ( i r i s d a t a , ncomp=20)
  f a c t o r ( Z to L a b e ls ( f i t _vb$model$ resp )

  # ground t r u t h
  irislabels

  # f i t GMM u sin g maximum l i k e l i h o o d , f o r comparison
  f i t _em ← classicEM ( i r i s d a t a , 4 )
  f i t _em$ l a b e l s

                          BRAG Oct. 11      Variational Bayes
Variational Bayes
                                 VBmix
                              Summary


Summary


 VB is an analytic approximation to the posterior distribution
     suited to standard models with natural conjugate priors
          update equations derived using calculus of variations
          to minimise the KL divergence
     algorithm resembles Expectation-Maximisation (EM)
          can become stuck on suboptimal local maxima
     tends to underestimate the uncertainty in the posterior
 The R package VBmix provides fast, approximate inference
 for mixtures of multivariate Gaussians.




                        BRAG Oct. 11      Variational Bayes
Appendix   For Further Reading



For Further Reading I

     Christopher M. Bishop
     Pattern Recognition and Machine Learning.
     Springer, 2006.
     John Ormerod & Matt Wand
     Explaining Variational Approximations.
     The American Statistician, 64(2): 140–153, 2010.
     Mike Jordan, Zoubin Ghahramani, Tommi Jaakkola, & Lawrence Saul
     An Introduction to Variational Methods for Graphical Models.
     Machine Learning, 37: 183–233, 1999.
     Pierrick Bruneau, Marc Gelgon & Fabien Picarougne
     Parsimonious reduction of Gaussian mixture models with a
     variational-Bayes approach.
     Pattern Recognition, 43(3): 850–858, 2010.



                          BRAG Oct. 11   Variational Bayes
Appendix   For Further Reading



For Further Reading II

     Clare McGrory & Mike Titterington
     Variational approximations in Bayesian model selection for finite mixture
     distributions.
     Computational Statistics & Data Analysis, 51: 5352–5367, 2007.
     Solomon Kullback & Richard Leibler
     On Information and Sufficiency.
     The Annals of Mathematical Statistics, 22: 79–86, 1951.
     Giorgio Parisi
     Statistical Field Theory.
     Addison-Wesley, 1988.
     Christian Robert & Kerrie Mengersen
     Exact Bayesian analysis of mixtures
     In Mengersen, Robert & Titternginton (eds.)
     Mixtures: Estimation and Applications
     John Wiley & Sons, 2011.

                            BRAG Oct. 11    Variational Bayes

More Related Content

What's hot

Sparsity with sign-coherent groups of variables via the cooperative-Lasso
Sparsity with sign-coherent groups of variables via the cooperative-LassoSparsity with sign-coherent groups of variables via the cooperative-Lasso
Sparsity with sign-coherent groups of variables via the cooperative-LassoLaboratoire Statistique et génome
 
3D gravity inversion by planting anomalous densities
3D gravity inversion by planting anomalous densities3D gravity inversion by planting anomalous densities
3D gravity inversion by planting anomalous densitiesLeonardo Uieda
 
Justification of canonical quantization of Josephson effect in various physic...
Justification of canonical quantization of Josephson effect in various physic...Justification of canonical quantization of Josephson effect in various physic...
Justification of canonical quantization of Josephson effect in various physic...Krzysztof Pomorski
 
B. Nikolic - Renormalizability of the D-Deformed Wess-Zumino Model
B. Nikolic - Renormalizability of the D-Deformed Wess-Zumino ModelB. Nikolic - Renormalizability of the D-Deformed Wess-Zumino Model
B. Nikolic - Renormalizability of the D-Deformed Wess-Zumino ModelSEENET-MTP
 
Bayes Independence Test
Bayes Independence TestBayes Independence Test
Bayes Independence TestJoe Suzuki
 
Discontinuous Petrov-Galerkin Methods for convection-dominated diffusion pro...
Discontinuous Petrov-Galerkin Methods for convection-dominated  diffusion pro...Discontinuous Petrov-Galerkin Methods for convection-dominated  diffusion pro...
Discontinuous Petrov-Galerkin Methods for convection-dominated diffusion pro...Mohammad Zakerzadeh
 
Geometric properties for parabolic and elliptic pde
Geometric properties for parabolic and elliptic pdeGeometric properties for parabolic and elliptic pde
Geometric properties for parabolic and elliptic pdeSpringer
 
G. Martinelli - From the Standard Model to Dark Matter and beyond: Symmetries...
G. Martinelli - From the Standard Model to Dark Matter and beyond: Symmetries...G. Martinelli - From the Standard Model to Dark Matter and beyond: Symmetries...
G. Martinelli - From the Standard Model to Dark Matter and beyond: Symmetries...SEENET-MTP
 
Mathematical behaviour of pde's
Mathematical behaviour of pde'sMathematical behaviour of pde's
Mathematical behaviour of pde'sparabajinkya0070
 
Complexity of exact solutions of many body systems: nonequilibrium steady sta...
Complexity of exact solutions of many body systems: nonequilibrium steady sta...Complexity of exact solutions of many body systems: nonequilibrium steady sta...
Complexity of exact solutions of many body systems: nonequilibrium steady sta...Lake Como School of Advanced Studies
 
Omiros' talk on the Bernoulli factory problem
Omiros' talk on the  Bernoulli factory problemOmiros' talk on the  Bernoulli factory problem
Omiros' talk on the Bernoulli factory problemBigMC
 

What's hot (14)

Bertail
BertailBertail
Bertail
 
Sparsity with sign-coherent groups of variables via the cooperative-Lasso
Sparsity with sign-coherent groups of variables via the cooperative-LassoSparsity with sign-coherent groups of variables via the cooperative-Lasso
Sparsity with sign-coherent groups of variables via the cooperative-Lasso
 
3D gravity inversion by planting anomalous densities
3D gravity inversion by planting anomalous densities3D gravity inversion by planting anomalous densities
3D gravity inversion by planting anomalous densities
 
Justification of canonical quantization of Josephson effect in various physic...
Justification of canonical quantization of Josephson effect in various physic...Justification of canonical quantization of Josephson effect in various physic...
Justification of canonical quantization of Josephson effect in various physic...
 
Lesage
LesageLesage
Lesage
 
B. Nikolic - Renormalizability of the D-Deformed Wess-Zumino Model
B. Nikolic - Renormalizability of the D-Deformed Wess-Zumino ModelB. Nikolic - Renormalizability of the D-Deformed Wess-Zumino Model
B. Nikolic - Renormalizability of the D-Deformed Wess-Zumino Model
 
Bayes Independence Test
Bayes Independence TestBayes Independence Test
Bayes Independence Test
 
Discontinuous Petrov-Galerkin Methods for convection-dominated diffusion pro...
Discontinuous Petrov-Galerkin Methods for convection-dominated  diffusion pro...Discontinuous Petrov-Galerkin Methods for convection-dominated  diffusion pro...
Discontinuous Petrov-Galerkin Methods for convection-dominated diffusion pro...
 
Geometric properties for parabolic and elliptic pde
Geometric properties for parabolic and elliptic pdeGeometric properties for parabolic and elliptic pde
Geometric properties for parabolic and elliptic pde
 
G. Martinelli - From the Standard Model to Dark Matter and beyond: Symmetries...
G. Martinelli - From the Standard Model to Dark Matter and beyond: Symmetries...G. Martinelli - From the Standard Model to Dark Matter and beyond: Symmetries...
G. Martinelli - From the Standard Model to Dark Matter and beyond: Symmetries...
 
Mathematical behaviour of pde's
Mathematical behaviour of pde'sMathematical behaviour of pde's
Mathematical behaviour of pde's
 
Complexity of exact solutions of many body systems: nonequilibrium steady sta...
Complexity of exact solutions of many body systems: nonequilibrium steady sta...Complexity of exact solutions of many body systems: nonequilibrium steady sta...
Complexity of exact solutions of many body systems: nonequilibrium steady sta...
 
Iwsmbvs
IwsmbvsIwsmbvs
Iwsmbvs
 
Omiros' talk on the Bernoulli factory problem
Omiros' talk on the  Bernoulli factory problemOmiros' talk on the  Bernoulli factory problem
Omiros' talk on the Bernoulli factory problem
 

More from Matt Moores

Bayesian Inference and Uncertainty Quantification for Inverse Problems
Bayesian Inference and Uncertainty Quantification for Inverse ProblemsBayesian Inference and Uncertainty Quantification for Inverse Problems
Bayesian Inference and Uncertainty Quantification for Inverse ProblemsMatt Moores
 
bayesImageS: an R package for Bayesian image analysis
bayesImageS: an R package for Bayesian image analysisbayesImageS: an R package for Bayesian image analysis
bayesImageS: an R package for Bayesian image analysisMatt Moores
 
Exploratory Analysis of Multivariate Data
Exploratory Analysis of Multivariate DataExploratory Analysis of Multivariate Data
Exploratory Analysis of Multivariate DataMatt Moores
 
R package bayesImageS: Scalable Inference for Intractable Likelihoods
R package bayesImageS: Scalable Inference for Intractable LikelihoodsR package bayesImageS: Scalable Inference for Intractable Likelihoods
R package bayesImageS: Scalable Inference for Intractable LikelihoodsMatt Moores
 
bayesImageS: Bayesian computation for medical Image Segmentation using a hidd...
bayesImageS: Bayesian computation for medical Image Segmentation using a hidd...bayesImageS: Bayesian computation for medical Image Segmentation using a hidd...
bayesImageS: Bayesian computation for medical Image Segmentation using a hidd...Matt Moores
 
Approximate Bayesian computation for the Ising/Potts model
Approximate Bayesian computation for the Ising/Potts modelApproximate Bayesian computation for the Ising/Potts model
Approximate Bayesian computation for the Ising/Potts modelMatt Moores
 
Importing satellite imagery into R from NASA and the U.S. Geological Survey
Importing satellite imagery into R from NASA and the U.S. Geological SurveyImporting satellite imagery into R from NASA and the U.S. Geological Survey
Importing satellite imagery into R from NASA and the U.S. Geological SurveyMatt Moores
 
Accelerating Pseudo-Marginal MCMC using Gaussian Processes
Accelerating Pseudo-Marginal MCMC using Gaussian ProcessesAccelerating Pseudo-Marginal MCMC using Gaussian Processes
Accelerating Pseudo-Marginal MCMC using Gaussian ProcessesMatt Moores
 
R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...
R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...
R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...Matt Moores
 
Bayesian modelling and computation for Raman spectroscopy
Bayesian modelling and computation for Raman spectroscopyBayesian modelling and computation for Raman spectroscopy
Bayesian modelling and computation for Raman spectroscopyMatt Moores
 
Final PhD Seminar
Final PhD SeminarFinal PhD Seminar
Final PhD SeminarMatt Moores
 
Precomputation for SMC-ABC with undirected graphical models
Precomputation for SMC-ABC with undirected graphical modelsPrecomputation for SMC-ABC with undirected graphical models
Precomputation for SMC-ABC with undirected graphical modelsMatt Moores
 
Pre-computation for ABC in image analysis
Pre-computation for ABC in image analysisPre-computation for ABC in image analysis
Pre-computation for ABC in image analysisMatt Moores
 
Informative Priors for Segmentation of Medical Images
Informative Priors for Segmentation of Medical ImagesInformative Priors for Segmentation of Medical Images
Informative Priors for Segmentation of Medical ImagesMatt Moores
 

More from Matt Moores (16)

Bayesian Inference and Uncertainty Quantification for Inverse Problems
Bayesian Inference and Uncertainty Quantification for Inverse ProblemsBayesian Inference and Uncertainty Quantification for Inverse Problems
Bayesian Inference and Uncertainty Quantification for Inverse Problems
 
bayesImageS: an R package for Bayesian image analysis
bayesImageS: an R package for Bayesian image analysisbayesImageS: an R package for Bayesian image analysis
bayesImageS: an R package for Bayesian image analysis
 
Exploratory Analysis of Multivariate Data
Exploratory Analysis of Multivariate DataExploratory Analysis of Multivariate Data
Exploratory Analysis of Multivariate Data
 
R package bayesImageS: Scalable Inference for Intractable Likelihoods
R package bayesImageS: Scalable Inference for Intractable LikelihoodsR package bayesImageS: Scalable Inference for Intractable Likelihoods
R package bayesImageS: Scalable Inference for Intractable Likelihoods
 
bayesImageS: Bayesian computation for medical Image Segmentation using a hidd...
bayesImageS: Bayesian computation for medical Image Segmentation using a hidd...bayesImageS: Bayesian computation for medical Image Segmentation using a hidd...
bayesImageS: Bayesian computation for medical Image Segmentation using a hidd...
 
Approximate Bayesian computation for the Ising/Potts model
Approximate Bayesian computation for the Ising/Potts modelApproximate Bayesian computation for the Ising/Potts model
Approximate Bayesian computation for the Ising/Potts model
 
Importing satellite imagery into R from NASA and the U.S. Geological Survey
Importing satellite imagery into R from NASA and the U.S. Geological SurveyImporting satellite imagery into R from NASA and the U.S. Geological Survey
Importing satellite imagery into R from NASA and the U.S. Geological Survey
 
Accelerating Pseudo-Marginal MCMC using Gaussian Processes
Accelerating Pseudo-Marginal MCMC using Gaussian ProcessesAccelerating Pseudo-Marginal MCMC using Gaussian Processes
Accelerating Pseudo-Marginal MCMC using Gaussian Processes
 
R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...
R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...
R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...
 
Bayesian modelling and computation for Raman spectroscopy
Bayesian modelling and computation for Raman spectroscopyBayesian modelling and computation for Raman spectroscopy
Bayesian modelling and computation for Raman spectroscopy
 
Final PhD Seminar
Final PhD SeminarFinal PhD Seminar
Final PhD Seminar
 
Precomputation for SMC-ABC with undirected graphical models
Precomputation for SMC-ABC with undirected graphical modelsPrecomputation for SMC-ABC with undirected graphical models
Precomputation for SMC-ABC with undirected graphical models
 
Intro to ABC
Intro to ABCIntro to ABC
Intro to ABC
 
Pre-computation for ABC in image analysis
Pre-computation for ABC in image analysisPre-computation for ABC in image analysis
Pre-computation for ABC in image analysis
 
Parallel R
Parallel RParallel R
Parallel R
 
Informative Priors for Segmentation of Medical Images
Informative Priors for Segmentation of Medical ImagesInformative Priors for Segmentation of Medical Images
Informative Priors for Segmentation of Medical Images
 

Recently uploaded

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 

Recently uploaded (20)

Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 

Variational Bayes

  • 1. Variational Bayes VBmix Summary Variational Bayes using the R package VBmix Matt Moores Zoé van Havre Bayesian Research & Applications Group Queensland University of Technology, Brisbane, Australia CRICOS provider no. 00213J Thursday October 11, 2012 BRAG Oct. 11 Variational Bayes
  • 2. Variational Bayes VBmix Summary Outline 1 Variational Bayes Introduction univariate Gaussian mixture of Gaussians 2 VBmix BRAG Oct. 11 Variational Bayes
  • 3. Variational Bayes Introduction VBmix univariate Gaussian Summary mixture of Gaussians Exact Inference When the posterior distribution is analytically tractable eg. Normal distribution with natural conjugate priors p(θ|Y) = p(µ, σ 2 |Y) = p(µ|σ 2 , Y)p(σ 2 |Y) (1) σ2 ∼ N m′ , ′ IG(a′ , b ′ ) (2) ν where ν ′ = ν0 + n 1 ¯ m′ = ν ′ (ν0 m0 + ny ) a′ = a0 + n 2 1 n ν0 n(y −m0 )2 ¯ b ′ = b0 + 2 i=1 (yi − y )2 + ¯ ν0 +n BRAG Oct. 11 Variational Bayes
  • 4. Variational Bayes Introduction VBmix univariate Gaussian Summary mixture of Gaussians Approximate Inference Stochastic approximation Markov chain Monte Carlo Analytic approximation expectation propagation Laplace approximation variational Bayes BRAG Oct. 11 Variational Bayes
  • 5. Variational Bayes Introduction VBmix univariate Gaussian Summary mixture of Gaussians Variational Bayes VB is derived from the calculus of variations (Euler, Lagrange, et al.) integration and differentiation of functionals (functions of functions) Kullback-Leibler (KL) divergence measures the distance between our approximation q(θ) and the true posterior distribution p(θ|Y) p(θ|Y) KL(q||p) = − q(θ) ln dθ (3) q(θ) Kullback & Leibler (1951) On Information and Sufficiency BRAG Oct. 11 Variational Bayes
  • 6. Variational Bayes Introduction VBmix univariate Gaussian Summary mixture of Gaussians Mean Field Variational Bayes If the posterior distribution is analytically intractable, approximate it using a distribution that is tractable eg. using mean field theory: M q(θ) = qm (θm ) (4) m=1 then minimise the KL divergence using convex optimisation Parisi (1988) Statistical Field Theory BRAG Oct. 11 Variational Bayes
  • 7. Variational Bayes Introduction VBmix univariate Gaussian Summary mixture of Gaussians VB for the univariate Gaussian distribution The exact posterior distribution is analytically tractable (see equation 1): p(µ, σ 2 |Y) = p(µ|σ 2 , Y)p(σ 2 |Y) but for the purpose of illustration: q(µ, σ 2 ) = qµ (µ) × qσ2 (σ 2 ) ν0 m0 + ny E[σ 2 ] ¯ qµ (µ) ∼ N , ν0 + n ν0 + n n 2 n 1 qσ2 (σ ) ∼ IG a0 + , b0 + Eµ (yi − µ)2 + ν0 (µ − m0 )2 2 2 i=1 this lends itself to estimation via a variant of the EM algorithm BRAG Oct. 11 Variational Bayes
  • 8. Variational Bayes Introduction VBmix univariate Gaussian Summary mixture of Gaussians R code for VB § while ( LB − oldLB > 0 . 1 ) { # E−ste p Emu ← m_vb Etau ← a_vb / b_vb # M−ste p m_vb ← mean( y ) n_vb ← n a_vb ← n / 2 b_vb ← (sum ( ( y − Emu) ^ 2 ) + 1 / Etau ) / 2 # check convergence oldLB ← LB LB ← calcLowerBound (m_vb , n_vb , a_vb , b_vb ) } BRAG Oct. 11 Variational Bayes
  • 9. Variational Bayes Introduction VBmix univariate Gaussian Summary mixture of Gaussians VB in action iteration 0 2.0 1.5 τ 1.0 0.5 0.0 0.0 0.5 1.0 1.5 2.0 µ BRAG Oct. 11 Variational Bayes
  • 10. Variational Bayes Introduction VBmix univariate Gaussian Summary mixture of Gaussians VB in action iteration 1 bound is −100.6 2.0 1.5 τ 1.0 0.5 0.0 0.0 0.5 1.0 1.5 2.0 µ BRAG Oct. 11 Variational Bayes
  • 11. Variational Bayes Introduction VBmix univariate Gaussian Summary mixture of Gaussians VB in action iteration 2 bound is −100.2 2.0 1.5 τ 1.0 0.5 0.0 0.0 0.5 1.0 1.5 2.0 µ BRAG Oct. 11 Variational Bayes
  • 12. Variational Bayes Introduction VBmix univariate Gaussian Summary mixture of Gaussians Gaussian Mixture Model Likelihood function:   n k 1 (yi − µj )2 p(y|λ, µ, σ 2 ) =  λj exp −  i=1 j=1 2πσj2 2σj2 where k λj = 1 j=1 Natural conjugate priors: p(λ) ∼ Dirichlet(α) σj2 p(µj |σj2 ) ∼ N mj , νj p(σj2 ) ∼ IG(aj , bj ) BRAG Oct. 11 Variational Bayes
  • 13. Variational Bayes Introduction VBmix univariate Gaussian Summary mixture of Gaussians Exact Inference for GMM Complexity of the posterior distribution is O(k n ) computationally infeasible for more than a small handful of observations and mixture components back of the envelope: if k = 2 and n = 50, it would take approximately 15min on an nVidia Tesla M2050 (1288 GFLOPs peak throughput) if k = 2 and n = 100, it would take 31 billion years For EM, Gibbs sampling and Variational Bayes, we approximate the posterior by introducing a matrix Z of indicator variables, such that zij = 1 if yi has the label j, and zij = 0 otherwise. Robert & Mengersen (2011) Exact Bayesian analysis of mixtures BRAG Oct. 11 Variational Bayes
  • 14. Variational Bayes Introduction VBmix univariate Gaussian Summary mixture of Gaussians Variational Bayes for GMM mean field approximation: k q(θ) = q(Z) × q(λ) q(µj |σj2 )q(σj2 ) j=1 Variational E-step: n k z q(Z) = ρij ij i=1 j=1 ωij ρij = k x=1 ωix 1 1 log ωij = E[log λj ] − E[log σj2 ] − log 2π 2 2 (xi − µj )2 1 − Eµj ,σ2 2 j σj2 BRAG Oct. 11 Variational Bayes
  • 15. Variational Bayes Introduction VBmix univariate Gaussian Summary mixture of Gaussians Variational Bayes for GMM, continued M-step: n n n 1 1 ˆ nj = ρij ¯ yj = ρij yi sj2 = ˆ ρij (yi − yj )2 ¯ ˆ nj ˆ nj i=1 i=1 i=1 ˆ nj nj sj2 ˆˆ ν0 nj (yj − m0 )2 ˆ ¯ q(σj2 ) ∼ IG a0 + , b0 + + 2 2 ˆ 2(ν0 + nj ) ˆ¯ ν0 m0 + nj yj σj2 q(µj |σj2 ) ∼ N , ˆ ν0 + n j ˆ ν0 + n j ˆ ˆ q(λ1 , . . . , λk ) ∼ Dirichlet(α0 + n1 , . . . , α0 + nk ) BRAG Oct. 11 Variational Bayes
  • 16. Variational Bayes VBmix Summary VBmix An R package by Pierrick Bruneau Variational Bayesian inference for mixtures of Gaussians see §10.2 of Bishop (2006) open source (GPL v3) implemented in C using the Gnu Scientific Library (GSL) Windows binary unavailable on CRAN Christopher M. Bishop (2006) Pattern Recognition and Machine Learning BRAG Oct. 11 Variational Bayes
  • 17. Variational Bayes VBmix Summary VBmix for Fisher’s iris data § i n s t a l l . packages ( "VBmix" ) # r e q u i r e s GSL, Qt , f f t w 3 l i b r a r y ( VBmix ) # 3 component m i x t u r e o f m u l t i v a r i a t e Gaussians f i t _vb ← varbayes ( i r i s d a t a , ncomp=20) f a c t o r ( Z to L a b e ls ( f i t _vb$model$ resp ) # ground t r u t h irislabels # f i t GMM u sin g maximum l i k e l i h o o d , f o r comparison f i t _em ← classicEM ( i r i s d a t a , 4 ) f i t _em$ l a b e l s BRAG Oct. 11 Variational Bayes
  • 18. Variational Bayes VBmix Summary Summary VB is an analytic approximation to the posterior distribution suited to standard models with natural conjugate priors update equations derived using calculus of variations to minimise the KL divergence algorithm resembles Expectation-Maximisation (EM) can become stuck on suboptimal local maxima tends to underestimate the uncertainty in the posterior The R package VBmix provides fast, approximate inference for mixtures of multivariate Gaussians. BRAG Oct. 11 Variational Bayes
  • 19. Appendix For Further Reading For Further Reading I Christopher M. Bishop Pattern Recognition and Machine Learning. Springer, 2006. John Ormerod & Matt Wand Explaining Variational Approximations. The American Statistician, 64(2): 140–153, 2010. Mike Jordan, Zoubin Ghahramani, Tommi Jaakkola, & Lawrence Saul An Introduction to Variational Methods for Graphical Models. Machine Learning, 37: 183–233, 1999. Pierrick Bruneau, Marc Gelgon & Fabien Picarougne Parsimonious reduction of Gaussian mixture models with a variational-Bayes approach. Pattern Recognition, 43(3): 850–858, 2010. BRAG Oct. 11 Variational Bayes
  • 20. Appendix For Further Reading For Further Reading II Clare McGrory & Mike Titterington Variational approximations in Bayesian model selection for finite mixture distributions. Computational Statistics & Data Analysis, 51: 5352–5367, 2007. Solomon Kullback & Richard Leibler On Information and Sufficiency. The Annals of Mathematical Statistics, 22: 79–86, 1951. Giorgio Parisi Statistical Field Theory. Addison-Wesley, 1988. Christian Robert & Kerrie Mengersen Exact Bayesian analysis of mixtures In Mengersen, Robert & Titternginton (eds.) Mixtures: Estimation and Applications John Wiley & Sons, 2011. BRAG Oct. 11 Variational Bayes