SlideShare a Scribd company logo
1 of 19
Download to read offline
CS340 Machine learning
Gibbs sampling in Markov random fields




                                         1
Image denoising




x        y
                      ˆ
                      x = E[x|y, θ]




                                      2
Ising model
• 2D Grid on {-1,+1} variables
• Neighboring variables are correlated




                        1
         p(x|θ)   =              ψij (xi , xj )
                       Z(θ) <ij>
                                                  3
Ising model

                      eW     e−W
ψij (xi , xj ) =
                     e−W      eW

 W > 0: ferro magnet
 W < 0: anti ferro magnet (frustrated system)

                    1
   p(x|θ)   =           exp[−βH(x|θ)]
                   Z(θ)
 H(x)   =    −xT Wx = −             Wij xi xj
                             <ij>


     β=1/T = inverse temperature




                                                4
Samples from an Ising model

                  W=1




      Cold T=2               Hot T=5




Mackay fig 31.2                         5
Boltzmann distribution
• Prob distribution in terms of clique potentials
                    1
          p(x|θ) =              ψc (xc |θ c )
                   Z(θ)
                          c∈C


• In terms of energy functions
             ψc (xc )   = exp[−Hc (xc )]
            log p(x)    = −[         Hc (xc ) + log Z]
                               c∈C




                                                         6
Ising model
• 2D Grid on {-1,+1} variables
• Neighboring variables are correlated

                                          Wij   −Wij
                             Hij =
                                         −Wij    Wij

                             W > 0: ferro magnet
                             W < 0: anti ferro magnet (frustrated)


                          H(x)       =   −xT Wx = −           Wij xi xj
                                                       <ij>
                                           1
                           p(x|θ )   =          exp[−βH(x|θ )]
                                          Z(θ )


                        β=1/T = inverse temperature                  7
Local evidence
                                                          
                                   1
p(x, y) =     p(x)p(y|x) =                   ψij (xi , xj )       p(yi |xi )
                                   Z   <ij>                     i

            p(yi |xi )   =   N (yi |xi , σ 2 )




                                                                                 8
Gibbs sampling
• A way to draw samples from p(x1:d|y,θ) one variable
  at a time, ie. p(xi|x-i)

         1. xs+1 ∼ p(x1 |xs , . . . , xs )
             1            2            D

         2. xs+1 ∼ p(x2 |xs+1 , xs , . . . , xs )
             2            1      3            D

         3. xs+1 ∼ p(xi |xs+1 , xs
             i            1:i−1  i+1:D )

         4. xs+1 ∼ p(xD |xs+1 , . . . , xs+1 )
             D            1              D−1




                                                    9
Gibbs sampling from a 2d Gaussian




Mackay 29.13                               10
Gibbs sampling in an MRF
• Full conditional depends only on Markov blanket

                       p(xi = ℓ, x−i )
p(Xi = ℓ|x−i )   =                ′
                       ℓ′ p(Xi = ℓ , x−i )
                       (1/Z)[ j∈Ni ψij (Xi = ℓ, xj )][            <jk>:j,k∈Fi   ψjk (xj , xk )]
                 =
                     (1/Z)     ℓ′ [    j∈Ni   ψij (Xi = ℓ′ , xj )][   <j,k>:j,k∈Fi   ψjk (xj , xk )]
                            j∈Ni   ψij (Xi = ℓ, xj )
                 =
                       ℓ′    j∈Ni     ψij (Xi = ℓ′ , xj )




                                                                                                       11
Gibbs sampling in an Ising model
• Let ψ(xi,xj) = exp(W xi xj), xi = +1,-1.

                                                   j∈Ni   ψij (Xi = +1, xj )
      p(Xi = +1|x−i )    =
                               j∈Ni      ψij (Xi = +1, xj ) +       j∈Ni   ψij (Xi = −1, xj )
                                            exp[J     j∈Ni   xj ]
                         =
                             exp[J        j∈Ni   xj ] + exp[−J      j∈Ni   xj ]
                                    exp[Jwi ]
                         =
                             exp[Jwi ] + exp[−Jwi ]
                         =   σ(2J wi )
                   wi    =          xj
                             j∈Ni

   σ(u) = 1/(1 + e−u )




                                                                                                12
Adding in local evidence
• Final form is

                                         exp[J wi ]φi (+1, yi )
   p(Xi = +1|x−i , y)   =
                            exp[Jwi ]φi (+1, yi ) + exp[−Jwi ]φi (−1, yi )




                        Run demo




                                                                             13
Gibbs sampling for DAGs
• The Markov blanket of a node is the set that
  renders it independent of the rest of the graph.
• This is the parents, children and co-parents.
                                p(Xi , X−i )
            p(Xi |X−i )   =
                                x p(Xi , X−i )
                               p(Xi , U1:n , Y1:m , Z1:m , R)
                          =
                                x p(x, U1:n , Y1:m , Z1:m , R)
                                     p(Xi |U1:n )[ j p(Yj |Xi , Zj )]P (U1:n , Z1:m , R)
                          =
                                x   p(Xi = x|U1:n )[       j   p(Yj |Xi = x, Zj )]P (U1:n , Z1:m , R)
                                      p(Xi |U1:n )[    j   p(Yj |Xi , Zj )]
                          =
                                x   p(Xi = x|U1:n )[       j   p(Yj |Xi = x, Zj )]




                     p(Xi |X−i ) ∝ p(Xi |P a(Xi ))                   p(Yj |P a(Yj )
                                                      Yj ∈ch(Xi )
                                                                                              14
Birats




         15
Samples




          16
Posterior predictive check




                             17
Boltzmann machines
Ising model where the graph structure is arbitrary, and the
weights W are learned by maximum likelihood


                                 Restricted Boltzmann machine




                                                                18
Hopfield network
Boltzmann machine with no hidden nodes (fully connected Ising model)




                                                                       19

More Related Content

What's hot

An Introduction to HSIC for Independence Testing
An Introduction to HSIC for Independence TestingAn Introduction to HSIC for Independence Testing
An Introduction to HSIC for Independence TestingYuchi Matsuoka
 
Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...
Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...
Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...Gabriel Peyré
 
Lecture 2: linear SVM in the dual
Lecture 2: linear SVM in the dualLecture 2: linear SVM in the dual
Lecture 2: linear SVM in the dualStéphane Canu
 
Signal Processing Course : Inverse Problems Regularization
Signal Processing Course : Inverse Problems RegularizationSignal Processing Course : Inverse Problems Regularization
Signal Processing Course : Inverse Problems RegularizationGabriel Peyré
 
Optimal Finite Difference Grids for Elliptic and Parabolic PDEs with Applicat...
Optimal Finite Difference Grids for Elliptic and Parabolic PDEs with Applicat...Optimal Finite Difference Grids for Elliptic and Parabolic PDEs with Applicat...
Optimal Finite Difference Grids for Elliptic and Parabolic PDEs with Applicat...Alex (Oleksiy) Varfolomiyev
 
Mesh Processing Course : Active Contours
Mesh Processing Course : Active ContoursMesh Processing Course : Active Contours
Mesh Processing Course : Active ContoursGabriel Peyré
 
Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...
Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...
Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...Gabriel Peyré
 
Model Selection with Piecewise Regular Gauges
Model Selection with Piecewise Regular GaugesModel Selection with Piecewise Regular Gauges
Model Selection with Piecewise Regular GaugesGabriel Peyré
 
Tele3113 wk1wed
Tele3113 wk1wedTele3113 wk1wed
Tele3113 wk1wedVin Voro
 
Reflect tsukuba524
Reflect tsukuba524Reflect tsukuba524
Reflect tsukuba524kazuhase2011
 
Lecture3 linear svm_with_slack
Lecture3 linear svm_with_slackLecture3 linear svm_with_slack
Lecture3 linear svm_with_slackStéphane Canu
 
Levitan Centenary Conference Talk, June 27 2014
Levitan Centenary Conference Talk, June 27 2014Levitan Centenary Conference Talk, June 27 2014
Levitan Centenary Conference Talk, June 27 2014Nikita V. Artamonov
 
Common derivatives integrals_reduced
Common derivatives integrals_reducedCommon derivatives integrals_reduced
Common derivatives integrals_reducedKyro Fitkry
 
Calculus cheat sheet_integrals
Calculus cheat sheet_integralsCalculus cheat sheet_integrals
Calculus cheat sheet_integralsUrbanX4
 
CVPR2010: higher order models in computer vision: Part 1, 2
CVPR2010: higher order models in computer vision: Part 1, 2CVPR2010: higher order models in computer vision: Part 1, 2
CVPR2010: higher order models in computer vision: Part 1, 2zukun
 
Calculus Cheat Sheet All
Calculus Cheat Sheet AllCalculus Cheat Sheet All
Calculus Cheat Sheet AllMoe Han
 
Learning Sparse Representation
Learning Sparse RepresentationLearning Sparse Representation
Learning Sparse RepresentationGabriel Peyré
 

What's hot (20)

確率伝播
確率伝播確率伝播
確率伝播
 
An Introduction to HSIC for Independence Testing
An Introduction to HSIC for Independence TestingAn Introduction to HSIC for Independence Testing
An Introduction to HSIC for Independence Testing
 
Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...
Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...
Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...
 
Lecture 2: linear SVM in the dual
Lecture 2: linear SVM in the dualLecture 2: linear SVM in the dual
Lecture 2: linear SVM in the dual
 
Signal Processing Course : Inverse Problems Regularization
Signal Processing Course : Inverse Problems RegularizationSignal Processing Course : Inverse Problems Regularization
Signal Processing Course : Inverse Problems Regularization
 
Holographic Cotton Tensor
Holographic Cotton TensorHolographic Cotton Tensor
Holographic Cotton Tensor
 
Optimal Finite Difference Grids for Elliptic and Parabolic PDEs with Applicat...
Optimal Finite Difference Grids for Elliptic and Parabolic PDEs with Applicat...Optimal Finite Difference Grids for Elliptic and Parabolic PDEs with Applicat...
Optimal Finite Difference Grids for Elliptic and Parabolic PDEs with Applicat...
 
Mesh Processing Course : Active Contours
Mesh Processing Course : Active ContoursMesh Processing Course : Active Contours
Mesh Processing Course : Active Contours
 
Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...
Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...
Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...
 
Model Selection with Piecewise Regular Gauges
Model Selection with Piecewise Regular GaugesModel Selection with Piecewise Regular Gauges
Model Selection with Piecewise Regular Gauges
 
Lecture5 kernel svm
Lecture5 kernel svmLecture5 kernel svm
Lecture5 kernel svm
 
Tele3113 wk1wed
Tele3113 wk1wedTele3113 wk1wed
Tele3113 wk1wed
 
Reflect tsukuba524
Reflect tsukuba524Reflect tsukuba524
Reflect tsukuba524
 
Lecture3 linear svm_with_slack
Lecture3 linear svm_with_slackLecture3 linear svm_with_slack
Lecture3 linear svm_with_slack
 
Levitan Centenary Conference Talk, June 27 2014
Levitan Centenary Conference Talk, June 27 2014Levitan Centenary Conference Talk, June 27 2014
Levitan Centenary Conference Talk, June 27 2014
 
Common derivatives integrals_reduced
Common derivatives integrals_reducedCommon derivatives integrals_reduced
Common derivatives integrals_reduced
 
Calculus cheat sheet_integrals
Calculus cheat sheet_integralsCalculus cheat sheet_integrals
Calculus cheat sheet_integrals
 
CVPR2010: higher order models in computer vision: Part 1, 2
CVPR2010: higher order models in computer vision: Part 1, 2CVPR2010: higher order models in computer vision: Part 1, 2
CVPR2010: higher order models in computer vision: Part 1, 2
 
Calculus Cheat Sheet All
Calculus Cheat Sheet AllCalculus Cheat Sheet All
Calculus Cheat Sheet All
 
Learning Sparse Representation
Learning Sparse RepresentationLearning Sparse Representation
Learning Sparse Representation
 

Similar to Image denoising

Scientific Computing with Python Webinar 9/18/2009:Curve Fitting
Scientific Computing with Python Webinar 9/18/2009:Curve FittingScientific Computing with Python Webinar 9/18/2009:Curve Fitting
Scientific Computing with Python Webinar 9/18/2009:Curve FittingEnthought, Inc.
 
Jackknife algorithm for the estimation of logistic regression parameters
Jackknife algorithm for the estimation of logistic regression parametersJackknife algorithm for the estimation of logistic regression parameters
Jackknife algorithm for the estimation of logistic regression parametersAlexander Decker
 
02 2d systems matrix
02 2d systems matrix02 2d systems matrix
02 2d systems matrixRumah Belajar
 
The multilayer perceptron
The multilayer perceptronThe multilayer perceptron
The multilayer perceptronESCOM
 
Numerical solution of boundary value problems by piecewise analysis method
Numerical solution of boundary value problems by piecewise analysis methodNumerical solution of boundary value problems by piecewise analysis method
Numerical solution of boundary value problems by piecewise analysis methodAlexander Decker
 
Differential Calculus
Differential Calculus Differential Calculus
Differential Calculus OlooPundit
 
Andreas Eberle
Andreas EberleAndreas Eberle
Andreas EberleBigMC
 
Rao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithms
Rao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithmsRao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithms
Rao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithmsChristian Robert
 
修士論文発表会
修士論文発表会修士論文発表会
修士論文発表会Keikusl
 
Actuarial Science Reference Sheet
Actuarial Science Reference SheetActuarial Science Reference Sheet
Actuarial Science Reference SheetDaniel Nolan
 
Introduction to Calculus of Variations
Introduction to Calculus of VariationsIntroduction to Calculus of Variations
Introduction to Calculus of VariationsDelta Pi Systems
 
Testing for mixtures by seeking components
Testing for mixtures by seeking componentsTesting for mixtures by seeking components
Testing for mixtures by seeking componentsChristian Robert
 
Emat 213 study guide
Emat 213 study guideEmat 213 study guide
Emat 213 study guideakabaka12
 
Introduction to Stochastic calculus
Introduction to Stochastic calculusIntroduction to Stochastic calculus
Introduction to Stochastic calculusAshwin Rao
 

Similar to Image denoising (20)

Scientific Computing with Python Webinar 9/18/2009:Curve Fitting
Scientific Computing with Python Webinar 9/18/2009:Curve FittingScientific Computing with Python Webinar 9/18/2009:Curve Fitting
Scientific Computing with Python Webinar 9/18/2009:Curve Fitting
 
Jackknife algorithm for the estimation of logistic regression parameters
Jackknife algorithm for the estimation of logistic regression parametersJackknife algorithm for the estimation of logistic regression parameters
Jackknife algorithm for the estimation of logistic regression parameters
 
Optimal Finite Difference Grids
Optimal Finite Difference GridsOptimal Finite Difference Grids
Optimal Finite Difference Grids
 
02 2d systems matrix
02 2d systems matrix02 2d systems matrix
02 2d systems matrix
 
The multilayer perceptron
The multilayer perceptronThe multilayer perceptron
The multilayer perceptron
 
Complex varible
Complex varibleComplex varible
Complex varible
 
Complex varible
Complex varibleComplex varible
Complex varible
 
test
testtest
test
 
Numerical solution of boundary value problems by piecewise analysis method
Numerical solution of boundary value problems by piecewise analysis methodNumerical solution of boundary value problems by piecewise analysis method
Numerical solution of boundary value problems by piecewise analysis method
 
Differential Calculus
Differential Calculus Differential Calculus
Differential Calculus
 
Andreas Eberle
Andreas EberleAndreas Eberle
Andreas Eberle
 
Rao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithms
Rao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithmsRao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithms
Rao-Blackwellisation schemes for accelerating Metropolis-Hastings algorithms
 
rinko2010
rinko2010rinko2010
rinko2010
 
修士論文発表会
修士論文発表会修士論文発表会
修士論文発表会
 
Actuarial Science Reference Sheet
Actuarial Science Reference SheetActuarial Science Reference Sheet
Actuarial Science Reference Sheet
 
Introduction to Calculus of Variations
Introduction to Calculus of VariationsIntroduction to Calculus of Variations
Introduction to Calculus of Variations
 
Testing for mixtures by seeking components
Testing for mixtures by seeking componentsTesting for mixtures by seeking components
Testing for mixtures by seeking components
 
Emat 213 study guide
Emat 213 study guideEmat 213 study guide
Emat 213 study guide
 
Introduction to Stochastic calculus
Introduction to Stochastic calculusIntroduction to Stochastic calculus
Introduction to Stochastic calculus
 
talk MCMC & SMC 2004
talk MCMC & SMC 2004talk MCMC & SMC 2004
talk MCMC & SMC 2004
 

Image denoising

  • 1. CS340 Machine learning Gibbs sampling in Markov random fields 1
  • 2. Image denoising x y ˆ x = E[x|y, θ] 2
  • 3. Ising model • 2D Grid on {-1,+1} variables • Neighboring variables are correlated 1 p(x|θ) = ψij (xi , xj ) Z(θ) <ij> 3
  • 4. Ising model eW e−W ψij (xi , xj ) = e−W eW W > 0: ferro magnet W < 0: anti ferro magnet (frustrated system) 1 p(x|θ) = exp[−βH(x|θ)] Z(θ) H(x) = −xT Wx = − Wij xi xj <ij> β=1/T = inverse temperature 4
  • 5. Samples from an Ising model W=1 Cold T=2 Hot T=5 Mackay fig 31.2 5
  • 6. Boltzmann distribution • Prob distribution in terms of clique potentials 1 p(x|θ) = ψc (xc |θ c ) Z(θ) c∈C • In terms of energy functions ψc (xc ) = exp[−Hc (xc )] log p(x) = −[ Hc (xc ) + log Z] c∈C 6
  • 7. Ising model • 2D Grid on {-1,+1} variables • Neighboring variables are correlated Wij −Wij Hij = −Wij Wij W > 0: ferro magnet W < 0: anti ferro magnet (frustrated) H(x) = −xT Wx = − Wij xi xj <ij> 1 p(x|θ ) = exp[−βH(x|θ )] Z(θ ) β=1/T = inverse temperature 7
  • 8. Local evidence   1 p(x, y) = p(x)p(y|x) =  ψij (xi , xj ) p(yi |xi ) Z <ij> i p(yi |xi ) = N (yi |xi , σ 2 ) 8
  • 9. Gibbs sampling • A way to draw samples from p(x1:d|y,θ) one variable at a time, ie. p(xi|x-i) 1. xs+1 ∼ p(x1 |xs , . . . , xs ) 1 2 D 2. xs+1 ∼ p(x2 |xs+1 , xs , . . . , xs ) 2 1 3 D 3. xs+1 ∼ p(xi |xs+1 , xs i 1:i−1 i+1:D ) 4. xs+1 ∼ p(xD |xs+1 , . . . , xs+1 ) D 1 D−1 9
  • 10. Gibbs sampling from a 2d Gaussian Mackay 29.13 10
  • 11. Gibbs sampling in an MRF • Full conditional depends only on Markov blanket p(xi = ℓ, x−i ) p(Xi = ℓ|x−i ) = ′ ℓ′ p(Xi = ℓ , x−i ) (1/Z)[ j∈Ni ψij (Xi = ℓ, xj )][ <jk>:j,k∈Fi ψjk (xj , xk )] = (1/Z) ℓ′ [ j∈Ni ψij (Xi = ℓ′ , xj )][ <j,k>:j,k∈Fi ψjk (xj , xk )] j∈Ni ψij (Xi = ℓ, xj ) = ℓ′ j∈Ni ψij (Xi = ℓ′ , xj ) 11
  • 12. Gibbs sampling in an Ising model • Let ψ(xi,xj) = exp(W xi xj), xi = +1,-1. j∈Ni ψij (Xi = +1, xj ) p(Xi = +1|x−i ) = j∈Ni ψij (Xi = +1, xj ) + j∈Ni ψij (Xi = −1, xj ) exp[J j∈Ni xj ] = exp[J j∈Ni xj ] + exp[−J j∈Ni xj ] exp[Jwi ] = exp[Jwi ] + exp[−Jwi ] = σ(2J wi ) wi = xj j∈Ni σ(u) = 1/(1 + e−u ) 12
  • 13. Adding in local evidence • Final form is exp[J wi ]φi (+1, yi ) p(Xi = +1|x−i , y) = exp[Jwi ]φi (+1, yi ) + exp[−Jwi ]φi (−1, yi ) Run demo 13
  • 14. Gibbs sampling for DAGs • The Markov blanket of a node is the set that renders it independent of the rest of the graph. • This is the parents, children and co-parents. p(Xi , X−i ) p(Xi |X−i ) = x p(Xi , X−i ) p(Xi , U1:n , Y1:m , Z1:m , R) = x p(x, U1:n , Y1:m , Z1:m , R) p(Xi |U1:n )[ j p(Yj |Xi , Zj )]P (U1:n , Z1:m , R) = x p(Xi = x|U1:n )[ j p(Yj |Xi = x, Zj )]P (U1:n , Z1:m , R) p(Xi |U1:n )[ j p(Yj |Xi , Zj )] = x p(Xi = x|U1:n )[ j p(Yj |Xi = x, Zj )] p(Xi |X−i ) ∝ p(Xi |P a(Xi )) p(Yj |P a(Yj ) Yj ∈ch(Xi ) 14
  • 15. Birats 15
  • 16. Samples 16
  • 18. Boltzmann machines Ising model where the graph structure is arbitrary, and the weights W are learned by maximum likelihood Restricted Boltzmann machine 18
  • 19. Hopfield network Boltzmann machine with no hidden nodes (fully connected Ising model) 19