SlideShare a Scribd company logo
1 of 7
Download to read offline
PR 1181           Indira       Bala        Giri MV




                                      Pattern Recognition 33 (2000) 1919}1925




                MRF parameter estimation by MCMC method
                                          Lei Wang, Jun Liu*, Stan Z. Li
           School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore 639798, Singapore
                       Received 13 January 1999; received in revised form 28 July 1999; accepted 28 July 1999


Abstract

   Markov random "eld (MRF) modeling is a popular pattern analysis method and MRF parameter estimation plays an
important role in MRF modeling. In this paper, a method based on Markov Chain Monte Carlo (MCMC) is proposed to
estimate MRF parameters. Pseudo-likelihood is used to represent likelihood function and it gives a good estimation
result.   2000 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved.

Keywords: MRF; MCMC; Least-squares "t; Parameter estimation; Pseudo-likelihood




1. Introduction                                                         consuming. Here a method based on MCMC is used to
                                                                        estimate the parameters which can give a good solution
   The objective of mathematical modeling in pattern                    to the estimation.
analysis is aimed to extract the intrinsic characteristics of              The general parameter estimation principle is as fol-
the pattern in a few parameters so as to represent the                  lows. Let F denote any "nite set which comprises of
pattern e!ectively. Markov random "eld modeling is                      a random "eld and f3F is an observation of F. On
a very popular pattern analysis method and it plays an                  F a family of distributions
important role in pattern recognition and computer vi-                     "+ (F; ): 3 ,
sion. Markov random "eld models were popularized by
Besag to model spatial interactions on lattice system [1].              is considered where LRB is a set of parameters. The
It can be used in texture classi"cation and segmentation                &true' parameter H3 is not known and needs to be
as well as image restoration [2]. The most important                    determined or at least approximated. The only available
characteristic of MRF modeling is that the global pat-                  information is hidden in the observation f which is a
terns can be formed via stochastic propagation of local                 realization of F. Now, the problem is how to choose K
interactions. MRF parameter estimation is necessary in                  as a substitute for H if f is picked at random from
MRF modeling after the form of model is given. During                      (F; H).
the past years, many authors presented methods to esti-                     In this paper, the estimation of parameters is based
mate MRF parameters. Simulated annealing [3], max-                      on deriving posterior distribution calculated using the
imum likelihood [4], coding method [1], mean "eld                       Metropolis}Hastings algorithm. This is a Markov chain
approximations [5], Bayesian estimation [6] and least-                  Monte Carlo (MCMC) technique [8].
squares (LS) "t [7] are discussed to estimate MRF para-                     The paper is arranged as follows. MRF image model is
meters.                                                                 discussed in Section 2. MCMC parameter estimation is
   Least-squares (LS) methods and maximum likelihood                    proposed in Section 3. The experiments are shown in
methods are often used. However, LS is not accurate in                  Section 4 and conclusion is given in Section 5.
estimation and maximum likelihood method is time-
                                                                        2. MRF modeling

   * Corresponding author.                                                 This section introduces some notations related to
   E-mail    addresses:       elwang@263.net        (L.    Wang),       MRF modeling which will be used in the following sec-
ejliu@ntu.edu.sg (J. Liu), szli@szli.eee.ntu.edu.sg (S.Z. Li).          tions of the paper.

0031-3203/00/$20.00           2000 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved.
PII: S 0 0 3 1 - 3 2 0 3 ( 9 9 ) 0 0 1 7 8 - 8
PR 1181


1920                                  L. Wang et al. / Pattern Recognition 33 (2000) 1919}1925

   A lattice is a square array of pixels, or sites,                  where c ( j) denotes the neighbor of site j in the ith cli-
                                                                             G
+( j, k): 0)j)N!1, 0)k)N!1,. We adopt a                              que c .
                                                                          G
simple numbering of sites by assigning sequence number
i"k#Nj to site ( j, k). Letting M"N denote the num-                    Then the distributions have the form
ber of sites,
                                                                     P( f, )Z( ) exp (1 , H( f )2), 3
S+0, 1,2, M!1,
                                                                     where 1 , H( f )2 L        H is the inner product of
                                                                                            G G G
index the set of sites. A random eld model is a distribu-             and H, and Z( )        exp (1 , H( f )2) is the normaliz-
                                                                                             D
tion for the M-tuple random vector F, which contains                 ing partition function. The conditional probability is as
a random variable F(i) for the value of site i. The sites in         follows:
S are related to one another via a neighborhood system.
A neighborhood system for S is dened as                                                 exp(!1 , H( f )2)
                                                                     P( f  f H )                    H      ,                (2)
                                                                         H ,               exp(! 1 , H(z )2)
                                                                                     XH ZL                 H
N+N ∀i3S,,
    G
                                                                     where H( f ) is the local histogram only calculated in the
                                                                                H
where N is the set of sites neighboring i. The neighbor-             neighborhood of site j. H(z ) denotes the local histogram
          G                                                                                       H
ing relationship has the following properties:                       replacing f with z and the neighborhood of j is xed.
                                                                                H        H
                                                                     The computation of Z( ) is infeasible because there are
(1) a site is not neighboring to itself;                             a combinatorial number of elements in the conguration
(2) the neighboring relationship is mutual.                          space. In order to avoid using the partition function Z( ),
                                                                     the pseudo-likelihood function
   A clique c is a set of sites in which all pairs of sites are
mutual neighbors. The set of all cliques in a neighbor-
                                                                     P¸( f  )log “ P( f  f H )
hood system is denoted as Q.                                                             H ,
                                                                                   HZS
   Suppose F is an MRF. Let f3F be a realization of F.
A clique function, or potential function,  ( f ), is asso-
                                                A                             “ (1 , H( f )2)!log       exp(1 , H(z )2)      (3)
ciated with each clique and the energy function, ;( f ), of                               H                         H
MRF can be expressed as the sum of clique functions.                           HZ S
                                                                                                   XH ZL


                                                                     can be used to replace likelihood function. The pseudo-
;( f )  ( f ).                                                     likelihood does not involve the partition function Z( ).
           A                                                         Hence it is much easier to be calculated.
       AZ/
  To a homogeneous MRF, the potential function is
independent of locations. Thus, the number of clique
                                                                     3. MCMC estimation of MRF parameters
potentials can be reduced to the number of clique types,
that is, each potential corresponding to a clique type.                 According to Bayesian theorem, the posterior distribu-
  Consider a multi-level logistic (MLL) model [4]. Let               tion of conditional on f is
L+1,2, m, be the label set and ( ,2, ) be the
                                                L
parameter vector for clique potentials where each com-
                                                                                P( )P( f  )
ponent corresponds to a clique type. Consider the distri-            P(  f )                JP( )P( f  ).                  (4)
bution of Gibbs form,                                                         P( )P( f  ) d

                                                                     According to Gilks et al. [8], any features of the posterior
P(Ff, )JP(Ff  )Z( ) exp(!;( f, )),                    (1)
                                                                     distribution are legitimate for Bayesian inference: mo-
                                                                     ments, quantiles, highest posterior density regions, etc.
where ;( f, ) is energy function and depends linearly on
                                                                     All these quantiles can be expressed in terms of posterior
 . Suppose H( f )(H ,2, H ) is the histogram of
                                L                                   expectations of functions of . The posterior expectation
cliques of f, n denotes the index of clique type. Let
                                                                     of a function g( ) is
        1 if z0,
 (z)                                                                            g( )P( )P( f  ) d
        0 otherwise.                                                 E[g( ) f ]                    .                        (5)
                                                                                   P( )P( f  ) d

H     2            ( f !f )!1 , i1,2, n,                           The integrations in this expression are di$cult to be
 G                     H  HY                                         solved in Bayesian inference. Monte Carlo integration
   HZ1 HYZAG H
PR 1181


                                   L. Wang et al. / Pattern Recognition 33 (2000) 1919}1925                            1921

including Markov Chain Monte Carlo (MCMC)                         of say m iterations, + R, tm#1,2, n, will be depen-
approach [6] can be used to deal with the di$culty [8].           dent samples approximately from 
(.). Let
The task is to evaluate the expectation
                                                                      1          L
                                                                   M                  R.                                (7)
        g( )P( ) d                                                  n!m
E(g( ))            .                                   (6)                   RK
          P( ) d                                                 This is an ergodic average. Convergence to the re-
                                                                  quired expectation is ensured by the ergodic theorem.
A Markov chain can be adopted for the purpose of                  Eq. (7) shows how a Markov chain can be used to
evaluation. Suppose we generate a sequence of random              estimate E(  f ). Such a Markov chain can be constructed
variables + , ,2,. At each time t*0, the next state             by Metropolis}Hastings algorithm [8]. At each time t,
 R is sampled from a distribution P( R R) which de-          the next state R is chosen by rst sampling a candidate
pends only on the current state R of the chain. This              point  from a proposal distribution q(  R). The choice
Markov chain is assumed to be time-homogeneous.                   of proposal distribution is almost arbitrary; here a
Thus, the sequence will gradually converge to a unique            multivariate normal distribution centered on the cur-
stationary distribution 
(.). After a su$cient long burn-in       rent value R is adopted. The candidate  is accepted




            Fig. 1. Textures used in the experiment. (a) Number of graylevels M2, (b) M2, (c) M4, (d) M4.
PR 1181


1922                                     L. Wang et al. / Pattern Recognition 33 (2000) 1919}1925

with probability                                                         The Metropolis}Hastings algorithm can be sum-
                                                                        marized in the following procedures:
                   P(  f )q(  R)
 ( R, )min 1,                      .                                      Initialize ; set t0 and ¹maximum number of
                   P( R f )q( R )
                                                                            iteration
                                                                            While t(¹
The transition kernel for the Metropolis-Hastings algo-
                                                                            BEGIN
rithm is
                                                                            Sample a point  from q(. R)
                                                                            Sample a uniform (0, 1) random variable v
P( R R)q( R R) ( R, R)#I( R R)                                  If v) ( R, ), set R . Otherwise set R R
                                                                            Increment t

                      
            ; 1! q(  R) ( R, )d 
                                                                            END


where I(.) denotes the indicator function (taking 1 when                4. Experiments
its argument is true, and 0 otherwise). If the candidate 
is accepted, the next state becomes R , otherwise                      In order to inspect the performance of the method
  R R. Since P( f)JP( )P(f ) and the prior P( ) can                proposed in this paper, a Gibbs sampler [4] is used to
be assumed to be #at when the prior information is                      sample textures with the specied parameters. Here a sec-
totally unavailable,                                                    ond-order neighborhood system is used and four double-
                                                                        site cliques + , , , , corresponding to 03, 903,
                                                                                               
                   P(  f )q(  R)                                    453 and 1353 individually are adopted as non-zero para-
 ( R, )min 1,                                                         meters. Fig. 1 shows four 128;128 textures generated
                   P( R f )q( R )
                                                                        from the Gibbs Sampler. The rst two textures are sam-
                   P( f  )q(  R)                                    pled with two graylevels and the next two textures are
       min 1,                       .                          (8)     sampled with four graylevels. The parameters of the four
                   P( f  R)q( R )
                                                                        textures are listed in Table 1. In order to get acceptable
                                                                        parameters, the MCMC procedure described in the pre-
Since the choice of proposal distribution here is normal                vious section should be repeated until stability of the
centered on the current value, q(  R)q( R ) due                    Markov chain is reached. The choice of starting values
to the symmetric property of the proposed distribution.                    will not a!ect the stationary distribution if the chain is
Thus, the acceptance probability formula can be                         irreducible. In our experiments,  are chosen randomly.
reduced to                                                              The usual informal approach to detection of convergence
                                                                        is visual inspection of plots of the Monte-Carlo output
                   P( f  )                                            + R, t1,2, n,. From Figs. 2}5, three independent sam-
 ( R, )min 1,              .                                  (9)
                   P( f  R)                                            ples of Markov chains for texture 4 are given. From the

Thus, the Metropolis}Hastings algorithm is switched to
Metropolis algorithm. When we use pseudo-likelihood to
represent the likelihood function, we get

 ( R, )min(1, exp(P¸( f  )!P¸( f  R)))


       min 1, exp               (1 , H( f )2!1 R, H( f )2)
                                           H            H
                           HZ1

          !log             exp(1 , H(z )2)
                                       H
                   XH ZL

          #log             exp(1 R, H(z )2)       .            (10)
                                       H
                   XH ZL

With this acceptance probability, the          can be approxi-          Fig. 2. 1000 iterations with di!erent starting values for estima-
mated e!ectively.                                                       ting
                                                                              
                                                                                for texture 4.
PR 1181


                                       L. Wang et al. / Pattern Recognition 33 (2000) 1919}1925                                   1923




Fig. 3. 1000 iterations with di!erent starting values for estima-
                                                                      Fig. 4. 1000 iterations with di!erent starting values for estima-
ting    for texture 4.
                                                                     ting
                                                                            
                                                                              for texture 4.


gures, we observe that the length of burn-in depends on              creased. Initially, we set n1000. If the estimates M do
 . The Markov chains converge in less than 300 iter-                 not agree adequately, we increase 500 iterations each
ations in most examples according to visual inspection of             time until estimates are similar. We only need to inspect
the monitoring statistics. Here we set burn-in m500.                 the mean M and variance of the Monte}Carlo output. In
More formal methods for convergence diagnostics can be                our experiments in Table 1, n1000 is enough. The
found in Refs. [9,10]. Decision about the iteration num-              results of MCMC approach in Table 1 are acceptable
ber is an important and practical matter. The aim is to               where      denotes the average standard deviation of
run the chain long enough to obtain adequate precision                Markov chains after burn-in. In order to verify the per-
in the estimator. Here three chains are run in parallel               formance of this method, least- squares (LS) t method
with di!erent starting values M from Eq. (7). If they do not          proposed by Derin and Elliott [7] is also used in
agree adequately, the iteration number n must be in-                  our experiments. From Table 1, it can be seen that LS




Table 1
MRF parameter estimation

Textures                    Method
                                                                                                                          
Texture 1                   Specied                     1                      1                 !0.5                  !0.5
                            LS                           0.8448                 0.8734            !0.4332               !0.4382
                            MCMC                         0.9884                 0.9899            !0.5076               !0.5078
                                                         0.0436                 0.0323             0.0278                0.0400
Texture 2                   Specied                     1                   !0.8                     0.5               !0.5
                            LS                           0.9949              !0.8157                  0.4960            !0.3244
                            MCMC                         1.0093              !0.8586                  0.5569            !0.4522
                                                         0.0147               0.0235                  0.0168             0.0245
Texture 3                   Specied                     0.3                    0.3                   0.3                  0.3
                            LS                           0.1152                 0.1520                0.1867               0.1444
                            MCMC                         0.3478                 0.2762                0.2960               0.2877
                                                         0.0266                 0.0165                0.0130               0.0086

Texture 4                   Specied                     0.5                    1                 !0.5                     0.7
                            LS                           0.0415                 0.4364             0                       0.4201
                            MCMC                         0.5525                 1.0394            !0.5951                  0.6810
                                                         0.0321                 0.0364             0.0490                  0.0156
PR 1181


1924                                   L. Wang et al. / Pattern Recognition 33 (2000) 1919}1925

                                                                      cation and segmentation as well as image restoration.
                                                                      MRF parameter estimation plays an important role in
                                                                      MRF modeling. In order to estimate MRF parameter
                                                                      e!ectively and e$ciently, an MRF parameter estimation
                                                                      method based on MCMC is proposed in this paper.
                                                                      A Markov chain is constructed to sample the MRF
                                                                      parameters via Monte Carlo method. MLL model is used
                                                                      as image model. In order to avoid to calculate the nor-
                                                                      malizing partition function, pseudo-likelihood function
                                                                      is used to represent likelihood function. Compared to
                                                                      least-squares t method, our method is more accurate
                                                                      and can be used for multi-graylevel texture parameter
                                                                      estimation e!ectively as seen from the experiments in the
                                                                      paper. This method can be extended to be used in multi-
                                                                      resolution analysis of texture modeling and segmentation
                                                                      of textured images.
Fig. 5. 1000 iterations with di!erent starting values for estima-
ting    for texture 4.
      


method is e!ective only to the textures with two gray-                Acknowledgements
levels, while MCMC method is e!ective to all examples
in the experiments even more graylevels are adopted in                  We wish to thank the constructive comments and
the model. From the comparison, MCMC method pro-                      suggestions of the reviewers.
posed in this paper is much better than LS method. The
MCMC routines are run on a Sun Ultra 2 workstation,
each analysis takes less than 3 min to perform 1000
iterations.                                                           References

                                                                       [1] J. Besag, Spatial interaction and the statistical analysis of
5. Conclusion                                                              lattice systems, J. Roy. Statist. Soc. Ser. B 36 (1974)
                                                                           192}236.
   Markov random elds (MRFs) modeling is a popular                    [2] S. Barker, Image segmentation using Markov random eld
pattern analysis method. It can be used in texture classi-                 models, Ph.D. Thesis, University of Cambridge, 1998.
cation and segmentation as well as image restoration.                 [3] S. Geman, D. Geman, Stochastic relaxation, Gibbs distri-
MRF parameter estimation plays an important role in                        bution and the Bayesian restoratopn of images, IEEE
MRF modeling. In order to estimate MRF parameter                           Trans. PAMI 6 (6) (1984) 721}741.
                                                                       [4] S. Li, Markov Random Field Modeling in Computer
e!ectively and e$ciently, an MRF parameter estimation
                                                                           Vision, Springer, New York, 1995.
method based on method MCMC is proposed in this                        [5] D. Chandler, Introduction to Modern Statistical Mechan-
paper. A Markov chain is constructed to sample the                         ics, Oxford University Press, Oxford, 1987.
MRF parameters via Monte Carlo method. MLL model                       [6] R. Aykroyd, Bayesian estimation for homogeneous and
is used as image model. In order to avoid to calculate the                 inhomogeneous Gaussian random elds, IEEE Trans.
normalizing partition function, pseudo-likelihood func-                    Pattern Analysis Mach. Intell. 20 (5) (1998) 533}539.
tion is used to represent likelihood function. Compared                [7] H. Derin, H. Elliott, Modeling and segmentation of
to least-squares t method, the proposed method is more                    noisy and textured images using Gibbs random elds,
accurate and can be used for multilevel-graylevel texture                  IEEE Trans. Pattern Analysis Mach. Intell. 9 (1) (1987)
parameter estimation e!ectively as seen from the experi-                   39}55.
                                                                       [8] W. Gilks, S. Richardson, D. Spiegelhalter, Markov
ments in the paper. This method can be extended to be
                                                                           chain Monte Carlo in Practice, Chapman  Hall, London,
used in multiresolution analysis of texture modeling and                   1996.
segmentation of textured images.                                       [9] M. Cowles, B. Carlin, Markov Chain Monte Carlo conver-
                                                                           gence diagnostics: a comparative review, Technical Re-
                                                                           port, Division of Biostatistics, School of Public Health,
                                                                           University of Minnesota, 1994.
6. Summary                                                            [10] S. Brooks, P. Dellaportas, G. Roberts, An approach to
                                                                           diagnosing total variation convergence of MCMC
  Markov random elds (MRFs) modeling is a popular                         algorithms, University of Cambridge, http://www.stats.
pattern analysis method. It can be used in texture classi-                 bris.ac.uk/ maspb/mypapers/brodr96.html, 1996.
PR 1181


                                      L. Wang et al. / Pattern Recognition 33 (2000) 1919}1925                                  1925

About the Author*LEI WANG received his B. Eng and M. Eng degrees from Harbin Institute of Technology, China, in 1992 and 1995,
respectively. He is currently a Ph.D. candidate in School of Electrical and Electronic Engineering, Nanyang Technological University,
Singapore. His research interests include pattern recognition, image compression, image processing, and image retrieval.

About the Author*JUN LIU received his B.S. and M.S. degree from Jiao Tong University, Xian, China in 1982 and 1984, respectively.
He obtained his Ph.D. degree from Oakland University, MI, USA in 1989. He is currently an associate professor with Division of
Information Engineering, School of Electrical and Electronic Engineering, Nanyang Technological University. His research interests
include pattern recognition, image processing and multimedia database.

About the Author*STAN Z. LI received the B.Sc degree from Hunan University, China, in 1982, M.Sc degree from the National
University of Defense Technology, China, in 1985 and Ph.D. degree from the University of Surrey, UK, in 1991. All degrees are in EEE.
He is currently a senior lecturer at Nanyang Technological University, Singapore. His research interests include computer vision,
pattern recognition, image processing and optimization methods.

More Related Content

What's hot

Functions form 3
Functions form 3Functions form 3
Functions form 3
MrHunte
 
Light Scattering by Nonspherical Particles
Light Scattering by Nonspherical ParticlesLight Scattering by Nonspherical Particles
Light Scattering by Nonspherical Particles
avinokurov
 
Engr 371 final exam april 2006
Engr 371 final exam april 2006Engr 371 final exam april 2006
Engr 371 final exam april 2006
amnesiann
 
Differential Geometry
Differential GeometryDifferential Geometry
Differential Geometry
lapuyade
 
Engr 371 final exam april 2010
Engr 371 final exam april 2010Engr 371 final exam april 2010
Engr 371 final exam april 2010
amnesiann
 
Parallel Evaluation of Multi-Semi-Joins
Parallel Evaluation of Multi-Semi-JoinsParallel Evaluation of Multi-Semi-Joins
Parallel Evaluation of Multi-Semi-Joins
Jonny Daenen
 
IGARSS2011 FR3.T08.3 BenDavid.pdf
IGARSS2011 FR3.T08.3 BenDavid.pdfIGARSS2011 FR3.T08.3 BenDavid.pdf
IGARSS2011 FR3.T08.3 BenDavid.pdf
grssieee
 

What's hot (18)

AI Lesson 06
AI Lesson 06AI Lesson 06
AI Lesson 06
 
3-D Visual Reconstruction: A System Perspective
3-D Visual Reconstruction: A System Perspective3-D Visual Reconstruction: A System Perspective
3-D Visual Reconstruction: A System Perspective
 
Computer Science Assignment Help
Computer Science Assignment HelpComputer Science Assignment Help
Computer Science Assignment Help
 
Functions form 3
Functions form 3Functions form 3
Functions form 3
 
Software Construction Assignment Help
Software Construction Assignment HelpSoftware Construction Assignment Help
Software Construction Assignment Help
 
Light Scattering by Nonspherical Particles
Light Scattering by Nonspherical ParticlesLight Scattering by Nonspherical Particles
Light Scattering by Nonspherical Particles
 
Enumeration of 2-level polytopes
Enumeration of 2-level polytopesEnumeration of 2-level polytopes
Enumeration of 2-level polytopes
 
AI Lesson 05
AI Lesson 05AI Lesson 05
AI Lesson 05
 
Engr 371 final exam april 2006
Engr 371 final exam april 2006Engr 371 final exam april 2006
Engr 371 final exam april 2006
 
Practical computation of Hecke operators
Practical computation of Hecke operatorsPractical computation of Hecke operators
Practical computation of Hecke operators
 
Differential Geometry
Differential GeometryDifferential Geometry
Differential Geometry
 
ARIC Team Seminar
ARIC Team SeminarARIC Team Seminar
ARIC Team Seminar
 
Engr 371 final exam april 2010
Engr 371 final exam april 2010Engr 371 final exam april 2010
Engr 371 final exam april 2010
 
Parallel Evaluation of Multi-Semi-Joins
Parallel Evaluation of Multi-Semi-JoinsParallel Evaluation of Multi-Semi-Joins
Parallel Evaluation of Multi-Semi-Joins
 
Venn diagram
Venn diagramVenn diagram
Venn diagram
 
Lesson 19: The Mean Value Theorem
Lesson 19: The Mean Value TheoremLesson 19: The Mean Value Theorem
Lesson 19: The Mean Value Theorem
 
IGARSS2011 FR3.T08.3 BenDavid.pdf
IGARSS2011 FR3.T08.3 BenDavid.pdfIGARSS2011 FR3.T08.3 BenDavid.pdf
IGARSS2011 FR3.T08.3 BenDavid.pdf
 
Lecture 21 problem reduction search ao star search
Lecture 21 problem reduction search ao star searchLecture 21 problem reduction search ao star search
Lecture 21 problem reduction search ao star search
 

Similar to 10.1.1.1.3210

11.quadrature radon transform for smoother tomographic reconstruction
11.quadrature radon transform for smoother  tomographic reconstruction11.quadrature radon transform for smoother  tomographic reconstruction
11.quadrature radon transform for smoother tomographic reconstruction
Alexander Decker
 
EARTHSC 5642 Spring 2015 Dr. von FreseEARTHSC 5642.docx
EARTHSC 5642     Spring 2015   Dr. von FreseEARTHSC 5642.docxEARTHSC 5642     Spring 2015   Dr. von FreseEARTHSC 5642.docx
EARTHSC 5642 Spring 2015 Dr. von FreseEARTHSC 5642.docx
jacksnathalie
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
ijceronline
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
ijceronline
 

Similar to 10.1.1.1.3210 (20)

11.quadrature radon transform for smoother tomographic reconstruction
11.quadrature radon transform for smoother  tomographic reconstruction11.quadrature radon transform for smoother  tomographic reconstruction
11.quadrature radon transform for smoother tomographic reconstruction
 
11.[23 36]quadrature radon transform for smoother tomographic reconstruction
11.[23 36]quadrature radon transform for smoother  tomographic reconstruction11.[23 36]quadrature radon transform for smoother  tomographic reconstruction
11.[23 36]quadrature radon transform for smoother tomographic reconstruction
 
論文紹介:Towards Robust Adaptive Object Detection Under Noisy Annotations
論文紹介:Towards Robust Adaptive Object Detection Under Noisy Annotations論文紹介:Towards Robust Adaptive Object Detection Under Noisy Annotations
論文紹介:Towards Robust Adaptive Object Detection Under Noisy Annotations
 
Macrocanonical models for texture synthesis
Macrocanonical models for texture synthesisMacrocanonical models for texture synthesis
Macrocanonical models for texture synthesis
 
math camp
math campmath camp
math camp
 
EARTHSC 5642 Spring 2015 Dr. von FreseEARTHSC 5642.docx
EARTHSC 5642     Spring 2015   Dr. von FreseEARTHSC 5642.docxEARTHSC 5642     Spring 2015   Dr. von FreseEARTHSC 5642.docx
EARTHSC 5642 Spring 2015 Dr. von FreseEARTHSC 5642.docx
 
Fixed point result in menger space with ea property
Fixed point result in menger space with ea propertyFixed point result in menger space with ea property
Fixed point result in menger space with ea property
 
Dycops2019
Dycops2019 Dycops2019
Dycops2019
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
An investigation of inference of the generalized extreme value distribution b...
An investigation of inference of the generalized extreme value distribution b...An investigation of inference of the generalized extreme value distribution b...
An investigation of inference of the generalized extreme value distribution b...
 
Hm2513521357
Hm2513521357Hm2513521357
Hm2513521357
 
Hm2513521357
Hm2513521357Hm2513521357
Hm2513521357
 
ICML2019読み会(2019/08/04, @オムロン)「On the Universality of Invariant Networks」紹介スライド
ICML2019読み会(2019/08/04, @オムロン)「On the Universality of Invariant Networks」紹介スライドICML2019読み会(2019/08/04, @オムロン)「On the Universality of Invariant Networks」紹介スライド
ICML2019読み会(2019/08/04, @オムロン)「On the Universality of Invariant Networks」紹介スライド
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
Chang etal 2012a
Chang etal 2012aChang etal 2012a
Chang etal 2012a
 
Slides: A glance at information-geometric signal processing
Slides: A glance at information-geometric signal processingSlides: A glance at information-geometric signal processing
Slides: A glance at information-geometric signal processing
 
Analytic construction of points on modular elliptic curves
Analytic construction of points on modular elliptic curvesAnalytic construction of points on modular elliptic curves
Analytic construction of points on modular elliptic curves
 
Interferogram Filtering Using Gaussians Scale Mixtures in Steerable Wavelet D...
Interferogram Filtering Using Gaussians Scale Mixtures in Steerable Wavelet D...Interferogram Filtering Using Gaussians Scale Mixtures in Steerable Wavelet D...
Interferogram Filtering Using Gaussians Scale Mixtures in Steerable Wavelet D...
 
Er24902905
Er24902905Er24902905
Er24902905
 
10.1.1.474.2861
10.1.1.474.286110.1.1.474.2861
10.1.1.474.2861
 

10.1.1.1.3210

  • 1. PR 1181 Indira Bala Giri MV Pattern Recognition 33 (2000) 1919}1925 MRF parameter estimation by MCMC method Lei Wang, Jun Liu*, Stan Z. Li School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore 639798, Singapore Received 13 January 1999; received in revised form 28 July 1999; accepted 28 July 1999 Abstract Markov random "eld (MRF) modeling is a popular pattern analysis method and MRF parameter estimation plays an important role in MRF modeling. In this paper, a method based on Markov Chain Monte Carlo (MCMC) is proposed to estimate MRF parameters. Pseudo-likelihood is used to represent likelihood function and it gives a good estimation result. 2000 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved. Keywords: MRF; MCMC; Least-squares "t; Parameter estimation; Pseudo-likelihood 1. Introduction consuming. Here a method based on MCMC is used to estimate the parameters which can give a good solution The objective of mathematical modeling in pattern to the estimation. analysis is aimed to extract the intrinsic characteristics of The general parameter estimation principle is as fol- the pattern in a few parameters so as to represent the lows. Let F denote any "nite set which comprises of pattern e!ectively. Markov random "eld modeling is a random "eld and f3F is an observation of F. On a very popular pattern analysis method and it plays an F a family of distributions important role in pattern recognition and computer vi- "+ (F; ): 3 , sion. Markov random "eld models were popularized by Besag to model spatial interactions on lattice system [1]. is considered where LRB is a set of parameters. The It can be used in texture classi"cation and segmentation &true' parameter H3 is not known and needs to be as well as image restoration [2]. The most important determined or at least approximated. The only available characteristic of MRF modeling is that the global pat- information is hidden in the observation f which is a terns can be formed via stochastic propagation of local realization of F. Now, the problem is how to choose K interactions. MRF parameter estimation is necessary in as a substitute for H if f is picked at random from MRF modeling after the form of model is given. During (F; H). the past years, many authors presented methods to esti- In this paper, the estimation of parameters is based mate MRF parameters. Simulated annealing [3], max- on deriving posterior distribution calculated using the imum likelihood [4], coding method [1], mean "eld Metropolis}Hastings algorithm. This is a Markov chain approximations [5], Bayesian estimation [6] and least- Monte Carlo (MCMC) technique [8]. squares (LS) "t [7] are discussed to estimate MRF para- The paper is arranged as follows. MRF image model is meters. discussed in Section 2. MCMC parameter estimation is Least-squares (LS) methods and maximum likelihood proposed in Section 3. The experiments are shown in methods are often used. However, LS is not accurate in Section 4 and conclusion is given in Section 5. estimation and maximum likelihood method is time- 2. MRF modeling * Corresponding author. This section introduces some notations related to E-mail addresses: elwang@263.net (L. Wang), MRF modeling which will be used in the following sec- ejliu@ntu.edu.sg (J. Liu), szli@szli.eee.ntu.edu.sg (S.Z. Li). tions of the paper. 0031-3203/00/$20.00 2000 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved. PII: S 0 0 3 1 - 3 2 0 3 ( 9 9 ) 0 0 1 7 8 - 8
  • 2. PR 1181 1920 L. Wang et al. / Pattern Recognition 33 (2000) 1919}1925 A lattice is a square array of pixels, or sites, where c ( j) denotes the neighbor of site j in the ith cli- G +( j, k): 0)j)N!1, 0)k)N!1,. We adopt a que c . G simple numbering of sites by assigning sequence number i"k#Nj to site ( j, k). Letting M"N denote the num- Then the distributions have the form ber of sites, P( f, )Z( ) exp (1 , H( f )2), 3 S+0, 1,2, M!1, where 1 , H( f )2 L H is the inner product of G G G index the set of sites. A random eld model is a distribu- and H, and Z( ) exp (1 , H( f )2) is the normaliz- D tion for the M-tuple random vector F, which contains ing partition function. The conditional probability is as a random variable F(i) for the value of site i. The sites in follows: S are related to one another via a neighborhood system. A neighborhood system for S is dened as exp(!1 , H( f )2) P( f f H ) H , (2) H , exp(! 1 , H(z )2) XH ZL H N+N ∀i3S,, G where H( f ) is the local histogram only calculated in the H where N is the set of sites neighboring i. The neighbor- neighborhood of site j. H(z ) denotes the local histogram G H ing relationship has the following properties: replacing f with z and the neighborhood of j is xed. H H The computation of Z( ) is infeasible because there are (1) a site is not neighboring to itself; a combinatorial number of elements in the conguration (2) the neighboring relationship is mutual. space. In order to avoid using the partition function Z( ), the pseudo-likelihood function A clique c is a set of sites in which all pairs of sites are mutual neighbors. The set of all cliques in a neighbor- P¸( f )log “ P( f f H ) hood system is denoted as Q. H , HZS Suppose F is an MRF. Let f3F be a realization of F. A clique function, or potential function, ( f ), is asso- A “ (1 , H( f )2)!log exp(1 , H(z )2) (3) ciated with each clique and the energy function, ;( f ), of H H MRF can be expressed as the sum of clique functions. HZ S XH ZL can be used to replace likelihood function. The pseudo- ;( f ) ( f ). likelihood does not involve the partition function Z( ). A Hence it is much easier to be calculated. AZ/ To a homogeneous MRF, the potential function is independent of locations. Thus, the number of clique 3. MCMC estimation of MRF parameters potentials can be reduced to the number of clique types, that is, each potential corresponding to a clique type. According to Bayesian theorem, the posterior distribu- Consider a multi-level logistic (MLL) model [4]. Let tion of conditional on f is L+1,2, m, be the label set and ( ,2, ) be the L parameter vector for clique potentials where each com- P( )P( f ) ponent corresponds to a clique type. Consider the distri- P( f ) JP( )P( f ). (4) bution of Gibbs form, P( )P( f ) d According to Gilks et al. [8], any features of the posterior P(Ff, )JP(Ff )Z( ) exp(!;( f, )), (1) distribution are legitimate for Bayesian inference: mo- ments, quantiles, highest posterior density regions, etc. where ;( f, ) is energy function and depends linearly on All these quantiles can be expressed in terms of posterior . Suppose H( f )(H ,2, H ) is the histogram of L expectations of functions of . The posterior expectation cliques of f, n denotes the index of clique type. Let of a function g( ) is 1 if z0, (z) g( )P( )P( f ) d 0 otherwise. E[g( ) f ] . (5) P( )P( f ) d H 2 ( f !f )!1 , i1,2, n, The integrations in this expression are di$cult to be G H HY solved in Bayesian inference. Monte Carlo integration HZ1 HYZAG H
  • 3. PR 1181 L. Wang et al. / Pattern Recognition 33 (2000) 1919}1925 1921 including Markov Chain Monte Carlo (MCMC) of say m iterations, + R, tm#1,2, n, will be depen- approach [6] can be used to deal with the di$culty [8]. dent samples approximately from (.). Let The task is to evaluate the expectation 1 L M R. (7) g( )P( ) d n!m E(g( )) . (6) RK P( ) d This is an ergodic average. Convergence to the re- quired expectation is ensured by the ergodic theorem. A Markov chain can be adopted for the purpose of Eq. (7) shows how a Markov chain can be used to evaluation. Suppose we generate a sequence of random estimate E( f ). Such a Markov chain can be constructed variables + , ,2,. At each time t*0, the next state by Metropolis}Hastings algorithm [8]. At each time t, R is sampled from a distribution P( R R) which de- the next state R is chosen by rst sampling a candidate pends only on the current state R of the chain. This point from a proposal distribution q( R). The choice Markov chain is assumed to be time-homogeneous. of proposal distribution is almost arbitrary; here a Thus, the sequence will gradually converge to a unique multivariate normal distribution centered on the cur- stationary distribution (.). After a su$cient long burn-in rent value R is adopted. The candidate is accepted Fig. 1. Textures used in the experiment. (a) Number of graylevels M2, (b) M2, (c) M4, (d) M4.
  • 4. PR 1181 1922 L. Wang et al. / Pattern Recognition 33 (2000) 1919}1925 with probability The Metropolis}Hastings algorithm can be sum- marized in the following procedures: P( f )q( R) ( R, )min 1, . Initialize ; set t0 and ¹maximum number of P( R f )q( R ) iteration While t(¹ The transition kernel for the Metropolis-Hastings algo- BEGIN rithm is Sample a point from q(. R) Sample a uniform (0, 1) random variable v P( R R)q( R R) ( R, R)#I( R R) If v) ( R, ), set R . Otherwise set R R Increment t ; 1! q( R) ( R, )d END where I(.) denotes the indicator function (taking 1 when 4. Experiments its argument is true, and 0 otherwise). If the candidate is accepted, the next state becomes R , otherwise In order to inspect the performance of the method R R. Since P( f)JP( )P(f ) and the prior P( ) can proposed in this paper, a Gibbs sampler [4] is used to be assumed to be #at when the prior information is sample textures with the specied parameters. Here a sec- totally unavailable, ond-order neighborhood system is used and four double- site cliques + , , , , corresponding to 03, 903, P( f )q( R) 453 and 1353 individually are adopted as non-zero para- ( R, )min 1, meters. Fig. 1 shows four 128;128 textures generated P( R f )q( R ) from the Gibbs Sampler. The rst two textures are sam- P( f )q( R) pled with two graylevels and the next two textures are min 1, . (8) sampled with four graylevels. The parameters of the four P( f R)q( R ) textures are listed in Table 1. In order to get acceptable parameters, the MCMC procedure described in the pre- Since the choice of proposal distribution here is normal vious section should be repeated until stability of the centered on the current value, q( R)q( R ) due Markov chain is reached. The choice of starting values to the symmetric property of the proposed distribution. will not a!ect the stationary distribution if the chain is Thus, the acceptance probability formula can be irreducible. In our experiments, are chosen randomly. reduced to The usual informal approach to detection of convergence is visual inspection of plots of the Monte-Carlo output P( f ) + R, t1,2, n,. From Figs. 2}5, three independent sam- ( R, )min 1, . (9) P( f R) ples of Markov chains for texture 4 are given. From the Thus, the Metropolis}Hastings algorithm is switched to Metropolis algorithm. When we use pseudo-likelihood to represent the likelihood function, we get ( R, )min(1, exp(P¸( f )!P¸( f R))) min 1, exp (1 , H( f )2!1 R, H( f )2) H H HZ1 !log exp(1 , H(z )2) H XH ZL #log exp(1 R, H(z )2) . (10) H XH ZL With this acceptance probability, the can be approxi- Fig. 2. 1000 iterations with di!erent starting values for estima- mated e!ectively. ting for texture 4.
  • 5. PR 1181 L. Wang et al. / Pattern Recognition 33 (2000) 1919}1925 1923 Fig. 3. 1000 iterations with di!erent starting values for estima- Fig. 4. 1000 iterations with di!erent starting values for estima- ting for texture 4. ting for texture 4. gures, we observe that the length of burn-in depends on creased. Initially, we set n1000. If the estimates M do . The Markov chains converge in less than 300 iter- not agree adequately, we increase 500 iterations each ations in most examples according to visual inspection of time until estimates are similar. We only need to inspect the monitoring statistics. Here we set burn-in m500. the mean M and variance of the Monte}Carlo output. In More formal methods for convergence diagnostics can be our experiments in Table 1, n1000 is enough. The found in Refs. [9,10]. Decision about the iteration num- results of MCMC approach in Table 1 are acceptable ber is an important and practical matter. The aim is to where denotes the average standard deviation of run the chain long enough to obtain adequate precision Markov chains after burn-in. In order to verify the per- in the estimator. Here three chains are run in parallel formance of this method, least- squares (LS) t method with di!erent starting values M from Eq. (7). If they do not proposed by Derin and Elliott [7] is also used in agree adequately, the iteration number n must be in- our experiments. From Table 1, it can be seen that LS Table 1 MRF parameter estimation Textures Method Texture 1 Specied 1 1 !0.5 !0.5 LS 0.8448 0.8734 !0.4332 !0.4382 MCMC 0.9884 0.9899 !0.5076 !0.5078 0.0436 0.0323 0.0278 0.0400 Texture 2 Specied 1 !0.8 0.5 !0.5 LS 0.9949 !0.8157 0.4960 !0.3244 MCMC 1.0093 !0.8586 0.5569 !0.4522 0.0147 0.0235 0.0168 0.0245 Texture 3 Specied 0.3 0.3 0.3 0.3 LS 0.1152 0.1520 0.1867 0.1444 MCMC 0.3478 0.2762 0.2960 0.2877 0.0266 0.0165 0.0130 0.0086 Texture 4 Specied 0.5 1 !0.5 0.7 LS 0.0415 0.4364 0 0.4201 MCMC 0.5525 1.0394 !0.5951 0.6810 0.0321 0.0364 0.0490 0.0156
  • 6. PR 1181 1924 L. Wang et al. / Pattern Recognition 33 (2000) 1919}1925 cation and segmentation as well as image restoration. MRF parameter estimation plays an important role in MRF modeling. In order to estimate MRF parameter e!ectively and e$ciently, an MRF parameter estimation method based on MCMC is proposed in this paper. A Markov chain is constructed to sample the MRF parameters via Monte Carlo method. MLL model is used as image model. In order to avoid to calculate the nor- malizing partition function, pseudo-likelihood function is used to represent likelihood function. Compared to least-squares t method, our method is more accurate and can be used for multi-graylevel texture parameter estimation e!ectively as seen from the experiments in the paper. This method can be extended to be used in multi- resolution analysis of texture modeling and segmentation of textured images. Fig. 5. 1000 iterations with di!erent starting values for estima- ting for texture 4. method is e!ective only to the textures with two gray- Acknowledgements levels, while MCMC method is e!ective to all examples in the experiments even more graylevels are adopted in We wish to thank the constructive comments and the model. From the comparison, MCMC method pro- suggestions of the reviewers. posed in this paper is much better than LS method. The MCMC routines are run on a Sun Ultra 2 workstation, each analysis takes less than 3 min to perform 1000 iterations. References [1] J. Besag, Spatial interaction and the statistical analysis of 5. Conclusion lattice systems, J. Roy. Statist. Soc. Ser. B 36 (1974) 192}236. Markov random elds (MRFs) modeling is a popular [2] S. Barker, Image segmentation using Markov random eld pattern analysis method. It can be used in texture classi- models, Ph.D. Thesis, University of Cambridge, 1998. cation and segmentation as well as image restoration. [3] S. Geman, D. Geman, Stochastic relaxation, Gibbs distri- MRF parameter estimation plays an important role in bution and the Bayesian restoratopn of images, IEEE MRF modeling. In order to estimate MRF parameter Trans. PAMI 6 (6) (1984) 721}741. [4] S. Li, Markov Random Field Modeling in Computer e!ectively and e$ciently, an MRF parameter estimation Vision, Springer, New York, 1995. method based on method MCMC is proposed in this [5] D. Chandler, Introduction to Modern Statistical Mechan- paper. A Markov chain is constructed to sample the ics, Oxford University Press, Oxford, 1987. MRF parameters via Monte Carlo method. MLL model [6] R. Aykroyd, Bayesian estimation for homogeneous and is used as image model. In order to avoid to calculate the inhomogeneous Gaussian random elds, IEEE Trans. normalizing partition function, pseudo-likelihood func- Pattern Analysis Mach. Intell. 20 (5) (1998) 533}539. tion is used to represent likelihood function. Compared [7] H. Derin, H. Elliott, Modeling and segmentation of to least-squares t method, the proposed method is more noisy and textured images using Gibbs random elds, accurate and can be used for multilevel-graylevel texture IEEE Trans. Pattern Analysis Mach. Intell. 9 (1) (1987) parameter estimation e!ectively as seen from the experi- 39}55. [8] W. Gilks, S. Richardson, D. Spiegelhalter, Markov ments in the paper. This method can be extended to be chain Monte Carlo in Practice, Chapman Hall, London, used in multiresolution analysis of texture modeling and 1996. segmentation of textured images. [9] M. Cowles, B. Carlin, Markov Chain Monte Carlo conver- gence diagnostics: a comparative review, Technical Re- port, Division of Biostatistics, School of Public Health, University of Minnesota, 1994. 6. Summary [10] S. Brooks, P. Dellaportas, G. Roberts, An approach to diagnosing total variation convergence of MCMC Markov random elds (MRFs) modeling is a popular algorithms, University of Cambridge, http://www.stats. pattern analysis method. It can be used in texture classi- bris.ac.uk/ maspb/mypapers/brodr96.html, 1996.
  • 7. PR 1181 L. Wang et al. / Pattern Recognition 33 (2000) 1919}1925 1925 About the Author*LEI WANG received his B. Eng and M. Eng degrees from Harbin Institute of Technology, China, in 1992 and 1995, respectively. He is currently a Ph.D. candidate in School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore. His research interests include pattern recognition, image compression, image processing, and image retrieval. About the Author*JUN LIU received his B.S. and M.S. degree from Jiao Tong University, Xian, China in 1982 and 1984, respectively. He obtained his Ph.D. degree from Oakland University, MI, USA in 1989. He is currently an associate professor with Division of Information Engineering, School of Electrical and Electronic Engineering, Nanyang Technological University. His research interests include pattern recognition, image processing and multimedia database. About the Author*STAN Z. LI received the B.Sc degree from Hunan University, China, in 1982, M.Sc degree from the National University of Defense Technology, China, in 1985 and Ph.D. degree from the University of Surrey, UK, in 1991. All degrees are in EEE. He is currently a senior lecturer at Nanyang Technological University, Singapore. His research interests include computer vision, pattern recognition, image processing and optimization methods.