Spatially coherent latent topic model for concurrent object segmentation and classificationAuthors: Liangliang Cao, Li Fei-FeiPresenter: Shao-Chuan Wang
OutlineMotivationA Review on Graphical ModelsToday’s topic: the paperTheir Results
Motivation: Real world problem often full of “noises”Bags of words (local features)Spatial relationships of objects are ignored (has its limit)When classify a test image, what is its “subject” ?Flag?Banner?People?Sports field?From Prof. Fei-Fei’s ICCV09 tutorial slide
OutlineMotivationA Review on Graphical ModelsToday’s topic: the paperTheir Results
Generative vs Discriminative Generative model: model p(x, y) or p(x|y)p(y)Discriminative model: model p(y|x)0.10.05001020304050607010.50010203040506070x = dataFrom Prof. Antonio Torralba course slide
Naïve Bayesian model (c: class, w: visual words)Once we have learnt the distribution, for a query imageGenerative model: An exampleBayesianNetworkscw1wn…
Generative model: Another exampleMixture Gaussian ModelHow to infer from unlabeled data even if weknow the underlining  probability distribution structure? ?
A graphical modelObject classcP(c)Inverse VarianceMeanγμP(γ|c)P(μ|c)Observed dataxP(x|μ,γ)Directed graph
Nodes represent variablesHiddenLinks show dependencies
Conditional distributions   at each nodeInference of latent variablesExpectation maximization (EM)“Soft guess” latent variable first (E-step)Based on latent variable (assume it is correct), solve optimization problem (M-step)Markov-chain Monte Carlo (MCMC)
Use Gibbs sampling from the Posterior
Slow to converge
Variational method/Variational Message Passing (VMP)
Algorithms that convert inference problems into optimization problems (Opper and Saad 2001; Wainwright and Jordan 2003)Image from Wikipedia
OutlineMotivationA Review on Graphical ModelsToday’s topic: the paperTheir Results
Back to the topic: the paperbag of wordsKey Ideas:Latent topics are spatially coherentGenerate topic distribution at the region levelOver-segmentation, then merge by same topicsAvoid obtaining regions larger than the objectsOne topic per regionCan recognize objects with occlusionoversegmentationDescribe a region:
Homogeneous Appearance ar: average of color or texture features
SIFT-based visual words: wr
Concurrent segmentation and classificationSpatial Latent Topic ModelNotation:Image IdRegion r = {1,2,…,Rd}Latent topic zr= {1,2,…,K}appearance ar = {1,2,…,A}visual words wr = (wr1,wr2,…, wrMr); wr1 = {1,2,…,W}P(zr |θd): topic probability (Multinomial distribution) parameterized by θdP(θd|λ): Dirichlet prior of θd, parameterized by λα, β: parameters describing the probability of generating appearance and visual words given topic
Spatial Latent Topic Model (Unsupervised)MultinomialDirichletpriorMaximize Log-likelihoodan optimization problem: close-formed solution is intractable
Variaitional Message Passing (Winn 2005)Coupling hidden variables θ, α, β makes the maximization intractableInstead, maximize the lower bound of L Goal: Find a tractable Q(H) that closely approximates the true posterior distribution P(H|V)          (equality holds for any distribution Q)←Or equivalently, minimize KL(Q||P)
Variaitional Message Passing (Winn 2005)Further factorization assumptions (Jordan et al., 1999; Jaakkola, 2001; Parisi, 1988) (restrict the family of distributions Q)Entropy term=Where,
Variaitional Message Passing (Winn 2005)Eqn. (6) in the paperBayesian networks representationMarkov blanket:
Spatial Latent Topic Model (Supervised)Now it becomes C x K matrix, i.e. θ depends on observed cFor a query image,Id , find its most probable category c:

Spatially Coherent Latent Topic Model For Concurrent Object Segmentation and Classification

  • 1.
    Spatially coherent latenttopic model for concurrent object segmentation and classificationAuthors: Liangliang Cao, Li Fei-FeiPresenter: Shao-Chuan Wang
  • 2.
    OutlineMotivationA Review onGraphical ModelsToday’s topic: the paperTheir Results
  • 3.
    Motivation: Real worldproblem often full of “noises”Bags of words (local features)Spatial relationships of objects are ignored (has its limit)When classify a test image, what is its “subject” ?Flag?Banner?People?Sports field?From Prof. Fei-Fei’s ICCV09 tutorial slide
  • 4.
    OutlineMotivationA Review onGraphical ModelsToday’s topic: the paperTheir Results
  • 5.
    Generative vs DiscriminativeGenerative model: model p(x, y) or p(x|y)p(y)Discriminative model: model p(y|x)0.10.05001020304050607010.50010203040506070x = dataFrom Prof. Antonio Torralba course slide
  • 6.
    Naïve Bayesian model(c: class, w: visual words)Once we have learnt the distribution, for a query imageGenerative model: An exampleBayesianNetworkscw1wn…
  • 7.
    Generative model: AnotherexampleMixture Gaussian ModelHow to infer from unlabeled data even if weknow the underlining probability distribution structure? ?
  • 8.
    A graphical modelObjectclasscP(c)Inverse VarianceMeanγμP(γ|c)P(μ|c)Observed dataxP(x|μ,γ)Directed graph
  • 9.
  • 10.
    Conditional distributions at each nodeInference of latent variablesExpectation maximization (EM)“Soft guess” latent variable first (E-step)Based on latent variable (assume it is correct), solve optimization problem (M-step)Markov-chain Monte Carlo (MCMC)
  • 11.
    Use Gibbs samplingfrom the Posterior
  • 12.
  • 13.
  • 14.
    Algorithms that convertinference problems into optimization problems (Opper and Saad 2001; Wainwright and Jordan 2003)Image from Wikipedia
  • 15.
    OutlineMotivationA Review onGraphical ModelsToday’s topic: the paperTheir Results
  • 16.
    Back to thetopic: the paperbag of wordsKey Ideas:Latent topics are spatially coherentGenerate topic distribution at the region levelOver-segmentation, then merge by same topicsAvoid obtaining regions larger than the objectsOne topic per regionCan recognize objects with occlusionoversegmentationDescribe a region:
  • 17.
    Homogeneous Appearance ar:average of color or texture features
  • 18.
  • 19.
    Concurrent segmentation andclassificationSpatial Latent Topic ModelNotation:Image IdRegion r = {1,2,…,Rd}Latent topic zr= {1,2,…,K}appearance ar = {1,2,…,A}visual words wr = (wr1,wr2,…, wrMr); wr1 = {1,2,…,W}P(zr |θd): topic probability (Multinomial distribution) parameterized by θdP(θd|λ): Dirichlet prior of θd, parameterized by λα, β: parameters describing the probability of generating appearance and visual words given topic
  • 20.
    Spatial Latent TopicModel (Unsupervised)MultinomialDirichletpriorMaximize Log-likelihoodan optimization problem: close-formed solution is intractable
  • 21.
    Variaitional Message Passing(Winn 2005)Coupling hidden variables θ, α, β makes the maximization intractableInstead, maximize the lower bound of L Goal: Find a tractable Q(H) that closely approximates the true posterior distribution P(H|V) (equality holds for any distribution Q)←Or equivalently, minimize KL(Q||P)
  • 22.
    Variaitional Message Passing(Winn 2005)Further factorization assumptions (Jordan et al., 1999; Jaakkola, 2001; Parisi, 1988) (restrict the family of distributions Q)Entropy term=Where,
  • 23.
    Variaitional Message Passing(Winn 2005)Eqn. (6) in the paperBayesian networks representationMarkov blanket:
  • 24.
    Spatial Latent TopicModel (Supervised)Now it becomes C x K matrix, i.e. θ depends on observed cFor a query image,Id , find its most probable category c:
  • 25.
    ProcessTraining stepmaximize totallikelihood of training images, subject λ, α, θ and zrThe learned λ, α are fixedTesting phase, for a query Image IdEstimate its θd and zrFor classification task, find its most probable latent topics as its categoryFor segmentation task, for the same zr, merge it.(3)
  • 26.
    OutlineMotivationA Review onGraphical ModelsToday’s topic: the paperTheir Results
  • 27.
  • 28.
    Experimental ResultsSupervised segmentationDataset13classes of nature scenes# of training images: 100# of topics: 60# of categories: 13
  • 29.
    Experimental ResultsSupervised classificationDataset28classes from Caltech 101# of training images: 30# of test images: 30# of topics in category: 28# of topics in clutter: 346 background classes are left unlabeled
  • 30.
  • 31.
    Variaitional Message PassingFollowingthis framework, and use the graphical model provided by this paper: