Spatially Coherent Latent Topic Model For Concurrent Object Segmentation and Classification

  • 1,745 views
Uploaded on

A paper review

A paper review

More in: Education , Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
1,745
On Slideshare
0
From Embeds
0
Number of Embeds
4

Actions

Shares
Downloads
24
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Spatially coherent latent topic model for concurrent object segmentation and classification
    Authors: Liangliang Cao, Li Fei-Fei
    Presenter: Shao-Chuan Wang
  • 2. Outline
    Motivation
    A Review on Graphical Models
    Today’s topic: the paper
    Their Results
  • 3. Motivation: Real world problem often full of “noises”
    Bags of words (local features)
    Spatial relationships of objects are ignored (has its limit)
    When classify a test image, what is its “subject” ?
    Flag?
    Banner?
    People?
    Sports field?
    From Prof. Fei-Fei’s ICCV09 tutorial slide
  • 4. Outline
    Motivation
    A Review on Graphical Models
    Today’s topic: the paper
    Their Results
  • 5. Generative vs Discriminative
    Generative model: model p(x, y) or p(x|y)p(y)
    Discriminative model: model p(y|x)
    0.1
    0.05
    0
    0
    10
    20
    30
    40
    50
    60
    70
    1
    0.5
    0
    0
    10
    20
    30
    40
    50
    60
    70
    x = data
    From Prof. Antonio Torralba course slide
  • 6. Naïve Bayesian model
    (c: class, w: visual words)
    Once we have learnt the distribution, for a query image
    Generative model: An example
    Bayesian
    Networks
    c
    w1
    wn

  • 7. Generative model: Another example
    Mixture Gaussian Model
    How to infer from unlabeled data even if we
    know the underlining probability distribution structure?
    ?
  • 8. A graphical model
    Object class
    c
    P(c)
    Inverse Variance
    Mean
    γ
    μ
    P(γ|c)
    P(μ|c)
    Observed data
    x
    P(x|μ,γ)
    • Directed graph
    • 9. Nodes represent variables
    Hidden
    • Links show dependencies
    • 10. Conditional distributions at each node
  • Inference of latent variables
    Expectation maximization (EM)
    “Soft guess” latent variable first (E-step)
    Based on latent variable (assume it is correct), solve optimization problem (M-step)
    • Markov-chain Monte Carlo (MCMC)
    • 11. Use Gibbs sampling from the Posterior
    • 12. Slow to converge
    • 13. Variational method/Variational Message Passing (VMP)
    • 14. Algorithms that convert inference problems into optimization problems (Opper and Saad 2001; Wainwright and Jordan 2003)
    Image from Wikipedia
  • 15. Outline
    Motivation
    A Review on Graphical Models
    Today’s topic: the paper
    Their Results
  • 16. Back to the topic: the paper
    bag of words
    Key Ideas:
    Latent topics are spatially coherent
    Generate topic distribution at the region level
    Over-segmentation, then merge by same topics
    Avoid obtaining regions larger than the objects
    One topic per region
    Can recognize objects with occlusion
    oversegmentation
    • Describe a region:
    • 17. Homogeneous Appearance ar: average of color or texture features
    • 18. SIFT-based visual words: wr
    • 19. Concurrent segmentation and classification
  • Spatial Latent Topic Model
    Notation:
    Image Id
    Region r = {1,2,…,Rd}
    Latent topic zr= {1,2,…,K}
    appearance ar = {1,2,…,A}
    visual words wr = (wr1,wr2,…, wrMr); wr1 = {1,2,…,W}
    P(zr |θd):
    topic probability (Multinomial distribution) parameterized by θd
    P(θd|λ):
    Dirichlet prior of θd, parameterized by λ
    α, β:
    parameters describing the probability of generating appearance and visual words given topic
  • 20. Spatial Latent Topic Model (Unsupervised)
    Multinomial
    Dirichlet
    prior
    Maximize Log-likelihood
    an optimization problem: close-formed solution is intractable
  • 21. Variaitional Message Passing (Winn 2005)
    Coupling hidden variables θ, α, β makes the maximization intractable
    Instead, maximize the lower bound of L
    Goal: Find a tractable Q(H) that closely approximates the true posterior distribution P(H|V) (equality holds for any distribution Q)
    ←Or equivalently, minimize KL(Q||P)
  • 22. Variaitional Message Passing (Winn 2005)
    Further factorization assumptions (Jordan et al., 1999; Jaakkola, 2001; Parisi, 1988) (restrict the family of distributions Q)
    Entropy term
    =
    Where,
  • 23. Variaitional Message Passing (Winn 2005)
    Eqn. (6) in the paper
    Bayesian networks
    representation
    Markov blanket:
  • 24. Spatial Latent Topic Model (Supervised)
    Now it becomes C x K matrix, i.e. θ depends on observed c
    For a query image,Id , find its most probable category c:
  • 25. Process
    Training step
    maximize total likelihood of training images, subject λ, α, θ and zr
    The learned λ, α are fixed
    Testing phase, for a query Image Id
    Estimate its θd and zr
    For classification task, find its most probable latent topics as its category
    For segmentation task, for the same zr, merge it.
    (3)
  • 26. Outline
    Motivation
    A Review on Graphical Models
    Today’s topic: the paper
    Their Results
  • 27. Experimental Results
    Unsupervised segmentation
    Occlusion case:
  • 28. Experimental Results
    Supervised segmentation
    Dataset
    13 classes of nature scenes
    # of training images: 100
    # of topics: 60
    # of categories: 13
  • 29. Experimental Results
    Supervised classification
    Dataset
    28 classes from Caltech 101
    # of training images: 30
    # of test images: 30
    # of topics in category: 28
    # of topics in clutter: 34
    6 background classes are
    left unlabeled
  • 30. ~ Thank you ~
  • 31. Variaitional Message Passing
    Following this framework, and use the graphical model provided by this paper: