Spatially Coherent Latent Topic Model For Concurrent Object Segmentation and Classification - Presentation Transcript
Spatially coherent latent topic model for concurrent object segmentation and classification Authors: Liangliang Cao, Li Fei-Fei Presenter: Shao-Chuan Wang
Outline Motivation A Review on Graphical Models Today’s topic: the paper Their Results
Motivation: Real world problem often full of “noises” Bags of words (local features) Spatial relationships of objects are ignored (has its limit) When classify a test image, what is its “subject” ? Flag? Banner? People? Sports field? From Prof. Fei-Fei’s ICCV09 tutorial slide
Outline Motivation A Review on Graphical Models Today’s topic: the paper Their Results
Generative vs Discriminative Generative model: model p(x, y) or p(x|y)p(y) Discriminative model: model p(y|x) 0.1 0.05 0 0 10 20 30 40 50 60 70 1 0.5 0 0 10 20 30 40 50 60 70 x = data From Prof. Antonio Torralba course slide
Naïve Bayesian model (c: class, w: visual words) Once we have learnt the distribution, for a query image Generative model: An example Bayesian Networks c w1 wn …
Generative model: Another example Mixture Gaussian Model How to infer from unlabeled data even if we know the underlining probability distribution structure? ?
A graphical model Object class c P(c) Inverse Variance Mean γ μ P(γ|c) P(μ|c) Observed data x P(x|μ,γ)
Directed graph
Nodes represent variables
Hidden
Links show dependencies
Conditional distributions at each node
Inference of latent variables Expectation maximization (EM) “Soft guess” latent variable first (E-step) Based on latent variable (assume it is correct), solve optimization problem (M-step)
Algorithms that convert inference problems into optimization problems (Opper and Saad 2001; Wainwright and Jordan 2003)
Image from Wikipedia
Outline Motivation A Review on Graphical Models Today’s topic: the paper Their Results
Back to the topic: the paper bag of words Key Ideas: Latent topics are spatially coherent Generate topic distribution at the region level Over-segmentation, then merge by same topics Avoid obtaining regions larger than the objects One topic per region Can recognize objects with occlusion oversegmentation
Describe a region:
Homogeneous Appearance ar: average of color or texture features
SIFT-based visual words: wr
Concurrent segmentation and classification
Spatial Latent Topic Model Notation: Image Id Region r = {1,2,…,Rd} Latent topic zr= {1,2,…,K} appearance ar = {1,2,…,A} visual words wr = (wr1,wr2,…, wrMr); wr1 = {1,2,…,W} P(zr |θd): topic probability (Multinomial distribution) parameterized by θd P(θd|λ): Dirichlet prior of θd, parameterized by λ α, β: parameters describing the probability of generating appearance and visual words given topic
Spatial Latent Topic Model (Unsupervised) Multinomial Dirichlet prior Maximize Log-likelihood an optimization problem: close-formed solution is intractable
Variaitional Message Passing (Winn 2005) Coupling hidden variables θ, α, β makes the maximization intractable Instead, maximize the lower bound of L Goal: Find a tractable Q(H) that closely approximates the true posterior distribution P(H|V) (equality holds for any distribution Q) ←Or equivalently, minimize KL(Q||P)
Variaitional Message Passing (Winn 2005) Further factorization assumptions (Jordan et al., 1999; Jaakkola, 2001; Parisi, 1988) (restrict the family of distributions Q) Entropy term = Where,
Variaitional Message Passing (Winn 2005) Eqn. (6) in the paper Bayesian networks representation Markov blanket:
Spatial Latent Topic Model (Supervised) Now it becomes C x K matrix, i.e. θ depends on observed c For a query image,Id , find its most probable category c:
Process Training step maximize total likelihood of training images, subject λ, α, θ and zr The learned λ, α are fixed Testing phase, for a query Image Id Estimate its θd and zr For classification task, find its most probable latent topics as its category For segmentation task, for the same zr, merge it. (3)
Outline Motivation A Review on Graphical Models Today’s topic: the paper Their Results
Experimental Results Supervised segmentation Dataset 13 classes of nature scenes # of training images: 100 # of topics: 60 # of categories: 13
Experimental Results Supervised classification Dataset 28 classes from Caltech 101 # of training images: 30 # of test images: 30 # of topics in category: 28 # of topics in clutter: 34 6 background classes are left unlabeled
~ Thank you ~
Variaitional Message Passing Following this framework, and use the graphical model provided by this paper:
0 comments
Post a comment