Object Recognition with     Deformable Models            Pedro F. Felzenszwalb        Department of Computer Science      ...
Example Problems  Detecting rigid objects              PASCAL challenge                            Medical imageDetecting ...
Deformable Models•   Significant challenge:    - Handling variation in appearance within object classes    - Non-rigid obje...
Overview•   Part I: Pictorial Structures    - Deformable part models    - Highly efficient matching algorithms•   Part II: ...
Part I: Pictorial Structures•   Introduced by Fischler and Elschlager in 1973•   Part-based models:    - Each part represe...
Local Evidence + Global Decision•   Parts have a match quality at each image location•   Local evidence is noisy    - Part...
Matching Problem•   Model is represented by a graph G = (V, E)    - V = {v ,...,v } are the parts                 1       ...
Matching Problem                           n                E(L) =    ∑ m (l ) + ∑ d (l ,l )                              ...
Dynamic Programming on Trees                     n                                                 v2          E(L) =    ∑...
Min-Convolution Speedup                                                           v2      Best2(l1) = min [m2(l2) + d12(l1...
Finding MotorbikesModel with 6 parts:      2 wheels    2 headlightsfront & back of seat
Human Pose Estimation
Human TrackingRamanan, Forsyth, Zisserman, Tracking People by Learning their AppearanceIEEE Pattern Analysis and Machine I...
Part II: Deformable Shapes•   Shape is a fundamental cue for recognizing objects•   Many objects have no well defined parts...
Triangulated Polygons•   Polygonal templates•   Delauney triangulation gives natural decomposition of an object•   Conside...
Structure of Triangulated Polygons                     There are 2 graphs associated with a                            tri...
Deformable Matching        Consider piecewise affine maps from model        to image (taking triangles to triangles)       ...
Hierarchical Shape Model•   Shape-tree of curve from a to b:    -   Select midpoint c, store relative location c | a,b.   ...
Deformations•   Independently perturb relative locations stored in a shape-tree    -   Local and global properties are pre...
Matching                     h     f           c           d     i         e   g                                         c...
Recognizing LeafsNearest neighbor classification                                  15 species   Shape-tree           96.28 ...
Part III: PASCAL Challenge•   ~10,000 images, with ~25,000 target objects    - Objects from 20 categories (person, car, bi...
Model Overviewdetection     root filter   part filters deformation                                         modelsModel has a...
Histogram of Gradient (HOG) Features•   Image is partitioned into 8x8 pixel blocks•   In each block we compute a histogram...
Filters•   Filters are rectangular templates defining weights for features•   Score is dot product of filter and subwindow o...
Object Hypothesis                                              Score is sum of filter                                      ...
Training•   Training data consists of images with labeled bounding boxes•   Need to learn the model structure, filters and ...
Connection With Linear Classifiers •   Score of model is sum of filter scores plus deformation scores     - Bounding box in ...
Latent SVMsLinear in w if z is fixed            Regularization   Hinge loss
Learned Models                            Bicycle                     Sofa          CarBottle
Example Results
More Results
Overall Results•   9 systems competed in the 2007 challenge•   Out of 20 classes we get:    - First place in 10 classes   ...
Component Analysis                               PASCAL2006 Person             1            0.9                       Root...
Summary•   Deformable models provide an elegant framework for object    detection and recognition    - Efficient algorithms...
Object Recognition with Deformable Models
Upcoming SlideShare
Loading in …5
×

Object Recognition with Deformable Models

2,611 views

Published on

Published in: Education, Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
2,611
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
80
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Object Recognition with Deformable Models

  1. 1. Object Recognition with Deformable Models Pedro F. Felzenszwalb Department of Computer Science University of ChicagoJoint work with: Dan Huttenlocher, Joshua Schwartz, David McAllester, Deva Ramanan.
  2. 2. Example Problems Detecting rigid objects PASCAL challenge Medical imageDetecting non-rigid objects analysis Segmenting cells
  3. 3. Deformable Models• Significant challenge: - Handling variation in appearance within object classes - Non-rigid objects, generic categories, etc.• Deformable models approach: - Consider each object as a deformed version of a template - Compact representation - Leads to interesting modeling and algorithmic problems
  4. 4. Overview• Part I: Pictorial Structures - Deformable part models - Highly efficient matching algorithms• Part II: Deformable Shapes - Triangulated polygons - Hierarchical models• Part III: The PASCAL Challenge - Recognizing 20 object categories in realistic scenes - Discriminatively trained, multiscale, deformable part models
  5. 5. Part I: Pictorial Structures• Introduced by Fischler and Elschlager in 1973• Part-based models: - Each part represents local visual properties - “Springs” capture spatial relationships Matching model to image involves joint optimization of part locations “stretch and fit”
  6. 6. Local Evidence + Global Decision• Parts have a match quality at each image location• Local evidence is noisy - Parts are detected in the context of the whole model part test image match quality
  7. 7. Matching Problem• Model is represented by a graph G = (V, E) - V = {v ,...,v } are the parts 1 n - (v ,v ) ∈ E indicates a connection between parts i j• mi(li) is a cost for placing part i at location li• dij(li,lj) is a deformation cost• Optimal configuration for the object is L = (l1,...,ln) minimizing n E(L) = ∑ m (l ) + ∑ d (l ,l ) i i ij i j i=1 (vi,vj) ∈ E
  8. 8. Matching Problem n E(L) = ∑ m (l ) + ∑ d (l ,l ) i i ij i j i=1 (vi,vj) ∈ E• Assume n parts, k possible locations for each part - There are k n configurations L• If graph is a tree we can use dynamic programming - O(nk ) algorithm 2• If dij(li,lj) = g(li-lj) we can use min-convolutions - O(nk) algorithm - As fast as matching each part separately!
  9. 9. Dynamic Programming on Trees n v2 E(L) = ∑ m (l ) + ∑ d (l ,l ) i i ij i j i=1 (vi,vj) ∈ E v1• For each l1 find best l2: - Best (l ) = min [m (l ) + d 2 1 l2 2 2 12(l1,l2) ]• “Delete” v2 and solve problem with smaller model• Keep removing leafs until there is a single part left
  10. 10. Min-Convolution Speedup v2 Best2(l1) = min [m2(l2) + d12(l1,l2)] v1 l2• Brute force: O(k2) --- k is number of locations• Suppose d12(l1,l2) = g(l1-l2): - Best (l ) = min [m (l ) + g(l -l )] 2 1 l2 2 2 1 2• Min-convolution: O(k) if g is convex
  11. 11. Finding MotorbikesModel with 6 parts: 2 wheels 2 headlightsfront & back of seat
  12. 12. Human Pose Estimation
  13. 13. Human TrackingRamanan, Forsyth, Zisserman, Tracking People by Learning their AppearanceIEEE Pattern Analysis and Machine Intelligence (PAMI). Jan 2007
  14. 14. Part II: Deformable Shapes• Shape is a fundamental cue for recognizing objects• Many objects have no well defined parts - We can capture their outlines using deformable models
  15. 15. Triangulated Polygons• Polygonal templates• Delauney triangulation gives natural decomposition of an object• Consider deforming each triangle “independently” Rabbit ear can be bent by changing shape of a single triangle
  16. 16. Structure of Triangulated Polygons There are 2 graphs associated with a triangulated polygonIf the polygon is simple (no holes): Dual graph is a tree Graphical structure of triangulation is a 2-tree
  17. 17. Deformable Matching Consider piecewise affine maps from model to image (taking triangles to triangles) Find globally optimal deformation usingModel dynamic programming over 2-tree Matching to MRI data
  18. 18. Hierarchical Shape Model• Shape-tree of curve from a to b: - Select midpoint c, store relative location c | a,b. - Left child is a shape-tree of sub-curve from a to c. - Right child is a shape-tree of sub-curve from c to b. h f c d i e g c | a,b b a e | a,c d | c,b f | a,e g | e,c h | c,d i | d,b
  19. 19. Deformations• Independently perturb relative locations stored in a shape-tree - Local and global properties are preserved - Reconstructed curve is perceptually similar to original
  20. 20. Matching h f c d i e g c | a,ba b w p e | a,c d | c,b r v f | a,e g | e,c h | c,d i | d,b q u model curveMatch(v, [p,q]) = w1Match(u, [q,r]) = w2Match(w, [p,r]) = w1 + w2 + dif((e|a,c), (q|p,r)) similar to parsing with the CKY algorithm
  21. 21. Recognizing LeafsNearest neighbor classification 15 species Shape-tree 96.28 75 examples per species Inner distance 94.13 (25 training, 50 test) Shape context 88.12
  22. 22. Part III: PASCAL Challenge• ~10,000 images, with ~25,000 target objects - Objects from 20 categories (person, car, bicycle, cow, table...) - Objects are annotated with labeled bounding boxes
  23. 23. Model Overviewdetection root filter part filters deformation modelsModel has a root filter plus deformable parts
  24. 24. Histogram of Gradient (HOG) Features• Image is partitioned into 8x8 pixel blocks• In each block we compute a histogram of gradient orientations - Invariant to changes in lighting, small deformations, etc.• We compute features at different resolutions (pyramid)
  25. 25. Filters• Filters are rectangular templates defining weights for features• Score is dot product of filter and subwindow of HOG pyramid H W Score of H at this location is H ⋅ W HOG pyramid
  26. 26. Object Hypothesis Score is sum of filter scores plus deformation scores Image pyramid HOG feature pyramidMultiscale model captures features at two-resolutions
  27. 27. Training• Training data consists of images with labeled bounding boxes• Need to learn the model structure, filters and deformation costs Training
  28. 28. Connection With Linear Classifiers • Score of model is sum of filter scores plus deformation scores - Bounding box in training data specifies that score should be high for some placement in a range w is a model x is a detection window z are filter placementsconcatenation of filters and concatenation of featuresdeformation parameters and part displacements
  29. 29. Latent SVMsLinear in w if z is fixed Regularization Hinge loss
  30. 30. Learned Models Bicycle Sofa CarBottle
  31. 31. Example Results
  32. 32. More Results
  33. 33. Overall Results• 9 systems competed in the 2007 challenge• Out of 20 classes we get: - First place in 10 classes - Second place in 6 classes• Some statistics: - It takes ~2 seconds to evaluate a model in one image - It takes ~3 hours to train a model - MUCH faster than most systems
  34. 34. Component Analysis PASCAL2006 Person 1 0.9 Root (0.18) Root+Latent (0.24) 0.8 Parts+Latent (0.29) 0.7 Root+Parts+Latent (0.34) 0.6precision 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 recall
  35. 35. Summary• Deformable models provide an elegant framework for object detection and recognition - Efficient algorithms for matching models to images - Applications: pose estimation, medical image analysis, object recognition, etc.• We can learn models from partially labeled data - Generalized standard ideas from machine learning - Leads to state-of-the-art results in PASCAL challenge• Future work: hierarchical models, grammars, 3D objects

×