ECCV2010: feature learning for image classification, part 1

1,267 views

Published on

Published in: Education
  • Be the first to comment

ECCV2010: feature learning for image classification, part 1

  1. 1. Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University
  2. 2. Outline of Part 2 05/13/11 <ul><li>Local Features, Sampling, Visual Words </li></ul><ul><li>Discriminative Methods </li></ul><ul><ul><li>Bag-of-Words (BoW) representation </li></ul></ul><ul><ul><li>Spatial pyramid matching (SPM) </li></ul></ul><ul><li>Generative Methods </li></ul><ul><ul><li>P art-based methods </li></ul></ul><ul><ul><li>Topic models </li></ul></ul>
  3. 3. Outline of Part 2 05/13/11 <ul><li>Local Features, Sampling, Visual Words </li></ul><ul><li>Discriminative Methods </li></ul><ul><ul><li>Bag-of-Words (BoW) representation </li></ul></ul><ul><ul><li>Spatial pyramid matching (SPM) </li></ul></ul><ul><li>Generative Methods </li></ul><ul><ul><li>P art-based methods </li></ul></ul><ul><ul><li>Topic models </li></ul></ul>
  4. 4. Local features 05/13/11 <ul><li>Distinctive descriptors of local image patches </li></ul><ul><li>Invariant to local translation, scale, … </li></ul><ul><li>and sometimes rotation or general affine transformations </li></ul><ul><li>The most famous choice is the SIFT feature </li></ul>
  5. 5. Sampling local features from images 05/13/11 Image credits: F-F. Li, E. Nowak, J. Sivic A set of points
  6. 6. Visual words 05/13/11 <ul><li>Similar points are grouped into one visual word </li></ul><ul><li>Algorithms: k-means, agglomerative clustering, … </li></ul><ul><li>Points from different images are then more easily compared. </li></ul>Slide credit: Kristen Grauman
  7. 7. Outline of Part 2 05/13/11 <ul><li>Local Features, Sampling, Visual Words, … </li></ul><ul><li>Discriminative Methods </li></ul><ul><ul><li>Bag-of-Words (BoW) representation </li></ul></ul><ul><ul><li>Spatial pyramid matching (SPM) </li></ul></ul><ul><li>Generative Methods </li></ul><ul><ul><li>P art-based methods </li></ul></ul><ul><ul><li>Topic models </li></ul></ul>
  8. 8. Bag-of-words (BoW) representation 05/13/11 Adapted from tutorial slides by Fei-Fei et al. Analogy to documents
  9. 9. BoW for object categorization 05/13/11 <ul><li>Works pretty well for whole-image classification </li></ul>Slide credit: Svetlana Lazebnik Csurka et al. (2004), Willamowski et al. (2005), Grauman & Darrell (2005), Sivic et al. (2003, 2005)
  10. 10. Unsupervised Dictionary Learning 05/13/11 image database <ul><li>Sample local features from images </li></ul><ul><li>Run k-mean or other clustering algorithm to get dictionary </li></ul><ul><li>Dictionary is also called “codebook” </li></ul>SIFT space R1 R2 R3
  11. 11. Compute BoW histogram for each image 05/13/11 Assign sift features into clusters BoW histogram representations R1 R2 R3 Compute the frequency of each cluster within an image R1 R2 R3
  12. 12. Indication of BoW histogram 05/13/11 <ul><li>Summarize entire image based on its distribution of visual word occurrences </li></ul><ul><li>Turn bags of different sizes into a fixed length vector </li></ul><ul><li>Analogous to bag of words representation commonly used for text categorization. </li></ul>
  13. 13. Image classification based on BoW histogram 05/13/11 dog bird Decision boundary BoW histogram vector space <ul><li>Learn a classification model to determine the decision boundary </li></ul><ul><li>Nonlinear SVMs are commonly applied. </li></ul>
  14. 14. Issues 05/13/11 <ul><li>Sampling strategy </li></ul><ul><li>Learning codebook: size? supervised?, … </li></ul><ul><li>Classification: which method? scalability? </li></ul><ul><li>Scalability: how to handle millions of data? </li></ul><ul><li>How to use spatial information? </li></ul>
  15. 15. Spatial information 05/13/11 <ul><li>The BoW removes spatial layout. </li></ul><ul><li>This increases the invariance to scale, translation, and deformation, </li></ul><ul><li>B ut sacrifices discriminative power, especially when the spatial layout is important . </li></ul>Slide adapted from Bill Freeman
  16. 16. Spatial pyramid matching 05/13/11 <ul><li>Compute BoW for image regions at different locations in various scales </li></ul>Figure credit: Svetlana Lazebnik
  17. 17. A common pipeline for discriminative image classification using BoW 05/13/11 K-means Dense/Sparse SIFT dictionary Dictionary Learning VQ Coding Dense/Sparse SIFT Spatial Pyramid Pooling Nonlinear SVM Image Classification
  18. 18. Combining multiple descriptors 05/13/11 Multiple Feature Detectors Multiple Descriptors: SIFT, shape, color, … VQ Coding and Spatial Pooling Nonlinear SVM Diagram from SurreyUVA_SRKDA, winner team in PASCAL VOC 2008
  19. 19. Outline of Part 2 05/13/11 <ul><li>Local Features, Sampling, Visual Words, … </li></ul><ul><li>Discriminative Methods </li></ul><ul><ul><li>Bag-of-Words (BoW) representation </li></ul></ul><ul><ul><li>Spatial pyramid matching (SPM) </li></ul></ul><ul><li>Generative Methods </li></ul><ul><ul><li>P art-based methods </li></ul></ul><ul><ul><li>Topic models </li></ul></ul>
  20. 20. Topic models for images 05/13/11 Latent Dirichlet Allocation (LDA) Fei-Fei et al. ICCV 2005 Slide credit Fei-Fei Li w N c z D  “ beach”
  21. 21. Part-based Model 05/13/11 Fischler & Elschlager 1973 Rob Fergus ICCV09 Tutorial
  22. 22. For a comprehensive coverage of object categorization models , please visit 05/13/11 Recognizing and Learning Object Categories Li Fei-Fei (Stanford), Rob Fergus (NYU), Antonio Torralba (MIT) http://people.csail.mit.edu/torralba/shortCourseRLOC/

×