ECCV2010: feature learning for image classification, part 3

1,695 views
1,565 views

Published on

Published in: Education
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,695
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
79
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • Let’s further check what’s happening when best classification performance is achieved.
  • ECCV2010: feature learning for image classification, part 3

    1. 1. Part 3: Image Classification using Sparse Coding: Advanced Topics Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University
    2. 2. Outline of Part 3 05/13/11 <ul><li>Why can sparse coding learn good features? </li></ul><ul><ul><li>Intuition, topic model view, and geometric view </li></ul></ul><ul><ul><li>A theoretical framework: local coordinate coding </li></ul></ul><ul><ul><li>Two practical coding methods </li></ul></ul><ul><li>Recent advances in sparse coding for image classification </li></ul>
    3. 3. Outline of Part 3 05/13/11 <ul><li>Why can sparse coding learn good features? </li></ul><ul><ul><li>Intuition, topic model view, and geometric view </li></ul></ul><ul><ul><li>A theoretical framework: local coordinate coding </li></ul></ul><ul><ul><li>Two practical coding methods </li></ul></ul><ul><li>Recent advances in sparse coding for image classification </li></ul>
    4. 4. Intuition: why sparse coding helps classification? 05/13/11 <ul><li>The coding is a nonlinear feature mapping </li></ul><ul><li>Represent data in a higher dimensional space </li></ul><ul><li>Sparsity makes prominent patterns more distinctive </li></ul>Figure from http://www.dtreg.com/svm.htm
    5. 5. A “topic model” view to sparse coding 05/13/11 <ul><li>Each basis is a “ direction ” or a “ topic ”. </li></ul><ul><li>Sparsity : each datum is a linear combination of only a few bases. </li></ul><ul><li>Applicable to image denoising, inpainting, and super-resolution. </li></ul>B oth f igures adapted from CVPR10 tutorial by F. Bach, J. Mairal, J. Ponce and G. Sapiro Basis 1 Basis 2
    6. 6. A geometric view to sparse coding 05/13/11 Data manifold <ul><li>Each basis is somewhat like a pseudo data point – “ anchor point ” </li></ul><ul><li>Sparsity : each datum is a sparse combination of neighbor anchors. </li></ul><ul><li>The coding scheme explores the manifold structure of data. </li></ul>Basis Data
    7. 7. MNIST Experiment: Classification using SC 05/13/11 <ul><li>60K training, 10K for test </li></ul><ul><li>Let k=512 </li></ul><ul><li>Linear SVM on sparse codes </li></ul>Try different values
    8. 8. MNIST Experiment: Lambda = 0.0005 05/13/11 Each basis is like a part or direction .
    9. 9. MNIST Experiment: Lambda = 0.005 05/13/11 Again, each basis is like a part or direction .
    10. 10. MNIST Experiment: Lambda = 0.05 05/13/11 Now, each basis is more like a digit !
    11. 11. MNIST Experiment: Lambda = 0.5 05/13/11 Like clustering now!
    12. 12. Geometric view of sparse coding 05/13/11 Error: 4.54% <ul><li>When SC achieves the best classification accuracy, the learned bases are like digits – each basis has a clear local class association. </li></ul><ul><li>Implication: exploring data geometry may be useful for classification . </li></ul>Error: 3.75% Error: 2.64%
    13. 13. Distribution of coefficients (MNIST) 05/13/11 Neighbor bases tend to get nonzero coefficients
    14. 14. Distribution of coefficient (SIFT, Caltech101) 05/13/11 Similar observation here!
    15. 15. Recap: two different views to sparse coding 05/13/11 <ul><li>View 1 </li></ul><ul><li>Discover “topic” components </li></ul><ul><li>Each basis is a “ direction ” </li></ul><ul><li>Sparsity : each datum is a linear combination of several bases. </li></ul><ul><li>Related to topic model </li></ul><ul><li>View 2 </li></ul><ul><li>Geometric structure of data manifold </li></ul><ul><li>Each basis is an “ anchor point ” </li></ul><ul><li>Sparsity : each datum is a linear combination of neighbor anchors. </li></ul><ul><li>S omewhat like a soft VQ (link to BoW) </li></ul><ul><li>Either can be valid for sparse coding under certain circumstances. </li></ul><ul><li>View 2 seems to be helpful to sensory data classification. </li></ul>
    16. 16. Outline of Part 3 05/13/11 <ul><li>Why can sparse coding learn good features? </li></ul><ul><ul><li>Intuition, topic model view, and geometric view </li></ul></ul><ul><ul><li>A theoretical framework: local coordinate coding </li></ul></ul><ul><ul><li>Two practical coding methods </li></ul></ul><ul><li>Recent advances in sparse coding for image classification </li></ul>
    17. 17. Key theoretical question 05/13/11 <ul><li>Why unsupervised feature learning via sparse coding can help classification ? </li></ul>
    18. 18. The image classification setting for analysis Implication : Learning an image classifier is a matter of learning nonlinear functions on patches. Sparse Coding Dense local feature Linear Pooling Linear SVM Function on images Function on patches
    19. 19. Illustration: nonlinear l earning via local coding 05/13/11 data points bases locally linear
    20. 20. How to learn a nonlinear function? 05/13/11 S tep 1: Learning the dictionary from unlabeled data
    21. 21. How to learn a nonlinear function? 05/13/11 S tep 2: Use t he dictionary to encode data
    22. 22. How to learn a nonlinear function? <ul><li>Nonlinear local learning via learning a global linear function . </li></ul>05/13/11 Sparse codes of data S tep 3: Estimate parameters Global linear weights to be learned
    23. 23. L ocal Coordinate Coding (LCC): connect coding to n onlinear f unction l earning 05/13/11 Locality term Function approximation error Coding error If f(x) is (alpha, beta)-Lipschitz smooth Yu et al NIPS-09 T he key message: A good coding scheme should 1. have a small coding error, 2. and also b e sufficiently local
    24. 24. Outline of Part 3 05/13/11 <ul><li>Why can sparse coding learn good features? </li></ul><ul><ul><li>Intuition, topic model view, and geometric view </li></ul></ul><ul><ul><li>A theoretical framework: local coordinate coding </li></ul></ul><ul><ul><li>Two practical coding methods </li></ul></ul><ul><li>Recent advances in sparse coding for image classification </li></ul>
    25. 25. Application of LCC theory 05/13/11 <ul><li>F ast Implementation with a large dictionary </li></ul><ul><li>A simple geometric way to improve BoW </li></ul>Wang e t al, CVPR 10 Zhou et al, ECCV 10
    26. 26. Application of LCC theory 05/13/11 <ul><li>F ast Implementation with a large dictionary </li></ul><ul><li>A simple geometric way to improve BoW </li></ul>
    27. 27. The larger dictionary, the higher accuracy, but also the higher computation cost 05/13/11 T he same observation for Caltech-256, PASCAL, ImageNet, … Yu et al NIPS-09 Y ang et al CVPR 09
    28. 28. L ocality-constrained linear coding a fast implementation of LCC 05/13/11 <ul><li>D ictionary Learning: k-means (or hierarchical k -means) </li></ul><ul><li>C oding for X, </li></ul><ul><ul><li>Step 1 – ensure locality : find the K nearest bases </li></ul></ul><ul><ul><li>Step 2 – ensure low coding error : </li></ul></ul>Wang et al, CVPR 10
    29. 29. C ompetitive in accuracy, cheap in computation 05/13/11 Wang et al CVPR 10 Sparse coding Significantly better than sparse coding T his is one of the two major algorithms applied by NEC-UIUC team to achieve the No.1 position in ImageNet challenge 2010! Comparable with sparse coding
    30. 30. Application of the LCC theory 05/13/11 <ul><li>F ast Implementation with a large dictionary </li></ul><ul><li>A simple geometric way to improve BoW </li></ul>
    31. 31. Interpret “BoW + linear classifier” data points cluster centers Piece-wise local constant ( zero-order)
    32. 32. Super-vector coding: a simple geometric way to improve BoW (VQ) Zhou et al, ECCV 10 data points cluster centers Piecewise local linear ( first-order) Local tangent
    33. 33. Super-vector coding: a simple geometric way to improve BoW (VQ) 05/13/11 Q uantization error Function approximation error If f(x) is beta-Lipschitz smooth, and Local tangent
    34. 34. Super-vector coding: learning nonlinear function via a global linear model 05/13/11 Let be the VQ coding of T his is one of the two major algorithms applied by NEC-UIUC team to achieve the No.1 position in PASCAL VOC 2009! Global linear weights to be learned S uper-vector codes of data
    35. 35. Summary of Geometric Coding Methods Super-vector Coding <ul><li>A ll lead to higher-dimensional, sparse , and localized coding </li></ul><ul><li>A ll explore geometric structure of data </li></ul><ul><li>N ew coding methods are suitable for linear classifiers . </li></ul><ul><li>Their implementations are quite straightforward. </li></ul>Vector Quantization (BoW) (Fast) Local Coordinate Coding
    36. 36. Things not covered here 05/13/11 <ul><li>I mproved LCC using Local Tangent, Yu & Zhang, ICML10 </li></ul><ul><li>M ixture of Sparse Coding, Yang et al ECCV 10 </li></ul><ul><li>Deep Coding Network, Lin et al NIPS 10 </li></ul><ul><li>P ooling methods </li></ul><ul><ul><li>Max-pooling works wel l in practice, but appears to be ad-hoc. </li></ul></ul><ul><ul><li>An interesting analysis on max-pooling, Boureau et al. ICML 2010 </li></ul></ul><ul><ul><li>W e are working on a linear pooling method, which has a similar effect as max-pooling. Some preliminary results already in the super-vector coding paper, Zhou et al, ECCV2010. </li></ul></ul>
    37. 37. Outline of Part 3 05/13/11 <ul><li>Why can sparse coding learn good features? </li></ul><ul><ul><li>Intuition, topic model view, and geometric view </li></ul></ul><ul><ul><li>A theoretical framework: local coordinate coding </li></ul></ul><ul><ul><li>Two practical coding methods </li></ul></ul><ul><li>Recent advances in sparse coding for image classification </li></ul>
    38. 38. Fast approximation of sparse coding via neural networks 05/13/11 Gregor & LeCun, ICML-10 <ul><li>The method aims at improving sparse coding speed in coding time, not training speed, potentially make sparse coding practical for video. </li></ul><ul><li>Idea: Given a trained sparse coding model, use its input outputs as training data to train a feed-forward model </li></ul><ul><li>They showed a speedup of X20 faster. But not evaluated on real video data. </li></ul>
    39. 39. Group sparse coding 05/13/11 <ul><li>Sparse coding is on patches, the image representation is unlikely sparse. </li></ul><ul><li>Idea: enforce joint sparsity via L1/L2 norm on sparse codes of a group of patches. </li></ul><ul><li>The resultant image representation becomes sparse, which can save the memory cost, but the classification accuracy decreases. </li></ul>Bengio et al, NIPS 09
    40. 40. Learning hierarchical dictionary 05/13/11 Jenatton, Mairal, Obozinski, and Bach, 2010 A node can be active only if its ancestors are active.
    41. 41. Reference 05/13/11 <ul><li>Image Classification using Super-Vector Coding of Local Image Descriptors, Xi Zhou, Kai Yu, Tong Zhang, and Thomas Huang. In ECCV 2010. </li></ul><ul><li>Efficient Highly Over-Complete Sparse Coding using a Mixture Model, Jianchao Yang, Kai Yu, and Thomas Huang. In ECCV 2010. </li></ul><ul><li>Learning Fast Approximations of Sparse Coding, Karol Gregor and Yann LeCun. In ICML 2010. </li></ul><ul><li>Improved Local Coordinate Coding using Local Tangents, Kai Yu and Tong Zhang. In ICML 2010. </li></ul><ul><li>Sparse Coding and Dictionary Learning for Image Analysis, Francis Bach,  Julien Mairal, Jean Ponce, and Guillermo Sapiro. CVPR 2010 Tutorial </li></ul><ul><li>Supervised translation-invariant sparse coding, Jianchao Yang, Kai Yu, and Thomas Huang, In CVPR 2010. </li></ul><ul><li>Learning locality-constrained linear coding for image classification, Jingjun Wang, Jianchao Yang, Kai Yu, Fengjun Lv, Thomas Huang, and Yihong Gong. In CVPR 2010. </li></ul><ul><li>Group Sparse Coding, Samy Bengio, Fernando Pereira, Yoram Singer, and Dennis  Strelow, In NIPS*2009. </li></ul><ul><li>Nonlinear learning using local coordinate coding, Kai Yu, Tong Zhang, and Yihong Gong. In NIPS*2009. </li></ul><ul><li>Linear spatial pyramid matching using sparse coding for image classification, Jianchao Yang, Kai Yu, Yihong Gong, and Thomas Huang. In CVPR 2009. </li></ul><ul><li>Efficient sparse coding algorithms. Honglak Lee, Alexis Battle, Raina Rajat and Andrew Y.Ng. In NIPS*2007. </li></ul>

    ×