The document discusses sparse coding and its applications in visual recognition tasks. It introduces sparse coding as an unsupervised learning technique that learns bases to represent image patches. Sparse coding has been shown to outperform bag-of-words models with vector quantization on datasets like Caltech-101 and PASCAL VOC. The document also discusses extensions of sparse coding, including hierarchical sparse coding and supervised methods, that have achieved further improvements on image classification benchmarks.
5. L earning everything from data 12/30/11 Machine Learning Low-level sensing Pre-processing Feature extract. Feature selection Inference: prediction, recognition M achine L earning
6.
7. W inning Method in PASCAL VOC before 2009 12/30/11 M ultiple Feature Sampling Methods Multiple Visual Descriptors VQ C oding , H istogram, SPM N onlinear SVM
8.
9.
10. D evelop better methods Better Coding Better Pooling Scalable Linear Classifier B etter Coding B etter Pooling
11. S parse Coding 12/30/11 Sparse coding (Olshausen & Field,1996). Originally developed to explain early visual processing in the brain (edge detection). T raining: given a set of random patches x, learning a dictionary of bases [Φ 1, Φ 2, …] Coding: for data vector x, solve LASSO to find the sparse coefficient vector a
13. S elf-taught Learning Testing: What is this? Motorcycles Not motorcycles [Raina, Lee, Battle, Packer & Ng, ICML 07] Testing: What is this ? Slide credit: Andrew Ng Unlabeled images …
14. Classification R esult on Caltech 101 12/30/11 64% SIFT VQ + N onlinear SVM 50% Pixel S parse Coding + Linear SVM 9K images, 101 classes
15. e.g, SIFT, HOG S parse Coding on SIFT [Y ang, Yu, Gong & Huang , CVPR09] S parse Coding M ax Pooling Scalable Linear Classifier Local Gradients Pooling
16. 12/30/11 64% SIFT VQ + N onlinear SVM 73% SIFT S parse Coding + Linear SVM C altech-101 S parse Coding on SIFT [Y ang, Yu, Gong & Huang , CVPR09]
22. A F unction Approximation View to Coding – The General Formulation 12/30/11 F unction Approx. Error ≤ A n unsupervised learning objective
23.
24.
25. F unction Approximation based on LCC 12/30/11 data points bases Yu, Zhang, Gong, NIPS 10 locally linear
26. Function Approximation based on SVC Zhou, Yu, Zhang, and Huang, ECCV 10 data points cluster centers Piecewise local linear ( first-order) Local tangent
27. PASCAL VOC C hallenge 2009 12/30/11 N o.1 for 18 of 20 categories W e used only HOG feature on gray images Ours Best of Other Teams Difference Classes
28. I mageNet Challenge 2010 12/30/11 ~40% VQ + I ntersection Kernel 64%~73% Various Coding Methods + Linear SVM 1.4 million images, 1000 classes, top5 hit rate 50% Classification accuracy
29. H ierarchical sparse coding Yu, Lin, & Lafferty, CVPR 11 Conv. Filtering Pooling Conv. Filtering Pooling L earning from unlabeled data
31. MNIST Results -- classification HSC vs. CNN: HSC provide even better performance than CNN more amazingly, HSC learns features in unsupervised manner!
32. MNIST results -- effect of hierarchical learning C omparing the Fisher score of HSC and SC Discriminative power: is significantly improved by HSC although HSC is unsupervised coding
33. MNIST results -- learned codebook One dimension in the second layer: invariance to translation, rotation, and deformation