Sparse Coding Extensions Visual Recognition

Sparse Coding and Its Extensions for Visual Recognition Kai Yu M edia Analytics Department NEC Labs America, C upertino, CA

V isual Recognition is HOT in Computer Vision 12/30/11 C altech 101 PASCAL VOC 80 Million Tiny Images I mageNet

T he pipeline of machine visual perception 12/30/11 M ost Efforts in Machine Learning Low-level sensing Pre-processing Feature extract. Feature selection Inference: prediction, recognition ,[object Object],[object Object],[object Object],[object Object]

Computer vision features SIFT Spin image HoG RIFT S lide Credit: Andrew Ng GLOH

L earning everything from data 12/30/11 Machine Learning Low-level sensing Pre-processing Feature extract. Feature selection Inference: prediction, recognition M achine L earning

BoW + SPM Kernel 12/30/11 ,[object Object],Figure credit: Fei-Fei Li, Svetlana Lazebnik Bag-of-visual-words representation (BoW) based on vector quantization (VQ) Spatial pyramid matching (SPM) kernel

W inning Method in PASCAL VOC before 2009 12/30/11 M ultiple Feature Sampling Methods Multiple Visual Descriptors VQ C oding , H istogram, SPM N onlinear SVM

Convolution Neural Networks ,[object Object],Conv. Filtering Pooling Conv. Filtering Pooling

BoW+SPM: the same architecture ,[object Object],[object Object],[object Object],[object Object],[object Object],e.g, SIFT, HOG VQ Coding Average Pooling (obtain histogram) Nonlinear SVM Local Gradients Pooling

D evelop better methods Better Coding Better Pooling Scalable Linear Classifier B etter Coding B etter Pooling

S parse Coding 12/30/11 Sparse coding (Olshausen & Field,1996). Originally developed to explain early visual processing in the brain (edge detection). T raining: given a set of random patches x, learning a dictionary of bases [Φ 1, Φ 2, …] Coding: for data vector x, solve LASSO to find the sparse coefficient vector a

Sparse Coding E xample Natural Images Learned bases (  1 , …,  64 ): “Edges” x  0.8 *  36 + 0.3 *  42 + 0.5 *  63 [a 1 , …, a 64 ] = [ 0, 0, …, 0, 0.8 , 0, …, 0, 0.3 , 0, …, 0, 0.5 , 0 ] (feature representation) Test example Compact & easily interpretable Slide credit: Andrew Ng  0.8 * + 0.3 * + 0.5 *

S elf-taught Learning Testing: What is this? Motorcycles Not motorcycles [Raina, Lee, Battle, Packer & Ng, ICML 07] Testing: What is this ? Slide credit: Andrew Ng Unlabeled images …

Classification R esult on Caltech 101 12/30/11 64% SIFT VQ + N onlinear SVM 50% Pixel S parse Coding + Linear SVM 9K images, 101 classes

e.g, SIFT, HOG S parse Coding on SIFT [Y ang, Yu, Gong & Huang , CVPR09] S parse Coding M ax Pooling Scalable Linear Classifier Local Gradients Pooling

12/30/11 64% SIFT VQ + N onlinear SVM 73% SIFT S parse Coding + Linear SVM C altech-101 S parse Coding on SIFT [Y ang, Yu, Gong & Huang , CVPR09]

W hat we have learned? ,[object Object],[object Object],e.g, SIFT, HOG S parse Coding M ax Pooling Scalable Linear Classifier Local Gradients Pooling

MNIST E xperiments 12/30/11 Error: 4.54% ,[object Object],Error: 3.75% Error: 2.64%

Distribution of coefficient (SIFT, Caltech101) 12/30/11 Neighbor bases tend to get nonzero coefficients

12/30/11 ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

A F unction Approximation View to Coding 12/30/11 ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],A coding scheme is good if it helps learning f(x)

A F unction Approximation View to Coding – The General Formulation 12/30/11 F unction Approx. Error ≤ A n unsupervised learning objective

Local Coordinate Coding (LCC) 12/30/11 ,[object Object],[object Object],[object Object],[object Object],Yu, Zhang & Gong, NIPS 09 W ang, Yang, Yu, Lv, Huang CVPR 10

Super-Vector Coding (SVC) 12/30/11 ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Zhou, Yu, Zhang, and Huang, ECCV 10 Zero-order Local tangent

F unction Approximation based on LCC 12/30/11 data points bases Yu, Zhang, Gong, NIPS 10 locally linear

Function Approximation based on SVC Zhou, Yu, Zhang, and Huang, ECCV 10 data points cluster centers Piecewise local linear ( first-order) Local tangent

PASCAL VOC C hallenge 2009 12/30/11 N o.1 for 18 of 20 categories W e used only HOG feature on gray images Ours Best of Other Teams Difference Classes

I mageNet Challenge 2010 12/30/11 ~40% VQ + I ntersection Kernel 64%~73% Various Coding Methods + Linear SVM 1.4 million images, 1000 classes, top5 hit rate 50% Classification accuracy

H ierarchical sparse coding Yu, Lin, & Lafferty, CVPR 11 Conv. Filtering Pooling Conv. Filtering Pooling L earning from unlabeled data

A two-layer sparse coding formulation 12/30/11

MNIST Results -- classification  HSC vs. CNN: HSC provide even better performance than CNN  more amazingly, HSC learns features in unsupervised manner!

MNIST results -- effect of hierarchical learning C omparing the Fisher score of HSC and SC  Discriminative power: is significantly improved by HSC although HSC is unsupervised coding

MNIST results -- learned codebook One dimension in the second layer: invariance to translation, rotation, and deformation

Caltech101 results -- classification  Learned descriptor: performs slightly better than SIFT + SC

Caltech101 results -- learned codebook  First layer bases: very much like edge detectors.

Conclusion and Future Work ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

References 12/30/11 ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Sparse Coding Extensions Visual Recognition

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (7)

Similar to Sparse Coding Extensions Visual Recognition

Similar to Sparse Coding Extensions Visual Recognition (20)

More from zukun

More from zukun (20)

Recently uploaded

Recently uploaded (20)

Sparse Coding Extensions Visual Recognition