Upcoming SlideShare
×

# Beyond The Euclidean Distance: Creating effective visual codebooks using the histogram intersection kernel

1,925

Published on

Published in: Education, Technology
2 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total Views
1,925
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
24
0
Likes
2
Embeds 0
No embeds

No notes for slide

### Beyond The Euclidean Distance: Creating effective visual codebooks using the histogram intersection kernel

1. 1. Beyond the Euclidean distance: Creating effective visual codebooks using the histogram intersection kernel<br />Authors: Jianxin Wu and James Rehg<br />@Georgia Institute of Technology<br />Presenter: Shao-Chuan Wang<br />
2. 2. Beyond the Euclidean distance<br />Key Ideas:<br />Use histogram intersection kernel (HIK) to create the visual codebook due to the fact that most of descriptors are histogram-based features<br />Kernel K-means (using HIK)<br />One-class SVM (using HIK)<br />Conclusions: <br />One-class SVM with HIK performs the best<br />K-median is the compromise (comparable with HIK K-means) <br />
3. 3. Background: Bag of Visual Words<br />Codebook construction (Find D)<br />Clustering-based, such as k-means<br />Assignment of descriptors to visual word (Find alpha)<br />Pooling (sum pooling to construct histograms)<br />←focus of this paper<br />Voronoi<br />diagram<br />Subject to some constraints<br />
4. 4. Kernel K-means (1/2)<br />Finding the nearest centroidfrom K centroids:<br />Updating the centroids by averaging the new assigned atoms<br />Iteration t:<br />
5. 5. Kernel K-means (2/2)<br />(1)<br />
6. 6. Contribution 1: fast evaluation of HIK<br />Based on (Maji et al. 2008) and transforming R^d_+ into N^d, and the evaluation of (1) can be reduced to O(d)<br />->pre-compute a lookup table!<br />
7. 7. Contribution 2: Encoding via One-class SVM<br />Example one-class SVM in 2D using Gaussian kernel:<br />Gamma = 0.01, C=2000<br />Gamma = 0.1, C=2000<br />
8. 8. Contribution 2: Encoding via One-class SVM<br />Use kernel K-means (with HIK) to create codebook of size K.<br />Train K one-class SVM for each cluster.<br />Assign the word according to the maximum response out of K SVM machines.<br />:Lagrangian multiplier<br />
9. 9. Contribution 3: Comparison with K-median Codebook<br />K-median clustering:<br />Finding nearest centroid using L1 distance<br />Updating the centroids by finding the median of the updated atoms.<br />‘Median’ is the minimizer of the following opt. problem,<br />
10. 10. Some engineering details<br />Pyramid overlapping pooling strategy<br />31 subwindows => 31K dimension vector<br />
11. 11. Some engineering details<br />Concatenation of Sobel image<br />Pictures from Wikipedia<br />=> 31K*2=62K dimension image representation<br />
12. 12. Some engineering details<br />SIFT for Caltech, CENTRIST for others<br />Codebook size K = 200<br />Pyramid level L = 0, 1, 2<br />Using one-vs-one SVM for smaller dataset, using BSVM for Caltech 101<br />Random splitting is repeated 5 times.<br />
13. 13. Results: Caltech 101<br />B, not B: concatenation of Sobel or not<br />s: grid step size of dense SIFT extraction<br />oc_{svm}: one class SVM encoding<br />k_{HI}: using histogram intersection kernel <br />
14. 14. Results: Scene 15<br />B, not B: concatenation of Sobel or not<br />s: grid step size of dense SIFT extraction<br />oc_{svm}: one class SVM encoding<br />k_{HI}: using histogram intersection kernel <br />
15. 15. Conclusions<br />HIK visual codebook improves classification accuracy.<br />K-median is a compromise between k-means and HIK.<br />One-class SVM encoding helps build a more compact representation<br />Smaller step<br />size is better?<br />
1. #### A particular slide catching your eye?

Clipping is a handy way to collect important slides you want to go back to later.