Beyond the Euclidean distance: Creating effective visual codebooks using the histogram intersection kernel<br />Authors: J...
Beyond the Euclidean distance<br />Key Ideas:<br />Use histogram intersection kernel (HIK) to create the visual codebook d...
Background: Bag of Visual Words<br />Codebook construction (Find D)<br />Clustering-based, such as k-means<br />Assignment...
Kernel K-means (1/2)<br />Finding the nearest centroidfrom K centroids:<br />Updating the centroids by averaging the new a...
Kernel K-means (2/2)<br />(1)<br />
Contribution 1: fast evaluation of HIK<br />Based on (Maji et al. 2008) and transforming R^d_+ into N^d, and the evaluatio...
Contribution 2: Encoding via One-class SVM<br />Example one-class SVM in 2D using Gaussian kernel:<br />Gamma = 0.01, C=20...
Contribution 2: Encoding via One-class SVM<br />Use kernel K-means (with HIK) to create codebook of size K.<br />Train K o...
Contribution 3: Comparison with K-median Codebook<br />K-median clustering:<br />Finding nearest centroid using L1 distanc...
Some engineering details<br />Pyramid overlapping pooling strategy<br />31 subwindows => 31K dimension vector<br />
Some engineering details<br />Concatenation of Sobel image<br />Pictures from Wikipedia<br />=> 31K*2=62K dimension image ...
Some engineering details<br />SIFT for Caltech, CENTRIST for others<br />Codebook size K = 200<br />Pyramid level L = 0, 1...
Results: Caltech 101<br />B, not B: concatenation of Sobel or not<br />s: grid step size of dense SIFT extraction<br />oc_...
Results: Scene 15<br />B, not B: concatenation of Sobel or not<br />s: grid step size of dense SIFT extraction<br />oc_{sv...
Conclusions<br />HIK visual codebook improves classification accuracy.<br />K-median is a compromise between k-means and H...
Upcoming SlideShare
Loading in...5
×

Beyond The Euclidean Distance: Creating effective visual codebooks using the histogram intersection kernel

1,925

Published on

Published in: Education, Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,925
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
24
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Beyond The Euclidean Distance: Creating effective visual codebooks using the histogram intersection kernel

  1. 1. Beyond the Euclidean distance: Creating effective visual codebooks using the histogram intersection kernel<br />Authors: Jianxin Wu and James Rehg<br />@Georgia Institute of Technology<br />Presenter: Shao-Chuan Wang<br />
  2. 2. Beyond the Euclidean distance<br />Key Ideas:<br />Use histogram intersection kernel (HIK) to create the visual codebook due to the fact that most of descriptors are histogram-based features<br />Kernel K-means (using HIK)<br />One-class SVM (using HIK)<br />Conclusions: <br />One-class SVM with HIK performs the best<br />K-median is the compromise (comparable with HIK K-means) <br />
  3. 3. Background: Bag of Visual Words<br />Codebook construction (Find D)<br />Clustering-based, such as k-means<br />Assignment of descriptors to visual word (Find alpha)<br />Pooling (sum pooling to construct histograms)<br />←focus of this paper<br />Voronoi<br />diagram<br />Subject to some constraints<br />
  4. 4. Kernel K-means (1/2)<br />Finding the nearest centroidfrom K centroids:<br />Updating the centroids by averaging the new assigned atoms<br />Iteration t:<br />
  5. 5. Kernel K-means (2/2)<br />(1)<br />
  6. 6. Contribution 1: fast evaluation of HIK<br />Based on (Maji et al. 2008) and transforming R^d_+ into N^d, and the evaluation of (1) can be reduced to O(d)<br />->pre-compute a lookup table!<br />
  7. 7. Contribution 2: Encoding via One-class SVM<br />Example one-class SVM in 2D using Gaussian kernel:<br />Gamma = 0.01, C=2000<br />Gamma = 0.1, C=2000<br />
  8. 8. Contribution 2: Encoding via One-class SVM<br />Use kernel K-means (with HIK) to create codebook of size K.<br />Train K one-class SVM for each cluster.<br />Assign the word according to the maximum response out of K SVM machines.<br />:Lagrangian multiplier<br />
  9. 9. Contribution 3: Comparison with K-median Codebook<br />K-median clustering:<br />Finding nearest centroid using L1 distance<br />Updating the centroids by finding the median of the updated atoms.<br />‘Median’ is the minimizer of the following opt. problem,<br />
  10. 10. Some engineering details<br />Pyramid overlapping pooling strategy<br />31 subwindows => 31K dimension vector<br />
  11. 11. Some engineering details<br />Concatenation of Sobel image<br />Pictures from Wikipedia<br />=> 31K*2=62K dimension image representation<br />
  12. 12. Some engineering details<br />SIFT for Caltech, CENTRIST for others<br />Codebook size K = 200<br />Pyramid level L = 0, 1, 2<br />Using one-vs-one SVM for smaller dataset, using BSVM for Caltech 101<br />Random splitting is repeated 5 times.<br />
  13. 13. Results: Caltech 101<br />B, not B: concatenation of Sobel or not<br />s: grid step size of dense SIFT extraction<br />oc_{svm}: one class SVM encoding<br />k_{HI}: using histogram intersection kernel <br />
  14. 14. Results: Scene 15<br />B, not B: concatenation of Sobel or not<br />s: grid step size of dense SIFT extraction<br />oc_{svm}: one class SVM encoding<br />k_{HI}: using histogram intersection kernel <br />
  15. 15. Conclusions<br />HIK visual codebook improves classification accuracy.<br />K-median is a compromise between k-means and HIK.<br />One-class SVM encoding helps build a more compact representation<br />Smaller step<br />size is better?<br />
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×