- 1. CVPR 2009, Miami, Florida Subhransu Maji and Jitendra Malik University of California at Berkeley, Berkeley, CA-94720 Object Detection Using a Max-Margin Hough Transform
- 2. Overview <ul><li>Overview of probabilistic Hough transform </li></ul><ul><li>Learning framework </li></ul><ul><li>Experiments </li></ul><ul><li>Summary </li></ul>
- 3. Our Approach: Hough Transform <ul><li>Popular for detecting parameterized shapes </li></ul><ul><ul><li>Hough’59, Duda&Hart’72, Ballard’81,… </li></ul></ul><ul><li>Local parts vote for object pose </li></ul><ul><li>Complexity : # parts * # votes </li></ul><ul><ul><li>Can be significantly lower than brute force search over pose (for example sliding window detectors) </li></ul></ul>
- 4. Generalized to object detection Learning <ul><li>Learn appearance codebook </li></ul><ul><ul><li>Cluster over interest points on </li></ul></ul><ul><ul><li>training images </li></ul></ul><ul><li>Use Hough space voting to find objects </li></ul><ul><ul><li>Lowe’99, Leibe et.al.’04,’08, Opelt&Pinz’08 </li></ul></ul><ul><li>Implicit Shape Model </li></ul><ul><ul><li>Leibe et.al.’04,’08 </li></ul></ul><ul><li>Learn spatial distributions </li></ul><ul><ul><li>Match codebook to training images </li></ul></ul><ul><ul><li>Record matching positions on object </li></ul></ul><ul><ul><li>Centroid is given </li></ul></ul>Spatial occurrence distributions x y s x y s x y s x y s
- 5. Detection Pipeline B. Leibe, A. Leonardis, and B. Schiele. Combined object categorization and segmentation with an implicit shape model ‘ 2004 Probabilistic Voting Interest Points eg. SIFT,GB, Local Patches Matched Codebook Entries KD Tree
- 6. Probabilistic Hough Transform <ul><li>C – Codebook </li></ul><ul><li>f – features, l - locations </li></ul>Position Posterior Codeword Match Codeword likelihood Detection Score Codeword likelihood
- 7. Learning Feature Weights <ul><li>Given : </li></ul><ul><ul><li>Appearance Codebook, C </li></ul></ul><ul><ul><li>Posterior distribution of object center for each codeword P(x|…) </li></ul></ul><ul><li>To Do : </li></ul><ul><ul><li>Learn codebook weights such that the Hough transform detector works well (i.e. better detection rates) </li></ul></ul><ul><li>Contributions : </li></ul><ul><ul><li>Show that these weights can be learned optimally using a max-margin framework. </li></ul></ul><ul><ul><li>Demonstrate that this leads to improved accuracy on various datasets </li></ul></ul>
- 8. <ul><li>Naïve Bayes weights: </li></ul><ul><li>Encourages relatively rare parts </li></ul><ul><li>However rare parts may not be good predictors of the object location </li></ul><ul><li>Need to jointly consider both priors and distribution of location centers. </li></ul>Learning Feature Weights : First Try
- 9. <ul><li>Location invariance assumption </li></ul><ul><li>Overall score is linear given the matched codebook entries </li></ul>Learning Feature Weights : Second Try Position Posterior Codeword Match Codeword likelihood Activations Feature weights
- 10. Max-Margin Training <ul><li>Training: </li></ul><ul><li>Construct dictionary </li></ul><ul><li>Record codeword distributions on training examples </li></ul><ul><li>Compute “a” vectors on positive and negative training examples </li></ul><ul><li>Learn codebook weights using by max-margin training </li></ul>Standard ISM model (Leibe et.al.’04) Our Contribution class label {+1,-1} activations non negative
- 11. Experiment Datasets ETHZ Shape Dataset ( Ferrari et al., ECCV 2006) 255 images, over 5 classes (Apple logo, Bottle, Giraffe, Mug, Swan) UIUC Single Scale Cars Dataset ( Agarwal & Roth, ECCV 2002) 1050 training, 170 test images INRIA Horse Dataset ( Jurie & Ferrari) 170 positive + 170 negative images (50 + 50 for training)
- 12. Experimental Results <ul><li>Hough transform details </li></ul><ul><ul><li>Interest points : Geometric Blur descriptors at sparse sample of edges (Berg&Malik’01) </li></ul></ul><ul><ul><li>Codebook constructed using k -means </li></ul></ul><ul><ul><li>Voting over position and aspect ratio </li></ul></ul><ul><ul><li>Search over scales </li></ul></ul><ul><li>Correct detections (PASCAL criterion) </li></ul>
- 13. Learned Weights (ETHZ shape) Max-Margin Important Parts Naïve Bayes blue (low) , dark red (high) Influenced by clutter (rare structures)
- 14. Learned Weights (UIUC cars) blue (low) , dark red (high) Naïve Bayes Max-Margin Important Parts
- 15. Learned Weights (INRIA horses) blue (low) , dark red (high) Naïve Bayes Max-Margin Important Parts
- 16. Detection Results (ETHZ dataset) Recall @ 1.0 False Positives Per Window
- 17. Detection Results (INRIA Horses) Our Work
- 18. Detection Results (UIUC Cars) INRIA horses Our Work
- 19. Hough Voting + Verification Classifier Recall @ 0.3 False Positives Per Image ETHZ Shape Dataset IKSVM was run on top 30 windows + local search KAS – Ferrari et.al., PAMI’08 TPS-RPM – Ferrari et.al., CVPR’07 better fitting bounding box Implicit sampling over aspect-ratio
- 20. Hough Voting + Verification Classifier IKSVM was run on top 30 windows + local search Our Work
- 21. Hough Voting + Verification Classifier UIUC Single Scale Car Dataset IKSVM was run on top 10 windows + local search 1.7% improvement
- 22. Summary <ul><li>Hough transform based detectors offer good detection performance and speed. </li></ul><ul><li>To get better performance one may learn </li></ul><ul><ul><li>Discriminative dictionaries (two talks ago, Gall et.al.’09) </li></ul></ul><ul><ul><li>Weights on codewords (our work) </li></ul></ul><ul><li>Our approach directly optimizes detection performance using a max-margin formulation </li></ul><ul><li>Any weak predictor of object center can be used is this framework </li></ul><ul><ul><li>Eg. Regions (one talk ago, Gu et.al. CVPR’09) </li></ul></ul>
- 23. <ul><li>Work partially supported by: </li></ul><ul><li>ARO MURI W911NF-06-1-0076 and ONR MURI N00014-06-1-0734 </li></ul><ul><li>Computer Vision Group @ UC Berkeley </li></ul>Acknowledgements Thank You Questions?
- 24. Backup Slide : Toy Example Rare but poor localization Rare and good localization

