Iccv2009 recognition and learning object categories p2 c01 - recognizing a large number of object classes

1.
Large Scale Recognitionand Retrieval

2.
What does theworld look like?Scaling to billions of imagesObject Recognition for large-scale searchFocus on scaling rather than understanding imageHigh level image statistics

3.
Content-Based Image RetrievalVarietyof simple/hand-designed cues:Color and/or Texture histograms, Shape, PCA, etc.Various distance metricsEarth Movers Distance (Rubner et al. ‘98)QBIC from IBM (1999)Blobworld, Carson et al. 2002

4.
Some vision techniquesfor large scale recognitionEfficient matching methodsPyramid Match KernelLearning to compare imagesMetrics for retrievalLearning compact descriptors

5.

6.
Matching features incategory-levelrecognition

7.
Comparing sets oflocal featuresPrevious strategies: Match features individually, vote on small sets to verify [Schmid, Lowe, Tuytelaars et al.]Explicit search for one-to-one correspondences [Rubner et al., Belongie et al., Gold & Rangarajan, Wallraven & Caputo, Berg et al., Zhang et al.,…]Bag-of-words: Compare frequencies of prototype features [Csurka et al., Sivic & Zisserman, Lazebnik & Ponce]Slide credit: Kristen Grauman

8.
Pyramid match kerneloptimalpartial matching[Grauman & Darrell, ICCV 2005]Optimal match: O(m3)Pyramid match: O(mL)m = # featuresL = # levels in pyramidSlide credit: Kristen Grauman

9.
Pyramid match: mainideaFeature space partitions serve to “match” the local descriptors within successively wider regions.descriptor spaceSlide credit: Kristen Grauman

10.
Pyramid match: mainideaHistogram intersection counts number of possible matches at a given partitioning.Slide credit: Kristen Grauman

11.
Computing the partialmatchingEarth Mover’s Distance[Rubner, Tomasi, Guibas 1998]Hungarian method[Kuhn, 1955]Greedy matching …Pyramid match [Grauman and Darrell, ICCV 2005]for sets with features of dimension

12.
Recognition on theETH-80ComplexityKernelMatch [Wallraven et al.]Pyramid matchRecognition accuracy (%)Testing time (s)Mean number of features per set (m)Mean number of features per set (m)Slide credit: Kristen Grauman

13.
Pyramid match kernel:examples ofextensions and applications by other groupsSpatial Pyramid Match KernelLazebnik, Schmid, Ponce, 2006.Dual-space Pyramid Matching Hu et al., 2007. Representing Shape with a Pyramid Kernel Bosch & Zisserman, 2007.Scene recognitionShape representationMedical image classificationSlide credit: Kristen GraumanL=1L=2L=0

14.
Pyramid match kernel:examples of extensions and applications by other groupswavesit downFrom Omnidirectional Images to Hierarchical Localization, Murillo et al. 2007.Single View Human Action Recognition using Key Pose Matching, Lv & Nevatia, 2007.Spatio-temporal Pyramid Matching for Sports Videos, Choi et al., 2008.Action recognitionVideo indexingRobot localizationSlide : Kristen Grauman

15.

16.
Learning how tocompare imagesExploit (dis)similarity constraints to construct more useful distance function

17.
Number of existingtechniques for metric learningdissimilarsimilar[Weinberger et al. 2004, Hertz et al. 2004, Frome et al. 2007, Varma & Ray 2007, Kumar et al. 2007]

18.
Problem-specific knowledgeExample sourcesof similarity constraintsFully labeled image databasesPartially labeled image databases

19.
Locality Sensitive Hashing(LSH)Gionis, A. & Indyk, P. & Motwani, R. (1999)Take random projections of data

20.
Quantize each projectionwith few bits1010Descriptor in high D space10110

21.
Fast Image Searchfor Learned MetricsJain, Kulis, & Grauman, CVPR 2008 Learn a Malhanobis metric for LSHh( ) = h( )h( ) ≠h( )Less likely to split pairs like those with similarity constraintMore likely to split pairs like those with dissimilarity constraint Slide : Kristen Grauman

22.
Results: Flickr datasetQuerytime: Error rate18 classes, 5400 images

23.
Categorize scene basedon nearest exemplars

24.
Base metric: Ling& Soatto’s Proximity Distribution Kernel (PDK)slower search faster search30% of data 2% of dataSlide : Kristen Grauman

25.
Results: Flickr datasetQuerytime: Error rate18 classes, 5400 images

26.
Categorize scene basedon nearest exemplars

27.
Base metric: Ling& Soatto’s Proximity Distribution Kernel (PDK)slower search faster search30% of data 2% of data Slide : Kristen Grauman

28.

29.
Semantic Hashing[Salakhutdinov &Hinton, 2007] for text documentsQuery ImageSemantic HashFunctionAddress SpaceBinary codeImages in databaseQuery addressSemantically similar imagesQuite differentto a (conventional)randomizing hash

30.
Exploring different choicesof semantic hash functionTorralba, Fergus, Weiss, CVPR 2008<10μsBinary codeImage 13. RBM2. BoostSSC1.LSHRetrieved images<1msSemantic HashQuery ImageGist descriptorCompute Gist~1ms (in Matlab)

31.
Learn mappingNeighborhood ComponentsAnalysis [Goldberger et al., 2004] Adjust model parameters to move:Points of SAME class closerPoints of DIFFERENT class awayPoints in code space

32.
LabelMe retrieval comparison32-bitlearned codes do as well as 512-dim real-valued input descriptorLearning methods outperform LSH% of 50 true neighbors in retrieval set0 2,000 10,000 20,0000Size of retrieval set

33.
Review: constructing agood metric from dataLearn the metric from training dataTwo approaches that do this:

34.
Jain, Kulis, &Grauman, CVPR 2008: Learn Malhanobis distance for LSH.

35.
Torralba, Fergus, Weiss,CVPR 2008: Directly learn mapping from image to binary code.

36.
Use Hamming distance(binary codes) for speed

37.
Learning metric reallyhelps over plain LSH

38.
Learning only appliedto metric, not representation

Iccv2009 recognition and learning object categories p2 c01 - recognizing a large number of object classes

More Related Content

What's hot

Viewers also liked

Similar to Iccv2009 recognition and learning object categories p2 c01 - recognizing a large number of object classes

More from zukun

Recently uploaded

Iccv2009 recognition and learning object categories p2 c01 - recognizing a large number of object classes

Editor's Notes