Li Fei-Fei, Princeton Rob Fergus, MIT Antonio Torralba, MIT CVPR 2007 Minneapolis, Short Course, June 17 Recognizing and Learning  Object Categories: Year 2007
Agenda Introduction Bag-of-words models Part-based models Discriminative methods Segmentation and recognition Datasets & Conclusions
 
perceptible vision material thing
Plato said… Ordinary objects are classified together if they `participate' in the same abstract Form, such as the Form of a Human or the Form of Quartz. Forms are proper subjects of philosophical investigation, for they have the highest degree of reality. Ordinary objects, such as humans, trees, and stones, have a lower degree of reality than the Forms. Fictions, shadows, and the like have a still lower degree of reality than ordinary objects and so are not proper subjects of philosophical enquiry.
Bruegel, 1564
How many object categories are there? ~10,000 to 30,000 Biederman 1987
So what does object recognition involve?
Verification: is that a lamp?
Detection: are there people?
Identification: is that Potala Palace?
Object categorization mountain building tree banner vendor people street lamp
Scene and context categorization outdoor city …
Computational photography
Assisted driving Lane detection Pedestrian and car detection Collision warning systems with adaptive cruise control,  Lane departure warning systems,  Rear object detection systems,   meters meters Ped Ped Car
Improving online search Query: STREET Organizing photo collections
Challenges 1: view point variation Michelangelo 1475-1564
Challenges 2: illumination slide credit: S. Ullman
Challenges 3: occlusion Magritte, 1957
Challenges 4: scale
Challenges 5: deformation Xu, Beihong 1943
Challenges 6: background clutter Klimt, 1913
History: single object recognition
History: single object recognition Lowe, et al. 1999, 2003 Mahamud and Herbert, 2000 Ferrari, Tuytelaars, and Van Gool, 2004 Rothganger, Lazebnik, and Ponce, 2004 Moreels and Perona, 2005 …
Challenges 7: intra-class variation
History: early object categorization
Turk and Pentland, 1991 Belhumeur, Hespanha, & Kriegman, 1997 Schneiderman & Kanade 2004 Viola and Jones, 2000 Amit and Geman, 1999 LeCun et al. 1998 Belongie and Malik, 2002 Schneiderman & Kanade, 2004 Argawal and Roth, 2002 Poggio et al. 1993
~10,000 to 30,000
Object categorization:  the statistical viewpoint vs. Bayes rule: posterior ratio likelihood ratio prior ratio
Object categorization:  the statistical viewpoint posterior ratio likelihood ratio prior ratio Discriminative methods model posterior Generative methods model likelihood and prior
Discriminative Direct modeling of  Zebra Non-zebra Decision boundary
Generative Model  and  Middle  Low High Middle Low
Three main issues Representation How to represent an object category Learning How to form the classifier, given training data Recognition How the classifier is to be used on novel data
Representation Generative / discriminative / hybrid
Representation Generative / discriminative / hybrid Appearance only or location and appearance
Representation Generative / discriminative / hybrid Appearance only or location and appearance Invariances View point Illumination Occlusion Scale Deformation Clutter etc.
Representation Generative / discriminative / hybrid Appearance only or location and appearance invariances Part-based or global w/sub-window
Representation Generative / discriminative / hybrid Appearance only or location and appearance invariances Parts or global w/sub-window Use set of features or each pixel in image
Unclear how to model categories, so we learn what distinguishes them rather than manually specify the difference -- hence current interest in machine learning Learning
Unclear how to model categories, so we learn what distinguishes them rather than manually specify the difference -- hence current interest in machine learning) Methods of training: generative vs. discriminative Learning
Unclear how to model categories, so we learn what distinguishes them rather than manually specify the difference -- hence current interest in machine learning) What are you maximizing? Likelihood (Gen.) or performances on train/validation set (Disc.) Level of supervision Manual segmentation; bounding box; image labels; noisy labels Learning Contains a motorbike
Unclear how to model categories, so we learn what distinguishes them rather than manually specify the difference -- hence current interest in machine learning) What are you maximizing? Likelihood (Gen.) or performances on train/validation set (Disc.) Level of supervision Manual segmentation; bounding box; image labels; noisy labels Batch/incremental (on category and image level; user-feedback )  Learning
Unclear how to model categories, so we learn what distinguishes them rather than manually specify the difference -- hence current interest in machine learning) What are you maximizing? Likelihood (Gen.) or performances on train/validation set (Disc.) Level of supervision Manual segmentation; bounding box; image labels; noisy labels Batch/incremental (on category and image level; user-feedback )  Training images: Issue of overfitting Negative images for discriminative methods Priors Learning
Unclear how to model categories, so we learn what distinguishes them rather than manually specify the difference -- hence current interest in machine learning) What are you maximizing? Likelihood (Gen.) or performances on train/validation set (Disc.) Level of supervision Manual segmentation; bounding box; image labels; noisy labels Batch/incremental (on category and image level; user-feedback )  Training images: Issue of overfitting Negative images for discriminative methods Priors Learning
Scale / orientation range to search over  Speed Context Recognition
Hoiem, Efros, Herbert, 2006
OBJECTS ANIMALS INANIMATE PLANTS MAN-MADE NATURAL VERTEBRATE … .. MAMMALS BIRDS GROUSE BOAR TAPIR CAMERA

Cvpr2007 object category recognition p0 - introduction

  • 1.
    Li Fei-Fei, PrincetonRob Fergus, MIT Antonio Torralba, MIT CVPR 2007 Minneapolis, Short Course, June 17 Recognizing and Learning Object Categories: Year 2007
  • 2.
    Agenda Introduction Bag-of-wordsmodels Part-based models Discriminative methods Segmentation and recognition Datasets & Conclusions
  • 3.
  • 4.
  • 5.
    Plato said… Ordinaryobjects are classified together if they `participate' in the same abstract Form, such as the Form of a Human or the Form of Quartz. Forms are proper subjects of philosophical investigation, for they have the highest degree of reality. Ordinary objects, such as humans, trees, and stones, have a lower degree of reality than the Forms. Fictions, shadows, and the like have a still lower degree of reality than ordinary objects and so are not proper subjects of philosophical enquiry.
  • 6.
  • 7.
    How many objectcategories are there? ~10,000 to 30,000 Biederman 1987
  • 8.
    So what doesobject recognition involve?
  • 9.
  • 10.
  • 11.
  • 12.
    Object categorization mountainbuilding tree banner vendor people street lamp
  • 13.
    Scene and contextcategorization outdoor city …
  • 14.
  • 15.
    Assisted driving Lanedetection Pedestrian and car detection Collision warning systems with adaptive cruise control, Lane departure warning systems, Rear object detection systems, meters meters Ped Ped Car
  • 16.
    Improving online searchQuery: STREET Organizing photo collections
  • 17.
    Challenges 1: viewpoint variation Michelangelo 1475-1564
  • 18.
    Challenges 2: illuminationslide credit: S. Ullman
  • 19.
  • 20.
  • 21.
    Challenges 5: deformationXu, Beihong 1943
  • 22.
    Challenges 6: backgroundclutter Klimt, 1913
  • 23.
  • 24.
    History: single objectrecognition Lowe, et al. 1999, 2003 Mahamud and Herbert, 2000 Ferrari, Tuytelaars, and Van Gool, 2004 Rothganger, Lazebnik, and Ponce, 2004 Moreels and Perona, 2005 …
  • 25.
  • 26.
    History: early objectcategorization
  • 27.
    Turk and Pentland,1991 Belhumeur, Hespanha, & Kriegman, 1997 Schneiderman & Kanade 2004 Viola and Jones, 2000 Amit and Geman, 1999 LeCun et al. 1998 Belongie and Malik, 2002 Schneiderman & Kanade, 2004 Argawal and Roth, 2002 Poggio et al. 1993
  • 28.
  • 29.
    Object categorization: the statistical viewpoint vs. Bayes rule: posterior ratio likelihood ratio prior ratio
  • 30.
    Object categorization: the statistical viewpoint posterior ratio likelihood ratio prior ratio Discriminative methods model posterior Generative methods model likelihood and prior
  • 31.
    Discriminative Direct modelingof Zebra Non-zebra Decision boundary
  • 32.
    Generative Model and Middle  Low High Middle Low
  • 33.
    Three main issuesRepresentation How to represent an object category Learning How to form the classifier, given training data Recognition How the classifier is to be used on novel data
  • 34.
    Representation Generative /discriminative / hybrid
  • 35.
    Representation Generative /discriminative / hybrid Appearance only or location and appearance
  • 36.
    Representation Generative /discriminative / hybrid Appearance only or location and appearance Invariances View point Illumination Occlusion Scale Deformation Clutter etc.
  • 37.
    Representation Generative /discriminative / hybrid Appearance only or location and appearance invariances Part-based or global w/sub-window
  • 38.
    Representation Generative /discriminative / hybrid Appearance only or location and appearance invariances Parts or global w/sub-window Use set of features or each pixel in image
  • 39.
    Unclear how tomodel categories, so we learn what distinguishes them rather than manually specify the difference -- hence current interest in machine learning Learning
  • 40.
    Unclear how tomodel categories, so we learn what distinguishes them rather than manually specify the difference -- hence current interest in machine learning) Methods of training: generative vs. discriminative Learning
  • 41.
    Unclear how tomodel categories, so we learn what distinguishes them rather than manually specify the difference -- hence current interest in machine learning) What are you maximizing? Likelihood (Gen.) or performances on train/validation set (Disc.) Level of supervision Manual segmentation; bounding box; image labels; noisy labels Learning Contains a motorbike
  • 42.
    Unclear how tomodel categories, so we learn what distinguishes them rather than manually specify the difference -- hence current interest in machine learning) What are you maximizing? Likelihood (Gen.) or performances on train/validation set (Disc.) Level of supervision Manual segmentation; bounding box; image labels; noisy labels Batch/incremental (on category and image level; user-feedback ) Learning
  • 43.
    Unclear how tomodel categories, so we learn what distinguishes them rather than manually specify the difference -- hence current interest in machine learning) What are you maximizing? Likelihood (Gen.) or performances on train/validation set (Disc.) Level of supervision Manual segmentation; bounding box; image labels; noisy labels Batch/incremental (on category and image level; user-feedback ) Training images: Issue of overfitting Negative images for discriminative methods Priors Learning
  • 44.
    Unclear how tomodel categories, so we learn what distinguishes them rather than manually specify the difference -- hence current interest in machine learning) What are you maximizing? Likelihood (Gen.) or performances on train/validation set (Disc.) Level of supervision Manual segmentation; bounding box; image labels; noisy labels Batch/incremental (on category and image level; user-feedback ) Training images: Issue of overfitting Negative images for discriminative methods Priors Learning
  • 45.
    Scale / orientationrange to search over Speed Context Recognition
  • 46.
  • 47.
    OBJECTS ANIMALS INANIMATEPLANTS MAN-MADE NATURAL VERTEBRATE … .. MAMMALS BIRDS GROUSE BOAR TAPIR CAMERA