• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Describing People: A Poselet-based approach to attribute classification
 

Describing People: A Poselet-based approach to attribute classification

on

  • 976 views

 

Statistics

Views

Total Views
976
Views on SlideShare
976
Embed Views
0

Actions

Likes
0
Downloads
10
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution-ShareAlike LicenseCC Attribution-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Blablabla.Blablabla-bla. Bla!

Describing People: A Poselet-based approach to attribute classification Describing People: A Poselet-based approach to attribute classification Presentation Transcript

  • Describing People: A Poselet-BasedApproach to Attribute Classification Lubomir Bourdev1,2 Subhransu Maji1 Jitendra Malik1 1EECS U.C. Berkeley 2Adobe Systems Inc.
  • Goal: Extract attributes from images of people
  • Who has long hair?
  • Who has short pants?
  • Male or female?
  • Prior workon poselets and on attributes
  • Prior work on Poselets• Introduced by [Bourdev and Malik, ICCV09]• Detection with poselets [Bourdev et al, ECCV10]• Applications • Segmentation [Brox et al, ECCV10] [Maire et al, ICCV 11] • Actions [Yang et al, CVPR10] [Maji et al, CVPR11] [Yao et al, ICCV11] • Human parsing [Wang et al, CVPR11] • Semantic contours [Hariharan et al, ICCV11] • Subordinate level categorization [Farrell et al, ICCV11]
  • Prior work on Poselets• Introduced by [Bourdev and Malik, ICCV09]• Detection with poselets [Bourdev et al, ECCV10]• Applications • Segmentation [Brox et al, ECCV10] [Maire et al, ICCV 11] • Actions [Yang et al, CVPR10] [Maji et al, CVPR11] [Yao et al, ICCV11] • Human parsing [Wang et al, CVPR11] • Semantic contours [Hariharan et al, ICCV11] • Subordinate level categorization [Farrell et al, ICCV11]
  • Prior work on Poselets• Introduced by [Bourdev and Malik, ICCV09]• Detection with poselets [Bourdev et al, ECCV10]• Applications • Segmentation [Brox et al, ECCV10] [Maire et al, ICCV 11] • Actions [Yang et al, CVPR10] [Maji et al, CVPR11] [Yao et al, ICCV11] • Human parsing [Wang et al, CVPR11] • Semantic contours [Hariharan et al, ICCV11] • Subordinate level categorization [Farrell et al, ICCV11]
  • Prior work on Poselets• Introduced by [Bourdev and Malik, ICCV09]• Detection with poselets [Bourdev et al, ECCV10]• Applications • Segmentation [Brox et al, ECCV10] [Maire et al, ICCV 11] • Actions [Yang et al, CVPR10] [Maji et al, CVPR11] [Yao et al, ICCV11] • Human parsing [Wang et al, CVPR11] • Semantic contours [Hariharan et al, ICCV11] • Subordinate level categorization [Farrell et al, ICCV11]
  • Prior work on Poselets• Introduced by [Bourdev and Malik, ICCV09]• Detection with poselets [Bourdev et al, ECCV10]• Applications • Segmentation [Brox et al, ECCV10] [Maire et al, ICCV 11] • Actions [Yang et al, CVPR10] [Maji et al, CVPR11] [Yao et al, ICCV11] • Human parsing [Wang et al, CVPR11] • Semantic contours [Hariharan et al, ICCV11] • Subordinate level categorization [Farrell et al, ICCV11]
  • Prior work on AttributesAttributes as intermediate parts Image retrieval with attributesDiscovering attributes from text Attributes and actionsDiscovering attributes from images Active learning with attributesAttributes from motion capture Attributes of peopleJoint learning of classes & attributes Gender attribute[Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam& Yang, PAMI02][Ferrari &Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08][Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al,BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al, CVPR10][Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Branson el al,ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al, CVPR11][Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11] [Yao et al, ICCV11][Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]
  • Prior work on AttributesAttributes as intermediate parts Image retrieval with attributesDiscovering attributes from text Attributes and actionsDiscovering attributes from images Active learning with attributesAttributes from motion capture Attributes of peopleJoint learning of classes & attributes Gender attribute[Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam& Yang, PAMI02][Ferrari &Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08][Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al,BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al, CVPR10][Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Branson el al,ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al, CVPR11][Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11][Yao et al, ICCV11][Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]
  • Prior work on AttributesAttributes as intermediate parts Image retrieval with attributesDiscovering attributes from text Attributes and actionsDiscovering attributes from images Active learning with attributesAttributes from motion capture Attributes of peopleJoint learning of classes & attributes Gender attribute[Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam& Yang, PAMI02][Ferrari &Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08][Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al,BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al, CVPR10][Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Branson el al,ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al, CVPR11][Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11] [Yao et al, ICCV11][Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]
  • Prior work on AttributesAttributes as intermediate parts Image retrieval with attributesDiscovering attributes from text Attributes and actionsDiscovering attributes from images Active learning with attributesAttributes from motion capture Attributes of peopleJoint learning of classes & attributes Gender attribute[Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam& Yang, PAMI02][Ferrari &Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08][Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al,BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al, CVPR10][Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Branson el al,ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al, CVPR11][Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11] [Yao et al, ICCV11][Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]
  • Prior work on AttributesAttributes as intermediate parts Image retrieval with attributesDiscovering attributes from text Attributes and actionsDiscovering attributes from images Active learning with attributesAttributes from motion capture Attributes of peopleJoint learning of classes & attributes Gender attribute[Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam& Yang, PAMI02][Ferrari &Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08][Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al,BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al, CVPR10][Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Branson el al,ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al, CVPR11][Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11] [Yao et al, ICCV11][Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]
  • Prior work on AttributesAttributes as intermediate parts Image retrieval with attributesDiscovering attributes from text Attributes and actionsDiscovering attributes from images Active learning with attributesAttributes from motion capture Attributes of peopleJoint learning of classes & attributes Gender attribute[Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam& Yang, PAMI02][Ferrari &Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08][Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al,BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al, CVPR10][Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Branson el al,ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al, CVPR11][Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11] [Yao et al, ICCV11][Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]
  • Prior work on AttributesAttributes as intermediate parts Image retrieval with attributesDiscovering attributes from text Attributes and actionsDiscovering attributes from images Active learning with attributesAttributes from motion capture Attributes of peopleJoint learning of classes & attributes Gender attribute[Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam& Yang, PAMI02][Ferrari &Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08][Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al,BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al,CVPR10][Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Bransonel al, ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al,CVPR11] [Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11] [Yao et al,ICCV11] [Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]
  • Prior work on AttributesAttributes as intermediate parts Image retrieval with attributesDiscovering attributes from text Attributes and actionsDiscovering attributes from images Active learning with attributesAttributes from motion capture Attributes of peopleJoint learning of classes & attributes Gender attribute[Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam& Yang, PAMI02][Ferrari &Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08][Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al,BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al, CVPR10][Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Branson el al,ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al, CVPR11][Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11] [Yao et al, ICCV11][Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]
  • Prior work on AttributesAttributes as intermediate parts Image retrieval with attributesDiscovering attributes from text Attributes and actionsDiscovering attributes from images Active learning with attributesAttributes from motion capture Attributes of peopleJoint learning of classes & attributes Gender attribute[Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam& Yang, PAMI02][Ferrari &Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08][Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al,BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al, CVPR10][Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Branson el al,ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al, CVPR11][Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11] [Yao et al, ICCV11][Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]
  • Prior work on AttributesAttributes as intermediate parts Attributes and actionsDiscovering attributes from text Active learning with attributesDiscovering attributes from images Attributes of peopleAttributes from motion capture Gender attributeJoint learning of classes & attributesImage retrieval with attributes[Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam& Yang, PAMI02][Ferrari &Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08][Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al,BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al, CVPR10][Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Branson el al,ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al, CVPR11][Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11] [Yao et al, ICCV11][Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]
  • Prior work on AttributesAttributes as intermediate parts Image retrieval with attributesDiscovering attributes from text Attributes and actionsDiscovering attributes from images Active learning with attributesAttributes from motion capture Attributes of peopleJoint learning of classes & attributes Gender attribute[Cottrell and Medcalfe, NIPS90] [Golomb et al, NIPS90] [Moghaddam& Yang, PAMI02][Ferrari &Zisserman, NIPS07] [Kumar et al, ECCV08] [Gallagher and Chen, CVPR08][Cao et al, ACM08] [Lampert et al, CVPR09] [Farhadi et al, CVPR 09] [Wang et al,BMVC09] [Wang and Forsyth, ICCV09] [Kumar et al, ICCV09] [Farhadi et al, CVPR10][Berg et al, ECCV10] [Wang and Mori, ECCV10] [Sigal et al, ECCV10] [Branson el al,ECCV10] [Hwang et al, CVPR11] [Parikh and Grauman, CVPR11] [Douze et al, CVPR11][Kovashka et al, ICCV11] [Liu et al, CVPR11] [Qiu et al, ICCV11] [Yao et al, ICCV11][Dhar et al, CVPR11] [Parikh and Grauman, ICCV11] [Siddiquie et al, CVPR11]
  • Poseletsfor Attribute Classification
  • Male or female?
  • Gender recognition is easier if we factor out the pose
  • Poselets [Bourdev & Malik ICCV09]
  • PoseletsExamples may differ visually but have common semantics
  • How do we train a poselet?
  • Finding correspondences at training timeGiven part of a human How do we find a similarpose pose configuration in the training set?
  • Finding correspondences at training time Left Shoulder Left HipWe use keypoints to annotate the joints, eyes, nose, etc. of people
  • Finding correspondences at training time Residual Error
  • Training poselet classifiersResidual 0.15 0.20 0.10 0.85 0.15 0.35Error:1. Given a seed patch2. Find the closest patch for every other person3. Sort them by residual error4. Threshold them
  • Training poselet classifiers1. Given a seed patch2. Find the closest patch for every other person3. Sort them by residual error4. Threshold them5. Use them as positive training examples to train a linear SVM with HOG features
  • Attribute Classification Algorithm at Test Time
  • Goal: Extract attributes of this person
  • Goal: Extract attributes of this person Target person bounds Input: Bounds of other nearby people
  • Step 1: Detect poselet activations [Bourdev et al, ECCV10]
  • Step 2: Cluster the activations [Bourdev et al, ECCV10]
  • Step 3: Predict person bounds [Bourdev et al, ECCV10]
  • Step 4: Identify the correct cluster Max-flow in bipartite graph
  • Start with its poselet activationsPoseletActivations
  • Features • Pyramid HOG • LAB histogram • Skin features • Hands-skin • Legs-skin Poselet Skin Arms B .* C patch mask maskFeaturesPoseletActivations
  • Attribute Classification OverviewPoselet-levelAttributeClassifiersFeaturesPoseletActivations
  • Attribute Classification OverviewPerson-levelAttributeClassifiersPoselet-levelAttributeClassifiersFeaturesPoseletActivations
  • Attribute Classification OverviewContext-levelAttributeClassifiersPerson-levelAttributeClassifiersPoselet-levelAttributeClassifiersFeaturesPoseletActivations
  • Results
  • Our dataset• Source: VOC 2010 trainval for Person + H3D• ~8000 annotations (4000 train + 4000 test)• 9 binary attributes specified by 5 independent annotators via AMT• Ground truth label: If 4 of the 5 agree• Dataset will be made publicly available
  • Visual search on our test set“Wears hat”“Female”
  • “Has long hair”“Wears glasses”
  • “Wears shorts”“Has long sleeves”
  • “Doesn’t have long sleeves”
  • Our baseline• Canny-modulated HOG with SPM kernel [Lazebnik et al CVPR06]• To help the baseline trained separate SPM for four viewpoints: Full view Head zoom Upper body Legs• For each attribute we pick the best SPM as our baseline
  • Precision/recall on our test setLabel - ---frequencySPM ___No ___contextFull ___Model
  • State-of-the-art Gender Recognition• We outperform Cognitec (top-notch face recognizer)• We outperform any gender recognizer based on frontal faces (are there others?) • 61% of our test have frontal faces. • Even with perfect classification of frontal faces, max AP=80.5% vs. our AP of 82.4%
  • Confusions long hairMen most confused as womenWomen most confused as men baseball hat hair hidden
  • annotationNon-T-shirt most confused to be T-shirt errorsShort pants most confused to be long pants Are these pants short? wrong person occlusion
  • Best poselets per attributeGender:Long Hair:Wears glasses:
  • We can describe a picture of a person “A woman with long hair, glasses and long pants”(??)
  • Conclusion
  • How poselets help in high-level vision The image is a complex Poselets decouple pose andfunction of the viewpoint, camera view from pose, appearance, etc. appearance
  • Google “poselets” to get:• The set of published poselet papers• H3D data set + Matlab tools• Java3D annotation tool + video tutorial• Matlab code to detect people using poselets• Our latest trained poselets
  • Poselets website Failure modehttp://eecs.berkeley.edu/~lbourdev/poselets hair, “A man with with long “A woman short “Aglasses,with short hair, “Aperson short short hair, man with sleeves and hair and long sleeves”• The set of published poseletno hat pants” sleeves glasses, short sleeves” papers and long long• H3D data set + Matlab toolsand person with “A shorts” Java3D annotation tool + video tutorial longcomputer vision “A pants”•• Matlab code to detect people using poselets professor who likes• Our latest trained poselets machine learning”