The PASCAL Visual Object Classes   (VOC) Dataset and Challenge         Andrew Zisserman        Visual Geometry Group      ...
The PASCAL VOC Challenge• Challenge in visual object  recognition funded by  PASCAL network of excellence• Publicly availa...
Dataset Content• 20 classes: aeroplane, bicycle, boat, bottle, bus, car, cat,  chair, cow, dining table, dog, horse, motor...
Annotation• Complete annotation of all objects• Annotated in one session with written guidelinesOccluded                  ...
Examples Aeroplane   Bicycle   Bird   Boat    Bottle   Bus         Car      Cat   Chair   Cow
ExamplesDining Table   Dog     Horse   Motorbike    PersonPotted Plant   Sheep    Sofa     Train     TV/Monitor
Dataset Statistics – VOC 2010                     Training             Testing        Images         10,103              9...
Design choices: Things we got right …1. Standard method of assessment•   Train/validation/test splits given•   Standard ev...
Design choices: Things we got right …2. Evaluation of test dataThree possibilities   1. Release test data and annotation (...
Design choices: Things we got right …3. Augmentation of dataset each year•   VOC2010 around 40% increase in size over VOC2...
Detection Challenge: Progress 2008-2010             60             50Max AP (%)             40                            ...
Design choices: Things we got right …4. Equivalence bands•   Uses Friedman/Nemenyi ANOVA significance test•   Makes explic...
Statistical Significance• Friedman/Nemenyi analysis     – Compare differences in mean rank of methods over classes using n...
The future …Action Classification Taster Challenge – 2010 on• Given the bounding box of a person, predict whether they are...
Ten Action Classes Phoning    Playing Instrument      Reading       Riding Bike   Riding Horse Running      Taking Photo  ...
Person Layout Taster Challenge – 2009 on• Given the bounding box of a person, predict the positions of head, hands  and fe...
Futures2012• Last official year of PASCAL2• May introduce a new taster competition, e.g   – Materials   – Actions in video...
Upcoming SlideShare
Loading in …5
×

Fcv dataset zisserman_2

430 views

Published on

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
430
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Fcv dataset zisserman_2

  1. 1. The PASCAL Visual Object Classes (VOC) Dataset and Challenge Andrew Zisserman Visual Geometry Group University of Oxford
  2. 2. The PASCAL VOC Challenge• Challenge in visual object recognition funded by PASCAL network of excellence• Publicly available dataset of annotated images• Main competitions in classification (is there an X in this image), detection (where are the X’s), and segmentation (which pixels belong to X)• Competition has run each year since 2006.• Organized by Mark Everingham, Luc Van Gool, Chris Williams, John Winn and Andrew Zisserman
  3. 3. Dataset Content• 20 classes: aeroplane, bicycle, boat, bottle, bus, car, cat, chair, cow, dining table, dog, horse, motorbike, person, potted plant, sheep, train, TV• Real images downloaded from flickr, not filtered for “quality”• Complex scenes, scale, pose, lighting, occlusion, truncation ...
  4. 4. Annotation• Complete annotation of all objects• Annotated in one session with written guidelinesOccluded DifficultObject is Not scoredsignificantly in evaluationoccluded within BB Truncated Pose Object extends Facing left beyond BB
  5. 5. Examples Aeroplane Bicycle Bird Boat Bottle Bus Car Cat Chair Cow
  6. 6. ExamplesDining Table Dog Horse Motorbike PersonPotted Plant Sheep Sofa Train TV/Monitor
  7. 7. Dataset Statistics – VOC 2010 Training Testing Images 10,103 9,637 Objects 23,374 22,992• Minimum ~500 training objects per category – ~1700 cars, 1500 dogs, 7000 people• Approximately equal distribution across training and test sets• Data augmented each year
  8. 8. Design choices: Things we got right …1. Standard method of assessment• Train/validation/test splits given• Standard evaluation protocol – AP per class• Software supplied – Includes baseline classifier/detector/segmenter – Runs from training to generating PR curve and AP on validation or test data out of the box• Has meant that results on VOC can be consistently compared in publications• “Best Practice” webpage on how the training/validation/test data should be used for the challenge
  9. 9. Design choices: Things we got right …2. Evaluation of test dataThree possibilities 1. Release test data and annotation (most liberal) and participants can assess performance – Cons: open to abuse 2. Release test data, but test annotation withheld - participants submit results and organizers assess performance (use an evaluation server) 3. No release of test data - participants have to submit software and organizers run this and assess performance
  10. 10. Design choices: Things we got right …3. Augmentation of dataset each year• VOC2010 around 40% increase in size over VOC2009 Training Testing Images 10,103 (7,054) 9,637 (6,650) Objects 23,374 (17,218) 22,992 (16,829) VOC2009 counts shown in brackets• Has prevented over fitting on data• 2008/9 datasets retained as subset of 2010 – Assignments to training/test sets maintained – So can measure progress from 2008 to 2010
  11. 11. Detection Challenge: Progress 2008-2010 60 50Max AP (%) 40 2008 30 2009 2010 20 10 0 tvmonitor pottedplant bottle motorbike diningtable aeroplane horse sofa person train sheep bicycle cow cat boat bus bird dog car chair• Results on 2008 data improve for best 2009 and 2010 methods for all categories, by over 100% for some categories – Caveat: Better methods or more training data?
  12. 12. Design choices: Things we got right …4. Equivalence bands• Uses Friedman/Nemenyi ANOVA significance test• Makes explicit that many methods are just slight perturbations of each other• Need more research on this area for obtaining equivalence classes for ranking (rather than classification)• Plan to include banner/header results on evaluation server (to aid comparisons for reviewing etc)
  13. 13. Statistical Significance• Friedman/Nemenyi analysis – Compare differences in mean rank of methods over classes using non- parametric version of ANOVA – Mean rank must differ by at least 5.4 to be considered significant (p=0.05) Mean rank 45 40 35 30 25 20 15 10 5 (chance) NUSPSL_KERNELREGFUSING HIT_PROTOLEARN_2 NUSPSL_MFDETSVM NII_SVMSIFT NUSPSL_EXCLASSIFIER BUPT_LPBETA_MULTFEAT NEC_V1_HOGLBP_NONLIN_SVMDET NTHU_LINSPARSE_2 NLPR_VSTAR_CLS_DICTLEARN BUPT_SVM_MULTFEAT NEC_V1_HOGLBP_NONLIN_SVM LIG_MSVM_FUSE_CONCEPT UVA_BW_NEWCOLOURSIFT WLU_SPM_EMDIST UVA_BW_NEWCOLOURSIFT_SRKDA BUPT_SPM_SC_HOG CVC_PLUSDET LIP6UPMC_RANKING SURREY_MK_KDA LIP6UPMC_KSVM_BASELINE CVC_PLUS LIP6UPMC_MKL_L1 BUT_FU_SVM_SIFT UC3M_GENDISC XRCE_IFV NUDT_SVM_WHGO_SIFT_CENTRIST_LLM CVC_FLAT RITSU_CBVR_WKF BONN_FGT_SEGM TIT_SIFT_GMM_MKL LIRIS_MKL_TRAINVAL NUDT_SVM_LDP_SIFT_PMK_SPMK
  14. 14. The future …Action Classification Taster Challenge – 2010 on• Given the bounding box of a person, predict whether they are performing a given action Playing Instrument? Reading?• Developed with Ivan Laptev• Strong participation from the start (11 methods, 8 groups)
  15. 15. Ten Action Classes Phoning Playing Instrument Reading Riding Bike Riding Horse Running Taking Photo Using Computer Walking Jumping• 2011: added jumping + `other’ class
  16. 16. Person Layout Taster Challenge – 2009 on• Given the bounding box of a person, predict the positions of head, hands and feet.• aim: to encourage research on more detailed image interpretation hand hand head head head hand hand hand foot foot foot• 2009: no participation• 2010: relaxed success criteria, 2 participants• Assessment method still not satisfactory?
  17. 17. Futures2012• Last official year of PASCAL2• May introduce a new taster competition, e.g – Materials – Actions in video• Open to suggestions

×