Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

PFDet: 2nd Place Solutions to Open Images Competition

5,604 views

Published on

Presented at Kaggle Tokyo Meetup #5. Our solution to Open Images - Object Detection Track. See also: https://connpass.com/event/105298/

Published in: Data & Analytics
  • Be the first to comment

PFDet: 2nd Place Solutions to Open Images Competition

  1. 1. Takuya Akiba*, Tommi Kerola*, Yusuke Niitani*, Toru Ogawa*, Shotaro Sano*, and Shuji Suzuki* *: Equal Contribution PFDet: 2nd Place Solution to Open Images Competition
  2. 2. : iwiwi Kaggle Tokyo Meetup #2
  3. 3. : Shotaro Sano (@g_votte)
  4. 4. • Object Detection • 500, 1.7M, Bounding Box 12M • : 2018 7/3 ~ 8/30
  5. 5. Public LB Private LB
  6. 6. • • : • : – Bounding Boxes – Bounding Box • • • Object Detection
  7. 7. Bounding Box : mean Average Precision • Average Precision • Average Precision – – Precision IoU = >= 0.5 Confidence TP/FP precision@k 0.9 TP 100% 0.8 TP 100% 0.75 FP 66% 0.7 TP 75% 0.67 FP 60% … … … Average Precision
  8. 8. Object Detection 2007-2012 2014- 2018
  9. 9. MS COCO Open Images # of classes 80 500 # of images 0.12M 1.7M Increase by more than x14 Object Detection
  10. 10. Faster RCNN
  11. 11. Bounding Box & Bounding Box
  12. 12. &
  13. 13. Bounding Box (NMS/NMW)
  14. 14. PFDet
  15. 15. Architecture Faster RCNN + Feature Pyramid Networks + SE-ResNeXt + Pyramid Scene Parsing + Context Head + Non-maximal Weighted Training Method SGD + Sigmoid Loss + Cosine Annealing + Co-occurrence Loss
  16. 16. Faster RCNN SE-ResNeXt Feature Pyramid Networks Expert Models Sigmoid Loss Cosine Annealing Co-occurrence Loss
  17. 17. SE-ResNeXt Feature Pyramid Networks Expert Models
  18. 18. SE-ResNeXt
  19. 19. SE-ResNeXt ResNeXt SENet
  20. 20. Feature Pyramid Networks
  21. 21. Feature Pyramid Networks • ResNet (& upsampling) feature map • Region Proposal Network
  22. 22. Pressure Cooker: 17 images Person: 800k images 238 classes appear in <1000 images Image from https://storage.googleapis.com/openimages/web/factsfigures.html Expert Models
  23. 23. Expert Models • •
  24. 24. Suppression BB & class BB & class BB & class BB & class BB & class BB & class Bounding Box
  25. 25. Model 1 Model 2 … Concat Suppression
  26. 26. Sigmoid Loss
  27. 27. Sigmoid Loss • • E.g., Bounding Box Football Ball
  28. 28. FC Softmax volleyball score football score ball score Cross Entropy football or ball volleyball score football score ball score FC Sigmoid Sigmoid Sigmoid Cross Entropy football and ballCross Entropy Cross Entropy SoftmaxLossSigmoidLoss
  29. 29. Cosine Annealing Co-occurrence Loss
  30. 30. Cosine Annealing
  31. 31. Co-occurrence Loss
  32. 32. Co-occurrence Loss
  33. 33. • • Bounding Box Bounding Box
  34. 34. Ignore: Face Arm Negative: Car, ... negative Co-occurrence Loss
  35. 35. +22.7AP improvement on “Human Parts” on 47 part classes average +9.2AP improvement
  36. 36. Faster RCNN SE-ResNeXt Feature Pyramid Networks Expert Models Sigmoid Loss Cosine Annealing Co-occurrence Loss
  37. 37. Single Best Model
  38. 38. Ensemble
  39. 39. MS COCO Open Images # of classes 80 500 # of images 0.12M 1.7M Increase by more than x14
  40. 40. HW MN-1 V100 (32GB) x512 Infiniband ● Best single model 33 ● 83% (v.s. 8 GPU) SW ChainerMN
  41. 41. Multi-node Batch Normalization • BN ghost BN – mean, var • segmentation, detection multi-node BN – BN mean, var • multi-node BN
  42. 42. • Faster R-CNN + FPN + etc • Backbone: SE-ResNeXt (vs. NASNet, etc……) • LR Schedule: Cosine (vs. step, poly, etc……) • Suppression: NMW (vs. NMS, Soft NMS, etc……) • PSP, Context head • : Co-occurrence loss • : Multi-node BN, Linear LR scaling, warmup • expert model
  43. 43. 1. tech report: Takuya Akiba, Tommi Kerola, Yusuke Niitani, Toru Ogawa, Shotaro Sano, Shuji Suzuki. PFDet: 2nd Place Solution to Open Images Challenge 2018 Object Detection Track. https://arxiv.org/abs/1809.00778 2. Co-occurrence loss tech report: Yusuke Niitani, Takuya Akiba, Tommi Kerola, Toru Ogawa, Shotaro Sano, Shuji Suzuki. Sampling Techniques for Large-Scale Object Detection from Sparsely Annotated Objects. https://arxiv.org/abs/1811.10862

×