Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Faster R-CNN:
Towards Real-Time Object Detection with
Region Proposal Networks
Shaoqing Ren, Kaiming He, Ross Girshick, Ji...
1. Introduction
Object Detection
3
Object Detection: Previously...
DPM. P. Felzenszwalb, R. Girshick, D. McAllester, D. Ramanan. Object Detection with Discri...
Object Detection: Previously...
R-CNN. Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014, June). Rich feature hier...
Object Detection: Limitations
Selective Search CPMC
MCG
Object Proposal computation is the bottleneck in
current state of ...
Faster R-CNN: Motivation
Selective Search CPMC
MCG
Replace the usage of external Object Proposals
with a Region Proposal N...
2. Methodology
Faster R-CNN: Overview
Conv
Layer 5
Conv
layers
RPN RPN Proposals
RPN Proposals
Class probabilities
RoI pooling layer
FC l...
Faster R-CNN: Overview
Conv
Layer 5
Conv
layers
RPN RPN Proposals
RPN Proposals
Class probabilities
RoI pooling layer
FC l...
Region Proposal Network (RPN)
11
Objectness scores
Bounding Box Regression
In practice, k = 9 (3 different scales and 3 as...
RPN: Loss Function
12
Predicted probability of being an object for anchor i
i = anchor index in minibatch
Coordinates of t...
RPN: Positive/Negative Samples
13
An anchor is labeled as positive if:
(a) the anchor is the one with highest IoU overlap ...
Faster R-CNN: Overview
Conv
Layer 5
Conv
layers
RPN RPN Proposals
RPN Proposals
Class probabilities
RoI pooling layer
FC l...
Object Detection Network
15
Fast R-CNN
Object Detection Network: Loss
16
*From Fast R-CNN
Predicted class scores
True class scores
True box coordinates
Predicted...
Fast R-CNN: Positive/Negative Samples
17
Positive samples are defined as those whose IoU overlap with a
ground-truth bound...
Faster R-CNN: Training
18
Conv
Layer 5
Conv
layers RPN RPN Proposals
RPN Proposals
Class probabilities
RoI pooling layer
F...
Faster R-CNN: 4-step training
Conv
Layer 5
Conv
layers
RPN RPN Proposals
19
Step 1: Train RPN initialized with an ImageNet...
Faster R-CNN: 4-step training
Conv
Layer 5
Conv
layers
RPN Proposals
(learned in 1)
Class probabilities
20
Step 2: Train F...
Faster R-CNN: 4-step training
Conv
Layer 5
Conv
layers
RPN RPN Proposals
21
Step 3: The model trained in 2 is used to init...
Faster R-CNN: 4-step training
Conv
Layer 5
Conv
layers
RPN Proposals
(learned in 3)
Class probabilities
22
Step 4: Fine tu...
3. Experiments
Experiments: CNN Architectures
24
VGG-16: Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks
for large...
Experiments: Datasets
25
Experiments I: VOC 2007 & ZF
26
Comparison between Fast R-CNN trained with external object proposals
(SS: Selective Search...
Experiments I: VOC 2007 & ZF
27
Experiments I: VOC 2007 & ZF
28
Experiments I: VOC 2007 & ZF
29
Experiments II
30
Detection Accuracy
Timing (ms)
Experiments III
31
Experiments IV
32
One-Stage Detection:
1) Directly Refine and Classify Sliding Window locations
Two-Stage Proposal + Detec...
Experiments V: MS COCO (arXiv)
33
Qualitative Results
34
4. Summary
Summary
36
● Region Proposal Network sharing convolutional features
with Object Detection Network makes region generation
...
Summary
37
● Faster R-CNN is the basis of the winners of COCO and
ILSVRC 2015 object detection competitions [1].
● RPN is ...
Thank you !
Questions?
Upcoming SlideShare
Loading in …5
×

Faster R-CNN: Towards real-time object detection with region proposal networks (UPC Reading Group)

18,720 views

Published on

Slides by Amaia Salvador at the UPC Computer Vision Reading Group.

Source document on GDocs with clickable links:
https://docs.google.com/presentation/d/1jDTyKTNfZBfMl8OHANZJaYxsXTqGCHMVeMeBe5o1EL0/edit?usp=sharing

Based on the original work:

Ren, Shaoqing, Kaiming He, Ross Girshick, and Jian Sun. "Faster R-CNN: Towards real-time object detection with region proposal networks." In Advances in Neural Information Processing Systems, pp. 91-99. 2015.

Published in: Technology
  • Dear Sir- Madame, Please let me know how I can be of help to you? Kind Regards, Fredrick. Fredrick Anold Director - Consultant idealtropical.com +1 813-713-0615 Hello, It is my pleasure to reach out to you with the following offer of appointment. Shenergy (Group) Company Limited, are currently seeking Reputable Company/Individual that can act as their Company Representative/Account Receivable Agent in Canada and in USA (Intermediary between Shenergy (Group) Co, Ltd and its clients in the Northern America region). If interested, kindly indicate your interest by responding directly to the Company's Deputy General Manager and Director details below: Xu Weiquan Deputy General Manager and Director Shenergy (Group) Co, Ltd E-mail: representative-department@shenergygroup.com.cn http://www.shenergy.com.cn/ Note: The job will only take few minutes of your time daily. You can Earn Extra income while doing your normal Job/Business. Management Shenergy (Group) CSS, Company Limited
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Faster R-CNN: Towards real-time object detection with region proposal networks (UPC Reading Group)

  1. 1. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun [paper@NIPS15][arXiv][python][matlab][slides by R. Girshick] Slides by Amaia Salvador [GDoc] Computer Vision Reading Group (01/03/2016)
  2. 2. 1. Introduction
  3. 3. Object Detection 3
  4. 4. Object Detection: Previously... DPM. P. Felzenszwalb, R. Girshick, D. McAllester, D. Ramanan. Object Detection with Discriminatively Trained Part Based Models. In IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 32, No. 9, Sep. 2010 DPM 4 Hand-crafted features + Sliding Window
  5. 5. Object Detection: Previously... R-CNN. Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014, June). Rich feature hierarchies for accurate object detection and semantic segmentation. In Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on(pp. 580-587). IEEE. SPPnet. He, K., Zhang, X., Ren, S., & Sun, J. (2015). Spatial pyramid pooling in deep convolutional networks for visual recognition. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 37(9), 1904-1916. Fast R-CNN. Girshick, R. (2015). Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision (pp. 1440-1448). R-CNN SPPnet Fast R-CNN 5 CNN features + Object Proposals
  6. 6. Object Detection: Limitations Selective Search CPMC MCG Object Proposal computation is the bottleneck in current state of the art object detection systems Selective Search. Van de Sande, K. E., Uijlings, J. R., Gevers, T., & Smeulders, A. W. (2011, November). Segmentation as selective search for object recognition. InComputer Vision (ICCV), 2011 IEEE International Conference on (pp. 1879-1886). IEEE. CPMC. Carreira, J., & Sminchisescu, C. (2010, June). Constrained parametric min-cuts for automatic object segmentation. In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on (pp. 3241-3248). IEEE. MCG. Arbeláez, P., Pont-Tuset, J., Barron, J., Marques, F., & Malik, J. (2014). Multiscale combinatorial grouping. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 328-335). 6
  7. 7. Faster R-CNN: Motivation Selective Search CPMC MCG Replace the usage of external Object Proposals with a Region Proposal Network (RPN). 7
  8. 8. 2. Methodology
  9. 9. Faster R-CNN: Overview Conv Layer 5 Conv layers RPN RPN Proposals RPN Proposals Class probabilities RoI pooling layer FC layers Class scores 9
  10. 10. Faster R-CNN: Overview Conv Layer 5 Conv layers RPN RPN Proposals RPN Proposals Class probabilities RoI pooling layer FC layers Class scores 10
  11. 11. Region Proposal Network (RPN) 11 Objectness scores Bounding Box Regression In practice, k = 9 (3 different scales and 3 aspect ratios)
  12. 12. RPN: Loss Function 12 Predicted probability of being an object for anchor i i = anchor index in minibatch Coordinates of the predicted bounding box for anchor i Ground truth objectness label True box coordinates Ncls = Number of anchors in minibatch (~ 256) Nreg = Number of anchor locations ( ~ 2400) Log loss Smooth L1 loss In practice = 10, so that both terms are roughly equally balanced
  13. 13. RPN: Positive/Negative Samples 13 An anchor is labeled as positive if: (a) the anchor is the one with highest IoU overlap with a ground-truth box (b) the anchor has an IoU overlap with a ground-truth box higher than 0.7 Negative labels are assigned to anchors with IoU lower than 0.3 for all ground-truth boxes. 50%/50% ratio of positive/negative anchors in a minibatch.
  14. 14. Faster R-CNN: Overview Conv Layer 5 Conv layers RPN RPN Proposals RPN Proposals Class probabilities RoI pooling layer FC layers Class scores 14
  15. 15. Object Detection Network 15 Fast R-CNN
  16. 16. Object Detection Network: Loss 16 *From Fast R-CNN Predicted class scores True class scores True box coordinates Predicted box coordinates Log loss Smooth L1 loss
  17. 17. Fast R-CNN: Positive/Negative Samples 17 Positive samples are defined as those whose IoU overlap with a ground-truth bounding box is > 0.5. Negative examples are sampled from those that have a maximum IoU overlap with ground truth in the interval [0.1, 0.5). 25%/75% ratio for positive/negative samples in a minibatch. *From Fast R-CNN
  18. 18. Faster R-CNN: Training 18 Conv Layer 5 Conv layers RPN RPN Proposals RPN Proposals Class probabilities RoI pooling layer FC layers Class scores 4-step training to share features for RPN and Fast R-CNN
  19. 19. Faster R-CNN: 4-step training Conv Layer 5 Conv layers RPN RPN Proposals 19 Step 1: Train RPN initialized with an ImageNet pre-trained model. ImageNet weights (fine tuned)
  20. 20. Faster R-CNN: 4-step training Conv Layer 5 Conv layers RPN Proposals (learned in 1) Class probabilities 20 Step 2: Train Fast R-CNN with learned RPN proposals. ImageNet weights (fine tuned)
  21. 21. Faster R-CNN: 4-step training Conv Layer 5 Conv layers RPN RPN Proposals 21 Step 3: The model trained in 2 is used to initialize RPN and train again. Weights from Step 2 (fixed)
  22. 22. Faster R-CNN: 4-step training Conv Layer 5 Conv layers RPN Proposals (learned in 3) Class probabilities 22 Step 4: Fine tune FC layers of Fast R-CNN using same shared convolutional layers as in 3. Weights from Step 2&3 (fixed)
  23. 23. 3. Experiments
  24. 24. Experiments: CNN Architectures 24 VGG-16: Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. ZF: Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In Computer vision–ECCV 2014 (pp. 818-833). Springer International Publishing.
  25. 25. Experiments: Datasets 25
  26. 26. Experiments I: VOC 2007 & ZF 26 Comparison between Fast R-CNN trained with external object proposals (SS: Selective Search, EB: EdgeBoxes) with Faster R-CNN
  27. 27. Experiments I: VOC 2007 & ZF 27
  28. 28. Experiments I: VOC 2007 & ZF 28
  29. 29. Experiments I: VOC 2007 & ZF 29
  30. 30. Experiments II 30 Detection Accuracy Timing (ms)
  31. 31. Experiments III 31
  32. 32. Experiments IV 32 One-Stage Detection: 1) Directly Refine and Classify Sliding Window locations Two-Stage Proposal + Detection: 1) Learn Object Proposals 2) Refine and classify Object Proposals
  33. 33. Experiments V: MS COCO (arXiv) 33
  34. 34. Qualitative Results 34
  35. 35. 4. Summary
  36. 36. Summary 36 ● Region Proposal Network sharing convolutional features with Object Detection Network makes region generation step nearly cost-free. ● Quality of proposals is improved with RPN wrt SS and EB. ● Object Detection system at 5-17 fps.
  37. 37. Summary 37 ● Faster R-CNN is the basis of the winners of COCO and ILSVRC 2015 object detection competitions [1]. ● RPN is also used in the winning entries of ILSVRC 2015 localization [1] and COCO 2015 segmentation competitions [2]. [1] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” arXiv:1512.03385, 2015. [2] J. Dai, K. He, and J. Sun, “Instance-aware semantic segmentation via multi-task network cascades,” arXiv:1512.04412, 2015.
  38. 38. Thank you ! Questions?

×