Faster R-CNN: Towards
Real-Time Object detection
Microsoft Research, NIPS2015
Presentor: Andy Tsai
Key Idea: Region Proposal Net (RPN) layer
Conv
Layers
Predict BBoxes Classify objects
conv5
Testing Time: faster R-CNN
Conv
Layers
Predict BBoxes Classify objects
conv5
1*ConvTime
1*SmallNet 1000*
FcTime
Testing Time: R-CNN
Conv
Layers
conv5
Object Proposal
…………….
Object Proposal
1000*ConvTime
1000*FcTime
Testing Time: fast R-CNN
Conv
Layers
conv5
Object Proposal
…………….
Object Proposal
1*ConvTime
1000*FcTime
ROI pooling
Fast, Accurate Object Detection
- fastest region proposal method: Edge Boxes [4fps, 1000 proposal]
- Testing stage
Model Time
Edge boxes + R-CNN 0.25 sec + 1000*ConvTime + 1000*FcTime
Edge boxes + fast R-CNN 0.25 sec + 1*ConvTime + 1000*FcTime
faster R-CNN 1*ConvTime + 1000*FcTime
RPN layer
RPN layer
RPN layer
Anchor regression
RPN: Loss Function
- 2 class Softmax cross entropy loss
- Discriminative training:
- pi* = 1 if IoU > 0.7
- pi* = 0 if IoU < 0.3
- otherwise, do not contribute to loss
RPN: Loss Function
only positive sample contribute to reg. Loss
How to train faster R-CNN ?
- balance sampling ( neg. vs pos. = 1:1 )
- joint training is almost impossible ( conv update alternating )
- 4 step training !
How to train faster R-CNN ?
1. train RPN with ImageNet pre-trained model
2. train fast R-CNN using proposal generated by 1. [ no params. sharing ]
3. use conv trained by 2. to initialize model, fix shared conv, update RPN
4. fix shared conv, fine-tune fc
Result - MAP
- VOC2007, ZF ConvNet
Result - MAP
Using VGG ConvNet, fast R-CNN vs faster R-CNN
Result - Time
VOC 2007, fast R-CNN vs faster R-CNN
70.0%
73.2%
59.9%
Official Leader Board Score
Code available at gitHub
Matlab(faster-rcnn) & python(py-faster-rcnn) version
Thank you :)

Faster rcnn

  • 1.
    Faster R-CNN: Towards Real-TimeObject detection Microsoft Research, NIPS2015 Presentor: Andy Tsai
  • 2.
    Key Idea: RegionProposal Net (RPN) layer Conv Layers Predict BBoxes Classify objects conv5
  • 3.
    Testing Time: fasterR-CNN Conv Layers Predict BBoxes Classify objects conv5 1*ConvTime 1*SmallNet 1000* FcTime
  • 4.
    Testing Time: R-CNN Conv Layers conv5 ObjectProposal ……………. Object Proposal 1000*ConvTime 1000*FcTime
  • 5.
    Testing Time: fastR-CNN Conv Layers conv5 Object Proposal ……………. Object Proposal 1*ConvTime 1000*FcTime ROI pooling
  • 6.
    Fast, Accurate ObjectDetection - fastest region proposal method: Edge Boxes [4fps, 1000 proposal] - Testing stage Model Time Edge boxes + R-CNN 0.25 sec + 1000*ConvTime + 1000*FcTime Edge boxes + fast R-CNN 0.25 sec + 1*ConvTime + 1000*FcTime faster R-CNN 1*ConvTime + 1000*FcTime
  • 7.
  • 8.
  • 9.
  • 10.
    RPN: Loss Function -2 class Softmax cross entropy loss - Discriminative training: - pi* = 1 if IoU > 0.7 - pi* = 0 if IoU < 0.3 - otherwise, do not contribute to loss
  • 11.
    RPN: Loss Function onlypositive sample contribute to reg. Loss
  • 12.
    How to trainfaster R-CNN ? - balance sampling ( neg. vs pos. = 1:1 ) - joint training is almost impossible ( conv update alternating ) - 4 step training !
  • 13.
    How to trainfaster R-CNN ? 1. train RPN with ImageNet pre-trained model 2. train fast R-CNN using proposal generated by 1. [ no params. sharing ] 3. use conv trained by 2. to initialize model, fix shared conv, update RPN 4. fix shared conv, fine-tune fc
  • 14.
    Result - MAP -VOC2007, ZF ConvNet
  • 15.
    Result - MAP UsingVGG ConvNet, fast R-CNN vs faster R-CNN
  • 16.
    Result - Time VOC2007, fast R-CNN vs faster R-CNN 70.0% 73.2% 59.9%
  • 17.
  • 18.
    Code available atgitHub Matlab(faster-rcnn) & python(py-faster-rcnn) version
  • 19.

Editor's Notes

  • #9 為何不用fc ? 因為這樣的sliding window會有很多個,每個要用同一組weight
  • #10 為何不用fc ? 因為這樣的sliding window會有很多個,每個要用同一組weight 會提出幾個proposal呢 ? k*你在conv5 feature map 上可以滑幾次, tipically, about 6k proposal