Fast rcnn

RWTH AACHEN
Media Informatics
Hojun Lim
1
Fast R-CNN
Paper review session 1

■ Comparison: R-CNN vs Fast R-CNN
■ Image Pyramid
■ Scale Invariance (Multi-scale)
■ Truncated SVD for replacing weights of FC layers
■ Performance Metric: Pascal VOC 2012 vs COCO
Outline
2

Comparision: R-CNN vs Fast R-CNN
3
■ R-CNN
□ Architecture
□ Classification
□ Regression (localization)
-> BBOX encoding: for reducing the answer space.
It can be further reduced by variance trick
X

4
■ R-CNN
□ Defacts
□ Multi-stage training pipeline
(1) Train ConvNet for localization
(2) Train SVMs to ConvNet features
(3) Replacing Softmax by SVM and finetune
□ Training is expensive
□ Convolution for each region proposal, after warping
□ Object detection is slow

5
■ Fast R-CNN
□ Architecture
□ single-stage training pipeline: combining
(1) Log loss
(2) Smooth L1 (= Huber loss when delta is 1)
□ Multi-task loss for each RoI
Indicator function,
u = 0 for background

6
■ Fast R-CNN
□ Improvements
□ Feed whole image through ConvNet
□ RoI Pooling (no warping)
y
x
Backprop of RoI pooling

7
■ Fast R-CNN
□ Limitation
□ Complete architecture depends on external
RoI proposal algorithm
□ Have to extract fixed N(=64) regions
from each image
□ Hard negative mining:
25% positive: IoU in [0.5, 1]
75% negative: IoU in [0.1, 0.5)
□ Weekly addressed multi-scale invariance
□ Brute-force (fixing image resolution)
□ Image Pyramid: expensive

Image Pyramid
8
[1] Image Pyramid (Gaussian Pyramid)

Truncated SVD
12
Q. Why is it helpful to reduce num parameters?
A. Suppose (n, d, r) = (100, 100, 2)

Truncated SVD
13
2D dataset example 3D dataset example

Fast rcnn

More Related Content

What's hot

Similar to Fast rcnn

Recently uploaded

Fast rcnn

Editor's Notes