Cascade R-CNN: Delving into High
Quality Object Detection
2022/4/6, Changjin Lee
Introduction
● A tricky challenge in object detection
○ A detector trained with low IoU threshold produces noisy bounding boxes
○ A detector trained with high IoU threshold (weirdly)shows degraded performance
Low IoU
threshold
noisy
bboxes
High IoU
threshold
high quality
bboxes
X
Why high IoU leads to worse performance?
● Overfitting - The number of positive examples exponentially vanishes when trained with a high
IoU threshold
○ being more picky towards the bounding box quality
● Mismatch between the IoUs for which the detector is optimal and those of the input hypotheses
during inference
○ Suppose a detector is trained for IoU of 0.5 but if it’s asked for a competition
where the criteria is 0.7, there’s a mismatch.
0.6
0.6
0.69
0.8
0.9 0.6
0.68
IoU threshold: 0.7
detector
But I was trained
with IoU 0.5..
it’s time to test
with IoU 0.7!
Motivations of Cascade R-CNN
[1] A detector optimized at a single IoU level is not necessarily optimal at other levels
[2] A detector can only have high quality predictions if presented with high quality proposals (e.g. from
RPN)
(Bbox IoU with GT from RPN)
RPN
Motivations of Cascade R-CNN
● Just increasing IoU threshold doesn’t solve the problem.
1. The distribution of bounding box quality from a proposal network is
heavily imbalanced towards low quality. If you increase IoU threshold,
a lot of examples are wiped out. -> resulting in overfitting
2. High quality detectors are only optimal for high quality proposals. A
large . distribution gap between RPN and detection head leads to
mismatch
High quality proposals
Low quality proposals
Goal: mAP70
RPN
Detection
Head
Cascade R-CNN
● Cascade-RCNN “stages” are trained sequentially with increasing IoU thresholds, using
the output of one stage to train the next, being more selective against close false
positives
○ Let a single detector to handle a single IoU!
● The output of a detector is a “good distribution” for training the next higher quality
detector
● The same cascade procedure is also applied at inference
Stage 1
(IoU 0.5)
Stage 2
(IoU 0.6)
Stage 3
(IoU 0.7)
0.3
0.7
0.3
● H0: RPN (proposal network)
● H1: Detection Head
● C: Classification score
● B: Bounding box predictions
RPN
RPN
Head
Summary of Target Problems
[1] A detector optimized at a single IoU level is not necessarily optimal at other levels
[2] A detector can only have high quality predictions if presented with high quality proposals (e.g. from
RPN) -> large distribution gap b/w RPN and Detection Head
[3] Overfitting - Vanishing positive examples
[4] Mismatch between the IoUs for which the detector is optimal and those of the input hypotheses during
inference
Stage 1
(IoU 0.5)
Stage 2
(IoU 0.6)
Stage 3
(IoU 0.7)
[1] A detector optimized at a single IoU level is not necessarily optimal at other levels
[2] A detector can only have high quality predictions if presented with high quality proposals (e.g. from RPN) -> large
distribution gap b/w RPN and Detection Head
[3] Overfitting - Vanishing positive examples
[4] Mismatch between the IoUs for which the detector is optimal and those of the input hypotheses during inference
[1] A detector optimized at a single IoU level is not necessarily optimal at other levels
[2] A detector can only have high quality predictions if presentedwith high quality proposals (e.g. from RPN) -> large
distribution gap b/w RPN and Detection Head
[3] Overfitting- Vanishing positive examples
[4] Mismatch between the IoUs for which the detector is optimal and those of the input hypotheses during inference
Stage 1
(IoU 0.5)
Stage 2
(IoU 0.6)
Stage 3
(IoU 0.7)
resampling distribution to
higher quality
[1] A detector optimized at a single IoU level is not necessarily optimal at other levels
[2] A detector can only have high quality predictions if presented with high quality proposals (e.g. from RPN) -> large
distribution gap b/w RPN and Detection Head
[3] Overfitting - Vanishing positive examples
[4] Mismatch between the IoUs for which the detector is optimal and those of the input hypotheses during inference
Stage 1
(IoU 0.5)
Stage 2
(IoU 0.6)
Stage 3
(IoU 0.7)
Inference
Performance
References
● https://arxiv.org/abs/1712.00726
● https://deep-learning-study.tistory.com/605

Cascade R-CNN_ Delving into High Quality Object Detection.pptx

  • 1.
    Cascade R-CNN: Delvinginto High Quality Object Detection 2022/4/6, Changjin Lee
  • 2.
    Introduction ● A trickychallenge in object detection ○ A detector trained with low IoU threshold produces noisy bounding boxes ○ A detector trained with high IoU threshold (weirdly)shows degraded performance Low IoU threshold noisy bboxes High IoU threshold high quality bboxes X
  • 3.
    Why high IoUleads to worse performance? ● Overfitting - The number of positive examples exponentially vanishes when trained with a high IoU threshold ○ being more picky towards the bounding box quality ● Mismatch between the IoUs for which the detector is optimal and those of the input hypotheses during inference ○ Suppose a detector is trained for IoU of 0.5 but if it’s asked for a competition where the criteria is 0.7, there’s a mismatch. 0.6 0.6 0.69 0.8 0.9 0.6 0.68 IoU threshold: 0.7 detector But I was trained with IoU 0.5.. it’s time to test with IoU 0.7!
  • 4.
    Motivations of CascadeR-CNN [1] A detector optimized at a single IoU level is not necessarily optimal at other levels [2] A detector can only have high quality predictions if presented with high quality proposals (e.g. from RPN) (Bbox IoU with GT from RPN) RPN
  • 5.
    Motivations of CascadeR-CNN ● Just increasing IoU threshold doesn’t solve the problem. 1. The distribution of bounding box quality from a proposal network is heavily imbalanced towards low quality. If you increase IoU threshold, a lot of examples are wiped out. -> resulting in overfitting 2. High quality detectors are only optimal for high quality proposals. A large . distribution gap between RPN and detection head leads to mismatch High quality proposals Low quality proposals Goal: mAP70 RPN Detection Head
  • 6.
    Cascade R-CNN ● Cascade-RCNN“stages” are trained sequentially with increasing IoU thresholds, using the output of one stage to train the next, being more selective against close false positives ○ Let a single detector to handle a single IoU! ● The output of a detector is a “good distribution” for training the next higher quality detector ● The same cascade procedure is also applied at inference Stage 1 (IoU 0.5) Stage 2 (IoU 0.6) Stage 3 (IoU 0.7) 0.3 0.7 0.3 ● H0: RPN (proposal network) ● H1: Detection Head ● C: Classification score ● B: Bounding box predictions RPN RPN Head
  • 7.
    Summary of TargetProblems [1] A detector optimized at a single IoU level is not necessarily optimal at other levels [2] A detector can only have high quality predictions if presented with high quality proposals (e.g. from RPN) -> large distribution gap b/w RPN and Detection Head [3] Overfitting - Vanishing positive examples [4] Mismatch between the IoUs for which the detector is optimal and those of the input hypotheses during inference
  • 8.
    Stage 1 (IoU 0.5) Stage2 (IoU 0.6) Stage 3 (IoU 0.7) [1] A detector optimized at a single IoU level is not necessarily optimal at other levels [2] A detector can only have high quality predictions if presented with high quality proposals (e.g. from RPN) -> large distribution gap b/w RPN and Detection Head [3] Overfitting - Vanishing positive examples [4] Mismatch between the IoUs for which the detector is optimal and those of the input hypotheses during inference
  • 9.
    [1] A detectoroptimized at a single IoU level is not necessarily optimal at other levels [2] A detector can only have high quality predictions if presentedwith high quality proposals (e.g. from RPN) -> large distribution gap b/w RPN and Detection Head [3] Overfitting- Vanishing positive examples [4] Mismatch between the IoUs for which the detector is optimal and those of the input hypotheses during inference Stage 1 (IoU 0.5) Stage 2 (IoU 0.6) Stage 3 (IoU 0.7) resampling distribution to higher quality
  • 10.
    [1] A detectoroptimized at a single IoU level is not necessarily optimal at other levels [2] A detector can only have high quality predictions if presented with high quality proposals (e.g. from RPN) -> large distribution gap b/w RPN and Detection Head [3] Overfitting - Vanishing positive examples [4] Mismatch between the IoUs for which the detector is optimal and those of the input hypotheses during inference Stage 1 (IoU 0.5) Stage 2 (IoU 0.6) Stage 3 (IoU 0.7) Inference
  • 11.
  • 12.