[CVPR 22] Context-rich Minority Oversampling for Long-tailed Classification

The Majority Can Help the Minority:
Context-rich Minority Oversampling for Long-tailed Classification
1Seoul National University, 2NAVER AI Lab
Poster ID: 159a & Poster Time: 22, Jun. 10:00-12:30
Seulki Park1 Byeongho Heo2 Sangdoo Yun2 Jin Young Choi1
Youngkyu Hong2

Long-tailed Classification
2
Introduction Proposed Method Experiment Conclusion
Many real-world data often exhibit long-tailed distribution.
✓ The model trained on such imbalanced data tends to overfit the majority classes.
✓ That is, the model performs poorly on minority classes.
Problem Definition:
● Input: Long-tailed (imbalanced) training data & uniform-distributed (balanced) test data.
● Goal: To make a robust model that can generalize well on balanced test data.
The Majority Can Help the Minority: Context-rich Minority Oversampling for Long-tailed Classification
Faces (Zhang et al., 2017) Places (Wang et al., 2017) Species (Van Horn et al., 2018) Actions (Zhang et al., 2019)
* Images by authors.

Previous Oversampling Methods
1. Random Oversampling (ROS)
◦ A simple and straightforward method which repeatedly oversample minor classes.
◦ However, this may intensify overfitting problem [1].
2. Synthetic Minority Over-sampling Technique (SMOTE), 2002
◦ Oversamples minority samples by interpolating between existing minority samples and their nearest minority neighbors.
◦ However, difficulties for end-to-end algorithm and largescale image datasets due to the high computational complexity of
calculating K-Nearest Neighbor for every sample.
3
[1] Deep imbalanced attribute classification using visual attention aggregation, ECCV, 2018.
SMOTE Figure from: http://www.incodom.kr/SMOTE

3. Generative Adversarial Minority Oversampling (GAMO), 2019
◦ Produces new minority samples by training a convex generator, inspired by the success of
generative adversarial networks (GANs).
◦ However, difficult to train a generator (mode collapse) & additional training cost.
4. MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition, 2021
◦ Uses implicit semantic data augmentation (ISDA) algorithm [1].
◦ However, this meta-learning-based method requires additional balanced validation, and hundreds and thousands of
iterations for training (high training cost).
4
[1] Implicit semantic data augmentation for deep networks, NeurIPS, 2019.

Random Oversampling
◦ A simple and straightforward method which repeatedly oversample minor classes.
◦ However, this may intensify overfitting problem [1].
Synthetic Minority Over-sampling Technique (SMOTE), 2002
◦ Oversamples minority samples by interpolating between existing minority samples and their nearest minority neighbors.
◦ However, difficulties for end-to-end algorithm and largescale image datasets due to the high computational complexity of
calculating K-Nearest Neighbor for every sample.
Generative Adversarial Minority Oversampling (GAMO), 2019
◦ Produces new minority samples by training a convex generator, inspired by the success of generative adversarial networks
(GANs).
◦ However, difficult to train a generator (mode collapse) & additional training cost.
MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition, 2021
◦ Uses implicit semantic data augmentation (ISDA) algorithm [1].
◦ However, this meta-learning-based method requires additional balanced validation, and hundreds and thousands of
iterations for training (high training cost).
5
Limitation of previous methods:
1) Simple methods generate only context-limited images. (e.g, ROS, SMOTE)
- Limited improvement especially when imbalance is severe.
2) Recent methods require additional expensive training cost (e.g, GAMO, MetaSAug)
- E.g., training generator, additional balanced validation set, longer training epochs.
→ We need ‘Simple & Context-rich’ oversampling method!

Motivation
Q. How can we generate diverse ‘context-rich minority samples’ from long-tailed distribution?
A. Let’s pay attention to the characteristics of long-tailed distributions.
Key Observations:
✓ Majority class samples are data-rich and information-rich!
→ Let’s use the affluent information of the majority samples
to generate new minority samples.
Key Idea
- We can use the rich major-class images as the background
for the newly created minor-class images.
6

Proposed Method: Context-rich Minority Oversampling (CMO)
Recap: CutMix (Yun et al., 2019)
- A simple but effective data augmentation method used in many visual tasks.
෤
𝑥 = 𝑴⨀𝑥𝑏
+ 𝟏 − 𝑴 ⨀𝑥𝑓
, ෤
𝑦 = 𝜆𝑦𝑏
+ 1 − 𝜆 𝑦𝑓
𝑥𝑏
, 𝑦𝑏
, 𝑥𝑓
, 𝑦𝑓
~ 𝑃
𝑴 ∈ 0, 1 𝑊×𝐻
: a binary mask
→ designed for a class balanced dataset.
Naively using CutMix generates more samples of the majority classes.
Context-rich Minority Oversampling (CMO)
- For an imbalanced dataset, we use different distributions for background and foreground images.
𝑥𝑏
, 𝑦𝑏
~ 𝑃, 𝑥𝑓
, 𝑦𝑓
~ 𝑄
𝑄 : minor-class-weighted distribution.
7
CutMix
Comparison with CutMix [3]

Proposed Method: Minor-class-weighted distribution
How to design minor-class-weighted sampling strategies?
- Re-weighting methods have provided a way how to assign appropriate weights to samples.
- Commonly used sampling strategies give a weight inversely proportional to class frequency [1, 2],
or the effective number [3].
- 𝑛𝑘: the number of samples in 𝑘-th class, 𝐶: the total number of classes.
- The generalized sampling probability for 𝑘-th class can be defined by
𝑞 𝑟, 𝑘 =
1/𝑛𝑘
𝑟
σ𝑘′=1
𝐶
1/𝑛𝑘′
𝑟
- As 𝑟 increases, weight of the minor class becomes increasingly larger than
that of the major class.
- Effective number[3] is defined as
𝐸 𝑘 =
1 − 𝛽𝑛𝑘
1 − 𝛽
8
[1] Learning deep representation for imbalanced classification, CVPR, 2016.
[2] Exploring the limits of weakly supervised pretraining, ECCV, 2018.
[3] Class-balanced loss based on effective number of samples, CVPR, 2019.
𝑟 = 1
𝑟 = 2
Original
𝑟 = 0
Data distribution

Proposed Method: Algorithm
Algorithm
9

Experimental Results
1. Datasets
◦Synthetic data:
▪CIFAR-100-LT (100 classes), ImageNet-LT (1,000 classes)
◦Real-world data:
▪iNaturalist 2018 (8,142 classes)
※ imbalance ratio: the ratio between the most frequent class and the least frequent class.
2. Evaluation metrics
◦Top-1 accuracy
◦Accuracy for disjoint sets (Many > 100, 20<=Med<=100, Few<20) [1]
10
[1] Large-scale long-tailed recognition in an open world, CVPR, 2019.

Long-tailed classification benchmarks (ImageNet-LT)
1. Comparison with state-of-the-arts
11
2. Comparison with oversampling methods
3. Results of longer training epochs

Analysis
1. Impact of different Q distributions
12
2. Using different augmentation methods
3. Variants of CMO 4. Generated Images

Conclusion
✓We propose a novel context-rich minority oversampling that leverages the rich context of the majority
classes as background images.
✓It requires little additional computational cost and can be easily integrated into existing methods.
✓It is simple but effective that achieves the state-of-the-art performance.
✓We empirically prove the effectiveness of the proposed oversampling method through extensive
experiments and ablation studies.
13

Conclusion
✓We propose a novel context-rich minority oversampling that leverages the rich context of the majority
classes as background images.
✓It requires little additional computational cost and can be easily integrated into existing methods.
✓It is simple but effective that achieves the state-of-the-art performance.
✓We empirically prove the effectiveness of the proposed oversampling method through extensive
experiments and ablation studies.
14
Thank you!
Contact: seulki.park@snu.ac.kr
Code: https://github.com/naver-ai/cmo

[CVPR 22] Context-rich Minority Oversampling for Long-tailed Classification

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to [CVPR 22] Context-rich Minority Oversampling for Long-tailed Classification

Similar to [CVPR 22] Context-rich Minority Oversampling for Long-tailed Classification (20)

Recently uploaded

Recently uploaded (20)

[CVPR 22] Context-rich Minority Oversampling for Long-tailed Classification