2. Dealing with Imbalanced Classes
Resampling Dataset
Over-Sampling (tens of thousands of records or less)
Under-Sampling (tens- or hundreds of thousands of instances or
more)
Hybrid of Over-sampling and Under-sampling.
Change performance metric
• Confusion Matrix, Precision, Recall, F-score etc.
• Kappa, ROC analysis
I. Firstly, it chooses a random minority observation a.
II. Among its k nearest minority class neighbors, instance b
is selected.
III. Finally, a new sample x is created by randomly interpolating
the two samples:
i.e. x=a+ (b-a)*rand(0,1)
Where, rand(0-1) generates a random number between 0 and 1
Synthetic Minority Oversampling Technique (SMOTE)