A classier can predict the class labels of new data after the training.
Proportion of class labels for the training can be imbalanced in
real-world data sets, and imbalanced data makes the training
difficult for a classier. This is the case for Real-Time Bidding
(RTB) framework in online advertisement, and there are several
ways to deal with the problem to improve the performance of the
classier.
2. Table of contents
1. Introduction
2. Methods
2.1 Re-sampling
2.2 Cost-sensitive learning
3. Tools in practice
4. Reference
1
3. Introduction
A classifier can predict the class labels of new data after the
training. Proportion of class labels for the training can be
imbalanced in real-world data sets, and imbalanced data makes the
training difficult for a classifier. This is the case for Real-Time
Bidding (RTB) framework in online advertisement, and there are
several ways to deal with the problem to improve the performance
of the classifier.
2
4. Methods: Re-sampling
Re-sampling can deal with the imbalanced data by balancing the
proportion of class labels
• Under-sampling the majority class
• Over-sampling the minority class
• Combining over- and under-sampling
• Create ensemble balanced sets
3
5. Methods: Calibration after re-sampling
There are several ways to calibrate the output probability from a
classifier after the re-sampling
• Isotonic regression
minimize
∑
i wi (yi − ˆyi )2
subject to ˆymin = ˆy1 ≤ ˆy2... ≤ ˆyn = ˆymax
• Calibration factor for negative under-sampling
q = p
p+(1−p)/w
• q: calibrated probability
• p: prediction in under-sampling space
• w: under-sampling rate
4
6. Methods: Calibration after re-sampling
• Probability calibration should be done on new data not used
for model fitting
• Logistic regression returns well calibrated predictions by
default as it directly optimizes log-loss
5
7. Cost-sensitive learning
Actual positive Actual negative
Predict positive C(0,0) C(0,1)
Predict negative C(1,0) C(1,1)
• Cost-sensitive learning takes the misclassification costs into
consideration
• R(i|x) =
∑
j P(j|x)C(i, j)
• expected cost R(i|x) of classifying an instance x into class i
• Classifier will classify an instance x into positive class if and
only if:
P(0|x)C(1, 0) ≤ P(1|x)C(0, 1) assumig C(0, 0) = C(1, 1) = 0
6
8. Cost-sensitive learning types: Thresholding
• Thresholding method modifies the threshold (0.5 by defalut)
to label the class considering the costs
p∗ = C(1,0)
C(1,0)+C(0,1)
• threshold p∗ for the classifier to classify an instance x into
positive if P(1|x) ≥ p∗
7
9. Cost-sensitive learning types: Sampling
• Re-sampling method described above can be considered as a
part of cost-sensitive learning
• Positive and negative examples are sampled by the ratio of:
p(1)FN : p(0)FP
• p(1) and p(0) are the prior probability of the positive and
negative examples in the original training set
8
10. Cost-sensitive learning types: Weighting
• Weighting method assigns a normalized weight to each
instance according to the misclassification costs
• This can be considered as a part of Sampling method as
example with high weights (for rare class with high costs) can
be viewed as example duplication - thus sampling
• Weighting method can utilize all data unlike Sampling method
9
11. Tools in practice: Xgboost
• Balance the positive and negative weights via
scale-pos-weight if you care only about the ranking order of
your prediction
• typically by inserting sum(negative/major samples)
sum(positive/rare samples)
• Use AUC for evaluation. Utility [Chapelle O 2015] can also be
considered as a metric in RTB
• If you care about predicting the right probability, you cannot
re-balance the data
• setting parameter max-delta-step to a finite number (like 1)
will help convergence
10
12. Reference
• Offline Evaluation of Response Prediction in Online
Advertising Auctions Categories and Subject Descriptors,
Chapelle O 2015
• XGBoost, Chen T et al. 2016
• Practical Lessons from Predicting Clicks on Ads at Facebook,
He X et al. 2014
• Cost-sensitive learning and the class imbalance problem, Ling
C et al. 2008
• Cost-sensitive Learning for Utility Optimization in Online
Advertising Auctions, Vasile F et al. 2016
11