Kato Mivule: An Overview of Adaptive Boosting – AdaBoost

An Overview of Adaptive Boosting – AdaBoost
Presented By
Kato Mivule
Dr. Manohar Mareboyana, Professor
Data Mining - Spring 2013
Computer Science Department
Bowie State University
An Overview of AdaBoost
1

OUTLINE
• Introduction
• How AdaBoost Works
• The experiment
• Results
• Conclusion and Discussion
2

Adaptive Booting – AdaBoost
• Adaptive Boosting (AdaBoost) was proposed by Freund and Schapire (1995).
• AdaBoost is a machine learning classifier that uses several iterations by adding
weak learners to generate a new learner with improved performance.
• AdaBoost is adaptive in that with each iteration, a new weak learner is added
to the AdaBoost classifier by fine-tuning weights with priority given to
misclassified data in prior iterations.
• AdaBoost is less vulnerable to over-fitting but prone to noise and outliers.
3

AdaBoost Fit Ensemble
4
An overview of the AdaBoost Fit Ensemble procedure.

How AdaBoost Works – Weak Learners
• Decision Stump
• For this overview, we choose Decision Stumps as our weak learner.
• The Decision Stump generates a decision tree with only one single split.
• The resulting tree can be used for classifying unseen (untrained) instances.
• The leaf nodes is the class name.
• A non-leaf node is a decision node.
9

How AdaBoost Works – Weak Learners
• How a Decision Stump chooses the best attributes:
• Information gain: attribute with lowest info gain is chosen.
• Gain ratio.
• Gini index.
10

AdaBoost – the experiment
• For illustration purposes, we utilized Rapid Miner’s AdaBoost functionality
• We used a UCI Cancer dataset with 643 data points.
• We employed a 10 fold cross validation.
11

AdaBoost – the experiment
• We used Rapid Miner’s Decision Stump as our weak learner.
12

AdaBoost – Results
13
Generated AdaBoost Model
The following AdaBoost Model was generated:
AdaBoost (prediction model for label Class)
Number of inner models: 3
Embedded model #0 (weight: 2.582):
Uniformity of Cell > 3.500: 4 {2=11, 4=202}
Uniformity of Cell ≤ 3.500: 2 {2=433, 4=37}
Uniformity of Cell Shape > 1.500: 4 {2=100, 4=237}
Uniformity of Cell Shape ≤ 1.500: 2 {2=344, 4=2}
Clump Thickness > 8.500: 4 {2=0, 4=83}
Clump Thickness ≤ 8.500: 2 {2=444, 4=156}

AdaBoost – Results
• AdaBoost using Decision Stumps – classification accuracy at 93.12%.
• Decision Stump with out AdaBoost – classification accuracy at 92.97%.
14

AdaBoost – the results
• AdaBoost Confusion Matrix – Classification accuracy at 93.12%
• Decision Stump Confusion Matrix – Classification accuracy at 92.97%
15

AdaBoost – the results
16
The Receiver Operating Characteristic
(ROC):
•The ROC shows the false positive rate on X-
axis (specificity), the probability of target = 1
when its true value is 0.
•The true positive rate on Y-axis (sensitivity),
the probability of target=1 when its true value
is 1.
•For an ideal situation, the curve rises fast to
the top-left indicating that the model correctly
made the predictions.
Area Under the Curve (AUC):
•AUC shows how the classiﬁer will rank a
randomly chosen positive instance higher
than a randomly chosen negative instance.
•The AUC calculates total performance of
classifier.
•Higher AUC indicates better performance.
•0.50 AUC indicates random performance.
•1.00 AUC indicates perfect performance
weight:
The ROC/AUC plot for AdaBoost – with AUC of 0.975.
The ROC/AUC plot for Decision Stamp – with AUC of 0.911.

17
CONCLUSION
• As shown in the preliminary results, AdaBoost performs better
than Decision Stump.
• However, much of the success for the AdaBoost will depend
largely on fine-tuning parameters in the machine learning
classifier and the weak learner that is chosen.

References
1. Y. Freund and R. E. Schapire, "A Decision-Theoretic generalization of On-Line learning and an application to boosting," Journal of
Computer and System Sciences, vol. 55, no. 1, pp. 119-139, Aug. 1997.
2. T. G. Dietterich, "Ensemble methods in machine learning," Lecture Notes in Computer Science, vol. 1857, pp. 1-15, 2000.
3. Kato Mivule, Claude Turner, Soo-Yeon Ji, Towards A Differential Privacy and Utility Preserving Machine Learning Classifier, Procedia
Computer Science, Volume 12, 2012, Pages 176-181
4. T. Fawcett, “An introduction to ROC analysis.”, Pattern recognition letters, vol. 27, no.8, 2006, Pages 861-874.
5. K. Bache and M. Lichman, “Breast Cancer Wisconsin (Original) Data Set - UCI Machine Learning Repository.” University of
California, School of Information and Computer Science., Irvine, CA, 2013.
6. MATLAB, “AdaBoost - MATLAB.” Online, Accessed: May 3rd 2013, Available: http://www.mathworks.com/discovery/adaboost.html.
7. MATLAB, “Ensemble Methods :: Nonparametric Supervised Learning (Statistics Toolbox™).” Online, Accessed: May 3rd 2013,
Available: http://www.mathworks.com/help/toolbox/stats/bsvjye9.html#bsvjyi5.
8. ROC Charts, “Model Evaluation – Classification” Online, Accessed May 3rd 2013, Available: http://chem-
eng.utoronto.ca/~datamining/dmc/model_evaluation_c.htm
9. MedCalc, “ROC curve analysis in MedCalc”, Online, Accessed May 3rd 2013, Available: http://www.medcalc.org/manual/roc-
curves.php
18

THANK YOU.
Contact: kmivule at gmail dot com
19

Kato Mivule: An Overview of Adaptive Boosting – AdaBoost

More Related Content

What's hot

Viewers also liked

Similar to Kato Mivule: An Overview of Adaptive Boosting – AdaBoost

More from Kato Mivule

Recently uploaded

Kato Mivule: An Overview of Adaptive Boosting – AdaBoost