SlideShare a Scribd company logo
1 of 12
1
Ensemble
• Ensemble method is creating many learners(classifier or
regressor) and learning new hypothesis by combining them
• Combining many learners can lead to more reliable prediction
results than just one learner
(Most of ensemble method mainly use many learners with the same algorithm)
Bagging and Boosting
2
• Bagging assigns bootstrap sampled train data to various classifiers and predicts the results by voting or averaging
the prediction results
 RandomForest
• Boosting assigns many weak learners and each learner trains data and repeatedly correct errors by updating
weights
 Adaboost
 Gradient Boost
 eXtream Gradient Boost ( Xgboost)
• Generally speaking, Boosting is better in predition performance but it takes too much of time and has a little more
chance of overfitting
3
+
+
+ +
+
-
-
-
-
- +
+
+ +
+
-
-
-
-
- +
+
+++
-
-
-
-
- +
+
+++
-
-
-
-
- +
+
+ +
+
-
-
--
- +
+
+ +
+
-
-
--
-
Classification 1 Classification 2 Classfication 3
Feature Dataset Step 1 Step 2 Step 3 Step 4 Step 5
+
+
+
+ +
-
-
-
-
-
Prediction by combining classification 1, 2,3
AdaBoost
4
0.3 0.5 0.8
+
+
+
+ +
-
-
-
-
-
AdaBoost
Each weak learner is combined by updating
weights. For example, first learner gets weight 0.3,
second one gets 0.5 and third one gets 0.8 and
consequently they are put together and do
prediction
5
GBM(Gradient Boosting Machine)
GBM is similar to Adaboost.
Major difference is the way of updating weights.
It is more sophisticatedly done by gradient descent.
But it takes a lot of time by serially updating
each weak learner’s weight.
Advantages of XGBOOST
6
eXtra Gradient Boost
XGBoost
eXcellent prediction performance
Faster execution time compared with GBM
• CPU Parallel processing enabled
Various enhancements
• Regularization
• Tree Pruning
Various utilities
• Early Stopping
• Embeded cross validation
• Embeded Null value processing
XGBOOST implementation in python
7
C/C++ Native Module Python Wrapper Scikit-Learn Wrapper
Initially XGBoost was made
by C/C++
Python package was made
by call native C/C++
It has its own python API
and library
Integrated with scikit-learn framework
• Training and prediction by fit( ) and
predict( ) methods like any other
classifiers in scikit-learn
• Having no problem of using other scikit-
learn modules like GridSearchCV by
seamless integration
XGBOOST Python Wrapper vs XGBOOST Scikit-learn Wrapper
8
Category Python Wrapper Scikit-learn Wrapper
modules from xgboost as xgb from xgboost import XGBClassifier
Training and test
datasets
DMatrix class is needed
train = xgb.DMatrix(data=X_train , label=y_train)
in order by create DMatrix objects, feature datasets and
label datasets are provided as parameters
Using numpy or pandas directly
training API
Xgb_model=xgb.train( )
Xgb_model is retured trained model by calling xgb.train()
XGBClassifer.fit( )
prediction API
Xgb_model.predict( )
predict( ) method is called in trained object by
xgb.train() . Returned value is not direct prediction
result. It is probability value for prediction result
XGBClassifer.predict( )
Returning direct prediction results
Feature importance
visualization
plot_importance( ) plot_importance( )
Hyper parameters of python wrapper and scikit-learn wrapper
9
Python Wrapper Scikit-Learn Wrapper Hyper parameter description
eta learning_rate Same parameter with GBM’s learning_rate. It is learning rate of updating weights iterating
boosting steps. Normally it is set between 0 and 1
Default value is 0.3 when using python wrapper xgboos, 0.1 when using scikit-learn wrapper
xgboost
num_boost_rounds n_estimators Same parameter with n_estimators in scikit learn ensemble. It means numbers of weak
learners(iteration count)
min_child_weight min_child_weight Similart to min_child_leaf of decision tree. It is used against overfitting
max_depth max_depth Same with max_depth in decision tree. Max tree depth
sub_sample subsample Same parameter with subsample in GBM. It sets samping percentage in order to prevent tree
from growing bigger and overfitting. If you set sub_sample=0.5, a half of total data can be
used for creating tree. 0 ~ 1 can be used but normally 0.5 ~ 1 is used
10
Python Wrapper Scikit-Learn Wrapper Hyper parameter description
lambda reg_lambda L2 Regularization value. Default is 1. The bigger value, the more regualarization. Used for
overfitting
alpha reg_alpha L1 Regularization value. Default is 0. The bigger value, the more regualarization. Used for
overfitting
colsample_bytree colsample_bytree Similar to max_features in GBM. It is used for sampling features for making tree. If there are
too many features , it is used for overfitting
XGBoost 파이썬 Wrapper 하이퍼 파라미터와 사이킷런 Wrapper
하이퍼 파라미터 비교
11
XGBoost Early stopping
XGBoost can stop its iterations before it reaches designated count unless cost is reduced
during specified early stopping repetition interval.
It can be wisely used in hyper parameter tuning process to reduce the tuning time.
If you set too small value to early stopping, training can be finished without proper optimization
Main parameters for early stopping
• early_stopping_rounds : Maximum iterations at which loss metric is no more enhanced
• eval_metric : cost evaluation metric.
• eval_set : evaluation dataset which is used in evaluating cost reduction.
12
XGBoost Wrap up
• XGBoost (and LightGBM) is most used ensemble method especially among Kagglers
• It can enhance prediction performance compared with GBM but not as much as rocket-boosted improvement
• Execution time is faster compared with GBM and it can be used with parallel processing with multi cpu cores
• Hyper parameter tuning is difficult because of too much of them. But you don’t have to stick to it as drastic improvement
of performance is rare case in XGBoost
• XGBoost is not golden campass but it is widely used in various appllications especially in classification and regression

More Related Content

What's hot

Webpage Personalization and User Profiling
Webpage Personalization and User ProfilingWebpage Personalization and User Profiling
Webpage Personalization and User Profiling
yingfeng
 

What's hot (20)

Bayesian Global Optimization
Bayesian Global OptimizationBayesian Global Optimization
Bayesian Global Optimization
 
Kaggle Higgs Boson Machine Learning Challenge
Kaggle Higgs Boson Machine Learning ChallengeKaggle Higgs Boson Machine Learning Challenge
Kaggle Higgs Boson Machine Learning Challenge
 
Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018
 
AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...
 
AUSOUG - NZOUG - Groundbreakers - Jun 2019 - 19 Troubleshooting Tips and Tric...
AUSOUG - NZOUG - Groundbreakers - Jun 2019 - 19 Troubleshooting Tips and Tric...AUSOUG - NZOUG - Groundbreakers - Jun 2019 - 19 Troubleshooting Tips and Tric...
AUSOUG - NZOUG - Groundbreakers - Jun 2019 - 19 Troubleshooting Tips and Tric...
 
What's new in Oracle Trace File Analyzer version 12.2.1.1.0
What's new in Oracle Trace File Analyzer version 12.2.1.1.0What's new in Oracle Trace File Analyzer version 12.2.1.1.0
What's new in Oracle Trace File Analyzer version 12.2.1.1.0
 
H2O World - Top 10 Deep Learning Tips & Tricks - Arno Candel
H2O World - Top 10 Deep Learning Tips & Tricks - Arno CandelH2O World - Top 10 Deep Learning Tips & Tricks - Arno Candel
H2O World - Top 10 Deep Learning Tips & Tricks - Arno Candel
 
AUSOUG - NZOUG-GroundBreakers-Jun 2019 - 19c RAC
AUSOUG - NZOUG-GroundBreakers-Jun 2019 - 19c RACAUSOUG - NZOUG-GroundBreakers-Jun 2019 - 19c RAC
AUSOUG - NZOUG-GroundBreakers-Jun 2019 - 19c RAC
 
Data Wrangling For Kaggle Data Science Competitions
Data Wrangling For Kaggle Data Science CompetitionsData Wrangling For Kaggle Data Science Competitions
Data Wrangling For Kaggle Data Science Competitions
 
C3 w5
C3 w5C3 w5
C3 w5
 
Ml2 train test-splits_validation_linear_regression
Ml2 train test-splits_validation_linear_regressionMl2 train test-splits_validation_linear_regression
Ml2 train test-splits_validation_linear_regression
 
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learntKaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
 
[Paper] DetectoRS for Object Detection
[Paper] DetectoRS for Object Detection[Paper] DetectoRS for Object Detection
[Paper] DetectoRS for Object Detection
 
C3 w3
C3 w3C3 w3
C3 w3
 
Presentation on BornoNet Research Paper and Python Basics
Presentation on BornoNet Research Paper and Python BasicsPresentation on BornoNet Research Paper and Python Basics
Presentation on BornoNet Research Paper and Python Basics
 
Whats Right and Wrong with Apache Mahout
Whats Right and Wrong with Apache MahoutWhats Right and Wrong with Apache Mahout
Whats Right and Wrong with Apache Mahout
 
Feature Engineering - Getting most out of data for predictive models
Feature Engineering - Getting most out of data for predictive modelsFeature Engineering - Getting most out of data for predictive models
Feature Engineering - Getting most out of data for predictive models
 
Apache Mahout Architecture Overview
Apache Mahout Architecture OverviewApache Mahout Architecture Overview
Apache Mahout Architecture Overview
 
Ml1 introduction to-supervised_learning_and_k_nearest_neighbors
Ml1 introduction to-supervised_learning_and_k_nearest_neighborsMl1 introduction to-supervised_learning_and_k_nearest_neighbors
Ml1 introduction to-supervised_learning_and_k_nearest_neighbors
 
Webpage Personalization and User Profiling
Webpage Personalization and User ProfilingWebpage Personalization and User Profiling
Webpage Personalization and User Profiling
 

Similar to Understanding GBM and XGBoost in Scikit-Learn

Scalable gradientbasedtuningcontinuousregularizationhyperparameters ppt
Scalable gradientbasedtuningcontinuousregularizationhyperparameters pptScalable gradientbasedtuningcontinuousregularizationhyperparameters ppt
Scalable gradientbasedtuningcontinuousregularizationhyperparameters ppt
Ruochun Tzeng
 

Similar to Understanding GBM and XGBoost in Scikit-Learn (20)

XGBOOST [Autosaved]12.pptx
XGBOOST [Autosaved]12.pptxXGBOOST [Autosaved]12.pptx
XGBOOST [Autosaved]12.pptx
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
 
Start machine learning in 5 simple steps
Start machine learning in 5 simple stepsStart machine learning in 5 simple steps
Start machine learning in 5 simple steps
 
chapter1.pdf
chapter1.pdfchapter1.pdf
chapter1.pdf
 
Xgboost
XgboostXgboost
Xgboost
 
How to Reduce Scikit-Learn Training Time
How to Reduce Scikit-Learn Training TimeHow to Reduce Scikit-Learn Training Time
How to Reduce Scikit-Learn Training Time
 
StackNet Meta-Modelling framework
StackNet Meta-Modelling frameworkStackNet Meta-Modelling framework
StackNet Meta-Modelling framework
 
XGBoost: the algorithm that wins every competition
XGBoost: the algorithm that wins every competitionXGBoost: the algorithm that wins every competition
XGBoost: the algorithm that wins every competition
 
XgBoost.pptx
XgBoost.pptxXgBoost.pptx
XgBoost.pptx
 
RapidMiner: Data Mining And Rapid Miner
RapidMiner:  Data Mining And Rapid MinerRapidMiner:  Data Mining And Rapid Miner
RapidMiner: Data Mining And Rapid Miner
 
RapidMiner: Data Mining And Rapid Miner
RapidMiner: Data Mining And Rapid MinerRapidMiner: Data Mining And Rapid Miner
RapidMiner: Data Mining And Rapid Miner
 
Grid search.pptx
Grid search.pptxGrid search.pptx
Grid search.pptx
 
The ABC of Implementing Supervised Machine Learning with Python.pptx
The ABC of Implementing Supervised Machine Learning with Python.pptxThe ABC of Implementing Supervised Machine Learning with Python.pptx
The ABC of Implementing Supervised Machine Learning with Python.pptx
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
Scalable gradientbasedtuningcontinuousregularizationhyperparameters ppt
Scalable gradientbasedtuningcontinuousregularizationhyperparameters pptScalable gradientbasedtuningcontinuousregularizationhyperparameters ppt
Scalable gradientbasedtuningcontinuousregularizationhyperparameters ppt
 
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
 
MLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott ClarkMLConf 2016 SigOpt Talk by Scott Clark
MLConf 2016 SigOpt Talk by Scott Clark
 
17_00-Dima-Panchenko-cnn-tips-and-tricks.pptx
17_00-Dima-Panchenko-cnn-tips-and-tricks.pptx17_00-Dima-Panchenko-cnn-tips-and-tricks.pptx
17_00-Dima-Panchenko-cnn-tips-and-tricks.pptx
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to Chainer
 

Recently uploaded

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 

Recently uploaded (20)

Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 

Understanding GBM and XGBoost in Scikit-Learn

  • 1. 1 Ensemble • Ensemble method is creating many learners(classifier or regressor) and learning new hypothesis by combining them • Combining many learners can lead to more reliable prediction results than just one learner (Most of ensemble method mainly use many learners with the same algorithm)
  • 2. Bagging and Boosting 2 • Bagging assigns bootstrap sampled train data to various classifiers and predicts the results by voting or averaging the prediction results  RandomForest • Boosting assigns many weak learners and each learner trains data and repeatedly correct errors by updating weights  Adaboost  Gradient Boost  eXtream Gradient Boost ( Xgboost) • Generally speaking, Boosting is better in predition performance but it takes too much of time and has a little more chance of overfitting
  • 3. 3 + + + + + - - - - - + + + + + - - - - - + + +++ - - - - - + + +++ - - - - - + + + + + - - -- - + + + + + - - -- - Classification 1 Classification 2 Classfication 3 Feature Dataset Step 1 Step 2 Step 3 Step 4 Step 5 + + + + + - - - - - Prediction by combining classification 1, 2,3 AdaBoost
  • 4. 4 0.3 0.5 0.8 + + + + + - - - - - AdaBoost Each weak learner is combined by updating weights. For example, first learner gets weight 0.3, second one gets 0.5 and third one gets 0.8 and consequently they are put together and do prediction
  • 5. 5 GBM(Gradient Boosting Machine) GBM is similar to Adaboost. Major difference is the way of updating weights. It is more sophisticatedly done by gradient descent. But it takes a lot of time by serially updating each weak learner’s weight.
  • 6. Advantages of XGBOOST 6 eXtra Gradient Boost XGBoost eXcellent prediction performance Faster execution time compared with GBM • CPU Parallel processing enabled Various enhancements • Regularization • Tree Pruning Various utilities • Early Stopping • Embeded cross validation • Embeded Null value processing
  • 7. XGBOOST implementation in python 7 C/C++ Native Module Python Wrapper Scikit-Learn Wrapper Initially XGBoost was made by C/C++ Python package was made by call native C/C++ It has its own python API and library Integrated with scikit-learn framework • Training and prediction by fit( ) and predict( ) methods like any other classifiers in scikit-learn • Having no problem of using other scikit- learn modules like GridSearchCV by seamless integration
  • 8. XGBOOST Python Wrapper vs XGBOOST Scikit-learn Wrapper 8 Category Python Wrapper Scikit-learn Wrapper modules from xgboost as xgb from xgboost import XGBClassifier Training and test datasets DMatrix class is needed train = xgb.DMatrix(data=X_train , label=y_train) in order by create DMatrix objects, feature datasets and label datasets are provided as parameters Using numpy or pandas directly training API Xgb_model=xgb.train( ) Xgb_model is retured trained model by calling xgb.train() XGBClassifer.fit( ) prediction API Xgb_model.predict( ) predict( ) method is called in trained object by xgb.train() . Returned value is not direct prediction result. It is probability value for prediction result XGBClassifer.predict( ) Returning direct prediction results Feature importance visualization plot_importance( ) plot_importance( )
  • 9. Hyper parameters of python wrapper and scikit-learn wrapper 9 Python Wrapper Scikit-Learn Wrapper Hyper parameter description eta learning_rate Same parameter with GBM’s learning_rate. It is learning rate of updating weights iterating boosting steps. Normally it is set between 0 and 1 Default value is 0.3 when using python wrapper xgboos, 0.1 when using scikit-learn wrapper xgboost num_boost_rounds n_estimators Same parameter with n_estimators in scikit learn ensemble. It means numbers of weak learners(iteration count) min_child_weight min_child_weight Similart to min_child_leaf of decision tree. It is used against overfitting max_depth max_depth Same with max_depth in decision tree. Max tree depth sub_sample subsample Same parameter with subsample in GBM. It sets samping percentage in order to prevent tree from growing bigger and overfitting. If you set sub_sample=0.5, a half of total data can be used for creating tree. 0 ~ 1 can be used but normally 0.5 ~ 1 is used
  • 10. 10 Python Wrapper Scikit-Learn Wrapper Hyper parameter description lambda reg_lambda L2 Regularization value. Default is 1. The bigger value, the more regualarization. Used for overfitting alpha reg_alpha L1 Regularization value. Default is 0. The bigger value, the more regualarization. Used for overfitting colsample_bytree colsample_bytree Similar to max_features in GBM. It is used for sampling features for making tree. If there are too many features , it is used for overfitting XGBoost 파이썬 Wrapper 하이퍼 파라미터와 사이킷런 Wrapper 하이퍼 파라미터 비교
  • 11. 11 XGBoost Early stopping XGBoost can stop its iterations before it reaches designated count unless cost is reduced during specified early stopping repetition interval. It can be wisely used in hyper parameter tuning process to reduce the tuning time. If you set too small value to early stopping, training can be finished without proper optimization Main parameters for early stopping • early_stopping_rounds : Maximum iterations at which loss metric is no more enhanced • eval_metric : cost evaluation metric. • eval_set : evaluation dataset which is used in evaluating cost reduction.
  • 12. 12 XGBoost Wrap up • XGBoost (and LightGBM) is most used ensemble method especially among Kagglers • It can enhance prediction performance compared with GBM but not as much as rocket-boosted improvement • Execution time is faster compared with GBM and it can be used with parallel processing with multi cpu cores • Hyper parameter tuning is difficult because of too much of them. But you don’t have to stick to it as drastic improvement of performance is rare case in XGBoost • XGBoost is not golden campass but it is widely used in various appllications especially in classification and regression