Gradient Boosting in Practice:
XGBoost and LightGBM
Gabriel Cypriano
source: http://blog.kaggle.com/2017/01/23/a-kaggle-master-explains-gradient-boosting
Decision Trees
review
source: https://intelligentjava.wordpress.com/2015/04/28/machine-learning-decision-tree
Overfitting review
source: https://en.wikipedia.org/wiki/Overfitting
Overfitting with Decision Trees
Gradient Boosting
source: https://blog.bigml.com/2017/03/14/introduction-to-boosted-trees
Regularization
review
source: https://www.r-bloggers.com/an-attempt-to-understand-boosting-algorithms
Regularization review
Ridge (L2) Lasso (L1)
XGBoost
Feature
Engineering
● OK with outliers
● OK with non-standardized
features
● OK with collinear features
● OK with NaN values
or lack thereof
Feature
Engineering
● Got NaN’s?
○ set them to -999
○ set missing = -999
or lack thereof
XGBoost
Parameter
Tuning
● n_estimators
● max_depth
● learning_rate
● reg_lambda
● reg_alpha
● subsample
● colsample_bytree
● gamma
yes, it’s combinatorial
XGBoost Parameter Tuning
RandomizedSearchCV and GridSearchCV to the rescue.
XGBoost Parameter Tuning
How not to do grid search (3 * 2 * 15 * 3 = 270 models):
XGBoost
Parameter Tuning
1 * 1 * 3 * 5 * 3 = 45 models
Evaluation Metric
XGBoost
Ensembles
VotingClassifier with
voting='soft' for
combining multiple
XGBoost models and
optimizing for multiple
metrics.
ensemble xgb_auc xgb_precision xgb_log_loss
AUC 0.84 0.84 0.76 0.84
Precision 0.71 0.44 0.94 0.71
Log Loss 0.42 0.49 0.48 0.38
LightGBM
LightGBM
and its advantages
● OK with NaN values
● OK with categorical features
● Faster training than XGBoost
● Often better results
Trees: Categorical
features vs
One-Hot-Encoded
features
source:
https://medium.com/data-design/visiting-catego
rical-features-and-encoding-in-decision-trees-534
00fa65931
Tree Growth strategies
XGBoost:
LightGBM:
source: https://www.analyticsvidhya.com/blog/2017/06/which-algorithm-takes-the-crown-light-gbm-vs-xgboost
The good
● Often yields good results
● Reduced need for feature
engineering
● Fast to train a single model
● Good choice if all you have is 1
shot at the problem
● GPU support
● Scikit-learn API
● Great to ensemble and optimize for
multiple metrics
The bad
● Too many parameters
● Slow to tune parameters
● GPU config can be tough (try
Docker)
● No GPU support on
scikit-learn API (XGBoost)
Gracias!
gabrielcs.me
vagas.creditas.com.br
somostera.com

XGBoost & LightGBM