zekeLabs
Ensemble Methods
Learning made Simpler !
www.zekeLabs.com
Agenda
â—Ź Introduction to Ensemble Methods
â—Ź Families of ensemble methods
â—Ź Forest
â—Ź AdaBoost
â—Ź Gradient Tree Boosting
â—Ź Voting Classifier
â—Ź XGBoost
Introduction to Ensemble Methods
â—Ź Ensemble methods are techniques that create multiple models and then
combine them to produce improved results.
â—Ź Usually boosts accuracy compared to base models themselves
â—Ź Very popular in competitions
Families of ensemble methods
Bagging Boosting Voting
Types of Ensemble Methods
Bagging
â—Ź Building multiple models (typically of the same type) from different
subsamples of the training dataset.
â—Ź Picks multiple sample from same training dataset as configured as
n_estimators
â—Ź Training one model for each picked sample.
â—Ź Final prediction if a function of prediction by all models.
BaggingClassif/Regre RandomForestClassif/Regre ExtraTreeClasif/Regres
RandomForset
â—Ź Used for both classification & Regression
â—Ź Samples of training dataset are taken with replacement
â—Ź Models are trained using the subsamples
â—Ź Final result is function of all the results of participating models
â—Ź Reduces variance of base learning method
â—Ź Usually base algorithm is Decision Tree
â—Ź All features are considered, that reduces correlation between trees
RandomForest
Boosting
â—Ź Building multiple models (typically of the same type) each of which learns to
fix the prediction errors of a prior model in the chain.
â—Ź Creating strong predictor using weak learners
â—Ź This is done by building a model from the training data, then creating a
second model that attempts to correct the errors from the first model.
â—Ź Models are added until the training set is predicted perfectly or a maximum
number of models are added.
AdaBoost GradientBoostingTree XGBoost
AdaBoost
â—Ź Suited for bi-class classification
â—Ź Steps are as follows
- Train weak learner on training data
- Increase weights of misclassified data
- Increased weight data has more chances of getting picked in next model
training
- Final prediction is function of all the participating models
AdaBoost
Gradient Boosting Tree
XGBoost
â—Ź Advanced implementation of Gradient Boosting Algorithm
â—Ź Regularized Boosting to prevent overfitting
â—Ź In-built mechanism of handling missing values
â—Ź Re-train already trained model
â—Ź In-built cross validation
Voting
â—Ź Building multiple models (typically of differing types) and simple majority or
weighted majority as prediction
â—Ź Participating learning algorithms can be SVM, KNearestClassifiers, Logistic
Regression, bagging or boosting methods.
â—Ź Here participating learners should be strong.
â—Ź Weights can be assigned to different algorithm.
Application
Thank You !!!
Visit : www.zekeLabs.com for more details
THANK YOU
Let us know how can we help your organization to Upskill the
employees to stay updated in the ever-evolving IT Industry.
Get in touch:
www.zekeLabs.com | +91-8095465880 | info@zekeLabs.com

Ensemble methods

  • 1.
    zekeLabs Ensemble Methods Learning madeSimpler ! www.zekeLabs.com
  • 2.
    Agenda â—Ź Introduction toEnsemble Methods â—Ź Families of ensemble methods â—Ź Forest â—Ź AdaBoost â—Ź Gradient Tree Boosting â—Ź Voting Classifier â—Ź XGBoost
  • 3.
    Introduction to EnsembleMethods â—Ź Ensemble methods are techniques that create multiple models and then combine them to produce improved results. â—Ź Usually boosts accuracy compared to base models themselves â—Ź Very popular in competitions
  • 4.
    Families of ensemblemethods Bagging Boosting Voting Types of Ensemble Methods
  • 5.
    Bagging â—Ź Building multiplemodels (typically of the same type) from different subsamples of the training dataset. â—Ź Picks multiple sample from same training dataset as configured as n_estimators â—Ź Training one model for each picked sample. â—Ź Final prediction if a function of prediction by all models. BaggingClassif/Regre RandomForestClassif/Regre ExtraTreeClasif/Regres
  • 6.
    RandomForset â—Ź Used forboth classification & Regression â—Ź Samples of training dataset are taken with replacement â—Ź Models are trained using the subsamples â—Ź Final result is function of all the results of participating models â—Ź Reduces variance of base learning method â—Ź Usually base algorithm is Decision Tree â—Ź All features are considered, that reduces correlation between trees
  • 7.
  • 8.
    Boosting â—Ź Building multiplemodels (typically of the same type) each of which learns to fix the prediction errors of a prior model in the chain. â—Ź Creating strong predictor using weak learners â—Ź This is done by building a model from the training data, then creating a second model that attempts to correct the errors from the first model. â—Ź Models are added until the training set is predicted perfectly or a maximum number of models are added. AdaBoost GradientBoostingTree XGBoost
  • 9.
    AdaBoost â—Ź Suited forbi-class classification â—Ź Steps are as follows - Train weak learner on training data - Increase weights of misclassified data - Increased weight data has more chances of getting picked in next model training - Final prediction is function of all the participating models
  • 10.
  • 11.
  • 12.
    XGBoost â—Ź Advanced implementationof Gradient Boosting Algorithm â—Ź Regularized Boosting to prevent overfitting â—Ź In-built mechanism of handling missing values â—Ź Re-train already trained model â—Ź In-built cross validation
  • 13.
    Voting â—Ź Building multiplemodels (typically of differing types) and simple majority or weighted majority as prediction â—Ź Participating learning algorithms can be SVM, KNearestClassifiers, Logistic Regression, bagging or boosting methods. â—Ź Here participating learners should be strong. â—Ź Weights can be assigned to different algorithm.
  • 14.
  • 15.
  • 16.
    Visit : www.zekeLabs.comfor more details THANK YOU Let us know how can we help your organization to Upskill the employees to stay updated in the ever-evolving IT Industry. Get in touch: www.zekeLabs.com | +91-8095465880 | info@zekeLabs.com