zekeLabs
Ensemble Methods
Learning made Simpler !
www.zekeLabs.com
Agenda
● Introduction to Ensemble Methods
● Families of ensemble methods
● Forest
● AdaBoost
● Gradient Tree Boosting
● Voting Classifier
● XGBoost
Introduction to Ensemble Methods
● Ensemble methods are techniques that create multiple models and then
combine them to produce improved results.
● Usually boosts accuracy compared to base models themselves
● Very popular in competitions
Families of ensemble methods
Bagging Boosting Voting
Types of Ensemble Methods
Bagging
● Building multiple models (typically of the same type) from different
subsamples of the training dataset.
● Picks multiple sample from same training dataset as configured as
n_estimators
● Training one model for each picked sample.
● Final prediction if a function of prediction by all models.
BaggingClassif/Regre RandomForestClassif/Regre ExtraTreeClasif/Regres
RandomForset
● Used for both classification & Regression
● Samples of training dataset are taken with replacement
● Models are trained using the subsamples
● Final result is function of all the results of participating models
● Reduces variance of base learning method
● Usually base algorithm is Decision Tree
● All features are considered, that reduces correlation between trees
RandomForest
Boosting
● Building multiple models (typically of the same type) each of which learns to
fix the prediction errors of a prior model in the chain.
● Creating strong predictor using weak learners
● This is done by building a model from the training data, then creating a
second model that attempts to correct the errors from the first model.
● Models are added until the training set is predicted perfectly or a maximum
number of models are added.
AdaBoost GradientBoostingTree XGBoost
AdaBoost
● Suited for bi-class classification
● Steps are as follows
- Train weak learner on training data
- Increase weights of misclassified data
- Increased weight data has more chances of getting picked in next model
training
- Final prediction is function of all the participating models
AdaBoost
Gradient Boosting Tree
XGBoost
● Advanced implementation of Gradient Boosting Algorithm
● Regularized Boosting to prevent overfitting
● In-built mechanism of handling missing values
● Re-train already trained model
● In-built cross validation
Voting
● Building multiple models (typically of differing types) and simple majority or
weighted majority as prediction
● Participating learning algorithms can be SVM, KNearestClassifiers, Logistic
Regression, bagging or boosting methods.
● Here participating learners should be strong.
● Weights can be assigned to different algorithm.
Application
Thank You !!!
Visit : www.zekeLabs.com for more details
THANK YOU
Let us know how can we help your organization to Upskill the
employees to stay updated in the ever-evolving IT Industry.
Get in touch:
www.zekeLabs.com | +91-8095465880 | info@zekeLabs.com

Ensemble methods

  • 1.
    zekeLabs Ensemble Methods Learning madeSimpler ! www.zekeLabs.com
  • 2.
    Agenda ● Introduction toEnsemble Methods ● Families of ensemble methods ● Forest ● AdaBoost ● Gradient Tree Boosting ● Voting Classifier ● XGBoost
  • 3.
    Introduction to EnsembleMethods ● Ensemble methods are techniques that create multiple models and then combine them to produce improved results. ● Usually boosts accuracy compared to base models themselves ● Very popular in competitions
  • 4.
    Families of ensemblemethods Bagging Boosting Voting Types of Ensemble Methods
  • 5.
    Bagging ● Building multiplemodels (typically of the same type) from different subsamples of the training dataset. ● Picks multiple sample from same training dataset as configured as n_estimators ● Training one model for each picked sample. ● Final prediction if a function of prediction by all models. BaggingClassif/Regre RandomForestClassif/Regre ExtraTreeClasif/Regres
  • 6.
    RandomForset ● Used forboth classification & Regression ● Samples of training dataset are taken with replacement ● Models are trained using the subsamples ● Final result is function of all the results of participating models ● Reduces variance of base learning method ● Usually base algorithm is Decision Tree ● All features are considered, that reduces correlation between trees
  • 7.
  • 8.
    Boosting ● Building multiplemodels (typically of the same type) each of which learns to fix the prediction errors of a prior model in the chain. ● Creating strong predictor using weak learners ● This is done by building a model from the training data, then creating a second model that attempts to correct the errors from the first model. ● Models are added until the training set is predicted perfectly or a maximum number of models are added. AdaBoost GradientBoostingTree XGBoost
  • 9.
    AdaBoost ● Suited forbi-class classification ● Steps are as follows - Train weak learner on training data - Increase weights of misclassified data - Increased weight data has more chances of getting picked in next model training - Final prediction is function of all the participating models
  • 10.
  • 11.
  • 12.
    XGBoost ● Advanced implementationof Gradient Boosting Algorithm ● Regularized Boosting to prevent overfitting ● In-built mechanism of handling missing values ● Re-train already trained model ● In-built cross validation
  • 13.
    Voting ● Building multiplemodels (typically of differing types) and simple majority or weighted majority as prediction ● Participating learning algorithms can be SVM, KNearestClassifiers, Logistic Regression, bagging or boosting methods. ● Here participating learners should be strong. ● Weights can be assigned to different algorithm.
  • 14.
  • 15.
  • 16.
    Visit : www.zekeLabs.comfor more details THANK YOU Let us know how can we help your organization to Upskill the employees to stay updated in the ever-evolving IT Industry. Get in touch: www.zekeLabs.com | +91-8095465880 | info@zekeLabs.com