Boosting
Why is boosting used?
Given a dataset of images containing images of cats and dogs, you were
asked to build a model that can classify these images into two separate
classes.
Rule 1: If the image has pointy ears, then it is a cat
Rule 2: If the image has cat shaped eyes, then it is a cat
Rule 3: If the image has a bigger limbs, then it is a dog
Rule 4: If the image has sharpened claws, then it is a cat
Rule 5: If the image has a wider mouth structure, then it is a dog
• Using one of these rules to classify the image, it does not make sense.
• For eg: Given a cat image as input and we need to classify it as dog or
cat.
• The cat is of different breed and it has bigger limbs.
• The rule sees the image has bigger limbs and classify it as dog.
• Now each of the rules if applied individually on an image will not give
you an accurate result.
• All the rules need to be applied for the test image and then only predict
the output.
• Each of these rules individually are called weak learner, because these
rules are not strong enough to classify an image as cat or dog
individually.
• Your prediction will mostly be wrong.
• You cannot take one feature into consideration and classify the images
(cat/dog).
• So to make the predictions more accurate, we can combine the
predictions from each of these weak learners by using the majority rule
or weighted average. This is a strong learner.
5 weak learners gives the prediction that the
image is a cat.
Training Image
Test Image
What is Boosting?
• Boosting is an ensemble learning technique that uses a set of
Machine Learning algorithms to combine weak learner to form
strong learner in order to increase the accuracy of the model.
• Boosting is a sequential ensemble model, here the weak learners are
sequentially produced during the training phase.
• The performance of the model is improved by assigning higher
weightage to the previous incorrectly classified sample.
• Entire dataset is fed into an algorithm and make predictions.
• The algorithm misclassifies some data.
• Then more attention is paid to the misclassified data points.
• Boosting is sequentially processed and updating the weights depending
upon the misclassified samples.
How does boosting algorithm works?
• The basic principle behind the working of Boosting algorithm is to
generate multiple weak learners and combine their predictions to form a
one strong rule.
Misclassified data point
Explanation
• Step 1: Base algorithm reads the data and assigns equal weights to all
the data.
• Then analyze the data and try to draw a decision stump.
• “Decision stump is a single level decision tree that tries to classify the
data points”.
• Then the base learner will check for false predictions.
• In the first images, two squares are separated right but the other three
squares are on the opposite side meaning that these squares are
misclassified.
• Step 2: Assign higher weightage to these misclassified samples.
• In the second image the three squares have higher weightage by
increasing the size of the image.
• Identify the misclassified samples and increase the size of the image.
• Then make sure the misclassified samples are classified correctly in the
next iteration.
• Step 3: This process is repeated again and again until the class A is
separated from class B.
Flow Diagram of Boosting
Types of Boosting
Adaboost
Gradient Boosting
Xgboost
LightBGM
Catboost
Adaptive Boosting (Adaboost)

Boosting in ensemble learning in ml.pptx

  • 1.
  • 2.
    Why is boostingused? Given a dataset of images containing images of cats and dogs, you were asked to build a model that can classify these images into two separate classes. Rule 1: If the image has pointy ears, then it is a cat Rule 2: If the image has cat shaped eyes, then it is a cat Rule 3: If the image has a bigger limbs, then it is a dog Rule 4: If the image has sharpened claws, then it is a cat Rule 5: If the image has a wider mouth structure, then it is a dog
  • 3.
    • Using oneof these rules to classify the image, it does not make sense. • For eg: Given a cat image as input and we need to classify it as dog or cat. • The cat is of different breed and it has bigger limbs. • The rule sees the image has bigger limbs and classify it as dog. • Now each of the rules if applied individually on an image will not give you an accurate result. • All the rules need to be applied for the test image and then only predict the output.
  • 4.
    • Each ofthese rules individually are called weak learner, because these rules are not strong enough to classify an image as cat or dog individually. • Your prediction will mostly be wrong. • You cannot take one feature into consideration and classify the images (cat/dog). • So to make the predictions more accurate, we can combine the predictions from each of these weak learners by using the majority rule or weighted average. This is a strong learner.
  • 5.
    5 weak learnersgives the prediction that the image is a cat.
  • 6.
  • 7.
  • 8.
    What is Boosting? •Boosting is an ensemble learning technique that uses a set of Machine Learning algorithms to combine weak learner to form strong learner in order to increase the accuracy of the model.
  • 9.
    • Boosting isa sequential ensemble model, here the weak learners are sequentially produced during the training phase. • The performance of the model is improved by assigning higher weightage to the previous incorrectly classified sample. • Entire dataset is fed into an algorithm and make predictions. • The algorithm misclassifies some data. • Then more attention is paid to the misclassified data points. • Boosting is sequentially processed and updating the weights depending upon the misclassified samples.
  • 11.
    How does boostingalgorithm works? • The basic principle behind the working of Boosting algorithm is to generate multiple weak learners and combine their predictions to form a one strong rule.
  • 14.
  • 16.
    Explanation • Step 1:Base algorithm reads the data and assigns equal weights to all the data. • Then analyze the data and try to draw a decision stump. • “Decision stump is a single level decision tree that tries to classify the data points”. • Then the base learner will check for false predictions. • In the first images, two squares are separated right but the other three squares are on the opposite side meaning that these squares are misclassified.
  • 17.
    • Step 2:Assign higher weightage to these misclassified samples. • In the second image the three squares have higher weightage by increasing the size of the image. • Identify the misclassified samples and increase the size of the image. • Then make sure the misclassified samples are classified correctly in the next iteration. • Step 3: This process is repeated again and again until the class A is separated from class B.
  • 18.
  • 19.
    Types of Boosting Adaboost GradientBoosting Xgboost LightBGM Catboost
  • 20.