Ensemble Learning
Bagging & Boosting
OMega TechEd
Ensemble Learning
2
Ensemble learning is one of the most powerful machine learning techniques
that use the combined output of two or more models/weak learners and
solve a particular computational intelligence problem. E.g., a Random
Forest algorithm is an ensemble of various decision trees combined.
Ensemble learning is primarily used to improve the model performance,
such as classification, prediction, function approximation, etc.
Definition: "An ensembled model is a machine learning model that
combines the predictions from two or more models.”
1. Bagging
2. Boosting
OMega TechEd
Bagging
Bagging is a method of ensemble modeling, which is primarily used to solve
supervised machine learning problems. It is generally completed in two steps
as follows:
Bootstrapping: It is a random sampling method that is used to derive samples
from the data using the replacement procedure.(replacement means an
individual data point can be sampled multiple times) In this method, first,
random data samples are fed to the primary model, and then a base learning
algorithm is run on the samples to complete the learning process.
Aggregation: This is a step that involves the process of combining the output
of all base models and based on their output, predicting an aggregate result
with greater accuracy and reduced variance.
3
OMega TechEd
Steps of Bagging
1.We have an initial training dataset containing n-number of instances.
2.We create a m-number of subsets of data from the training set. We take a
subset of N sample points from the initial dataset for each subset. Each subset
is taken with replacement. This means that a specific data point can be
sampled more than once.
3.For each subset of data, we train the corresponding weak learners
independently. These models are homogeneous, meaning that they are of the
same type.
4.Each model makes a prediction.
5.The predictions are aggregated into a single prediction. For this, either max
voting or averaging is used.
4
OMega TechEd
Steps of Bagging
5
Original dataset
Subset D1
M1
Subset D4
Subset D3
Subset D2
M4
M3
M2
Combined Prediction OMega TechEd
Boosting
Boosting is an ensemble method that enables each member to learn from the
preceding member's mistakes and make better predictions for the future.
Unlike the bagging method, in boosting, all base learners (weak) are arranged
in a sequential format so that they can learn from the mistakes of their
preceding learner. Hence, in this way, all weak learners get turned into strong
learners and make a better predictive model with significantly improved
performance.
6
OMega TechEd
Steps of Boosting
1. We sample m-number of subsets from an initial training dataset.
2. Using the first subset, we train the first weak learner.
3. We test the trained weak learner using the training data. As a result of the
testing, some data points will be incorrectly predicted.
4. Each data point with the wrong prediction is sent into the second subset of
data, and this subset is updated.
5. Using this updated subset, we train and test the second weak learner.
6. We continue with the following subset until the total number of subsets is
reached.
7. We now have the total prediction. The overall prediction has already been
aggregated at each step, so there is no need to calculate it.
7
OMega TechEd
Steps of Boosting
8
Training set
Subset 1
Weak
learner
Subset 4
Subset 3
Subset 2
Weak
learner
Weak
learner
Weak
learner
Overall Prediction
Thank you
Reference:
Artificial Intelligence: A Modern Approach, 3rd ed.
Stuart Russell and Peter Norvig
https://www.javatpoint.com/reinforcement-learning
Join Telegram channel for AI notes. Link is in the description.
OMega TechEd

Ensemble learning

  • 1.
    Ensemble Learning Bagging &Boosting OMega TechEd
  • 2.
    Ensemble Learning 2 Ensemble learningis one of the most powerful machine learning techniques that use the combined output of two or more models/weak learners and solve a particular computational intelligence problem. E.g., a Random Forest algorithm is an ensemble of various decision trees combined. Ensemble learning is primarily used to improve the model performance, such as classification, prediction, function approximation, etc. Definition: "An ensembled model is a machine learning model that combines the predictions from two or more models.” 1. Bagging 2. Boosting OMega TechEd
  • 3.
    Bagging Bagging is amethod of ensemble modeling, which is primarily used to solve supervised machine learning problems. It is generally completed in two steps as follows: Bootstrapping: It is a random sampling method that is used to derive samples from the data using the replacement procedure.(replacement means an individual data point can be sampled multiple times) In this method, first, random data samples are fed to the primary model, and then a base learning algorithm is run on the samples to complete the learning process. Aggregation: This is a step that involves the process of combining the output of all base models and based on their output, predicting an aggregate result with greater accuracy and reduced variance. 3 OMega TechEd
  • 4.
    Steps of Bagging 1.Wehave an initial training dataset containing n-number of instances. 2.We create a m-number of subsets of data from the training set. We take a subset of N sample points from the initial dataset for each subset. Each subset is taken with replacement. This means that a specific data point can be sampled more than once. 3.For each subset of data, we train the corresponding weak learners independently. These models are homogeneous, meaning that they are of the same type. 4.Each model makes a prediction. 5.The predictions are aggregated into a single prediction. For this, either max voting or averaging is used. 4 OMega TechEd
  • 5.
    Steps of Bagging 5 Originaldataset Subset D1 M1 Subset D4 Subset D3 Subset D2 M4 M3 M2 Combined Prediction OMega TechEd
  • 6.
    Boosting Boosting is anensemble method that enables each member to learn from the preceding member's mistakes and make better predictions for the future. Unlike the bagging method, in boosting, all base learners (weak) are arranged in a sequential format so that they can learn from the mistakes of their preceding learner. Hence, in this way, all weak learners get turned into strong learners and make a better predictive model with significantly improved performance. 6 OMega TechEd
  • 7.
    Steps of Boosting 1.We sample m-number of subsets from an initial training dataset. 2. Using the first subset, we train the first weak learner. 3. We test the trained weak learner using the training data. As a result of the testing, some data points will be incorrectly predicted. 4. Each data point with the wrong prediction is sent into the second subset of data, and this subset is updated. 5. Using this updated subset, we train and test the second weak learner. 6. We continue with the following subset until the total number of subsets is reached. 7. We now have the total prediction. The overall prediction has already been aggregated at each step, so there is no need to calculate it. 7 OMega TechEd
  • 8.
    Steps of Boosting 8 Trainingset Subset 1 Weak learner Subset 4 Subset 3 Subset 2 Weak learner Weak learner Weak learner Overall Prediction
  • 9.
    Thank you Reference: Artificial Intelligence:A Modern Approach, 3rd ed. Stuart Russell and Peter Norvig https://www.javatpoint.com/reinforcement-learning Join Telegram channel for AI notes. Link is in the description. OMega TechEd