2. Ensemble Learning
2
Ensemble learning is one of the most powerful machine learning techniques
that use the combined output of two or more models/weak learners and
solve a particular computational intelligence problem. E.g., a Random
Forest algorithm is an ensemble of various decision trees combined.
Ensemble learning is primarily used to improve the model performance,
such as classification, prediction, function approximation, etc.
Definition: "An ensembled model is a machine learning model that
combines the predictions from two or more models.”
1. Bagging
2. Boosting
OMega TechEd
3. Bagging
Bagging is a method of ensemble modeling, which is primarily used to solve
supervised machine learning problems. It is generally completed in two steps
as follows:
Bootstrapping: It is a random sampling method that is used to derive samples
from the data using the replacement procedure.(replacement means an
individual data point can be sampled multiple times) In this method, first,
random data samples are fed to the primary model, and then a base learning
algorithm is run on the samples to complete the learning process.
Aggregation: This is a step that involves the process of combining the output
of all base models and based on their output, predicting an aggregate result
with greater accuracy and reduced variance.
3
OMega TechEd
4. Steps of Bagging
1.We have an initial training dataset containing n-number of instances.
2.We create a m-number of subsets of data from the training set. We take a
subset of N sample points from the initial dataset for each subset. Each subset
is taken with replacement. This means that a specific data point can be
sampled more than once.
3.For each subset of data, we train the corresponding weak learners
independently. These models are homogeneous, meaning that they are of the
same type.
4.Each model makes a prediction.
5.The predictions are aggregated into a single prediction. For this, either max
voting or averaging is used.
4
OMega TechEd
6. Boosting
Boosting is an ensemble method that enables each member to learn from the
preceding member's mistakes and make better predictions for the future.
Unlike the bagging method, in boosting, all base learners (weak) are arranged
in a sequential format so that they can learn from the mistakes of their
preceding learner. Hence, in this way, all weak learners get turned into strong
learners and make a better predictive model with significantly improved
performance.
6
OMega TechEd
7. Steps of Boosting
1. We sample m-number of subsets from an initial training dataset.
2. Using the first subset, we train the first weak learner.
3. We test the trained weak learner using the training data. As a result of the
testing, some data points will be incorrectly predicted.
4. Each data point with the wrong prediction is sent into the second subset of
data, and this subset is updated.
5. Using this updated subset, we train and test the second weak learner.
6. We continue with the following subset until the total number of subsets is
reached.
7. We now have the total prediction. The overall prediction has already been
aggregated at each step, so there is no need to calculate it.
7
OMega TechEd
8. Steps of Boosting
8
Training set
Subset 1
Weak
learner
Subset 4
Subset 3
Subset 2
Weak
learner
Weak
learner
Weak
learner
Overall Prediction
9. Thank you
Reference:
Artificial Intelligence: A Modern Approach, 3rd ed.
Stuart Russell and Peter Norvig
https://www.javatpoint.com/reinforcement-learning
Join Telegram channel for AI notes. Link is in the description.
OMega TechEd