EASWARI ENGINEERING COLLEGE
(AUTONOMOUS)
RAMAPURAM, CHENNAI – 600 089
BACHELOR OF ENGINEERING in COMPUTER SCIENCE AND ENGINEERING
191CSC701T – Data Science
Group 1:
Raghul V – 3106201040105 Prakash S -
310620104098
Pramadeish SM – 310620104099 Prithvi S -
310620104102
Joel Thomas Joe – 310620104064 Nithish S -
310620104094
Ensemble Methods
AGENDA
• INTRODUCTION
• CATEGORIES OF ENSEMBLE
METHODS
• MAIN TYPES OF ENSEMBLE
METHODS
• HOW THESE TYPES WORK
• ADVANTAGES AND
DISADVANTAGES OF USING
ENSEMBLE METHODS
2
INTRODUCTION
ENSEMBLE
METHODS
3
o Ensemble learning helps improve machine learning results by
combining several models.
o This approach allows the production of better predictive
performance compared to a single model
o Ensemble methods are meta-algorithms that combine several
machine learning techniques into one predictive model in order to
decrease variance (bagging), bias (boosting), or improve
predictions (stacking).
Why do we use Ensemble methods ?
CATEGORIES OF ENSEMBLE METHODS
ENSEMBLE
METHODS
4
o Sequential ensemble techniques generate base learners in a sequence, e.g.,
Adaptive Boosting (AdaBoost).
o The sequential generation of base learners promotes the dependence between
the base learners. The performance of the model is then improved by assigning
higher weights to previously misrepresented learners.
o Parallel ensemble techniques, base learners are generated in a parallel
format, e.g., random forest.
o Methods utilize the parallel generation of base learners to encourage
independence between the base learners. The independence of base learners
significantly reduces the error due to the application of averages.
ENSEMBLE
METHODS
5
Bagging
• Bagging, the short form for bootstrap aggregating, is mainly applied in classification
and regression.
• It increases the accuracy of models through decision trees, which reduces variance to a
large extent. The reduction of variance increases accuracy, eliminating overfitting, which is
a challenge to many predictive models.
Boosting
• Boosting is an ensemble technique that learns from previous predictor mistakes to make
better predictions in the future.
• Technique combines several weak base learners to form one strong learner, thus
significantly improving the predictability of models.
Stacking
• Stacking, another ensemble method, is often referred to as stacked generalization
• This technique works by allowing a training algorithm to ensemble several other
similar learning algorithm predictions.
Main Types of Ensemble Methods
ENSEMBLE
METHODS
6
• Consider dataset D.
• It has many rows and columns
• Consider models or base learners M(M1,M2,…,Mn)
for dataset D
• For each model we provide dataset D’M,D’’M, Etc.
• Suppose we have n records we select sample of n
records and provide a particular record to model 1
• Similarly for next model we use row sampling with
replacement.
• For example in model M1 if there is data (A,B) ,then
for model M2(B,C) where B is repetitive
• After training is done we give new test data to predict.
• Now we consider this method in binary classifier
model
How Bagging Works?
ENSEMBLE
METHODS
7
• Suppose we give new test data and made
them to pass
• The models gives their values as 1 or 0 as we
consider binary classifier
• In the given dataset by voting classifier the
majority (1) is taken as O/P
How Bagging Works?
ENSEMBLE
METHODS
8
• Consider a dataset with records
• Consider models(M1,M2,..,Mn) or
base learners
• Some data are passed to base
learners or model once it is trained.
• After training we will pass records to
base learners or model and see how
particular model is performed.
How Boosting Works?
Dataset
ENSEMBLE
METHODS
9
• The records are allowed to pass to model M1
and red colored 2 records are incorrectly
classified, the next model will be created
sequentially and only 2 records will be
passed to next model M2
• If M2 gives some wrong records then the
error will be passed continuously to M3
• This will go until we specify some strong
learners.
• This boosting technique will make weak
learners to strong learners.
How Boosting Works?
Dataset
ENSEMBLE
METHODS
1 0
• It is use heterogeneous method (strong learner + weak
learner) where other methods use Homogenous method
(strong learner or weak learner)
Meta model
How stacking works in meta model?
• Let have 100 records to train data
• 80 % trained on these data will be used for Prediction on
20% data
• Here we use:
• Logistic regression
• SVM
• Neural Networks
• In this we take this group
How Stacking Works?
ENSEMBLE
METHODS
11
• We can take k fold approach in 75 % typically
trained data, we can create k buckets.
• We can always create meta model on 1 bucket
out of k bucket or k-1 bucket
How Stacking Works?
ENSEMBLE
METHODS
1 2
Adv and DisAdv of using Ensemble Methods
Advantages of Ensemble
Methods
Disadvantages of Ensemble
Methods
Improved Predictive Performance Increased Complexity
Reduction of Overfitting Computationally Intensive
Robustness to Noisy Data Longer Training Times
Handles Different Data Types Difficulty in Interpretation
Versatility in Model Selection Decreased Transparency
Increased Generalization Possibility of Overfitting
Flexibility in Model Combination Reduced Intuitiveness
REFERENCE
S
PRESENTATION
TITLE
1 3
https://corporatefinanceinstitute.com/resources/data-
science/ensemble-methods/
https://machinelearningmastery.com/tour-of-ensemble-learning-
algorithms/
https://en.wikipedia.org/wiki/Ensemble_learning
https://www.analyticsvidhya.com/blog/2023/01/ensemble-learning-
methods-bagging-boosting-and-stacking/
THANK YOU

Ensemble methods in Machine learning technology

  • 1.
    EASWARI ENGINEERING COLLEGE (AUTONOMOUS) RAMAPURAM,CHENNAI – 600 089 BACHELOR OF ENGINEERING in COMPUTER SCIENCE AND ENGINEERING 191CSC701T – Data Science Group 1: Raghul V – 3106201040105 Prakash S - 310620104098 Pramadeish SM – 310620104099 Prithvi S - 310620104102 Joel Thomas Joe – 310620104064 Nithish S - 310620104094 Ensemble Methods
  • 2.
    AGENDA • INTRODUCTION • CATEGORIESOF ENSEMBLE METHODS • MAIN TYPES OF ENSEMBLE METHODS • HOW THESE TYPES WORK • ADVANTAGES AND DISADVANTAGES OF USING ENSEMBLE METHODS 2
  • 3.
    INTRODUCTION ENSEMBLE METHODS 3 o Ensemble learninghelps improve machine learning results by combining several models. o This approach allows the production of better predictive performance compared to a single model o Ensemble methods are meta-algorithms that combine several machine learning techniques into one predictive model in order to decrease variance (bagging), bias (boosting), or improve predictions (stacking). Why do we use Ensemble methods ?
  • 4.
    CATEGORIES OF ENSEMBLEMETHODS ENSEMBLE METHODS 4 o Sequential ensemble techniques generate base learners in a sequence, e.g., Adaptive Boosting (AdaBoost). o The sequential generation of base learners promotes the dependence between the base learners. The performance of the model is then improved by assigning higher weights to previously misrepresented learners. o Parallel ensemble techniques, base learners are generated in a parallel format, e.g., random forest. o Methods utilize the parallel generation of base learners to encourage independence between the base learners. The independence of base learners significantly reduces the error due to the application of averages.
  • 5.
    ENSEMBLE METHODS 5 Bagging • Bagging, theshort form for bootstrap aggregating, is mainly applied in classification and regression. • It increases the accuracy of models through decision trees, which reduces variance to a large extent. The reduction of variance increases accuracy, eliminating overfitting, which is a challenge to many predictive models. Boosting • Boosting is an ensemble technique that learns from previous predictor mistakes to make better predictions in the future. • Technique combines several weak base learners to form one strong learner, thus significantly improving the predictability of models. Stacking • Stacking, another ensemble method, is often referred to as stacked generalization • This technique works by allowing a training algorithm to ensemble several other similar learning algorithm predictions. Main Types of Ensemble Methods
  • 6.
    ENSEMBLE METHODS 6 • Consider datasetD. • It has many rows and columns • Consider models or base learners M(M1,M2,…,Mn) for dataset D • For each model we provide dataset D’M,D’’M, Etc. • Suppose we have n records we select sample of n records and provide a particular record to model 1 • Similarly for next model we use row sampling with replacement. • For example in model M1 if there is data (A,B) ,then for model M2(B,C) where B is repetitive • After training is done we give new test data to predict. • Now we consider this method in binary classifier model How Bagging Works?
  • 7.
    ENSEMBLE METHODS 7 • Suppose wegive new test data and made them to pass • The models gives their values as 1 or 0 as we consider binary classifier • In the given dataset by voting classifier the majority (1) is taken as O/P How Bagging Works?
  • 8.
    ENSEMBLE METHODS 8 • Consider adataset with records • Consider models(M1,M2,..,Mn) or base learners • Some data are passed to base learners or model once it is trained. • After training we will pass records to base learners or model and see how particular model is performed. How Boosting Works? Dataset
  • 9.
    ENSEMBLE METHODS 9 • The recordsare allowed to pass to model M1 and red colored 2 records are incorrectly classified, the next model will be created sequentially and only 2 records will be passed to next model M2 • If M2 gives some wrong records then the error will be passed continuously to M3 • This will go until we specify some strong learners. • This boosting technique will make weak learners to strong learners. How Boosting Works? Dataset
  • 10.
    ENSEMBLE METHODS 1 0 • Itis use heterogeneous method (strong learner + weak learner) where other methods use Homogenous method (strong learner or weak learner) Meta model How stacking works in meta model? • Let have 100 records to train data • 80 % trained on these data will be used for Prediction on 20% data • Here we use: • Logistic regression • SVM • Neural Networks • In this we take this group How Stacking Works?
  • 11.
    ENSEMBLE METHODS 11 • We cantake k fold approach in 75 % typically trained data, we can create k buckets. • We can always create meta model on 1 bucket out of k bucket or k-1 bucket How Stacking Works?
  • 12.
    ENSEMBLE METHODS 1 2 Adv andDisAdv of using Ensemble Methods Advantages of Ensemble Methods Disadvantages of Ensemble Methods Improved Predictive Performance Increased Complexity Reduction of Overfitting Computationally Intensive Robustness to Noisy Data Longer Training Times Handles Different Data Types Difficulty in Interpretation Versatility in Model Selection Decreased Transparency Increased Generalization Possibility of Overfitting Flexibility in Model Combination Reduced Intuitiveness
  • 13.
  • 14.