chaitra-1.pptx fake news detection using machine learning
Ensemble methods in Machine learning technology
1. EASWARI ENGINEERING COLLEGE
(AUTONOMOUS)
RAMAPURAM, CHENNAI – 600 089
BACHELOR OF ENGINEERING in COMPUTER SCIENCE AND ENGINEERING
191CSC701T – Data Science
Group 1:
Raghul V – 3106201040105 Prakash S -
310620104098
Pramadeish SM – 310620104099 Prithvi S -
310620104102
Joel Thomas Joe – 310620104064 Nithish S -
310620104094
Ensemble Methods
2. AGENDA
• INTRODUCTION
• CATEGORIES OF ENSEMBLE
METHODS
• MAIN TYPES OF ENSEMBLE
METHODS
• HOW THESE TYPES WORK
• ADVANTAGES AND
DISADVANTAGES OF USING
ENSEMBLE METHODS
2
3. INTRODUCTION
ENSEMBLE
METHODS
3
o Ensemble learning helps improve machine learning results by
combining several models.
o This approach allows the production of better predictive
performance compared to a single model
o Ensemble methods are meta-algorithms that combine several
machine learning techniques into one predictive model in order to
decrease variance (bagging), bias (boosting), or improve
predictions (stacking).
Why do we use Ensemble methods ?
4. CATEGORIES OF ENSEMBLE METHODS
ENSEMBLE
METHODS
4
o Sequential ensemble techniques generate base learners in a sequence, e.g.,
Adaptive Boosting (AdaBoost).
o The sequential generation of base learners promotes the dependence between
the base learners. The performance of the model is then improved by assigning
higher weights to previously misrepresented learners.
o Parallel ensemble techniques, base learners are generated in a parallel
format, e.g., random forest.
o Methods utilize the parallel generation of base learners to encourage
independence between the base learners. The independence of base learners
significantly reduces the error due to the application of averages.
5. ENSEMBLE
METHODS
5
Bagging
• Bagging, the short form for bootstrap aggregating, is mainly applied in classification
and regression.
• It increases the accuracy of models through decision trees, which reduces variance to a
large extent. The reduction of variance increases accuracy, eliminating overfitting, which is
a challenge to many predictive models.
Boosting
• Boosting is an ensemble technique that learns from previous predictor mistakes to make
better predictions in the future.
• Technique combines several weak base learners to form one strong learner, thus
significantly improving the predictability of models.
Stacking
• Stacking, another ensemble method, is often referred to as stacked generalization
• This technique works by allowing a training algorithm to ensemble several other
similar learning algorithm predictions.
Main Types of Ensemble Methods
6. ENSEMBLE
METHODS
6
• Consider dataset D.
• It has many rows and columns
• Consider models or base learners M(M1,M2,…,Mn)
for dataset D
• For each model we provide dataset D’M,D’’M, Etc.
• Suppose we have n records we select sample of n
records and provide a particular record to model 1
• Similarly for next model we use row sampling with
replacement.
• For example in model M1 if there is data (A,B) ,then
for model M2(B,C) where B is repetitive
• After training is done we give new test data to predict.
• Now we consider this method in binary classifier
model
How Bagging Works?
7. ENSEMBLE
METHODS
7
• Suppose we give new test data and made
them to pass
• The models gives their values as 1 or 0 as we
consider binary classifier
• In the given dataset by voting classifier the
majority (1) is taken as O/P
How Bagging Works?
8. ENSEMBLE
METHODS
8
• Consider a dataset with records
• Consider models(M1,M2,..,Mn) or
base learners
• Some data are passed to base
learners or model once it is trained.
• After training we will pass records to
base learners or model and see how
particular model is performed.
How Boosting Works?
Dataset
9. ENSEMBLE
METHODS
9
• The records are allowed to pass to model M1
and red colored 2 records are incorrectly
classified, the next model will be created
sequentially and only 2 records will be
passed to next model M2
• If M2 gives some wrong records then the
error will be passed continuously to M3
• This will go until we specify some strong
learners.
• This boosting technique will make weak
learners to strong learners.
How Boosting Works?
Dataset
10. ENSEMBLE
METHODS
1 0
• It is use heterogeneous method (strong learner + weak
learner) where other methods use Homogenous method
(strong learner or weak learner)
Meta model
How stacking works in meta model?
• Let have 100 records to train data
• 80 % trained on these data will be used for Prediction on
20% data
• Here we use:
• Logistic regression
• SVM
• Neural Networks
• In this we take this group
How Stacking Works?
11. ENSEMBLE
METHODS
11
• We can take k fold approach in 75 % typically
trained data, we can create k buckets.
• We can always create meta model on 1 bucket
out of k bucket or k-1 bucket
How Stacking Works?
12. ENSEMBLE
METHODS
1 2
Adv and DisAdv of using Ensemble Methods
Advantages of Ensemble
Methods
Disadvantages of Ensemble
Methods
Improved Predictive Performance Increased Complexity
Reduction of Overfitting Computationally Intensive
Robustness to Noisy Data Longer Training Times
Handles Different Data Types Difficulty in Interpretation
Versatility in Model Selection Decreased Transparency
Increased Generalization Possibility of Overfitting
Flexibility in Model Combination Reduced Intuitiveness