SlideShare a Scribd company logo
ENSEMBLE
HYBRID FEATURE
SELECTION
TECHNIQUE Name: Disha Sinha
Semester: 6th
Year: 3rd
Section: B
University Roll Number :10900117090
CONTENTS
➢ Introduction
➢ Feature Selection
➢ Feature Selection vs Dimensionality Reduction
➢ Types of Feature Selection
➢ Ensembles
➢ Why Ensembles
➢ Types of Ensembles
➢ Application of ensembles
➢ Conclusion
INTRODUCTION
● As the amount of stored information increases, the ability to make use of it is not
proportional.
● In high dimensional datasets, due to redundant features and dimensionality, a
learning method takes quite a significant amount of time and the performance of
the model decreases.
● Hence, we use feature selection technique to select a subset of relevant and
non-redundant features.
FEATURE SELECTION
● Feature selection is used to select a subset of relevant and non-redundant features
from a large feature space.
● In many applications of machine learning and pattern recognition, feature selection
is used to select an optimal feature subset to train the learning model.
● The main objectives of feature selection are:
➢ to improve predictive accuracy
➢ to remove redundant features and
➢ to reduce time consumption during analysis.
FEATURE SELECTION VS
DIMENSIONALITY REDUCTION
➢ Feature selection is simply selecting and excluding
given features without transforming them.
➢ Dimensionality reduction transforms features into a lower
Dimension.
TYPES OF FEATURE SELECTION
TECHNIQUES
➢ Filter Methods
➢ Wrapper Methods
➢ Embedded Methods
➢ Hybrid Methods
1. Filter Methods
● Filter methods select a subset of features from a dataset without using any machine
learning algorithm.
● Examples being eliminating features with null values
● Filter-based feature selection methods are typically faster but the classifier accuracy
is not ensured.
● Selected features can be used in any machine learning algorithm
● They’re computationally inexpensive
2. Wrapper Methods
● Wrapper methods select a subset of features by evaluating it using a machine
learning algorithm that involves a search operation through the space of possible
feature subsets, evaluating each subset based on the performance of a given
algorithm.
● Wrapper methods can give high classification accuracy than filter method for
particular classifiers but they are less cost effective.
● They detect the interaction between variables
● They find the optimal feature subset for the desired machine learning algorithm
● Forward Selection, Backward Propagation, Stepwise Selection
3. Embedded Methods
● Performs feature selection during the process of training
● Specific to the applied learning algorithm.
● A learning algorithm takes advantage of its own variable selection process and
performs feature selection and classification/regression at the same time.
● They take into consideration the interaction of features like wrapper methods do.
● They are faster and more accurate than filter methods.
● They find the feature subset for the algorithm being trained.
● They are much less prone to overfitting.
● Examples : Lasso, Elastic Net
4. Hybrid Methods
● Combinations of all of the other feature selection methods - filter, wrapper and
embedded methods.
● Approach up to the engineer.
● Has high scope for research.
● High performance and accuracy.
● Better computational complexity than wrapper methods.
● Models that are more flexible and robust against high dimensional data.
ENSEMBLES
● For a given dataset, different feature selection algorithms may select different
subsets of features and hence the result obtained may have different accuracy. So
we use ensemble-based feature selection methods to select a stable feature
set.
● Ensembles are sets of learning machines that combine their decisions, or their
learning algorithms, or different views of data, or other specific characteristics to
achieve more reliable and accurate predictions in supervised and unsupervised
learning problems.
WHY ENSEMBLES ?
● It’s not that the best combination of learning algorithms outperforms the best
learning algorithm but a combination of learning algorithms will give more
accurate results on unseen data samples than a single learning algorithm.
● Ensembles enlarge the margins of large-margin classifiers like SVM in order to
classify data points accurately.
● Ensembles can reduce both bias and variance of the error.
WHY ENSEMBLES ?
● A rigorous mathematical treatment starting from the ”representativeness” of the
examples used in machine learning problems leads to the design of ensembles of
weak classifiers, whose accuracy is governed by the law of large numbers.
● Predictive performances of single models have been improved by the ensemble
methodology in several application fields, such as information security, astronomy
and astrophysics, geography and remote sensing, image retrieval, finance, medicine
etc.
TYPES OF ENSEMBLE METHODS
➢ Bayes Optimal Classifier
➢ Bootstrap Aggregating (Bagging)
➢ Boosting
➢ Bayesian Model Averaging
➢ Bayesian Model Combination
➢ Bucket of Models
➢ Stacking
1. Bayes Optimal Classifier
● Classification technique.
● Ensemble of all hypotheses in the hypothesis space.
● The naive Bayes optimal classifier is a version of this that assumes that the data is
conditionally independent of the class.
● Each hypothesis is given a vote proportional to the probability that the training
dataset would be sampled from a system if that hypothesis were true.
● Vote of each hypothesis is multiplied by the prior probability of that hypothesis.
1. Bayes Optimal Classifier
● Equation : y=argmax ∑ P(cj|hi) P(T|hi) P(hi)
where
y : the predicted class
c : the set of all possible classes
hi ϵ H : hypothesis space
P : probability
T : training data.
● By Bayes' theorem : P(hi|T) ∝ P(T|hi) P(hi)
● Hence,
y=argmax∑P(cj|hi) P(hi|T)
2. Bootstrap Aggregating (Bagging)
● Each model in the ensemble vote has equal weight.
● Trains each model in the ensemble using a randomly drawn subset of the training
set to promote model variance.
● It is a general procedure that can be used to reduce the variance for those
algorithms that have high variance such as decision trees, like classification and
regression trees (CART).
● As an example, the random forest algorithm combines random decision trees with
bagging to achieve very high classification accuracy.
2. Bootstrap Aggregating (Bagging)
Algorithm :
Assuming a dataset with 1000 instances and applying CART algorithm on it
Bagging of the CART algorithm would work as follows :
➢ Create many (e.g. 100) random sub-samples of our dataset with
replacement.
➢ Train a CART model on each sample.
➢ Given a new dataset, calculate the average prediction from each model.
We consider the most frequently predicted class.
3. Boosting
● Incrementally builds an ensemble by training each new model instance to
emphasize the training instances that previous models mis-classified.
● More accurate than bagging, but also tends to over-fit the training data.
● Most common algorithm : Adaboost
● Most boosting algorithms consist of iteratively learning weak classifiers with
respect to a distribution and adding them to a final strong classifier.
● While adding, they are weighted in a way that is related to the weak learners'
accuracy.
● After a weak learner is added, the data weights are re-adjusted by re-weighting
which leads to misclassified input data gaining higher weight and correctly
classified data losing weight.
● When they are added, they are weighted in a way that is related to the
4. Bayesian Model Averaging
● An ensemble technique that seeks to approximate the Bayes optimal classifier by
sampling hypotheses from the hypothesis space, and combining them using Bayes'
law.
● Hypotheses are typically sampled using a Monte Carlo sampling technique
such as MCMC.
● Gibbs sampling may be used to draw hypotheses that are representative of the
distribution P(T|H).
● Under certain circumstances, when hypotheses are drawn in this manner and
averaged according to Bayes' law, this technique has an expected error that is bound
to be at most twice the expected error of the Bayes optimal classifier.
5. Bayesian Model Combination
● An algorithmic correction to Bayesian model averaging (BMA).
● Instead of sampling each model in the ensemble individually, it samples from the
space of possible ensembles. This helps in overcoming the tendency of BMA to
converge toward giving all of the weight to a single model.
● Yields better result but computationally expensive than BMA.
● When they are added, they are weighted in a way that is related to the
weak learners' accuracy.
6. Bucket of Models
● An ensemble technique in which a model selection algorithm is used to choose the
best model for each problem.
● When tested with only one problem, a bucket of models can produce no better
results than the best model in the set, but when evaluated across many problems, it
will typically produce much better results, on average, than any model in the set.
● Most common approach used for model-selection : cross-validation selection
● Gating is a generalization of Cross-Validation Selection. It involves training another
learning model (or often a perceptron) to decide which of the models in the bucket
is best-suited to solve the problem.
6. Bucket of Models
Pseudo-code :
For each model m in the bucket:
Do c times: (where 'c' is some constant)
Randomly divide the training dataset into two datasets: A, and
B.
Train m with A
Test m with B
Select the model that obtains the highest average score
7. Stacking
● Involves training a learning algorithm to combine the predictions of several other
learning algorithms.
● First, all of the other algorithms are trained using the available data, then a
combiner algorithm is trained to make a final prediction using all the predictions of
the other algorithms as additional inputs.
● In practice, a logistic regression model is often used as the combiner.
● Successfully used on both supervised learning tasks (regression, classification and
distance learning) and unsupervised learning (density estimation).
● It has also been used to estimate Bagging's error rate.
● Reportedly out-performs Bayesian model Averaging.
APPLICATIONS OF ENSEMBLES
➢ image classification
➢ fingerprint classification
➢ weather forecasting
➢ text categorization
➢ image segmentation
➢ visual tracking
➢ change detection in image analysis
➢ protein fold pattern recognition
➢ cancer classification
➢ pedestrian recognition or detection
➢ prediction of software quality
➢ face recognition
APPLICATIONS OF ENSEMBLES
➢ email filtering
➢ prediction of students’ performance
➢ medical image analysis
➢ churn prediction
➢ malware detection
➢ intrusion detection
➢ emotion detection
➢ sentiment analysis
➢ prediction of air quality
➢ land cover mapping
➢ intrusion detection.
CONCLUSION
● The extent to which the ensemble implementation outperforms the simple version
of a given algorithm is strongly dependent on the intrinsic stability of the algorithm
itself, with larger gains in robustness for the least stable methods.
● It is worth highlighting that even selection methods that are quite different to each
other tend to exhibit a similar performance, in terms of both accuracy and stability,
when used in their ensemble version.
● As a future line of research, it could be interesting to explore the full potential of
hybrid ensemble approaches, where diversity is injected both at the data level and at
the algorithmic level. This might open the way to the definition of more flexible
selection strategies which leverage multiple heuristics while reducing the degree of
dependence on the specific composition of the training data.
THANK YOU

More Related Content

What's hot

Understanding Bagging and Boosting
Understanding Bagging and BoostingUnderstanding Bagging and Boosting
Understanding Bagging and Boosting
Mohit Rajput
 
What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?
Kazuki Yoshida
 
L2. Evaluating Machine Learning Algorithms I
L2. Evaluating Machine Learning Algorithms IL2. Evaluating Machine Learning Algorithms I
L2. Evaluating Machine Learning Algorithms I
Machine Learning Valencia
 
Multilayer perceptron
Multilayer perceptronMultilayer perceptron
Multilayer perceptron
omaraldabash
 
introduction to machin learning
introduction to machin learningintroduction to machin learning
introduction to machin learning
nilimapatel6
 
Classification using back propagation algorithm
Classification using back propagation algorithmClassification using back propagation algorithm
Classification using back propagation algorithm
KIRAN R
 
Ensemble modeling and Machine Learning
Ensemble modeling and Machine LearningEnsemble modeling and Machine Learning
Ensemble modeling and Machine Learning
StepUp Analytics
 
Chapter 09 classification advanced
Chapter 09 classification advancedChapter 09 classification advanced
Chapter 09 classification advanced
Houw Liong The
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
Musa Hawamdah
 
Machine Learning - Ensemble Methods
Machine Learning - Ensemble MethodsMachine Learning - Ensemble Methods
Machine Learning - Ensemble Methods
Andrew Ferlitsch
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
zekeLabs Technologies
 
2.1 Data Mining-classification Basic concepts
2.1 Data Mining-classification Basic concepts2.1 Data Mining-classification Basic concepts
2.1 Data Mining-classification Basic concepts
Krish_ver2
 
Machine Learning by Rj
Machine Learning by RjMachine Learning by Rj
Support Vector Machines for Classification
Support Vector Machines for ClassificationSupport Vector Machines for Classification
Support Vector Machines for Classification
Prakash Pimpale
 
Perceptron
PerceptronPerceptron
Perceptron
Nagarajan
 
(Machine Learning) Ensemble learning
(Machine Learning) Ensemble learning (Machine Learning) Ensemble learning
(Machine Learning) Ensemble learning
Omkar Rane
 
Machine learning with ADA Boost
Machine learning with ADA BoostMachine learning with ADA Boost
Machine learning with ADA Boost
Aman Patel
 
Chapter 5 of 1
Chapter 5 of 1Chapter 5 of 1
Chapter 5 of 1
Melaku Bayih Demessie
 
Stacking ensemble
Stacking ensembleStacking ensemble
Stacking ensemble
kalung0313
 
Supervised Machine Learning
Supervised Machine LearningSupervised Machine Learning
Supervised Machine Learning
Ankit Rai
 

What's hot (20)

Understanding Bagging and Boosting
Understanding Bagging and BoostingUnderstanding Bagging and Boosting
Understanding Bagging and Boosting
 
What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?
 
L2. Evaluating Machine Learning Algorithms I
L2. Evaluating Machine Learning Algorithms IL2. Evaluating Machine Learning Algorithms I
L2. Evaluating Machine Learning Algorithms I
 
Multilayer perceptron
Multilayer perceptronMultilayer perceptron
Multilayer perceptron
 
introduction to machin learning
introduction to machin learningintroduction to machin learning
introduction to machin learning
 
Classification using back propagation algorithm
Classification using back propagation algorithmClassification using back propagation algorithm
Classification using back propagation algorithm
 
Ensemble modeling and Machine Learning
Ensemble modeling and Machine LearningEnsemble modeling and Machine Learning
Ensemble modeling and Machine Learning
 
Chapter 09 classification advanced
Chapter 09 classification advancedChapter 09 classification advanced
Chapter 09 classification advanced
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
Machine Learning - Ensemble Methods
Machine Learning - Ensemble MethodsMachine Learning - Ensemble Methods
Machine Learning - Ensemble Methods
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
2.1 Data Mining-classification Basic concepts
2.1 Data Mining-classification Basic concepts2.1 Data Mining-classification Basic concepts
2.1 Data Mining-classification Basic concepts
 
Machine Learning by Rj
Machine Learning by RjMachine Learning by Rj
Machine Learning by Rj
 
Support Vector Machines for Classification
Support Vector Machines for ClassificationSupport Vector Machines for Classification
Support Vector Machines for Classification
 
Perceptron
PerceptronPerceptron
Perceptron
 
(Machine Learning) Ensemble learning
(Machine Learning) Ensemble learning (Machine Learning) Ensemble learning
(Machine Learning) Ensemble learning
 
Machine learning with ADA Boost
Machine learning with ADA BoostMachine learning with ADA Boost
Machine learning with ADA Boost
 
Chapter 5 of 1
Chapter 5 of 1Chapter 5 of 1
Chapter 5 of 1
 
Stacking ensemble
Stacking ensembleStacking ensemble
Stacking ensemble
 
Supervised Machine Learning
Supervised Machine LearningSupervised Machine Learning
Supervised Machine Learning
 

Similar to Ensemble hybrid learning technique

Random Forest.pptx
Random Forest.pptxRandom Forest.pptx
Random Forest.pptx
SPIDERSRSTV
 
Ensemblelearning 181220105413
Ensemblelearning 181220105413Ensemblelearning 181220105413
Ensemblelearning 181220105413
Aravindharamanan S
 
How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ? How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ?
HackerEarth
 
Learning On The Border:Active Learning in Imbalanced classification Data
Learning On The Border:Active Learning in Imbalanced classification DataLearning On The Border:Active Learning in Imbalanced classification Data
Learning On The Border:Active Learning in Imbalanced classification Data
萍華 楊
 
Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptx
narmeen11
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
Akshay Kanchan
 
Proposing an Appropriate Pattern for Car Detection by Using Intelligent Algor...
Proposing an Appropriate Pattern for Car Detection by Using Intelligent Algor...Proposing an Appropriate Pattern for Car Detection by Using Intelligent Algor...
Proposing an Appropriate Pattern for Car Detection by Using Intelligent Algor...
Editor IJCATR
 
Unit 3 – AIML.pptx
Unit 3 – AIML.pptxUnit 3 – AIML.pptx
Unit 3 – AIML.pptx
hiblooms
 
Data mining chapter04and5-best
Data mining chapter04and5-bestData mining chapter04and5-best
Data mining chapter04and5-best
ABDUmomo
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
AmAn Singh
 
BaggingBoosting.pdf
BaggingBoosting.pdfBaggingBoosting.pdf
BaggingBoosting.pdf
DynamicPitch
 
Machine learning
Machine learningMachine learning
Machine learning
Sukhwinder Singh
 
Deep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningDeep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter Tuning
Shubhmay Potdar
 
Ensemble Method (Bagging Boosting)
Ensemble Method (Bagging Boosting)Ensemble Method (Bagging Boosting)
Ensemble Method (Bagging Boosting)
Abdullah al Mamun
 
Pricing like a data scientist
Pricing like a data scientistPricing like a data scientist
Pricing like a data scientist
Matthew Evans
 
Optimal Model Complexity (1).pptx
Optimal Model Complexity (1).pptxOptimal Model Complexity (1).pptx
Optimal Model Complexity (1).pptx
MurindanyiSudi1
 
in5490-classification (1).pptx
in5490-classification (1).pptxin5490-classification (1).pptx
in5490-classification (1).pptx
MonicaTimber
 
Performance Comparision of Machine Learning Algorithms
Performance Comparision of Machine Learning AlgorithmsPerformance Comparision of Machine Learning Algorithms
Performance Comparision of Machine Learning Algorithms
Dinusha Dilanka
 
OTTO-Report
OTTO-ReportOTTO-Report
Ensemble methods in Machine learning technology
Ensemble methods in Machine learning technologyEnsemble methods in Machine learning technology
Ensemble methods in Machine learning technology
sikethatsarightemail
 

Similar to Ensemble hybrid learning technique (20)

Random Forest.pptx
Random Forest.pptxRandom Forest.pptx
Random Forest.pptx
 
Ensemblelearning 181220105413
Ensemblelearning 181220105413Ensemblelearning 181220105413
Ensemblelearning 181220105413
 
How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ? How to Win Machine Learning Competitions ?
How to Win Machine Learning Competitions ?
 
Learning On The Border:Active Learning in Imbalanced classification Data
Learning On The Border:Active Learning in Imbalanced classification DataLearning On The Border:Active Learning in Imbalanced classification Data
Learning On The Border:Active Learning in Imbalanced classification Data
 
Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptx
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
 
Proposing an Appropriate Pattern for Car Detection by Using Intelligent Algor...
Proposing an Appropriate Pattern for Car Detection by Using Intelligent Algor...Proposing an Appropriate Pattern for Car Detection by Using Intelligent Algor...
Proposing an Appropriate Pattern for Car Detection by Using Intelligent Algor...
 
Unit 3 – AIML.pptx
Unit 3 – AIML.pptxUnit 3 – AIML.pptx
Unit 3 – AIML.pptx
 
Data mining chapter04and5-best
Data mining chapter04and5-bestData mining chapter04and5-best
Data mining chapter04and5-best
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
BaggingBoosting.pdf
BaggingBoosting.pdfBaggingBoosting.pdf
BaggingBoosting.pdf
 
Machine learning
Machine learningMachine learning
Machine learning
 
Deep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningDeep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter Tuning
 
Ensemble Method (Bagging Boosting)
Ensemble Method (Bagging Boosting)Ensemble Method (Bagging Boosting)
Ensemble Method (Bagging Boosting)
 
Pricing like a data scientist
Pricing like a data scientistPricing like a data scientist
Pricing like a data scientist
 
Optimal Model Complexity (1).pptx
Optimal Model Complexity (1).pptxOptimal Model Complexity (1).pptx
Optimal Model Complexity (1).pptx
 
in5490-classification (1).pptx
in5490-classification (1).pptxin5490-classification (1).pptx
in5490-classification (1).pptx
 
Performance Comparision of Machine Learning Algorithms
Performance Comparision of Machine Learning AlgorithmsPerformance Comparision of Machine Learning Algorithms
Performance Comparision of Machine Learning Algorithms
 
OTTO-Report
OTTO-ReportOTTO-Report
OTTO-Report
 
Ensemble methods in Machine learning technology
Ensemble methods in Machine learning technologyEnsemble methods in Machine learning technology
Ensemble methods in Machine learning technology
 

Recently uploaded

一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
74nqk8xf
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
kuntobimo2016
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
Sm321
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
74nqk8xf
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
GetInData
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 

Recently uploaded (20)

一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 

Ensemble hybrid learning technique

  • 1. ENSEMBLE HYBRID FEATURE SELECTION TECHNIQUE Name: Disha Sinha Semester: 6th Year: 3rd Section: B University Roll Number :10900117090
  • 2. CONTENTS ➢ Introduction ➢ Feature Selection ➢ Feature Selection vs Dimensionality Reduction ➢ Types of Feature Selection ➢ Ensembles ➢ Why Ensembles ➢ Types of Ensembles ➢ Application of ensembles ➢ Conclusion
  • 3. INTRODUCTION ● As the amount of stored information increases, the ability to make use of it is not proportional. ● In high dimensional datasets, due to redundant features and dimensionality, a learning method takes quite a significant amount of time and the performance of the model decreases. ● Hence, we use feature selection technique to select a subset of relevant and non-redundant features.
  • 4. FEATURE SELECTION ● Feature selection is used to select a subset of relevant and non-redundant features from a large feature space. ● In many applications of machine learning and pattern recognition, feature selection is used to select an optimal feature subset to train the learning model. ● The main objectives of feature selection are: ➢ to improve predictive accuracy ➢ to remove redundant features and ➢ to reduce time consumption during analysis.
  • 5. FEATURE SELECTION VS DIMENSIONALITY REDUCTION ➢ Feature selection is simply selecting and excluding given features without transforming them. ➢ Dimensionality reduction transforms features into a lower Dimension.
  • 6. TYPES OF FEATURE SELECTION TECHNIQUES ➢ Filter Methods ➢ Wrapper Methods ➢ Embedded Methods ➢ Hybrid Methods
  • 7. 1. Filter Methods ● Filter methods select a subset of features from a dataset without using any machine learning algorithm. ● Examples being eliminating features with null values ● Filter-based feature selection methods are typically faster but the classifier accuracy is not ensured. ● Selected features can be used in any machine learning algorithm ● They’re computationally inexpensive
  • 8. 2. Wrapper Methods ● Wrapper methods select a subset of features by evaluating it using a machine learning algorithm that involves a search operation through the space of possible feature subsets, evaluating each subset based on the performance of a given algorithm. ● Wrapper methods can give high classification accuracy than filter method for particular classifiers but they are less cost effective. ● They detect the interaction between variables ● They find the optimal feature subset for the desired machine learning algorithm ● Forward Selection, Backward Propagation, Stepwise Selection
  • 9. 3. Embedded Methods ● Performs feature selection during the process of training ● Specific to the applied learning algorithm. ● A learning algorithm takes advantage of its own variable selection process and performs feature selection and classification/regression at the same time. ● They take into consideration the interaction of features like wrapper methods do. ● They are faster and more accurate than filter methods. ● They find the feature subset for the algorithm being trained. ● They are much less prone to overfitting. ● Examples : Lasso, Elastic Net
  • 10. 4. Hybrid Methods ● Combinations of all of the other feature selection methods - filter, wrapper and embedded methods. ● Approach up to the engineer. ● Has high scope for research. ● High performance and accuracy. ● Better computational complexity than wrapper methods. ● Models that are more flexible and robust against high dimensional data.
  • 11. ENSEMBLES ● For a given dataset, different feature selection algorithms may select different subsets of features and hence the result obtained may have different accuracy. So we use ensemble-based feature selection methods to select a stable feature set. ● Ensembles are sets of learning machines that combine their decisions, or their learning algorithms, or different views of data, or other specific characteristics to achieve more reliable and accurate predictions in supervised and unsupervised learning problems.
  • 12. WHY ENSEMBLES ? ● It’s not that the best combination of learning algorithms outperforms the best learning algorithm but a combination of learning algorithms will give more accurate results on unseen data samples than a single learning algorithm. ● Ensembles enlarge the margins of large-margin classifiers like SVM in order to classify data points accurately. ● Ensembles can reduce both bias and variance of the error.
  • 13. WHY ENSEMBLES ? ● A rigorous mathematical treatment starting from the ”representativeness” of the examples used in machine learning problems leads to the design of ensembles of weak classifiers, whose accuracy is governed by the law of large numbers. ● Predictive performances of single models have been improved by the ensemble methodology in several application fields, such as information security, astronomy and astrophysics, geography and remote sensing, image retrieval, finance, medicine etc.
  • 14. TYPES OF ENSEMBLE METHODS ➢ Bayes Optimal Classifier ➢ Bootstrap Aggregating (Bagging) ➢ Boosting ➢ Bayesian Model Averaging ➢ Bayesian Model Combination ➢ Bucket of Models ➢ Stacking
  • 15. 1. Bayes Optimal Classifier ● Classification technique. ● Ensemble of all hypotheses in the hypothesis space. ● The naive Bayes optimal classifier is a version of this that assumes that the data is conditionally independent of the class. ● Each hypothesis is given a vote proportional to the probability that the training dataset would be sampled from a system if that hypothesis were true. ● Vote of each hypothesis is multiplied by the prior probability of that hypothesis.
  • 16. 1. Bayes Optimal Classifier ● Equation : y=argmax ∑ P(cj|hi) P(T|hi) P(hi) where y : the predicted class c : the set of all possible classes hi ϵ H : hypothesis space P : probability T : training data. ● By Bayes' theorem : P(hi|T) ∝ P(T|hi) P(hi) ● Hence, y=argmax∑P(cj|hi) P(hi|T)
  • 17. 2. Bootstrap Aggregating (Bagging) ● Each model in the ensemble vote has equal weight. ● Trains each model in the ensemble using a randomly drawn subset of the training set to promote model variance. ● It is a general procedure that can be used to reduce the variance for those algorithms that have high variance such as decision trees, like classification and regression trees (CART). ● As an example, the random forest algorithm combines random decision trees with bagging to achieve very high classification accuracy.
  • 18. 2. Bootstrap Aggregating (Bagging) Algorithm : Assuming a dataset with 1000 instances and applying CART algorithm on it Bagging of the CART algorithm would work as follows : ➢ Create many (e.g. 100) random sub-samples of our dataset with replacement. ➢ Train a CART model on each sample. ➢ Given a new dataset, calculate the average prediction from each model. We consider the most frequently predicted class.
  • 19. 3. Boosting ● Incrementally builds an ensemble by training each new model instance to emphasize the training instances that previous models mis-classified. ● More accurate than bagging, but also tends to over-fit the training data. ● Most common algorithm : Adaboost ● Most boosting algorithms consist of iteratively learning weak classifiers with respect to a distribution and adding them to a final strong classifier. ● While adding, they are weighted in a way that is related to the weak learners' accuracy. ● After a weak learner is added, the data weights are re-adjusted by re-weighting which leads to misclassified input data gaining higher weight and correctly classified data losing weight. ● When they are added, they are weighted in a way that is related to the
  • 20. 4. Bayesian Model Averaging ● An ensemble technique that seeks to approximate the Bayes optimal classifier by sampling hypotheses from the hypothesis space, and combining them using Bayes' law. ● Hypotheses are typically sampled using a Monte Carlo sampling technique such as MCMC. ● Gibbs sampling may be used to draw hypotheses that are representative of the distribution P(T|H). ● Under certain circumstances, when hypotheses are drawn in this manner and averaged according to Bayes' law, this technique has an expected error that is bound to be at most twice the expected error of the Bayes optimal classifier.
  • 21. 5. Bayesian Model Combination ● An algorithmic correction to Bayesian model averaging (BMA). ● Instead of sampling each model in the ensemble individually, it samples from the space of possible ensembles. This helps in overcoming the tendency of BMA to converge toward giving all of the weight to a single model. ● Yields better result but computationally expensive than BMA. ● When they are added, they are weighted in a way that is related to the weak learners' accuracy.
  • 22. 6. Bucket of Models ● An ensemble technique in which a model selection algorithm is used to choose the best model for each problem. ● When tested with only one problem, a bucket of models can produce no better results than the best model in the set, but when evaluated across many problems, it will typically produce much better results, on average, than any model in the set. ● Most common approach used for model-selection : cross-validation selection ● Gating is a generalization of Cross-Validation Selection. It involves training another learning model (or often a perceptron) to decide which of the models in the bucket is best-suited to solve the problem.
  • 23. 6. Bucket of Models Pseudo-code : For each model m in the bucket: Do c times: (where 'c' is some constant) Randomly divide the training dataset into two datasets: A, and B. Train m with A Test m with B Select the model that obtains the highest average score
  • 24. 7. Stacking ● Involves training a learning algorithm to combine the predictions of several other learning algorithms. ● First, all of the other algorithms are trained using the available data, then a combiner algorithm is trained to make a final prediction using all the predictions of the other algorithms as additional inputs. ● In practice, a logistic regression model is often used as the combiner. ● Successfully used on both supervised learning tasks (regression, classification and distance learning) and unsupervised learning (density estimation). ● It has also been used to estimate Bagging's error rate. ● Reportedly out-performs Bayesian model Averaging.
  • 25. APPLICATIONS OF ENSEMBLES ➢ image classification ➢ fingerprint classification ➢ weather forecasting ➢ text categorization ➢ image segmentation ➢ visual tracking ➢ change detection in image analysis ➢ protein fold pattern recognition ➢ cancer classification ➢ pedestrian recognition or detection ➢ prediction of software quality ➢ face recognition
  • 26. APPLICATIONS OF ENSEMBLES ➢ email filtering ➢ prediction of students’ performance ➢ medical image analysis ➢ churn prediction ➢ malware detection ➢ intrusion detection ➢ emotion detection ➢ sentiment analysis ➢ prediction of air quality ➢ land cover mapping ➢ intrusion detection.
  • 27. CONCLUSION ● The extent to which the ensemble implementation outperforms the simple version of a given algorithm is strongly dependent on the intrinsic stability of the algorithm itself, with larger gains in robustness for the least stable methods. ● It is worth highlighting that even selection methods that are quite different to each other tend to exhibit a similar performance, in terms of both accuracy and stability, when used in their ensemble version. ● As a future line of research, it could be interesting to explore the full potential of hybrid ensemble approaches, where diversity is injected both at the data level and at the algorithmic level. This might open the way to the definition of more flexible selection strategies which leverage multiple heuristics while reducing the degree of dependence on the specific composition of the training data.