feature-Selection-Lab-8-20032024-111222am.pptx

Date
Introduction to Data Science
Lab 8
(Feature Selection)

Feature Selection
There are mainly 3 ways for feature selection:
Filter Methods
Wrapper Method
Embedded Methods

Univariate Selection: Filter Method
 The filter method ranks each feature based on some uni-variate metric and then
selects the highest-ranking features.
 Variance
 No Duplicate Columns
 High correlation
 Chi square test
 Mutual information of the independent variable.
 The filter method looks at individual features for identifying it’s relative
importance.
 A feature may not be useful on its own but may be an important influencer when
combined with other features. Filter methods may miss such features.

Filter Method: Variance
Removing the features whose variance is zero (having similar
values)

Filter Method: No duplicate Columns
Remove all the columns which have duplicate values

Filter Method: Correlation
 Two or more than two features are mutually correlated, they convey redundant
information to the model and hence only one of the correlated features should
be retained to reduce the number of features.
 If independent features are correlated with dependent features (target
variable), you don't need to remove them.
 If independent features are correlated with dependent features by 80% or
90% then drop those kind of features and train model with remaining
features.

Chi-Square Calculation: An Example
Χ2 (chi-square) calculation (numbers in parenthesis are expected counts
calculated based on the data distribution in the two categories)
It shows that like_science_fiction and play_chess are correlated in the group as
the value is greater than 10.828
93
.
507
840
)
840
1000
(
360
)
360
200
(
210
)
210
50
(
90
)
90
250
( 2
2
2
2
2










Play chess Not play chess Sum (row)
Like science fiction 250(90) 200(360) 450
Not like science fiction 50(210) 1000(840) 1050
Sum(col.) 300 1200 1500

Filter Method: Chi Square Test & Mutual Information
Chi Square Test & Mutual Information
For chi square convert to categorical data by converting data to
integers

Wrapper Method
 Feature selection process is based on a specific machine learning algorithm that
is to be applied to a particular record.
 It follows a greedy search approach by evaluating all possible combinations of
features based on the evaluation criterion.
 Wrapper methods are of following types:
 Forward Elimination
 Backward Elimination

Forward Elimination
The procedure starts with an empty set of features.
The best of the original features is determined and added to the
reduced set.
At each subsequent iteration, the best of the remaining original
attributes is added to the set.
SequentialFeatureSelector from sklearn can be used for forward
feature elimination by setting parameter direction=‘forward’

Forward Elimination
ExtraTreesClassifier from sklearn can also be used for the forward
elimination feature selection

Backward Elimination
The procedure starts with the full set of features.
At each step, it removes the worst attribute remaining in the set.
SequentialFeatureSelector from sklearn can be used for forward
feature elimination by setting parameter direction=‘backward’

Comparison
Filter
 Filter methods do not incorporate a machine
learning model in order to determine if a
feature is good or bad
 Filter methods are much faster compared to
wrapper methods as they do not involve
training the models.
 Filter methods may fail to find the best subset
of features in situations when there is not
enough data to model the statistical correlation
of the features
Wrapper
 Wrapper methods use a machine learning model
and train it the feature to decide if it is essential
for the final model or not
 Wrapper methods are computationally costly,
and in the case of massive datasets, wrapper
methods are probably not the most effective
feature selection method to consider.
• Wrapper methods can always provide the best
subset of features because of their exhaustive
nature.
 Using features from wrapper methods in your
final machine learning model can lead to
overfitting as wrapper methods already train
machine learning models with the features and it
affects the true power of learning.

feature-Selection-Lab-8-20032024-111222am.pptx

More Related Content

Similar to feature-Selection-Lab-8-20032024-111222am.pptx

Recently uploaded

feature-Selection-Lab-8-20032024-111222am.pptx