SlideShare a Scribd company logo
Email; Aakanksha.Jain@poornima.edu.in
Dimension Reduction Techniques
By: MS. AAKANKSHA JAIN
Feature Selection
based Dimension
Reduction
Content
01
02
03
04
What is dimensionality reduction?
Feature Selection and Feature Extraction
Techniques to achieve dimension reduction
Backward feature elimination and Forward
feature selection technique
Hand-on session on feature selection
Why dimension reduction is important?
Basic understanding of feature selection
Python implementation on Jupiter lab
What is dimensionality Reduction?
Dimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional
space into a low-dimensional space so that the low-dimensional representation retains some meaningful
properties of the original data
Source: Internet
Curse of Dimensionality
DATA
INJESTION
DATA
STORAGE
Heterogeneous
DATA
Feature
Engineering
Data
Pre Processing
Data
Collection for
ML model
INTERNET
RESOURCES
Data for
Model Training
6
Technique to achieve Dimension Reduction
Feature extraction: finds a set of new
features (i.e., through some mapping f())
from the existing features.
1
2
1
2
.
.
.
.
.
.
. K
i
i
i
N
x
x
x
x
x
x
 
 
   
   
   
   
  
   
   
   
   
 
 
 
x y
1
2
1
2
( )
.
.
.
.
.
.
.
f
K
N
x
x
y
y
y
x
 
 
   
   
   
   
 
 
   
   
   
 
 
 
 
 
x
x y
Feature selection: chooses a
subset of the original features.
The mapping f() could
be linear or non-linear
K<<N K<<N
Feature Selection Techniques
Embedded Method
Features are selected in combined quality
of Filter and Wrapper method
WRAPPER Method
Selects the best combinations of the
features that produces the best result
FILTER Method
Features are being selected via various
statistical test score.
Backward Feature Elimination
Feature Selection
Keeping Most Significant
Feature
Complete
Dataset
All
Features Select Most
Significant
Feature
Initially we start with
all the features
Iterative checking of
significance of feature
Dependent
Variable
Iterative
Learning
Checking impact on model
performance after removal
Feature removal
Backward Feature Elimination
Assumptions:
• There is no missing values in our dataset
• The variance of all the variable are very high
• And between independent variable, correlation is very low
Backward Feature Elimination
Steps-I:
To perform Backward feature elimination
Firstly, train the model using all variable let say n
Step-II:
Next, we will calculate the performance of the model
ACCURACY: 92%
Backward Feature Elimination
Steps-III:
Next, we will eliminate a variable (Calories_brunt)
and train the model with remaining ones say n-1
variables.
Accuracy : 90%
Backward Feature Elimination
Steps-IV:
Again, we will eliminate some other variables
(Gender) and train the model with remaining
ones say n-1 variables.
Accuracy:91.6%
Backward Feature Elimination
Steps-V:
Again, we will eliminate some other variables
(Play_Sport?) and train the model with remaining
ones say n-1 variables.
Accuracy:88%
Backward Feature Elimination
Steps-VI:
When done, we will identify the eliminated variable which
does not having much impact on model’s performance
Hands-on
ID season holiday workingday weather temp humidity windspeed count
AB101 1 0 0 1 9.84 81 0 16
AB102 1 0 0 1 9.02 80 0 40
AB103 1 0 0 1 9.02 80 0 32
AB104 1 0 0 1 9.84 75 0 13
AB105 1 0 0 1 9.84 75 0 1
AB106 1 0 0 2 9.84 75 6.0032 1
AB107 1 0 0 1 9.02 80 0 2
AB108 1 0 0 1 8.2 86 0 3
AB109 1 0 0 1 9.84 75 0 8
AB110 1 0 0 1 13.12 76 0 14
AB111 1 0 0 1 15.58 76 16.9979 36
AB112 1 0 0 1 14.76 81 19.0012 56
AB113 1 0 0 1 17.22 77 19.0012 84
Python Code
#importing the libraries
import pandas as pd
#reading the file
data = pd.read_csv('backward_feature_elimination.csv')
# first 5 rows of the data
data.head()
#shape of the data
data.shape
# creating the training data
X = data.drop(['ID', 'count'], axis=1)
y = data['count']
#Checking Shape
X.shape, y.shape
#Installation of MlEXTEND
!pip install mlxtend
Python Code
#importing the libraries
from mlxtend.feature_selection import SequentialFeatureSelector as sfs
from sklearn.linear_model import LinearRegression
#Setting parameters to apply Backward Feature Elimination
lreg = LinearRegression()
sfs1 = sfs(lreg, k_features=4, forward=False, verbose=1, scoring='neg_mean_squared_error')
#Apply Backward Feature Elimination
sfs1 = sfs1.fit(X, y)
#Checking selected features
feat_names = list(sfs1.k_feature_names_)
print(feat_names)
#Setting new dataframe
new_data = data[feat_names]
new_data['count'] = data['count']
Python Code
# first five rows of the new data
new_data.head()
# shape of new and original data
new_data.shape, data.shape
Congratulations!!
We have successfully implemented Backward feature elimination
Practical Implementation:
Sample dataset
19
Practical Implementation:
20
Practical Implementation:
21
Practical Implementation:
22
Practical Implementation:
23
Practical Implementation:
24
Practical Implementation:
25
Final Output
THANK YOU
AAKANKSHA.JAIN@POORNIMA.EDU.IN

More Related Content

What's hot

Unsupervised learning
Unsupervised learningUnsupervised learning
Unsupervised learning
amalalhait
 
Perceptron (neural network)
Perceptron (neural network)Perceptron (neural network)
Perceptron (neural network)
EdutechLearners
 
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
Edureka!
 
Feature Selection in Machine Learning
Feature Selection in Machine LearningFeature Selection in Machine Learning
Feature Selection in Machine Learning
Upekha Vandebona
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
mrizwan969
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
Knoldus Inc.
 
Performance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsPerformance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning Algorithms
Kush Kulshrestha
 
Introduction to Linear Discriminant Analysis
Introduction to Linear Discriminant AnalysisIntroduction to Linear Discriminant Analysis
Introduction to Linear Discriminant Analysis
Jaclyn Kokx
 
Feature selection
Feature selectionFeature selection
Feature selection
Dong Guo
 
Machine Learning With Logistic Regression
Machine Learning  With Logistic RegressionMachine Learning  With Logistic Regression
Machine Learning With Logistic Regression
Knoldus Inc.
 
Introduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersIntroduction to Machine Learning Classifiers
Introduction to Machine Learning Classifiers
Functional Imperative
 
K - Nearest neighbor ( KNN )
K - Nearest neighbor  ( KNN )K - Nearest neighbor  ( KNN )
K - Nearest neighbor ( KNN )
Mohammad Junaid Khan
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machinesnextlib
 
Feature scaling
Feature scalingFeature scaling
Feature scaling
Gautam Kumar
 
Support Vector Machines- SVM
Support Vector Machines- SVMSupport Vector Machines- SVM
Support Vector Machines- SVM
Carlo Carandang
 
NAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIERNAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIER
Knoldus Inc.
 
PCA (Principal component analysis)
PCA (Principal component analysis)PCA (Principal component analysis)
PCA (Principal component analysis)
Learnbay Datascience
 
Naive Bayes
Naive BayesNaive Bayes
Naive Bayes
CloudxLab
 

What's hot (20)

Unsupervised learning
Unsupervised learningUnsupervised learning
Unsupervised learning
 
Perceptron (neural network)
Perceptron (neural network)Perceptron (neural network)
Perceptron (neural network)
 
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
 
Feature Selection in Machine Learning
Feature Selection in Machine LearningFeature Selection in Machine Learning
Feature Selection in Machine Learning
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
 
Performance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsPerformance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning Algorithms
 
Introduction to Linear Discriminant Analysis
Introduction to Linear Discriminant AnalysisIntroduction to Linear Discriminant Analysis
Introduction to Linear Discriminant Analysis
 
Feature selection
Feature selectionFeature selection
Feature selection
 
Machine Learning With Logistic Regression
Machine Learning  With Logistic RegressionMachine Learning  With Logistic Regression
Machine Learning With Logistic Regression
 
Introduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersIntroduction to Machine Learning Classifiers
Introduction to Machine Learning Classifiers
 
K - Nearest neighbor ( KNN )
K - Nearest neighbor  ( KNN )K - Nearest neighbor  ( KNN )
K - Nearest neighbor ( KNN )
 
KNN
KNNKNN
KNN
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
 
Feature scaling
Feature scalingFeature scaling
Feature scaling
 
Support Vector Machines- SVM
Support Vector Machines- SVMSupport Vector Machines- SVM
Support Vector Machines- SVM
 
NAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIERNAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIER
 
K means Clustering Algorithm
K means Clustering AlgorithmK means Clustering Algorithm
K means Clustering Algorithm
 
PCA (Principal component analysis)
PCA (Principal component analysis)PCA (Principal component analysis)
PCA (Principal component analysis)
 
Naive Bayes
Naive BayesNaive Bayes
Naive Bayes
 

Similar to Dimension reduction techniques[Feature Selection]

CSL0777-L07.pptx
CSL0777-L07.pptxCSL0777-L07.pptx
CSL0777-L07.pptx
KonkoboUlrichArthur
 
Machine Learning Model for M.S admissions
Machine Learning Model for M.S admissionsMachine Learning Model for M.S admissions
Machine Learning Model for M.S admissions
Omkar Rane
 
Introduction to deep learning using python
Introduction to deep learning using pythonIntroduction to deep learning using python
Introduction to deep learning using python
Lino Coria
 
Mining attributes
Mining attributesMining attributes
Mining attributes
Sandra Alex
 
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET Journal
 
Running Intelligent Applications inside a Database: Deep Learning with Python...
Running Intelligent Applications inside a Database: Deep Learning with Python...Running Intelligent Applications inside a Database: Deep Learning with Python...
Running Intelligent Applications inside a Database: Deep Learning with Python...
Miguel González-Fierro
 
Machine Learning Model Bakeoff
Machine Learning Model BakeoffMachine Learning Model Bakeoff
Machine Learning Model Bakeoff
mrphilroth
 
Ml ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science MeetupMl ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science Meetup
Jim Dowling
 
PPT on Data Science Using Python
PPT on Data Science Using PythonPPT on Data Science Using Python
PPT on Data Science Using Python
NishantKumar1179
 
Ember
EmberEmber
Ember
mrphilroth
 
Visualizing the Model Selection Process
Visualizing the Model Selection ProcessVisualizing the Model Selection Process
Visualizing the Model Selection Process
Benjamin Bengfort
 
Building Machine Learning Pipelines
Building Machine Learning PipelinesBuilding Machine Learning Pipelines
Building Machine Learning Pipelines
InMobi Technology
 
Workshop: Your first machine learning project
Workshop: Your first machine learning projectWorkshop: Your first machine learning project
Workshop: Your first machine learning project
Alex Austin
 
Data herding
Data herdingData herding
Data herding
unbracketed
 
Database connectivity in python
Database connectivity in pythonDatabase connectivity in python
Database connectivity in python
baabtra.com - No. 1 supplier of quality freshers
 
Building ML Pipelines
Building ML PipelinesBuilding ML Pipelines
Building ML Pipelines
Debidatta Dwibedi
 
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Databricks
 
Enhancing the performance of kmeans algorithm
Enhancing the performance of kmeans algorithmEnhancing the performance of kmeans algorithm
Enhancing the performance of kmeans algorithm
Hadi Fadlallah
 
Keynote at IWLS 2017
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017
Manish Pandey
 

Similar to Dimension reduction techniques[Feature Selection] (20)

CSL0777-L07.pptx
CSL0777-L07.pptxCSL0777-L07.pptx
CSL0777-L07.pptx
 
Machine Learning Model for M.S admissions
Machine Learning Model for M.S admissionsMachine Learning Model for M.S admissions
Machine Learning Model for M.S admissions
 
Introduction to deep learning using python
Introduction to deep learning using pythonIntroduction to deep learning using python
Introduction to deep learning using python
 
Mining attributes
Mining attributesMining attributes
Mining attributes
 
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
 
Running Intelligent Applications inside a Database: Deep Learning with Python...
Running Intelligent Applications inside a Database: Deep Learning with Python...Running Intelligent Applications inside a Database: Deep Learning with Python...
Running Intelligent Applications inside a Database: Deep Learning with Python...
 
Machine Learning Model Bakeoff
Machine Learning Model BakeoffMachine Learning Model Bakeoff
Machine Learning Model Bakeoff
 
Ml ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science MeetupMl ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science Meetup
 
PPT on Data Science Using Python
PPT on Data Science Using PythonPPT on Data Science Using Python
PPT on Data Science Using Python
 
Ember
EmberEmber
Ember
 
Visualizing the Model Selection Process
Visualizing the Model Selection ProcessVisualizing the Model Selection Process
Visualizing the Model Selection Process
 
Building Machine Learning Pipelines
Building Machine Learning PipelinesBuilding Machine Learning Pipelines
Building Machine Learning Pipelines
 
Workshop: Your first machine learning project
Workshop: Your first machine learning projectWorkshop: Your first machine learning project
Workshop: Your first machine learning project
 
Data herding
Data herdingData herding
Data herding
 
Data herding
Data herdingData herding
Data herding
 
Database connectivity in python
Database connectivity in pythonDatabase connectivity in python
Database connectivity in python
 
Building ML Pipelines
Building ML PipelinesBuilding ML Pipelines
Building ML Pipelines
 
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
 
Enhancing the performance of kmeans algorithm
Enhancing the performance of kmeans algorithmEnhancing the performance of kmeans algorithm
Enhancing the performance of kmeans algorithm
 
Keynote at IWLS 2017
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017
 

More from AAKANKSHA JAIN

Random forest and decision tree
Random forest and decision treeRandom forest and decision tree
Random forest and decision tree
AAKANKSHA JAIN
 
Inheritance in OOPs with java
Inheritance in OOPs with javaInheritance in OOPs with java
Inheritance in OOPs with java
AAKANKSHA JAIN
 
OOPs with java
OOPs with javaOOPs with java
OOPs with java
AAKANKSHA JAIN
 
Probability
ProbabilityProbability
Probability
AAKANKSHA JAIN
 
Data Mining & Data Warehousing
Data Mining & Data WarehousingData Mining & Data Warehousing
Data Mining & Data Warehousing
AAKANKSHA JAIN
 
Distributed Database Design and Relational Query Language
Distributed Database Design and Relational Query LanguageDistributed Database Design and Relational Query Language
Distributed Database Design and Relational Query Language
AAKANKSHA JAIN
 
DISTRIBUTED DATABASE WITH RECOVERY TECHNIQUES
DISTRIBUTED DATABASE WITH RECOVERY TECHNIQUESDISTRIBUTED DATABASE WITH RECOVERY TECHNIQUES
DISTRIBUTED DATABASE WITH RECOVERY TECHNIQUES
AAKANKSHA JAIN
 
Distributed Database Management System
Distributed Database Management SystemDistributed Database Management System
Distributed Database Management System
AAKANKSHA JAIN
 
DETECTION OF MALICIOUS EXECUTABLES USING RULE BASED CLASSIFICATION ALGORITHMS
DETECTION OF MALICIOUS EXECUTABLES USING RULE BASED CLASSIFICATION ALGORITHMSDETECTION OF MALICIOUS EXECUTABLES USING RULE BASED CLASSIFICATION ALGORITHMS
DETECTION OF MALICIOUS EXECUTABLES USING RULE BASED CLASSIFICATION ALGORITHMS
AAKANKSHA JAIN
 
Fingerprint matching using ridge count
Fingerprint matching using ridge countFingerprint matching using ridge count
Fingerprint matching using ridge count
AAKANKSHA JAIN
 
Image processing second unit Notes
Image processing second unit NotesImage processing second unit Notes
Image processing second unit Notes
AAKANKSHA JAIN
 
Advance image processing
Advance image processingAdvance image processing
Advance image processing
AAKANKSHA JAIN
 

More from AAKANKSHA JAIN (12)

Random forest and decision tree
Random forest and decision treeRandom forest and decision tree
Random forest and decision tree
 
Inheritance in OOPs with java
Inheritance in OOPs with javaInheritance in OOPs with java
Inheritance in OOPs with java
 
OOPs with java
OOPs with javaOOPs with java
OOPs with java
 
Probability
ProbabilityProbability
Probability
 
Data Mining & Data Warehousing
Data Mining & Data WarehousingData Mining & Data Warehousing
Data Mining & Data Warehousing
 
Distributed Database Design and Relational Query Language
Distributed Database Design and Relational Query LanguageDistributed Database Design and Relational Query Language
Distributed Database Design and Relational Query Language
 
DISTRIBUTED DATABASE WITH RECOVERY TECHNIQUES
DISTRIBUTED DATABASE WITH RECOVERY TECHNIQUESDISTRIBUTED DATABASE WITH RECOVERY TECHNIQUES
DISTRIBUTED DATABASE WITH RECOVERY TECHNIQUES
 
Distributed Database Management System
Distributed Database Management SystemDistributed Database Management System
Distributed Database Management System
 
DETECTION OF MALICIOUS EXECUTABLES USING RULE BASED CLASSIFICATION ALGORITHMS
DETECTION OF MALICIOUS EXECUTABLES USING RULE BASED CLASSIFICATION ALGORITHMSDETECTION OF MALICIOUS EXECUTABLES USING RULE BASED CLASSIFICATION ALGORITHMS
DETECTION OF MALICIOUS EXECUTABLES USING RULE BASED CLASSIFICATION ALGORITHMS
 
Fingerprint matching using ridge count
Fingerprint matching using ridge countFingerprint matching using ridge count
Fingerprint matching using ridge count
 
Image processing second unit Notes
Image processing second unit NotesImage processing second unit Notes
Image processing second unit Notes
 
Advance image processing
Advance image processingAdvance image processing
Advance image processing
 

Recently uploaded

Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
AJAYKUMARPUND1
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
thanhdowork
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
gdsczhcet
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
karthi keyan
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
ViniHema
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
Kamal Acharya
 
Investor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptxInvestor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptx
AmarGB2
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
manasideore6
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
AhmedHussein950959
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
MdTanvirMahtab2
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
Robbie Edward Sayers
 
ethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.pptethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.ppt
Jayaprasanna4
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
Jayaprasanna4
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
obonagu
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
AafreenAbuthahir2
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
fxintegritypublishin
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Dr.Costas Sachpazis
 
ML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptxML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptx
Vijay Dialani, PhD
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
BrazilAccount1
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
zwunae
 

Recently uploaded (20)

Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
 
Investor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptxInvestor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptx
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
 
ethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.pptethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.ppt
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
 
ML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptxML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptx
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
 

Dimension reduction techniques[Feature Selection]

  • 3. Content 01 02 03 04 What is dimensionality reduction? Feature Selection and Feature Extraction Techniques to achieve dimension reduction Backward feature elimination and Forward feature selection technique Hand-on session on feature selection Why dimension reduction is important? Basic understanding of feature selection Python implementation on Jupiter lab
  • 4. What is dimensionality Reduction? Dimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meaningful properties of the original data Source: Internet
  • 5. Curse of Dimensionality DATA INJESTION DATA STORAGE Heterogeneous DATA Feature Engineering Data Pre Processing Data Collection for ML model INTERNET RESOURCES Data for Model Training
  • 6. 6 Technique to achieve Dimension Reduction Feature extraction: finds a set of new features (i.e., through some mapping f()) from the existing features. 1 2 1 2 . . . . . . . K i i i N x x x x x x                                              x y 1 2 1 2 ( ) . . . . . . . f K N x x y y y x                                               x x y Feature selection: chooses a subset of the original features. The mapping f() could be linear or non-linear K<<N K<<N
  • 7. Feature Selection Techniques Embedded Method Features are selected in combined quality of Filter and Wrapper method WRAPPER Method Selects the best combinations of the features that produces the best result FILTER Method Features are being selected via various statistical test score.
  • 8. Backward Feature Elimination Feature Selection Keeping Most Significant Feature Complete Dataset All Features Select Most Significant Feature Initially we start with all the features Iterative checking of significance of feature Dependent Variable Iterative Learning Checking impact on model performance after removal Feature removal
  • 9. Backward Feature Elimination Assumptions: • There is no missing values in our dataset • The variance of all the variable are very high • And between independent variable, correlation is very low
  • 10. Backward Feature Elimination Steps-I: To perform Backward feature elimination Firstly, train the model using all variable let say n Step-II: Next, we will calculate the performance of the model ACCURACY: 92%
  • 11. Backward Feature Elimination Steps-III: Next, we will eliminate a variable (Calories_brunt) and train the model with remaining ones say n-1 variables. Accuracy : 90%
  • 12. Backward Feature Elimination Steps-IV: Again, we will eliminate some other variables (Gender) and train the model with remaining ones say n-1 variables. Accuracy:91.6%
  • 13. Backward Feature Elimination Steps-V: Again, we will eliminate some other variables (Play_Sport?) and train the model with remaining ones say n-1 variables. Accuracy:88%
  • 14. Backward Feature Elimination Steps-VI: When done, we will identify the eliminated variable which does not having much impact on model’s performance
  • 15. Hands-on ID season holiday workingday weather temp humidity windspeed count AB101 1 0 0 1 9.84 81 0 16 AB102 1 0 0 1 9.02 80 0 40 AB103 1 0 0 1 9.02 80 0 32 AB104 1 0 0 1 9.84 75 0 13 AB105 1 0 0 1 9.84 75 0 1 AB106 1 0 0 2 9.84 75 6.0032 1 AB107 1 0 0 1 9.02 80 0 2 AB108 1 0 0 1 8.2 86 0 3 AB109 1 0 0 1 9.84 75 0 8 AB110 1 0 0 1 13.12 76 0 14 AB111 1 0 0 1 15.58 76 16.9979 36 AB112 1 0 0 1 14.76 81 19.0012 56 AB113 1 0 0 1 17.22 77 19.0012 84
  • 16. Python Code #importing the libraries import pandas as pd #reading the file data = pd.read_csv('backward_feature_elimination.csv') # first 5 rows of the data data.head() #shape of the data data.shape # creating the training data X = data.drop(['ID', 'count'], axis=1) y = data['count'] #Checking Shape X.shape, y.shape #Installation of MlEXTEND !pip install mlxtend
  • 17. Python Code #importing the libraries from mlxtend.feature_selection import SequentialFeatureSelector as sfs from sklearn.linear_model import LinearRegression #Setting parameters to apply Backward Feature Elimination lreg = LinearRegression() sfs1 = sfs(lreg, k_features=4, forward=False, verbose=1, scoring='neg_mean_squared_error') #Apply Backward Feature Elimination sfs1 = sfs1.fit(X, y) #Checking selected features feat_names = list(sfs1.k_feature_names_) print(feat_names) #Setting new dataframe new_data = data[feat_names] new_data['count'] = data['count']
  • 18. Python Code # first five rows of the new data new_data.head() # shape of new and original data new_data.shape, data.shape Congratulations!! We have successfully implemented Backward feature elimination