SlideShare a Scribd company logo
1 of 16
Feature Engineering
Making an Algorithm Understand the
Data
SAURABH WANI
--/--/2019
How does a Machine learning
model work?
The Data
Why do we need to refine the
Datasets ?
What is Feature Engineering ?
The process of extracting features from raw data is called
Feature Engineering
Methods for Engineering the Features
1. Imputation
2. Coping with Outliers
• Outliers can cause a real statistical
trouble if we are concerned about
the majority of the data
• They are sometimes helpful or
rather most essential element of our
model
eg. Anomaly detection
• In majority of the cases we have to
get rid of the outliers
• This could be done by eliminating
the data points whose standard
deviation is relatively high
3. Binning
• Binning is helpful is club together the values which are in a similar range
• Helps in converting discrete feature values to a categorical feature values
Age Groups Category
<12 Children
13-19 Teen agers
20-59 Adults
>60 Elderly
4. Encoding Techniques
Label Encoding One Hot Encoding
5. Feature Split
• Sometimes a feature has some information which could not be used as it is.
• Feature splitting is used when the information could be split directly into two or
more features without advanced engineering
6. Scaling
• From Machine Learning point of view, all numeric features should be in a similar
range
• Otherwise it gets difficult for the algorithm to fit those features.
• To avoid this issue an engineering technique called Scaling is used
• There are many mathematical methods for scaling but the most popular one is
‘Normalization’
7. Date Extraction
• One of the most essential problem with data having dates
mentioned
• These dates cannot be understood by the algorithm in their
present forms
• Relevant information can be extracted from these dates
according to the requirement and format of the date (This
could be tricky)
Eg. 1. Number of days between start and end date
2. Extracting Day, Month and Year in different
columns, etc.
Who is a good Data Scientist ?
How does Feature Engineering differentiates
between a good Data Scientist and
a bad Data Scientist ?
Thank You

More Related Content

What's hot

Introduction to Azure Machine Learning
Introduction to Azure Machine LearningIntroduction to Azure Machine Learning
Introduction to Azure Machine LearningPaul Prae
 
BigML Education - Evaluations
BigML Education - EvaluationsBigML Education - Evaluations
BigML Education - EvaluationsBigML, Inc
 
Don't be limited by error-prone spreadsheets
Don't be limited by error-prone spreadsheetsDon't be limited by error-prone spreadsheets
Don't be limited by error-prone spreadsheetsChristine O'Connor
 
Data science lecture4_doaa_mohey
Data science lecture4_doaa_moheyData science lecture4_doaa_mohey
Data science lecture4_doaa_moheyDoaa Mohey Eldin
 
Machine Learning Research in blibli
Machine Learning Research in blibliMachine Learning Research in blibli
Machine Learning Research in blibliHendri Karisma
 
Software Engineering_Agile Software Development By: Professor Lili Saghafi
Software Engineering_Agile Software Development By: Professor Lili SaghafiSoftware Engineering_Agile Software Development By: Professor Lili Saghafi
Software Engineering_Agile Software Development By: Professor Lili SaghafiProfessor Lili Saghafi
 
Balanced Scorecarding
Balanced  ScorecardingBalanced  Scorecarding
Balanced Scorecardinghanu friend
 
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...ijdpsjournal
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningData Works MD
 
Intro to Azure Machine Learning
Intro to Azure Machine LearningIntro to Azure Machine Learning
Intro to Azure Machine LearningDeepak Shevani
 
Intro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationIntro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationAnkit Gupta
 
BigML Education - Deepnets
BigML Education - DeepnetsBigML Education - Deepnets
BigML Education - DeepnetsBigML, Inc
 
Data science lecture1_doaa_mohey
Data science lecture1_doaa_moheyData science lecture1_doaa_mohey
Data science lecture1_doaa_moheyDoaa Mohey Eldin
 
Fcv core liu
Fcv core liuFcv core liu
Fcv core liuzukun
 
Complete Introduction to Business Data Analysis
Complete Introduction to Business Data AnalysisComplete Introduction to Business Data Analysis
Complete Introduction to Business Data AnalysisSam Dias
 

What's hot (18)

Introduction to Azure Machine Learning
Introduction to Azure Machine LearningIntroduction to Azure Machine Learning
Introduction to Azure Machine Learning
 
BigML Education - Evaluations
BigML Education - EvaluationsBigML Education - Evaluations
BigML Education - Evaluations
 
Datascience methodology
Datascience methodologyDatascience methodology
Datascience methodology
 
Machine learning
Machine learningMachine learning
Machine learning
 
Don't be limited by error-prone spreadsheets
Don't be limited by error-prone spreadsheetsDon't be limited by error-prone spreadsheets
Don't be limited by error-prone spreadsheets
 
Data science lecture4_doaa_mohey
Data science lecture4_doaa_moheyData science lecture4_doaa_mohey
Data science lecture4_doaa_mohey
 
Machine Learning Research in blibli
Machine Learning Research in blibliMachine Learning Research in blibli
Machine Learning Research in blibli
 
Software Engineering_Agile Software Development By: Professor Lili Saghafi
Software Engineering_Agile Software Development By: Professor Lili SaghafiSoftware Engineering_Agile Software Development By: Professor Lili Saghafi
Software Engineering_Agile Software Development By: Professor Lili Saghafi
 
Balanced Scorecarding
Balanced  ScorecardingBalanced  Scorecarding
Balanced Scorecarding
 
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Intro to Azure Machine Learning
Intro to Azure Machine LearningIntro to Azure Machine Learning
Intro to Azure Machine Learning
 
Intro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationIntro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning Presentation
 
BigML Education - Deepnets
BigML Education - DeepnetsBigML Education - Deepnets
BigML Education - Deepnets
 
Data Mining Technique - SEMMA
Data Mining Technique - SEMMAData Mining Technique - SEMMA
Data Mining Technique - SEMMA
 
Data science lecture1_doaa_mohey
Data science lecture1_doaa_moheyData science lecture1_doaa_mohey
Data science lecture1_doaa_mohey
 
Fcv core liu
Fcv core liuFcv core liu
Fcv core liu
 
Complete Introduction to Business Data Analysis
Complete Introduction to Business Data AnalysisComplete Introduction to Business Data Analysis
Complete Introduction to Business Data Analysis
 

Similar to Feature engineering

Choosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your needChoosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your needGibDevs
 
Feature Engineering in Machine Learning
Feature Engineering in Machine LearningFeature Engineering in Machine Learning
Feature Engineering in Machine LearningKnoldus Inc.
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning Gopal Sakarkar
 
Data Science Lifecycle
Data Science LifecycleData Science Lifecycle
Data Science LifecycleSwapnilDahake2
 
التنقيب في البيانات - Data Mining
التنقيب في البيانات -  Data Miningالتنقيب في البيانات -  Data Mining
التنقيب في البيانات - Data Miningnabil_alsharafi
 
Doing Analytics Right - Building the Analytics Environment
Doing Analytics Right - Building the Analytics EnvironmentDoing Analytics Right - Building the Analytics Environment
Doing Analytics Right - Building the Analytics EnvironmentTasktop
 
Data Scientist By: Professor Lili Saghafi
Data Scientist By: Professor Lili SaghafiData Scientist By: Professor Lili Saghafi
Data Scientist By: Professor Lili SaghafiProfessor Lili Saghafi
 
Data Engineer vs Data Scientist vs Data Analyst.pptx
Data Engineer vs Data Scientist vs Data Analyst.pptxData Engineer vs Data Scientist vs Data Analyst.pptx
Data Engineer vs Data Scientist vs Data Analyst.pptxCarolineRebeccaD
 
Citizen Data Science Training using KNIME
Citizen Data Science Training using KNIMECitizen Data Science Training using KNIME
Citizen Data Science Training using KNIMEAli Raza Anjum
 
Unit 1-ML (1) (1).pptx
Unit 1-ML (1) (1).pptxUnit 1-ML (1) (1).pptx
Unit 1-ML (1) (1).pptxChitrachitrap
 
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)MAHIRA
 
Presentation 7.pptx
Presentation 7.pptxPresentation 7.pptx
Presentation 7.pptxShivam327815
 
Tasks amenable to AI automation in data science _.pptx
Tasks amenable to AI automation in data science _.pptxTasks amenable to AI automation in data science _.pptx
Tasks amenable to AI automation in data science _.pptxMirzaJahanzeb5
 
EVAIN Artificial intelligence and semantic annotation: are you serious about it?
EVAIN Artificial intelligence and semantic annotation: are you serious about it?EVAIN Artificial intelligence and semantic annotation: are you serious about it?
EVAIN Artificial intelligence and semantic annotation: are you serious about it?FIAT/IFTA
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine LearningYuriy Guts
 
artificial intelligence.pptx
artificial intelligence.pptxartificial intelligence.pptx
artificial intelligence.pptxrithika858339
 
Data Analytics and Big Data on IoT
Data Analytics and Big Data on IoTData Analytics and Big Data on IoT
Data Analytics and Big Data on IoTShivam Singh
 

Similar to Feature engineering (20)

Choosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your needChoosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your need
 
Feature Engineering in Machine Learning
Feature Engineering in Machine LearningFeature Engineering in Machine Learning
Feature Engineering in Machine Learning
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning
 
Data Science Lifecycle
Data Science LifecycleData Science Lifecycle
Data Science Lifecycle
 
التنقيب في البيانات - Data Mining
التنقيب في البيانات -  Data Miningالتنقيب في البيانات -  Data Mining
التنقيب في البيانات - Data Mining
 
Doing Analytics Right - Building the Analytics Environment
Doing Analytics Right - Building the Analytics EnvironmentDoing Analytics Right - Building the Analytics Environment
Doing Analytics Right - Building the Analytics Environment
 
Module-4_Part-II.pptx
Module-4_Part-II.pptxModule-4_Part-II.pptx
Module-4_Part-II.pptx
 
Data Scientist By: Professor Lili Saghafi
Data Scientist By: Professor Lili SaghafiData Scientist By: Professor Lili Saghafi
Data Scientist By: Professor Lili Saghafi
 
Data Engineer vs Data Scientist vs Data Analyst.pptx
Data Engineer vs Data Scientist vs Data Analyst.pptxData Engineer vs Data Scientist vs Data Analyst.pptx
Data Engineer vs Data Scientist vs Data Analyst.pptx
 
Citizen Data Science Training using KNIME
Citizen Data Science Training using KNIMECitizen Data Science Training using KNIME
Citizen Data Science Training using KNIME
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Unit 1-ML (1) (1).pptx
Unit 1-ML (1) (1).pptxUnit 1-ML (1) (1).pptx
Unit 1-ML (1) (1).pptx
 
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
 
Presentation 7.pptx
Presentation 7.pptxPresentation 7.pptx
Presentation 7.pptx
 
Tasks amenable to AI automation in data science _.pptx
Tasks amenable to AI automation in data science _.pptxTasks amenable to AI automation in data science _.pptx
Tasks amenable to AI automation in data science _.pptx
 
EVAIN Artificial intelligence and semantic annotation: are you serious about it?
EVAIN Artificial intelligence and semantic annotation: are you serious about it?EVAIN Artificial intelligence and semantic annotation: are you serious about it?
EVAIN Artificial intelligence and semantic annotation: are you serious about it?
 
System Life Cycle
System Life CycleSystem Life Cycle
System Life Cycle
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine Learning
 
artificial intelligence.pptx
artificial intelligence.pptxartificial intelligence.pptx
artificial intelligence.pptx
 
Data Analytics and Big Data on IoT
Data Analytics and Big Data on IoTData Analytics and Big Data on IoT
Data Analytics and Big Data on IoT
 

Recently uploaded

VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts ServiceCall Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Servicejennyeacort
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Spark3's new memory model/management
Spark3's new memory model/managementSpark3's new memory model/management
Spark3's new memory model/managementakshesh doshi
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Data Warehouse , Data Cube Computation
Data Warehouse   , Data Cube ComputationData Warehouse   , Data Cube Computation
Data Warehouse , Data Cube Computationsit20ad004
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...Suhani Kapoor
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Data Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationData Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationBoston Institute of Analytics
 

Recently uploaded (20)

VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts ServiceCall Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Spark3's new memory model/management
Spark3's new memory model/managementSpark3's new memory model/management
Spark3's new memory model/management
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Data Warehouse , Data Cube Computation
Data Warehouse   , Data Cube ComputationData Warehouse   , Data Cube Computation
Data Warehouse , Data Cube Computation
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Data Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationData Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health Classification
 

Feature engineering

  • 1. Feature Engineering Making an Algorithm Understand the Data SAURABH WANI --/--/2019
  • 2. How does a Machine learning model work?
  • 4. Why do we need to refine the Datasets ?
  • 5. What is Feature Engineering ? The process of extracting features from raw data is called Feature Engineering
  • 6. Methods for Engineering the Features
  • 8. 2. Coping with Outliers • Outliers can cause a real statistical trouble if we are concerned about the majority of the data • They are sometimes helpful or rather most essential element of our model eg. Anomaly detection • In majority of the cases we have to get rid of the outliers • This could be done by eliminating the data points whose standard deviation is relatively high
  • 9. 3. Binning • Binning is helpful is club together the values which are in a similar range • Helps in converting discrete feature values to a categorical feature values Age Groups Category <12 Children 13-19 Teen agers 20-59 Adults >60 Elderly
  • 10. 4. Encoding Techniques Label Encoding One Hot Encoding
  • 11. 5. Feature Split • Sometimes a feature has some information which could not be used as it is. • Feature splitting is used when the information could be split directly into two or more features without advanced engineering
  • 12. 6. Scaling • From Machine Learning point of view, all numeric features should be in a similar range • Otherwise it gets difficult for the algorithm to fit those features. • To avoid this issue an engineering technique called Scaling is used • There are many mathematical methods for scaling but the most popular one is ‘Normalization’
  • 13. 7. Date Extraction • One of the most essential problem with data having dates mentioned • These dates cannot be understood by the algorithm in their present forms • Relevant information can be extracted from these dates according to the requirement and format of the date (This could be tricky) Eg. 1. Number of days between start and end date 2. Extracting Day, Month and Year in different columns, etc.
  • 14. Who is a good Data Scientist ?
  • 15. How does Feature Engineering differentiates between a good Data Scientist and a bad Data Scientist ?