SlideShare a Scribd company logo
EXPLOLATORY DATA ANALYSIS
Davis David
Data Scientist at ParrotAI
CONTENT:
1. Introduction to EDA
2. Importance of EDA
3. Data Types
4. Python Packages for EDA
5. List of Graphs
6. Practical EDA
1.INTRODUCTION TO EDA
 Exploratory Data Analysis refers to the critical process of performing
initial investigations on data so as to discover patterns, to spot
anomalies, to test hypothesis and to check assumptions with the help
of summary statistics and graphical representations.
It is a good practice to understand the data first and try to gather as
many insights from it.
2. IMPORTANCE OF EDA
Identifying the most important variables/features in your dataset.
Testing a hypothesis or checking assumptions related to the dataset.
To check the quality of data for further processing and cleaning.
Deliver data-driven insights to business stakeholders.
Verify expected relationships actually exist in the data.
To find unexpected structure or insights in the data.
Two Categories of Data
 Structured Data types
Example: csv file, excel file, database file
 Unstructured Data types
Examples: Images, videos, audio,
Data Types
Structured Data Types
Categorical - This is any data that isn’t a number.
 Ordinal - have a set of order e.g. rating happiness on a scale of 1-10.
 Binary - have only two values .e.g. Male or Female
 Nominal - no set of order e.g. Countries
Numerical – Data inform of numbers
 Continuous - numbers that don’t have a logical end to them e.g heights
 Discrete - have a logical end to them e.g. days in the month
Python Packages for EDA
1.Bar Chart
2. Pie Chart
3.Histogram
4.Scatter Plot
5. Heatmap
6. Box Plot
7. Line Plot
8. Violin Plot
9.Bubble Plot
10. 3D Scatter Plot
Exploratory data analysis with Python

More Related Content

What's hot

Data Preprocessing
Data PreprocessingData Preprocessing
introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
bhavesh lande
 
Introduction of Data Science
Introduction of Data ScienceIntroduction of Data Science
Introduction of Data Science
Jason Geng
 
Data preprocessing PPT
Data preprocessing PPTData preprocessing PPT
Data preprocessing PPT
ANUSUYA T K
 
Data cleansing
Data cleansingData cleansing
Data cleansing
kunaljain1701
 
Exploratory data analysis
Exploratory data analysisExploratory data analysis
Exploratory data analysis
gokulprasath06
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
Srishti44
 
Introduction to data analytics
Introduction to data analyticsIntroduction to data analytics
Introduction to data analytics
SSaudia
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
mrizwan969
 
Data reduction
Data reductionData reduction
Data reduction
kalavathisugan
 
Data mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, dataData mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, data
Salah Amean
 
Introduction to Data Visualization
Introduction to Data VisualizationIntroduction to Data Visualization
Introduction to Data Visualization
Stephen Tracy
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
Jason Rodrigues
 
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioLecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Marina Santini
 
Outlier analysis and anomaly detection
Outlier analysis and anomaly detectionOutlier analysis and anomaly detection
Outlier analysis and anomaly detection
ShantanuDeosthale
 
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessing
Data Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessingData Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessing
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessing
Salah Amean
 
Data Preprocessing || Data Mining
Data Preprocessing || Data MiningData Preprocessing || Data Mining
Data Preprocessing || Data Mining
Iffat Firozy
 
Privacy, security and ethics in data science
Privacy, security and ethics in data sciencePrivacy, security and ethics in data science
Privacy, security and ethics in data science
Nikolaos Vasiloglou
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
Saad Elbeleidy
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
YashwantGahlot1
 

What's hot (20)

Data Preprocessing
Data PreprocessingData Preprocessing
Data Preprocessing
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
 
Introduction of Data Science
Introduction of Data ScienceIntroduction of Data Science
Introduction of Data Science
 
Data preprocessing PPT
Data preprocessing PPTData preprocessing PPT
Data preprocessing PPT
 
Data cleansing
Data cleansingData cleansing
Data cleansing
 
Exploratory data analysis
Exploratory data analysisExploratory data analysis
Exploratory data analysis
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Introduction to data analytics
Introduction to data analyticsIntroduction to data analytics
Introduction to data analytics
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
 
Data reduction
Data reductionData reduction
Data reduction
 
Data mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, dataData mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, data
 
Introduction to Data Visualization
Introduction to Data VisualizationIntroduction to Data Visualization
Introduction to Data Visualization
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioLecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
 
Outlier analysis and anomaly detection
Outlier analysis and anomaly detectionOutlier analysis and anomaly detection
Outlier analysis and anomaly detection
 
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessing
Data Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessingData Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessing
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessing
 
Data Preprocessing || Data Mining
Data Preprocessing || Data MiningData Preprocessing || Data Mining
Data Preprocessing || Data Mining
 
Privacy, security and ethics in data science
Privacy, security and ethics in data sciencePrivacy, security and ethics in data science
Privacy, security and ethics in data science
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 

Similar to Exploratory data analysis with Python

Exploratory Data Analysis - A Comprehensive Guide to EDA.pdf
Exploratory Data Analysis - A Comprehensive Guide to EDA.pdfExploratory Data Analysis - A Comprehensive Guide to EDA.pdf
Exploratory Data Analysis - A Comprehensive Guide to EDA.pdf
JamieDornan2
 
Exploratory Data Analysis - A Comprehensive Guide to EDA.pdf
Exploratory Data Analysis - A Comprehensive Guide to EDA.pdfExploratory Data Analysis - A Comprehensive Guide to EDA.pdf
Exploratory Data Analysis - A Comprehensive Guide to EDA.pdf
JamieDornan2
 
Exploratory Data Analysis - A Comprehensive Guide to EDA.pdf
Exploratory Data Analysis - A Comprehensive Guide to EDA.pdfExploratory Data Analysis - A Comprehensive Guide to EDA.pdf
Exploratory Data Analysis - A Comprehensive Guide to EDA.pdf
StephenAmell4
 
Data+Science+in+Python+-+Data+Prep+&+EDA.pdf
Data+Science+in+Python+-+Data+Prep+&+EDA.pdfData+Science+in+Python+-+Data+Prep+&+EDA.pdf
Data+Science+in+Python+-+Data+Prep+&+EDA.pdf
neelakandan2001kpm
 
Ds
DsDs
Data quality and data profiling
Data quality and data profilingData quality and data profiling
Data quality and data profiling
Shailja Khurana
 
Exploratory Data Analysis_ Uncovering Patterns in Data.pdf
Exploratory Data Analysis_ Uncovering Patterns in Data.pdfExploratory Data Analysis_ Uncovering Patterns in Data.pdf
Exploratory Data Analysis_ Uncovering Patterns in Data.pdf
archijain931
 
Qiagram
QiagramQiagram
Qiagram
jwppz
 
Master Data Analyst Course in Bangalore with ProITBridge's Expert Course.pdf
Master Data Analyst Course in Bangalore with ProITBridge's Expert Course.pdfMaster Data Analyst Course in Bangalore with ProITBridge's Expert Course.pdf
Master Data Analyst Course in Bangalore with ProITBridge's Expert Course.pdf
proitbridgePvtLtd
 
Business Intelligence
Business IntelligenceBusiness Intelligence
Business Intelligence
Sukirti Garg
 
Of Unicorns, Yetis, and Error-Free Datasets (or what is data quality?)
Of Unicorns, Yetis, and Error-Free Datasets (or what is data quality?)Of Unicorns, Yetis, and Error-Free Datasets (or what is data quality?)
Of Unicorns, Yetis, and Error-Free Datasets (or what is data quality?)
Gianluca Tarasconi
 
data wrangling (1).pptx kjhiukjhknjbnkjh
data wrangling (1).pptx kjhiukjhknjbnkjhdata wrangling (1).pptx kjhiukjhknjbnkjh
data wrangling (1).pptx kjhiukjhknjbnkjh
VISHALMARWADE1
 
Technical Documentation 101 for Data Engineers.pdf
Technical Documentation 101 for Data Engineers.pdfTechnical Documentation 101 for Data Engineers.pdf
Technical Documentation 101 for Data Engineers.pdf
Shristi Shrestha
 
Accelerate Data Discovery
Accelerate Data Discovery   Accelerate Data Discovery
Accelerate Data Discovery
Attivio
 
Data analytics Course for Beginners (1).pptx
Data analytics Course for Beginners (1).pptxData analytics Course for Beginners (1).pptx
Data analytics Course for Beginners (1).pptx
SumitAgarwal65690
 
Data management: documentation and metadata
Data management: documentation and metadataData management: documentation and metadata
Data management: documentation and metadata
Statistics Specialist
 
Introduction to Business and Data Analysis Undergraduate.pdf
Introduction to Business and Data Analysis Undergraduate.pdfIntroduction to Business and Data Analysis Undergraduate.pdf
Introduction to Business and Data Analysis Undergraduate.pdf
AbdulrahimShaibuIssa
 
Data science vs. Data scientist by Jothi Periasamy
Data science vs. Data scientist by Jothi PeriasamyData science vs. Data scientist by Jothi Periasamy
Data science vs. Data scientist by Jothi Periasamy
Peter Kua
 
Database
DatabaseDatabase
Database
wwaqas2007
 
Database
DatabaseDatabase
Database
Vaibhav Bajaj
 

Similar to Exploratory data analysis with Python (20)

Exploratory Data Analysis - A Comprehensive Guide to EDA.pdf
Exploratory Data Analysis - A Comprehensive Guide to EDA.pdfExploratory Data Analysis - A Comprehensive Guide to EDA.pdf
Exploratory Data Analysis - A Comprehensive Guide to EDA.pdf
 
Exploratory Data Analysis - A Comprehensive Guide to EDA.pdf
Exploratory Data Analysis - A Comprehensive Guide to EDA.pdfExploratory Data Analysis - A Comprehensive Guide to EDA.pdf
Exploratory Data Analysis - A Comprehensive Guide to EDA.pdf
 
Exploratory Data Analysis - A Comprehensive Guide to EDA.pdf
Exploratory Data Analysis - A Comprehensive Guide to EDA.pdfExploratory Data Analysis - A Comprehensive Guide to EDA.pdf
Exploratory Data Analysis - A Comprehensive Guide to EDA.pdf
 
Data+Science+in+Python+-+Data+Prep+&+EDA.pdf
Data+Science+in+Python+-+Data+Prep+&+EDA.pdfData+Science+in+Python+-+Data+Prep+&+EDA.pdf
Data+Science+in+Python+-+Data+Prep+&+EDA.pdf
 
Ds
DsDs
Ds
 
Data quality and data profiling
Data quality and data profilingData quality and data profiling
Data quality and data profiling
 
Exploratory Data Analysis_ Uncovering Patterns in Data.pdf
Exploratory Data Analysis_ Uncovering Patterns in Data.pdfExploratory Data Analysis_ Uncovering Patterns in Data.pdf
Exploratory Data Analysis_ Uncovering Patterns in Data.pdf
 
Qiagram
QiagramQiagram
Qiagram
 
Master Data Analyst Course in Bangalore with ProITBridge's Expert Course.pdf
Master Data Analyst Course in Bangalore with ProITBridge's Expert Course.pdfMaster Data Analyst Course in Bangalore with ProITBridge's Expert Course.pdf
Master Data Analyst Course in Bangalore with ProITBridge's Expert Course.pdf
 
Business Intelligence
Business IntelligenceBusiness Intelligence
Business Intelligence
 
Of Unicorns, Yetis, and Error-Free Datasets (or what is data quality?)
Of Unicorns, Yetis, and Error-Free Datasets (or what is data quality?)Of Unicorns, Yetis, and Error-Free Datasets (or what is data quality?)
Of Unicorns, Yetis, and Error-Free Datasets (or what is data quality?)
 
data wrangling (1).pptx kjhiukjhknjbnkjh
data wrangling (1).pptx kjhiukjhknjbnkjhdata wrangling (1).pptx kjhiukjhknjbnkjh
data wrangling (1).pptx kjhiukjhknjbnkjh
 
Technical Documentation 101 for Data Engineers.pdf
Technical Documentation 101 for Data Engineers.pdfTechnical Documentation 101 for Data Engineers.pdf
Technical Documentation 101 for Data Engineers.pdf
 
Accelerate Data Discovery
Accelerate Data Discovery   Accelerate Data Discovery
Accelerate Data Discovery
 
Data analytics Course for Beginners (1).pptx
Data analytics Course for Beginners (1).pptxData analytics Course for Beginners (1).pptx
Data analytics Course for Beginners (1).pptx
 
Data management: documentation and metadata
Data management: documentation and metadataData management: documentation and metadata
Data management: documentation and metadata
 
Introduction to Business and Data Analysis Undergraduate.pdf
Introduction to Business and Data Analysis Undergraduate.pdfIntroduction to Business and Data Analysis Undergraduate.pdf
Introduction to Business and Data Analysis Undergraduate.pdf
 
Data science vs. Data scientist by Jothi Periasamy
Data science vs. Data scientist by Jothi PeriasamyData science vs. Data scientist by Jothi Periasamy
Data science vs. Data scientist by Jothi Periasamy
 
Database
DatabaseDatabase
Database
 
Database
DatabaseDatabase
Database
 

Recently uploaded

Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 

Recently uploaded (20)

Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 

Exploratory data analysis with Python

  • 1. EXPLOLATORY DATA ANALYSIS Davis David Data Scientist at ParrotAI
  • 2. CONTENT: 1. Introduction to EDA 2. Importance of EDA 3. Data Types 4. Python Packages for EDA 5. List of Graphs 6. Practical EDA
  • 3. 1.INTRODUCTION TO EDA  Exploratory Data Analysis refers to the critical process of performing initial investigations on data so as to discover patterns, to spot anomalies, to test hypothesis and to check assumptions with the help of summary statistics and graphical representations. It is a good practice to understand the data first and try to gather as many insights from it.
  • 4. 2. IMPORTANCE OF EDA Identifying the most important variables/features in your dataset. Testing a hypothesis or checking assumptions related to the dataset. To check the quality of data for further processing and cleaning. Deliver data-driven insights to business stakeholders. Verify expected relationships actually exist in the data. To find unexpected structure or insights in the data.
  • 5.
  • 6.
  • 7.
  • 8. Two Categories of Data  Structured Data types Example: csv file, excel file, database file  Unstructured Data types Examples: Images, videos, audio,
  • 10. Structured Data Types Categorical - This is any data that isn’t a number.  Ordinal - have a set of order e.g. rating happiness on a scale of 1-10.  Binary - have only two values .e.g. Male or Female  Nominal - no set of order e.g. Countries Numerical – Data inform of numbers  Continuous - numbers that don’t have a logical end to them e.g heights  Discrete - have a logical end to them e.g. days in the month