SlideShare a Scribd company logo
1 of 19
0
AI/ML APPLICATIONS
Exploratory Data Analysis
Prof. Rahul Borate
Table of Contents
• Understand the ML best practice and project roadmap
• Identify the data source(s) and Data Collection
• Machine Learning process
• Exploratory Data Analysis(EDA)
1
Understand the ML best practice and project roadmap
• When a customer wants to
implement ML(Machine
Learning) for the identified
business problem(s) after
multiple discussions along
with the following
stakeholders from both sides
– Business, Architect,
Infrastructure, Operations,
and others.
2
Identify the data source(s) and Data Collection
• Organization’s key
application(s) – it would be
Internal or External
application or web-sites
• It would be streaming data
from the web
(Twitter/Facebook – any
Social media)
• Once you’re comfortable
with the available data, you
can start work on the rest of
the Machine Learning
process model.
3
Machine Learning process
4
Machine Learning process
In the data preparation, EDA gets most of the effort and unavoidable
steps
5
What is Exploratory Data
Analysis
6
EDA is an approach for data analysis using variety
of techniques to gain insights about the data.
• Cleaning and preprocessing
• Statistical Analysis
• Visualization for trend analysis,
anomaly detection, outlier
detection (and removal).
Basic steps in any
exploratory data
analysis:
Importance of EDA
Understanding the given dataset and helps clean up the given dataset.
It gives you a clear picture of the features and the relationships between them.
Discover errors, outliers, and missing values in the data.
Identify patterns by visualizing data in graphs such as bar graphs, scatter plots,
heatmaps and histograms.
7
EDA using Pandas
Import data into workplace(Jupyter notebook, Google colab, Python IDE)
Descriptive statistics
Removal of nulls
Visualization
8
1. Packages and data import
• Step 1 : Import pandas to the workplace.
• “Import pandas”
• Step 2 : Read data/dataset into Pandas dataframe. Different input
formats include:
• Excel : read_excel
• CSV: read_csv
• JSON: read_json
• HTML and many more
9
2. Descriptive
Stats (Pandas)
• Used to make preliminary assessments about the population
distribution of the variable.
• Commonly used statistics:
1. Central tendency :
• Mean – The average value of all the data points. :
dataframe.mean()
• Median – The middle value when all the data points are
put in an ordered list: dataframe.median()
• Mode – The data point which occurs the most in the
dataset :dataframe.mode()
2. Spread : It is the measure of how far the datapoints are away
from the mean or median
• Variance - The variance is the mean of the squares of the
individual deviations: dataframe.var()
• Standard deviation - The standard deviation is the square
root of the variance:dataframe.std()
3. Skewness: It is a measure of asymmetry: dataframe.skew()
Descriptive
Stats (contd.)
Other methods to get a quick look on the data:
• Describe() : Summarizes the central tendency,
dispersion and shape of a dataset’s distribution,
excluding NaN values.
• Syntax: pandas.dataframe.describe()
• Info() :Prints a concise summary of the
dataframe. This method prints information
about a dataframe including the index dtype
and columns, non-null values and memory
usage.
• Syntax: pandas.dataframe.info()
3. Null values
12
Detecting
Detecting Null-
values:
•Isnull(): It is used as an
alias for dataframe.isna().
This function returns the
dataframe with boolean
values indicating missing
values.
•Syntax :
dataframe.isnull()
Handling
Handling null values:
•Dropping the rows with
null values: dropna()
function is used to delete
rows or columns with
null values.
•Replacing missing values:
fillna() function can fill
the missing values with a
special value value like
mean or median.
4. Visualization
• Univariate: Looking at one variable/column at a time
• Bar-graph
• Histograms
• Boxplot
• Multivariate : Looking at relationship between two or more variables
• Scatter plots
• Pie plots
• Heatmaps(seaborn)
13
Bar-Graph,Histogram
and Boxplot
• Bar graph: A bar plot is a plot that presents
data with rectangular bars with lengths
proportional to the values that they represent.
• Boxplot : Depicts numerical data graphically
through their quartiles. The box extends from
the Q1 to Q3 quartile values of the data, with
a line at the median (Q2).
• Histogram: A histogram is a representation of
the distribution of data.
14
Scatterplot, Pieplot
• Scatterplot : Shows the data as a collection of points.
• Syntax: dataframe.plot.scatter(x = 'x_column_name', y = 'y_columnn_name’)
• Pie plot : Proportional representation of the numerical data in a column.
• Syntax: dataframe.plot.pie(y=‘column_name’)
15
Outlier detection
• An outlier is a point or set of data points that lie away from the rest of
the data values of the dataset..
• Outliers are easily identified by visualizing the data.
• For e.g.
• In a boxplot, the data points which lie outside the upper and lower bound can
be considered as outliers
• In a scatterplot, the data points which lie outside the groups of datapoints can
be considered as outliers
16
Outlier removal
• Calculate the IQR as follows:
Calculate the first and third quartile (Q1 and Q3)
Calculate the interquartile range, IQR = Q3-Q1
Find the lower bound which is Q1*1.5
Find the upper bound which is Q3*1.5
Replace the data points which lie outside this range.
They can be replaced by mean or median.
17
Hope now you have some idea, let’s implement all these using
the Automobile – Predictive Analysis dataset.
18
Hands on Demonstration

More Related Content

Similar to EDA.pptx

DFDs_and_Algorithms.pptx
DFDs_and_Algorithms.pptxDFDs_and_Algorithms.pptx
DFDs_and_Algorithms.pptx
AliyahAli19
 
ds 1 Introduction to Data Structures.ppt
ds 1 Introduction to Data Structures.pptds 1 Introduction to Data Structures.ppt
ds 1 Introduction to Data Structures.ppt
AlliVinay1
 
Excel and research
Excel and researchExcel and research
Excel and research
Nursing Path
 

Similar to EDA.pptx (20)

1.1 introduction to Data Structures.ppt
1.1 introduction to Data Structures.ppt1.1 introduction to Data Structures.ppt
1.1 introduction to Data Structures.ppt
 
EDA
EDAEDA
EDA
 
Data Visualization in Data Science
Data Visualization in Data ScienceData Visualization in Data Science
Data Visualization in Data Science
 
ch2 DS.pptx
ch2 DS.pptxch2 DS.pptx
ch2 DS.pptx
 
Unit 2 - Data Manipulation with R.pptx
Unit 2 - Data Manipulation with R.pptxUnit 2 - Data Manipulation with R.pptx
Unit 2 - Data Manipulation with R.pptx
 
DFDs_and_Algorithms.pptx
DFDs_and_Algorithms.pptxDFDs_and_Algorithms.pptx
DFDs_and_Algorithms.pptx
 
M. FLORENCE DAYANA/DATABASE MANAGEMENT SYSYTEM
M. FLORENCE DAYANA/DATABASE MANAGEMENT SYSYTEMM. FLORENCE DAYANA/DATABASE MANAGEMENT SYSYTEM
M. FLORENCE DAYANA/DATABASE MANAGEMENT SYSYTEM
 
SQLBits Module 2 RStats Introduction to R and Statistics
SQLBits Module 2 RStats Introduction to R and StatisticsSQLBits Module 2 RStats Introduction to R and Statistics
SQLBits Module 2 RStats Introduction to R and Statistics
 
Data Science 1.pdf
Data Science 1.pdfData Science 1.pdf
Data Science 1.pdf
 
Exploratory Data Analysis (EDA) .pptx
Exploratory Data Analysis (EDA) .pptxExploratory Data Analysis (EDA) .pptx
Exploratory Data Analysis (EDA) .pptx
 
ANALYSIS OF DATA (2).pptx
ANALYSIS OF DATA (2).pptxANALYSIS OF DATA (2).pptx
ANALYSIS OF DATA (2).pptx
 
ds 1 Introduction to Data Structures.ppt
ds 1 Introduction to Data Structures.pptds 1 Introduction to Data Structures.ppt
ds 1 Introduction to Data Structures.ppt
 
Data science guide
Data science guideData science guide
Data science guide
 
Data Analytics with R and SQL Server
Data Analytics with R and SQL ServerData Analytics with R and SQL Server
Data Analytics with R and SQL Server
 
II B.Sc IT DATA STRUCTURES.pptx
II B.Sc IT DATA STRUCTURES.pptxII B.Sc IT DATA STRUCTURES.pptx
II B.Sc IT DATA STRUCTURES.pptx
 
Data visualization
Data visualizationData visualization
Data visualization
 
Lecture2 (1).ppt
Lecture2 (1).pptLecture2 (1).ppt
Lecture2 (1).ppt
 
Excel and research
Excel and researchExcel and research
Excel and research
 
Unit 2_ Descriptive Analytics for MBA .pptx
Unit 2_ Descriptive Analytics for MBA .pptxUnit 2_ Descriptive Analytics for MBA .pptx
Unit 2_ Descriptive Analytics for MBA .pptx
 
Data Science and Machine learning-Lect01.pdf
Data Science and Machine learning-Lect01.pdfData Science and Machine learning-Lect01.pdf
Data Science and Machine learning-Lect01.pdf
 

More from Rahul Borate

Unit 4_Introduction to Server Farms.pptx
Unit 4_Introduction to Server Farms.pptxUnit 4_Introduction to Server Farms.pptx
Unit 4_Introduction to Server Farms.pptx
Rahul Borate
 
Unit 3_Data Center Design in storage.pptx
Unit  3_Data Center Design in storage.pptxUnit  3_Data Center Design in storage.pptx
Unit 3_Data Center Design in storage.pptx
Rahul Borate
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
Rahul Borate
 
Practice_Exercises_Files_and_Exceptions.pptx
Practice_Exercises_Files_and_Exceptions.pptxPractice_Exercises_Files_and_Exceptions.pptx
Practice_Exercises_Files_and_Exceptions.pptx
Rahul Borate
 

More from Rahul Borate (20)

PigHive presentation and hive impor.pptx
PigHive presentation and hive impor.pptxPigHive presentation and hive impor.pptx
PigHive presentation and hive impor.pptx
 
Unit 4_Introduction to Server Farms.pptx
Unit 4_Introduction to Server Farms.pptxUnit 4_Introduction to Server Farms.pptx
Unit 4_Introduction to Server Farms.pptx
 
Unit 3_Data Center Design in storage.pptx
Unit  3_Data Center Design in storage.pptxUnit  3_Data Center Design in storage.pptx
Unit 3_Data Center Design in storage.pptx
 
Fundamentals of storage Unit III Backup and Recovery.ppt
Fundamentals of storage Unit III Backup and Recovery.pptFundamentals of storage Unit III Backup and Recovery.ppt
Fundamentals of storage Unit III Backup and Recovery.ppt
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
 
Confusion Matrix.pptx
Confusion Matrix.pptxConfusion Matrix.pptx
Confusion Matrix.pptx
 
Unit 4 SVM and AVR.ppt
Unit 4 SVM and AVR.pptUnit 4 SVM and AVR.ppt
Unit 4 SVM and AVR.ppt
 
Unit I Fundamentals of Cloud Computing.pptx
Unit I Fundamentals of Cloud Computing.pptxUnit I Fundamentals of Cloud Computing.pptx
Unit I Fundamentals of Cloud Computing.pptx
 
Unit II Cloud Delivery Models.pptx
Unit II Cloud Delivery Models.pptxUnit II Cloud Delivery Models.pptx
Unit II Cloud Delivery Models.pptx
 
QQ Plot.pptx
QQ Plot.pptxQQ Plot.pptx
QQ Plot.pptx
 
Module III MachineLearningSparkML.pptx
Module III MachineLearningSparkML.pptxModule III MachineLearningSparkML.pptx
Module III MachineLearningSparkML.pptx
 
Unit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptxUnit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptx
 
2.2 Logit and Probit.pptx
2.2 Logit and Probit.pptx2.2 Logit and Probit.pptx
2.2 Logit and Probit.pptx
 
UNIT I Streaming Data & Architectures.pptx
UNIT I Streaming Data & Architectures.pptxUNIT I Streaming Data & Architectures.pptx
UNIT I Streaming Data & Architectures.pptx
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
 
Practice_Exercises_Files_and_Exceptions.pptx
Practice_Exercises_Files_and_Exceptions.pptxPractice_Exercises_Files_and_Exceptions.pptx
Practice_Exercises_Files_and_Exceptions.pptx
 
Practice_Exercises_Data_Structures.pptx
Practice_Exercises_Data_Structures.pptxPractice_Exercises_Data_Structures.pptx
Practice_Exercises_Data_Structures.pptx
 
Practice_Exercises_Control_Flow.pptx
Practice_Exercises_Control_Flow.pptxPractice_Exercises_Control_Flow.pptx
Practice_Exercises_Control_Flow.pptx
 
blog creation.pdf
blog creation.pdfblog creation.pdf
blog creation.pdf
 
Chapter I.pptx
Chapter I.pptxChapter I.pptx
Chapter I.pptx
 

Recently uploaded

The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 

Recently uploaded (20)

Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIFood Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Role Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxRole Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptx
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 

EDA.pptx

  • 1. 0 AI/ML APPLICATIONS Exploratory Data Analysis Prof. Rahul Borate
  • 2. Table of Contents • Understand the ML best practice and project roadmap • Identify the data source(s) and Data Collection • Machine Learning process • Exploratory Data Analysis(EDA) 1
  • 3. Understand the ML best practice and project roadmap • When a customer wants to implement ML(Machine Learning) for the identified business problem(s) after multiple discussions along with the following stakeholders from both sides – Business, Architect, Infrastructure, Operations, and others. 2
  • 4. Identify the data source(s) and Data Collection • Organization’s key application(s) – it would be Internal or External application or web-sites • It would be streaming data from the web (Twitter/Facebook – any Social media) • Once you’re comfortable with the available data, you can start work on the rest of the Machine Learning process model. 3
  • 6. Machine Learning process In the data preparation, EDA gets most of the effort and unavoidable steps 5
  • 7. What is Exploratory Data Analysis 6 EDA is an approach for data analysis using variety of techniques to gain insights about the data. • Cleaning and preprocessing • Statistical Analysis • Visualization for trend analysis, anomaly detection, outlier detection (and removal). Basic steps in any exploratory data analysis:
  • 8. Importance of EDA Understanding the given dataset and helps clean up the given dataset. It gives you a clear picture of the features and the relationships between them. Discover errors, outliers, and missing values in the data. Identify patterns by visualizing data in graphs such as bar graphs, scatter plots, heatmaps and histograms. 7
  • 9. EDA using Pandas Import data into workplace(Jupyter notebook, Google colab, Python IDE) Descriptive statistics Removal of nulls Visualization 8
  • 10. 1. Packages and data import • Step 1 : Import pandas to the workplace. • “Import pandas” • Step 2 : Read data/dataset into Pandas dataframe. Different input formats include: • Excel : read_excel • CSV: read_csv • JSON: read_json • HTML and many more 9
  • 11. 2. Descriptive Stats (Pandas) • Used to make preliminary assessments about the population distribution of the variable. • Commonly used statistics: 1. Central tendency : • Mean – The average value of all the data points. : dataframe.mean() • Median – The middle value when all the data points are put in an ordered list: dataframe.median() • Mode – The data point which occurs the most in the dataset :dataframe.mode() 2. Spread : It is the measure of how far the datapoints are away from the mean or median • Variance - The variance is the mean of the squares of the individual deviations: dataframe.var() • Standard deviation - The standard deviation is the square root of the variance:dataframe.std() 3. Skewness: It is a measure of asymmetry: dataframe.skew()
  • 12. Descriptive Stats (contd.) Other methods to get a quick look on the data: • Describe() : Summarizes the central tendency, dispersion and shape of a dataset’s distribution, excluding NaN values. • Syntax: pandas.dataframe.describe() • Info() :Prints a concise summary of the dataframe. This method prints information about a dataframe including the index dtype and columns, non-null values and memory usage. • Syntax: pandas.dataframe.info()
  • 13. 3. Null values 12 Detecting Detecting Null- values: •Isnull(): It is used as an alias for dataframe.isna(). This function returns the dataframe with boolean values indicating missing values. •Syntax : dataframe.isnull() Handling Handling null values: •Dropping the rows with null values: dropna() function is used to delete rows or columns with null values. •Replacing missing values: fillna() function can fill the missing values with a special value value like mean or median.
  • 14. 4. Visualization • Univariate: Looking at one variable/column at a time • Bar-graph • Histograms • Boxplot • Multivariate : Looking at relationship between two or more variables • Scatter plots • Pie plots • Heatmaps(seaborn) 13
  • 15. Bar-Graph,Histogram and Boxplot • Bar graph: A bar plot is a plot that presents data with rectangular bars with lengths proportional to the values that they represent. • Boxplot : Depicts numerical data graphically through their quartiles. The box extends from the Q1 to Q3 quartile values of the data, with a line at the median (Q2). • Histogram: A histogram is a representation of the distribution of data. 14
  • 16. Scatterplot, Pieplot • Scatterplot : Shows the data as a collection of points. • Syntax: dataframe.plot.scatter(x = 'x_column_name', y = 'y_columnn_name’) • Pie plot : Proportional representation of the numerical data in a column. • Syntax: dataframe.plot.pie(y=‘column_name’) 15
  • 17. Outlier detection • An outlier is a point or set of data points that lie away from the rest of the data values of the dataset.. • Outliers are easily identified by visualizing the data. • For e.g. • In a boxplot, the data points which lie outside the upper and lower bound can be considered as outliers • In a scatterplot, the data points which lie outside the groups of datapoints can be considered as outliers 16
  • 18. Outlier removal • Calculate the IQR as follows: Calculate the first and third quartile (Q1 and Q3) Calculate the interquartile range, IQR = Q3-Q1 Find the lower bound which is Q1*1.5 Find the upper bound which is Q3*1.5 Replace the data points which lie outside this range. They can be replaced by mean or median. 17
  • 19. Hope now you have some idea, let’s implement all these using the Automobile – Predictive Analysis dataset. 18 Hands on Demonstration