SlideShare a Scribd company logo
1 of 24
Introduction to
Basic Statistical Concepts
Statistics is a branch of mathematics that deals with the collection,
organization, analysis, interpretation, and presentation of data. It is used in
various fields such as business, economics, sociology, and more.
Understanding statistical concepts is essential for making informed decisions
and drawing meaningful conclusions.
Descriptive Statistics
Mean, Median, Mode
Descriptive statistics involve methods used
to summarize and describe data. It includes
measures of central tendency such as
mean, median, and mode.
Variability Measures
Descriptive statistics also include measures
of variability, which provide insights into the
spread and dispersion of data.
Measures of Central Tendency
1 Mean
The mean is the average of a set of numbers and is calculated by summing all the
numbers and then dividing by the count of numbers.
2 Median
The median is the middle value when the numbers are arranged in ascending order. It
represents the central tendency of the data.
3 Mode
The mode is the value that appears most frequently in a set of data. It indicates the
most common observation.
Measures of Variability
1 Range
The range is the difference between
the highest and lowest values in a
dataset. It provides a simple measure
of variability.
2 Variance
Variance measures the average
degree to which each point in a
dataset differs from the mean. It
shows how much the data points are
spread out.
3 Standard Deviation
Standard deviation is a measure of the amount of variation or dispersion of a set of
values. It is the square root of the variance.
Inferential Statistics
Definition
Inferential statistics involves using data from a
sample to make predictions or inferences about
a population.
Applications
It is used to determine the probability of
something happening or how accurate a
prediction can be made from a sample.
Hypothesis Testing
Formulate Hypothesis
The first step involves
stating a clear hypothesis
that you want to test based
on existing knowledge or
observations.
Collect Data
After formulating the
hypothesis, data is collected
to test and analyze the
validity of the hypothesis.
Analyze Results
The results are statistically
analyzed to determine
whether to accept or reject
the hypothesis.
Confidence Intervals
1 Upper Bound
The upper bound of a confidence
interval represents the high end of
the interval and provides the
maximum potential value of the
parameter.
2
Lower Bound
The lower bound of a confidence
interval represents the low end of the
interval and provides the minimum
potential value of the parameter.
Types of Data
1 Nominal Data
Nominal data represents categories
without any order or sequence.
Examples include gender, colors, and
names.
2 Ordinal Data
Ordinal data represents categories
with a specific order or rank.
Examples include education levels
and survey ratings.
Sampling Methods
Simple Random Sampling
Every member of the population has an
equal chance of being selected.
Stratified Sampling
The population is divided into subgroups,
and samples are then randomly selected
from each subgroup.
Common Statistical Distributions
Normal Distribution
The bell-shaped curve
represents a symmetrical
distribution with most values
clustered around the mean.
Binomial Distribution
It represents the number of
successes in a fixed number
of independent trials with the
same probability of success in
each trial.
Poisson Distribution
It estimates the number of
events that can happen in a
fixed interval of time or space.
Introduction to
Descriptive Statistics
Using Python
Descriptive statistics is a branch of statistics that involves the collection,
analysis, interpretation, and presentation of data. Its primary focus is on
summarizing and describing the main features of a dataset, providing a
comprehensive and meaningful overview.
Purpose and Goals
Summarization:
Condensing large amounts of data
into key insights.
Exploration:
Identifying patterns, trends, and
outliers within the data.
Communication:
Presenting findings in a clear and understandable manner to
facilitate decision-making.
Types of Descriptive Statistics
Provide a central or typical value
in a dataset.
Common measures include:
• Mean: Average of all values.
• Median: Middle value in a
sorted dataset.
• Mode: Most frequently
occurring value.
Indicate the spread or variability of
the data.
Common measures include:
• Range: Difference between the
maximum and minimum values.
• Variance: Average of the
squared differences from the
mean.
• Standard Deviation: Square
root of the variance.
Describe the distribution or shape of
the data.
Common measures include:
• Skewness: Indicates the
asymmetry of the data distribution.
• Kurtosis: Measures the
"tailedness" or sharpness of the
data distribution.
Measures of
Central Tendency
Measures of
Dispersion
Measures of
Shape
Measures of Central Tendency
Mean
The mean is the average of a set of numbers, calculated by adding all the numbers together and then dividing by the count of numbers.
Consider the following dataset:
[10, 15, 20, 25, 30]
• (10 + 15 + 20 + 25 + 30) / 5 = 20
Measures of Central Tendency
Median
The median is the middle value of a data set when it is ordered from least to greatest. It represents the 50th
percentile of the data.
Consider the following dataset:
[10, 15, 20, 25, 30]
• The Middle value, which is also 20.
Measures of Central Tendency
Mode
The mode is the value that appears most frequently in a given data set. It's the most common observation in the
data.
Consider the following dataset:
[10, 15, 20, 25, 30]
• No Mode in this case.
Measures of Dispersion
Range
The range is the difference
between the largest and the
smallest values within a
dataset. It provides a simple
measure of variability.
Variance
Variance measures the average
degree to which each point in a
dataset differs from the mean. It
indicates the spread of the
data.
Standard Deviation
The standard deviation is a
measure of the amount of
variation or dispersion of a set
of values. It is the square root
of the variance.
Measures of Dispersion
Example:
Indicate the spread or variability of the data.
Consider two datasets:
• Dataset A: [5,5,5,5,5]
• Dataset B: [0,10,0,10,0]
• Both datasets have the same mean (5), but Dataset B has higher dispersion.
Measures of Shape
Example:
Describe the distribution or shape of the data.
Consider two datasets:
• Dataset C: [10,15,20,25,30]
• Dataset D: [10,10,20,30,30]
• Both datasets have the same mean and median, but Dataset C is symmetric, while Dataset D is
skewed.
Interquartile Range (IQR)
1 Definition
The interquartile range (IQR) is a measure of statistical dispersion, or how
scattered spread out, the values in a dataset are.
2 Calculation
It is calculated as the difference between the third quartile (Q3) and the first
quartile (Q1) in a dataset.
Interquartile Range (IQR)
IQR = Q3 – Q1
• Interquartile range is the amount of spread in the middle 50% of a dataset.
• In other words, it is the distance between the first quartile (Q1) and the third quartile (Q3).
How to Find IQR?
Here's how to find the IQR:
Step 1: Put the data in order from least to greatest.
Step 2: Find the median. If the number of data points is odd, the median is the middle data point. If the number of data points is
even, the median is the average of the middle two data points.
Step 3: Find the first quartile (Q1). The first quartile is the median of the data points to the left of the median in the ordered list.
Step 4: Find the third quartile (Q3). The third quartile is the median of the data points to the right of the median in the ordered
list.
Step 5: Calculate IQR by subtracting Q3 – Q1.
Find the IQR of these scores:
1,3,3,3,4,4,4,6,6
Step 1: The data is already in order.
Step 2: Find the median. There are 9 scores, so the median is the middle score.
The median is 4.
Step 3: Find Q1, which is the median of the data to the left of the median.
There is an even number of data points to the left of the median, so we need the average of
the middle two data points.
1,3,3,3
Q1 = (3+3)/2 = 3
The first Quartile (Q1) is 3.
Step 4: Find Q3, which is the median of the data to the right of the median.
There is an even number of data points to the right of the median, so we need the average of
the middle two data points.
4,4,6,6
Q3 = (4+6)/2 = 5
The Third Quartile (Q3) is 5.
Step 5: Calculate the IQR.
IQR = Q3 - Q1
= 5 – 3
= 2
The IQR is 2 points.

More Related Content

Similar to Basic Statistical Concepts in Machine Learning.pptx

Statistics and permeability engineering reports
Statistics and permeability engineering reportsStatistics and permeability engineering reports
Statistics and permeability engineering reports
wwwmostafalaith99
 
IDS-Unit-II. bachelor of computer applicatio notes
IDS-Unit-II. bachelor of computer applicatio notesIDS-Unit-II. bachelor of computer applicatio notes
IDS-Unit-II. bachelor of computer applicatio notes
AnkurTiwari813070
 
Review of Basic Statistics and Terminology
Review of Basic Statistics and TerminologyReview of Basic Statistics and Terminology
Review of Basic Statistics and Terminology
aswhite
 

Similar to Basic Statistical Concepts in Machine Learning.pptx (20)

STATISTICS.pptx
STATISTICS.pptxSTATISTICS.pptx
STATISTICS.pptx
 
Statistics for machine learning shifa noorulain
Statistics for machine learning   shifa noorulainStatistics for machine learning   shifa noorulain
Statistics for machine learning shifa noorulain
 
Statistics and permeability engineering reports
Statistics and permeability engineering reportsStatistics and permeability engineering reports
Statistics and permeability engineering reports
 
Biostatistics mean median mode unit 1.pptx
Biostatistics mean median mode unit 1.pptxBiostatistics mean median mode unit 1.pptx
Biostatistics mean median mode unit 1.pptx
 
Medical Statistics.ppt
Medical Statistics.pptMedical Statistics.ppt
Medical Statistics.ppt
 
Statistics as a discipline
Statistics as a disciplineStatistics as a discipline
Statistics as a discipline
 
Introduction to Biostatistics_20_4_17.ppt
Introduction to Biostatistics_20_4_17.pptIntroduction to Biostatistics_20_4_17.ppt
Introduction to Biostatistics_20_4_17.ppt
 
Measures of central tendency
Measures of central tendencyMeasures of central tendency
Measures of central tendency
 
Descriptive Analysis.pptx
Descriptive Analysis.pptxDescriptive Analysis.pptx
Descriptive Analysis.pptx
 
Measure of central tendency grouped data.pptx
Measure of central tendency grouped data.pptxMeasure of central tendency grouped data.pptx
Measure of central tendency grouped data.pptx
 
Basic statisctis -Anandh Shankar
Basic statisctis -Anandh ShankarBasic statisctis -Anandh Shankar
Basic statisctis -Anandh Shankar
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
 
MMW (Data Management)-Part 1 for ULO 2 (1).pptx
MMW (Data Management)-Part 1 for ULO 2 (1).pptxMMW (Data Management)-Part 1 for ULO 2 (1).pptx
MMW (Data Management)-Part 1 for ULO 2 (1).pptx
 
Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptx
 
IDS-Unit-II. bachelor of computer applicatio notes
IDS-Unit-II. bachelor of computer applicatio notesIDS-Unit-II. bachelor of computer applicatio notes
IDS-Unit-II. bachelor of computer applicatio notes
 
Review of Basic Statistics and Terminology
Review of Basic Statistics and TerminologyReview of Basic Statistics and Terminology
Review of Basic Statistics and Terminology
 
Stat11t chapter3
Stat11t chapter3Stat11t chapter3
Stat11t chapter3
 
data
datadata
data
 
Statistical treatment and data processing copy
Statistical treatment and data processing   copyStatistical treatment and data processing   copy
Statistical treatment and data processing copy
 
Intro to Biostat. ppt
Intro to Biostat. pptIntro to Biostat. ppt
Intro to Biostat. ppt
 

Recently uploaded

1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
dq9vz1isj
 
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
ppy8zfkfm
 
原件一样伦敦国王学院毕业证成绩单留信学历认证
原件一样伦敦国王学院毕业证成绩单留信学历认证原件一样伦敦国王学院毕业证成绩单留信学历认证
原件一样伦敦国王学院毕业证成绩单留信学历认证
pwgnohujw
 
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
fztigerwe
 
edited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdfedited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdf
great91
 
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
yulianti213969
 
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
a8om7o51
 
Displacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second DerivativesDisplacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second Derivatives
23050636
 

Recently uploaded (20)

Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
 
Formulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfFormulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdf
 
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
 
Bios of leading Astrologers & Researchers
Bios of leading Astrologers & ResearchersBios of leading Astrologers & Researchers
Bios of leading Astrologers & Researchers
 
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
 
Data Analysis Project Presentation : NYC Shooting Cluster Analysis
Data Analysis Project Presentation : NYC Shooting Cluster AnalysisData Analysis Project Presentation : NYC Shooting Cluster Analysis
Data Analysis Project Presentation : NYC Shooting Cluster Analysis
 
Seven tools of quality control.slideshare
Seven tools of quality control.slideshareSeven tools of quality control.slideshare
Seven tools of quality control.slideshare
 
原件一样伦敦国王学院毕业证成绩单留信学历认证
原件一样伦敦国王学院毕业证成绩单留信学历认证原件一样伦敦国王学院毕业证成绩单留信学历认证
原件一样伦敦国王学院毕业证成绩单留信学历认证
 
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
 
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
 
Digital Marketing Demystified: Expert Tips from Samantha Rae Coolbeth
Digital Marketing Demystified: Expert Tips from Samantha Rae CoolbethDigital Marketing Demystified: Expert Tips from Samantha Rae Coolbeth
Digital Marketing Demystified: Expert Tips from Samantha Rae Coolbeth
 
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarj
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarjSCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarj
SCI8-Q4-MOD11.pdfwrwujrrjfaajerjrajrrarj
 
edited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdfedited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdf
 
Sensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
Sensing the Future: Anomaly Detection and Event Prediction in Sensor NetworksSensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
Sensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
 
Predictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesPredictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting Techniques
 
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
obat aborsi Bontang wa 081336238223 jual obat aborsi cytotec asli di Bontang6...
 
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital AgeCredit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
 
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
如何办理加州大学伯克利分校毕业证(UCB毕业证)成绩单留信学历认证
 
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
 
Displacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second DerivativesDisplacement, Velocity, Acceleration, and Second Derivatives
Displacement, Velocity, Acceleration, and Second Derivatives
 

Basic Statistical Concepts in Machine Learning.pptx

  • 1. Introduction to Basic Statistical Concepts Statistics is a branch of mathematics that deals with the collection, organization, analysis, interpretation, and presentation of data. It is used in various fields such as business, economics, sociology, and more. Understanding statistical concepts is essential for making informed decisions and drawing meaningful conclusions.
  • 2. Descriptive Statistics Mean, Median, Mode Descriptive statistics involve methods used to summarize and describe data. It includes measures of central tendency such as mean, median, and mode. Variability Measures Descriptive statistics also include measures of variability, which provide insights into the spread and dispersion of data.
  • 3. Measures of Central Tendency 1 Mean The mean is the average of a set of numbers and is calculated by summing all the numbers and then dividing by the count of numbers. 2 Median The median is the middle value when the numbers are arranged in ascending order. It represents the central tendency of the data. 3 Mode The mode is the value that appears most frequently in a set of data. It indicates the most common observation.
  • 4. Measures of Variability 1 Range The range is the difference between the highest and lowest values in a dataset. It provides a simple measure of variability. 2 Variance Variance measures the average degree to which each point in a dataset differs from the mean. It shows how much the data points are spread out. 3 Standard Deviation Standard deviation is a measure of the amount of variation or dispersion of a set of values. It is the square root of the variance.
  • 5. Inferential Statistics Definition Inferential statistics involves using data from a sample to make predictions or inferences about a population. Applications It is used to determine the probability of something happening or how accurate a prediction can be made from a sample.
  • 6. Hypothesis Testing Formulate Hypothesis The first step involves stating a clear hypothesis that you want to test based on existing knowledge or observations. Collect Data After formulating the hypothesis, data is collected to test and analyze the validity of the hypothesis. Analyze Results The results are statistically analyzed to determine whether to accept or reject the hypothesis.
  • 7. Confidence Intervals 1 Upper Bound The upper bound of a confidence interval represents the high end of the interval and provides the maximum potential value of the parameter. 2 Lower Bound The lower bound of a confidence interval represents the low end of the interval and provides the minimum potential value of the parameter.
  • 8. Types of Data 1 Nominal Data Nominal data represents categories without any order or sequence. Examples include gender, colors, and names. 2 Ordinal Data Ordinal data represents categories with a specific order or rank. Examples include education levels and survey ratings.
  • 9. Sampling Methods Simple Random Sampling Every member of the population has an equal chance of being selected. Stratified Sampling The population is divided into subgroups, and samples are then randomly selected from each subgroup.
  • 10. Common Statistical Distributions Normal Distribution The bell-shaped curve represents a symmetrical distribution with most values clustered around the mean. Binomial Distribution It represents the number of successes in a fixed number of independent trials with the same probability of success in each trial. Poisson Distribution It estimates the number of events that can happen in a fixed interval of time or space.
  • 11. Introduction to Descriptive Statistics Using Python Descriptive statistics is a branch of statistics that involves the collection, analysis, interpretation, and presentation of data. Its primary focus is on summarizing and describing the main features of a dataset, providing a comprehensive and meaningful overview.
  • 12. Purpose and Goals Summarization: Condensing large amounts of data into key insights. Exploration: Identifying patterns, trends, and outliers within the data. Communication: Presenting findings in a clear and understandable manner to facilitate decision-making.
  • 13. Types of Descriptive Statistics Provide a central or typical value in a dataset. Common measures include: • Mean: Average of all values. • Median: Middle value in a sorted dataset. • Mode: Most frequently occurring value. Indicate the spread or variability of the data. Common measures include: • Range: Difference between the maximum and minimum values. • Variance: Average of the squared differences from the mean. • Standard Deviation: Square root of the variance. Describe the distribution or shape of the data. Common measures include: • Skewness: Indicates the asymmetry of the data distribution. • Kurtosis: Measures the "tailedness" or sharpness of the data distribution. Measures of Central Tendency Measures of Dispersion Measures of Shape
  • 14. Measures of Central Tendency Mean The mean is the average of a set of numbers, calculated by adding all the numbers together and then dividing by the count of numbers. Consider the following dataset: [10, 15, 20, 25, 30] • (10 + 15 + 20 + 25 + 30) / 5 = 20
  • 15. Measures of Central Tendency Median The median is the middle value of a data set when it is ordered from least to greatest. It represents the 50th percentile of the data. Consider the following dataset: [10, 15, 20, 25, 30] • The Middle value, which is also 20.
  • 16. Measures of Central Tendency Mode The mode is the value that appears most frequently in a given data set. It's the most common observation in the data. Consider the following dataset: [10, 15, 20, 25, 30] • No Mode in this case.
  • 17. Measures of Dispersion Range The range is the difference between the largest and the smallest values within a dataset. It provides a simple measure of variability. Variance Variance measures the average degree to which each point in a dataset differs from the mean. It indicates the spread of the data. Standard Deviation The standard deviation is a measure of the amount of variation or dispersion of a set of values. It is the square root of the variance.
  • 18. Measures of Dispersion Example: Indicate the spread or variability of the data. Consider two datasets: • Dataset A: [5,5,5,5,5] • Dataset B: [0,10,0,10,0] • Both datasets have the same mean (5), but Dataset B has higher dispersion.
  • 19. Measures of Shape Example: Describe the distribution or shape of the data. Consider two datasets: • Dataset C: [10,15,20,25,30] • Dataset D: [10,10,20,30,30] • Both datasets have the same mean and median, but Dataset C is symmetric, while Dataset D is skewed.
  • 20. Interquartile Range (IQR) 1 Definition The interquartile range (IQR) is a measure of statistical dispersion, or how scattered spread out, the values in a dataset are. 2 Calculation It is calculated as the difference between the third quartile (Q3) and the first quartile (Q1) in a dataset.
  • 21. Interquartile Range (IQR) IQR = Q3 – Q1 • Interquartile range is the amount of spread in the middle 50% of a dataset. • In other words, it is the distance between the first quartile (Q1) and the third quartile (Q3).
  • 22. How to Find IQR? Here's how to find the IQR: Step 1: Put the data in order from least to greatest. Step 2: Find the median. If the number of data points is odd, the median is the middle data point. If the number of data points is even, the median is the average of the middle two data points. Step 3: Find the first quartile (Q1). The first quartile is the median of the data points to the left of the median in the ordered list. Step 4: Find the third quartile (Q3). The third quartile is the median of the data points to the right of the median in the ordered list. Step 5: Calculate IQR by subtracting Q3 – Q1.
  • 23. Find the IQR of these scores: 1,3,3,3,4,4,4,6,6 Step 1: The data is already in order. Step 2: Find the median. There are 9 scores, so the median is the middle score. The median is 4. Step 3: Find Q1, which is the median of the data to the left of the median. There is an even number of data points to the left of the median, so we need the average of the middle two data points. 1,3,3,3 Q1 = (3+3)/2 = 3 The first Quartile (Q1) is 3.
  • 24. Step 4: Find Q3, which is the median of the data to the right of the median. There is an even number of data points to the right of the median, so we need the average of the middle two data points. 4,4,6,6 Q3 = (4+6)/2 = 5 The Third Quartile (Q3) is 5. Step 5: Calculate the IQR. IQR = Q3 - Q1 = 5 – 3 = 2 The IQR is 2 points.