2. •A measure of variability is a summary statistic that
represents the amount of dispersion in a dataset.
•The terms variability, spread, and dispersion are
synonyms, and refer to how spread out a distribution is.
VARIABILITY
Measure σf
Descriptive Statistics: Statistical Analysis
4. Descriptive Statistics: Statistical Analysis
MEASURE OF CENTRAL TENDENCY MEASURE OF VARIABILITY
• Indicate the approximate center
of a distribution
• Tells you where most of your
points lie
• Determines how well you
can generalize results from the
sample to your population
• Describe the spread of the data
5. Why does variability matter?
• This is important because the amount of variability determines how well
you can generalize results from the sample to your population.
• A low dispersion indicates that the data points tend to be clustered tightly
around the center. Low variability is ideal because it means that you can
better predict information about the population based on sample data.
High dispersion signifies that they tend to fall further away. High variability
means that the values are less consistent, so it’s harder to make
predictions.
• Data sets can have the same central tendency but different levels of
variability or vice versa. If you know only the central tendency or the
variability, you can’t say anything about the other aspect. Both of them
together give you a complete picture of your data.
7. •The range tells you the spread of your data from the
lowest to the highest value in the distribution.
•The range (R) of a dataset is the difference between
the largest value (LV) and the smallest value (SV) in a
dataset.
Descriptive Statistics: Statistical Analysis
RANGE
R = HV - LV
8. Descriptive Statistics: Statistical Analysis
RANGE
Sample: Find and compare the value of range in each dataset.
DATASET 1 DATASET 2
26 23
33 46
21 39
38 11
25 25
20 16
29 32
22 52
34 19
• For Dataset 1
• Highest Value (HV) = 38
• Lowest Value (SV) = 20
R = HV – LV
R = 38 – 20
R = 18
• For Dataset 2
• Highest Value (HV) = 52
• Lowest Value (SV) = 11
R = HV – LV
R = 52 – 11
R = 41
• Dataset 2 has a broader range and, hence,
more variability than dataset 1.
9. •Interquartile Range is the range of the middle half of
a distribution or a data set.
•The interquartile range (IQR) is the difference of the
third quartile (Q3) and the first quartile (Q1) of the
data set.
Descriptive Statistics: Statistical Analysis
INTERQUARTILE RANGE
IQR = Q3 – Q1
10. Descriptive Statistics: Statistical Analysis
Find the value of interquartile range of the given dataset.
2, 3, 5, 7, 11, 13, 17, 19, 23, 29
Solution:
Number of Values (n) = 10
2, 3, 5, 7, 11, 13, 17, 19, 23, 29
Median (M) =
11+13
2
= 12 = Q2
INTERQUARTILE RANGE
Sample:
11. Descriptive Statistics: Statistical Analysis
Now we have to get two parts i.e. lower half to find Q1 and the upper
half to find Q3.
2, 3, 5, 7, 11, 13, 17, 19, 23, 29
IQR = Q3 – Q1
IQR = 19 – 5
IQR = 14
INTERQUARTILE RANGE
First Part: 2, 3, 5, 6, 11
Q1 = 5
Second Part: 13, 17, 19, 23, 29
Q3 = 19
12. Interquartile Range (IQR)
• Given: The first ten prime numbers are:
• 2, 3, 5, 7, 11, 13, 17, 19, 23, 29
• his is already in increasing order.
• Here the number of values = 10
• 10 is an even number. Therefore, the median is mean of 11 and 13
• That is Q2 = (11 + 13)/2 = 24/2 = 12.
• Now we have to get two parts i.e. lower half to find Q1 and the upper half to find Q3.
• Q1 part : 2, 3, 5,7,11
• Here the number of values = 5
• 5 is an odd number. Therefore, the center value is 5, that is Q1= 5
• Q3 part : 13, 17, 19, 23, 29
• Here the number of values = 5
• 5 is an odd number. Therefore, the center value is 19, that is Q3= 19
• The subtraction of Q1 and Q3 value is 19 – 5 = 11
• Therefore, 11 is the interquartile range value.
13. Standard Deviation
• The standard deviation is the average amount of variability in your
dataset.
• It tells you, on average, how far each score lies from the mean. The
larger the standard deviation, the more variable the data set is.
Variance
• In statistics, variance measures variability from the average or mean.
• Variance is the average squared difference of the values from the mean
• Because the calculations use the squared differences, the variance is in squared units rather the original units of the data. While
higher values of the variance indicate greater variability, there is no intuitive interpretation for specific values. Despite this
limitation, various statistical tests use the variance in their calculations.
14. Descriptive Statistics: Statistical Analysis
• The standard deviation is the average amount of
variability in your dataset.
• It tells you, on average, how far each score lies from
the mean.
Note: The larger the standard deviation, the more
variable the data set is.
Standard Deviation
15. Descriptive Statistics: Statistical Analysis
• In statistics, variance measures variability from the
average or mean.
• Variance is the average squared difference of the
values from the mean.
Note: Higher values of variance indicate greater
variability in a data set.
Variance
17. Descriptive Statistics: Statistical Analysis
Find the standard deviation and variance of the following scores
on an exam.
92, 95, 85, 80, 75, 50
Solution:
μ =
92+95+85+80+75+50
6
= 79.5
The value of mean is 79.5
Standard Deviation
and Variance
Sample:
19. Descriptive Statistics: Statistical Analysis
Standard Deviation
and Variance
The sum of the squares is equal to 1317.50
σ2 =
𝑠𝑢𝑚 𝑜𝑓 𝑡ℎ𝑒 𝑠𝑞𝑢𝑎𝑟𝑒𝑠
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑎𝑡𝑎 (𝑛)
σ2 =
1317.50
6
σ2 = 263.5
σ= 263.5
σ≈ 16.2
Variance Standard Deviation
Therefore, the variance is 263.5 sq.units, and the standard deviation is
approx. 16.2.
21. Standard deviation formula for populations
• If you have data from the entire population, use the population
standard deviation formula:
22. Standard deviation formula for populations
• Example: Population standard deviation
• Four friends were comparing their scores on a recent essay.
• Calculate the standard deviation of their scores:
6, 2, 3, 1.
• Solution:
• MEAN = 6+2+3+1/4 = 3
• The mean is 333 points.
23. Standard deviation formula for populations
• Sqrt of (14 / (4)) = approx 1.87
• The sample standard deviation is approximately 1.87.
Score (xi) Deviation (xi-mean) Square Deviation (xi-mean)2
6 6-3 = 3 9
2 2-3 = -1 1
3 3-3 = 0 0
1 1-3 = -2 4
14
24. Standard deviation formula for samples
• If you have data from a sample, use the sample standard deviation
formula:
25. Standard deviation formula for samples
• Example: Sample standard deviation
• A sample of 4 students was taken to see how many pencils they were
carrying.
• Calculate the sample standard deviation of their responses:
2, 2, 5, 7
Solution:
MEAN=(2+2+5+7)/4 = 4
The sample mean is 4 pencils
26. Standard deviation formula for samples
• Sqrt of (18 / (4-1)) = approx 2.45
• The sample standard deviation is approximately 2.452.452, point, 45.
Pencils Deviation (xi-mean) Square Deviation (xi-mean)2
2 2-4 = -2 4
2 2-4 = -2 4
5 5-4 = 1 1
7 7-4 = 3 9
18
27. Population variance
The formula for the variance
of an entire population is the
following:
Sample variance
To use a sample to estimate
the variance for a population,
use the following formula.
Using the previous equation
with sample data tends to
underestimate the variability.
Because it’s usually
impossible to measure an
entire population, statisticians
use the equation for sample
variances much more
frequently.
29. Measures of Variability
• A high school teacher at a small private school
assigns trigonometry practice problems to be
worked via the net. Students must use a
password to access the problems and the time of
log-in and log-off are automatically recorded for
the teacher. At the end of the week, the teacher
examines the amount of time each student spent
working the assigned problems. The data is
provided below in minutes.
• Find the Range, Standard Deviation, and Variance for
the above data.
• What does this information tell you about the
variability of student's length of time on the computer
solving trigonometry problems? Is it homogeneous or
heterogeneous?
X X2
49 2401
48 2304
43 1849
39 1521
34 1156
33 1089
28 784
27 729
25 625
25 625
22 484
22 484
22 484
20 400
15 225
452 15160
31. • What does this information tell you about the variability of
students' length of time on the computer solving trigonometry
problems?
• It depends on how you look at the question. If you look at the mean of
approximately 30 minutes. Then having a standard deviation of approximately
10.5 minutes seems rather large. But if you put it in the broader context, a
standard deviation of approximately 10.5 minutes of study time is not all the
big.