οΊPlease Subscribe to this Channel for more solutions and lectures
http://www.youtube.com/onlineteaching
Chapter 3: Describing, Exploring, and Comparing Data
3.2: Measures of Variation
2. Chapter 3:
Describing, Exploring, and Comparing Data
3.1 Measures of Center
3.2 Measures of Variation
3.3 Measures of Relative Standing and Boxplots
2
Objectives:
1. Summarize data, using measures of central tendency, such as the mean, median, mode,
and midrange.
2. Describe data, using measures of variation, such as the range, variance, and standard
deviation.
3. Identify the position of a data value in a data set, using various measures of position,
such as percentiles, deciles, and quartiles.
4. Use the techniques of exploratory data analysis, including boxplots and five-number
summaries, to discover various aspects of data
3. Key Concept: Variation is the single most important topic in statistics.
Three important measures of variation: Range, Standard Deviation, & Variance.
3.2 Measures of Variation
3
1. Range = Max β Min
2. Variance
3. Standard Deviation
4. Coefficient of Variation: πΆππ΄π =
π
π₯
β 100%, Or
πΆπ =
π
π
β 100%
5. Chebyshevβs Theorem: 1 β
1
π2
6. Empirical Rule (Normal)
7. Range Rule of Thumb for Understanding Standard
Deviation: π β
π ππππ
4
& Β΅ Β± 2Ο
Use CVAR to compare
variability when the
units are different.
4. 4
Example 1: Two brands of outdoor paint
are tested to see how long each will last
before fading. The results (in months) for
a sample of 6 cans are shown
a. Find the mean and range of each
group.
b. Which brand would you buy?
Brand A Brand B
10 35
60 45
50 30
30 35
40 40
20 25
Brand A: π₯ =
π₯
π
=
210
6
= 35
, π = 60 β 10 = 50
The mean for both brands is the same,
but the range for Brand A is much
greater than the range for Brand B.
Which brand would you buy?
π = πππ₯ β πππ, π₯ =
π₯
π
, π =
π₯
π
, π =
(π₯βπ₯)2
πβ1
Range = Maximum data value β Minimum data value
The range uses only the maximum and the minimum data values, so it is very sensitive
to extreme values. Therefore, the range is not resistant, it does not take every value
into account, and does not truly reflect the variation among all of the data values.
3.2 Measures of Variation
Brand B: π₯ =
π₯
π
=
210
6
= 35
, π = 45 β 25 = 20
5. Variance & Standard Deviation
5
The variance is the average of the squares of the distance each value is from the mean.
The standard deviation is the square root of the variance.
The standard deviation is a measure of how spread out your data are and how much data values deviate
away from the mean.
Notation
s = sample standard deviation
Ο = population standard deviation
Usage & properties:
1. To determine the spread of the data.
2. To determine the consistency of a variable.
3. To determine the number of data values that fall within a specified interval in a distribution (Chebyshevβs
Theorem).
4. Used in inferential statistics.
5. The value of the standard deviation s is never negative. It is zero only when all of the data values are exactly the
same.
6. Larger values of s indicate greater amounts of variation.
3.2 Measures of Variation
6. Variance & Standard Deviation
6
Population Variance:
π2 =
π β π 2
π
Population Standard
Deviation:
π =
π β π 2
π
Sample Variance:
π 2 =
π β π 2
π β 1
=
π π2 β π 2
π π β 1
Sample Standard Deviation:
π =
π β π 2
π β 1
=
π π2 β π 2
π π β 1
1. The standard
deviation is effected
by outliers.
2. The units of the
standard are the same
as the units of the
original data values.
3. The sample standard
deviation s is a
biased estimator of
the population
standard deviation Ο,
which means that
values of the sample
standard deviation s
do not center around
the value of Ο.
TI Calculator:
How to enter data:
1. Stat
2. Edi
3. Highlight & Clear
4. Type in your data in
L1, ..
TI Calculator:
Mean, SD, 5-number
summary
1. Stat
2. Calc
3. Select 1 for 1 variable
4. Type: L1 (second 1)
5. Scroll down for 5-
number summary
3.2 Measures of Variation
7. Example 2
7
Given the data speeds (Mbps): 38.5, 55.6, 22.4, 14.1, 23.1.
a. Find the range of these data speeds (Mbps):
b. Find the standard deviation
Range =Max β Min = 55.6 β 14.1 = 41.50 Mbps
b. π₯ =
π₯
π
=
38.5 + 55.6 + 22.4 + 14.1 + 23.1
5
=
153.7
5
= 30.74ππππ
π =
38.5 β 30.74 2 + 55.6 β 30.74 2 + 22.4 β 30.74 2 + 14.1 β 30.74 2 + 23.1 β 30.74 2
5 β 1
=
1083.0520
4 = 16.45ππππ
ππ:
π =
π π2 β π 2
π π β 1
=
5(5807.79) β 153.7 2
5 5 β 1
=
5415.26
20
= 16.45ππππ
π = πππ₯ β πππ
π₯ =
π₯
π
, π =
(π₯ β π₯)2
π β 1
8. 8
Find the variance and standard deviation for the population
data set for Brand A paint. 10, 60, 50, 30, 40, 20.
Months, X Β΅ X β Β΅ (X β Β΅)2
10
60
50
30
40
20
35
35
35
35
35
35
β25
25
15
β5
5
β15
625
625
225
25
25
225
1750 π =
1750
6
= 17.1ππππ‘βπ
π2 =
π β π 2
π
=
1750
6
= 291.7
π = πππ₯ β πππ, π₯ =
π₯
π
, π =
π₯
π
, π =
(π₯βπ₯)2
πβ1
, π =
(π₯βπ)2
π
π =
10 + 60 + 50 + 30 + 40 + 20
6
= 35
Example 3
9. 9
Find the variance and standard deviation for the amount
of European auto sales for a sample of 10 years. The data
are in millions of dollars.
11.2, 11.9, 12.0, 12.8, 13.4, 14.3
X X 2
11.2
11.9
12.0
12.8
13.4
14.3
125.44
141.61
144.00
163.84
179.56
204.49
958.94
75.6
π 2 =
π π2 β π 2
π π β 1
π 2 =
6 958.94 β 75.6 2
6 5
π 2 = 1.28
π = 1.13
π 2 = 6 β 958.94 β 75. 62 / 6 β 5
π = πππ₯ β πππ
π₯ =
π₯
π
, π =
(π₯ β π₯)2
π β 1
π =
(π₯ β π)2
π
Example 4
10. Range Rule of Thumb for Understanding Standard Deviation
The range rule of thumb is a crude but simple tool for understanding and
interpreting standard deviation. The vast majority (such as 95%) of sample
values lie within 2 standard deviations of the mean.
10
Variance & Standard Deviation
Unusual:
Significantly low values are Β΅ β 2Ο or lower.
Significantly high values are Β΅ + 2Ο or higher.
Usual:
Values not significant are between (Β΅ β 2Ο ) and (Β΅ + 2Ο).
Range Rule of Thumb for Estimating a Value of the Standard Deviation
To roughly estimate the standard deviation from a collection of known sample data
(when the distribution is unimodal and approximately symmetric), use: π β
π ππππ
4
3.2 Measures of Variation
11. The Empirical Rule
The empirical rule states that for
data sets having a distribution
that is approximately bell-
shaped, the following properties
apply.
β’ About 68% of all values fall within 1
standard deviation of the mean.
β’ About 95% of all values fall within 2
standard deviations of the mean.
β’ About 99.7% of all values fall within
3 standard deviations of the mean.
11
3.2 Measures of Variation
12. Example 6: Use Range Rule of Thumb to approximate the lowest value and the
highest value in a data set where π₯ = 10 & π = 12. (Β΅ Β± 2Ο)
12
IQ scores have a bell-shaped distribution with a mean of 100 and a
standard deviation of 15. What percentage of IQ scores are between 70 and
130?
130 β 100 = 30 & 100 β 70 = 30
The empirical rule: About 95% of all IQ scores are between 70 and 130.
30
π
=
30
15
= 2
π β
π
4
=
12
4
= 3 π₯ Β± 2π = 10 Β± 2(3) πΏππ€ = 4 & βπ = 16
π = πππ₯ β πππ, π₯ =
π₯
π
, π =
π₯
π
, π β
π
4
, π =
(π₯βπ₯)2
πβ1
, π =
(π₯βπ)2
π
Example 5
13. 13
Chebyshevβs Theorem
The proportion of values from any data set that fall within k standard deviations of the
mean will be at least 1 β 1/k2, where k is a number greater than 1 (k is not necessarily
an integer).
# of standard
deviations, k
Minimum Proportion
within k standard
deviations
Minimum Percentage
within k standard
deviations
2 1 β 1/4 = 3/4 75%
3 1 β 1/9 = 8/9 88.89%
4
1 β 1/16 =
15/16
93.75%
3.2 Measures of Variation
The mean price of houses in
a certain neighborhood is
$50,000, and the standard
deviation is $10,000. Find
the price range for which at
least 75% of the houses will
sell.
Example 7
Chebyshevβs Theorem states that:
At least 75% of a data set will fall within 2 standard deviations of the mean.
50,000 β 2(10,000) = 30,000
50,000 + 2(10,000) = 70,000
Why Chebyshevβs Theorem?
π = $50,000, π = $10,000
14. 14
1 β 1/k2
A survey of local companies found that the mean amount of travel
allowance for executives was $0.25 per mile. The standard deviation was
0.02. Using Chebyshevβs theorem, find the minimum percentage of the
data values that will fall between $0.20 and $0.30.
.30 β .25 /.02 = 2.5
.25 β .20 /.02 = 2.5
1 β 1/π2 = 1 β 1/2. 52
π = 2.5
= 0.84 = 84%
Β΅=0.25, Ο = 0.02
Example 8
15. Comparing Variation in Different Samples or Populations
Coefficient of Variation
The coefficient of variation (or CV) for a set of nonnegative sample or population
data, expressed as a percent, describes the standard deviation relative to the mean,
and is given by the following:
The coefficient of variation is the standard deviation divided by the mean,
expressed as a percentage.
Use CVAR to compare standard deviations when the units are different.
15
Properties of Variance
100%
s
CV
x
ο½ ο 100%
CV
ο³
ο
ο½ ο
3.2 Measures of Variation
16. 16
3.2 Measures of Variation
The mean of the number of sales of cars over a 3-month period is 87, and
the standard deviation is 5. The mean of the commissions is $5225, and
the standard deviation is $773. Compare the variations of the two.
Commissions are more variable than sales.
πΆππ΄π =
5
87
β 100% = 5.7% Sales
πΆππ΄π =
773
5225
β 100% = 14.8% Commissions
πΆπ =
π
π
β 100%
Example 9
17. Properties of Variance
The units of the variance are the squares of the units of the original data values.
The value of the variance can increase dramatically with the inclusion of
outliers. (The variance is not resistant.)
The value of the variance is never negative. It is zero only when all of the data
values are the same number.
The sample variance sΒ² is an unbiased estimator of the population variance ΟΒ².
17
Why Divide by (n β 1)?
There are only n β 1 values that can be assigned without constraint. With a given mean, we can use any
numbers for the first n β 1 values, but the last value will then be automatically determined.
With division by n β 1, sample variances sΒ² tend to center around the value of the population variance
ΟΒ²; with division by n, sample variances sΒ² tend to underestimate the value of the population variance
ΟΒ².
https://www.khanacademy.org/math/ap-statistics/summarizing-quantitative-data-ap/more-standard-
deviation/v/another-simulation-giving-evidence-that-n-1-gives-us-an-unbiased-estimate-of-variance
3.2 Measures of Variation (Time)
18. Biased and Unbiased Estimators
The sample standard deviation s is a biased estimator of the population standard
deviation s, which means that values of the sample standard deviation s do not
tend to center around the value of the population standard deviation Ο.
The sample variance sΒ² is an unbiased estimator of the population variance ΟΒ²,
which means that values of sΒ² tend to center around the value of ΟΒ² instead of
systematically tending to overestimate or underestimate ΟΒ².
18
π =
ππ
π
, π2 =
π2
π β
( ππ)2
π
π
β π =
π2π
π
β π2
Recall for a Grouped Data: m is the Midpoint of a class