2. Variability is a measure of how
different scores are from one
another within a set of data.
Synonyms: spread,dispersion.
Does the amount of variability
(spread, dispersion) make a
difference? Do we care?
How could we measure the
amount of variability?
Bookhaven by waffler at http://www.flickr.com/photos/adrian_s/23441729/
3. Julian and Delia ask for help
Their mean quiz score is the same:
M = 15 (out of 25, on 20 quizzes)
Do we know enough to help them?
Let’s look at the actual scores for each student
4. Best score = 18
Worst score = 13
Range=18.5-12.5=6
Range of middle 50%
is IQR=16.5-13.5 =3
Scores are pretty similar
across all 20 quizzes
Julian’s Quiz Scores (Mean = 15.0)
16 14 15 14 16 14 14 17 14 13 15 15 16 15 18 15 14 16 14 15
5. Delia’s Quiz Scores (Mean = 15.0)
15 22 10 11 16 13 20 13 17 8 18 16 12 14 9 19 18 13 21 15
Best score = 22
Worst score = 8
Range=22.5 – 7.5 = 15
Range of middle 50% is
IQR=18.5-12.5 =6
Scores seem to differ quite
a bit from quiz to quiz
6. Range: Distance from highest number to
lowest number (may require real limits)
Interquartile Range: Distance from 25% point
to 75% point (range of middle 50% of scores)
• Quartile: 25th, 50th (median), and 75th percentiles
• The 25th and 75th are midpoint of each half
• May be an exact value, or may be between two
values, using the same rules as the median.
7. Essentials of Statistics for Behavioral Science, 6th Edition by Frederick Gravetter and Larry Wallnau Copyright
2008 Wadsworth Publishing, a division of Thomson Learning. All rights reserved.
8. Developed by John Tukey to
display central tendency &
variability efficiently
Box = middle 50% of cases
Top = 75th percentile
Bottom = 25th percentile
Height = Interquartile Range IQR
Line inside the box = MEDIAN
If line is not centered, data are
not perfectly symmetric.
Line (“whiskers”) extend to
minimum or maximum values
within 1.5 IQR
9.
10. Outliers: Beyond 1.5 IQR
from edge of box
Extremes: More than 3
IQRs from edge of box
http://web.anglia.ac.uk/numbers/common_folder
/graphics/fig6_single_box.jpg
11.
12. Range: Distance from highest number to
lowest number (may require real limits)
Interquartile Range: Distance from 25% point
to 75% point (range of middle 50% of scores)
Average deviation: Sum of deviations of
scores from M, divided by N = (X-M) / N
13. Range: Distance from highest number to
lowest number (may require real limits)
Interquartile Range: Distance from 25% point
to 75% point (range of middle 50% of scores)
Average deviation: Sum of deviations of
scores from M, divided by N = (X-M) / N
DOESN’T WORK: ALWAYS EQUALS 0.
14. Square the
deviations so all
scores positive
Sum of Squares
(SS) used in most
inferential
statistics
SS is in the
numerator of a
fraction for both
Variance and
Standard
Deviation
X Mean (X–Mean) (X-Mean)2
Jim 48 138.75 -90.75 8235.5625
Orlend 27 138.75 -111.75 12488.0625
Ellen 189 138.75 50.25 2525.0625
Steve 136 138.75 -2.75 7.5625
Jose 250 138.75 111.25 12376.5625
Tabia 218 138.75 79.25 6280.5625
Kaleb 151 138.75 12.25 150.0625
Lisa 201 138.75 62.25 3875.0625
Pavlik 78 138.75 -60.75 3690.5625
Kris 163 138.75 24.25 588.0625
Emma 106 138.75 -32.75 1072.5625
Michael 98 138.75 -40.75 1660.5625
Mean= 138.75 Sum= 0 52950.25
Called SS or “Sum of Squares” which
means Sum of Squared Deviations from the Mean
16. Delia Scores X X2
15 225
22 484
10 100
11 121
16 256
13 169
20 400
13 169
17 289
8 64
18 324
16 256
12 144
14 196
9 81
19 361
18 324
13 169
21 441
15 225
Sum 300 4798
N 20 60.1 SS
3.163Variance
1.779Std Dev
N
X
XSS
2
2 )(
17.
18. Range: Distance from highest number to
lowest number (may require real limits)
Interquartile Range: Distance from 25% point
to 75% point (range of middle 50% of scores)
Average deviation: Sum deviations from M,
divide by N. Doesn’t work. Always = 0.
Variance: Average of squared deviation scores
2
n
SS
n
XX
2
2 )(
19. Variance: Average of squared deviation
scores.
Standard Deviation: Square root of Variance
n
SS
n
MX
2
2
)(
n
SS
n
MX
2
)(
20. STANDARD DEVIATION
FOR A SAMPLE
The data we use are from a
randomly selected sample
Numerator of fraction is
Sum of Squares (SS)
Denominator of fraction is n – 1
Symbols: s or SD
Excel function: STDEV
STANDARD DEVIATION
FOR A POPULATION
The data we use are from all
members of population
Numerator of fraction is
Sum of Squares (SS)
Denominator of fraction is n
Symbols: σ (Greek sigma = s)
Excel function: STDEVP
1
)(
1
2
n
XX
n
SS
s
n
XX
n
SS
2
)(
21. Range and Interquartile Range use only a few scores.
Standard deviation and variance use all scores
Measure of
Variability
Can be used with … Best or most commonly
used for …
Percentages Any type of data Categorical / Nominal data
Range or
Interquartile
Range
Data with order (low to high)
and equal intervals
(Interval or Ratio data)
•Open-ended categories
•Indeterminate values
•Extreme or skewed values
Boxplot Interval or Ratio data
Good for graph
•Comparing variability in two
or more groups (Ch 8-14)
Standard
Deviation or
Variance
Interval or Ratio data with no
open-ended or indeterminate
values
•Any situation in which it can
be appropriately computed
•Inferential Statistics