5ANOVA Analyzing Differences in Multiple GroupsLea.docx

5
ANOVA: Analyzing Differences
in Multiple Groups
Learning Objectives
After reading this chapter, you should be able to:
• Describe the similarities and differences between t-tests and
ANOVA.
• Explain how ANOVA can help address some of the problems
and limitations associ-
ated with t-tests.
• Use ANOVA to analyze multiple group differences.
• Use post hoc tests to pinpoint group differences.
• Determine the practical importance of statistically significant
findings using effect
sizes with eta-squared.
iStockphoto/Thinkstock
tan81004_05_c05_103-134.indd 103 2/22/13 4:28 PM
CHAPTER 5Section 5.1 From t-Test to ANOVA

Chapter Overview
5.1 From t-Test to ANOVA
The ANOVA Advantage
Repeated Testing and Type I Error
5.2 One-Way ANOVA
Variance Between and Within
The Statistical Hypotheses
Measuring Data Variability in the ANOVA
Calculating Sums of Squares
Interpreting the Sums of Squares
The F Ratio
The ANOVA Table
Interpreting the F Ratio
Locating Significant Differences
Determining Practical Importance
5.3 Requirements for the One-Way ANOVA
Comparing ANOVA and the Independent t
One-Way ANOVA on Excel
5.4 Another One-Way ANOVA
Chapter Summary
Introduction
During the early part of the 20th century R. A. Fisher worked at
an agricultural research station in rural southern England. In his
work analyzing the effect of pesticides and
fertilizers on results like crop yield, he was stymied by the
limitations in Gosset’s indepen-
dent samples t-test, which allowed him to compare just two
samples at a time. In the effort
to develop a more comprehensive approach, Fisher created a

statistical method he called
analysis of variance, often referred to by its acronym, ANOVA,
which allows for making
multiple comparisons at the same time using relatively small
samples.
5.1 From t-Test to ANOVA
The process for completing an independent samples t-test in
Chapter 4 illustrated a number of things. The calculated t value,
for example, is a score based on a ratio, one
determined by dividing the variability between the two groups
(M1 2 M2) by the vari-
ability within the two groups, which is what the standard error
of the difference (SEd)
measures. So both the numerator and the denominator of the t-
ratio are measures of data
variability, albeit from different sources. The difference
between the means is variability
attributed primarily to the independent variable, which is the
group to which individual
subjects belong. The variability in the denominator is
variability for reasons that are unex-
plained—error variance in the language of statistics.
tan81004_05_c05_103-134.indd 104 2/22/13 4:28 PM
CHAPTER 5Section 5.1 From t-Test to ANOVA
In his method, ANOVA, Fisher also embraced this
pattern of comparing between-groups variance to
within-groups variance. He calculated the variance
statistics differently, as we shall see, but he followed
Gosset’s pattern of a ratio of between-groups vari-

ance compared to within.
The ANOVA Advantage
ANOVA and the t-test answer the same question—are
differences between groups statisti-
cally significant? Why bother with another test that answers the
same question that the
t-test answers? Suppose a utility company wants to compare the
per customer consump-
tion of natural gas in three areas: a, b, and c. Why not answer
the question by performing
three t-tests as follows?
Test 1 compares area a to area b.
Test 2 compares area b to area c.
Test 3 compares area a to area c.
Although the three tests involve all possible comparisons, there
are two problems. The
first is that the thought of all possible comparisons becomes
strained when there are many
groups involved in the analysis. If there were five different
areas in the natural gas con-
sumption example and they were labeled a through e, note the
number of comparisons
needed to cover all possible combinations:
1. a to b
2. a to c
3. a to d
4. a to e
5. b to c
6. b to d
7. b to e
8. c to d

9. c to e
10. d to e
Even if the sheer number of comparisons were the only
problem, conducting 10 tests in
order to cover all possible combinations would strain the
patience of most analysts.
There is another advantage to ANOVA over t-tests. The t-test
accommodates just one
independent variable. Area “a” can be compared to area “b,” or
women can be com-
pared to men, or Republicans to Democrats, but there is no way
to combine IVs and
compare, for example, female Republicans to male Democrats,
or gas consumption by
single-family residences in area “a” to consumption by
multifamily residences in area
“b.” Factorial ANOVA, which is covered in Chapter 6, besides
handling any number of
groups, will also accommodate any number of independent
variables, as long as they are
categorical variables.
Key Terms: Analysis of
variance is a test of significant
differences among two or more
independent groups, when the
IV is nominal and the DV is
interval or ratio.
tan81004_05_c05_103-134.indd 105 2/22/13 4:28 PM

CHAPTER 5Section 5.2 One-Way ANOVA
Repeated Testing and Type I Error
But there is another problem and it is more insidious.
Recall that the potential for type I error is indicated
by the level at which the test is conducted. When test-
ing at p 5 .05, any significant finding will actually be
a type I error an average of 5% of the time. However,
that level of error is based on the assumption that
each test is completely independent; it assumes that
every time a statistical test is completed, it is con-
ducted with fresh data. If statistical testing is done repeatedly
with the same data, the
potential for type I error doesn’t remain fixed at .05 (or
whatever the level of the testing
is), but rather grows. In fact, if 10 tests are conducted in
succession with the same data as
with groups labeled a, b, c, d, and e above, and each finding is
significant, by the time the 10th test is completed, the poten-
tial for alpha error is .40! The potential for alpha error for any
number of repeated tests can be calculated, although we will
not bother to do it here. The point is that after 10 statistically
significant findings, the probability of a type I error is 4 in 10
or 40%! The accuracy in determining significance is reduced
nearly to the level of probability offered by a coin flip (the
probability of obtaining heads or tails in a coin flip is 50%).
The foregoing brings us to this: multiple t-tests with the same
data are not an option.
5.2 One-Way ANOVA
Analysis of variance allows for one test to make comparisons
between any number of groups so that there is just one
probability for alpha error. In the example above, the

five groups can be compared for statistically significant
differences in the same analysis.
The result will indicate whether there are significant differences
in natural gas consump-
tion anywhere among the several groups.
Here, the focus is on ANOVA in its simplest form, the
procedure called one-way ANOVA,
with the “one” indicating just one independent variable. In that
regard this form of ANOVA
is similar to the independent samples t-test. The difference is
that the IV in ANOVA can
have two or more categories. In theory, there is no upper limit
to the number.
Variance Between and Within
Fisher’s test is based on the understanding that when several
subjects are measured
on some characteristic, the scores can vary for some
combination of two reasons, either
because of the impact of the independent variable (their group
membership) or because
of the error variance that stems from other, uncontrolled
influences.
Key Terms: Factorial
ANOVA is ANOVA with
more than one independent
variable. One-way ANOVA
involves just one IV.
Review Question A:
When successive tests
with the same data
indicate significance,

what happens to the
probability of type I
error?
tan81004_05_c05_103-134.indd 106 2/22/13 4:28 PM
The F ratio that is calculated as a test statistic in ANOVA is a
measure including the IV
effect, divided by a measure that is entirely error variance.
When F meets or exceeds a crit-
ical value, it indicates that the effect of the independent
variable is great enough that the
difference between at least two of the groups is not random.
When the F ratio is smaller
than a critical value, it indicates that differences between
groups that can be attributed to
the independent variable are not significant compared to the
error variance.
Three groups of the same size selected from one population
might be represented by the
three distributions in Figure 5.1. They do not have exactly the
same means because of sam-
pling error. Even randomly selected samples are rarely
identical, but they were all drawn
from a common population.
Figure 5.1: Three groups drawn from the same population
The range within each of the three groups reflects the fact that
even people in the same
group will often differ regarding whatever is measured. For

example, someone interested
in analyzing job satisfaction among workers in the same
industry will find that job satis-
faction varies, even among people of the same gender, the same
age, and the same level of
experience. The differences within the group indicate the error
variance.
The issue in analysis of variance is whether the different
manifestations of the IV cre-
ate enough variability between the groups that the ratio of
between-groups variance to
within-groups variance exceeds a critical value. In other words,
do the multiple samples
still represent populations with the same mean? Alternatively,
the IV, also sometimes
called the “grouping variable,” can be a particular intervention
or treatment. For example,
if three groups of workers are offered three different incentives,
do these different incen-
tives affect job satisfaction differently (Figure 5.2), so that the
three groups no longer rep-
resent populations with the same means?
tan81004_05_c05_103-134.indd 107 2/22/13 4:28 PM
Figure 5.2: Three groups after a treatment or an intervention
The within-groups variability in these three distributions is the
same as it was in the distri-
butions in Figure 5.1. It is the between-groups variability that
has changed. More particu-

larly, it’s the difference between the group means that has
changed. Although there was
some between-groups variability before the treatment, it was the
effect of sampling vari-
ability. After the treatment the differences between means are
much greater. F will indicate
whether the differences are great enough to be statistically
significant.
The Statistical Hypotheses
For the t-test the null hypothesis was written Ho: m1 5 m2,
indicating that the two samples
involved were drawn from populations with the same means.
For a one-way ANOVA with
three groups, the null hypothesis indicates that three samples
represent populations with
the same means:
Ho: m1 5 m2 5 m3
For the alternate hypothesis, however, there is not just one
possible alternative. Each of
the following outcomes might occur:
a. HA: m1 ≠ m2 5 m3: Sample 1 represents a population with a
mean value different
from the means of the populations represented by samples 2 and
3.
b. HA: m1 5m2 ≠ m3: Samples 1 and 2 represent populations
with mean values differ-
ent from the mean of the population represented by sample 3.
c. HA: m1 5m3 ≠ m2: Samples 1 and 3 represent a population
with a mean value dif-

ferent from the population represented by sample 2.
d. HA: m1 ≠ m2 ≠ m3: All three samples represent populations
with different means.
In the job satisfaction example, maybe two of the incentives,
pay raises and end-of-year
bonus, had similar effects on job satisfaction, while the third
incentive, say additional
vacation time, had little or no effect. Since the several possible
alternative outcomes
tan81004_05_c05_103-134.indd 108 2/22/13 4:28 PM
HA: not so
Measuring Data Variability in the ANOVA
There are several statistics that indicate data variability. So far
in the book we have used each of the following:
• the standard deviation (s),
• the variance (s2),
• the standard error of the mean (SEM),
• the standard error of the difference (SEd),
• the range (R).
Analysis of variance adds one more measure of data variability,
the sum of squares
(SS), which for the one-way ANOVA has three forms. There is
the sum of squares total,
SStot, which is all variability from all sources. The sum of

squares between, SSbet,
measures the effect of the IV, the “grouping variable” or the
“treatment effect.” The
sum of squares within, SSwith, or the SSerror, is a measure
entirely of error variance.
A. The sum of squares total, SStot, is the sum of the squared
differences between
each score in all groups and the mean of all data.
Review Question B:
If there were four
groups involved in a
one-way ANOVA, how
many possible pairs of
groups are there?
Where
x 5 each score in all groups
MG 5 the mean of all data, the “grand” mean
To calculate SStot,
1. Sum all scores from all groups and divide by the number of
scores to deter-
mine the grand mean, MG.
2. Subtract MG from each score (x) in each group, and then
square the difference:
(x 2 MG)
2.
2.

B. The sum of squares between, SSbet, is the sum of the squared
differences between
the means of the groups and the mean of all the data, times the
number in each
group.
multiply rapidly when the number of groups increases, a more
general alternate hypoth-
esis is given. Either the groups involved come from populations
with the same means, or
at least one does not. The alternate to the null hypothesis is
simply stated:
Formula 5.1 SStot
2
Formula 5.2 SSbet 5 (Ma 2 MG)
2na1 (Mb 2 MG)
2nb 1 (Mc 2 MG)
2nc
tan81004_05_c05_103-134.indd 109 2/22/13 4:28 PM
Formula 5.4 SStot 5 SSbet 1 SSwith
Where
Ma 5 the mean of the scores in the first group, “a”
MG 5 the same grand mean used in SStot

na 5 the number of scores in the first group, “a”
To calculate SSbet,
1. Determine the mean for each group, Ma,
Mb, and so on.
2. Subtract MG from each sample mean and
square the difference, (Ma 2 MG)
2.
3. Multiply the squared differences by the
number in the group, (Ma2MG)
2na.
4. Repeat for each group.
5. Sum the results across groups.
C. The sum of squares within, SSwith, is the sum of the squared
differences be-
tween individuals in the groups and the particular group mean.
Key Terms: Data variability in
ANOVA is measured by sum of
squares. Total sum of squares
indicates all data variability.
Sum of squares between
includes the variance related to
the IV. Sum of squares within
is error variance.

2
Where
SSwith 5 the sum of squares within
xa refers to each of the individual scores in group “a”
Ma 5 the score mean in group “a”
To calculate SSwith:
1. From each score in each group:
a. Subtract the mean of the group.
b. Square the difference.
c. Sum the squared differences within each group.
2. Repeat this for each group.
3. Sum the results across the groups.
SStot consists of the SSbet and the SSwith, so it follows that:
tan81004_05_c05_103-134.indd 110 2/22/13 4:28 PM
This means that once two of the three have been determined, the
third can be calculated
by subtraction. For example, SStot – SSbet 5 SSwith. Although
we will determine the SSwith
this way in an example below, it is a good idea to be cautious in
following this approach
because if there is an error in calculating either SStot or SSbet,
it is perpetuated by using
subtraction to determine SSwith.

Calculating Sums of Squares
Suppose that the service manager at a local auto dealership
would like to find out the
particular price for oil changes that will bring in the most
customers. Coupons are offered
in the local newspaper for $30 oil changes in March, $25 oil
changes in April, and $20 oil
changes in May. In this example, the monetary value of the
coupon is the IV, and the num-
ber of oil changes bought is the DV. The question is whether
price is related to the number
of oil changes. The number of oil changes bought on four
successive Fridays in each of the
three months is as follows:
March: 3, 4, 4, 3
April: 6, 6, 7, 8
May: 6, 7, 7, 9
Recall that both SStot and SSbet require the mean values. For
MG, verify that for all three
groups,
12, so MG 5 5.833
For March,
For April,
For May,
The calculations for sum of squares total and sum of squares

within are fairly extensive
and are in Tables 5.1 and 5.2 respectively. Those for sum of
squares between are briefer
and are presented in text.
Verify that SStot 5 41.668
tan81004_05_c05_103-134.indd 111 2/22/13 4:28 PM
Table 5.1: Calculating the sum of squares total, SStot
2
MG 5 5.833
For March ($30 Coupon):
x 2 M (x 2 M)2
3 2 5.833 5 22.833 8.026
4 2 5.833 5 21.833 3.360
4 2 5.833 5 21.833 3.360
3 2 5.833 5 22.833 8.026
For April ($25 Coupon):
x 2 M (x 2 M)2

6 2 5.833 5 .167 .028
6 2 5.833 5 .167 .028
7 2 5.833 5 1.167 1.362
8 2 5.833 5 2.167 4.696
For May ($20 Coupon):
x 2 M (x 2 M)2
6 2 5.833 5 .167 .028
7 2 5.833 5 1.167 1.362
7 2 5.833 5 1.167 1.362
9 2 5.833 5 3.167 10.030
SStot 5 41.668
For SSbet,
SSbet 5 (Ma 2 MG)
2na1 (Mb 2 MG)
2nb 1 (Mc 2 MG)
2nc
5 (3.5 2 5.833)2(4) 1 (6.75 2 5.833)2(4) 1 (7.25 2 5.833)2(4)
5 21.772 1 3.364 1 8.032
5 33.168
For the error term, sum of squares within, verify that, SSwith 5
8.504

tan81004_05_c05_103-134.indd 112 2/22/13 4:28 PM
Table 5.2: Calculating the sum of squares within, SSwith
SSwith 5 S(xa 2 Ma)
2 1 S(xb 2 Mb)
2 1 S(xc 2 Mc)
2
3, 4, 4, 3
6, 6, 7, 8
6, 7, 7, 9
Ma 5 3.50
Mb 5 6.750
Mc 5 7.250
For March ($30 Coupon):
x 2 M (x 2 M)2
3 2 3.50 5 2.50 .250
4 2 3.50 5 .50 .250
4 2 3.50 5 .50 .250
3 2 3.50 5 2.50 .250

For April ($25 Coupon):
x 2 M (x 2 M)2
6 2 6.750 5 2.750 .563
6 2 6.750 5 2.750 .563
7 2 6.750 5 .250 .063
8 2 6.750 5 1.250 1.563
For May ($20 Coupon):
x 2 M (x 2 M)2
6 2 7.250 5 21.250 1.563
7 2 7.250 5 2.250 .063
7 2 7.250 5 2.250 .063
9 2 7.250 5 1.750 3.063
SSwith 5 8.504
Since SStot 5 SSbet 1 SSwith, you can now check your results
for accuracy. For the oil-change
example we have,
8.504 1 33.168 5 41.672
tan81004_05_c05_103-134.indd 113 2/22/13 4:28 PM

In the initial calculation, SStot 5 41.668. The difference of .004
is due to number rounding
and is relatively unimportant.
The sums of squares values can never be negative, which
should make sense since there’s no such thing as negative
variability. Because they are the sums of squared differences,
all SS values must be positive. The smallest value for a sum of
squares is zero, which occurs when all the scores in the calcu-
lation have the same value. Squaring the differences between
individual scores and group means is not a procedure unique
to ANOVA. Recall when the standard deviation was calculated
back in Chapter 1. At the heart of the standard deviation cal-
culation is those repetitive x 2 M differences for each score in
the sample, which were then squared and summed. In addi-
tion, the denominator in the standard deviation calculation was
n 2 1, which should look
suspiciously like a degrees of freedom value.
Interpreting the Sums of Squares
The different sums of squares values are measures of data
variability, and in that regard
they are like the standard deviation and other measures of data
variability from earlier
chapters. But there is also an important difference between SS
and the other statistics.
Although they measure data variability, the SS values also
reflect the number of scores
involved in the calculation, n. Because sums of squares are in
fact the sum of squared val-
ues, the more values there are, the larger the SS value becomes.

With the standard devia-
tion often the opposite occurs. Because the majority of scores in
most distributions are
near the mean, adding values often shrinks the value of the
standard deviation. This can-
not happen with the sum of squares. An additional score,
whatever its value, always
increases SS values.
This characteristic makes the sum of squares difficult
to interpret. A large SS value can indicate that indi-
vidual scores tend to be highly variable, or that there
are many scores in the set, or both. Fisher’s answer
to this was to transform each sum of squares value
into a mean measure of variability by dividing each
SS by its own particular degrees of freedom; SS 4 df
creates the mean square (MS). The df for the one-
way ANOVA are as follows:
• dftot 5 N 2 1 where N is all subjects in all groups
• dfbet 5 k 2 1 where k is the number of groups
• dfwith, or dferror 5 N 2 k
Just as SSbet 1 SSwith 5 SStot, the sum of dfbet and dfwith will
equal dftot. Although there is
a MS value associated with both the SSbet and the SSwith
(SSerror) in the one-way ANOVA,
there is no mean square total calculated. A mean level of overall
variability would be of
no value when answering questions about the ratio of between-
groups to within-groups
variability.
Review Question C:
In the independent
samples t-test, the

measure of within
group variability is
the standard error of
the difference. What’s
the equivalent for
ANOVA?
Key Terms: The mean square
provides a mean measure of data
variability. It’s determined by
dividing the respective sum of
squares value by its degrees of
freedom.
tan81004_05_c05_103-134.indd 114 2/22/13 4:28 PM
Dividing the MSbet by the MSwith to determine F makes it
clear that the test statistic is based
on how the IV, which is the grouping variable or the treatment
effect (in the MSbet) com-
pared to error (MSwith). This comparison is illustrated in
Figure 5.3, where the between-
groups variance is illustrated by comparing the distance from
the mean of the first distri-
bution to the mean of the second distribution, the “A” variance,
to the variances within
the groups, the “B” and “C” variances.
Figure 5.3: The F-ratio: comparing variance between (A) groups
to variance within (B 1 C)
To be statistically significant the MSbet/MSwith ratio must be

greater than 1.0—the between-
groups variance must be greater than within-groups variance.
How much greater is indi-
cated by a critical value discussed below.
A
B C
The F Ratio
The mean squares for between and within make up the F ratio,
the test statistic in ANOVA.
Formula 5.5 F 5 MSbet/MSwith
tan81004_05_c05_103-134.indd 115 2/22/13 4:28 PM
The ANOVA Table
Using the oil change example, the sums of squares and the
degrees of freedom are as
follows:
dftot 5 N 2 1 Since N 5 12, dftot 5 11
dfbet 5 k 2 1 Since k 5 3 groups (or treatments), dfbet 5 2
dfwith 5 N 2 k With N 5 12, and k 5 3 and, dfwith 5 9
The MS values, which are SS for between and within divided by
their df, are as follows:
MSbet 5 SSbet / dfbet 5 33.168/2 5 16.584

MSwith 5 MSwith / dfwith 5 8.504/9 5 .945
The F value, which is the MSbet/MSwith, is:
16.584/.945 5 17.549
The ANOVA results are often presented in a table such as the
one below:
Source SS df MS F
Total 41.672 11
Between 33.168 2 16.584 17.549
Within 8.504 9 .945
Interpreting the F Ratio
The larger F is, the more likely it is to be statistically
significant, but how large is large
enough? Here F 5 17.549, which means that MSbet is 17.549
times greater than MSwith. To
determine statistical significance, it is compared to the values in
the Critical Values of F,
Table 5.3, where the values are indexed to degrees of freedom
for the problem.
tan81004_05_c05_103-134.indd 116 2/22/13 4:28 PM
Table 5.3: The critical values of F

Boldface values indicate the critical value for p 5 .01.
df
denomi-
nator
df numerator
1 2 3 4 5 6 7 8 9 10
2 18.51
98.49
19.00
99.01
19.16
99.17
19.25
99.25
19.30
99.30
19.33
99.33
19.35
99.36
19.37
99.38
19.38
99.39

19.40
99.40
3 10.13
34.12
9.55
30.82
9.28
29.46
9.12
28.71
9.01
28.24
8.94
27.91
8.89
27.67
8.85
27.49
8.81
27.34
8.79
27.23
4 7.71
21.20

6.94
18.00
6.59
16.69
6.39
15.98
6.26
15.52
6.16
15.21
6.09
14.98
6.04
14.80
6.00
14.66
5.96
14.55
5 6.61
16.26
5.79
13.27
5.41
12.06

5.19
11.39
5.05
10.97
4.95
10.67
4.88
10.46
4.82
10.29
4.77
10.16
4.74
10.05
6 5.99
13.75
5.14
10.92
4.76
9.78
4.53
9.15
4.39
8.75

4.28
8.47
4.21
8.26
4.15
8.10
4.10
7.98
4.06
7.87
7 5.59
12.25
4.74
9.55
4.35
8.45
4.12
7.85
3.97
7.46
3.87
7.19
3.79
6.99

3.73
6.84
3.68
6.72
3.64
6.62
8 5.32
11.26
4.46
8.65
4.07
7.59
3.84
7.01
3.69
6.63
3.58
6.37
3.50
6.18
3.44
6.03
3.39
5.91

3.35
5.81
9 5.12
10.56
4.26
8.02
3.86
6.99
3.63
6.42
3.48
6.06
3.37
5.80
3.29
5.61
3.23
5.47
3.18
5.35
3.14
5.26
10 4.96
10.04

4.10
7.56
3.71
6.55
3.48
5.99
3.33
5.64
3.22
5.39
3.14
5.20
3.07
5.06
3.02
4.94
2.98
4.85
11 4.84
9.65
3.98
7.21
3.59
6.22

3.36
5.67
3.20
5.32
3.09
5.07
3.01
4.89
2.95
4.74
2.90
4.63
2.85
4.54
12 4.75
9.33
3.89
6.93
3.49
5.95
3.26
5.41
3.11
5.06

3.00
4.82
2.91
4.64
2.85
4.50
2.80
4.39
2.75
4.30
13 4.67
9.07
3.81
6.70
3.41
5.74
3.18
5.21
3.03
4.86
2.92
4.62
2.83
4.44

2.77
4.30
2.71
4.19
2.67
4.10
14 4.60
8.86
3.74
6.51
3.34
5.56
3.11
5.04
2.96
4.69
2.85
4.46
2.76
4.28
2.70
4.14
2.65
4.03

2.60
3.94
15 4.54
8.68
3.68
6.36
3.29
5.42
3.06
4.89
2.90
4.56
2.79
4.32
2.71
4.14
2.64
4.00
2.59
3.89
2.54
3.80
16 4.49
8.53

3.63
6.23
3.24
5.29
3.01
4.77
2.85
4.44
2.74
4.20
2.66
4.03
2.59
3.89
2.54
3.78
2.49
3.69
17 4.45
8.40
3.59
6.11
3.20
5.19

2.96
4.67
2.81
4.34
2.70
4.10
2.61
3.93
2.55
3.79
2.49
3.68
2.45
3.59
18 4.41
8.29
3.55
6.01
3.16
5.09
2.93
4.58
2.77
4.25

2.66
4.01
2.58
3.84
2.51
3.71
2.46
3.60
2.41
3.51
(continued)
tan81004_05_c05_103-134.indd 117 2/22/13 4:28 PM
As with the t-test, as degrees of freedom increase, the critical
values decline. Unlike t, F
has two df values, one for the MSbet, the other for the MSwith.
Using them together identifies
the critical value from the table.
• In Table 5.3, the critical value is identified by first moving
across the top of
the table to the value of dfbet, since the numerator of the F ratio
is dfbet. In this
example, dfbet 5 2.

Table 5.3: The critical values of F (continued)
Boldface values indicate the critical value for p 5 .01.
df
denomi-
nator
df numerator
1 2 3 4 5 6 7 8 9 10
19 4.38
8.18
3.52
5.93
3.13
5.01
2.90
4.50
2.74
4.17
2.63
3.94
2.54
3.77
2.48
3.63

2.42
3.52
2.38
3.43
20 4.35
8.10
3.49
5.85
3.10
4.94
2.87
4.43
2.71
4.10
2.60
3.87
2.51
3.70
2.45
3.56
2.39
3.46
2.35
3.37

21 4.32
8.02
3.47
5.78
3.07
4.87
2.84
4.37
2.68
4.04
2.57
3.81
2.49
3.64
2.42
3.51
2.37
3.40
2.32
3.31
22 4.30
7.95
3.44
5.72

3.05
4.82
2.82
4.31
2.66
3.99
2.55
3.76
2.46
3.59
2.40
3.45
2.34
3.35
2.30
3.26
23 4.28
7.88
3.42
5.66
3.03
4.76
2.80
4.26

2.64
3.94
2.53
3.71
2.44
3.54
2.37
3.41
2.32
3.30
2.27
3.21
24 4.26
7.82
3.40
5.61
3.01
4.72
2.78
4.22
2.62
3.90
2.51
3.67

2.42
3.50
2.36
3.36
2.30
3.26
2.25
3.17
25 4.24
7.77
3.39
5.57
2.99
4.68
2.76
4.18
2.60
3.85
2.49
3.63
2.40
3.46
2.34
3.32

2.28
3.22
2.24
3.13
26 4.23
7.72
3.37
5.53
2.98
4.64
2.74
4.14
2.59
3.82
2.47
3.59
2.39
3.42
2.32
3.29
2.27
3.18
2.22
3.09

27 4.21
7.68
3.35
5.49
2.96
4.60
2.73
4.11
2.57
3.78
2.46
3.56
2.37
3.39
2.31
3.26
2.25
3.15
2.20
3.06
28 4.20
7.64
3.34
5.45

2.95
4.57
2.71
4.07
2.56
3.75
2.45
3.53
2.36
3.36
2.29
3.23
2.24
3.12
2.19
3.03
29 4.18
7.60
3.33
5.42
2.93
4.54
2.70
4.04

2.55
3.73
2.43
3.50
2.35
3.33
2.28
3.20
2.22
3.09
2.18
3.00
30 4.17
7.56
3.32
5.39
2.92
4.51
2.69
4.02
2.53
3.70
2.42
3.47

2.33
3.30
2.27
3.17
2.21
3.07
2.16
2.98
Source: Critical Values of F. (2011). Retrieved from
http://faculty.vassar.edu/lowry/apx_d.html
tan81004_05_c05_103-134.indd 118 2/22/13 4:28 PM
Where
x 5 a table value determined by the number of groups (k) in the
problem and the
degrees of freedom within (dfwith) from the ANOVA table
MSwith 5 the value from the ANOVA table
n 5 the number in one group when group sizes are equal. (When
group sizes are
unequal, a different HSD value is calculated for each pair of
groups.)
Formula 5.6 HSD 5 x"1MSwith/n2
• The second step is moving down the left side of the table to
the value of dfwith,

the numerator of the F ratio. In this example, dfwith 5 9.
• The intersection of the 2 across the top and 9 along the left
side of the table
leads to two critical values, one in plain print which is for p 5
.05, and one in
boldface, which is the value for testing at p 5 .01.
• The critical value when testing at p 5 .05 for 2 and 9 degrees
of freedom is 4.26.
The fact that the calculated value for F is larger than the table
value indicates that the dif-
ference in the number of oil changes is probably related to the
price charged, or to say it
the other way around, the differences are probably not just an
artifact of sampling vari-
ability. The sales manager can reject Ho.
Locating Significant Differences
When there were only two groups involved in an independent
samples t-test, it was rela-
tively easy to interpret a significant t. It indicates that the two
groups represent popula-
tions with different means. A significant F in an ANOVA with
more than two groups is not
so straightforward. It indicates that at least one group is
significantly different from at least
one other group in the study, but unless there are only two
groups in the ANOVA, it is not
clear which group is significantly different from which. If the
null hypothesis is rejected,
there are a number of possibilities, as we noted earlier.
To further pinpoint which pairs of groups are sig-

nificantly different from each other, post hoc tests
(a Latin expression that means “after this”) are con-
ducted following a significant F. There are many post
hoc tests, each with particular strengths, but one of
the more common, and one of the easier to calcu-
late, is called Tukey’s HSD (for “honestly significant
difference”). The Tukey’s formula (5.6) produces a
value which represents the smallest difference between the
means of any two samples in a
significant ANOVA that can be statistically significant:
Key Terms: The post hoc test
provides a way to determine
which group is significantly dif-
ferent from which when there are
more than two groups in a test.
tan81004_05_c05_103-134.indd 119 2/22/13 4:28 PM
Table 5.4: Critical values for Tukey’s HSD
df for
Error
Term
k 5 Number of Treatments
2 3 4 5 6 7 8 9 10
5 3.64
5.70

4.60
6.98
5.22
7.80
5.67
8.42
6.03
8.91
6.33
9.32
6.58
9.67
6.80
9.97
6.99
10.24
6 3.46
5.24
4.34
6.33
4.90
7.03
5.30
7.56

5.63
7.97
5.90
8.32
6.12
8.61
6.32
8.87
6.49
9.10
7 3.34
4.95
4.16
5.92
4.68
6.54
5.06
7.01
5.36
7.37
5.61
7.68
5.82
7.94

6.00
8.17
6.16
8.37
8 3.26
4.75
4.04
5.64
4.53
6.20
4.89
6.62
5.17
6.96
5.40
7.24
5.60
7.47
5.77
7.68
5.92
7.86
9 3.20
4.60

3.95
5.43
4.41
5.96
4.76
6.35
5.02
6.66
5.24
6.91
5.43
7.13
5.59
7.33
5.74
7.49
10 3.15
4.48
3.88
5.27
4.33
5.77
4.65
6.14

4.91
6.43
5.12
6.67
5.30
6.87
5.46
7.05
5.60
7.21
11 3.11
4.39
3.82
5.15
4.26
5.62
4.57
5.97
4.82
6.25
5.03
6.48
5.20
6.67

5.35
6.84
5.49
6.99
12 3.08
4.32
3.77
5.05
4.20
5.50
4.51
5.84
4.75
6.10
4.95
6.32
5.12
6.51
5.27
6.67
5.39
6.81
13 3.06
4.26

3.73
4.96
4.15
5.40
4.45
5.73
4.69
5.98
4.88
6.19
5.05
6.37
5.19
6.53
5.32
6.67
14 3.03
4.21
3.70
4.89
4.11
5.32
4.41
5.63

4.64
5.88
4.83
6.08
4.99
6.26
5.13
6.41
5.25
6.54
15 3.01
4.17
3.67
4.84
4.08
5.25
4.37
5.56
4.59
5.80
4.78
5.99
4.94
6.16

5.08
6.31
5.20
6.44
16 3.00
4.13
3.65
4.79
4.05
5.19
4.33
5.49
4.56
5.72
4.74
5.92
4.90
6.08
5.03
6.22
5.15
6.35
17 2.98
4.10

3.63
4.74
4.02
5.14
4.30
5.43
4.52
5.66
4.70
5.85
4.86
6.01
4.99
6.15
5.11
6.27
18 2.97
4.07
3.61
4.70
4.00
5.09
4.28
5.38

4.49
5.60
4.67
5.79
4.82
5.94
4.96
6.08
5.07
6.20
19 2.96
4.05
3.59
4.67
3.98
5.05
4.25
5.33
4.47
5.55
4.65
5.73
4.79
5.89

4.92
6.02
5.04
6.14
20 2.95
4.02
3.58
4.64
3.96
5.02
4.23
5.29
4.45
5.51
4.62
5.69
4.77
5.84
4.90
5.97
5.01
6.09
24 2.92
3.96

3.53
4.55
3.90
4.91
4.17
5.17
4.37
5.37
4.54
5.54
4.68
5.69
4.81
5.81
4.92
5.92
30 2.89
3.89
3.49
4.45
3.85
4.80
4.10
5.05

4.30
5.24
4.46
5.40
4.60
5.54
4.72
5.65
4.82
5.76
40 2.86
3.82
3.44
4.37
3.79
4.70
4.04
4.93
4.23
5.11
4.39
5.26
4.52
5.39

4.63
5.50
4.73
5.60
Source: Tukey’s HSD Critical Values. (2011). Retrieved from
http://www.stat.duke.edu/courses/Spring98/sta110c/qtable.html
tan81004_05_c05_103-134.indd 120 2/22/13 4:28 PM
http://www.stat.duke.edu/courses/Spring98/sta110c/qtable.html
Computing HSD:
1. From Table 5.4, locate the value of x by moving across the
top of the table to
the number of groups (k 5 3, not k – 1 which was the dfbet), and
then down the
left side for the within degrees of freedom (dfwith5 9). The
intersecting values
are 3.95 and 5.43. The smaller of the two is the value when p 5
.05.
2. The calculation is 3.95 times the result of the square root of
.945 (the MSwith)
divided by 4, the number in any one group (n).
3.95 "1.945/42 5 1.920
This value indicates the minimum difference there can be
between the means of any two

groups for the groups to be significantly different. The sign of
the difference does not mat-
ter; it is the absolute value we need.
The mean number of oil changes in the each of the three months
were as follows:
When oil changes cost $30, the mean number of changes was
Ma 5 3.50.
Mb 5 6.750.
Mc 5 7.250.
Now, comparing each pair of groups:
The first month minus the second month:
• Ma 2 Mb 5 3.5 – 6.75 5 23.25—this difference exceeds the
HSD value of 1.92
and is significant. Changing prices from $30 to $25 is
associated with a signifi-
cant increase in the number of oil changes.
The first month minus the third month:
• Ma 2 Mc 5 3.5 2 7.25 5 23.75—this difference exceeds 1.92
and is significant.
Changing prices from $30 to $20 is associated with a significant
increase in
the number of oil changes.
The second month minus the third month:
• Mb 2 Mc 5 6.75 2 7.25 5 2.5—this difference is less than 1.92
and is not sig-

nificant. Changing prices from $25 to $20 is not associated with
a significant
increase in the number of oil changes.
When there are several groups involved, it is more helpful to
create a table that indicates
the differences between all possible pairs of means, as Table 5.5
does. Using a matrix such
as the one below to summarize the difference between each pair
of means makes it easier
to interpret the HSD value. The shaded cells are blank because
they either represent a dif-
ference between a group mean and itself (which would be zero),
or a redundant difference
that has already been presented in the opposite side of the
“diagonal” of the matrix. For
example, the absolute difference between March and April is the
same as the absolute dif-
ference between April and March.
tan81004_05_c05_103-134.indd 121 2/22/13 4:28 PM
Table 5.5: Presenting Tukey’s HSD results in a table
HSD5 "
5 3.95 "1.945/42 5 1.920
Any difference between pairs of means 1.920 or greater is a
statistically significant difference.
March ($30)

M 5 3.50
April ($25)
M 5 6.750
May ($20)
M 5 7.250
March ($30)
M 5 3.50
Diff 5 3.250 Diff 5 3.750
April ($25)
M 5 6.750
Diff 5 .50
May ($20)
M 5 7.250
Mean differences in red are statistically significant.
Determining Practical Importance
When F is significant, there are two additional questions to
address. One we just took
up when we used the post hoc test to determine which groups
are significantly differ-
ent from which other groups. In the oil change example above,
the results indicate to
the service manager that lowering the price from $30 to $25 or
to $20 made a significant

difference, but lowering the price from $25 to $20 did not. In
this case, $25 seems to be
the optimal price point. Note that in the calculations above the
mean differences were
negative, indicating that the number of oil changes sold was
indeed lower when the price
was higher (Ma , Mb , Mc), which is consistent with what has
been expected. However,
based on the results above, the mean difference of an additional
half an oil change sold
is not significant enough to justify lowering the price from $25
to $20, at least from a sta-
tistical standpoint.
From an economic standpoint, the manager may think that a $5
sacrifice is worthwhile if
it will bring in at least an additional half an oil change.
However, the statistically savvy
manager would understand that this additional half an oil
change could have occurred
by chance, since the results were statistically insignificant. An
accounting-savvy manager
would also take into consideration the impact of lowering the
price to $20 on the rev-
enues, as well as the costs of and profit margin on oil changes,
before making a final busi-
ness decision.
The other question in an analysis of variance problem is about
the importance of the signifi-
cant result. Recall that a statistically significant outcome is one
that probably did not occur
by chance, but because it is statistically significant does not
mean that it is necessarily
x (MSwith/n)

tan81004_05_c05_103-134.indd 122 2/22/13 4:28 PM
Note that it involves values that are already available from the
ANOVA table. The
eta-squared statistic is a very straightforward ratio of between-
groups variability (SSbet)
to total variability (SStot). For the oil changes problem, SSbet 5
33.168 and SStot 5 41.672.
h2 5 SSbet/SStot
h2 5 33.168/41.672 5 .796
The value indicates that about 80% of the difference between
the number of oil changes at
various price points can be attributed to the price reduction
alone. The balance is due to
other factors not accounted for in the analysis. They might
include such variables as the
weather (maybe the weather was adverse during March and that
affected sales), special
sales on other products or services that might have brought in
additional customers, and
so on.
Among measures of effect size, eta-squared is one of the more
liberal. It is specific to the
particular sample and should not be used to predict what the
effect size might be for some
new analysis.

If there was no error variance, all the differences in the scores
would be attributable to the
independent variable and the sums of squares for between and
for total would have the
same values. In that instance, the effect size would be 1.0. With
human subjects there is
always error variance; scores fluctuate for reasons other than
the IV, but it’s important to
know that 1.0 is the “upper bound” for this effect size. The
lower bound is 0, of course—
none of the variance is explained. But an h2 5 0 is also unlikely
since the only time the
effect size is calculated is when F is significant, and that can
only happen when the effect
of the IV is great enough that the ratio of MSbet to MSwith
exceeds the critical value.
Formula 5.7 h2 5 SSbet/ SStot
important. For t-tests, Cohen’s d was the statistic we used to
determine the importance of
an outcome. The effect size we will use for analysis of variance
is called eta-squared (hp2).
Like Cohen’s d, it addresses the issue of practical importance
by answering the question:
How much of the difference between the groups can
be attributed to the independent variable?
In the oil change example, the analysis was of
whether the number of oil changes sold is affected
by the price of the oil change. The effect size statis-
tics will indicate how much of that increase in sales
is due to the price reduction, and by inference, how
much is due to other factors.

The formula for eta-squared is as follows:
Key Terms: Eta-squared
indicates the proportion of the
difference between groups’
scores that can be explained by
the independent variable.
tan81004_05_c05_103-134.indd 123 2/22/13 4:28 PM
CHAPTER 5Section 5.3 Requirements for the One-Way
ANOVA
5.3 Requirements for the One-Way ANOVA
Any statistical test is based on a number of assumptions related
to the nature of the data. In the case of the one-way ANOVA,
the “one” indicates an important condition.
• This particular test can accommodate just one independent
variable.
That one variable can have any number of categories, but there
can be just one IV. In the
number of oil changes example, the IV was the price of the oil
change. The test can accom-
modate any number of prices over time and test their effects on
the number of oil changes
sold, but it cannot factor in a second IV such as the day of
week. In that regard, it is like
the independent samples t-test that also accommodates just one
IV, but in the case of the
independent samples t-test, the IV is limited to precisely two
categories.

• The categories of the IV must be independent.
Also like the independent samples t-test, the subjects in the
groups must be separate
from each other. The groups cannot include the same subjects
measured multiple times,
although there is a variation of ANOVA that will accommodate
repeated measures; this
will be discussed in Chapter 7.
• The IV must be measured on a nominal scale.
The IV in ANOVA must be treated as a categorical variable
since the analysis is whether
there are significant differences in the value of the dependent
variable (number of oil
changes in our example) for the different categories. Strictly
speaking, the categories of the
independent variable can involve data of any scale. In the oil
change example, the catego-
ries were defined by the price of the oil change which is a ratio
scale, continuous variable.
But the independent variable must be treated as though it was
categorical. It would have
been impractical for the service manager to run promotions at
each possible price point to
determine the exact value that would bring in the highest
number of customers. Instead,
three “treatments” were selected, which in this case were the
three specific price points of
$20, $25, and $30. For this reason, ANOVA was the right
choice for this analysis. ANOVA is
applicable any time the data can be classified and grouped
based on the IV. Had the man-
ager more incrementally varied the price over a long period of
time and observed changes

in sales, another type of analysis (e.g., correlation or regression,
discussed in chapters to
come), would have been a better choice for the data.
• The DV must be measured on an interval or ratio scale.
The DV in the example was the number of oil changes, a ratio
scale variable.
• The groups in the analysis must be similarly distributed.
The technical description for this is that there must be
homogeneity of variance. It means
that, for example, the groups all have reasonably similar
standard deviations.
• Finally, using ANOVA assumes that the samples are drawn
from a normally
distributed population.
tan81004_05_c05_103-134.indd 124 2/22/13 4:28 PM
ANOVA
Although the above are requirements, Fisher’s pro-
cedure can tolerate a certain amount of deviation
from these requirements. In the cryptic language
of statistics, the test is quite “robust,” particularly
where minor variations from data normality and
homogeneity of variance are concerned.
Comparing ANOVA and the Independent t

Checking the assumptions associated with the independent
samples t-test in Chapter 4 indicates that Gosset’s test and
Fisher’s ANOVA share several assumptions. Although they
employ distinct statistics, the sums of squares within instead
of the standard error of the difference, for example, the one-
way ANOVA truly is an extension of t-test. This can be illus-
trated by completing ANOVA and the independent t-test for
the same data.
Suppose that the human resources manager of an organization
would like to assess the
differences in work-life balance of the employees in the
marketing department versus
those in the production department. The dependent variable he
selects is the amount of
work completed after hours at home per week. In this example,
the IV or grouping vari-
able is department (marketing versus production). The data are
as follows:
Marketing: 3, 4, 5, 7, 7, 9, 11, 12
Production: 0, 1, 3, 3, 4, 5, 7, 7
Calculating some of the basic statistics yields the following:
M s SEM SEd MG
Marketing: 7.25 3.240 1.146
1.458 5.50
Production: 3.75 2.550 .901
First the t-test:
t 5

M1 2 M2
SEd
5
7.25 2 3.7
1.458
5 2.104; t.051142 5 2.145
Review Question D:
How are the mean
square values calcu-
lated in an ANOVA?
Key Terms: Homogeneity
of variance is a condition for
ANOVA. It indicates that mea-
sures for all groups in the analy-
sis are distributed similarly.
tan81004_05_c05_103-134.indd 125 2/22/13 4:28 PM
ANOVA
The difference is significant. Those in marketing (M1) take
significantly more work home
than those in production (M2). The human resources manager
can conclude that employ-
ees in the marketing department are more likely to experience
work-life conflict.
Now the ANOVA:

MG)
2 5 168
• Verify that the result of subtracting MG from each score in
both groups, squar-
ing the differences, and summing the square 5 168.
SSbet 5 (Ma 2 MG)
2na 1 (Mb 2 MG)
2nb
• This one isn’t too lengthy to do here:
(7.25 2 5.50)2(8) 1 (3.75 2 5.50)2(8) 5 24.5 1 24.5 5 49
2
• Verify that the result of subtracting the group means from
each score in the
particular group, squaring the differences, and summing the
squares 5 119.
• Check that SSwith 1 SSbet 5 SStot: 119 1 49 5 168.
Source SS df MS F
Total 168 15
Between 49 1 49 5.765; F.05(1,14) 5 4.60
Within 119 14 8.5

Like the t-test, ANOVA indicates that the difference in the
amount of work completed
at home is significantly different for the two groups. Both tests
drew the same conclu-
sion about whether the result is significant, but the kinship
between the two procedures
involves more than coming to the same conclusion.
• Note that the calculated value of t 5 2.401, and the calculated
value of
F 5 5.765.
• If the value of t is squared, it equals the value of F, 2.4012 5
5.765.
• The same is true for the critical values:
a. t.05(14) 5 2.145
b. F.05(1,14) 5 4.60
c. 2.1452 5 4.60
tan81004_05_c05_103-134.indd 126 2/22/13 4:28 PM
ANOVA
When there are two groups, comparing t-test results to the one-
way ANOVA makes it clear
that the two tests are equivalent. There is more calculation in
the ANOVA so the tendency
is to use t-test for two groups, but the point is that the two tests
are consistent.
One-Way ANOVA on Excel

Excel’s Analysis ToolPak includes an ANOVA procedure for
PC users. To illustrate its use,
suppose that a home builder is approached by a customer who
wants to move in as soon
as possible. The customer chooses three home designs that she
likes and asks the home
builder: “These three designs take approximately the same time
to build, right?” To com-
pare the three designs on speed of completion, the builder
randomly selects 10 homes that
he built in the past based on each design. The data for the
number of days to build each
home are as follows:
Design A: 33, 35, 38, 39, 42, 44, 44, 47, 50, 52
Design B: 27, 36, 37, 37, 39, 39, 41, 42, 45, 46
Design C: 22, 24, 25, 27, 28, 28, 29, 31, 33, 34
1. First create the data file in Excel. Enter the names of the
designs in cells A1, B1,
and C1.
2. In the columns below those labels, enter the number of days,
beginning in
cell A2 for Design A, B2 for Design B, and C2 for Design B.
Once the data are
entered and checked for accuracy, continue with the following
steps.
3. Click the Data tab at the top of the page.
4. At the extreme right, choose Data Analysis.
5. In the Analysis Tools window select ANOVA
Single Factor and click OK.
6. Indicate where the data are located in the Input

Range. In the example here, the range is A2:C11.
7. Note that the default is Grouped by Columns. If
the data are arrayed along rows instead of col-
umns, this would be changed.
Because we designated A2 instead of A1 as the point where the
data begin, there is no
need to indicate that labels are in the first row.
8. Select Output Range and enter a cell location where you wish
the display of
the output to begin—for example, A13.
9. Click OK.
Review Question E:
For a two-group test
of significant differ-
ences, how will the t
value compare to the
F value?
tan81004_05_c05_103-134.indd 127 2/22/13 4:28 PM
ANOVA
If column A is widened to make it easier to read the output, and
the decimal values are set
to 3, the result is the screen-shot in Figure 5.4.
Figure 5.4: ANOVA on Excel

Below the data set Excel produces two tables. The first provides
descriptive statistics.
The second table looks very much like the longhand table of
results for the number of oil
changes example, except that:
• The figures for total follow those for between and within
instead of preceding
them.
• The column titled “P-value” indicates the probability that an F
of this magni-
tude could have occurred by chance.
Before the default was changed to 3 decimals, the “P-value”
was 4.32E06. The “2E06”
is scientific notation. It is a shorthand way to say that the actual
value is p 5 .0000043,
4.3 with the decimal moved 6 decimals to the left. The
probability easily exceeds the
p 5 .05 standard for statistical significance. The P-value
indicates the probability that one
population distribution could contain all these differences in
speed of construction. It is
extremely unlikely, which is to say that the differences are
statistically significant. At least
one pair (Designs A&B, Designs A&C, Designs B&C) involve
designs with significantly
different completion times. Finding out which pair(s) is/are
different would require addi-
tional post hoc tests, of course, using Tukey’s HSD, which can
easily be calculated using
the output from Figure 5.4.
tan81004_05_c05_103-134.indd 128 2/22/13 4:28 PM

CHAPTER 5 Section 5.4 Another One-Way ANOVA
5.4 Another One-Way ANOVA
To reinforce what has been learned, consider one more example.
The manager of a machine tool company has three major clients.
The question is whether sales to these
three significantly differ over a three-month period. The sales
totals in thousands of dol-
lars are as follows for the period:
Client 1: 23.5, 14.3, 11.0, 17.0
Client 2: 36.6, 14.7, 19.0, 14.0
Client 3: 20.1 22.7, 27.4, 16.6
The relevant means are as follows:
M1 5 16.450
M2 5 21.075
M3 5 21.700
MG 5 19.742
2
5 (23.5 – 19.742)2 1 (14.3 2 19.742)2 1 . . . 1 (16.6 2 19.742)2
5 548.209
SSbet 5 (Ma 2 MG)
2na 1 (Mb 2 MG)
2nb 1 (Mc 2 MG)
2nc

5 (16.450 2 19.742)2(4) 1 (21.075 2 19.742) 2(4) 1 (21.700 2
19.742)2(4) 5 65.792
SSwith 5 SStot 2 SSbet 5 548.209 2 65.792 5 482.417. This
reflects the fact that SSwith
is what is left over once SSbet is removed from SStot.
Source SS df MS F
Total 548.209 11
Between 65.792 2 32.896 .614; F.05(2,9) 5 4.26.
Within 482.417 9 53.602
The difference between sales to the four clients is not
statistically significant. There appears
to be some difference, comparing the mean sales of the first
client to those of the others,
but there is so much within-group variance that the differences
between clients must be
attributed to sampling variability. The differences within the
groups (MSwith) overwhelm
the differences between (MSbet) in the F ratio.
tan81004_05_c05_103-134.indd 129 2/22/13 4:28 PM
CHAPTER 5Chapter Summary
Chapter Summary
This chapter is the natural extension of Chapters 3 and 4. Like
z-test and the t-tests, analysis of variance (ANOVA) is a test of

significant differences. With each procedure,
whether z, t, or F, the test statistic is a ratio of the differences
between groups to the dif-
ferences within.
Like the independent samples t-test, in ANOVA the IV is
nominal and the DV is interval
or ratio. Both require that groups be independent, and both
procedures are limited to one
independent variable that defines the groups and indicates
which subjects are receiving
which treatment or condition. The reason for moving to
ANOVA was the need to conduct
comparisons of multiple groups without running multiple tests
with the same data, which
can increase the probability of type I error. Analysis of variance
allows any number of
groups to be compared at a time, with just one test. When t and
F are calculated for a two-
group test, both tests reach the same conclusion. More
specifically t2 5 F (Objectives 1, 2,
and 3).
Multiple groups introduced a problem not present in the t-test,
however. When there are
more than two groups and the F is significant, it is not apparent
which group(s) is/are
significantly different from which. That problem was solved by
calculating a post hoc test,
Tukey’s HSD (Objective 4).
Knowing that a result is statistically significant indicates that
an outcome is probably not
random. It does not establish the importance of the outcome,
however. As we did with
t-test, the question of the importance of a significant outcome

was addressed by calcu-
lating an effect size. In general terms, the eta-squared value
answers the same question
answered with Cohen’s d for the t-test results: How important is
the effect? The added
dimension is that the statistic indicates the proportion of the
variance in scores that is
explained by the independent variable (Objective 5).
Answers to the Review Questions
A. With successive, significant findings with the same data, the
probability of
type I (alpha) error increases with each test.
B. With four groups in an analysis there are 6 possible different
pairs. To deter-
mine this take the total number of groups times the total number
of groups
minus one, divide the product by 2: (4 3 3)/2 5 6.
C. The equivalent of the standard error of the difference in t-
test is the MSwith in
ANOVA. Both measure within-group variance.
D. The mean square (MS) values in ANOVA are determined by
dividing the SS
values by their degrees of freedom.
E. t2 5 F
tan81004_05_c05_103-134.indd 130 2/22/13 4:28 PM
CHAPTER 5Management Application Exercises

Chapter Formulas
2
The total sum of squares; the total of all variance from all
sources in an ANOVA problem.
Formula 5.2 SSbet 5 (Ma 2 MG)
2na 1 (Mb 2 MG)
2nb 1 (Mc 2 MG)
2nc
The sum of squares between is a measure of how much
particular groups differ from the
mean of all the data. It measures the effect of the IV, the
“grouping variable” or the “treat-
ment effect.”
2
The sum of squares within is a measure of how much
individuals within a group differ
from the mean of their sample when exposed to the same level
of the IV(s). It’s a measure
of error variance.
Formula 5.4 SStot 5 SSbet 1 SSwith
Formula 5.5 F 5 MSbet/MSwith The F statistic in ANOVA.

Formula 5.6 HSD 5 x"1MSw/n2
Tukey’s HSD is a post hoc test used to determine which groups
in an ANOVA are signifi-
cantly different from each other.
Formula 5.7 h2 5 SSbet/SStot
Eta-squared is an estimate of effect size. It suggests the
proportion of the difference in
scores between significantly different groups that can be
explained by the independent
variable.
Management Application Exercises
1. A fleet of cabs required servicing in a particular month that
took them out of
service for 3.5, 3.8, 4.2, 4.5, 4.7, 5.3, 6.0, and 7.5 hours. What
is the sum or squares
value for these data?
tan81004_05_c05_103-134.indd 131 2/22/13 4:28 PM
CHAPTER 5Management Application Exercises
2. Identify the symbol or statistic in a one-way ANOVA that
does the following:
a. The statistic that indicates the mean amount of difference
between groups
b. The symbol that indicates the total number of participants
c. The symbol that indicates the number of groups
d. The mean amount of uncontrolled variability

3. There is an advertised special on smart phones at two outlets
at different locations.
The sales data for eight successive days are as follows:
Outlet A: 13, 14, 16, 16, 17, 18, 18, 18
Outlet B: 11, 12, 12, 14, 14, 14, 14, 16
Complete the problem as an ANOVA. Is the location difference
statistically
significant?
4. Complete problem 3 as an independent t-test and demonstrate
the relationship
between t2 and F.
5. If Outlet C offers a free data plan in connection with
purchases of smart phones
and the sales for 8 days are 14, 17, 19, 19, 21, 22, 25, and 27,
how do sales from this
outlet compare to those in Outlets A and B in item 3?
6. A labor specialist evaluates the number of hours 8 employees
work the week before
Thanksgiving in each of a bakery (M 5 44.5), an electronics
store (M 5 36.2), and a
grocery outlet (M 5 40.0). If the MSwith 5 24.50 and the dfwith
5 21, which group(s)
is/are significantly different from which?
7. A courier service is comparing the number of parcels
delivered on each of 5 work-
days in a major city one, two, and three weeks after opening its
doors.
1 week: 0, 5, 7, 8, 8,

2 weeks: 3, 5, 12, 16, 17
3 weeks: 11, 15, 16, 19, 22
a. Is F significant?
b. Which week(s) is/are different from which?
c. What does the effect size indicate??
8. Regarding item 7:
a. What’s the IV?
b. What’s the scale of the IV?
c. What’s the DV?
d. What’s the scale of the DV?
9. If a shift manager is comparing the number of sick days taken
by people in four
departments:
a. What will be the number of degrees of freedom for between?
b. If there are six people in each department, what will be the
degrees of free-
dom for within?
tan81004_05_c05_103-134.indd 132 2/22/13 4:28 PM
CHAPTER 5Key Terms
10. The manager of an agency providing temporary employees
to city offices is ana-
lyzing the number of days temporary hires typically work in
different types of
positions. The data are as follows:
Legal clerical: 2, 1, 4, 4, 2, 5, 6

Accounting firms: 3, 6, 4, 5, 5, 7, 8
Insurance: 5, 4, 7, 9, 9, 8, 11
a. Are there significant differences in the length of time temps
work in the differ-
ent industries?
b. How much of the difference can be explained by the
industry?
c. Which groups are significantly different from which?
Key Terms
• Analysis of variance is the name given to Fisher’s test that
allows one to detect
significant differences among any number of groups.
• Error variance refers to variability in a measure unrelated to
the variables being
analyzed.
• One-way ANOVA is ANOVA with one independent variable.
• Factorial ANOVA is ANOVA with more than one independent
variable, more than
one factor.
• Sum of squares is the variance measure in analysis of
variance. It is literally the sum
of squared deviations between a set of scores and its mean. Sum
of squares total is
total variance from all sources. Sum of squares between is the
variability related to
the independent variable. Sum of squares within is variability
stemming from dif-
ferent responses from individuals in the same group. It is

exclusively error variance.
• Mean square is the sum of squares divided by its degrees of
freedom. This division
allows the mean square to reflect the average amount of
variability from a source.
• Post hoc tests are conducted after a significant ANOVA, or
some similar test, which
identifies which among multiple possibilities is statistically
significant.
• Eta-squared is a measure of effect size for ANOVA. It
estimates the amount of vari-
ability in the DV explained by the IV.
• When there is homogeneity of variance, multiple groups of
data are distributed
similarly.
tan81004_05_c05_103-134.indd 133 2/22/13 4:28 PM
tan81004_05_c05_103-134.indd 134 2/22/13 4:28 PM

5ANOVA Analyzing Differences in Multiple GroupsLea.docx

Recommended

Recommended

More Related Content

Similar to 5ANOVA Analyzing Differences in Multiple GroupsLea.docx

Similar to 5ANOVA Analyzing Differences in Multiple GroupsLea.docx (20)

More from evonnehoggarth79783

More from evonnehoggarth79783 (20)

Recently uploaded

Recently uploaded (20)

5ANOVA Analyzing Differences in Multiple GroupsLea.docx