iStockphotoThinkstockchapter 6Analysis of Variance (A.docx

iStockphoto/Thinkstock
chapter 6
Analysis of Variance (ANOVA)
Learning Objectives
After reading this chapter, you will be able to. . .
1. explain why it is a mistake to analyze the differences
between more than two groups with
multiple t-tests.
2. relate sum of squares to other measures of data variability.
3. compare and contrast t-test with ANOVA.
4. demonstrate how to determine which group is significant in
an ANOVA with more than
two groups.
5. explain the use of eta-squared in ANOVA.
6. present statistics based on ANOVA results in APA format.
7. interpret results and draw conclusions of ANOVA.
8. discuss nonparametric Kruskal-Wallis H-test compared to the
ANOVA.
CN

CO_LO
CO_TX
CO_NL
CT
CO_CRD
suk85842_06_c06.indd 183 10/23/13 1:40 PM
CHAPTER 6Section 6.1 One-Way Analysis of Variance
Ronald. A. Fisher was present at the creation of modern
statistical analysis. During the early part of the 20th century,
Fisher worked at an agricultural research station in rural
southern England. In his work analyzing the effect of pesticides
and fertilizers on crop
yields, he was stymied by the limitations in Gosset’s
independent t-test, which allowed
him to compare only one pair of samples at a time. In the effort
to develop a more com-
prehensive approach, Fisher created analysis of variance
(ANOVA).
Like Gosset, he felt that his work was important enough to
publish, and like Gosset in his
effort to publish t-test, Fisher had opposition. In Fisher’s case,
the opposition came from
a fellow statistician, Karl Pearson. This is the same man who
created the first department
of statistical analysis at University College, London. In

Chapters 9 and 11 you will study
some of Pearson’s work with correlations as well as Spearman
rho (r) and Chi-square (x2),
which are the analysis of categorical (nominal and ordinal) data.
Pearson also founded
what is probably the most prominent journal for statisticians,
Biometrika. Pearson was an
advocate of making one comparison at a time and of using the
largest groups possible to
make those comparisons.
When Fisher submitted his work to Pearson’s journal with
procedures suggesting that
samples can be small and many comparisons can be made in the
same analysis, Pear-
son rejected the manuscript. So began a long and increasingly
acrimonious relationship
between two men who would become giants in the field of
statistical analysis and end up
in the same department at University College. Interestingly,
Gosset also gravitated to the
department and managed to get along with both of them.
Fisher’s contributions affect more than this chapter. Besides the
development of the
ANOVA, the concept of statistical significance is his as well as
hypothesis testing discussed
in Chapter 5. Note that although a ubiquitous phenomenon,
significance testing itself is
not always accepted by other statisticians. One such adversary
is William [Bill] Kruskal,
who consequently derived the nonparametric version of the
ANOVA—the Kruskal-Wallis
H-test, which is discussed in this chapter. Despite these
philosophical and statistical dif-
ferences, R. A. Fisher made an enormous contribution to the

field of quantitative analysis,
as did his nemesis, Karl Pearson with additional statistical
contributions by William Sealy
Gosset and Bill Kruskal.
6.1 One-Way Analysis of Variance
In any experiment, scores and measurements vary for many
reasons. If a researcher is interested in whether children will
emulate the videotaped behavior of adults whom
they have watched, any differences in the children’s behavior
from before they see the
adults to after are attributed primarily to the adults’ behaviors.
But even if all of the chil-
dren watch with equal attentiveness, it is likely there will be
differences in their behaviors
H1
TX_DC
BLF
TX
BL
BLL
suk85842_06_c06.indd 184 10/23/13 1:40 PM
after the video. Some of those differences might stem from age

differences among the chil-
dren. Perhaps the amount of exposure children otherwise have
to television will prompt
differences in their behavior. Probably differences in their
background experiences will
also affect the way they behave.
In an analysis of how behavior changes as a result of watching
the video, the independent
variable (IV) is whether or not the children have seen the video.
Changes in their behavior,
the dependent variable (DV), reflect the effect of the IV, but
they also reflect all the other
factors that prompt the children to behave differently. An IV is
also referred to as a factor,
particularly in procedures that involve more than one IV.
Behavior changes that are not
related to the IV reflect the presence of error variance attributed
by other factors known
as confounding variables.
When researchers work with human subjects, some level of
error variance is inescap-
able. Even under tightly controlled conditions where all
members of a sample receive
exactly the same treatment, the subjects are unlikely to respond
the same way. There are
just too many confounding variables that also affect their
behavior. Fisher ’s approach
was to calculate the total variability in a problem and then
analyze it, thus the name
analysis of variance.
Any number of IVs can be included in an ANOVA. Here, we are
interested primarily in
ANOVA in its simplest form, a procedure called one-way

ANOVA. The “one” in one-
way ANOVA indicates that there is just one IV in this model. In
that regard, one-way
ANOVA is similar to the independent-samples t-test discussed
in Chapter 5. Both tests
have one IV and one DV. The difference is that the independent
t-test
allows for an IV with just two groups, but the IV in ANOVA
can be any
number of groups generally more than two. In other words, a
one-way
ANOVA with just two groups is the same as an independent-
samples
t-test where the statistic calculated in ANOVA, F is equal to t2;
this is
addressed and illustrated in Section 6.5.
The ANOVA Advantage
The ANOVA and the t-test both answer the same question: Are
there significant differ-
ences between groups? So why bother with another test when
we have the t-test? Suppose
someone has developed a group therapy program for people with
anger management
problems and the question is, are there significant differences in
the behavior of clients
who spend (a) 8, (b) 16, and (c) 24 hours in therapy over a
period of weeks? Why not
answer the question by performing three t-tests as follows?
1. Compare the 8-hour group to the 16-hour group.
A What does the

“one” in one-way
ANOVA refer to?
Try It!
suk85842_06_c06.indd 185 10/23/13 1:40 PM
The Problem of Multiple Comparisons
These three tests represent all possible comparisons, but there
are two problems with this
approach. First, all possible comparisons is a good deal more
manageable if there are three
groups than if there are, say, five groups. If there were five
groups, labeled a through e,
note the number of comparisons needed to cover all possible
comparisons:
1. a to b
2. a to c
3. a to d
4. a to e
5. b to c
6. b to d
7. b to e
8. c to d
9. c to e
10. d to e
All possible comparisons among three tests involve 10 tests as
seen above to cover all the

combinations of tests.
Family-Wise Error
The other problem is an issue of inflated error in hypothesis
testing when doing multiple
tests known as family-wise error. Recall that the potential for
type I error (a) is deter-
mined by the level at which the test is conducted. At a 5 .05,
any significant finding will
result in a type I error an average of 5% of the time. However,
that level of error assumes
that each test is conducted with new data thereby increasing the
family-wise error rate
(FWER). Specifically, if statistical testing is done repeatedly
with the same data, the poten-
tial for type I error does not remain fixed at .05 (or whatever
the level of the testing), but
grows. In fact, if 10 tests are conducted in succession with the
same data as with groups
labeled a, b, c, d, and e mentioned earlier, and each finding is
significant, by the time the
10th test is completed, the potential for alpha error is FWER 5
.40 or a 40% error prob-
ability, as the following procedure illustrates:
P
a
5 1 2 (1 2 pa)n
Where
Pa 5 the probability of alpha error overall
pa 5 the probability of alpha error for the initial significant
finding

n 5 the number of tests conducted where the result was
significant
P
a
5 1 2 (1 2 .052 10
5 1 2 .599
FWER 5 .401
The probability of a type I error at this point is 4 in 10 or 40%!
suk85842_06_c06.indd 186 10/23/13 1:40 PM
The business of raising the (1 2 pa) difference to the 10th power
(or however many com-
parisons there are) is not only tedious, but the more important
problem is that the prob-
ability of a type I error does not remain fixed when there are
successive significant results
with the same data. Therefore, using multiple t-tests is never a
good option.
In the end, running one test in an overall ANOVA will control
for inflated FWER. An
ANOVA is therefore termed an omnibus test, as it will test the
overall significance of the
research model based on the differences between sample means.
It will not tell you which
two means are significantly different, which is why follow-up
post hoc comparisons are

executed. These concepts will be discussed in further detail
throughout the chapter.
The Variance in Analysis of Variance (ANOVA)
To analyze variance, Fisher began by calculating total
variability from all sources. He
recognized that when scores vary in a research study, they do so
for two reasons. They
vary because the independent variable (the “treatment”) has had
an effect, and they vary
because of factors beyond the control of the researcher,
producing the error variance
referred to earlier.
The test statistic in ANOVA is the F ratio (named for Fisher),
which is treatment variance
(variance that can be explained by the IV on the DV) divided by
error variance (variance
that cannot be explained due to confounding variables on the
DV). When F is large, it indi-
cates that the difference between at least two of the groups in
the analysis is not random
and that there are significant differences between at least two
group means. When the F
ratio is small (close to a value of 1), it indicates that the IV has
not had enough impact to
overcome error variability, and the differences between groups
are not significant. We will
return to the F ratio when we discuss Formula 6.4.
Variance Between and Within Groups
If three groups of the same size are all selected from one
population, they could be repre-
sented by three distributions, as shown in Figure 6.1. They do

not have exactly the same
mean, but that is because even when they are selected from the
same population, samples
are rarely identical. Those initial differences between sample
means indicate some degree
of sampling error.
Figure 6.1: Three groups drawn from the same population
suk85842_06_c06.indd 187 10/23/13 1:40 PM
The reason that each of the three distributions has width is that
there are differences within
each of the groups. Even if the sample means were the same,
individuals selected to the
same sample will rarely manifest precisely the same level of
whatever is measured. If a
population is identified—for example, a population of the
academically gifted—and a
sample is drawn from that population, the individuals in the
sample will not all have the
same level of ability. Because they are all members of the
population of the academically
gifted, they will probably all be higher than the norm for
academic ability, but there will
still be differences in the subjects’ academic ability within the
sample.
These differences within are sources of error variance.
The treatment effect is indicated in how the IV affects the way
the DV

is manifested. For example, three groups of subjects are
administered
different levels of a mild stimulant (the IV) to see the effect on
level
of attentiveness. The issue in ANOVA is whether the IV, the
treat-
ment, creates enough additional between-groups variability to
exceed
any error variance. Ultimately, the question is whether, as a
result of
the treatment, the samples still represent populations with the
same
mean, or whether, as is suggested by the distributions in Figure
6.2,
they may represent populations with different means.
Figure 6.2: Three groups after the treatment
The within-groups variability in these three distributions is the
same as it was in the dis-
tributions in Figure 6.1. It is the between-groups variability that
has changed in Figure 6.2.
More particularly, it is the difference between the group means
that has changed. Although
there was some between-groups variability before the treatment,
it was comparatively
minor and probably reflected sampling variability. After the
treatment, the differences
between means are much greater. What F indicates is whether
group differences are great
enough to be statistically significant not due to chance.
The Statistical Hypotheses in One-Way ANOVA
The hypotheses are very much like they were for the
independent t-test, except that they

accommodate more groups. For the t-test, the null hypothesis is
written H0: m1 5 m2. It
indicates that the two samples involved were drawn from
populations with the same
means. For a one-way ANOVA with three groups, the null
hypothesis has this form:
H0: m1 5 m2 5 m3
B If a psychologist
is interested in
the impact that 1 hour,
5 hours, or 10 hours of
therapy have on client
behavior, how are
behavior differences
related to gender
explained?
Try It!
suk85842_06_c06.indd 188 10/23/13 1:40 PM
It indicates that the three samples were drawn from populations
with the same means.
Things have to change for the alternate hypothesis, however,
because with three groups,
there is not just one possible alternative. Note that each of the
following is possible:
a. Ha: m1 ? m2 5 m3

Sample 1 represents a population with a mean value different
from the
mean of the population represented by Samples 2 and 3.
b. Ha: m1 5 m2 ? m3
Samples 1 and 2 represent a population with a mean value
different
from the mean of the population represented by Sample 3.
c. Ha: m1 5 m3 ? m2
Samples 1 and 3 represent a population with a mean value
different
from the population represented by Sample 2.
d. Ha: m1 ? m2 ? m3
All three samples represent populations with different means.
Because the several possible alternative outcomes multiply rap-
idly when the number of groups increases, a more general
alternate
hypothesis is given. Either all the groups involved come from
popu-
lations with the same means, or at least one of them does not.
So the
form of the alternate hypothesis for an ANOVA with any
number of
groups is simply
Ha: At least one of the means is different from the other
means.
Also remember that all the hypotheses are either nondirectional,
in that there is no predic-
tion of which sample mean will be higher than the others:
Nondirectional alternative hypothesis: Ha: m1 ? m2 ? m3

or directional, in that there is a prediction of which sample
mean will be higher than the
other means. As seen below for the directional alternative
hypothesis, there is a prediction
that m3 will be higher than m2 that is higher than m1.
Directional alternative hypothesis: Ha: m1 , m2 , m3
As a researcher, it is important to consider the value of
prediction in terms of a one-tailed
test versus no prediction in a two-tailed test as discussed in
Chapter 5.
Measuring Data Variability in the One-Way ANOVA
We have discussed several different measures of data variability
to this point, including
the standard deviation (s), the variance (s2), the standard error
of the mean (SEM), the
standard error of the difference (SEd), and the range. For
ANOVA, Fisher added one more,
C How many
t-tests would
it take to make all
possible comparisons
in a procedure with six
groups?
Try It!
suk85842_06_c06.indd 189 10/23/13 1:40 PM

the sum of squares (SS). The sum of squares is the sum of the
squared differences between
scores and one of several mean values. In ANOVA,
• One sum-of-squares value involves the differences between
individual scores and
the mean of all the scores in all the groups (the grand mean).
This is the called the
sum of squares total (SStot) because it measures all variability
from all sources.
• A second sum-of-squares value indicates the difference
between the means of
the individual groups and the grand mean. This is the sum of
squares between
(SSbet). It measures the effect of the IV, the treatment effect, as
well any differ-
ences that existed between the groups before the study began.
• A third sum-of-squares value measures the difference between
scores in the sam-
ples and the means of their sample. These sum of squares within
(SSwith) values
reflect the differences in the way subjects respond to the same
stimulus. Because
this value is entirely error variance, it is also called the sum of
squares error (SSerr)
or the sum of squares residual (SSres).
All Variability From All Sources: The Sum of Squares Total
(SStot )
There are multiple formulas for SStot. They all provide the
same answer, but some make

more sense to look at than others. Formula 6.1 makes it clear
that at the heart of SStot is
the difference between each individual score (x) and the mean
of all scores, or the grand
mean, for which the notation is MG.
SStot 5 a (x 2 MG )2 Formula 6.1
Where
x 5 each score in all groups
MG 5 the mean of all data from all groups, the grand mean
To calculate SStot, follow these steps:
1. Sum all scores from all groups and divide by the number of
scores to determine
the grand mean, MG.
2. Subtract MG from each score (x) in each group, and then
square the difference:
(x 2 MG)
2
3. Sum all the squared differences: a (x 2 MG)
2
The Treatment Effect: The Sum of Squares Between (SSbet )
The between-groups variance, the sum of squares between
(SSbet), contains the variability
due to the independent variable, the treatment effect. It will also
contain any initial differ-
ences between the groups, which of course is error variance. For

three groups labeled a,
b, and c, the formula is
SSbet 5 (Ma 2 MG )
2na 1 (Mb 2 MG)
2nb 1 (Mc 2 MG )
2nc Formula 6.2
suk85842_06_c06.indd 190 10/23/13 1:40 PM
Where
Ma 5 the mean of the scores in the first group (a)
MG 5 the same grand mean used in SStot
na 5 the number of scores in the first group (a)
To calculate SSbet, follow these steps:
1. Determine the mean for each group: Ma, Mb, and so on.
2. Subtract MG from each sample mean and square the
difference: (Ma 2 MG)
2
3. Multiply the squared differences by the number in the group:
(Ma 2 MG)
2na
4. Repeat for each group.
5. Sum ( a ) the results across groups.

The value that results from Formula 6.2 represents the
differences between groups and the
mean of all the data.
The Error Term: The Sum of Squares Within (SSwith )
When a group receives the same treatment but individuals
within the group respond dif-
ferently, their differences constitute error—unexplained
variability. Maybe subjects’ age
differences are the cause, or perhaps the circumstances of their
family lives, but for some
reason not analyzed in the particular study, subjects in the same
group often respond dif-
ferently to the same stimulus. The amount of this unexplained
variance within the groups
is calculated with the SSwith, for which we have Formula 6.3:
SSwith 5 a (xa 2 Ma)2 1 a (xb 2 Mb)2 1 a (xc 2 Mc )2 Formula
6.3
Where
SSwith 5 the sum of squares within
xa 5 each of the individual scores in Group a
Ma 5 the score mean in Group a
To calculate SSwith, follow these steps:
1. Take the mean for each of the groups; these are available
from
calculating the SSbet earlier.

2. From each score in each group,
a. subtract the mean of the group,
b. square the difference, and
c. sum the squared differences within each group.
3. Repeat this for each group.
4. Sum the results across the groups.
D When will the
sum-of-squares
values be negative?
Try It!
suk85842_06_c06.indd 191 10/23/13 1:40 PM
The SSwith (or the SSerr) measures the degree to which scores
vary due to factors not con-
trolled in the study, fluctuations that constitute error variance.
Because the SStot consists of the SSbet and the SSwith, once
the SStot and the SSbet are known,
the SSwith can be determined by subtraction:
SStot 2 SSbet 5 SSwith
However, there are two reasons not to determine the SSwith by
simple subtraction. First, if
there is an error in the SSbet, it is only perpetuated with the
subtraction. Second, calculat-
ing the value with Formula 6.3 helps clarify that what is being
determined is a measure

of how much variation in scores there is within each group. For
the few problems done
entirely by hand, we will take the “high road” and use the
conceptual formula.
Conceptual formulas (6.1, 6.2, and 6.3) clarify the logic
involved, but in the case of analysis
of variance, they also require a good deal of tiresome
subtracting and then squaring of
numbers. To minimize the tedium, the data sets here are all
relatively small. When larger
studies are done by hand, people often shift to the “calculation
formulas” for simpler
arithmetic, but there is a sacrifice to clarity. Happily, you will
seldom ever find yourself
doing manual ANOVA calculations, and after a few simple
longhand problems, this chap-
ter will explain how you can utilize Excel or SPSS for help with
the larger data sets.
Calculating the Sums of Squares
A researcher is interested in the level of social isolation people
feel in small towns (a),
suburbs (b), and cities (c). Participants randomly selected from
each of those three settings
take the Assessment List of Nonnormal Environments
(ALONE), for which the following
scores are available:
a. 3, 4, 4, 3
b. 6, 6, 7, 8
c. 6, 7, 7, 9
We know we are going to need the mean of all the data (MG) as
well as the mean for each

group (Ma, Mb, Mc ), so we will start there. Verify that
a x 5 70 and N 5 12, so that MG 5 5.833.
For the small-town subjects,
a xa 5 14 and na 5 4, so Ma 5 3.50.
For the suburban subjects,
a xb 5 27 and nb 5 4, so Mb 5 6.750.
suk85842_06_c06.indd 192 10/23/13 1:40 PM
For the city subjects,
a xc 5 29 and nc 5 4, so Mc 5 7.250.
For the sum of squares total, the formula is
SStot 5 a (x 2 MG)2.
SStot 5 41.67
The calculations are in Table 6.1.
Table 6.1: Calculating the sum of squares total (SStot )
SStot 5 a (x 2 MG)
2, MG 5 5.833
For the Town Data
x 2 M 1x 2 M2 2
3 2 5.833 5 22.833 8.026

4 2 5.833 5 21.833 3.360
4 2 5.833 5 21.833 3.360
3 2 5.833 5 22.833 8.026
For the Suburb Data
x 2 M 1x 2 M2 2
6 2 5.833 5 0.167 0.028
6 2 5.833 5 0.167 0.028
7 2 5.833 5 1.167 1.362
8 2 5.833 5 2.167 4.696
For the City Data
x 2 M 1x 2 M2 2
6 2 5.833 5 0.167 0.028
7 2 5.833 5 1.167 1.362
7 2 5.833 5 1.167 1.362
9 2 5.833 5 3.167 10.030
SStot 5 41.668
suk85842_06_c06.indd 193 10/23/13 1:40 PM

For the sum of squares between, the formula is
SSbet 5 (Ma 2 MG )
2na 1 (Mb 2 MG)
2nb 1 (Mc 2 MG )
2nc
The SSbet involves three groups rather than the 12 individuals
required for SStot. The SSbet
is as follows:
SSbet 5 (Ma 2 MG )
2na 1 (Mb 2 MG)
2nb 1 (Mc 2 MG )
2nc
5 (3.5 2 5.833)2(4) 1 (6.75 2 5.833)2(4) 1 (7.25 2 5.833)2(4)
5 21.772 1 3.364 1 8.032
5 33.17
The SSwith indicates the error variance by determining the
differences between individual scores in a group and their
means. The formula is
SSwith 5 a (xa 2 Ma )2 1 a (xb 2 Mb)2 1 a (xc 2 Mc)2
SSwith 5 8.50
The calculations are in Table 6.2.

Because we calculated the SSwith directly instead of
determining it by subtraction, we can
now check for accuracy by adding its value to the SSbet. If the
calculations are correct,
SSwith 1 SSbet 5 SStot. For the isolation example, we have
8.504 1 33.168 5 41.67
In the initial calculation, SStot 5 41.67. The difference of .004
is round-off difference and is
unimportant.
Although they were not called sums of squares, we have
calculated an equivalent statis-
tic since Chapter 1. At the heart of the standard deviation
calculation is those repetitive
x 2 M differences for each score in the sample. The difference
values are then squared and
summed much as they are for calculating SSwith and SStot.
Further, the denominator in the
standard deviation calculation is n 2 1, which should look
suspiciously like some of the
degrees of freedom values we will discuss in the next section.
Interpreting the Sums of Squares
The different sums-of-squares values are measures of data
variability, which makes them
like the standard deviation, variance measures, the standard
error of the mean, and so on.
But there is an important difference between SS and the other
statistics. In addition to data
variability, the magnitude of the SS value reflects the number of
scores included. Because
sums of squares are in fact the sum of squared values, the more

values there are, the larger
E What will
SStot 2 SSwith yield?
Try It!
suk85842_06_c06.indd 194 10/23/13 1:40 PM
the value becomes. With statistics like the standard deviation,
adding more values near
the mean of the distribution actually shrinks its value. But this
cannot happen with the
sum of squares. Additional scores, whatever their value, will
almost always increase the
sum-of-squares.
Table 6.2: Calculating the sum of squares within (SSwith )
SSwith 5 a 1xa 2 Ma 2 2 1 a 1xb 2 Mb 2 2 1 a 1xc 2 Mc 2 2
3, 4, 4, 3
6, 6, 7, 8
6, 7, 7, 9
Ma 5 3.50, Mb 5 6.750, Mc 5 7.250
For the Town Data
x 2 M 1x 2 M2 2

3 2 3.50 5 20.50 0.250
4 2 3.50 5 0.50 0.250
4 2 3.50 5 0.50 0.250
3 2 3.50 5 20.50 0.250
For the Suburb Data
x 2 M 1x 2 M2 2
6 2 6.750 5 20.750 0.563
6 2 6.750 5 20.750 0.563
7 2 6.750 5 0.250 0.063
8 2 6.750 5 1.250 1.563
For the City Data
x 2 M 1x 2 M2 2
6 2 7.250 5 21.250 1.563
7 2 7.250 5 20.250 0.063
7 2 7.250 5 20.250 0.063
9 2 7.250 5 1.750 3.063
SSwith 5 8.504
suk85842_06_c06.indd 195 10/23/13 1:40 PM

This characteristic makes the sum of squares difficult to
interpret. What constitutes much
or little variability depends not just on how much difference
there is between the scores
and the mean to which they are compared but also on how many
scores there are. Fisher
turned the sum-of-squares values into a “mean measure of
variability” by dividing each
sum-of-squares value by its degrees of freedom. The SS 4 df
operation creates what is
called the mean square (MS).
In the one-way ANOVA, there is a MS value associated with
both the SSbet and the SSwith
(SSerr). There is no mean squares total given in the table, but if
this were to be calculated, it
is the total variance (SSbet 1 SSwith) divided by the entire data
set as a single sample minus
one (N 2 1). Dividing the SStot by its degrees of freedom (N 2
1) would provide a mean
level of overall variability, but that would not help answer
questions about the ratio of
between-groups variance to within-groups variance.
The degrees of freedom for each of the sums of squares
calculated for the one-way ANOVA
are as follows:
• Degrees of freedom total 5 N 2 1, where N is the total number
of scores
• Degrees of freedom for between (dfbet) 5 k 2 1, where k is the

number of groups
SSbet 4 dfbet 5 MSbet
• Degrees of freedom for within (dfwith) 5 N 2 k
SSwith 4 dfwith 5 MSwith
Although there is no MStot, we need the sum of squares for
total (SStot) and the degrees of
freedom for total (dftot) because they provide an accuracy
check:
a. The sums of squares between and within should equal total
sum of squares:
SSbet 1 SSwith 5 SStot
b. The sum of degrees of freedom between and within should
equal degrees of
freedom total:
dfbet 1 dfwith 5 dftot
Remembering these relationships can help reveal errors. In
other words, the concept of
error is unexplained or unsystematic variance within groups
(SSwith)
that is considered
variance not caused by experimental manipulation, as opposed
to explained or systematic
variance due to experimental variance between groups (SSbet).
The F Ratio

The mean squares for between and within are the components of
F, and the F ratio is the
test statistic in ANOVA. As noted earlier in this chapter, the F
is a ratio:
F 5
MSbet
MSwith
Formula 6.4
suk85842_06_c06.indd 196 10/23/13 1:40 PM
The issue is whether the MSbet, which contains the treatment
effect and some error, is
substantially greater than the MSwith, which contains only
error. This is illustrated in
Figure 6.3 by comparing the distance from the mean of the first
distribution to the mean
of the second distribution, the A variance, to the B and C
variances, which indicate the
differences within groups.
If the MSbet/MSwith ratio is large—it must be substantially
greater than 1—the difference
between groups is likely to be significant. When that ratio is
small (close to 1), F is likely
to be nonsignificant. How large F must be to be significant
depends on the degrees of free-
dom for the problem, just as it did for the t-tests.
Figure 6.3: The F-ratio: comparing variance between groups (A)

to
variance within groups (B 1 C)
The ANOVA Table
With the sums of squares and the degrees of freedom for the
different values in hand, the
ANOVA results are presented in a table often referred to as a
source table, indicating the
sources of variability that indicates
• the source of the variance,
• the sums-of-squares values,
• the degrees of freedom
for total degrees of freedom, dftot 5 N 2 1 (because N 5 12 dftot
5 11),
for between degrees of freedom, dfbet 5 k 2 1(because k, the
number of groups,
5 3 dfbet 5 k 2 1, dfbet 5 2),
for within degrees of freedom, df 5 N 2 k (because N 5 12 and k
5 3, dfwith 5 9,
• the mean square values, which are SS/df, and
• the F value, which is the MSbet/MSwith.
B C
A
suk85842_06_c06.indd 197 10/23/13 1:40 PM

For the social isolation problem, the ANOVA table is
Source SS df MS F
Between 33.17 2 16.58 17.55
Within 8.50 9 .95
Total 41.67 11
The table makes it easy to check some of the results for
accuracy. Check that
SSbet 1 SSwith 5 SStot
Also verify that
dfbet 1 dfwith 5 dftot
In the course of checking results, note the sums-of-squares
values can never be negative.
Because the SS values are literally sums of squares, a negative
number indicates a calcula-
tion error somewhere because there is no such thing as negative
variability (Chapter 1).
The smallest a sum-of-squares value can be is 0, and this can
happen only if all scores in
the sum-of-squares calculation have the same value.
Understanding F
The larger F is, the more likely it is to be statistically
significant, but how large is large
enough? In the preceding ANOVA table, F 5 17.551, which

seems like a comparatively
large value.
• The fact that F is determined by dividing MSbet by MSwith
indicates that whatever
the value of F is indicates the number of times MSbet is greater
than MSwith.
• Here MSbet is 17.551 times greater than MSwith, which seems
promising, but to be
sure, it must be compared to a value from the critical values of
F (Table 6.3, which
is repeated in the Appendix as Table C).
As with the t-test, as degrees of freedom increase, the critical
values decline. The difference
is that with F two df values are involved: one for the MSbet and
the other for the MSwith.
• In Table 6.3 (also Table C in the Appendix), the critical value
is identified by
moving across the top of the table to the dfbet (the df
numerator) and then moving
down that column to the dfwith (the df denominator). According
to the social isola-
tion test ANOVA table above, these are
the dfbet 5 2 and
the dfwith 5 9.
suk85842_06_c06.indd 198 10/23/13 1:40 PM

• The intersection of the 2 at the top and the 9 along the left
side of the table leads
to two critical values, one in regular type, which is for a 5 .05
and is the default,
and one in bold type, which is the value for testing at the
critical a 5 .01.
• The critical value when testing at p 5 .05 is 4.26.
• The critical value indicates that any ANOVA test with 2 and 9
df that has an F
value equal to or greater than 4.26 is statistically significant.
The social isolation differences between the three groups are
probably
not due to sampling variability. The statistical decision is to
reject H0.
The relatively large value of F—it is more than four times the
critical
value—indicates that of the differences in social isolation, much
more
of it is probably related to where respondents live than to the
amount
that is error variance.
Table 6.3: The critical values of F
Values in regular type indicate the critical value for p = .05.
Values in bold type indicate the critical value for p = .01.
df denominator df numerator
1 2 3 4 5 6 7 8 9 10
2 18.51

98.49
19.00
99.01
19.16
99.17
19.25
99.25
19.30
99.30
19.33
99.33
19.35
99.36
19.37
99.38
19.38
99.39
19.40
99.40
3 10.13
34.12
9.55
30.82
9.28

29.46
9.12
28.71
9.01
28.24
8.94
27.91
8.89
27.67
8.85
27.49
8.81
27.34
8.79
27.23
4 7.71
21.20
6.94
18.00
6.59
16.69
6.39
15.98
6.26

15.52
6.16
15.21
6.09
14.98
6.04
14.80
6.00
14.66
5.96
14.55
5 6.61
16.26
5.79
13.27
5.41
12.06
5.19
11.39
5.05
10.97
4.95
10.67
4.88

10.46
4.82
10.29
4.77
10.16
4.74
10.05
6 5.99
13.75
5.14
10.92
4.76
9.78
4.53
9.15
4.39
8.75
4.28
8.47
4.21
8.26
4.15
8.10
4.10

7.98
4.06
7.87
7 5.59
12.25
4.74
9.55
4.35
8.45
4.12
7.85
3.97
7.46
3.87
7.19
3.79
6.99
3.73
6.84
3.68
6.72
3.64
6.62
8 5.32

11.26
4.46
8.65
4.07
7.59
3.84
7.01
3.69
6.63
3.58
6.37
3.50
6.18
3.44
6.03
3.39
5.91
3.35
5.81
9 5.12
10.56
4.26
8.02
3.86

6.99
3.63
6.42
3.48
6.06
3.37
5.80
3.29
5.61
3.23
5.47
3.18
5.35
3.14
5.26
10 4.96
10.04
4.10
7.56
3.71
6.55
3.48
5.99
3.33

5.64
3.22
5.39
3.14
5.20
3.07
5.06
3.02
4.94
2.98
4.85
11 4.84
9.65
3.98
7.21
3.59
6.22
3.36
5.67
3.20
5.32
3.09
5.07
3.01

4.89
2.95
4.74
2.90
4.63
2.85
4.54
12 4.75
9.33
3.89
6.93
3.49
5.95
3.26
5.41
3.11
5.06
3
4.82
2.91
4.64
2.85
4.50
2.80

4.39
2.75
4.30
13 4.67
9.07
3.81
6.70
3.41
5.74
3.18
5.21
3.03
4.86
2.92
4.62
2.83
4.44
2.77
4.30
2.71
4.19
2.67
4.10
F If the F in an

ANOVA is 4.0 and
the MSwith 5 2.0, what
will be the value of MSbet?
Try It!
(continued)
suk85842_06_c06.indd 199 10/23/13 1:40 PM
Table 6.3: The critical values of F (continued)
Values in regular type indicate the critical value for p = .05.
Values in bold type indicate the critical value for p = .01.
df denominator df numerator
1 2 3 4 5 6 7 8 9 10
14 4.60
8.86
3.74
6.51
3.34
5.56
3.11
5.04
2.96

4.69
2.85
4.46
2.76
4.28
2.70
4.14
2.65
4.03
2.60
3.94
15 4.54
8.68
3.68
6.36
3.29
5.42
3.06
4.89
2.90
4.56
2.79
4.32
2.71

4.14
2.64
4.00
2.59
3.89
2.54
3.80
16 4.49
8.53
3.63
6.23
3.24
5.29
3.01
4.77
2.85
4.44
2.74
4.20
2.66
4.03
2.59
3.89
2.54

3.78
2.49
3.69
17 4.45
8.40
3.59
6.11
3.20
5.19
2.96
4.67
2.81
4.34
2.70
4.10
2.61
3.93
2.55
3.79
2.49
3.68
2.45
3.59
18 4.41

8.29
3.55
6.01
3.16
5.09
2.93
4.58
2.77
4.25
2.66
4.01
2.58
3.84
2.51
3.71
2.46
3.60
2.41
3.51
19 4.38
8.18
3.52
5.93
3.13

5.01
2.90
4.50
2.74
4.17
2.63
3.94
2.54
3.77
2.48
3.63
2.42
3.52
2.38
3.43
20 4.35
8.10
3.49
5.85
3.10
4.94
2.87
4.43
2.71

4.10
2.60
3.87
2.51
3.70
2.45
3.56
2.39
3.46
2.35
3.37
21 4.32
8.02
3.47
5.78
3.07
4.87
2.84
4.37
2.68
4.04
2.57
3.81
2.49

3.64
2.42
3.51
2.37
3.40
2.32
3.31
22 4.30
7.95
3.44
5.72
3.05
4.82
2.82
4.31
2.66
3.99
2.55
3.76
2.46
3.59
2.40
3.45
2.34

3.35
2.30
3.26
23 4.28
7.88
3.42
5.66
3.03
4.76
2.80
4.26
2.64
3.94
2.53
3.71
2.44
3.54
2.37
3.41
2.32
3.30
2.27
3.21
24 4.26

7.82
3.40
5.61
3.01
4.72
2.78
4.22
2.62
3.90
2.51
3.67
2.42
3.50
2.36
3.36
2.30
3.26
2.25
3.17
25 4.24
7.77
3.39
5.57
2.99

4.68
2.76
4.18
2.60
3.85
2.49
3.63
2.40
3.46
2.34
3.32
2.28
3.22
2.24
3.13
26 4.23
7.72
3.37
5.53
2.98
4.64
2.74
4.14
2.59

3.82
2.47
3.59
2.39
3.42
2.32
3.29
2.27
3.18
2.22
3.09
27 4.21
7.68
3.35
5.49
2.96
4.60
2.73
4.11
2.57
3.78
2.46
3.56
2.37

3.39
2.31
3.26
2.25
3.15
2.20
3.06
28 4.20
7.64
3.34
5.45
2.95
4.57
2.71
4.07
2.56
3.75
2.45
3.53
2.36
3.36
2.29
3.23
2.24

3.12
2.19
3.03
29 4.18
7.60
3.33
5.42
2.93
4.54
2.70
4.04
2.55
3.73
2.43
3.50
2.35
3.33
2.28
3.20
2.22
3.09
2.18
3.00
30 4.17

7.56
3.32
5.39
2.92
4.51
2.69
4.02
2.53
3.70
2.42
3.47
2.33
3.30
2.27
3.17
2.21
3.07
2.16
2.98
Source: Richard Lowry. file://localhost/www.vassarstats.net.
Retrieved from http/::vassarstats.net:textbook:apx_d.html
suk85842_06_c06.indd 200 10/23/13 1:40 PM
file://localhost/www.vassarstats.net
http/::vassarstats.net:textbook:apx_d.html

CHAPTER 6Section 6.2 Identifying the Difference: Post Hoc
Tests and Tukey’s HSD
6.2 Identifying the Difference: Post Hoc Tests and Tukey’s HSD
A significant t from an independent t-test provides for a simpler
interpretation than a significant F from an ANOVA with three
or more groups can provide. A significant
t indicates that the two groups probably belong to populations
with different means. A
significant F indicates that at least one group is significantly
different from at least one
other group in the study, but unless there are only two groups in
the ANOVA, it is not
clear which group is significantly different from which. If the
null hypothesis is rejected,
there are a number of possible alternatives, as we noted when
we listed all the possible
HA outcomes earlier.
The point of a post hoc test (an “after this” test) conducted
following an ANOVA is to
determine which groups are significantly different from each
other. So when F is signifi-
cant, a post hoc test is the next step. Statisticians debate the
practice of whether to run a
post hoc if F is not significant, as there may be instances in
which the overall F will be
nonsignificant yet the post hoc tests detect a significant
difference between two groups.
With the ease of running the analysis in Excel or SPSS,
researchers may run post hoc tests
to determine whether there are significant differences in means
between pairs of groups.

In the latter case, a planned comparison is most prudent for
specific detection of mean dif-
ferences. Whether planned comparison or post hoc, the
determination should be based on
the purpose of the study. If the goal is to test the null
hypotheses that the means are not
significantly different, then a significant omnibus F is
appropriate. On the other hand, if
there are specific instances of detecting differences between
means, then the F result is not
necessary and going straight to the post hoc tests will be
apropos as in a planned compari-
son between means.
There are many post hoc tests that are used for different
purposes and based on their own
assumptions and calculations (18 of them in SPSS, named after
their respective authors).
Each of them has particular strengths, but one of the more
common in the psychological
disciplines, and also one of the easiest to calculate, is John
Tukey’s HSD test, for “honestly
significant difference.”
Many statisticians use the terms liberal and conservative to
describe post hoc tests. A liberal
test is one in which there is a greater chance of finding a
significant difference between
means but a higher chance of a type I error. Fisher’s least
significant difference (LSD) test
is an example of a liberal test. These are seldom used for the
very concern of committing a
type I error. Conversely, a conservative post hoc has a lower
chance of finding a significant
difference between means but also a lower chance of a type I
error. One such conservative

test is Bonferroni’s post hoc. By their very conservative nature,
these post hoc tests are
more widely used.
Formula 6.5 produces a value that is the smallest difference
between the means of any two
samples that can be statistically significant:
HSD 5 x Å
MSwith
n
Formula 6.5
suk85842_06_c06.indd 201 10/23/13 1:40 PM
Where
x 5 a table value indexed to the number of groups (k) in the
problem and
the degrees of freedom within (dfwith) from the ANOVA table
MSwith 5 the value from the ANOVA table
n 5 the number in one group when group sizes are equal.
In order to compute Tukey’s HSD, follow these steps:
1. From Table 6.4 locate the value of x by mov-
ing across the top of the table to the number
of groups/treatments (k 5 3), and then down

the left side for the within degrees of freedom
(dfwith 5 9). The intersecting values are 3.95 and
5.43. The smaller of the two is the value when
p 5 .05, as it was in our test. The post hoc test is
always conducted at the same probability level
as the ANOVA. In this case, it is p 5 .05.
2. The calculation is 3.95 times the result of the
square root of .945 (the MSwith) divided by 4 (n).
3.95 Å
.954
4
5 1.920
3. This value is the minimum difference between the means of
two significantly dif-
ferent samples. The sign of the difference does not matter; it is
the absolute value
we need.
The means for social isolation in the three groups are the
following:
Ma 5 3.50 for small town respondents
Mb 5 6.750 for suburban respondents
Mc 5 7.250 for city respondents
Small towns minus suburbs:
Ma 2 Mb 5 3.50 2 6.75 5 23.25—this difference exceeds 1.92
and is significant.

Small towns minus cities:
Ma 2 Mc 5 3.50 2 7.25 5 23.75—this difference exceeds 1.92
and is significant.
Suburbs minus cities:
Mb 2 Mc 5 6.75 2 7.25 5 20.50—this difference is less than
1.92 and is not
significant.
When several groups are involved, sometimes it is helpful to
create a table that presents
all the differences between pairs of means. Table 6.5, which is
repeated in the Appendix as
Table D, is the Tukey’s HSD results for the social isolation
problem.
Formula 6.5 is used
when group sizes are
equal. However, there is an
alternate formula for unequal
group sizes for the more
adventurous:
HSD 5 Å a
MSwith
2 b a
1
n1
1
1
n2
b

with a separate HSD value
completed for each pair of
means in the problem.
Try It!
suk85842_06_c06.indd 202 10/23/13 1:40 PM
Table 6.4: Tukey’s HSD critical values: q (Alpha, k, df )
* The critical value for q corresponding to alpha = .05 (top) and
alpha = .01 (bottom)
df k 5 Number of Treatments
2 3 4 5 6 7 8 9 10
5 3.64
5.70
4.60
6.98
5.22
7.80
5.67
8.42
6.03

8.91
6.33
9.32
6.58
9.67
6.80
9.97
6.99
10.24
6 3.46
5.24
4.34
6.33
4.90
7.03
5.30
7.56
5.63
7.97
5.90
8.32
6.12
8.61
6.32

8.87
6.49
9.10
7 3.34
4.95
4.16
5.92
4.68
6.54
5.06
7.01
5.36
7.37
5.61
7.68
5.82
7.94
6.00
8.17
6.16
8.37
8 3.26
4.75
4.04

5.64
4.53
6.20
4.89
6.62
5.17
6.96
5.40
7.24
5.60
7.47
5.77
7.68
5.92
7.86
9 3.20
4.60
3.95
5.43
4.41
5.96
4.76
6.35
5.02

6.66
5.24
6.91
5.43
7.13
5.59
7.33
5.74
7.49
10 3.15
4.48
3.88
5.27
4.33
5.77
4.65
6.14
4.91
6.43
5.12
6.67
5.30
6.87
5.46

7.05
5.60
7.21
11 3.11
4.39
3.82
5.15
4.26
5.62
4.57
5.97
4.82
6.25
5.03
6.48
5.20
6.67
5.35
6.84
5.49
6.99
12 3.08
4.32
3.77

5.05
4.20
5.50
4.51
5.84
4.75
6.10
4.95
6.32
5.12
6.51
5.27
6.67
5.39
6.81
13 3.06
4.26
3.73
4.96
4.15
5.40
4.45
5.73
4.69

5.98
4.88
6.19
5.05
6.37
5.19
6.53
5.32
6.67
14 3.03
4.21
3.70
4.89
4.11
5.32
4.41
5.63
4.64
5.88
4.83
6.08
4.99
6.26
5.13

6.41
5.25
6.54
15 3.01
4.17
3.67
4.84
4.08
5.25
4.37
5.56
4.59
5.80
4.78
5.99
4.94
6.16
5.08
6.31
5.20
6.44
16 3.00
4.13
3.65

4.79
4.05
5.19
4.33
5.49
4.56
5.72
4.74
5.92
4.90
6.08
5.03
6.22
5.15
6.35
17 2.98
4.10
3.63
4.74
4.02
5.14
4.30
5.43
4.52

5.66
4.70
5.85
4.86
6.01
4.99
6.15
5.11
6.27
18 2.97
4.07
3.61
4.70
4.00
5.09
4.28
5.38
4.49
5.60
4.67
5.79
4.82
5.94
4.96

6.08
5.07
6.20
19 2.96
4.05
3.59
4.67
3.98
5.05
4.25
5.33
4.47
5.55
4.65
5.73
4.79
5.89
4.92
6.02
5.04
6.14
20 2.95
4.02
3.58

4.64
3.96
5.02
4.23
5.29
4.45
5.51
4.62
5.69
4.77
5.84
4.90
5.97
5.01
6.09
24 2.92
3.96
3.53
4.55
3.90
4.91
4.17
5.17
4.37

5.37
4.54
5.54
4.68
5.69
4.81
5.81
4.92
5.92
30 2.89
3.89
3.49
4.45
3.85
4.80
4.10
5.05
4.30
5.24
4.46
5.40
4.60
5.54
4.72

5.65
4.82
5.76
40 2.86
3.82
3.44
4.37
3.79
4.70
4.04
4.93
4.23
5.11
4.39
5.26
4.52
5.39
4.63
5.50
4.73
5.60
Source: Tukey’s HSD critical values (n.d.). Retrieved from
http://www.stat.duke.edu/courses/Spring98/sta110c/qtable.html.
suk85842_06_c06.indd 203 10/23/13 1:40 PM

http://www.stat.duke.edu/courses/Spring98/sta110c/qtable.html
Table 6.5: Presenting Tukey’s HSD results in a table
HSD 5 x Å
MSwith
n
5 3.95 Å
.954
4
5 1.920 (x2)
Any difference between pairs of means 1.920 or greater is a
statistically significant difference.
Mean differences in orange are statistically significant.
Small towns
M 5 3.500
Suburbs
M 5 6.750
Cities
M 5 7.250
Small towns
M 5 3.500

Diff 5 3.250 Diff 5 3.750
Suburbs
M 5 6.750
Diff 5 0.500
Cities
M 5 7.250
The values entered in the cells in Table 6.5 indicate the
differences between each pair of
means in the study. Comparing the mean scores from each of the
three groups indicates
that the respondents from small towns expressed a significantly
lower level of social
isolation than those in either the suburbs or cities. Comparing
the mean scores from the
suburban and city groups indicates that social isolation scores
are higher in the city, but
the difference is not large enough to be statistically significant.
The significant F from the ANOVA indicated that at least one
group had a significantly
different level of social isolation from at least one other group,
but that is all a significant F
can reveal. The result does not indicate which group is
significantly different from which
other group, unless there are only two groups. The post hoc test
indicates which pairs of
groups are significantly different from each other. Table 6.5 is
an example of how to illus-
trate the significant and the nonsignificant differences. One
caveat in using Tukey’s HSD
is that there is an assumption of equality of variances
(homogeneity) between groups

based on Levene’s test. This assumption applies here as well.
Suppose there is a violation
of homogeneity. In that instance, an adjusted post hoc that
accounts for inequality of vari-
ances (or heterogeneity) will need to be employed. To
implement this in SPSS for instance
there are four options under the Equal Variances Not Assumed
heading when conducting
a post hoc for ANOVA. One of these approaches is the Games-
Howell post hoc, which is
executed by checking that box in SPSS Post Hoc tests tab for
ANOVA.
suk85842_06_c06.indd 204 10/23/13 1:40 PM
Apply It!
ANOVA and Product Development
A product development specialist in a major computer company
decides that
it would be a significant improvement to keyboards if they were
designed to
fit the shape of human hands. Instead of being flat, the new
keyboard would
curve like the surface of a football. Before the company
executives are willing to expend the
resources necessary to produce and distribute such a product,
they need to know whether
it will sell and what the most comfortable curvature of the

keyboard would be.
The company produces prototypes for four different keyboards,
labeled Prototype A through
D (see Table 6.6). Prototype A is a standard flat keyboard, and
the others each have varying
amounts of curve. Everything else about the keyboards is the
same, so this is a one-way
ANOVA. Forty different users are randomly assigned to test one
of the four keyboards and
rank them in comfort on a 100-point scale. The results are
shown below Figure 6.4.
Table 6.6: Prototype A–D data set
Prototype A Prototype B Prototype C Prototype D
49 57 77 65
57 53 82 61
73 69 77 73
68 65 85 81
65 61 93 89
62 73 79 77
61 57 73 81
45 69 89 77
53 73 82 69
61 77 85 77

Next, the test results are analyzed in Excel, which produces the
information in Figure 6.4.
(continued)
suk85842_06_c06.indd 205 10/23/13 1:40 PM
Apply It! (continued)
Figure 6.4: Excel results of comparison means and ANOVA of
prototypes
The null hypothesis is that there is no difference among the four
keyboards. From Table
6.6, we see that the F value is 16.72, which is larger than the
critical value of F 5 2.87 at
the critical a 5 .05. Therefore the null hypothesis is rejected at p
, .05. At least one of the
prototypes is significantly different from at least one other
prototype.
Because there is a significant F, the marketers next compute
HSD:
HSD 5 x Å
MSwith
n
Where

x 5 3.81 (based on k 5 4, dfwith 5 36, and p 5 .05)
MSwith 5 61.07, the value from the ANOVA table
n 5 10, the number in one group when group sizes are equal
HSD 5 9.42
(continued)
Groups
Prototype A
Prototype B
Prototype C
Prototype D
Count
10
10
10
10
Sum
594
654
822

750
Average
59.4
65.4
82.2
75.0
Variance
73.82
65.60
36.40
68.44
Summary
Source of
Variation
Between
Groups
Within
Groups
Total

SS
3063.6
2198.4
5262
df
3
36
39
MS
1021.20
61.07
F
16.72
p-value
5.71E–07
Fcrit
2.87
ANOVA

suk85842_06_c06.indd 206 10/23/13 1:40 PM
CHAPTER 6Section 6.3 Determining the Results’ Practical
Importance
This value is the minimum difference between the means of two
significantly different sam-
ples. The difference in means between the groups is shown
below:
A 2 B 5 26.0
A 2 C 5 222.8
A 2 D 5 215.6
B 2 C 5 216.8
B 2 D 5 29.6
C 2 D 5 7.2
The differences in comfort between Prototypes A-B and C-D are
not statistically significant,
because the absolute values are less than the Tukey’s HSD value
of 9.42. However, the differ-
ences in comfort between the remaining prototypes are
statistically significant.
Based on analysis of the one-way ANOVA, the marketing team
decides to produce and sell
the keyboard configuration of Prototype C. This had the highest
mean comfort level and will
be a significant improvement over existing keyboards.
Apply It! boxes written by Shawn Murphy

6.3 Determining the Results’ Practical Importance
Three questions can come up in an ANOVA. The second and
third questions depend upon
the answer to the first:
1. Are any of the differences statistically significant? The
answer depends upon
how the calculated F value compares to the critical value from
the table.
2. If the F is significant, which groups are significantly
different from each other?
That question is answered by completing a post hoc test such as
Tukey’s HSD.
3. If F is significant, how important is the result? The answer
comes by calculating an
effect size.
After addressing the first two questions, we now turn our
attention to the third question,
effect size. With the t-test in Chapter 5, Cohen’s d answered the
question about how impor-
tant the result was. Several effect-size statistics have been used
to explain the importance
of a significant ANOVA result. Omega squared (v2) and partial-
eta-squared (partial-h2)
are both quite common in the social science research literature,
but the one we will use is
called eta-squared (H2). The Greek letter eta (h pronounced like
“ate a” as in “ate a grape”)
is the equivalent of the letter h. Because some of the variance in
scores is unexplained and
is therefore error variance, eta-squared answers this question:
How much of the score
variance can be attributed to the independent variable?

suk85842_06_c06.indd 207 10/23/13 1:40 PM
Importance
In the social isolation problem, the question was whether
residents of small towns, subur-
ban areas, and cities differ in the amount of social isolation they
indicate. The respondents’
location is the IV. Eta-squared estimates how much of the
difference in social isolation is
related to where respondents live.
There are only two values involved in the h2 calculation, both
retrievable from the ANOVA
table. Formula 6.6 shows the eta-squared calculation:
h2 5
SSbet
SStot
Formula 6.6
Eta-squared is the ratio of between-groups variability to total
variability. If there was no
error variance, all variance would be due to the independent
variable, and the sums of
squares for between-groups variability and for total variability
would have the same val-
ues; the effect size would be 1.0. With human subjects, this
never happens because scores
fluctuate for reasons other than the IV, but it is important to
know that 1.0 is the “upper

bound” for this effect size. The lower bound is 0, of course—
none of the variance is
explained. But we also never see eta-squared values of 0
because the only time the effect
size is calculated is when F is significant, and that can only
happen when the effect of the
IV is great enough that the ratio of MSbet to MSwith exceeds
the critical value.
For the social isolation problem, SSbet 5 33.168 and SStot 5
41.672, so
h2 5
33.168
41.672
5 0.796.
According to this data, about 80% (79.6% to be exact) of the
variance in social isolation scores is related to whether the
respondent lives in a small town, a suburb, or a city. (Note
that this amount of variance is unrealistically high, which
can happen when numbers are contrived.)
G If the F in
ANOVA is not
significant, should the
post hoc test or the
effect-size calculation be
made?
Try It!
Apply It!
Using ANOVA to Test Effectiveness

A pharmaceutical company has developed a new medicine to
treat a skin condition. This medi-
cine has been proven effective in previous tests, but now the
company is trying to decide the
best method to deliver the medicine. The options are
1. pills that are taken orally,
2. a cream that is rubbed into the affected area, or
3. drops that are placed on the affected area.
(continued)
suk85842_06_c06.indd 208 10/23/13 1:40 PM
Importance
To test the application methods, the company uses 24 volunteers
who suffer
from this skin condition. Each of the volunteers is randomly
assigned to one of
the three treatment methods. Note that each volunteer tests only
one of the
delivery methods. This satisfies the requirement that the
categories of the IV
must be independent. This is a one-way ANOVA test with the
delivery method
being the only independent variable.
To evaluate the effectiveness of each delivery method, three

different dermatologists exam-
ine each patient after the course of treatment. They then rate the
skin condition on a scale
of 1 through 20, with 20 being a total absence of the condition.
The scores from the three
doctors are then averaged.
The null hypothesis is that all three delivery methods are
equally effective:
H0: mpills 5 mcream 5 mdrops
The null hypothesis indicates that the three treatments were
drawn from populations with
the same mean. The alternate hypothesis for the ANOVA test is
Ha: mpills ? mcream ? mdrops
Data from the trial is shown in Table 6.7.
Table 6.7: Data from trial of skin treatment conditions
Pills Cream Drops
14 18 13
13 15 15
19 16 16
18 18 15
15 17 14
16 13 17

12 17 13
12 18 16
(continued)
suk85842_06_c06.indd 209 10/23/13 1:40 PM
Importance
Figure 6.5: Analysis of the data that was performed in Excel
Figure 6.5 shows the value for F is 1.72, which is less than the
Fcrit value of 3.47 when testing
at p 5 .05. Therefore, the null hypothesis is not rejected. We
cannot say that the different
delivery methods come from populations with different means.
Looking at the p value gen-
erated by Excel, we see that there is a 20% probability that a
difference in means this large
could have occurred by chance alone. Because the null
hypothesis is not rejected, there is
no need to perform either a Tukey’s HSD test or an h2
calculation.
The pharmaceutical company decides to offer the medicine as a
cream because this is gen-
erally their preferred delivery method. The ANOVA test has
assured them that this is the
correct choice, and that neither of the two alternate methods
provided a more effective

delivery option. In other words, the alternative hypothesis is not
correct.
Apply It! boxes written by Shawn Murphy
Groups
Pills
Cream
Drops
Count
8
8
8
Sum
119
132
119
Average
14.88
16.50
14.88

Variance
6.98
3.14
2.13
Summary
Source of
Variation
Between
Groups
Within
Groups
Total
SS
14.08
85.75
99.83
df
2
21

23
MS
7.04
4.08
F
1.72
p-value
0.20
Fcrit
3.47
ANOVA
suk85842_06_c06.indd 210 10/23/13 1:40 PM
CHAPTER 6Section 6.4 Conditions for the One-Way ANOVA
6.4 Conditions for the One-Way ANOVA
As we saw with the t-tests, any statistical test requires that
certain conditions (also referred
to as assumptions) are met. The conditions might be
characteristics such as the scale of the
data, the way the data is distributed, the relationships between
the groups in the analysis,
and so on. In the case of the one-way ANOVA, the name

indicates one of the conditions.
• This particular test can accommodate just one independent
variable.
• That one variable can have any number of categories, but there
can be just one IV.
In the example of small-town, suburban, and city isolation, the
IV was the loca-
tion of the respondents’ residence. We might have added more
categories such as
small-town, semirural, small town, large town, suburbs of small
cities, suburbs of
large cities, and so on, all of which relate to the respondents’
place of residence,
but like the independent t-test, there is no way to add another
variable, such as
the respondents’ gender, in a one-way ANOVA.
• The categories of the IV must be independent.
• Like the independent t-test, the groups involved must be
independent. Those
who are members of one group cannot also be members of
another group
involved in the same analysis.
• The IV must be nominal scale.
• Because the IV must be nominal scale, sometimes data of
some other scale is
reduced to categorical data to complete the analysis. If someone
is interested
in whether there are differences in social isolation related to
age, age must be
changed from ratio to nominal data prior to the analysis. Rather

than using each
person’s age in years as the independent variable, ages are
grouped into catego-
ries such as 20s, 30s, and so on. This is not ideal, because by
reducing ratio data
to nominal or even ordinal scale, the differences in social
isolation between, for
example, 20- and 29-year-olds are lost.
• The DV must be interval or ratio scale.
• Technically, social isolation would need to be measured with
something like the
number of verbal exchanges that one has daily with neighbors or
co-workers,
rather than asking on a scale of 1–10 to indicate how isolated
one feels, which is
probably an example of ordinal data.
• The groups in the analysis must be similarly distributed. The
technical descrip-
tion for this similarity of distribution is homogeneity of
variance. For example,
this condition means that the groups should all have reasonably
similar standard
deviations. This was discussed in Chapter 5 where the Levene’s
test is used to
test equality of variances.
• Finally, using ANOVA assumes that the samples are drawn
from a normally dis-
tributed population.
It may seem difficult to meet all these conditions. However,
keep in mind that normality
and homogeneity of variance in particular represent ideals more

than practical necessities.
As it turns out, Fisher’s procedure can tolerate a certain amount
of deviation from these
requirements; this test is quite robust.
suk85842_06_c06.indd 211 10/23/13 1:40 PM
CHAPTER 6Section 6.5 ANOVA and the Independent t-Test
6.5 ANOVA and the Independent t-Test
The one-way ANOVA and the independent t-test share several
assumptions, although they employ distinct statistics, in that the
sums of squares is used for ANOVA and the
standard error of the difference is used for the t-test. For
example, both tests will lead the
analyst to the same conclusion. This consistency can be
illustrated by completing ANOVA
and the independent t-test for the same data.
Suppose an industrial psychologist is interested in how people
from two separate divi-
sions of a company differ in their work habits. The dependent
variable is the amount of
work completed after-hours at home per week for supervisors in
marketing versus super-
visors in manufacturing. The data is as follows:
Marketing: 3, 4, 5, 7, 7, 9, 11, 12
Manufacturing: 0, 1, 3, 3, 4, 5, 7, 7
Calculating some of the basic statistics yields the following:

M s SEM SEd MG
Marketing: 7.25 3.240 1.146
1.458 5.50
Manufacturing: 3.75 2.550 0.901
First, the t-test:
t 5
M1 2 M2
SEd
5
7.25 2 3.75
1.458
5 2.401; t.05(14) 5 2.145
The difference is significant. Those in marketing (M1) take
significantly more work home
than those in manufacturing (M2).
Now the ANOVA:
• SStot 5 a (x 2 MG)2 5 168
• Verify that the result of subtracting MG from each score in
both groups, squaring
the differences, and summing the square 5 168.
• SSbet 5 (Ma 2 MG)
2na 1 (Mb 2 MG)
2nb

• This one is not too lengthy to do here: (7.25 2 5.50)2(8) 1
(3.75 2 5.50)2(8)
5 24.5 1 24.5 5 49.
• SSwith 5 (xa 2 Ma)
2 1 (xb 2 Mb)
2
• Verify that the result of subtracting the group means from
each score in the par-
ticular group, squaring the differences, and summing the
squares 5 119.
• Check that SSwith 1 SSbet 5 SStot : 119 1 49 5 168.
suk85842_06_c06.indd 212 10/23/13 1:40 PM
CHAPTER 6Section 6.6 Completing ANOVA with Excel
Source SS df MS F Fcrit
Between 49 1 49 5.765 F.05(1,14) 5 4.60
Within 119 14 8.5
Total 168 15
Like the t-test, ANOVA indicates that the difference in the
amount of work completed at
home is significantly different for the two groups, so at least
both tests draw the same
conclusion about whether the result is significant, but there is

more similarity than this.
• Note that the calculated value of t 5 2.401, and the calculated
value of F 5 5.765.
• If the value of t is squared, it equals the value of F.
2.4012 5 5.765.
• The same is true for the critical values:
t.05(14) 5 2.145
F.05(1,14) 5 4.60
2.1452 5 4.60
Gosset’s and Fisher’s tests draw exactly equivalent conclusions
when there are two
groups. The ANOVA tends to be more work, and researchers
ordinarily use the t-test for
two groups, but the point is that the two tests are entirely
consistent.
6.6 Completing ANOVA with Excel
The ANOVA by longhand involves enough calculated means,
subtractions, squaring of differences, and so on that doing an
ANOVA on Excel is beneficial. A researcher
is comparing the level of optimism indicated by people in
different vocations during an
economic recession. The data is from laborers, clerical staff in
professional offices, and the
professionals in those offices. The data for the three groups
follows:
Laborers: 33, 35, 38, 39, 42, 44, 44, 47, 50, 52

Clerical staff: 27, 36, 37, 37, 39, 39, 41, 42, 45, 46
Professionals: 22, 24, 25, 27, 28, 28, 29, 31, 33, 34
1. Create the data file in Excel. Enter Laborers, Clerical staff,
and Professionals in
cells A1, B1, and C1, respectively.
2. In the columns below those labels, enter the optimism scores,
beginning in cell
A2 for the laborers, B2 for the clerical workers, and C2 for the
professionals. Once
the data is entered and checked for accuracy, proceed with the
following steps.
3. Click the Data tab at the top of the page.
H What is
the relationship
between the values
of t and F if both are
performed for the same
two-group test?
Try It!
suk85842_06_c06.indd 213 10/23/13 1:40 PM
CHAPTER 6Section 6.6 Completing ANOVA with Excel
4. At the extreme right, choose Data Analysis.
5. In the Analysis Tools window, select ANOVA Single Factor
and click OK.

6. Indicate where the data is located in the Input Range. In the
example here, the
range is A2:C11.
7. Note that the default is “Grouped by Columns.” If the data is
arrayed along rows
instead of columns, this would need to be changed.
Because we designated A2 instead of A1 as the point where the
data begins, there is no
need to indicate that labels are in the first row.
8. Select Output Range and enter a cell location where you wish
the display of the
output to begin. In the example in Figure 6.6, the location is
A13.
9. Click OK.
Widen column A to make the output easier to read. It will look
like the screenshot in
Figure 6.6.
Figure 6.6: Performing an ANOVA on Excel
suk85842_06_c06.indd 214 10/23/13 1:40 PM
CHAPTER 6Section 6.7 Presenting Results
As you have already seen in the two Apply It! boxes, the results
appear in two tables.
The first provides descriptive statistics. The second table looks
like the longhand table of

results for the social isolation example, except that
• the figures shown for the total follow those for between and
within instead of
preceding them, and
• the P-value column indicates the probability that an F of this
magnitude could
have occurred by chance.
Note that the P value is 4.31E-06. The “E-06” is scientific
notation. It is a shorthand way of
indicating that the actual value is p 5 .00000431, or 4.31 with
the decimal moved six deci-
mals to the left, or negative exponent to the sixth power The
probability easily exceeds the
p 5 .05 standard for statistical significance.
6.7 Presenting Results
The previous analyses all used Excel, so we will now shift to
using SPSS for the execution of these steps and the
interpretation of the results. We will first use the
data in Table 6.7 and then proceed with actual data gathered
from published research.
You will see that we use the same steps regardless of the sample
size, and that using
technology like Excel and SPSS makes hand calculations
unnecessary. While hand cal-
culations are instructive, they are also laborious and more prone
to errors, especially
with large data sets.
SPSS Example 1: Steps for ANOVA
After setting up the data in SPSS as seen in Figure 6.7 (data

from Table 6.7), the steps in
executing this analysis are as follows:
Analyze S Compare Means S One-Way ANOVA. Place
Treatment into the Factor box
and Skin Condition into the Dependent List. Click Post Hoc on
the left and check Tukey
and Games-Howell; then click Options and check Descriptive
and Homogeneity of
variance test. Click Continue and OK. (Note that the three
treatment groups in the data
set (Figure 6.7) are numerically coded: Pills 5 1, Creams 5 2,
and Drops 5 3.)
suk85842_06_c06.indd 215 10/23/13 1:40 PM
Figure 6.7: Data set in SPSS
suk85842_06_c06.indd 216 10/23/13 1:40 PM
Figure 6.8: SPSS output from trial of skin treatment conditions
Levene Statistic
1.822
df1

2
df2
21
Sig.
.186
Test of Homogeneity of Variances
SkinCondition
ANOVA
SkinCondition
Between
Groups
Within
Groups
Total
Sum of
Squares
14.083
85.750
99.833
df
2

21
23
Mean
Square
7.042
4.083
F
1.724
Sig.
0.203
Descriptives
SkinCondition
Pills
Creams
Total
Drops
N
8
8

8
24
16.50
14.88
14.88
15.42
Std. DeviationMean Std. Error
1.773
1.458
2.642
2.083
.627
.515
.934
.425
15.02
13.66
12.67

14.54
17.98
16.09
17.08
16.30
13
13
12
12
18
17
19
19
95% Confidence
Interval for Mean
Minimum
Lower
Bound
Upper

Bound
Maximum
Multiple Comparisons
Dependent Variable: SkinCondition
Tukey
HSD
Games-
Howell
Pills
Creams
Drops
Pills
Creams
Drops
(I)
Treatment
1.625
�1.625
1.625
.000

�1.625
1.625
.000
1.625
�1.625
�1.625
.000
.000
1.010
1.010
1.010
1.010
1.010
1.125
1.010
.811
1.125
.811

1.067
1.067
.264
.264
.264
1.000
.264
.350
1.000
.149
.350
.149
1.000
1.000
�.92
�4.17
�.92
�2.55

�4.17
�1.37
�2.55
�.51
�4.62
�3.76
�2.89
�2.89
4.17
.92
4.17
2.55
.92
4.62
2.55
3.76
1.37
.51

2.89
2.89
(J)
Treatment
Mean
Difference
(I-J)
95% Confidence
Interval
Std.
Error
Lower
Bound
Upper
Bound
Sig.
Creams
Drops
Pills
Drops
Pills

Creams
Creams
Drops
Pills
Drops
Pills
Creams
suk85842_06_c06.indd 217 10/23/13 1:40 PM
As seen in the SPSS output (Figure 6.8), the ANOVA results are
the same as when exe-
cuted in Excel earlier in the chapter. Here SPSS allows
execution of the ANOVA including
descriptive statistics, tests of homogeneity of variance, post hoc
tests, and a line graphs—
all simultaneously executed using the SPSS steps outlined
earlier. The results begin with
the Descriptives table where you can see that each group has an
even number of partici-
pants (n 5 8). Here you can see difference in the means with
Pills (M 5 16.50) highest of
the three treatments. The Test of Homogeneity of Variance
shows a favorable result in that
it is not significant (p . .05), specifically p 5 .186. This
indicates that there is no significant

difference in the variance of the three treatments indicating
equal variances. As you recall
in earlier chapters, if there is inequality of variance across
groups an adjustment is needed
to compare groups. Next, the ANOVA table shows a
nonsignificant F statistic, p 5 .203. At
this stage since F is not significant we do not need to interpret
the post hoc tests as there
will be no significance between groups. As noted earlier in the
chapter, this is a debat-
able topic in that with the ease of running post hoc tests, the
analyst can easily look at the
results of these regardless of the F statistic result. Findings may
indicate significant differ-
ences between any two groups even though there is a
nonsignificant F. This is often a rar-
ity and you can clearly see from the example that none of the
post hoc tests is significant
between groups.
SPSS Example 2: Steps for ANOVA
Using public data about higher education and housing from Pew
Research (2010),
Social and Demographic Trends, the steps in executing this
analysis are as follows:
Analyze S Compare Means S One-Way ANOVA. Place schl
(currently enrolled in school)
into the Factor box and age into the Dependent List. Click Post
Hoc on the left and check
Tukey and Games-Howell; then click Options and check
Descriptive, Homogeneity of
variance test, and Means plot. Click Continue and OK.
suk85842_06_c06.indd 218 10/23/13 1:40 PM

Figure 6.9: SPSS output from Pew research social and
demographic trends
(2010) education data set
Levene Statistic
44.884
df1
5
df2
1692
Sig.
.000
Test of Homogeneity of Variances
AGE What is your age?
ANOVA
Between
Groups
Within
Groups

Total
Sum of
Squares
72748.597
272338.706
345087.303
df
5
1692
1697
Mean
Square
14549.719
160.957
F
90.395
Sig.
.000
Descriptives

Std. DeviationMean Std. Error
18 43
18 58
18 64
18 64
18 64
22 34
19.64
24.81
36.03
31.38
42.05
29.33
38.82
33
212
33
81

1336
3
1698
4.801
8.433
15.503
10.692
13.399
6.429
14.260
.836
.579
2.699
1.188
.367
3.712
.346
17.93

23.66
30.53
29.02
41.33
13.36
38.14
21.34
25.95
41.53
33.75
42.77
45.30
39.49 18 64
95% Confidence
Interval for Mean
Minimum
Lower
Bound
Upper

Bound
Maximum
Yes, in
High School
Yes, in Technical,
trade, or
vocational school
Yes, in College
(undergraduate,
including 2-year
colleges)
Yes, in Graduate
School
No
Don’t know/
Refused (VOL.)
Total
N
suk85842_06_c06.indd 219 10/23/13 1:40 PM
demographic trends

(2010) education data set (continued)
Dependent Variable: AGE What is your age?
Tukey
HSD
(I)
SCHL Are you
currently enrolled
in school?
(J)
SCHL Are you
currently enrolled in
school?
Mean
Difference
(I-J)
95% Confidence
Interval
Std.
Error
Lower
Bound
Upper
Bound

Sig.
Yes, in High School
Yes, in High School
Yes, in Technical, trade, or
vocational school
Yes, in College
(undergraduate, including
2-year colleges)
Yes, in College
2-year colleges)
Yes, in College
2-year colleges)
No
Don’t know/Refused (VOL.)
Yes, in Graduate School
vocational school
Yes, in High School
vocational school

Yes, in College
2-year colleges)
Yes, in High School
vocational school
No
No
No
Yes, in College
2-year colleges)
Yes, in High School

vocational school
No
�16.394*
�5.170
�11.746*
�22.417*
�9.697
16.394*
11.224*
4.648
�6.023
6.697
5.170
�11.224*
�6.576*
�17.247*
�4.527

11.746*
�4.648
6.576*
�10.670*
2.049
22.417*
6.023
17.247*
12.720
10.670*
9.697
�6.697
4.527
�2.049
�12.720
3.123
2.374
2.620

2.236
7.650
3.123
2.374
2.620
2.236
7.650
2.374
2.374
1.657
.938
7.376
2.620
2.620
1.657
1.452
7.459
2.236

2.236
.938
7.333
1.452
7.650
7.650
7.376
7.459
7.333
.000
.249
.000
.000
.803
.000
.000
.483
.077

.952
.249
.000
.001
.000
.990
.000
.483
.001
.000
1.000
.000
.077
.000
.509
.000
.803
.952

.990
1.000
.509
�25.30
�11.94
�19.22
�28.79
�31.52
7.48
4.45
�2.83
�12.40
�15.13
�1.60
�18.00
�11.30
�19.92
�25.57

4.27
�12.12
1.85
�14.81
�19.23
16.04
�.36
14.57
�8.20
6.53
�12.13
�28.52
�16.52
�23.33
�33.64
�7.48
1.60
�4.27

�16.04
12.13
25.30
18.00
12.12
.36
28.52
11.94
�4.45
�1.85
�14.57
16.52
19.22
2.83
11.30
�6.53
23.33
28.79

12.40
19.92
33.64
14.81
31.52
15.13
25.57
19.23
8.20
* The mean difference is significant at the 0.05 level.
Yes, in College
(undergraduate,
including 2-year
colleges)
Yes, in
Graduate School
No
Yes, in
High School
Yes, in Technical,
trade, or
vocational school

Don’t know/
Refused (VOL.)
suk85842_06_c06.indd 220 10/23/13 1:40 PM
demographic trends
(2010) education data set (continued)
Source: Data from Pew Research: Social and Demographic
Trends. (2011). Higher Education/Housing. Retrieved from
http://www
.pewsocialtrends.org/category/datasets/.
Dependent Variable: AGE What is your age?
Games-
Howell
(I)
SCHL Are you
currently enrolled
in school?
(J)
SCHL Are you
currently enrolled in
school?

Mean
Difference
(I-J)
95% Confidence
Interval
Std.
Error
Lower
Bound
Upper
Bound
Sig.
Yes, in High School
Yes, in High School
vocational school
Yes, in College
2-year colleges)
Yes, in College
2-year colleges)
Yes, in College

2-year colleges)
No
vocational school
Yes, in High School
vocational school
Yes, in College
2-year colleges)
Yes, in High School
vocational school
No
No

No
Yes, in College
(undergraduate,
including 2-year
colleges)
Yes, in
Graduate School
No
Yes, in
High School
Yes, in Technical,
trade, or
vocational school
Yes, in College
2-year colleges)
Yes, in High School
vocational school

No
Don’t know/
Refused (VOL.)
* The mean difference is significant at the 0.05 level.
�16.394*
�5.170*
�11.746*
�22.417*
�9.697
16.394*
11.224*
4.648
�6.023
6.697
5.170*
�11.224*
�6.576*
�17.247*

�4.527
11.746*
�4.648
6.576*
�10.670*
2.049
22.417*
6.023
17.247*
12.720
10.670*
9.697
�6.697
4.527
�2.049
�12.720
2.825
1.017

1.453
.913
3.805
2.825
2.760
2.949
2.724
4.589
1.017
2.760
1.322
.685
3.757
1.453
2.949
1.322
1.243
3.897

.913
2.724
.685
3.730
1.243
3.805
4.589
3.757
3.897
3.730
.000
.000
.000
.000
.374
.000
.003
.618

.260
.701
.000
.003
.000
.000
.814
.000
.618
.000
.000
.990
.000
.260
.000
.247
.000
.374

.701
.814
.990
.247
�24.87
�8.15
�15.96
�25.13
�38.01
7.92
2.91
�4.13
�14.25
�13.62
2.19
�19.54
�10.40
�19.21

�34.04
7.53
�13.42
2.75
�14.29
�24.33
19.70
�2.21
15.28
�17.54
7.05
�18.62
�27.02
�24.99
�28.43
�42.98
�7.92
�2.19

�7.53
�19.70
18.62
24.87
19.54
13.42
2.21
27.02
8.15
�2.91
�2.75
�15.28
24.99
15.96
4.13
10.40
�7.05
28.43

25.13
14.25
19.21
42.98
14.29
38.01
13.62
34.04
24.33
17.54
suk85842_06_c06.indd 221 10/23/13 1:40 PM
http://www.pewsocialtrends.org/category/datasets/
Figure 6.10: SPSS output graph from Pew research social and
demographic
trends (2010) education data set
Source: Data from Pew Research: Social and Demographic
Trends. (2011). Higher Education/Housing. Retrieved from
http://www

.pewsocialtrends.org/category/datasets/.
The Descriptives table in Figures 6.9 and 6.10 show that each
group has an unequal number
of participants with No (not in school) participants with n 5
1,336 and the highest mean
age (M 5 42.05). The Test of Homogeneity of Variance shows a
nonfavorable result in that
it is significant (p , .05). This indicates that there is a
significant difference in the variance
of the six education groups indicating unequal variances (or
heterogeneity of variance).
Next, the ANOVA table indicates a significant F statistic (p ,
.05). To determine which
of the group comparisons is significant using a post hoc test
when there is a violation of
homogeneity, equal variance will not be assumed. Therefore, we
will interpret the Equal
variances not assumed post hoc tests, which is Games-Howell.
Here, the Don’t Know/
Refused group is not significant with any of the other education
groups. You can also
see significant difference with several groups such as Yes, in
High School and Yes, in
Technical, trade, or vocational school. All comparisons can be
made in a similar manner
based on the significance value using the Multiple Comparisons
table. The line graph or
means plot shows the mean age of each group with the No group
having the highest mean
age and the Yes, in High School group having the lowest.
suk85842_06_c06.indd 222 10/23/13 1:40 PM

CHAPTER 6Section 6.9 Nonparametric Test: Kruskal-Wallis H-
Test
6.8 Interpreting Results
Though you should refer to the most recent edition of the APA
manual for specific detail on formatting statistics, the following
may be used as a quick guide in presenting the
statistics covered in this chapter.
Table 6.8: Guide to APA formatting of F statistic results
Abbreviation or Term Description
F F test statistic score
h2 Eta-squared: an effect size
v2 Omega-squared: an effect size
HSD Honestly significant difference: a Tukey’s post hoc test
SS Sum of squares
MS Mean square
Source: Publication Manual of the American Psychological
Association, 6th edition. © 2009 American Psychological
Association,
pp. 119–122.
Note that all of the terms in Table 6.8 are italicized, while HSD
is not. The following are
some examples of how to present results using these

abbreviations, though you may use
different combinations of results.
Using the data from SPSS Example 2, Figures 6.9 and 6.10, we
could present the results in
the following way:
• The overall difference between treatment and skin condition
was not signifi-
cantly different F(2,21) 5 1.724, p 5 .203. (Note that the df is
listed for both the
between- and within-group lines in the ANOVA table.)
• The overall difference between school and age was
significantly different
F(5,1692) 5 90.39, p , .05.
• The No [school] group were significantly older (M 5 42.05,
SD 5 13.39) than the
Yes, in High School group (M 5 19.64, SD 5 4.80), the Yes, In
College. . . group
(M 5 24.81, SD 5 8.43), and the Yes, in Graduate School group
(M 5 31.38,
SD 5 10.69), whereas there were no significant differences with
the Yes, in Tech-
nical, trade. . . group (M 5 36.03, SD 5 15.50), and the Don’t
Know/Refused
group (M 5 29.33, SD 5 6.43).
6.9 Nonparametric Test: Kruskal-Wallis H-test
The one-way ANOVA nonparametric equivalent is the Kruskal-
Wallis H-test, also known as the Kruskal Wallis ANOVA. Like
the Mann-Whitney U-test, the Kruskal-
Wallis H-test is based on ranked (ordinal) data. It is used as an
alternative to its para-

metric counterpart when violations of assumptions have
occurred. In fact, Kruskal was
suk85842_06_c06.indd 223 10/23/13 1:40 PM
CHAPTER 6
not a proponent of significance testing, as Bradburn (2007)
has quoted him as saying, “I am thinking these days about
the many senses in which relative importance gets consid-
ered. Of these senses, some seem reasonable and others not
so. Statistical significance is low on my ordering.” That said,
his derived equivalent of a parametric technique is very
apropos.
As in the Mann-Whitney U-test, the rank of each group is
determined and then summed. The H is calculated as a pro-
portion of the summed ranks divided by their respective
sample sizes.
H 5
12
N1N 1 12 a a
1Tg 2 2
ng
b 2 31N 1 12 Formula
6.7
Where
N 5 total sample size

Tg 5 sum of ranks across
ng 5 sample of group
To illustrate the calculation of the H-test, we will use the same
data from Table 6.7 with a
few modifications as seen in Table 6.9. Here the initial rank is
to rank all the values across
treatments with 1 being the lowest rank. If there are tied ranks,
then an average of the ranks
is taken. For instance in the Pills column, the two values of 12
have an initial rank of 1 and
2. The average of them is 1.5 as seen in the Rank column. The
same is true for values of 13,
where there are four ranks with the average rank of 4.5, and so
on with the other ties. Once
all of these are complete, then the ranks are summed as seen in
the last row of the table.
Table 6.9: Data from trial of skin treatment conditions
Pills Initial
Rank
Rank Cream Initial
Rank
Rank Drops Initial
Rank
Rank
14 7 7.5 18 21 21.5 13 3 4.5
13 4 4.5 15 10 10.5 15 11 10.5

19 24 24 16 14 14.5 16 15 14.5
18 20 21.5 18 22 21.5 15 12 10.5
15 9 10.5 17 17 18 14 8 7.5
16 13 14.5 13 5 4.5 17 19 18
12 1 1.5 17 18 18 13 6 4.5
12 2 1.5 18 23 21.5 16 16 14.5
85.5 130 84.5
There are several
websites that will
help in these calculations.
One well-used statistical
calculator for various
analyses, such as the
Kruskal-Wallis H-test,
can be done using the
resources available at the
VassarStats website via
the link provided below.
Use the data provided in
this chapter section to see
if you get the same results.
http://vassarstats.net
/index.html
Try It!
Section 6.9 Nonparametric Test: Kruskal-Wallis H-Test

suk85842_06_c06.indd 224 10/23/13 1:40 PM
http://vassarstats.net/index.html
http://vassarstats.net/index.html
Test
Next each of the summed ranks are divided by their respective
sample sizes, completing
Formula 6.7.
H 5
12
24124 1 12 1 c
185.52 2
8
1
11302 2
8
1
184.52 2
8
d 2 3124 1 12
H 5 12 3 17,310.252 1 116,9002 1 17,140.252 4 2 3124 1 12
H 5 0.022 (913.78 1 2,112.50 1 892.53) 2 69
H 5 0.022 (3,968.31) 2 69

H 5 18.30
The H statistic approximates a chi-square (x2) distribution,
which will be discussed in
Chapter 11, based on k 2 1 degrees of freedom where k is the
number of comparison
groups. The chi-square distribution table in Table 6.10 has the
critical values based on the
degrees of freedom, that is N 2 1 5 23. Therefore, using the
table, the x2critical 5 35.17 at
the a 5 .05 level. As noted our x2observed value above 18.30 is
less than this x
2
critical 5 35.17
value meaning that there is no significant difference between
groups. As noted in the
ANOVA conducted earlier in the chapter, it was expected that a
nonsignificant outcome
would occur. Nonparametric tests are more conservative
compared to parametric ones in
that there is a lower probability of finding a significant outcome
compared to its paramet-
ric counterpart. This also leads to a lower probability of a type I
error.
Table 6.10: Chi-square distribution
Area to the right of critical value
Degrees of
freedom
0.99 0.975 0.95 0.90 0.10 0.05 0.025 0.01

1 — 0.001 0.004 0.016 2.706 3.841 5.024 6.635
2 0.020 0.051 0.103 0.211 4.605 5.991 7.378 9.210
3 0.115 0.216 0.352 0.584 6.251 7.815 9.348 11.345
4 0.297 0.484 0.711 1.064 7.779 9.488 11.143 13.277
5 0.554 0.831 1.145 1.610 9.236 11.071 12.833 15.086
6 0.872 1.237 1.635 2.204 10.645 12.592 14.449 16.812
7 1.239 1.690 2.167 2.833 12.017 14.067 16.013 18.475
8 1.646 2.180 2.733 3.490 13.362 15.507 17.535 20.090
9 2.088 2.700 3.325 4.168 14.684 16.919 19.023 21.666
10 2.558 3.247 3.940 4.865 15.987 18.307 20.483 23.209
11 3.053 3.816 4.575 5.578 17.275 19.675 21.920 24.725
12 3.571 4.404 5.226 6.304 18.549 21.026 23.337 26.217
(continued)
suk85842_06_c06.indd 225 10/23/13 1:40 PM
Test
Table 6.10: Chi-square distribution (continued)

Area to the right of critical value
Degrees of
freedom
0.99 0.975 0.95 0.90 0.10 0.05 0.025 0.01
13 4.107 5.009 5.892 7.042 19.812 22.362 24.736 27.688
14 4.660 5.629 6.571 7.790 21.064 23.685 26.119 29.141
15 5.229 6.262 7.261 8.547 22.307 24.996 27.488 30.578
16 5.812 6.908 7.962 9.312 23.542 26.296 28.845 32.000
17 6.408 7.564 8.672 10.085 24.769 27.587 30.191 33.409
18 7.015 8.231 9.390 10.865 25.989 28.869 31.526 34.805
19 7.633 8.907 10.117 11.651 27.204 30.144 32.852 36.191
20 8.260 9.591 10.851 12.443 28.412 31.410 34.170 37.566
21 8.897 10.283 11.591 13.240 29.615 32.671 35.479 38.932
22 9.542 10.982 12.338 14.042 30.813 33.924 36.781 40.289
23 10.196 11.689 13.091 14.848 32.007 35.172 38.076 41.638
24 10.856 12.401 13.848 15.659 33.196 36.415 39.364 42.980
25 11.524 13.120 14.611 16.473 34.382 37.652 40.646 44.314
26 12.198 13.844 15.379 17.292 35.563 38.885 41.923 45.642
27 12.879 14.573 16.151 18.114 36.741 40.113 43.194 46.963

28 13.565 15.308 16.928 18.939 37.916 41.337 44.461 48.278
29 14.257 16.047 17.708 19.768 39.087 42.557 45.722 49.588
30 14.954 16.791 18.493 20.599 40.256 43.773 46.979 50.892
As you will see in the next section, when this analysis is
performed in SPSS a x2 value is
given and not an H value per se.
SPSS Steps for the Kuskal-Wallis H-test
Reexamining the data set used Figure 6.6, but rearranging the
data as depicted in Figure
6.11, the employee groups (Position) are categorically coded
with 1 5 Laborers, 2 5 Cleri-
cal, and 3 5 Professional. To execute, go to Analyze S
Nonparametric Tests S Legacy
Dialogs S K Independent Samples. As shown in Figure 6.12,
input Optimism (DV) into
the Test Variable List box and Position (IV) into the Grouping
Variable box, then click
the Define Range button just below to input the range of codes
for the Position variable—
this will be 1 and 3 for the minimum and maximum codes,
respectively. Then click OK.
suk85842_06_c06.indd 226 10/23/13 1:40 PM
Test
Figure 6.11: Data set in SPSS

Figure 6.12: The Kruskal-Wallis H-test steps in SPSS
suk85842_06_c06.indd 227 10/23/13 1:40 PM
test
Interpreting Results
The output in Figure 6.13 shows the results of the Kuskal-
Wallis H-test. The x2 value in the
Test Statistics table shows a result as KW—x2(3) 5 7.17, p ,
.05 as there is an overall sta-
tistical difference in optimism amongst the three employee
groups. This can be seen in the
Ranks table where Laborers’ mean ranks (MR 5 21.80) is
highest compared to the lowest
by Professionals (MR 5 6.30). Consequently, the post hoc tests
are not readily available as
they were for the ANOVA, so follow-up Mann-Whitney U or
Wilcoxon rank-sum tests of
all possible combinations will have to be performed (see
Chapter 5 for these procedures).
The conclusion to these results would read as follows:
Based on the Kruskal-Wallis H-test there is a significant
difference in the
level of optimism of the three groups (KW—x2(3) 5 17.17, p ,
.05). Labor-
ers reported the highest level of optimism (MR 5 21.80)
followed by Cleri-
cal positions (MR 5 18.40), and then Professionals (MR 5 6.30),

which
reported the lowest level of optimism.
Figure 6.13: The Kruskal-Wallis H-test output
Groups
Optimism
Laborers
Clerical
Professionals
Total
Position N
10
10
10
30
Mean Rank
21.80
18.40
6.30
Ranks

Chi-Square 17.166
2df
Asymp. Sig. .000
Test Statisticsa,b
Optimism
a. Kruskal Wallis Test
b. Grouping Variable: Position
suk85842_06_c06.indd 228 10/23/13 1:40 PM
CHAPTER 6Summary
Summary
This chapter is the natural extension of Chapters 4 and 5. Like
the z- and t-tests, analysis
of variance is a test of significant differences. Also like the z-
and t-tests, the IV in ANOVA
is nominal and the DV is interval or ratio. With each
procedure—whether z, t, or F—the
test statistic is a ratio of the differences between groups to the
differences within groups
(Objective 3).
There are differences between ANOVA and the earlier
procedures, of course. The vari-
ance statistics are sums of squares and mean squares values. But
perhaps the most impor-
tant difference is that ANOVA can accommodate any number of

groups (Objectives 2 and
3). Remember that trying to deal with multiple groups in a t-test
introduces the problem
of mounting type I error when repeated analyses with the same
data indicate statistical
significance. One-way ANOVA lifts the limitation of a one-
pair-at-a-time comparison
(Objective 1).
The other side of multiple comparisons, however, is the
difficulty of determining which
comparisons are statistically significant when F is significant.
This problem is solved
with the post hoc test. In this chapter, we used Tukey’s HSD
(Objective 4). There are
other post hoc tests, each having their strengths and drawbacks,
but HSD is one of the
most widely used.
Years ago, the emphasis in the scholarly literature was on
whether a result was statisti-
cally significant. Today, the focus is on measuring the effect
size of a significant result, a
statistic that in the case of analysis of variance can indicate how
much of the variability
in the dependent variable can be attributed to the effect of the
independent variable. We
answered that question with eta-squared (h2). But neither the
post hoc test nor eta-squared
is relevant if the F is not significant (Objective 5). Then, further
ANOVAs were executed in
SPSS, and the results were presented (Objective 6) in APA
format and interpreted accord-
ingly (Objective 7). Finally, the nonparametric equivalent of
ANOVA, Kruskal-Wallis
H-test, was discussed as an alternative method and compared to

its parametric equivalent,
the ANOVA. The same data set was used to compare outcomes.
In addition, an appropri-
ate example in SPSS was provided (Objective 8).
The independent t-test and the one-way ANOVA both require
that groups be indepen-
dent. What if they are not? What if we wish to measure one
group twice over time, or
perhaps more than twice? Such dependent-groups procedures are
the focus of Chapter 7.
Rather than different thinking, it is more of an elaboration of
familiar concepts. For this
reason, consider reviewing Chapter 5 and the independent t-test
discussion before start-
ing Chapter 7.
The one-way ANOVA dramatically broadens the kinds of
questions the researcher can
ask. The procedures in Chapter 7 for nonindependent groups
represent the next incre-
mental step.
suk85842_06_c06.indd 229 10/23/13 1:40 PM
CHAPTER 6Chapter Exercises
Key Terms
analysis of variance Fisher’s test that
allows one to detect significant differences
among any number of groups. The acro-
nym is ANOVA.

error variance The variability in a measure
unrelated to the variables being analyzed.
Eta-squared A measure of effect size for
ANOVA. It estimates the amount of vari-
ability in the DV explained by the IV.
F ratio The test statistic calculated in an
analysis of variance problem. It is the ratio
of the variance between the groups to the
variance within the groups.
factor Refers to an IV, particularly in pro-
cedures that involve more than one.
family-wise error An inflated type I error
rate in hypothesis testing when doing mul-
tiple tests with the assumption of different
sets of data. Specifically, when comparing
multiple groups in dyad combinations
using a series of t-tests instead of executing
one omnibus ANOVA.
homogeneity of variance When multiple
groups of data are distributed similarly.
mean square The sum of squares divided
by its degrees of freedom. This division
allows the mean square to reflect a mean, or
average, amount of variability from a source.
omnibus test A test of the overall sig-
nificance of the model based on difference
between sample means when there are
more than two groups to compare. The test
will not tell you which two means are sig-

nificantly different, which is why follow-up
post hoc comparisons are executed.
one-way ANOVA The ANOVA in its sim-
plest form, this model has only one inde-
pendent variable.
post hoc test A test conducted after a sig-
nificant ANOVA or some similar test that
identifies which among multiple possibili-
ties is statistically significant.
sum of squares (SS) The variance measure
in analysis of variance. They are literally
the sum of squared deviations between a
set of scores and their mean.
sum of squares between The variability
related to the independent variable and any
measurement error that may occur.
sum of squares total Total variance from
all sources.
sum of squares within Variability stem-
ming from different responses from indi-
viduals in the same group. It is exclusively
error variance. Is also referred to as the sum
of squares error or the sum of squares residual.
Chapter Exercises
Answers to Try It! Questions
The answers to all Try It! questions introduced in this chapter
are provided below.

A. The “one” in one-way ANOVA refers to the fact that this test
accommodates just
one independent variable.
B. There is no gender variable in the analysis and consequently,
gender-related
variance emerges as error variance. The same would be true for
any variability
in scores stemming from any variable not being analyzed in the
study.
suk85842_06_c06.indd 230 10/23/13 1:40 PM
CHAPTER 6Chapter Exercises
C. It would take 15 comparisons! The answer is the number of
groups (6)
times the number of groups minus 1 (5), with the product
divided by 2:
6 3 5 5 30/2 5 15.
D. The only way SS values can be negative is if there has been a
calculation error.
Because the values are all squared values, if they have any
value other than 0,
they have to be positive.
E. The difference between SStot and SSwith is the SSbet.
F. If F 5 4 and MSwith 5 2, then MSbet 5 8 because F 5
MSbet 4 MSwith.
G. The answer is neither. If F is not significant, there is no
question of which group
is significantly different from which other group because any

variability may be
nothing more than sampling variability. By the same token,
there is no effect to
calculate because, as far as we know, the IV does not have any
effect on the DV.
H. F 5 t2
Review Questions
The answers to the odd-numbered items can be found in the
answers appendix.
1. Several people selected at random are given a story problem
to solve. They take
3.5, 3.8, 4.2, 4.5, 4.7, 5.3, 6.0, and 7.5 minutes. What is the
total sum of squares for
this data?
2. Identify the following symbols and statistics in a one-way
ANOVA:
a. The statistic that indicates the mean amount of difference
between groups.
b. The symbol that indicates the total number of participants.
c. The symbol that indicates the number of groups.
d. The mean amount of uncontrolled variability.
3. The theory is that there are differences by gender in
manifested aggression. With
data from Measuring Expressed Aggression Numbers (MEAN),
a researcher has
the following:
Males: 13, 14, 16, 16, 17, 18, 18, 18
Females: 11, 12, 12, 14, 14, 14, 14, 16

iStockphotoThinkstockchapter 6Analysis of Variance (A.docx

iStockphotoThinkstockchapter 6Analysis of Variance (A.docx

Recommended

Recommended

More Related Content

Similar to iStockphotoThinkstockchapter 6Analysis of Variance (A.docx

Similar to iStockphotoThinkstockchapter 6Analysis of Variance (A.docx (19)

More from vrickens

More from vrickens (20)

Recently uploaded

Recently uploaded (20)

iStockphotoThinkstockchapter 6Analysis of Variance (A.docx