ANOVA Interpretation Set 1 Study this scenario and ANOVA.docx

ANOVA Interpretation Set 1
Study this scenario and ANOVA table, then answer the
questions in the assignment instructions.
A researcher wants to compare the efficacy of three different
techniques for memorizing
information. They are repetition, imagery, and mnemonics. The
researcher randomly assigns
participants to one of the techniques. Each group is instructed
in their assigned memory
technique and given a document to memorize within a set time
period. Later, a test about the
document is given to all participants. The scores are collected
and analyzed using a one-way
ANOVA. Here is the ANOVA table with the results:
Source SS df MS F p
Between 114.3111 2 57.1556 19.74 <.0001
Within 121.6 42 2.8952
Total 235.9111 44

9/10/2019 Print
https://content.ashford.edu/print/AUPSY325.16.1?sections=ch6,
ch6sec1,ch6sec2,ch6sec3,ch6sec4,ch6sec5,ch6sec6,ch6sec7,ch6
sec8,ch6summary,ch7,ch7sec1,ch7s… 1/76
Chapter Learning Objectives
After reading this chapter, you should be able to do the
following:
1. Explain why it is a mistake to analyze the differences
between more than two groups with
multiple t tests.
2. Relate sum of squares to other measures of data variability.
3. Compare and contrast t test with analysis of variance
(ANOVA).
4. Demonstrate how to determine significant differences among
groups in an ANOVA with more
than two groups.
5. Explain the use of eta squared in ANOVA.
6Analysis of Variance
Peter Ginter/Science Faction/Corbis

9/10/2019 Print
Introduction
From one point of view at least, R. A. Fisher was present at the
creation of modern statistical analysis. During
the early part of the 20th century, Fisher worked at an
agricultural research station in rural southern England.
Analyzing the effect of pesticides and fertilizers on crop yields,
he was stymied by independent t tests that
allowed him to compare only two samples at a time. In the
effort to accommodate more comparisons, Fisher
created analysis of variance (ANOVA).
Like William Gosset, Fisher felt that his work was important
enough to publish, and like Gosset, he met
opposition. Fisher’s came in the form of a fellow statistician,
Karl Pearson. Pearson founded the first department
of statistical analysis in the world at University College,
London. He also began publication of what is—for
statisticians at least—perhaps the most influential journal in the
field, Biometrika. The crux of the initial conflict
between Fisher and Pearson was the latter’s commitment to
making one comparison at a time, with the largest
groups possible.
When Fisher submitted his work to Pearson’s journal,
suggesting that samples can be small and many
comparisons can be made in the same analysis, Pearson rejected
the manuscript. So began a long and

increasingly acrimonious relationship between two men who
became giants in the field of statistical analysis and
who nonetheless ended up in the same department at University
College. Gosset also gravitated to the
department but managed to get along with both of them. Joined
a little later by Charles Spearman, collectively
these men made enormous contributions to quantitative research
and laid the foundation for modern statistical
analysis.
9/10/2019 Print
Try It!: #1
To what does the one in one-way ANOVA refer?
Joanna Zielska/Hemera/Thinkstock
If a researcher is analyzing how children’s
behavior changes as a result of watching a
video, the independent variable (IV) is
whether the children have viewed the video.
A change in behavior is the dependent
variable (DV), but any behavior changes
other than those stemming from the IV
reflect the presence of error variance.
6.1 One-Way Analysis of Variance
In an experiment, measurements can vary for a variety of

reasons. A study to determine whether children will
emulate the adult behavior observed in a video recording
attributes the differences between those exposed to the
recording and those not exposed to viewing the recording. The
independent variable (IV) is whether the children
have seen the video. Although changes in behavior (the DV)
show the IV’s effect, they can also reflect a variety
of other factors. Perhaps differences in age among the children
prompt behavioral differences, or maybe variety
in their background experiences prompt them to interpret what
they see differently. Changes in the subjects’
behavior not stemming from the IV constitute what is called
error variance.
When researchers work with human subjects, some level of
error variance is inescapable. Even under tightly
controlled conditions where all members of a sample receive
exactly the same treatment, the subjects are
unlikely to respond identically because subjects are complex
enough that factors besides the IV are involved.
Fisher’s approach was to measure all the variability in a
problem and then analyze it, thus the name analysis of
variance.
Any number of IVs can be included in an ANOVA.
Initially, we are interested in the simplest form of the
test, one-way ANOVA. The “one” in one-way
ANOVA refers to the number of independent
variables, and in that regard, one-way ANOVA is
similar to the independent t test. Both employ just one
IV. The difference is that in the independent t test the
IV has just two groups, or levels, and ANOVA can
accommodate any number of groups more than one.
ANOVA Advantage

The ANOVA and the t test both answer the same question: Are
there significant differences between groups? When one
sample is compared to a population (in the study of whether
social science students study significantly different numbers of
hours than do all university students), we used the one-sample
t test. When two groups are involved (in the study of whether
problem-solving measures differ for married people than for
divorced people), we used the independent t test. If the study
involves more than two groups (for example, whether working
rural, semirural, suburban, and urban adults completed
significantly different numbers of years of post-secondary
education), why not just conduct multiple t tests?
Suppose someone develops a group-therapy program for
people with anger management problems. The research
question is Are there significant differences in the behavior of
clients who spend (a) 8, (b) 16, and (c) 24 hours in therapy
over a period of weeks? In theory, we could answer the
question by performing three t tests as follows:
1. Compare the 8-hour group to the 16-hour group.
9/10/2019 Print
The Problem of Multiple Comparisons
The three tests enumerated above represent all possible

comparisons, but this approach presents two problems.
First, all possible comparisons are a good deal more manageable
with three groups than, say, five groups. With
five groups (labeled a through e) the number of comparisons
needed to cover all possible comparisons increases
to 10, as Figure 6.1 shows. As the number of comparisons to
make increases, the number of tests required
quickly becomes unwieldy.
Figure 6.1 Comparisons needed for five groups
Comparing Group A to Group B is comparison 1. Comparing
Group D to Group E would be the
tenth comparison necessary to make all possible comparisons.
The second problem with using t tests to make all possible
comparisons is more subtle. Recall that the potential
for type I error (α) is determined by the level at which the test
is conducted. At p = 0.05, any significant finding
will result in a type I error an average of 5% of the time.
However, the error probability is based on the
assumption that each test is entirely independent, which means
that each analysis is based on data collected from
new subjects in a separate analysis. If statistical testing is
performed repeatedly with the same data, the potential
for type I error does not remain fixed at 0.05 (or whatever level
was selected), but grows. In fact, if 10 tests are
conducted in succession with the same data as with groups
labeled a, b, c, d, and e above, and each finding is
significant, by the time the 10th test is completed, the potential
for alpha error grows to 0.40 (see Sprinthall,
2011, for how to perform the calculation). Using multiple t tests
is therefore not a good option.
Variance in Analysis of Variance

When scores in a study vary, there are two potential
explanations: the effect of the independent variable (the
“treatment”) and the influence of factors not controlled by the
researcher. This latter source of variability is the
error variance mentioned earlier.
The test statistic in ANOVA is called the F ratio (named for
Fisher). The F ratio is treatment variance divided by
error variance. As was the case with the t ratio, a large F ratio
indicates that the difference among groups in the
analysis is not random. When the F ratio is small and not
significant, it means the IV has not had enough impact
to overcome error variability.
Variance Among and Within Groups
9/10/2019 Print
If three groups of the same size are all selected from one
population, they could be represented by the three
distributions in Figure 6.2. They do not have exactly the same
mean, but that is because even when they are
selected from the same population, samples are rarely identical.
Those initial differences among sample means
indicate some degree of sampling error.
The reason that each of the three distributions has width is that
differences exist within each of the groups. Even
if the sample means were the same, individuals selected for the
same sample will rarely manifest precisely the

same level of whatever is measured. If a population is
identified—for example, a population of the academically
gifted—and a sample is drawn from that population, the
individuals in the sample will not all have the same
level of ability despite the fact that all are gifted students. The
subjects’ academic ability within the sample will
still likely have differences. These differences within are the
evidence of error variance.
The treatment effect is represented in how the IV affects what is
measured, the DV. For example, three groups of
subjects are administered different levels of a mild stimulant
(the IV) to see the effect on level of attentiveness.
The subsequent analysis will indicate whether the samples still
represent populations with the same mean, or
whether, as is suggested by the distributions in Figure 6.3, they
represent unique populations.
The within-groups’ variability in these three distributions is the
same as it was in the distributions in Figure 6.2.
It is the among-groups’ variability that makes Figure 6.3
different. More specifically, the difference between the
group means is what has changed. Although some of the
difference remains from the initial sampling variability,
differences between the sample means after the treatment are
much greater. F allows us to determine whether
those differences are statistically significant.
Figure 6.2: Three groups drawn from the same population
A sample of three groups from the same population will have
similar—but not identical—
distributions, where differences among sample means are a
result of sampling error.
Figure 6.3: Three groups after the treatment

Once a treatment has been applied to sample groups from the
same population, differences
between sample means greatly increase.
9/10/2019 Print
Try It!: #2
How many t tests would it take to make all
possible pairs of comparisons in a procedure
with six groups?
The Statistical Hypotheses in One-Way ANOVA
The statistical hypotheses are very much like they were for the
independent t test, except that they accommodate
more groups. For the t test, the null hypothesis is written
H0: µ1 = µ2
It indicates that the two samples involved were drawn from
populations with the same mean. For a one-way
ANOVA with three groups, the null hypothesis has this form:
H0: µ1 = µ2 = µ3
It indicates that the three samples were drawn from populations
with the same mean.

Things have to change for the alternate hypothesis, however,
because three groups do not have just one possible
alternative. Note that each of the following is possible:
a. HA: µ1 ≠ µ2 = µ3 Sample 1 represents a population with a
mean value different from the mean of the
population represented by Samples 2 and 3.
b. HA: µ1 = µ2 ≠ µ3 Samples 1 and 2 represent a population
with a mean value different from the mean of
the population represented by Sample 3.
c. HA: µ1 = µ3 ≠ µ2 Samples 1 and 3 represent a population
with a mean value different from the
population represented by Sample 2.
d. HA: µ1 ≠ µ2 ≠ µ3
All three samples represent populations with different means.
Because the several possible alternative outcomes
multiply rapidly when the number of groups
increases, a more general alternate hypothesis is
given. Either all the groups involved come from
populations with the same means, or at least one of
them does not. So the form of the alternate hypothesis
for an ANOVA with any number of groups is simply
HA: not so.
Measuring Data Variability in the One-Way ANOVA
We have discussed several different measures of data variability
to this point, including the standard deviation
(s), the variance (s2), the standard error of the mean (SEM), the
standard error of the difference (SEd), and the
range (R). Analysis of variance presents a new measure of data

variability called the sum of squares (SS). As
the name suggests, it is the sum of the squared values. In the
ANOVA, SS is the sum of the squares of the
differences between scores and means.
One sum-of-squares value involves the differences between
individual scores and the mean of all the
scores in all the groups. This is the called the sum of squares
total (SStot) because it measures all
variability from all sources.
9/10/2019 Print
A second sum-of-squares value indicates the difference between
the means of the individual groups and
the mean of all the data. This is the sum of squares between
(SSbet). It measures the effect of the IV,
the treatment effect, as well any differences between the groups
and the mean of all the data preceding
the study.
A third sum-of-squares value measures the difference between
scores in the samples and the means of
those samples. These sum of squares within (SSwith) values
reflect the differences among the subjects
in a group, including differences in the way subjects respond to
the same stimulus. Because this measure
is entirely error variance, it is also called the sum of squares
error (SSerr).
All Variability from All Sources: Sum of Squares Total (SStot )

An example to follow will explore the issue of differences in
the levels of social isolation people in small towns
feel compared to people in suburban areas, as well as people in
urban areas. The SStot will be the amount of
variability people experience—manifested by the difference in
social isolation measures—in all three
circumstances: small towns, suburban areas, and urban areas.
There are multiple formulas for SStot. Although they all
provide the same answer, some make more sense to
consider than others that may be easier to follow when
straightforward calculation is the issue. The heart of SStot
is the difference between each individual score (x) and the mean
of all scores, called the “grand” mean (MG). In
the example to come, MG is the mean of all social isolation
measures from people in all three groups. The
formula will we use to calculate SStot follows.
Formula 6.1
SStot = ∑(x − MG)2
Where
x = each score in all groups
MG = the mean of all data from all groups, the “grand” mean
To calculate SStot, follow these steps:
1. Sum all scores from all groups and divide by the number of
scores to determine the grand mean, MG.
2. Subtract MG from each score (x) in each group, and then
square the difference: (x − MG)2

3. Sum all the squared differences: ∑(x − MG)2
The Treatment Effect: Sum of Squares Between (SSbet )
In the example we are using, SSbet is the differences in social
isolation between rural, suburban, and urban
groups. SSbet contains the variability due to the independent
variable, or what is often called the treatment effect,
in spite of the fact that it is not something that the researcher
can manipulate in this instance. It will also contain
any initial differences between the groups, which of course
represent error variance. Notice in Formula 6.2 that
SSbet is based on the square of the differences between the
individual group means and the grand mean, times the
number in each group. For three groups labeled A, B, and C, the
formula is below.
9/10/2019 Print
Formula 6.2
SSbet = (Ma − MG)2na + (Mb − MG)2nb + (Mc − MG )2nc
where
Ma = the mean of the scores in the first group (a)
MG = the same grand mean used in SStot
na = the number of scores in the first group (a)

To calculate SSbet,
1. Determine the mean for each group: Ma, Mb, and so on.
2. Subtract MG from each sample mean and square the
difference: (Ma − MG)2.
3. Multiply the squared differences by the number in each
group: (Ma − MG)2na.
4. Repeat for each group.
5. Sum (∑) the results across groups.
The Error Term: Sum of Squares Within
When a group receives the same treatment but individuals
within the group respond differently, their differences
constitute error—unexplained variability. These differences can
spring from any uncontrolled variable. Since the
only thing controlled in one-way ANOVA is the independent
variable, variance from any other source is error
variance. In the example, not all people in any group are likely
to manifest precisely the same level of social
isolation. The differences within the groups are measured in the
SSwith, the formula for which follows.
Formula 6.3
SSwith = ∑(xa − Ma )2 + ∑(xb − Mb)2 + ∑(xc − Mc)2
where
SSwith = the sum of squares within
xa = each of the individual scores in Group a
Ma = the score mean in Group a

To calculate SSwith, follow these steps:
1. Retrieve the mean (used for the SSbet earlier) for each of the
groups.
2. Subtract the individual group mean (Ma for the Group A
mean) from each score in the group (xa for
Group A)
3. Square the difference between each score in each group and
its mean.
4. Sum the squared differences for each group.
5. Repeat for each group.
9/10/2019 Print
Try It!: #3
When will sum-of-squares values be negative?
iStockphoto/Thinkstock
People may experience differences in social
isolation when they live in small towns
instead of suburbs of large cities.
6. Sum the results across the groups.
The SSwith (or the SSerr) measures the fluctuations in subjects’
scores that are error variance.

All variability in the data (SStot) is either SSbet or SSwith. As
a result, if two of three are known, the third can be
determined easily. If we calculate SStot and SSbet, the SSwith
can be determined by subtraction:
SStot − SSbet = SSwith
The difficulty with this approach, however, is that any
calculation error in SStot or SSbet is perpetuated in
SSwith/SSerror. The other value of using Formula 6.3
is that, like the two preceding formulas, it helps to
clarify that what is being determined is how much
score variability is within each group. For the few
problems done entirely by hand, we will take the
“high road” and use Formula 6.3.
To minimize the tedium, the data sets here are relatively small.
When researchers complete larger studies by
hand, they often shift to the alternate “calculation formulas” for
simpler arithmetic, but in so doing can sacrifice
clarity. Happily, ANOVA is one of the procedures that Excel
performs, and after a few simple longhand
problems, we can lean on the computer for help with larger data
sets.
Calculating the Sums of Squares
Consider the example we have been using: A researcher is
interested in the level of social isolation people feel in small
towns (a), suburbs (b), and cities (c). Participants randomly
selected from each of those three settings take the Assessment
List of Non-normal Environments (ALONE), for which the
following scores are available:
a. 3, 4, 4, 3
b. 6, 6, 7, 8

c. 6, 7, 7, 9
We know we will need the mean of all the data (MG) as well as
the mean for each group (Ma, Mb, Mc), so we will start there.
Verify that
∑x = 70 and N = 12, so MG = 5.833.
For the small-town subjects,
∑xa = 14 and na = 4, so Ma = 3.50.
For the suburban subjects,
∑xb = 27 and nb = 4, so Mb = 6.750.
9/10/2019 Print
sec8,ch6summary,ch7,ch7sec1,ch7… 10/76
For the city subjects,
∑xc = 29 and nc = 4, so Mc = 7.250.
For the sum-of-squares total, the formula is
SStot = ∑(x − MG)2
= 41.668
The calculations are listed in Table 6.1.

Table 6.1: Calculating the sum of squares total (SStot)
SStot = ∑ (x − MG)2 = 5.833
For the town data:
x − M
3 − 5.833 = −2.833
4 − 5.833 = −1.833
4 − 5.833 = −1.833
3 − 5.833 = −2.833
(x − M)2
8.026
3.360
3.360
8.026
For the suburb data:
x − M
6 − 5.833 = 0.167
6 − 5.833 = 0.167
7 − 5.833 = 1.167
8 − 5.833 = 2.167
(x − M)2
0.028
0.028
1.362
4.696
For the city data:
x − M
6 − 5.833 = 0.167

6 − 5.833 = 0.167
7 − 5.833 = 1.167
9 − 5.833 = 3.167
(x − M)2
0.028
0.028
1.362
10.030
SStot = 41.668
For the sum of squares between, the formula is:
SSbet = (Ma − MG)2na + (Mb − MG)2nb + (Mc − MG)2nc
The SSbet for the three groups is as follows:
SSbet = (Ma − MG)2na + (Mb − MG)2nb + (Mc − MG)2nc
= (3.5 − 5.833)2(4) + (6.75 − 5.833)2(4) + (7.25 − 5.833)2(4)
= 21.772 + 3.364 + 8.032
9/10/2019 Print
= 33.168
The SSwith indicates the error variance by determining the

differences between individual scores in a group and
their means. The formula is
SSwith = ∑(xa − Ma)2 + ∑(xb − Mb)2 + ∑(xc − Mc)2
SSwith = 8.504
Table 6.2 lists the calculations for SSwith.
Table 6.2: Calculating the sum of squares within (SSwith)
SSwith = ∑(xa − Ma)2 + ∑(xb − Mb)2 + ∑(xc − Mc)2
3,4,4,3
6,6,7,8
6,7,7,9
Ma = 3.50, Mb = 6.750, Mc = 7.250
For the town data:
x − M
3 − 3.50 = –0.50
4 − 3.50 = 0.50
4 − 3.50 = 0.50
3 − 3.50 = –0.50
(x − M)2
0.250
0.250
0.250
0.250
For the suburb data:
x − M

6 − 6.750 = –0.750
6 − 6.750 = –0.750
7 − 6.750 = 0.250
8 − 6.750 = 1.250
(x − M)2
0.563
0.563
0.063
1.563
For the city data:
x − M
6 − 7.250 = 1.250
7 − 7.250 = –0.250
7 − 7.250 = –0.250
9 − 7.250 = 1.750
(x − M)2
1.563
0.063
0.063
3.063
SSwith = 8.504
Because we calculated the SSwith directly instead of
determining it by subtraction, we can now check for
accuracy by adding its value to the SSbet. If the calculations are
correct, SSwith + SSbet = SStot. For the isolation
example, 8.504 + 33.168 = 41.672.
The calculation of SStot earlier found SStot = 41.668. The
difference between that value and the SStot that we
determined by adding SSbet to SSwith is just 0.004. That result

is due to differences from rounding and is
unimportant.
9/10/2019 Print
Try It!: #4
What will SStot − SSwith yield?
We calculated equivalent statistics as early as Chapter
1, although we did not term them sums of squares. At
the heart of the standard deviation calculation are
those repetitive x − M differences for each score in
the sample. The difference values are then squared
and summed, much as they are when calculating
SSwith and SStot. Incidentally, the denominator in the
standard deviation calculation is n − 1, which should
look suspiciously like some of the degrees of freedom
values we will discuss in the next section.
Interpreting the Sums of Squares
The different sums-of-squares values are measures of data
variability, which makes them like the standard
deviation, variance measures, the standard error of the mean,
and so on. Also like the other measures of
variability, SS values can never be negative. But between SS
and the other statistics is an important difference. In
addition to data variability, the magnitude of the SS value

reflects the number of scores involved. Because sums
of squares are in fact the sum of squared values, the more
values there are, the larger the value becomes. With
statistics like the standard deviation, if more values are added
near the mean of the distribution, s actually
shrinks. This cannot happen with the sum of squares. Additional
scores, whatever their value, will always
increase the sum-of-squares value.
The fact that large SS values can result from large amounts of
variability or relatively large numbers of scores
makes them difficult to interpret. The SS values become easier
to gauge if they become mean, or average,
variability measures. Fisher transformed sums-of-squares
variability measures into mean, or average, variability
measures by dividing each sum-of-squares value by its degrees
of freedom. The SS ÷ df operation creates what is
called the mean square (MS).
In the one-way ANOVA, an MS value is associated with both
the SSbet and the SSwith (SSerr). There is no mean-
squares total. Dividing the SStot by its degrees of freedom
provides a mean level of overall variability, but since
the analysis is based on how between-groups variability
compares to within-groups variance, mean total
variability would not be helpful.
The degrees of freedom for each of the sums of squares
calculated for the one-way ANOVA are as follows:
Though we do not calculate a mean measure of total variability,
degrees of freedom total allows us to
check the other df values for accuracy later; dftot is N − 1,
where N is the total number of scores.
Degrees of freedom for between (dfbet) is k − 1, where k is the
number of groups: SSbet ÷ dfbet = MSbet

Degrees of freedom for within (dfwith) is N – k, total number of
scores minus number of groups: SSwith
÷ dfwith = MSwith
a. The sums of squares between and within should equal total
sum of squares, as noted earlier:
SSbet + SSwith = SStot
b. Likewise, sum of degrees of freedom between and within
should equal degrees of freedom total:
dfbet + dfwith = dftot
The F Ratio
9/10/2019 Print
The mean squares for between and within groups are the
components of F, the test statistic in ANOVA:
Formula 6.4
F = MSbet/MSwith
This formula allows one to determine whether the average
treatment effect—MSbet—is substantially greater than
the average measure of error variance—MSwith. Figure 6.4
illustrates the F ratio, which compares the distance
from the mean of the first distribution to the mean of the second
distribution, the A variance, to the B and C
variances, which indicate the differences within groups.

If the MSbet / MSwith ratio is large—it must be substantially
greater than 1.0—the difference between groups is
likely to be significant. When that ratio is small, F is likely to
be nonsignificant. How large F must be to be
significant depends on the degrees of freedom for the problem,
just as it did for the t tests.
Figure 6.4: The F ratio: comparing variance between groups (A)
to
variance within groups (B + C)
The distance from the mean of the first distribution to the mean
of the second distribution, the A
variance, to the B and C variances indicates the differences
within groups.
The ANOVA Table
The results of ANOVA analysis are summarized in a table that
indicates
the source of the variance,
the sums-of-squares values,
the degrees of freedom,
the mean square values, and
F.
With the total number of scores (N) 12, and degrees of freedom
total (dftot) = N − 1; 12 − 1 = 11. The number of
groups (k) is 3 and between degrees of freedom (dfbet) = k − 1,
so dfbet = 2. Within degrees of freedom (dfwith)
are N – k; 12 − 3 = 9.
Recall that MSbet = SSbet/dfbet and MSwith = SSwith/dfwith.
We do not calculate MStot. Table 6.3 shows the

ANOVA table for the social isolation problem.
9/10/2019 Print
Try It!: #5
If the F in an ANOVA is 4.0 and the MSwith =
2.0, what will be the value of MSbet?
Table 6.3: ANOVA table for social isolation problem
Source SS df MS F
Total 41.672 11
Between 33.168 2 16.584 17.551
Within 8.504 9 0.945
Verify that SSbet + SSwith = SStot, and dfbet + dfwith = dftot.
The smallest value an SS can have is 0, which occurs
if all scores have the same value. Otherwise, the SS and MS
values will always be positive.
Understanding F
The larger F is, the more likely it is to be statistically
significant, but how large is large enough? In the ANOVA
table above, F = 17.551.

The fact that F is determined by dividing MSbet by MSwith
indicates that whatever the value of F is indicates the
number of times MSbet is greater than MSwith. Here, MSbet is
17.551 times greater than MSwith, which seems
promising; to be sure, however, it must be compared to a value
from the critical values of F (Table 6.4; Table B.3
in Appendix B).
As with the t test, as degrees of freedom increase, the critical
values decline. The difference between t and F is
that F has two df values, one for the MSbet, the other for the
MSwith. In Table 6.3, the critical value is at the
intersection of dfbet across the top of the table and dfwith down
the left side. For the social isolation problem,
these are 2 (k − 1) across the top and 9 (N − k) down the left
side.
The value in regular type at the intersection of 2 and 9 is 4.26
and is the critical value when testing at p = 0.05.
The value in bold type is for testing at p = 0.01.
The critical value indicates that any ANOVA test with 2 and 9
df that has an F value equal to or greater
than 4.26 is statistically significant.
The social isolation differences among the three groups are
probably not due to sampling variability.
The statistical decision is to reject H0.
The relatively large value of F—it is more than four times the
critical value—indicates that the differences in
social isolation are affected by where respondents live. The
amount of within-group variability, the error
variance, is small relative to the treatment effect.
Table 6.4 provides the critical values of F for a
variety of research scenarios. When computer

software completes ANOVA, the answer it generates
typically provides the exact probability that a
specified value of F could have occurred by chance.
Using the most common standard, when that
probability is 0.05 or less, the result is statistically
significant. Performing calculations by hand without
statistical software, however, requires the additional
step of comparing F to the critical value to determine
9/10/2019 Print
statistical significance. When the calculated value is the same
as, or larger than, the table value, it is statistically
significant.
Table 6.4: The critical values of F
df denominator
df numerator
1 2 3 4 5 6 7 8 9 10
2 18.51
98.49
19.00
99.01
19.16

99.17
19.25
99.25
19.30
99.30
19.33
99.33
19.35
99.36
19.37
99.38
19.38
99.39
19.40
99.40
3 10.13
34.12
9.55
30.82
9.28
29.46
9.12
28.71
9.01

28.24
8.94
27.67
8.89
27.49
8.85
27.49
8.81
27.34
8.79
27.23
4 7.71
21.20
6.94
18.00
6.59
16.69
6.39
15.98
6.26
15.52
6.16
15.21
6.09

14.98
6.04
14.80
6.00
14.66
5.96
14.55
5 6.61
16.26
5.79
13.27
5.41
12.06
5.19
11.39
5.05
10.97
4.95
10.67
4.88
10.46
4.82
10.29
4.77

10.16
4.74
10.05
6 5.99
13.75
5.14
10.92
4.76
9.78
4.53
9.15
4.39
8.75
4.28
8.47
4.21
8.26
4.15
8.10
4.10
7.98
4.06
7.87
7 5.59

12.25
4.74
9.55
4.35
8.45
4.12
7.85
3.97
7.46
3.87
7.19
3.79
6.99
3.73
6.72
3.68
6.72
3.64
6.62
8 5.32
11.26
4.46
8.65
4.07

7.59
3.84
7.01
3.69
6.63
3.58
6.37
3.50
6.18
3.44
6.03
3.39
5.91
3.64
6.62
9 5.12
10.56
4.26
8.02
3.86
6.99
3.63
6.42
3.48

6.06
3.37
5.80
3.29
5.61
3.23
5.47
3.18
5.35
3.14
5.26
10 4.96
10.04
4.10
7.56
3.71
6.55
3.48
5.99
3.33
5.64
3.22
5.39
3.14

5.20
3.07
5.06
3.02
4.94
2.98
4.85
11 4.84
9.65
3.98
7.21
3.59
6.22
3.36
5.67
3.20
5.32
3.09
5.07
3.01
4.89
2.95
4.74
2.90

4.63
2.85
4.54
12 4.75
9.33
3.89
6.93
3.49
5.95
3.26
5.41
3.11
5.06
3.00
4.82
2.91
4.64
2.85
4.50
2.80
4.39
2.75
4.30
13 4.67

9.07
3.81
6.70
3.41
5.74
3.18
5.21
3.03
4.86
2.92
4.62
2.83
4.44
2.77
4.30
2.71
4.19
2.67
4.10
14 4.60
8.86
3.74
6.51
3.34

5.56
3.11
5.04
2.96
4.69
2.85
4.46
2.76
4.28
2.70
4.14
2.65
4.03
2.60
3.94
15 4.54
8.68
3.68
6.36
3.29
5.24
3.06
4.89
2.90

4.56
2.79
4.32
2.71
4.14
2.64
4.00
2.59
3.89
2.54
3.80
16 4.49
8.53
3.63
6.23
3.24
5.29
3.01
4.77
2.85
4.44
2.74
4.20
2.66

4.03
2.59
3.89
2.54
3.78
2.49
3.69
17 4.45
8.40
3.59
6.11
3.20
5.19
2.96
4.67
2.81
4.34
2.70
4.10
2.61
3.93
2.55
3.79
2.49

3.68
2.45
3.59
18 4.41
8.29
3.55
6.01
3.16
5.09
2.93
4.58
2.77
4.25
2.66
4.01
2.58
3.84
2.51
3.71
2.46
3.60
2.41
3.51

9/10/2019 Print
df denominator
df numerator
1 2 3 4 5 6 7 8 9 10
19 4.38
8.18
3.52
5.93
3.13
5.01
2.90
4.50
2.74
4.17
2.63
3.94
2.54
3.77
2.48
3.63

2.42
3.52
2.38
3.43
20 4.35
8.10
3.49
5.85
3.10
4.94
2.87
4.43
2.71
4.10
2.60
3.87
2.51
3.70
2.45
3.56
2.39
3.46
2.35
3.37

21 4.32
8.02
3.47
5.78
3.07
4.87
2.84
4.37
2.68
4.04
2.57
3.81
2.49
3.64
2.42
3.51
2.37
3.40
2.32
3.31
22 4.30
7.95
3.44
5.72

3.05
4.82
2.82
4.31
2.66
3.99
2.55
3.76
2.46
3.59
2.40
3.45
2.34
3.35
2.30
3.26
23 4.28
7.88
3.42
5.66
3.03
4.76
2.80
4.26

2.64
3.94
2.53
3.71
2.44
3.54
2.37
3.41
2.32
3.30
2.27
3.21
24 4.26
7.82
3.40
5.61
3.01
4.72
2.78
4.22
2.62
3.90
2.51
3.67

2.42
3.50
2.36
3.36
2.30
3.26
2.25
3.17
25 4.24
7.77
3.39
5.57
2.99
4.68
2.76
4.18
2.60
3.85
2.49
3.63
2.40
3.46
2.34
3.32

2.28
3.22
2.24
3.13
26 4.21
7.68
3.35
5.49
2.96
4.60
2.74
4.14
2.59
3.82
2.47
3.59
2.39
3.42
2.32
3.29
2.27
3.18
2.22
3.09

27 4.21
7.68
3.35
5.49
2.96
4.60
2.73
4.11
2.57
3.78
2.46
3.56
2.37
3.39
2.31
3.26
2.25
3.15
2.20
3.06
28 4.20
7.64
3.34
5.45

2.95
4.57
2.71
4.07
2.56
3.75
2.45
3.53
2.36
3.36
2.29
3.23
2.24
3.12
2.19
3.03
29 4.18
7.60
3.33
5.42
2.93
4.54
2.70
4.04

2.55
3.73
2.43
3.50
2.35
3.33
2.28
3.20
2.22
3.09
2.18
3.00
30 4.17
7.56
3.32
5.39
2.92
4.51
2.69
4.02
2.53
3.70
2.42
3.47

2.33
3.30
2.27
3.17
2.21
3.07
2.16
2.98
Values in regular type indicate the critical value for p = .05;
Values in bold type indicate the critical value for p = .01
Source: Critical values of F. (n.d.). Retrieved from
http://faculty.vassar.edu/lowry/apx_d.html
(http://faculty.vassar.edu/lowry/apx_d.html)
http://faculty.vassar.edu/lowry/apx_d.html
9/10/2019 Print
6.2 Locating the Difference: Post Hoc Tests and Honestly
Significant Difference
(HSD)
When a t test is statistically significant, only one explanation of
the difference is possible: the first group
probably belongs to a different population than the second

group. Things are not so simple when there are more
than two groups. A significant F indicates that at least one
group is significantly different from at least one other
group in the study, but unless the ANOVA considers only two
groups, there are a number of possibilities for the
statistical significance, as we noted when we listed all the
possible HA outcomes earlier.
The point of a post hoc test, an “after this” test conducted
following an ANOVA, is to determine which groups
are significantly different from which. When F is significant, a
post hoc test is the next step.
There are many post hoc tests. Each of them has particular
strengths, but one of the more common, and also one
of the easier to calculate, is one John Tukey developed called
HSD, for “honestly significant difference.”
Formula 6.5 produces a value that is the smallest difference
between the means of any two samples that can be
statistically significant:
Formula 6.5
where
x = a table value indexed to the number of groups (k) in the
problem and the degrees of
freedom within (dfwith) from the ANOVA table
MSwith = the value from the ANOVA table
n = the number in any group when the group sizes are equal
As long as the number in all samples is the same, the value from
Formula 6.5 will indicate the minimum
difference between the means of any two groups that can be

statistically significant. An alternate formula for
HSD may be used when group sizes are unequal:
Formula 6.6
The notation in this formula indicates that the HSD value is for
the group-1-to-group-2 comparison (n1, n2).
When sample sizes are unequal, a separate HSD value must be
completed for each pair of sample means in the
problem.
To compute HSD for equal sample sizes, follow these steps:
1. From Table 6.5, locate the value of x by moving across the
top of the table to the number of
groups/treatments (k = 3), and then down the left side for the
within degrees of freedom (dfwith = 9). The
intersecting values for 3 and 9 are 3.95 and 5.43. The smaller of
the two is the value when p = 0.05. The
post hoc test is always conducted at the same probability level
as the ANOVA, p = 0.05 in this case.
9/10/2019 Print
2. The calculation is 3.95 times the result of the square root of
0.945 (the MSwith) divided by 4 (n).
This value is the minimum absolute value of the difference
between the means of two statistically significant
samples. The means for social isolation in the three groups are

as follows:
Ma = 3.50 for small town respondents
Mb = 6.750 for suburban respondents
Mc = 7.250 for city respondents
To compare small towns to suburbs this procedure is as follows:
Ma − Mb = 3.50 − 6.75 = −3.25.
This difference exceeds 1.92 and is significant.
To compare small towns to cities, note that
Ma − Mc = 3.50 − 7.25 = −3.75.
This difference exceeds 1.92 and is significant.
To compare suburbs to cities,
Mb − Mc = 6.75 − 7.25 = −0.50.
This difference is less than 1.92 and is not significant.
When several groups are involved, sometimes it is helpful to
create a table that presents all the differences
between pairs of means. Table 6.6 repeats the HSD results for
the social isolation problem.
Table 6.5: Tukey’s HSD critical values: q (alpha, k, df)
df
k = Number of Treatments

2 3 4 5 6 7 8 9 10
5 3.64
5.70
4.60
6.98
5.22
7.80
5.67
8.42
6.03
8.91
6.33
9.32
6.58
9.67
6.80
9.97
6.99
10.24
6 3.46
5.24
4.34
6.33

4.90
7.03
5.30
7.56
5.63
7.97
5.90
8.32
6.12
8.61
6.32
8.87
6.49
9.10
7 3.34
4.95
4.16
5.92
4.68
6.54
5.06
7.01
5.36
7.37

5.61
7.68
5.82
7.94
6.00
8.17
6.16
8.37
8 3.26
4.75
4.04
5.64
4.53
6.20
4.89
6.62
5.17
6.96
5.40
7.24
5.60
7.47
5.77
7.68

5.92
7.86
9/10/2019 Print
df
k = Number of Treatments
2 3 4 5 6 7 8 9 10
9 3.20
4.60
3.95
5.43
4.41
5.96
4.76
6.35
5.02
6.66
5.24
6.91
5.43

7.13
5.59
7.33
5.74
7.49
10 3.15
4.48
3.88
5.27
4.33
5.77
4.65
6.14
4.91
6.43
5.12
6.67
5.30
6.87
5.46
7.05
5.60
7.21
11 3.11

4.39
3.82
5.15
4.26
5.62
4.57
5.97
4.82
6.25
5.03
6.48
5.20
6.67
5.35
6.84
5.49
6.99
12 3.08
4.32
3.77
5.05
4.20
5.50
4.51

5.84
4.75
6.10
4.95
6.32
5.12
6.51
5.27
6.67
5.39
6.81
13 3.06
4.26
3.73
4.96
4.15
5.40
4.45
5.73
4.69
5.98
4.88
6.19
5.05

6.37
5.19
6.53
5.32
6.67
14 3.03
4.21
3.70
4.89
4.11
5.32
4.41
5.63
4.64
5.88
4.83
6.08
4.99
6.26
5.13
6.41
5.25
6.54
15 3.01

4.17
3.67
4.84
4.08
5.25
4.37
5.56
4.59
5.80
4.78
5.99
4.94
6.16
5.08
6.31
5.20
6.44
16 3.00
4.13
3.65
4.79
4.05
5.19
4.33

5.49
4.56
5.72
4.74
5.92
4.90
6.08
5.03
6.22
5.15
6.35
17 2.98
4.10
3.63
4.74
4.01
5.14
4.30
5.43
4.52
5.66
4.70
5.85
4.86

6.01
4.99
6.15
5.11
6.27
18 2.97
4.07
3.61
4.70
4.00
5.09
4.28
5.38
4.49
5.60
4.67
5.79
4.82
5.94
4.96
6.08
5.07
6.20
19 2.96

4.05
3.59
4.67
3.98
5.05
4.25
5.33
4.47
5.55
4.65
5.73
4.79
5.89
4.92
6.02
5.04
6.14
20 2.95
4.02
3.58
4.64
3.96
5.02
4.23

5.29
4.45
5.51
4.62
5.69
4.77
5.84
4.90
5.97
5.01
6.09
24 2.92
3.96
3.53
4.55
3.90
4.91
4.17
5.17
4.37
5.37
4.54
5.54
4.68

5.69
4.81
5.81
4.92
5.92
30 2.89
3.89
3.49
4.45
3.85
4.80
4.10
5.05
4.30
5.24
4.46
5.40
4.60
5.54
4.72
5.65
4.82
5.76
40 2.86

3.82
3.44
4.37
3.79
4.70
4.04
4.93
4.23
5.11
4.39
5.26
4.52
5.39
4.63
5.50
4.73
5.60
*The critical values for q corresponding to alpha = 0.05 (top)
and alpha = 0.01 (bottom)
Source: Tukey’s HSD critical values (n.d.). Retrieved from
http://www.stat.duke.edu/courses/Spring98/sta110c/qtable.html
(http://www.stat.duke.edu/courses/Spring98/sta110c/qtable.html
)
Table 6.6: Presenting Tukey’s HSD results in a table

http://www.stat.duke.edu/courses/Spring98/sta110c/qtable.html
9/10/2019 Print
Any difference between pairs of means 1.920 or greater is a
statistically significant difference.
Small towns
M = 3.500
Suburbs
M = 6.750
Cities
M = 7.250
Any difference between pairs of means 1.920 or greater is a
statistically significant difference.
Small towns
M = 3.500
Suburbs
M = 6.750
Cities
M = 7.250
Small towns
M = 3.500

Diff = 3.250 Diff = 3.750
Suburbs
M = 6.750
Diff = 0.500
Cities
M = 7.250
The mean differences of 3.250 and 3.750 are statistically
significant.
The values in the cells in Table 6.6 indicate the results of the
post hoc test for differences between each pair of
means in the study. Results indicate that the respondents from
small towns expressed a significantly lower level
of social isolation than those in either the suburbs or cities.
Results from the suburban and city groups indicate
that social isolation scores are higher in the city than in the
suburbs, but the difference is not large enough to be
statistically significant.
Analysis of Variance (ANOVA)
9/10/2019 Print
iStockphoto/Thinkstock
Using Excel to complete ANOVA makes it

easier to calculate the means, differences,
and other values of data from studies such
as the level of optimism indicated by people
in different vocations during a recession.
6.3 Completing ANOVA with Excel
The ANOVA by longhand involves enough calculated means,
subtractions, squaring of differences, and so on that letting
Excel do the ANOVA work can be very helpful. Consider the
following example: A researcher is comparing the level of
optimism indicated by people in different vocations during an
economic recession. The data are from laborers, clerical staff
in professional offices, and the professionals in those offices.
The optimism scores for the individuals in the three groups are
as follows:
Laborers: 33, 35, 38, 39, 42, 44, 44, 47, 50, 52
Clerical staff: 27, 36, 37, 37, 39, 39, 41, 42, 45, 46
Professionals: 22, 24, 25, 27, 28, 28, 29, 31, 33, 34
1. First create the data file in Excel. Enter “Laborers,”
“Clerical staff,” and “ Professionals” in cells A1, B1,
and C1 respectively.
2. In the columns below those labels, enter the optimism scores,
beginning in cell A2 for the laborers, B2
for the clerical workers, and C2 for the professionals. After
entering the data and checking for accuracy,
proceed with the following steps.
3. Click the Data tab at the top of the page.
4. On the far right, choose Data Analysis.
5. In the Analysis Tools window, select ANOVA Single Factor

and click OK.
6. Indicate where the data are located in the Input Range. In the
example here, the range is A2:C11.
7. Note that the default setting is “Grouped by Columns.” If the
data are arrayed along rows instead of
columns, change the setting. Because we designated A2 instead
of A1 as the point where the data begin,
there is no need to indicate that labels are in the first row.
8. Select Output Range and enter a cell location where you wish
the display of the output to begin. In the
example in Figure 6.5, the output results are located in A13.
9. Click OK.
Widen column A to make the output easier to read. The result
resembles the screenshot in Figure 6.5.
Figure 6.5: ANOVA in Excel
Results of ANOVA performed using Excel
9/10/2019 Print
Source: Microsoft Excel. Used with permission from Microsoft.
Completing ANOVA with Excel
Results appear in two tables. The first provides descriptive

statistics. The second table looks like the longhand
table we created earlier, except that the column titled “P-value”
indicates the probability that an F of this
magnitude could have occurred by chance.
Note that the P-value is 4.31E-06. The “E-06” is scientific
notation, a shorthand way of indicating that the actual
value is p = 0.00000431, or 4.31 with the decimal moved 6
decimals to the left. The probability easily exceeds
the p = 0.05 standard for statistical significance.
Apply It!
Analysis of Variance and Problem-Solving Ability
9/10/2019 Print
A psychological services organization is interested in how long
a group of randomly selected university
graduates will persist in a series of cognitive tasks they are
asked to complete when the environment is
varied. Forty graduate students are recruited from a state
university and told that they are to evaluate the
effectiveness of a series of spatial relations tasks that may be
included in a test of academic aptitude. The
students are asked to complete a series of tasks, after which
they will be asked to evaluate the tasks. What
is actually being measured is how long subjects will persist in
these tasks when environmental conditions
vary. Group 1’s treatment is recorded hip-hop in the
background. Group 2 performs tasks with a newscast

in the background. Group 3 has classical music in the
background, and Group 4 experiences a no-noise
environment. The dependent variable is how many minutes
subjects persist before stopping to take a
break. Table 6.7 displays the measured results.
Table 6.7: Results of task persistence under varied background
conditions
1: Hip-hop 2: Newscast 3: Classical music 4: No noise
49 57 77 65
57 53 82 61
73 69 77 73
68 65 85 81
65 61 93 89
62 73 79 77
61 57 73 81
45 69 89 77
53 73 82 69
61 77 85 77
Next, the test results are analyzed in Excel, which produces the
information displayed in Table 6.8.
Table 6.8: Excel analysis of task persistence results

Summary
Group Count Sum Average Variance
1: Hip-hop 10 594 59.4 73.82
2: Newscast 10 654 65.4 65.60
3: Classical music 10 822 82.2 36.40
4: No noise 10 750 75.0 68.44
ANOVA
Source of variation SS df MS F P-value Fcrit
Between groups 3063.6 3 1021.1 16.72 5.71E-07 2.87
Within groups 2198.4 36 61.07
Total 5262.0 39
9/10/2019 Print
The research organization first asks: Is there a significant
difference? The null hypothesis states that there
is no difference in how long respondents persist, that the
background differences are unrelated to
persistence. The calculated value from the Excel procedure is F
=16.72. That value is larger than the

critical value of F0.05 (3,36) = 2.87, so the null hypothesis is
rejected. Those in at least one of the groups
work a significantly different amount of time before stopping
than those in other groups.
The significant F prompts a second question: Which group(s)
is/are significantly different from which
other(s)? Answering that question requires the post hoc test.
x = 3.81 (based on k = 4, dfwith = 36, and p = 0.05)
MSwith = 61.07, the value from the ANOVA table
n = 10, the number in one group when group sizes are equal
= 9.42
This value is the minimum difference between the means of two
significantly different samples. The
difference in means between the groups appears below:
A − B = −6.0
A − C = −22.8
A − D = −15.6
B − C = −16.8
B − D = −9.6
C − D = 7.2
Table 6.9 makes these differences a little easier to interpret.
The in-cell values are the differences
between the respective pairs of means:

Table 6.9: Mean differences between pairs of groups in task
persistence
A. Hip-hop
M1 = 59.4
B. Newscast
M2 = 65.4
C. Classical music
M3 = 82.2
D. No noise
M4 = 75.0
1: Hip-hop
M1 = 59.4
6.0 22.8 15.6
2: Newscast
M2 = 65.4
16.8 9.6
9/10/2019 Print
A. Hip-hop
M1 = 59.4

B. Newscast
M2 = 65.4
C. Classical music
M3 = 82.2
D. No noise
M4 = 75.0
3: Classical music
M3 = 82.2
7.2
4: No noise
M4 = 75.0
The differences in the amount of time respondents work before
stopping to rest are not significant
between environments A and B and between C and D; the
absolute values of those differences do not
exceed the HSD value of 9.42. The other four comparisons (in
red) are all statistically significant.
The data indicate that those with hip-hop as background noise
tended to work the least amount of time
before stopping, and those with the classical music background
persisted the longest, but that much
would have been evident from just the mean scores. The one-
way ANOVA completed with Excel
indicates that at least some of the differences are statistically
significant, rather than random; the type of
background noise is associated with consistent differences in
work-time. The post hoc test makes it clear
that two comparisons show no significant difference, between

classical music and no background sound,
and between hip-hop and the newscast.
Apply It! boxes written by Shawn Murphy
9/10/2019 Print
Try It!: #6
If the F in ANOVA is not significant, should the
post hoc test be completed?
Daniel Gale/Hemera/Thinkstock
In a study of social isolation
based on where people live (i.e.,
the respondents’ location, such as
a busy city) what is the
independent variable (IV)? What
is the dependent variable (DV)?
6.4 Determining the Practical Importance of Results
Potentially, three central questions could be
associated with an analysis of a variance. Whether
questions 2 and 3 are addressed depends upon the
answer to question 1:
1. Are any of the differences statistically
significant? The answer depends upon how

the calculated F value compares to the
critical value from the table.
2. If the F is significant, which groups are significantly
different from each other? That question is
answered by a post hoc test such as Tukey’s HSD.
3. IfF is significant, how important is the result? The question
is answered by an effect-size calculation.
If F is not statistically significant, questions 2 and 3 are
nonissues.
After addressing the first two questions, we now turn our
attention to the
third question, effect size. With the t test in Chapter 5, omega-
squared
answered the question about how important the result was.
There are
similar measures for analysis of variance, and in fact, several
effect-size
statistics have been used to explain the importance of a
significant
ANOVA result. Omega-squared (ω2) and partial eta-squared
(η2) (where
the Greek letter eta [η] is pronounced like “ate a” as in “ate a
grape”) are
both quite common in social-science research literature. Both
effect-size
statistics are demonstrated here, the omega-squared to be
consistent with
Chapter 5, and—because it is easy to calculate and quite
common in the
literature—we will also demonstrate eta-squared. Both statistics
answer
the same question: Because some of the variance in scores is

unexplained,
in other words error variance, how much of the score variance
can be
attributed to the independent variable which, in this recent
example, is the
background environment? The difference between the statistics
is that
omega-squared answers the question for the population of all
such
problems, while the eta-squared result is specific to the
particular data set.
In the social isolation problem, the question was whether
residents of
small towns, suburban areas, and cities differ in their measures
of social
isolation. The respondents’ location is the IV. Eta-squared
estimates how
much of the difference in social isolation is related to where
respondents
live.
The η2 calculation involves only two values, both retrievable
from the
ANOVA table. Formula 6.7 shows the eta-squared calculation:
Formula 6.7
9/10/2019 Print

The formula indicates that eta-squared is the ratio of between-
groups variability to total variability. If there were
no error variance, all variance would be due to the independent
variable, and the sums of squares for between-
groups variability and for total variability would have the same
values; the effect size would be 1.0. With human
subjects, this effect-size result never happens because scores
always fluctuate for reasons other than the IV, but it
is important to know that 1.0 is the upper limit for this effect
size and for omega-squared as well. The lower limit
is 0, of course—none of the variance is explained. But we also
never see eta-squared values of 0 because the
only time the effect size is calculated is when F is significant,
and that can only happen when the effect of the IV
is great enough that the ratio of MSbet to MSwith exceeds the
critical value; some variance will always be
explained.
For the social isolation problem, SSbet = 33.168 and SStot =
41.672, so
According to these data, about 80% of the variance in social
isolation scores relates to whether the respondent
lives in a small town, a suburb, or a city. Note that this amount
of variance is unrealistically high, which can
happen when numbers are contrived.
Omega-squared takes a slightly more conservative approach to
effect sizes and will always have a lower value
than eta-squared. The formula for omega-squared is:
Formula 6.8
Compared to η2, the numerator is reduced by the value of the df
between times MSwith, and the denominator is
increased by the SStot plus MSwith. The error term plays a

more prominent part in this effect size than in η2, thus
the more conservative value. Completing the calculations for ω2
yields the following:
The omega-squared value indicates that about 69% of the
variability in social isolation can be explained by
where the subject lives. This value is 10% less than the eta-
squared value explains. The advantage to using
omega-squared is that the researcher can say, “in all situations
where social isolation is studied as a function of
where the subject lives, the location of the subject’s home will
explain about 69% of the variance.” On the other
hand, when using eta-squared, the researcher is limited to
saying, “in this instance, the location of the subject’s
home explained about 79% of the variance in social isolation.”
Those statements indicate the difference between
being able to generalize compared to being restricted to the
present situation.
Apply It!
Using ANOVA to Test Effectiveness
9/10/2019 Print
Wavebreakmedia Ltd/Wavebreak Media/Thinkstock
A researcher is interested in the relative impact that
tangible reinforcers and verbal reinforcers have on
behavior. The researcher, who describes the study only as
an examination of human behavior, solicits the help of

university students. The researcher makes a series of
presentations on the growth of the psychological sciences
with an invitation to listeners to ask questions or make
comments whenever they wish. The three levels of the
independent variable are as follows:
1. no response to students’ interjections, except to answer
their questions
2. a tangible reinforcer—a small piece of candy—offered after
each comment/question
3. verbal praise offered for each verbal interjection
The volunteers are randomly divided into three groups of eight
each and asked to report for the
presentations, to which students are invited to respond. Note
that there are three independent groups:
Those who participate are members of only one group. The
three options described represent the three
levels of a single independent variable, the presenter’s response
to comments or questions by the
subjects. The dependent variable is the number of interjections
by subjects over the course of the
presentations.
The null hypothesis (H0: µ1 = µ2 = µ3) maintains that response
rates will not vary from group to group,
that in terms of verbal comments, the three groups belong to the
same population. The alternate
hypothesis (HA: not so) maintains that non-random differences
will occur between groups—that, as a
result of the treatment, at least one group will belong to some
other population of responders.
Each subject’s number of responses during the experiment is

indicated in Table 6.10.
Table 6.10: Number of responses given three different levels of
reinforcer
No response Tangible reinforcers Verbal reinforcers
14 18 13
13 15 15
19 16 16
18 18 15
15 17 14
16 13 17
12 17 13
12 18 16
Completing the analysis with Excel yields the following
summary (Table 6.11), with descriptive statistics
first:
Table 6.11: Summary of Excel analysis for the reinforcer study
9/10/2019 Print

Group Count Sum Average Variance
No Response 8 119 14.875 6.982143
Tangible Reinf. 8 132 16.500 3.142857
Verbal Reinf. 8 119 14.875 2.125000
ANOVA
Source of variation SS df MS F P-value Fcrit
Between groups 14.0833333 2 7.041666667 1.72449 0.202565
3.4668
Within groups 85.75 21 4.083333333
With an F = 1.72, results are not statistically significant for a
value less than F0.05 (2,21) = 3.47. The
statistical decision is to “fail to reject” H0. Note that the p
value reported in the results is the probability
that the particular value of F could have occurred by chance. In
this instance, there is a 0.20 probability
(1 chance in 5) that an F value this large (1.72) could occur by
chance in a population of responders. That
p value would need to be p ≤ 0.05 in order for the value of F to
be statistically significant. There are
differences between the groups, certainly, but those differences
are more likely explained by sampling
variability than by the effect of the independent variable.
Apply It! boxes written by Shawn Murphy

9/10/2019 Print
6.5 Conditions for the One-Way ANOVA
As we saw with the t tests, any statistical test requires that
certain conditions be met. The conditions might
include characteristics such as the scale of the data, the way the
data are distributed, the relationships between
the groups in the analysis, and so on. In the case of the one-way
ANOVA, the name indicates one of the
conditions. Conditions for the one-way ANOVA include the
following:
The one-way ANOVA test can accommodate just one
independent variable.
That one variable can have any number of categories, but can
have only one IV. In example of rural,
suburban, and city isolation, the IV was the location of the
respondents’ residence. We might have added
more categories, such as rural, semirural, small town, large
town, suburbs of small cities, suburbs of
large cities, and so on (all of which relate to the respondents’
residence) but like the independent t test,
we cannot add another variable, such as the respondents’
gender, in a one-way ANOVA.
The categories of the IV must be independent.
The groups involved must be independent. Those who are
members of one group cannot also be
members of another group involved in the same analysis.
The IV must be nominal scale. Because the IV must be nominal
scale, sometimes data of some other
scale are reduced to categorical data to complete the analysis. If

someone wants to know whether
differences in social isolation are related to age, age must be
changed from ratio to nominal data prior to
the analysis. Rather than using each person’s age in years as the
independent variable, ages are grouped
into categories such as 20s, 30s, and so on. Grouping by
category is not ideal, because by reducing ratio
data to nominal or even ordinal scale, the differences in social
isolation between 20- and 29-year-olds,
for example, are lost.
The DV must be interval or ratio scale. Technically, social
isolation would need to be measured with
something like the number of verbal exchanges that a subject
has daily with neighbors or co-workers,
rather than using a scale of 1–10 to indicate the level of
isolation, which is probably an example of
ordinal data.
The groups in the analysis must be similarly distributed, that is,
showing homogeneity of variance, a
concept discussed in Chapter 5. It means that the groups should
all have reasonably similar standard
deviations, for example.
Finally, using ANOVA assumes that the samples are drawn from
a normally distributed population.
To meet all these conditions may seem difficult. Keep in mind,
however, that normality and homogeneity of
variance in particular represent ideals more than practical
necessities. As it turns out, Fisher’s procedure can
tolerate a certain amount of deviation from these requirements,
which is to say that this test is quite robust. In
extreme cases, for example, when calculated skewness or
kurtosis values reach ±2.0, ANOVA would probably
be inappropriate. Absent that, the researcher can probably
safely proceed.

9/10/2019 Print
6.6 ANOVA and the Independent t Test
The one-way ANOVA and the independent t test share several
assumptions although they employ distinct
statistics—the sums of squares for ANOVA and the standard
error of the difference for the t test, for example.
When two groups are involved, both tests will produce the same
result, however. This consistency can be
illustrated by completing ANOVA and the independent t test for
the same data.
Suppose an industrial psychologist is interested in how people
from two separate divisions of a company differ
in their work habits. The dependent variable is the amount of
work completed after hours at home, per week, for
supervisors in marketing versus supervisors in manufacturing.
The data follow:
Marketing: 3, 4, 5, 7, 7, 9, 11, 12
Manufacturing: 0, 1, 3, 3, 4, 5, 7, 7
Calculating some of the basic statistics yields the results listed
in Table 6.12.
Table 6.12: Statistical results for work habits study
M s SEM SEd MG

Marketing 7.25 3.240 1.146
1.458 5.50
Manufacturing 3.75 2.550 0.901
First, the t test gives
The difference is significant. Those in marketing (M1) take
significantly more work home than those in
manufacturing (M2).
The ANOVA test proceeds as follows:
For all variability from all sources (SStot), verify that the result
of subtracting MG from each score in
both groups, squaring the differences, and summing the squares
= 168:
SStot = ∑(x − MG)2 = 168
For the SSbet, verify that subtracting the grand mean from each
group mean, squaring the difference, and
multiplying each result by the number in the particular group =
49:
SSbet = (Ma − MG)2na + (Mb − MG)2nb = (7.25 − 5.50)2(8) +
(3.75 − 5.50)2(8) = 24.5
For the SSwith, take each group mean from each score in the
group, square the difference, and then sum
the squared differences as follows to verify that SSwith = 119:

9/10/2019 Print
Try It!: #7
What is the relationship between the values of t
and F if both are performed for the same two-
group test?
SSwith = ∑(xa1 − Ma)2 + . . . (xa8 − Ma)2 + ∑(xb1 − Mb)2 . . .
(xb8 − Ma)2 = 119
Table 6.13 summarizes the results.
Table 6.13: ANOVA results for work habit study
Source SS df MS F Fcrit
Total 168 15
Between 49 1 49 5.765 F0.05(1,14) = 4.60
Within 119 14 8.5
Like the t test, ANOVA indicates that the difference
in the amount of work completed at home is
significantly different for the two groups, so at least
both tests draw the same conclusion, statistical
significance. Even so, more is involved than just the
statistical decision to reject H0.
Consider the following:

Note that the calculated value of t = 2.401 and the calculated
value of F = 5.765.
If the value of t is squared, it equals the value of F: 2.4012 =
5.765.
The same is true for the critical values:
T0.05(14) = 2.145, 2.1452 = 4.60
F0.05(1,14) = 4.60
Gosset’s and Fisher’s tests draw exactly equivalent conclusions
when two groups are tested. The ANOVA tends
to be more work, so people ordinarily use the t test for two
groups, but both tests are entirely consistent.
9/10/2019 Print
6.7 The Factorial ANOVA
In the language of statistics, a factor is an independent variable,
and a factorial ANOVA is an ANOVA that
includes multiple IVs. We noted that fluctuations in the DV
scores not explained by the IV emerge as error
variance. In the t-test/ANOVA example above, any differences
in the amount of work taken home not related to
the division between marketing and manufacturing—
differences in workers’ seniority, for example—become
part of SSwith and then the MSwith error. As long as a t test or
a one-way ANOVA is used, the researcher cannot
account for any differences in work taken home that are not

associated with whether the subject is from
marketing or manufacturing, or whatever IV is selected. There
can only be one independent variable.
The factorial ANOVA contains multiple IVs. Each one can
account for its portion of variability in the DV,
thereby reducing what would otherwise become part of the error
variance. As long as the researcher has
measures for each variable, the number of IVs has no theoretical
limit. Each one is treated as we treated the
SSbet: for each IV, a sum-of-squares value is calculated and
divided by its degrees of freedom to produce a mean
square. Each mean square is divided by the same MSwith value
to produce F so that there are separate F values
for each IV.
The associated benefit of adding more IVs to the analysis is that
the researcher can more accurately reflect the
complexity inherent in human behavior. One variable rarely
explains behavior in any comprehensive way.
Including more IVs is often a more informative view of why DV
scores vary. It also usually contributes to a more
powerful test. Recall from Chapter 4 that power refers to the
likelihood of detecting significance. Because
assigning what would otherwise be error variance to the
appropriate IV reduces the error term, factorial
ANOVAs are often more likely to produce significant F values
than one-way ANOVAs; they are often more
powerful tests.
In addition, IVs in combination sometimes affect the DV
differently than they do when they are isolated, a
concept called an interaction. The factorial ANOVA also
calculates F values for these interactions. If a
researcher wanted to examine the impact that marital status and
college graduation have on subjects’ optimism

about the economy, data would be gathered on subjects’ marital
status (married or not married) and their college
education (graduated or did not graduate). Then SS values, MS
values, and F ratios would be calculated for
marital status,
college education, and
the two IVs in combination, the interaction of the factors.
In the manufacturing versus marketing example, perhaps gender
and department interact so that females in
marketing respond differently than females in manufacturing,
for example.
The factorial ANOVA has not been included in this text, but it
is not difficult to understand. The procedures
involved in calculating a factorial ANOVA are more numerous,
but they are not more complicated than the one-
way ANOVA. Excel accommodates ANOVA problems with up
to two independent variables.
9/10/2019 Print
6.8 Writing Up Statistics
Any time a researcher has multiple groups or levels of a
nominal scale variable (ethnic groups, occupation type,
country of origin, preferred language) and the question is about
their differences on some interval or ratio scale
variable (income, aptitude, number of days sober, number of

parking violations), the question can be analyzed
using some form of ANOVA. Because it is a test that provides
tremendous flexibility, it is well represented in
research literature.
To examine whether a language is completely forgotten when
exposure to that language is severed in early
childhood, Bowers, Mattys, and Gage (2009) compared the
performance of subjects with no memory of
exposure to a foreign language in their early childhood to other
subjects with no exposure when the language is
encountered in adulthood. They compared the performance with
phonemes of the forgotten language (the DV) by
those exposed to Hindi (one group of the IV) or Zulu (a second
group of the IV) to the performance of adults of
the same age who had no exposure to either language (a third
group of the IV). They found that those with the
early Hindi or Zulu exposure learned those languages
significantly more quickly as adults.
Butler, Zaromb, Lyle, and Roediger III (2009) used ANOVA to
examine the impact that viewing film clips in
connection with text reading has on student recall of facts when
some of the film facts are inconsistent with text
material. This experiment was a factorial ANOVA with two IVs.
One independent variable had to do with the
mode of presentation including text alone, film alone, film and
text combined. A second IV had to do with
whether students received a general warning, a specific
warning, or no warning that the film might be
inconsistent with some elements of the text. The DV was the
proportion of correct responses students made to
questions about the content. Butler et al. found that learner
recall improved when film and text were combined
and when subjects received specific warnings about possible
misinformation. When the film facts were

inconsistent with the text material, receiving a warning
explained 37% of the variance in the proportion of
correct responses. The type of presentation explained 23% of
the variance.
9/10/2019 Print
Summary and Resources
Chapter Summary
This chapter is the natural extension of Chapters 4 and 5. Like
the z test and the t test, analysis of variance is a
test of significant differences. Also like the z test and t test, the
IV in ANOVA is nominal, and the DV is interval
or ratio. With each procedure—whether z, t, or F—the test
statistic is a ratio of the differences between groups to
the differences within groups (Objective 3).
ANOVA and the earlier procedures, do differ, of course. The
variance statistics are sums of squares and mean
squares values. But perhaps the most important difference is
that ANOVA can accommodate any number of
groups (Objectives 2 and 3). Remember that trying to deal with
multiple groups in a t test introduces the problem
of increasing type I error when repeated analyses with the same
data indicate statistical significance. One-way
ANOVA lifts the limitation of a one-pair-at-a-time comparison
(Objective 1).

The other side of multiple comparisons, however, is the
difficulty of determining which comparisons are
statistically significant when F is significant. This problem is
solved with the post hoc test. This chapter used
Tukey’s HSD (Objective 4). There are other post hoc tests, each
with its strengths and drawbacks, but HSD is
one of the more widely used.
Years ago, the emphasis in scholarly literature was on whether a
result was statistically significant. Today, the
focus is on measuring the effect size of a significant result, a
statistic that in the case of analysis of variance can
indicate how much of the variability in the dependent variable
can be attributed to the effect of the independent
variable. We answered that question with eta squared (η2). But
neither the post hoc test nor eta squared is
relevant if the F is not significant (Objective 5).
The independent t test and the one-way ANOVA both require
that groups be independent. What if they are not?
What if we wish to measure one group twice over time, or
perhaps more than twice? Such dependent group
procedures are the focus of Chapter 7, which will provide an
elaboration of familiar concepts. For this reason,
consider reviewing Chapter 5 and the independent t-test
discussion before starting Chapter 7.
The one-way ANOVA dramatically broadens the kinds of
questions the researcher can ask. The procedures in
Chapter 7 for non-independent groups represent the next
incremental step.
Chapter 6 Flashcards

9/10/2019 Print
Key Terms
analysis of variance (ANOVA)
Name given to Fisher’s test allowing a research study to detect
significant differences among any number of
groups.
error variance
Variability in a measure stemming from a source other than the
variables introduced into the analysis.
eta squared
A measure of effect size for ANOVA. It estimates the amount of
variability in the DV explained by the IV.
factor
An alternate name for an independent variable, particularly in
procedures that involve more than one.
factorial ANOVA
An ANOVA with more than one IV.
F ratio
The test statistic calculated in an analysis of variance problem.
It is the ratio of the variance between the
groups to the variance within the groups.
interaction
Occurs when the combined effect of multiple independent
variables is different than the variables acting

independently.
mean square
Name given to Fisher's test allowing a research study to detect
significant dif‐Click card to see term �
Choose a Study ModeView this study set
https://quizlet.com/
https://quizlet.com/125467580/statistics-for-the-behavioral-
social-sciences-chapter-6-flash-cards/
9/10/2019 Print
The sum of squares divided by the relevant degrees of freedom.
This division allows the mean square to reflect
a mean, or average, amount of variability from a source.
one-way ANOVA
Simplest variance analysis, involving only one independent
variable. Similar to the t test.
post hoc test
A test conducted after a significant ANOVA or some similar
test that identifies which among multiple
possibilities is statistically significant.
sum of squares
The variance measure in analysis of variance. It is the sum of
the squared deviations between a set of scores

and their mean.
sum of squares between
The variability related to the independent variable and any
measurement error that may occur.
sum of squares error
Another name for the sum of squares within because it refers to
the differences after treatment within the same
group, all of which constitute error variance.
sum of squares total
Total variance from all sources.
sum of squares within
Variability stemming from different responses from individuals
in the same group. Because all the individuals
in a particular group receive the same treatment, differences
among them constitute error variance.
Review Questions
Answers to the odd-numbered questions are provided in
Appendix A.
1. Several people selected at random are given a story problem
to solve. They take 3.5, 3.8, 4.2, 4.5, 4.7,
5.3, 6.0, and 7.5 minutes. What is the total sum of squares for
these data?
2. Identify the following symbols and statistics in a one-way
ANOVA:
a. The statistic that indicates the mean amount of difference
between groups.
b. The symbol that indicates the total number of participants.

c. The symbol that indicates the number of groups.
d. The mean amount of uncontrolled variability.
3. A study theorizes that manifested aggression differs by
gender. A researcher finds the following data
from Measuring Expressed Aggression Numbers (MEAN):
Males: 13, 14, 16, 16, 17, 18, 18, 18
Females: 11, 12, 12, 14, 14, 14, 14, 16
Complete the problem as an ANOVA. Is the difference
statistically significant?
4. Complete Question 3 as an independent t test, and
demonstrate the relationship between t2 and F.
9/10/2019 Print
a. Is there an advantage to completing the problem as an
ANOVA?
b. If there were three groups, why not just complete three t tests
to answer questions about
significance?
5. Even with a significant F, a two-group ANOVA never needs a
post hoc test. Why not?
6. A researcher completes an ANOVA in which the number of
years of education completed is analyzed by

ethnic group. If η2 = 0.36, how should that be interpreted?
7. Three groups of clients involved in a program for substance
abuse attend weekly sessions for 8 weeks,
12 weeks, and 16 weeks. The DV is the number of drug-free
days.
8 weeks: 0, 5, 7, 8, 8
12 weeks: 3, 5, 12, 16, 17
16 weeks: 11, 15, 16, 19, 22
a. Is F significant?
b. What is the location of the significant difference?
c. What does the effect size indicate?
8. For Question 7, answer the following:
a. What is the IV?
b. What is the scale of the IV?
c. What is the DV?
d. What is the scale of the DV?
9. For an ANOVA problem, k = 4 and n = 8.
If SSbet = 24.0
and SSwith = 72
a. What is F?
b. Is the result significant?
10. Consider this partially completed ANOVA table:
SS df MS F Fcrit
Between 2

Within 63 3
Total 94
a. What must be the value of N − k?
b. What must be the value of k?
c. What must be the value of N?
d. What must the SSbet be?
e. Determine the MSbet.
f. Determine F.
g. What is Fcrit?
9/10/2019 Print
Answers to Try It! Questions
1. The one in one-way ANOVA refers to the fact that this test
accommodates just one independent
variable. One-way ANOVA contrasts with factorial ANOVA,
which can include any number of IVs.
2. A t test with six groups would need 15 comparisons. The
answer is the number of groups (6) times the
number of groups minus 1 (5), with the product divided by 2: 6
× 5 = 30 / 2 = 15.
3. The only way SS values can be negative is if there has been a
calculation error. Because the values are
all squared values, if they have any value other than 0, they

must be positive.
4. The difference between SStot and SSwith is the SSbet.
5. If F = 4 and MSwith = 2, then MSbet must = 8 because F =
MSbet ÷ MSwith.
6. The answer is neither. If F is not significant, there is no
question of which group is significantly different
from which other group because any variability may be nothing
more than sampling variability. By the
same token, there is no effect to calculate because, as far as we
know, the IV does not have any effect on
the DV.
7. t2 = F
9/10/2019 Print
Chapter Learning Objectives
After reading this chapter, you should be able to do the
following:
1. Explain how initial between-groups differences affect t test
or analysis of variance.
2. Compare the independent t test to the dependent-groups t
test.
3. Complete a dependent-groups t test.

4. Explain what “power” means in statistical testing.
5. Compare the one-way ANOVA to the within-subjects F.
6. Complete a within-subjects F.
7Repeated Measures Designs for IntervalData
Karen Kasmauski/Corbis
9/10/2019 Print
Introduction
Tests of significant difference, such as the t test and analysis of
variance, take two basic forms, depending upon
the independence of the groups. Up to this point, the text has
focused only on independent-groups tests: tests
where those in one group cannot also be subjects in other
groups. However, dependent-groups procedures, in
which the same group is used multiple times, offer some
advantages.
This chapter focuses on the dependent-groups equivalents of the
independent t test and the one-way ANOVA.
Although they answer the same questions as their independent-
groups equivalents (are there significant
differences between groups?), under particular circumstances
these tests can do so more efficiently and with
more statistical power.

9/10/2019 Print
Try It!: #1
If the size of the group affects the size of the
standard deviation, what then is the relationship
between sample size and error in a t test?
7.1 Reconsidering the t and F Ratios
The scores produced in both the independent t and the one-way
ANOVA are ratios. In the case of the t test, the
ratio is the result of dividing the difference between the means
of the groups by the standard error of the
difference:
With ANOVA, the F ratio is the mean square between (MSbet)
divided by the mean square within (MSwith):
With either t or F, the denominator in the ratio reflects how
much scores vary within (rather than between) the
groups of subjects involved in the study. These differences are
easy to see in the way the standard error of the
difference is calculated for a t test. When group sizes are equal,
recall that the formula is
with
and s, of course, a measure of score variation in any group.

So the standard error of the difference is based on the standard
error of the mean, which in turn is based on the
standard deviation. Therefore, score variance within in a t test
has its root in the standard deviation for each
group of scores. If we reverse the order and work from the
standard deviation back to the standard error of the
difference, we note the following:
When scores vary substantially in a group,
the result is a large standard deviation.
When the standard deviation is relatively
large, the standard error of the mean must
likewise be large because the standard
deviation is the numerator in the formula for
SEM.
A large standard error of the mean results in
a large standard error of the difference
because that statistic is the square root of the sum of the
squared standard errors of the mean.
When the standard error of the difference is large, the
difference between the means has to be
correspondingly larger for the result to be statistically
significant. The table of critical values indicates
that no t ratio (the ratio of the differences between the means
and the standard error of the difference)
less than 1.96 to 1 is going to be significant, and even that
value requires an infinite sample size.
Error Variance
9/10/2019 Print

Greg Smith/Corbis
In a study of the impact of substance abuse
programs on addicts’ behavior, confounding
variables could include ethnic background,
age, or social class.
The point of the preceding discussion is that the value of t in
the t test—and for F in an ANOVA—is greatly
affected by the amount of variability within the groups
involved. Other factors being equal, when the variability
within the groups is extensive, the values of t and F are
diminished and less likely to be statistically significant
than when groups have relatively little variability within them.
These differences within groups stem from differences in the
way individuals within the samples react to
whatever treatment is the independent variable; different people
respond differently to the same stimulus. These
differences represent error variance—the outcome whenever
scores differ for reasons not related to the IV.
But within-group differences are not the only source of error
variance in the calculation of t and F. Both t test
and ANOVA assume that the groups involved are equivalent
before the independent variable is introduced. In a t
test where the impact of relaxation therapy on clients’ anxiety is
the issue, the test assumes that before the
therapy is introduced, the treatment group which receives the
therapy and the control group which does not both
begin with equivalent levels of anxiety. That assumption is the
key to attributing any differences after the
treatment to the therapy, the IV.

Confounding Variables
In comparisons like the one studying the effects of relaxation
therapy, the initial equivalence of the groups can be uncertain,
however. What if the groups had differences in anxiety before
the therapy was introduced? The employment circumstances of
each group might differ, and perhaps those threatened with
unemployment are more anxious than the others. What if age-
related differences exist between groups? These other
influences that are not controlled in an experiment are
sometimes called confounding variables.
A psychologist who wants to examine the impact that a
substance abuse program has on addicts’ behavior might set up
a study as follows. Two groups of the same number of addicts
are selected, and one group participates in the substance-abuse
program. After the program, the psychologist measures the
level of substance abuse in both groups to observe any
differences.
The problem is that the presence or absence of the program is
not the only thing that might prompt subjects to
respond differently. Perhaps subjects’ background experiences
are different. Perhaps ethnic-group, age, or social-
class differences play a role. If any of those differences affect
substance-abuse behavior, the researcher can
potentially confuse the influence of those factors with the
impact of the substance-abuse program (the IV). If
those other differences are not controlled and affect the
dependent variable, they contribute to error variance.
Error variance exists any time dependent-variable (DV) scores
fluctuate for reasons unrelated to the IV.
Thus, the variability within groups reflects error variance, and
any difference between groups that is not related

to the IV represents error variance. A statistically significant
result requires that the score variance from the
independent variable be substantially greater than the error
variance. The factor(s) the researcher controls must
contribute more to score values than the factors that remain
uncontrolled.
9/10/2019 Print
Try It!: #2
How does the use of random selection enable us
to control error variance in statistical testing?
Try It!: #3
How do the before/after t test and the matched-
pairs t test differ?
7.2 Dependent-Groups Designs
Ideally, any before-the-treatment differences between the
groups in a study will be minimal. Recall that random
selection entails every member of a population having an equal
chance of being selected. The logic behind
random selection dictates that when groups are randomly drawn
from the same population, they will differ only
by chance; as sample size increases, probabilities suggest that
they become increasingly similar in characteristic
to the population. No sample, however, can represent the

population with complete fidelity, and sometimes the
chance differences affect the way subjects respond to the IV.
One way researchers reduce error variance is to adopt
what are called dependent-groups designs. The
independent t test and the one-way ANOVA required
independent groups. Members of one group could not
also be members of other groups in the same study.
But in the case of the t test, if the same group is
measured, exposed to a treatment, and then measured
again, the study controls an important source of error
variance. Using the same group twice makes the initial
equivalence of the two groups no longer a concern. Other
aspects being equal, any score difference between the first and
second measure should indicate only the impact
of the independent variable.
The Dependent-Samples t Tests
One dependent-groups test where the same group is measured
twice is called the before/after t test. An
alternative is called the matched-pairs t test, where each
participant in the first group is matched to someone in
the second group who has a similar characteristic. The
before/after t test and the matched-pairs t test both have
the same objective—to control the error variance that is due to
initial between-groups differences. Following are
examples of each test.
The before/after design: A researcher is interested in the impact
that positive reinforcement has on
employees’ sales productivity. Besides the sales commission,
the researcher introduces a rewards
program that can result in increased vacation time. The
researcher gauges sales productivity for a
month, introduces the rewards program, and gauges sales

productivity during the second month for the
same people.
The matched-pairs design: A school counselor is interested in
the impact that verbal reinforcement has
on students’ reading achievement. To eliminate between-groups
differences, the researcher selects 30
people for the treatment group and matches each person in the
treatment group to someone in a control
group who has a similar reading score on a standardized test.
The researcher then introduces the verbal
reinforcement program to those in the treatment group for a
specified period of time and then compares
the performance of students in the two groups.
Although the two tests are set up differently, both
calculate the t statistic the same way. The differences
between the two approaches are conceptual, not
mathematical. They have the same purpose—to
control between-groups score variation stemming
from nonrelevant factors.
9/10/2019 Print
Calculating t in a Dependent-Groups Design
The dependent-groups t may be calculated using several
methods. Each method takes into account the
relationship between the two sets of scores. One approach is to
calculate the correlation between the two sets of
scores and then to use the strength of the correlation as a

mechanism for determining between-groups error
variance: the higher the correlation between the two sets of
scores, the lower the error variance. Because this text
has yet to discuss correlation, for now we will use a t statistic
that employs “difference scores.” The different
approaches yield the same answer.
The distribution of difference scores came up in Chapter 5 when
it introduced the independent t test. Recall that
the point of that distribution is to determine the point at which
the difference between a pair of sample means
(M1 − M2) is so great that the most probable explanation is that
the samples came from different populations.
Dependent-groups tests use that same distribution, but rather
than the difference between the means of the two
groups (M1 − M2), the numerator in the t ratio is the mean of
the differences between each pair of scores. If that
mean is sufficiently different from the mean of the population
of difference scores (which, recall, is 0), the t
value is statistically significant; the first set of measures
belongs to a different population than the second set of
measures. That may seem odd since in a before/after test, both
sets of measures come from the same subjects,
but the explanation is that those subjects’ responses (the DV)
were altered by the impact of the independent
variable; their responses are now different.
The denominator in the t ratio is another standard error of the
mean value, but in this case, it is the standard error
of the mean of the difference scores. The researcher checks for
significance using the same criteria as for the
independent t:
A critical value from the t table, determined by degrees of
freedom, defines the point at which the

calculated t value is statistically significant.
The degrees of freedom are the number of pairs of scores minus
1 (n − 1).
The dependent-groups t test statistic uses this formula:
Formula 7.1
where
Md = the mean of the difference scores
SEMd = the standard error of the mean for the difference scores
The steps for completing the test are as follows:
1. From the two scores for each subject, subtract the second
from the first to determine the difference
score, d, for each pair.
2. Determine the mean of the d scores:
9/10/2019 Print
3. Calculate the standard deviation of the d values, sd.
4. Calculate the standard error of the mean for the difference
scores, SEMd, by dividing sd by the square
root of the number of pairs of scores,

5. Divide Md by SEMd, the standard error of the mean for the
difference scores:
Figure 7.1 depicts these steps.
The following is an example of a dependent-measures t test: A
psychologist
is investigating the impact that verbal reinforcement has on the
number of
questions university students ask in a seminar. Ten upper-level
students
participate in two seminars where a presentation is followed by
students’
questions. In the first seminar, the instructor provides no
feedback after a
student asks the presenter a question. In the second seminar, the
instructor
offers feedback—such as “That’s an excellent question” or
“Very interesting
question” or “Yes, that had occurred to me as well”—after each
question.
Is there a significant difference between the number of
questions students
ask in the first seminar compared to the number of questions
students ask in
the second seminar? Problem 7.1 shows the number of questions
asked by
each student in both seminars and the solution to the problem.
Problem 7.1: Calculating the before/after t test
Seminar 1 Seminar 2 d
1 1 3 −2

2 0 2 −2
Figure 7.1: Steps for
calculating the before/after t
test
9/10/2019 Print
Seminar 1 Seminar 2 d
3 3 4 −1
4 0 0 0
5 2 3 −1
6 1 1 0
7 3 5 −2
8 2 4 −2
9 1 3 −2
10 2 1 1
∑d = −11
1. Determine the difference between each pair of scores, d,
using subtraction.

ANOVA Interpretation Set 1 Study this scenario and ANOVA.docx

ANOVA Interpretation Set 1 Study this scenario and ANOVA.docx

Recommended

Recommended

More Related Content

Similar to ANOVA Interpretation Set 1 Study this scenario and ANOVA.docx

Similar to ANOVA Interpretation Set 1 Study this scenario and ANOVA.docx (20)

More from festockton

More from festockton (20)

Recently uploaded

Recently uploaded (20)

ANOVA Interpretation Set 1 Study this scenario and ANOVA.docx