SlideShare a Scribd company logo
1 of 78
227
8Correlation
Anrodphoto/iStock/Thinkstock
Chapter Learning Objectives
After reading this chapter, you should be able to do the
following:
1. Explain the hypothesis of association.
2. Interpret the correlation coefficient.
3. List the Pearson correlation requirements.
4. Describe what the coefficient of determination explains.
5. Explain the variables involved in the point-biserial
correlation.
6. Describe the applications for the Spearman correlation.
tan82773_08_ch08_227-262.indd 227 3/3/16 12:33 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
Section 8.1 The Hypothesis of Association
Introduction
Correlation, the concept of a relationship or dependence
between variables, transcends statisti-
cal analysis. Cloudy days are related to (correlated with) cooler
temperatures. Natural disasters
are related to declines in the stock market. An impending test is
related to the need to study, and
grinding noises in the engine compartment of a car are usually
related to repair bills.
Some relationships are stronger than others, so statistical
procedures have been developed
to quantify, or numerically gauge, the strength of the
relationship between two variables. The
numerical indicators are called correlation coefficients, and one
of the most common is the
Pearson correlation coefficient, which indicates the strength of
the relationship between
interval- or ratio-scale variables. The name Pearson refers to
Karl Pearson, whose impact not
just on studying correlation but on statistical analysis generally
may be greater than that of
any other individual.
In the early years of the 20th century, Pearson founded the first
department of statistical analy-
sis at University College London. Under Pearson’s direction,
the department attracted, among
others, William Sealy Gosset of t test fame; Ronald Fisher, who
produced analysis of variance;
and Charles Spearman, for whom an alternative correlation
coefficient is named, as well as an
elegant statistical procedure based on correlation called factor
analysis. To put it succinctly, it is
difficult to overstate the impact that Pearson had on the
evolution of statistical analysis.
A man of fierce independence, Pearson’s education at
Cambridge centered in religion and
philosophy rather than mathematics. As a student of religion, he
sued the university over the
compulsory chapel attendance required of all undergraduates.
Winning his suit brought a
change to university rules—after which Pearson chose to attend
chapel. His graduate work (in
Germany) emphasized literature, and it is a testimony to his
extraordinary breadth of talent
that his greatest contributions would be in statistical analysis.
Pearson was a contemporary
of Einstein, who sought a grand theory that would unite all of
physics. Pearson tried to do the
same with mathematics. That both men were disappointed in
these efforts should not detract
from what they did accomplish. Although Pearson’s associations
with his colleagues were not
always harmonious, he and the others who found an academic
home in his department virtu-
ally defined modern quantitative analysis. Whether or not they
realize it, almost all of those
who crunch numbers for any length of time rely on their work.
8.1 The Hypothesis of Association
Previous chapters concentrated on tests of significant
difference. The z test, the t tests, analy-
sis of variance, and the repeated-measures designs test the
differences between groups. They
all fall under a general assumption referred to as the hypothesis
of difference. But some
kinds of analyses do not involve questions about whether there
are significant differences
between groups.
If a psychologist asks about the relationship between birth order
and achievement motiva-
tion among siblings or about the connection between the amount
of time children read and
their school grades, the subject of research concerns
relationships rather than differences.
Those questions call for procedures connected to the hypothesis
of association, and when
tan82773_08_ch08_227-262.indd 228 3/3/16 12:33 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
Section 8.1 The Hypothesis of Association
results are statistically significant, it means that the
relationship, rather than the difference, is
unlikely to be a random occurrence.
Correlation versus Cause
Before pursuing correlation, researchers must
make a distinction between correlation and cause.
Because two characteristics co-vary, or vary
together, that does not presume that one necessar-
ily causes the other. Although there may be a causal
relationship, researchers usually cannot determine
one just by studying the correlation. One of the
author’s statistics professors explained the risk of
confusing correlation with cause this way: A person
drinks for three successive nights. The first night, the
drink is scotch and water, the second bourbon and
water, and on the third, vodka and water. Each morn-
ing after is accompanied by a hangover. Because the
water is common to each experience, water must be
the cause.
A classic study demonstrates, among other things, a correlation
between the sale of ice cream
by vendors on city streets and burglaries in the same city.
Someone rushing to judgment
about cause might wish to curb ice cream sales or check the
criminal records of ice cream
vendors to reduce the number of burglaries. Such an individual
does not recognize that hot-
ter weather—and the open windows that result—probably drive
both ice cream sales and
burglaries. It is not unusual for some third variable to explain
an association between a first
and a second. Although correlation values provide some
evidence for causation, correlation
alone is rarely sufficient to demonstrate cause.
Scatterplots
Breaking down the word correlation—co-relation—makes its
meaning clear: the variables
are related. The evidence for the relationship is that the
characteristics co-vary. As the level of
one variable changes, the other changes as well because both
variables contain some of the
same information. The higher the correlation, the more common
information they contain.
A researcher gathers verbal ability and intelligence scores for
12 subjects and presents them
in Table 8.1. Note that the first participant has a verbal ability
score of 20 and an intelligence
score of 80. Scanning the two rows of data, we can see that as
the values of one score increase,
so do those of the other. In other words, there appears to be a
positive correlation between
the two scores. The relationship is easier to see in the
scatterplot. A scatterplot is a graph plotting the values
of one variable along the horizontal axis and the other
variable along the vertical axis, using dots to indicate
the intersection of each pair of values. Figure 8.1 shows
an Excel-generated scatterplot of the verbal ability/
intelligence data.
Design Pics/Kelly Redinger/Thinkstock
As the classic study involving ice
cream sales and burglaries shows us,
it is important to make a distinction
between correlation and cause.
Try It!: #1
How many raw scores does a single point
on a scatterplot represent?
tan82773_08_ch08_227-262.indd 229 3/3/16 12:34 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
Section 8.1 The Hypothesis of Association
Table 8.1: Results of a study comparing verbal ability and
intelligence
Participant 1 2 3 4 5 6 7 8 9 10 11 12
Verbal ability: 20 35 42 48 55 60 63 66 72 76 78 85
Intelligence: 80 95 90 100 100 100 110 115 120 115 110 125
Figure 8.1: The relationship between verbal ability and
intelligence
In the Figure 8.1 scatterplot, intelligence scores are plotted
along the vertical, or y, axis and
the verbal ability scores are plotted along the horizontal, or x,
axis. Each diamond-shaped
point in the graph, then, represents an intelligence score and a
verbal ability score.
The plot verifies what our cursory view of the two rows of data
in the table suggested: A posi-
tive correlation exists between measures of intelligence and
those of verbal ability. The gen-
eral trend is from lower left to upper right. As the value of one
variable increases, the value of
the other tends to do likewise. The incline is not dramatic, but
the graph shows a general rise
in the data points.
Less-than-Perfect Relationships
The relationship certainly is not perfect. The fourth, fifth, and
sixth participants all have the
same level of intelligence but different levels of verbal ability.
The same is true of participants
8 and 10, as well as participants 7 and 11. Still, there is a
general lower-left to upper-right
relationship, which might be expected. Brighter people often
have more complex language
patterns, something suggested by higher verbal-ability scores.
It also is not surprising that the relationship between
intelligence and verbal ability is less
than perfect. An extensive vocabulary alone is no guarantee of
an unusually high intelligence
tan82773_08_ch08_227-262.indd 230 3/3/16 12:34 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
Section 8.1 The Hypothesis of Association
score. Perhaps the individual is just an avid reader. At the other
end of the spectrum, not all
highly intelligent people excel verbally.
The exceptions point to the fact that people are very complex.
Human behavior is rarely
explained by one or two variables. Although intelligence is
related to verbal aptitude, so are
a number of other variables: how much the individual reads,
how easily the individual is dis-
tracted, how much experience the person has had, and so on.
One of the reasons researchers
calculate correlation values is to determine the level of
agreement when the relationships are
not perfect, as they rarely are with people.
The issue the hypothesis of association seeks to resolve is not
whether the relationship is
perfect—because it would be extremely rare if it were—but
rather, whether the relationship
is statistically significant. Statistically significant correlations
produce correlation values that
tend to reemerge every time new data are gathered for the
variables and the strength of the
correlation re-calculated.
Although perfect correlations are rare when dealing with
people, that is not necessarily the
case elsewhere. Mathematicians, for example, enjoy the
stability of perfect relationships; the
formula for the area of a circle, A 5 πr2 (where the area is
found by multiplying the value of pi
by the square of the radius), works for circles of any size
because a perfect relationship exists
between a circle’s radius and its area.
Still, even imperfect correlations, such as those related to
human-subjects research, can be
very important. If health professionals know a correlation, even
a weak one, exists between
exposure to secondhand smoke and the later development of
respiratory problems, they can
warn against such exposure. In that particular instance, by the
way, the research supports the
causal assumption. If educators know there is a correlation
between how much homework
students do and their success on a high school exit exam,
educators can encourage students to
complete more assignments. The instructors expect that pass
rates will rise as a consequence.
In the case of homework and exit exam scores, however, a
causal relationship is not as clear.
Perhaps people who have a higher level of academic
achievement do more homework and
have higher exit exam scores. That suggests the academic
achievement is the causal element
rather than the homework. Maybe the increased homework is the
manifestation of that other
variable, academic achievement, or perhaps parental
involvement is the causal factor—stu-
dents whose parents are directly involved in their schooling do
more homework and prepare
for their exit exams with greater care.
The Amount of Scatter
The amount of scatter in a scatterplot, the degree to which the
points in the scatterplot stray
from a straight line, suggests weakness in the correlation.
Scatterplots graphed for strong
correlations have very little scatter. The points appear to line
up.
What Correlations Provide
Calculating a correlation involves quantifying the strength of
the relationship between the
variables involved. Correlation values, or coefficients, range
from to 21.0 to 11.0. Correlation
values of either 21.0 or 11.0 indicate perfect relationships. With
positive correlations, as the
tan82773_08_ch08_227-262.indd 231 3/3/16 12:34 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
Section 8.1 The Hypothesis of Association
value of one variable increases, so does the value of the other—
more verbal reinforcement of
subjects in a test of problem-solving ability is probably
associated with more effort expended
by the subject. With negative correlations, as the value
of one variable increases, the other decreases—more
involvement with video-gaming while a text passage
is read to subjects is probably associated with lower
retention of the details of the text passage; as the value
of one increases, the value of the other declines. A cor-
relation of 0 indicates no relationship—fluctuations in
the value of one variable are unrelated to changes in the
value of the other. Values less than the absolute value of
1.0, but greater than 0, indicate imperfect relationships, with
the strength of the relationship
declining as the value approaches 0.
Correlating two variables does not require that they both
measure the same characteristic
or even that they both be gathered from the same subjects.
Often, entirely different kinds
of things are correlated. The example of secondhand smoke and
respiratory issues involves
two completely different variables, but the strength of the
relationship between them can be
calculated nevertheless. As long as the two variables can be
quantified—reduced to a num-
ber—the strength of any relationship can be determined.
Requirements for the Pearson Correlation
Researchers may employ any of several different correlation
procedures. The appropriate
procedure for a particular problem is determined by
characteristics such as the scale and
normality of the data involved. The Pearson correlation, for
example, requires variables of
either interval or ratio scale. Nominal or ordinal scale data can
be correlated as well, but they
involve other correlation procedures. In addition to interval or
ratio data, the Pearson corre-
lation also requires the following:
• In their populations, the characteristics are assumed to be
normally distrib-
uted. Normal distributions can never be reflected in relatively
small samples, but
researchers must have reason to believe that the samples come
from populations
that are normal.
• The distributions from which the samples come must be
similarly distributed.
• The two samples are assumed to be randomly selected
from their populations.
• The relationship between the variables must be linear; it
remains constant through-
out their ranges.
Recall that normality is indicated when the standard deviation is
about one-sixth of the range,
the measures of central tendency all have about the same value,
and so on (Chapter 2). The
way data are distributed in the scatterplot also suggests the
normality of the two variables
involved in a correlation. When both variables are normal, the
points in the plot will be dis-
tributed from left to right, with the frequency of the points
gradually increasing toward the
middle of the graph and then gradually decreasing to the right
extreme. If the relationship is
positive (example A in Figure 8.2), the scatter is generally from
lower left to upper right. If
it is negative (example B in Figure 8.2), the graph follows a
pattern from upper left to lower
right. If the variables have no correlation (example C in Figure
8.2), the points fall into a cir-
cular pattern in the middle of the graph with the greatest density
at the circle’s center. The
Try It!: #2
If two variables are normally distributed
but uncorrelated, what pattern will their
data points make in a scatterplot?
tan82773_08_ch08_227-262.indd 232 3/3/16 12:34 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
30
20
10
0
0 5 10 15
30
20
10
A. A positive Correlation
B. A negative Correlation
C. A Zero Correlation
0
0 42 86 10
5
15
10
5
0
0 5 10 15 20
Section 8.1 The Hypothesis of Association
greater frequency in the middle of the circle
reflects the fact that most of the data in any
normal distribution occur near the middle of
the distribution. (The pattern in our example
does not look circular because so few data are
present.)
The similar-distribution requirement does
not mean that the standard deviations should
be the same. That is not likely to happen
unless both variables are measured along the
same range. It means that the standard devia-
tions should account for similar proportions
of their respective ranges.
The strength of a correlation is affected by
range attenuation. When the range of scores
in either variable is artificially abbreviated,
the correlation value will be artificially low.
Range attenuation can be indicated by a stan-
dard deviation that is substantially smaller
than we know it to be in the population. If
we were correlating intelligence scores with
reading comprehension, and the intelligence
scores have a standard deviation of 8 points
when we know that the population standard
deviation is 15 points, we can expect any resulting correlation
value to be artificially low. One
of the advantages of random selection is that random samples of
a reasonable size tend to
mirror their populations reasonably well. Range restriction
problems are much less likely to
occur with randomly selected samples.
Linear and Nonlinear Correlations
When the relationship between two variables is linear, it means
that the degree to which they
change in concert with each other is the same throughout their
ranges; if it is low and posi-
tive, it is low and positive at low levels of both variables and at
higher levels of both variables.
Some correlations, however, are not linear. Consider the
correlation between anxiety and the
quality of a musician’s performance. In that instance, a little
anxiety is probably a good thing.
It prompts the individual to prepare for the performance by
practicing, studying the music
carefully, asking others for feedback, and so on. Without
anxiety, the musician might not make
the necessary preparations. It seems likely that, at least in the
early going, the quality of the
performance improves as anxiety increases.
But it is possible that if anxiety continues to increase, the
individual’s performance may reach
a plateau and then begin to diminish. The musician may become
so anxious that concentra-
tion is difficult and performance declines, with more anxiety
actually depreciating the quality
of the music. These conditions describe a relationship that is
curvilinear. It is illustrated in
Table 8.2, where anxiety is gauged as a function of someone’s
increasing pulse rate in beats
Figure 8.2: Scatterplots for positive,
negative, and zero correlations
30
20
10
0
0 5 10 15
30
20
10
A. A positive Correlation
B. A negative Correlation
C. A Zero Correlation
0
0 42 86 10
5
15
10
5
0
0 5 10 15 20
tan82773_08_ch08_227-262.indd 233 3/3/16 12:34 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
Section 8.1 The Hypothesis of Association
per minute. The quality of the musician’s performance is
represented by the judgment of a
trained observer, with higher values indicating a more virtuoso
performance. If scores were
awarded every 5 minutes during a 65-minute performance, the
data are as follows:
Table 8.2: Study results of anxiety versus quality of a
musician’s performance
Anxiety 52 54 58 62 64 67 72 73 75 78 82 86 88
Performance quality 3 5 6 6 8 8 9 7 5 5 4 3 1
Figure 8.3 shows the scatterplot illustrating the relationship
between the musician’s anxiety
and the quality of the musician’s performance.
Figure 8.3: The relationship between performance quality and
anxiety
Try It!: #3
What impact does range attenuation have
on a correlation?
Initially, there is a positive relationship between anxiety and the
quality of the music. The
first few pairs of data have points that rise from left to right.
However, a positive relationship
becomes negative when performance begins to diminish as
anxiety increases. Viewed as a
whole, the correlation is curvilinear. After performance
reaches the judge’s high of 9, more anxiety is not asso-
ciated with better music.
The scatterplot also reveals some of the danger associ-
ated with range restriction. If someone collects data so
that only the first six pairs of scores were the sample,
tan82773_08_ch08_227-262.indd 234 3/3/16 12:34 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
Section 8.1 The Hypothesis of Association
those scores provide very different indicators of the relationship
between anxiety and perfor-
mance than the last six pairs of scores. The first part of the
distribution makes the relation-
ship look linear and positive. The latter part of the data makes
the relationship look linear but
negative. An accurate picture of the relationship requires data
throughout the entire ranges
of the two variables.
Understanding Correlation Values
It is important not to confuse the sign of the correlation (1 or 2)
with its strength. A corre-
lation of 20.50 contains the same amount of information about
the two variables as does a
correlation of 10.50. The sign makes a great deal of difference
how the relationship is inter-
preted, but it has nothing to do with the strength of the
relationship. With positive correla-
tions, as the value of one variable increases so does the value of
the other. When correlations
are negative, increasing values of one variable are associated
with decreasing values of the
other.
Earlier we noted that different scales of data require different
types of correlation proce-
dures. The number of variables involved also dictates the need
for different correlation
procedures:
• Bivariate correlations indicate the relationship between
two variables. For exam-
ple, the correlation between intelligence and verbal aptitude is a
bivariate correla-
tion. This chapter focuses on bivariate correlations.
• Multiple correlation gauges the relationship between one
variable and a combina-
tion of others. For example, the correlation between a combined
reading compre-
hension and vocabulary measure with an analytical-ability
measure would indicate
how well reading comprehension and vocabulary ability,
combined, correlate with
analytical ability.
• Canonical correlation measures the relationship between
two groups of variables.
For example, determining how a combination of reading
comprehension and vocab-
ulary ability and a combination of analytical ability and
problem-solving ability
relate calls for a canonical correlation.
• Partial correlation measures the relationship between two
variables after neu-
tralizing the influence of some third variable on both of the first
two. For example,
a correlation of analytical ability with problem-solving ability,
with the influence of
age controlled in both of the other variables, eliminates age
differences as a factor
in the resulting correlation. In effect, a partial correlation would
be the correlation
of analytical ability with problem-solving ability as if all
subjects were the same
age.
• Semipartial correlation gauges the relationship between
two variables after neu-
tralizing the influence of a third on either of the first two. For
example, a correlation
of intelligence with verbal aptitude, with age differences
controlled in the verbal-
aptitude variable, is a semipartial correlation. Age would not be
controlled in the
intelligence variable. (This makes some sense since intelligence
is often argued to be
a stable variable across age differences in the individual.)
Only the bivariate correlations are covered here. The others are
beyond the scope of this book
but are described here very simply, so that the reader has a
sense of where bivariate correla-
tions fit into the broader discussion of these procedures.
tan82773_08_ch08_227-262.indd 235 3/3/16 12:34 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
Section 8.2 Calculating the Pearson Correlation
8.2 Calculating the Pearson Correlation
Formally called the Pearson product-moment correlation
coefficient, the Pearson correlation,
or—because its symbol is typically a lowercase r—“Pearson’s
r,” is probably the most often
calculated of any correlation value. Thumbing through statistics
books and glancing at online
sources reveals several formulas. All provide the same answer,
but some are easier to com-
plete than others. Visually, at least, Formula 8.1 is probably
simplest:
Formula 8.1
rxy 5
∑[(zx)(zy)]
n 2 1
Note that the r symbol has x and y subscripts. These indicate
that the procedure correlates
two variables designated x and y. Which variable is assigned x
and which y is unimportant,
since correlation does not presume that the x variable causes y,
for example. Formula 8.1
indicates that if the x and y scores are transformed into z scores
(Formula 3.1: z 5 x 2 Ms ), the
value of rxy, (the correlation value) is the sum of the products
of the x and yz scores for each
participant, divided by the number of participants in the data
group (rather than the number
of scores), minus 1.
The n 2 1 signifies that this is a correlation formula for sample,
rather than population, data.
It is the same adjustment for sample data made with the
standard deviation calculation in
Chapter 1. Formula 8.1 can be used to calculate the correlation
value of the verbal ability
and intelligence scores from the earlier example. Calculating
the equivalent verbal-ability and
intelligence z values with Formula 3.1 produces the z values for
the original raw scores listed
in Table 8.3.
Here, each pair of z scores is multiplied and the products
summed:
(21.991 3 21.902) 1 (21.212 3 20.761) 1 . . . 1 (1.385 3 1.522) 5
10.313
This provides the numerator to be used in the formula
rxy 5
∑[(zx)(zy)]
n 2 1
Then, for the denominator, n (the number of pairs of scores) 5
12, so n 2 1 5 11. Therefore,
substituting these values into the above equation gives
rxy 5
10.313
11
5 0.938
Table 8.3: z values
Verbal
ability (x)
21.991 21.212 20.848 20.537 20.173 0.087 0.242 0.398 0.710
0.917 1.021 1.385
Intelli-
gence (y)
21.902 20.761 21.141 20.380 20.380 20.380 0.380 0.761 1.141
0.761 0.380 1.522
tan82773_08_ch08_227-262.indd 236 3/3/16 12:34 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
Section 8.2 Calculating the Pearson Correlation
With a maximum possible correlation value of 1.0, rxy 5 0.938
indicates a strong relationship
between verbal ability and intelligence, something that is
reflected in the fact that many intel-
ligence tests include subtests of verbal ability.
Although Formula 8.1 is visually simple, the need to transform
everything into z scores before
calculating rxy makes the calculations very time consuming and
tedious. Completing the cal-
culations by hand takes too much time. Formula 8.2, the
formula we will use, turns out to be
the formula programmed into many hand-calculators. It is
visually more complex but much
easier to execute:
Formula 8.2
rxy 5
n∑xy 2 (∑x)(∑y)
Î {[n∑x2 2 (∑x)2][n∑y2 2 (∑y)2]}
where
x 5 one of the scores in each pair as above in the z score
formula.
y 5 the other score in the pair.
n 5 the number of participants (the number of pairs of scores).
∑xy indicates that each pair of scores is multiplied and then the
products for each pair
summed. The resulting value is the “sum of the cross-products.”
∑x2 indicates that each x score is squared, and then the squares
summed.
(∑x)2 indicates that the original x scores are totaled, and then
the total is squared.
∑y2 indicates that each y score is squared, and then the squares
summed.
(∑y)2 indicates that the original y scores are totaled, and then
the total is squared.
The formula is not as daunting as it appears. The
process will become familiar after a few problems.
Probably Excel or a hand-calculator with a built-in
correlation function will perform most of the statis-
tical “heavy-lifting,” but it is helpful to prepare for
that occasional time when there is no computer and
the calculator has no correlation function.
A Correlation Example
A researcher is duplicating a classic experiment by
psychologist E. L. Thorndike. The experiment relates
to Thorndike’s Law of Effect, which maintains that
behaviors followed by a satisfying state of affairs will
likely be repeated. In the experiment, the researcher
sets up a cage equipped with a door that opens if a
cat placed in the cage bats a string suspended inside
iStockphoto/ Thinkstock
Thorndike’s Law of Effect maintains
that behaviors followed by satisfaction
are likely to be repeated. A hungry cat
will learn to bat a suspended string if
that action is followed by food.
tan82773_08_ch08_227-262.indd 237 3/3/16 12:34 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
Section 8.2 Calculating the Pearson Correlation
the cage. According to the law of effect, if batting the string is
followed by something satisfying,
that behavior should occur more frequently in future trials than
other behaviors. A hungry cat
is placed in the cage and food placed outside where it is
inaccessible from the inside of the cage.
Data comprise the several trials and the elapsed time, in
minutes, before the cat releases itself.
This experiment is repeated 10 times over as many days. Table
8.4 lists the data.
Table 8.4: Experimental results from cat behavioral study
Trial number 1 2 3 4 5 6 7 8 9 10
Elapsed time 5.0 5.5 4.75 4.5 4.25 3.5 2.75 2.0 1.0 0.25
Figure 8.4 shows the scatterplot for these data, which suggests
that the relationship is prob-
ably negative and quite strong.
Figure 8.4: The relationship between number of trials and
elapsed time
The correlation value checks both conclusions. To determine the
correlation, we use
Formula 8.2:
rxy 5
n∑xy 2 (∑x)(∑y)
Î {[n∑x2 2 (∑x)2][n∑y2 2 (∑y)2]}
The number of trials (n) 5 10. The researcher can then verify
that
∑xy 5 137.25
∑x2 5 385
tan82773_08_ch08_227-262.indd 238 3/3/16 12:34 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
Section 8.2 Calculating the Pearson Correlation
(∑x)2 5 (55)2
∑y2 5 141
(∑y)2 5 (33.5)2
Substituting the relevant values gives
rxy 5
10(137.15) 2 (55)(33.5)
Î {[10(385) 2 (55)2][10(141) 2 (33.5)2]}
5
1372.5 2 1842.5
Î [(3850 2 (3025)][(1410 2 1122.25]
5
2470
Î (825 3 287.75)
5 20.965
Interpreting Results
The relationship is indeed negative and because the maximum
correlation is 61.0, the rela-
tionship is also very strong. Neither of those conclusions
indicates whether the result is sta-
tistically significant, however. As with z, t, and F, significance
is determined by comparing
the calculated value to the table value indicated by the relevant
degrees of freedom and the
selected level of probability. A calculated correlation value for
which the absolute value is as
large is one that probably did not occur by chance. For the
Pearson correlation, the values are
in Table 8.5 (see also Table B.5 in Appendix B).
Like the t and F values, the correct critical value for r is
determined by degrees of freedom
and by the level of probability the researcher selects. The
degrees of freedom for a Pearson
correlation are the number of pairs of data, minus 2. Be careful
not to confuse the number of
pairs with the number of scores.
The probability values in Table 8.5 indicate the absolute value
that the calculated rxy must
reach to be confident that the correlation did not occur by
chance. The level of confidence in
that conclusion is indicated by the columns for p 5 0.1, p 5
0.05, and p 5 0.01. To have some
practice interpreting the values, note the following:
• If a correlation were calculated for n 5 7 pairs of data
(which means that df 5 5)
and the result was rxy 5 1/2 0.669, there is 1 chance in 10, or in
other words
p 5 0.1, that the correlation occurred by chance. A chance, or
random, correlation
means that if new data were collected and the rxy value
calculated a second time, it
would probably be less than the table value.
• If the researcher wants more assurance against a random
correlation,
rxy 5 1/2 0.754 (also with 5 degrees of freedom) will occur by
chance just 5 times
in 100 (p 5 0.05) and rxy 5 1/2 0.875 will occur by chance just 1
time in 100
(p 5 0.01).
tan82773_08_ch08_227-262.indd 239 3/3/16 12:34 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
Section 8.2 Calculating the Pearson Correlation
Researchers most commonly settle on p 5 0.05 or 0.01. The p 5
0.1 occurs in statistical tables
less often because in most research settings, a one-in-ten chance
of a random correlation is
too great. No one wants to conclude that a correlation is not
statistically significant when
there is too much chance that the finding will not hold up under
further investigation. In
exploratory or descriptive research when there is little prior
research on which to rely, how-
ever, sometimes investigators will relax the probability to p 5
0.1.
Table 8.5: The critical values of rxy
Number of
xy pairs (n) df (n 2 2)
Lowest statistically significant correlation
for the specified probability
p 5 0.10 p 5 0.05 p 5 0.01
3 1 0.988 0.997 1.000
4 2 0.900 0.950 0.990
5 3 0.805 0.878 0.959
6 4 0.729 0.811 0.917
7 5 0.669 0.754 0.875
8 6 0.621 0.707 0.834
9 7 0.582 0.666 0.798
10 8 0.549 0.632 0.765
11 9 0.521 0.602 0.735
12 10 0.497 0.576 0.708
13 11 0.476 0.553 0.684
14 12 0.458 0.532 0.661
15 13 0.441 0.514 0.641
16 14 0.426 0.497 0.623
17 15 0.412 0.482 0.606
18 16 0.400 0.468 0.590
19 17 0.389 0.456 0.575
20 18 0.378 0.444 0.561
21 19 0.369 0.433 0.549
22 20 0.360 0.423 0.537
23 21 0.352 0.413 0.526
24 22 0.344 0.404 0.515
25 23 0.337 0.396 0.505
Source: Brighton Webs Ltd. (2006). Critical values of
correlation coefficient (R). Statistics for Energy and the
Environment. Retrieved
from
https://web.archive.org/web/20110117193722/http://www.bright
on-webs.co.uk/tables/critical_values_r.asp
tan82773_08_ch08_227-262.indd 240 3/3/16 12:34 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
https://web.archive.org/web/20110117193722/http://www.bright
on-webs.co.uk/tables/critical_values_r.asp
Section 8.2 Calculating the Pearson Correlation
The Relationship Between Degrees of Freedom and Significance
Even with a correlation value as extreme as 20.965, checking
the table for significance is
important. In both the t test and ANOVA, the magnitude of the
critical values declines as
degrees of freedom (and sample size) increase. It is the same
with correlation, but here the
decline in critical values is more dramatic. Note from the table,
for example, that if n 5 3 (and
therefore df 5 1), the correlation would need to be at least rxy 5
0.997 (nearly perfect) to be
statistically significant. The related point is that with only three
pairs of data, the potential for
a random relationship that looks significant is very high. At the
other extreme, if n 5 25 (so
that df 5 23), a correlation of just rxy= 0.396 is statistically
significant. That much data bears
a much lower potential for an accidental (random) relationship.
The Statistical Hypotheses
The null and alternate hypotheses for correlation reflect the fact
that we have moved away
from the hypothesis of difference. The null hypothesis is that no
relationship between the
variables exists. Symbolically, it is written: H0: ρ 5 0.
The symbol ρ is the Greek letter rho (as in
“row” your boat)and the equivalent of r. So
the
null hypothesis states that the correlation (r) equals 0. More
specifically, it means that there
is no statistically significant relationship. The alternate
hypothesis states that the correlation
does not equal 0, that a statistically significant relationship will
emerge each time data are
collected and the relationship calculated: HA: ρ ? 0.
The Coefficient of Determination
One of our important recurring themes is the distinction
between statistical significance and
practical importance. Determining practical importance was the
reason for omega-squared
and eta-squared calculations for significant t test and ANOVA
results, respectively.
Effect sizes take on particular importance with correlation
because with large samples, rela-
tively small correlations can be statistically significant. The
effect size corresponding to the
Pearson correlation is the coefficient of determination (rxy2).
As the notation suggests, the
coefficient of determination is the square of the correlation
coefficient. Squaring the correla-
tion indicates how much of the variance in y is explained by x
(or vice versa since correlation
does not assume cause).
In the problem about number of trials and elapsed time, rxy 5
20.965 so rxy2 5 0.931.
For that problem, the coefficient of determination is interpreted
this way: the number of tri-
als can explain about 93% of the variance in time elapsed,
which would be a very important
finding with implications for many kinds of performance tasks,
except that the numbers were
contrived.
The Interpretive Value of rxy2
The coefficient of determination can also indicate how
unimportant some low correlations
are, even when they are statistically significant. For example,
with 23 degrees of freedom, a
correlation of rxy 5 0.396 is statistically significant. The
coefficient of determination for that
tan82773_08_ch08_227-262.indd 241 3/3/16 12:34 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
Section 8.2 Calculating the Pearson Correlation
value is rxy2 5 0.157. One variable in such a relationship
explains just 16% of the variance in
the other. The other 84% of the variability is related to other
factors.
When the variables describe the behavior of people, small
coefficients of determination do
not surprise us because they are part of human subjects’
complexity. Very few individual vari-
ables will explain large proportions of human behavior.
Sometimes, however, even low correlations and low rxy2 values
are important. If research
revealed that the correlation between the age of first exposure
to illegal narcotics and the
development of an addiction was rxy 5 20.3, that value (note the
negative correlation) indi-
cates that the younger subjects are at first exposure, the more
likely they are to develop an
addiction. The resulting rxy2 value would be just 0.09. But even
if just 9% of the variance
in addiction is explained by age at first exposure, within the
context of human complexity that
would be considered important. Practical importance is a
function of consequences.
Comparing Correlation Values
In isolation, correlation coefficients can be difficult to interpret
because correlation strength
does not increase or decrease in consistent increments. The
change from rxy 5 0.2 to
rxy 5 0.3 is a less dramatic increase in strength than the
increase from rxy 5 0.75 to
rxy 5 0.85, for example. Although the Pearson r requires equal
interval data, in the coefficients
that are the result, an increase in correlation strength of 0.1
reflects a very different change
from 0.8 to 0.9 than it does from 0.2 to 0.3. It takes a much
stronger increase in the relation-
ship to increase by 0.1 in the upper ranges of correlation values
than in the lower ranges,
something suggested by the distance between tenths in this
number line:
rxy 5 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Squaring the correlation coefficient makes the intervals
consistent. A change in the coefficient
of determination from 0.35 to 0.5, for example, represents the
same increase in proportion of
variance explained as an increase from 0.7 to 0.85, as the line
suggests: r2xy 5 0.1 0.2 0.3 0.4
0.5 0.6 0.7 0.8 0.9.
Another Correlation Problem
A foundation interested in what prompts contributions to
charitable causes retains a consul-
tant. Noting that age varies with donation, the consultant
gathers the data in Table 8.6 and
generates the values in Problem 8.1.
Table 8.6: Data on charity donations
Donor: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Age: 25 27 32 32 35 38 43 45 45 47 48 52 63 65 66
Amount: 20 20 35 25 100 50 75 45 100 150 100 200 50 100 125
tan82773_08_ch08_227-262.indd 242 3/3/16 12:34 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
Section 8.2 Calculating the Pearson Correlation
Problem 8.1: The Pearson correlation for contributor’s age
and contribution amount
Donor’s age Contribution amount
x x2 y y2 xy
25 625 20 400 500
27 729 20 400 540
32 1,024 35 1,225 1,120
32 1,024 25 625 800
35 1,225 100 10,000 3,500
38 1,444 50 2,500 1,900
43 1,849 75 5,625 3,225
45 2,025 45 2,025 2,025
45 2,025 100 10,000 4,500
47 2,209 150 22,500 7,050
48 2,304 100 10,000 4,800
52 4,704 200 40,000 10,400
63 3,969 50 2,500 3,150
65 4,225 100 10,000 6,500
66 4,356 125 15,625 8,250
∑x 5 663 ∑x2 5 31,737 ∑y 5 1,195 ∑y2 5 133,425 ∑xy 5 58,260
The correlation of the donor’s age and the contribution amount
is calculated as follows:
rxy 5
n∑xy 2 (∑x)(∑y)
Î {[n∑x
2
2 (∑x)
2][n∑y2 2 (∑y)
2]}
5
15(58,260) 2 (663)(1,195)
Î {[15(31,737) 2 (663)
2][15(133,425) 2 1,195
2]}
5
81,615
Î [(36,486)(573,350)]
5 0.564
• The critical value at p 5 0.05 and 13 df (r0.05(13)) is
0.514.
• Because rxy . r0.05(13), the correlation is statistically
significant.
• The coefficient of determination (rxy2) 5 0.318, which
indicates that age can explain
about 32% of the variability in donation amount.
tan82773_08_ch08_227-262.indd 243 3/3/16 12:34 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
Section 8.3 Correlating Data When One Variable Is
Dichotomous
Problem 8.1 suggests some of the hazard in rushing
to judgment about cause from correlation data. While
we might be tempted to reduce the problem to “older
people contribute more to charity than younger peo-
ple,” other factors are probably at work, not the least
of which is that age likely correlates with income as
well. Perhaps it is not age that explains contribution
amount so much as income. The correlation value,
while instructive and important, indicates only how variables
co-vary, not necessarily why
the variables involved vary.
8.3 Correlating Data When One Variable Is Dichotomous
If the consultant had asked how the donation amount and the
donor’s gender relate, Pear-
son still provides the answer, but the procedure becomes a
point-biserial correlation. The
word point refers to the continuous variable, the amount of
money donated in this exam-
ple. The word biserial refers to the other variable, which has
only two levels. The required
change is coding the gender variable in a way that reflects its
dichotomy: as either 0 or
1. Which of females or males are coded 0 and which 1 will not
affect the strength of the
coefficient.
The point-biserial correlation has a number of applications.
Questions about the relation-
ship between marital status and income, between public versus
private school students and
achievement, or between Republicans’ and Democrats’
optimism are all questions that could
be analyzed with point-biserial correlation.
In point-biserial correlations, which level is coded 0 and which
1 affects only the sign of the
coefficient. We will need to be careful when interpreting the
result. If donors 3, 5, 6, 7, 9, 10,
11, and 14 are female, and if females are coded 1 and males 0,
the research obtains the data
in Table 8.7.
Table 8.7: Data on charity donations by donor type (gender)
Donor (x) 0 0 1 0 1 1 1 0 1 1 1 0 0 1 0
Amount (y) 20 20 35 25 100 50 75 45 100 150 100 200 50 100
125
Calculating the Point-Biserial Correlation
The amounts donated (the y values) remain the same from the
age/donor problem (Problem
8.1, where ∑y 5 1,195 and ∑y2 5 133,425). The other
values must be recalculated, although
that task becomes much simpler with gender (x) recoded to 1s
and 0s. Table 8.8 lists those
results.
Try It!: #4
What is the relationship between degrees
of freedom and statistical significance in
correlation?
tan82773_08_ch08_227-262.indd 244 3/3/16 12:34 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
Section 8.3 Correlating Data When One Variable Is
Dichotomous
Table 8.8: Point-biserial correlation results
Gender (x) x2 Amount (y) y2 xy
0 0 20 400 0
0 0 20 400 0
1 1 35 1,225 35
0 0 25 625 0
1 1 100 10,000 100
1 1 50 2,500 50
1 1 75 5,625 75
0 0 45 2,025 0
1 1 100 10,000 100
1 1 150 22,500 150
1 1 100 10,000 100
0 0 200 40,000 0
0 0 50 2,500 0
1 1 100 10,000 100
0 0 125 15,625 0
∑x58 ∑x258 ∑y51,195 ∑y2=133,425 ∑xy5710
Return to Formula 8.2, in which
rxy 5
n∑xy 2 (∑x)(∑y)
Î {[n∑x2 2 (∑x)2][n∑y2 2 (∑y)2]}
Substituting in the values from Table 8.8 gives
rxy 5
15(710) 2 (8)(1.195)
Î {[15(8) 2 (8)2][15(133,425) 2 (1,195)2]}
5 0.19
Still testing at p 5 0.05 and with the degrees of freedom still df
5 13, from Table 8.5 the criti-
cal value is still rxy0.05(13) 5 0.514. Therefore the statistical
decision will be to fail to reject H0.
The relationship between the donor’s gender and the amount
contributed is not statistically
significant. The rxy 5 0.19 result is probably a random
correlation that is unlikely to reach the
critical value from the table in any new analysis with new
subjects.
The interpretation of the point-biserial correlation is the same
as it is for conventional Pear-
son correlations, except that sign of the coefficient is a function
only of which variable is
coded 1. If male donors had been coded with 1s, the correlation
would have been negative,
rxy 5 20.19. Consider a few more applications for the point-
biserial correlation:
• What is the relationship between whether or not a parent
earned a college degree
and the child’s grades?
• How is whether or not a student is a native speaker of
English related to the
student’s test score?
tan82773_08_ch08_227-262.indd 245 3/3/16 12:34 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
Section 8.4 The Pearson Correlation in Excel
• What is the correlation between blue-collar/white-collar
jobs and the amount of
leisure time?
If both variables are dichotomous, another bivariate correlation
is involved. It is called the phi
coefficient, discussed in Chapter 10.
Degrees of Significance?
At rxy 5 0.19 and a table value of rxy0.05(13) 5 0.514, the
correlation value is not significant.
If the value had been rxy 5 0.50, and this correlation value
represented some relationship
calculated for your senior thesis, would it be appropriate to
refer to it as “almost significant”
or “nearly significant”? It is not uncommon to see such
qualifiers even in the published lit-
erature, but significance decisions should be treated the same
way as dichotomous variables.
Only two outcomes are possible: The correlation is significant
or it is not significant. To try
to make a statement about the nearness to an alternative
outcome undermines the principle
behind significance testing. Only two hypotheses for
significance exist, and the outcome is
couched in terms of one or the other.
8.4 The Pearson Correlation in Excel
A psychologist is interested in determining the relationship
between risk-taking and success
solving novel problems. Having devised the Inventory Risk
Survey Catalog (the I-RiSC), the
psychologist gauges the willingness of a group of 16-year-olds
to do the unconventional and
then provides a series of word problems with which the
participants are unfamiliar. Scores on
the I-RiSC and the problems for 10 participants are listed in
Table 8.9.
Table 8.9: Risk-taking and problem-solving success data
I-RiSC: 2 7 4 5 1 8 7 9 3 6
Problems: 14 17 14 16 12 17 16 17 15 15
To complete the problem in Excel, it is best to set up the data in
two columns. Two rows also
will work, but parallel columns are visually simpler.
1. Create a label in cell A1 for “I-RiSC” and in cell B1
“ProbSolv” so that
the I-RiSC data appear in cells A2 to A11
and the ProbSolv data appear in B2 to B11.
2. From the Home tab at the top of the page click Data, and then
Data Analysis at the
far right.
3. Select Correlation, which is the second option in the window.
4. In the Input Range window enter A2:B11, which indicates the
cells where the data
are found. Note that the default groups the data in columns.
(Change the default if
entering the data in rows.) Had the “Labels in First Row” box
been checked, Excel
would have treated the first row in each column (A2 and B2
because that is what is
tan82773_08_ch08_227-262.indd 246 3/3/16 12:34 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
Section 8.4 The Pearson Correlation in Excel
designated) as labels rather than data. Our adjustment for the
labels was made by
indicating that the data begin in A2 rather than A1.
5. Enter a cell value below or to the right of the last data entry
for the Output Range so
that the results do not overwrite the scores—either cell A12 or
below, or to the right
of column B.
6. Click OK.
The results appear in a box called a correlation matrix (see
Table 8.10). The intersection of
column 1 and column 2 indicates how well the data in column 1
(the Excel A column, where
I-RiSC data are located) correlate with the data in column 2 (the
Excel B column, which con-
tains the problem-solving scores).
Table 8.10: Correlation matrix
Column 1 Column 2
Column 1 1 0.904203
Column 2 0.904203 1
The result of the analysis is a Pearson correlation of rxy 5
0.904. The 1s in the diagonal indi-
cate that each variable correlates perfectly with itself (rxy 5
1.0), of course. Note that the
output does not indicate whether the calculated value is
statistically significant, which makes
a check of the critical values table necessary. Table 8.5
indicates that rxy0.05(8) 5 0.632. The
relationship between risk-taking and problem solving is
statistically significant. Were these
data not contrived, it would be quite important to know that
about 82% (rxy2 5 0.818) of
problem-solving success (0.9042) is explained by whatever the
I-RiSC measures, ostensibly
the subject’s willingness to be unconventional.
Apply It!
Investigating the Correlation
between Crime and Unemployment
A law enforcement analyst is interested in any link
between crime and unemployment as a guide to allocat-
ing crime-prevention funds. Specifically, she would like
to know whether murders and property crimes correlate
with the unemployment rate.
The analyst obtains the murder and property-crime rates
for her state for the 16 years from 1990 to 2005 from
the FBI Uniform Crime Reports (rates are per 100,000
inhabitants). She then consults the Bureau of Labor Sta-
tistics for the unemployment rate in the state for the
same period. The analyst will compute the Pearson cor-
relation between murder rate and unemployment and
then between property-crime rate and unemployment.
Table 8.11 shows the data.
Digital Vision/Photodisc/Thinkstock
(continued)
tan82773_08_ch08_227-262.indd 247 3/3/16 12:34 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
Section 8.4 The Pearson Correlation in Excel
(continued)
Table 8.11: Murder rate, property crime, and unemployment
Year
Murder rate
(per 100,000 people)
Property crime rate
(per 100,000 people)
Unemployment
percentage
1990 7.1 4462 5.6
1991 6.7 5092 6.8
1992 6.4 4801 7.5
1993 6.4 4662 6.9
1994 6.2 4678 6.1
1995 5.7 4460 5.6
1996 5.8 4438 5.4
1997 5.4 4279 4.9
1998 6.1 4040 4.5
1999 5.5 3852 4.2
2000 5.1 3592 4.0
2001 4.9 3456 4.7
2002 4.3 3412 5.8
2003 4.2 3289 6.0
2004 4.7 3168 5.5
2005 5.0 3081 5.1
The Excel results indicate the following:
• The correlation between murder rate and unemployment is
rxy 5 0.386.
• Comparing the murder rate/unemployment rate correlation
to the critical value from
Table 8.5 (rxy0.05(14) 5 0.497) indicates that the calculated
correlation is not statistically
significant at p 5 0.05.
• The analyst fails to reject the null hypothesis,
ρ 5 0.
• The property crimes rate and unemployment correlation is
rxy 5 0.551.
• Comparing the calculated value to the critical value from
Table 8.5 (the same
rxy0.05(14) 5 0.497, since df are unchanged) indicates that this
correlation is statistically
significant at p 5 0.05.
• The analyst rejects the null hypothesis, ρ 5 0.
• The coefficient of determination for this relationship is
rxy2 5 0.55122 5 0.303. About 30%
of the variance in the property crime rate can be explained by
the unemployment rate.
Although the rxy2 indicates that about 30% of property crime is
explained by variations in
unemployment, the analyst will want to be careful about making
the conceptual leap to a
causal conclusion. “Explained by” isn’t the same as “caused
by.” To reiterate the point, per-
haps something else explains both crime rate and
unemployment. Perhaps underfunded pub-
lic schooling prompts an unusually high dropout rate from
school. The consequently under-
educated population has more difficulty securing stable
unemployment. Perhaps state budget
cuts have been disproportionately imposed on police agencies,
and with fewer officers on the
street, crime rises. In other words, the simplest explanation
might not be the most accurate. A
statistically significant correlation is not where the analysis
ends.
Apply It! boxes written by Shawn Murphy
tan82773_08_ch08_227-262.indd 248 3/3/16 12:34 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
Section 8.5 Spearman’s Rho
8.5 Spearman’s Rho
The Pearson correlation requires that both variables must be at
least interval scale. The point-
biserial correlation requires that one variable must be at least
interval scale, and the other
must be a variable with only two levels.
Neither of these correlations is helpful when the data are
ordinal scale, which describes much
of the data that psychologists and other social scientists
encounter. Nearly everyone who goes
to the mall or answers the telephone has been asked to take a
survey, particularly if it hap-
pens to be an election year. Survey data are usually ordinal
scale. It is common for the ques-
tionnaires to have a Likert-type format, where a statement is
read and the respondents are
asked the degree to which they agree with the statement by
selecting from a range of choices
such as:
• Strongly agree
• Agree
• Neither agree nor disagree
• Disagree
• Strongly disagree
Although surveyors commonly code the responses (strongly
agree 5 1, agree 5 2 and so
on) and then calculate means and standard deviations for all
respondents, those statistics
assume that the data are at least interval scale. Survey data
rarely are. The Likert types of
responses are essentially rankings. A response of “strongly
agree” is more positive than
“agree” but precisely how much more is not clear. Besides, one
respondent’s “disagree” may
be another respondent’s “strongly disagree.” These data are
more safely treated as ordinal
scale responses.
Correlating Ordinal, or Mixed Ordinal/Interval Data
In addition to survey data, ordinal scale characterizes other
common data, such as class rank-
ings and percentile scores. Sometimes the variables
investigators might wish to correlate
have mixed scales. For example, a researcher wants to correlate
subjects’ income (ratio scale
data) with their optimism (usually gauged with a Likert-type
survey and so ordinal scale).
Along with the ordinal variable, the income variable is often not
normally distributed. The
lack of normality in both the ratio variable and the ordinal scale
variable rules out a Pearson’s
correlation.
Charles Spearman, Pearson’s colleague at University College
London, developed a tremen-
dously flexible correlation procedure. It accommodates two
variables in a correlation proce-
dure, provided the variables fit any of the following:
• Both are ordinal scale.
• One variable is ordinal scale and one is interval or ratio
scale.
• Two variables are interval or ratio scale, but one or both
fail to meet the Pearson
correlation requirement for normality.
The procedure is Spearman’s rho, symbolized by ρ.
Spearman’s rho is a nonparametric
procedure, which means that it makes no
assumptions about parameters; it means
that ρ will
tan82773_08_ch08_227-262.indd 249 3/3/16 12:34 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
Section 8.5 Spearman’s Rho
accommodate data when there are reasons to suspect that the
data are not normally distrib-
uted. The formula, which requires that the scores for each
variable be independently ranked,
is as follows:
Formula 8.3
ρ 5 1 2
6∑d2
n(n2 2 1)
where
d 5 the difference between the rankings for the two variables
n 5 the number of pairs of data
The formula’s 1s and 6 are constant values, used every time a
Spearman’s correlation
is calculated.
Following are the steps to calculating a Spearman’s rho:
1. Rank the scores for both variables separately.
2. For each pair of rankings, subtract the second ranking in the
pair from the first to
produce a difference score, d.
3. Square each of the d values for d2.
4. Sum the d2 values for ∑d2.
5. Solve for ρ.
Ranking Tied Scores
The ranking procedure must follow rules. If some of the scores
for one of the variables have
multiples, all must receive the same ranking. If someone were
ranking the following values,
for example:
3, 5, 6, 6, 7, 8, 8, 8, 9, 10
ranking the values from smallest to largest produces the
following values:
1, 2, 3.5, 3.5, 5, 7, 7, 7, 9, 10.
The smallest value, 3, was ranked “1,” the 5 was ranked “2,”
and so on. The two 6s and the
three 8s were handled as follows:
• Because the two 6s are rankings 3 and 4, those two values
are added and divided by
the number of them (2), which results in 3.5
([3 1 4] 4 2). After both 6s are ranked
3.5 (for places 3 and 4) the next value in the data set, 7, is
ranked 5.
• The 8s are all ranked 7 ([6 1 7 1 8] 4 3),
after which the next value, the 9, is ranked 9.
tan82773_08_ch08_227-262.indd 250 3/3/16 12:34 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
Section 8.5 Spearman’s Rho
An Example
Suppose the data ranked above measure emotional sta-
bility, a variable thought to correlate negatively with
stress. If those data are collected for career military ser-
vice personnel assigned to combat areas, and age data
are added for 10 subjects, Table 8.12 might be the result.
Table 8.12: Emotional stability and age data
Emotional stability Age
3 26
5 25
6 32
6 35
7 35
8 34
8 37
8 40
9 42
10 39
Calculations for a Spearman’s rho solution, based on the
information in Problem 8.1, give
ρ 5 1 2
6∑d2
n(n2 2 1)
5 1 2
6(24.5)
10(102 2 1)
5 0.852
Table 8.13 lists the critical values for Spearman’s rho (Table
B.6 in Appendix B). There are no
degrees of freedom for this procedure. The correct critical value
for rho is indicated by the
number of data pairs. Note that for p 5 0.05 and 10 pairs
ρ.05(10) 5 0.648. The relationship
between emotional stability and age among service personnel
assigned to combat zones is
statistically significant; therefore, we reject H0.
Try It!: #5
Spearman’s rho requires data of what
scale?
tan82773_08_ch08_227-262.indd 251 3/3/16 12:34 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
Section 8.5 Spearman’s Rho
Table 8.13: The critical values for Spearman’s rho
Number of pairs of scores p 5 0.05 p 5 0.01
5 1.0
6 0.886 1.0
7 0.786 0.929
8 0.738 0.881
9 0.683 0.883
10 0.648 0.794
12 0.591 0.777
14 0.544 0.715
16 0.506 0.665
18 0.475 0.625
20 0.450 0.591
22 0.428 0.562
24 0.409 0.537
26 0.392 0.515
28 0.377 0.496
30 0.364 0.478
Source: University of Sussex. (n.d.). Critical values of
Spearman’s rho (two-tailed). Retrieved
from www.sussex.ac.uk/Users/grahamh/RM1web/Rhotable.htm
Problem 8.2: The Spearman’s rho correlation: emotional
stability
and age among service personnel
1. Ranking the scores produces ρ1 for emotional
stability and ρ2 for age.
2. The d score is the difference between the two rankings.
3. The square of the difference score is d2.
Emotional stability Age ρ1 ρ2 d(ρ1 2 ρ2) d
2
3 26 1 2 21 1
5 25 2 1 1 1
6 32 3.5 3 0.5 0.25
6 35 3.5 5.5 22 4
7 35 5 5.5 20.5 0.25
8 34 7 4 3 9
8 37 7 7 0 0
8 40 7 9 22 4
9 42 9 10 21 1
10 39 10 8 2 4
∑d2 5 24.50
tan82773_08_ch08_227-262.indd 252 3/3/16 12:34 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
www.sussex.ac.uk/Users/grahamh/RM1web/Rhotable.htm
Section 8.5 Spearman’s Rho
Apply It!
Exploring the Correlation between
Job Satisfaction and Commute Times
As part of the justification for allowing workers
to work at home part-time, the human resources
director for a large firm intends to investigate
any correlation between job satisfaction and
average commute time for employees. The
director asks ten randomly selected employees
to fill out a job-satisfaction questionnaire with
the following responses to a series of questions:
Response Score
• very satisfied (vs) 1
• somewhat satisfied (ss) 2
• somewhat dissatisfied (sd) 3
• very dissatisfied (vd) 4
The employees were also asked to indicate their average one-
way commute time in minutes.
Recognizing that job satisfaction responses will be ordinal
scale, the HR director opts for
Spearman’s rho. The data and the difference scores are shown in
Table 8.14.
Table 8.14: Spearman’s rho data for the correlation between job
satisfaction and commute time
Commute
time
(minutes)
Commute
rank
Job
satisfaction
total
Satisfaction
rank Difference
Difference
squared
2 1 10 2 21 1
7 2 14 5 23 9
11 3 10 2 1 1
15 4 14 5 21 1
17 5 10 2 3 9
23 6 14 5 1 1
28 7 17 7.5 20.5 0.25
32 8 22 9.5 21.5 2.25
36 9 22 9.5 20.5 0.25
40 10 17 7.5 2.5 6.25
From the table, the sum of the differences is
∑d2 5 1 1 9 1 1 1 1 1 9 11 1 0.25 1 2.25 1 0.25 1 6.25 5 31
Digital Vision/Photodisc/Thinkstock
(continued)
tan82773_08_ch08_227-262.indd 253 3/3/16 12:35 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
Section 8.5 Spearman’s Rho
Direction of the Ranking
In the study of emotional stability and age for service
personnel, the least stable value received the ranking of
1, and the most stable a ranking of 10, while the young-
est subject received the age ranking of 1. In terms of
the value of the statistic, it would not have mattered
whether the rankings go from lowest to highest, or
from highest to lowest, as long as both variables are
ranked the same way. We could have ranked the most
emotionally stable 1 and the oldest 1, and the coeffi-
cient would have come out the same. If we reversed just one of
them, however, the correlation
would appear to be negative.
Summary of Spearman’s Rho
Spearman’s correlation provides flexibility to the analyst. As
long as some evidence of a rela-
tionship exists, correlations can be calculated for any
combination of ordinal, interval, and
ratio variables. But of course so much latitude requires some
sacrifice, and it is statistical
power. In the course of ranking values, the amount of difference
between any two data points
is lost. When the ages of the service personnel were ranked,
• the 25-year-old was 1,
• the 26-year-old was 2,
• and the 32-year-old was 3.
Once ranked, the fact that from the first to the second ranking is
a one-year difference and
from the second to the third ranking is a six-year difference is
lost. Pearson’s r retains those
(continued)
For n 5 10, the Spearman’s rho formula is
ρ 5 1 2
6∑d2
n(n2 2 1)
5 1 2
6(31)
10(102 2 1)
5 0.812
For rs 5 0.05 and 10 pairs of data, the critical value is
rs0.05(10) 5 0.648. The relationship between
job satisfaction and average commute time is statistically
significant. Those who commute the
least time have the highest levels of job satisfaction. Perhaps
the attitudes of those who have
the lowest levels of job satisfaction—those who have the
longest commutes—will improve if
they are required to commute less often because they can
sometimes work from home.
Apply It! boxes written by Shawn Murphy
Try It!: #6
For 10 students, grade averages and rank
in class are correlated. How will the result-
ing coefficient be affected if the highest
ranked student is given the lowest value
(1) versus the highest value (10)?
tan82773_08_ch08_227-262.indd 254 3/3/16 12:35 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
Summary and Resources
differences. When both correlations are calculated for the same
data, their coefficients usu-
ally have little difference, but a Pearson correlation will
sometimes be statistically signifi-
cant when Spearman’s is not. Note the comparison of critical
values at p 5 0.05 shown in
Table 8.15.
Table 8.15: Comparison of Pearson and Spearman critical
values
No. pairs Pearson critical value* Spearman critical value
5 0.878 1.000
6 0.811 0.886
10 0.632 0.648
*for df =number of pairs, 22
In the examples above, the value required for significance with
a Spearman correlation is
higher than that required for a Pearson correlation.
Another limitation of the Spearman correlation is that we cannot
square the Spearman value
to determine the proportion of variance in y explained by x.
Spearman’s rho has no equivalent
of rxy2. When the data do not meet the Pearson requirements,
however, the researcher has no
choice. When the data do meet the requirements, a Pearson’s r
is usually preferable to Spear-
man’s rho.
Correlation in Research
Correlation procedures answer enough of the questions that
interest researchers and con-
sumers of research that the procedures pervade research
literature. Arroyo (2015) exam-
ined the correlation between work engagement and internal self-
concept. Arroyo found that
people tend to engage in the work they do to earn a living, not
for the external rewards, but
for the work’s own sake; their work is intrinsically satisfying.
Ceci and Kumar (2015), meanwhile, asked whether happiness
correlates with creative capac-
ity. They found no significant correlation but did find a
significant correlation between cre-
ative capacity and intrinsic motivation, suggesting that those
with the greatest creative capac-
ity are probably those who are most internally driven to create.
The researchers’ approach to
quantifying happiness is also a matter of interest, since it is
often a challenge to find a way to
quantify something so subjective.
Summary and Resources
Chapter Summary
Many of the questions researchers and scholars ask deal with
the relationships between
variables. To accommodate them, the discussion in this chapter
shifted to statistical
procedures that reflect the hypothesis of association (Objective
1). Three of the many
correlation procedures that respond to the hypothesis of
association are the Pearson
tan82773_08_ch08_227-262.indd 255 3/3/16 12:35 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
Summary and Resources
correlation, the point-biserial correlation, and Spearman’s rho.
In each case, possible
values range from –1.0 to 11.0, and all their coefficients are
interpreted the same
way. Positive correlations indicate that as the values in one
variable increase, the values
in the other also increase. Negative correlations indicate that as
one increases, the
other decreases. The sign of the coefficient, however, is
unrelated to its strength
(Objective 2).
The differences among the correlation procedures in this
chapter are in the kinds
of variables they accommodate. The Pearson correlation
requires interval or ratio
variables that are normally and similarly distributed (Objective
3). A special applica-
tion of Pearson, the point-biserial correlation, requires an
interval/ratio variable and a
second variable that has only two manifestations, or a
dichotomously scored variable
(Objective 5). Spearman’s rho accommodates any combination
of ordinal, interval, or
ratio variables (Objective 6). Because the data used in a Pearson
correlation contain
more information than the rankings that make up the data for
Spearman’s approach,
the Pearson value provides more information about the nature
of the relationship
between the variables. This is evident in the fact that the
Pearson value can be squared
to produce the coefficient of determination. The rxy2 value
indicates the proportion of
one variable that can be explained by changes in the other
(Objective 4). Spearman
values have no equivalent of this statistic.
When two variables share information, they are correlated. The
amount of one explained
by the other is what that rxy2 value, the coefficient of
determination, indicates. This con-
cept provides a foundation for regression, which is the focus of
Chapter 9. Regression
allows what is known of y from analyzing x to predict the value
of y from a value of x.
It involves calculations and thinking with which you are already
familiar, so work the
end-of-chapter problems, reread any of the sections in Chapter
8, and prepare for
Chapter 9.
bivariate correlations Include all proce-
dures that test for significant relationships
between two variables.
canonical correlation Measures the rela-
tionship between two groups of variables.
coefficient of determination Indicates the
proportion of one variable in a Pearson cor-
relation that can be explained by the other.
correlation matrix A box in which the vari-
ables involved are listed in rows as well as
in columns, and each variable is correlated
with all variables, including itself.
hypothesis of association The umbrella
term for significance tests that analyze the
correlation between or among variables.
hypothesis of difference The umbrella
term for significance tests that analyze the
differences between groups.
linear Describes a relationship between
two variables whose strength is consistent
throughout their ranges. With curvilinear
relationships, the strength and sometimes
even the nature of the relationship (positive
or negative) changes depending upon where
in the variables’ ranges they are measured.
Key Terms
tan82773_08_ch08_227-262.indd 256 3/3/16 12:35 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
Summary and Resources
multiple correlation Gauges the strength
of the relationship between one variable and
two or more other variables.
nonparametric Tests for data that do not
meet the usual normality requirements.
More technically, a test in which there is no
interest in population parameters.
partial correlation Measures the relation-
ship between two variables, controlling for
the influence of a third in both of the first two.
Pearson correlation coefficient Indicates
the strength of the relationship between
interval- or ratio-scale variables.
point-biserial correlation A special appli-
cation of the Pearson correlation for those
instances where one of the variables, such
as gender or marital status, has just two
manifestations.
range attenuation Occurs when a variable
is not measured throughout its entire range.
Attenuated range artificially reduces the
strength of any resulting correlation value.
scatterplot A graph representing two vari-
ables, one on the horizontal axis, the other
on the vertical axis. Each point in the graph
indicates the measure of both variables for
one individual.
semi-partial correlation Gauges the rela-
tionship between two variables, controlling
for a third in just one of the first two.
Spearman’s rho A correlation procedure
for two ordinal variables, one ordinal and
one interval/ratio variable or two interval or
ratio variables, that fail to meet Pearson cor-
relation requirements for normality.
Review Questions
Answers to the odd-numbered questions are provided in
Appendix A.
1. What values indicate the strongest and weakest values for a
Pearson’s r?
2. What is the equivalent in a Pearson
correlation for η2?
3. What are the requirements for calculating Pearson’s r?
4. What is “range attenuation,” and how does it affect
correlation values for linear
relationships?
5. A university counselor gathers data on students’ grades and
whether or not they
are employed. What statistical procedure will gauge that
relationship?
6. What procedure will indicate whether there is a significant
relationship
between sales representatives’ sales rank and their attitudes
about the product
they sell?
7. a. What procedure will gauge the relationship between
university students’ grade
averages and their scores on, for example, a statistics test?
b. What statistic will indicate the proportion of the students’
test scores that is a
function of their GPA?
tan82773_08_ch08_227-262.indd 257 3/3/16 12:35 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
Summary and Resources
8. A forensic psychologist gathers data on the average time of
night juveniles go to bed
and whether or not they have an arrest record.
a. What procedure will allow the psychologist to evaluate the
relationship between
those two variables?
b. What is the resulting coefficient?
c. How much of variability in arrest records can be explained by
what time the juve-
nile goes to bed?
Juvenile Retire Arrest
1 9.0 No
2 9.5 No
3 11.0 Yes
4 11.5 Yes
5 10.0 Yes
6 9.75 No
7 10.0 No
8 10.25 Yes
9. A group of consumers has just taken two surveys on (a) their
attitude about
the economy and (b) their attitude about those in government. In
both, higher
scores mean more optimism. The data are ordinal scale. Are the
two attitudes
related?
Consumer Economy Government
1 15 10
2 5 4
3 16 11
4 10 8
5 11 13
6 3 4
7 12 10
8 11 8
9 10 7
10 14 9
tan82773_08_ch08_227-262.indd 258 3/3/16 12:35 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
Summary and Resources
10. A group of students has been told that reading will help
them in a test of verbal
ability required by the university they wish to attend. The x
variable indicates the
minutes per day spent reading. The y variable represents
students’ scores on
the test.
Student Minutes (x) Score (y)
1 15 57
2 80 84
3 0 60
4 75 92
5 30 65
6 10 60
7 22 75
8 15 68
a. Is the relationship statistically significant?
b. How much of the variance in test scores can be explained by
differences in the
amount of time spent reading?
11. A district psychologist is working with developmentally
disabled students in a
special education setting and is curious about the relationship
between students’
persistence on puzzle tasks (measured in the number of minutes
they remain on
task) and their number of absences from class.
Student Persist Absent
1 12 3
2 4 3
3 15 5
4 18 7
5 12 1
6 5 4
7 8 3
8 9 4
Is the relationship between persistence and attendance
statistically significant at
p 5 0.05?
tan82773_08_ch08_227-262.indd 259 3/3/16 12:35 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
Summary and Resources
12. An employer wishes to analyze the relationship between
stress and job perfor-
mance. Stress is reflected by systolic blood pressure. Job
performance is measured in
the number of sales per day.
a. What is the appropriate correlation procedure?
b. Is the relationship statistically significant?
Employee Sales Blood pressure
1 1 150
2 4 140
3 3 140
4 6 110
5 2 140
6 4 130
7 0 160
8 3 110
9 5 120
10 7 160
13. An industrial psychologist is determining the relationship
between workers’ willing-
ness to embrace new manufacturing procedures, gauged with a
dogmatism scale
(higher scores indicate greater dogmatism), and their level of
job satisfaction (higher
scores indicate greater satisfaction). The satisfaction data are at
least ordinal scale.
a. What is the relationship?
b. What is the null hypothesis?
c. Do you reject or fail to reject the null hypothesis?
d. What is the relationship between dogmatism and job
satisfaction?
e. Is the correlation statistically significant?
Worker Dogmatism Satisfaction
1 8 4
2 4 12
3 3 14
4 5 15
5 7 5
6 2 14
7 3 15
8 1 15
tan82773_08_ch08_227-262.indd 260 3/3/16 12:35 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
Summary and Resources
Answers to Try It! Questions
1. A single point in a scatterplot represents two raw scores, one
for x and one for y.
2. If the two variables are normally distributed but uncorrelated,
their combined scat-
terplot will be circular with greatest density in the middle of the
plot because of the
tendency for most of the data to fall in the middle of either
distribution.
3. Range attenuation diminishes the strength of the correlation
value in linear relation-
ships. It produces an artificially low correlation coefficient.
4. As degrees of freedom increase, the correlation value
required to reach significance
diminishes.
5. Spearman’s rho accommodates variables that have any
combination of ordinal,
interval, or ratio scale.
6. The coefficient would indicate that the higher the ranking,
the lower the GPA. If a
ranking of 1 is “best,” the best (highest) GPA must also receive
a class ranking of 1.
Otherwise, the relationship looks negative when it is not.
tan82773_08_ch08_227-262.indd 261 3/3/16 12:35 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
tan82773_08_ch08_227-262.indd 262 3/3/16 12:35 PM
© 2016 Bridgepoint Education, Inc. All rights reserved. Not for
resale or redistribution.
ARTX 435 The Fashion Consumer
Name: Date:
I Want That! How We All Became Shoppers- Discussion
Questions
1. How do we use objects to define our identity?
2. What does the author mean when he writes that objects are
“repositories of magic”?
3. The author writes, “The catalogue told Jane about the
moment in which she was living.” What is meant by this? What
resources do consumers use today to learn about the moment in
which they are living?
4. What does the author mean when he describes “just looking”
as a form of “domestic due diligence”?
5. What is included in the “buyosphere”?

More Related Content

Similar to 2278CorrelationAnrodphotoiStockThinkstockChapter.docx

Evaluation Of A Correlation Analysis Essay
Evaluation Of A Correlation Analysis EssayEvaluation Of A Correlation Analysis Essay
Evaluation Of A Correlation Analysis EssayCrystal Alvarez
 
You clearly understand the concepts of this assignment. You’ve don.docx
You clearly understand the concepts of this assignment. You’ve don.docxYou clearly understand the concepts of this assignment. You’ve don.docx
You clearly understand the concepts of this assignment. You’ve don.docxjeffevans62972
 
Learning Outcomes1. Describe correlations and regression a.docx
Learning Outcomes1. Describe correlations and regression a.docxLearning Outcomes1. Describe correlations and regression a.docx
Learning Outcomes1. Describe correlations and regression a.docxSHIVA101531
 
Katagorisel veri analizi
Katagorisel veri analiziKatagorisel veri analizi
Katagorisel veri analiziBurak Kocak
 
Intercultural Communication Essay.pdf
Intercultural Communication Essay.pdfIntercultural Communication Essay.pdf
Intercultural Communication Essay.pdfChristy Williams
 
Correlational research 1 1
Correlational research 1 1Correlational research 1 1
Correlational research 1 1sdwilson88
 
Describing Relationship between Variables
Describing Relationship between VariablesDescribing Relationship between Variables
Describing Relationship between VariablesMaribelMadarimot1
 
Importance of Quantitative Research in Social Sciences for PhD Research Schol...
Importance of Quantitative Research in Social Sciences for PhD Research Schol...Importance of Quantitative Research in Social Sciences for PhD Research Schol...
Importance of Quantitative Research in Social Sciences for PhD Research Schol...PhD Assistance
 
Importance of Quantitative Research in Social Sciences for PhD Research Schol...
Importance of Quantitative Research in Social Sciences for PhD Research Schol...Importance of Quantitative Research in Social Sciences for PhD Research Schol...
Importance of Quantitative Research in Social Sciences for PhD Research Schol...PhD Assistance
 
3Rd Person Essay Example
3Rd Person Essay Example3Rd Person Essay Example
3Rd Person Essay ExampleDenise Snow
 
3Rd Person Essay Example.pdf
3Rd Person Essay Example.pdf3Rd Person Essay Example.pdf
3Rd Person Essay Example.pdfRenee Spahn
 
Argumentative Essay Structure Argumentative E
Argumentative Essay Structure Argumentative EArgumentative Essay Structure Argumentative E
Argumentative Essay Structure Argumentative EAlison Carias
 
Quantifying an association to predict future events chapt
Quantifying an association to predict future events chaptQuantifying an association to predict future events chapt
Quantifying an association to predict future events chaptMARK547399
 
Sample Of A Cause And Effect Essay
Sample Of A Cause And Effect EssaySample Of A Cause And Effect Essay
Sample Of A Cause And Effect EssayKathy Murray
 

Similar to 2278CorrelationAnrodphotoiStockThinkstockChapter.docx (16)

Correlational Study
Correlational StudyCorrelational Study
Correlational Study
 
Evaluation Of A Correlation Analysis Essay
Evaluation Of A Correlation Analysis EssayEvaluation Of A Correlation Analysis Essay
Evaluation Of A Correlation Analysis Essay
 
You clearly understand the concepts of this assignment. You’ve don.docx
You clearly understand the concepts of this assignment. You’ve don.docxYou clearly understand the concepts of this assignment. You’ve don.docx
You clearly understand the concepts of this assignment. You’ve don.docx
 
Learning Outcomes1. Describe correlations and regression a.docx
Learning Outcomes1. Describe correlations and regression a.docxLearning Outcomes1. Describe correlations and regression a.docx
Learning Outcomes1. Describe correlations and regression a.docx
 
Katagorisel veri analizi
Katagorisel veri analiziKatagorisel veri analizi
Katagorisel veri analizi
 
Intercultural Communication Essay.pdf
Intercultural Communication Essay.pdfIntercultural Communication Essay.pdf
Intercultural Communication Essay.pdf
 
Correlational research 1 1
Correlational research 1 1Correlational research 1 1
Correlational research 1 1
 
Describing Relationship between Variables
Describing Relationship between VariablesDescribing Relationship between Variables
Describing Relationship between Variables
 
Importance of Quantitative Research in Social Sciences for PhD Research Schol...
Importance of Quantitative Research in Social Sciences for PhD Research Schol...Importance of Quantitative Research in Social Sciences for PhD Research Schol...
Importance of Quantitative Research in Social Sciences for PhD Research Schol...
 
Importance of Quantitative Research in Social Sciences for PhD Research Schol...
Importance of Quantitative Research in Social Sciences for PhD Research Schol...Importance of Quantitative Research in Social Sciences for PhD Research Schol...
Importance of Quantitative Research in Social Sciences for PhD Research Schol...
 
3Rd Person Essay Example
3Rd Person Essay Example3Rd Person Essay Example
3Rd Person Essay Example
 
3Rd Person Essay Example.pdf
3Rd Person Essay Example.pdf3Rd Person Essay Example.pdf
3Rd Person Essay Example.pdf
 
Argumentative Essay Structure Argumentative E
Argumentative Essay Structure Argumentative EArgumentative Essay Structure Argumentative E
Argumentative Essay Structure Argumentative E
 
Quantifying an association to predict future events chapt
Quantifying an association to predict future events chaptQuantifying an association to predict future events chapt
Quantifying an association to predict future events chapt
 
dag
dagdag
dag
 
Sample Of A Cause And Effect Essay
Sample Of A Cause And Effect EssaySample Of A Cause And Effect Essay
Sample Of A Cause And Effect Essay
 

More from lorainedeserre

4 Shaping and Sustaining Change Ryan McVayPhotodiscThink.docx
4 Shaping and Sustaining Change Ryan McVayPhotodiscThink.docx4 Shaping and Sustaining Change Ryan McVayPhotodiscThink.docx
4 Shaping and Sustaining Change Ryan McVayPhotodiscThink.docxlorainedeserre
 
4.1 EXPLORING INCENTIVE PAY4-1 Explore the incentive pay a.docx
4.1 EXPLORING INCENTIVE PAY4-1 Explore the incentive pay a.docx4.1 EXPLORING INCENTIVE PAY4-1 Explore the incentive pay a.docx
4.1 EXPLORING INCENTIVE PAY4-1 Explore the incentive pay a.docxlorainedeserre
 
38 u December 2017 January 2018The authorities beli.docx
38  u   December 2017  January 2018The authorities beli.docx38  u   December 2017  January 2018The authorities beli.docx
38 u December 2017 January 2018The authorities beli.docxlorainedeserre
 
3Prototypes of Ethical ProblemsObjectivesThe reader shou.docx
3Prototypes of Ethical ProblemsObjectivesThe reader shou.docx3Prototypes of Ethical ProblemsObjectivesThe reader shou.docx
3Prototypes of Ethical ProblemsObjectivesThe reader shou.docxlorainedeserre
 
4-5 Annotations and Writing Plan - Thu Jan 30 2111Claire Knaus.docx
4-5 Annotations and Writing Plan - Thu Jan 30 2111Claire Knaus.docx4-5 Annotations and Writing Plan - Thu Jan 30 2111Claire Knaus.docx
4-5 Annotations and Writing Plan - Thu Jan 30 2111Claire Knaus.docxlorainedeserre
 
3Moral Identity Codes of Ethics and Institutional Ethics .docx
3Moral Identity Codes of  Ethics and Institutional  Ethics .docx3Moral Identity Codes of  Ethics and Institutional  Ethics .docx
3Moral Identity Codes of Ethics and Institutional Ethics .docxlorainedeserre
 
3NIMH Opinion or FactThe National Institute of Mental Healt.docx
3NIMH Opinion or FactThe National Institute of Mental Healt.docx3NIMH Opinion or FactThe National Institute of Mental Healt.docx
3NIMH Opinion or FactThe National Institute of Mental Healt.docxlorainedeserre
 
4.1Updated April-09Lecture NotesChapter 4Enterpr.docx
4.1Updated April-09Lecture NotesChapter 4Enterpr.docx4.1Updated April-09Lecture NotesChapter 4Enterpr.docx
4.1Updated April-09Lecture NotesChapter 4Enterpr.docxlorainedeserre
 
3Type your name hereType your three-letter and -number cours.docx
3Type your name hereType your three-letter and -number cours.docx3Type your name hereType your three-letter and -number cours.docx
3Type your name hereType your three-letter and -number cours.docxlorainedeserre
 
3Welcome to Writing at Work! After you have completed.docx
3Welcome to Writing at Work! After you have completed.docx3Welcome to Writing at Work! After you have completed.docx
3Welcome to Writing at Work! After you have completed.docxlorainedeserre
 
3JWI 531 Finance II Assignment 1TemplateHOW TO USE THIS TEMP.docx
3JWI 531 Finance II Assignment 1TemplateHOW TO USE THIS TEMP.docx3JWI 531 Finance II Assignment 1TemplateHOW TO USE THIS TEMP.docx
3JWI 531 Finance II Assignment 1TemplateHOW TO USE THIS TEMP.docxlorainedeserre
 
3Big Data Analyst QuestionnaireWithin this document are fo.docx
3Big Data Analyst QuestionnaireWithin this document are fo.docx3Big Data Analyst QuestionnaireWithin this document are fo.docx
3Big Data Analyst QuestionnaireWithin this document are fo.docxlorainedeserre
 
3HR StrategiesKey concepts and termsHigh commitment .docx
3HR StrategiesKey concepts and termsHigh commitment .docx3HR StrategiesKey concepts and termsHigh commitment .docx
3HR StrategiesKey concepts and termsHigh commitment .docxlorainedeserre
 
3Implementing ChangeConstruction workers on scaffolding..docx
3Implementing ChangeConstruction workers on scaffolding..docx3Implementing ChangeConstruction workers on scaffolding..docx
3Implementing ChangeConstruction workers on scaffolding..docxlorainedeserre
 
3Assignment Three Purpose of the study and Research Questions.docx
3Assignment Three Purpose of the study and Research Questions.docx3Assignment Three Purpose of the study and Research Questions.docx
3Assignment Three Purpose of the study and Research Questions.docxlorainedeserre
 
380067.docxby Jamie FeryllFILET IME SUBMIT T ED 22- .docx
380067.docxby Jamie FeryllFILET IME SUBMIT T ED 22- .docx380067.docxby Jamie FeryllFILET IME SUBMIT T ED 22- .docx
380067.docxby Jamie FeryllFILET IME SUBMIT T ED 22- .docxlorainedeserre
 
392Group Development JupiterimagesStockbyteThinkstoc.docx
392Group Development JupiterimagesStockbyteThinkstoc.docx392Group Development JupiterimagesStockbyteThinkstoc.docx
392Group Development JupiterimagesStockbyteThinkstoc.docxlorainedeserre
 
39Chapter 7Theories of TeachingIntroductionTheories of l.docx
39Chapter 7Theories of TeachingIntroductionTheories of l.docx39Chapter 7Theories of TeachingIntroductionTheories of l.docx
39Chapter 7Theories of TeachingIntroductionTheories of l.docxlorainedeserre
 
3902    wileyonlinelibrary.comjournalmec Molecular Ecology.docx
3902     wileyonlinelibrary.comjournalmec Molecular Ecology.docx3902     wileyonlinelibrary.comjournalmec Molecular Ecology.docx
3902    wileyonlinelibrary.comjournalmec Molecular Ecology.docxlorainedeserre
 
38  Monthly Labor Review  •  June 2012TelecommutingThe.docx
38  Monthly Labor Review  •  June 2012TelecommutingThe.docx38  Monthly Labor Review  •  June 2012TelecommutingThe.docx
38  Monthly Labor Review  •  June 2012TelecommutingThe.docxlorainedeserre
 

More from lorainedeserre (20)

4 Shaping and Sustaining Change Ryan McVayPhotodiscThink.docx
4 Shaping and Sustaining Change Ryan McVayPhotodiscThink.docx4 Shaping and Sustaining Change Ryan McVayPhotodiscThink.docx
4 Shaping and Sustaining Change Ryan McVayPhotodiscThink.docx
 
4.1 EXPLORING INCENTIVE PAY4-1 Explore the incentive pay a.docx
4.1 EXPLORING INCENTIVE PAY4-1 Explore the incentive pay a.docx4.1 EXPLORING INCENTIVE PAY4-1 Explore the incentive pay a.docx
4.1 EXPLORING INCENTIVE PAY4-1 Explore the incentive pay a.docx
 
38 u December 2017 January 2018The authorities beli.docx
38  u   December 2017  January 2018The authorities beli.docx38  u   December 2017  January 2018The authorities beli.docx
38 u December 2017 January 2018The authorities beli.docx
 
3Prototypes of Ethical ProblemsObjectivesThe reader shou.docx
3Prototypes of Ethical ProblemsObjectivesThe reader shou.docx3Prototypes of Ethical ProblemsObjectivesThe reader shou.docx
3Prototypes of Ethical ProblemsObjectivesThe reader shou.docx
 
4-5 Annotations and Writing Plan - Thu Jan 30 2111Claire Knaus.docx
4-5 Annotations and Writing Plan - Thu Jan 30 2111Claire Knaus.docx4-5 Annotations and Writing Plan - Thu Jan 30 2111Claire Knaus.docx
4-5 Annotations and Writing Plan - Thu Jan 30 2111Claire Knaus.docx
 
3Moral Identity Codes of Ethics and Institutional Ethics .docx
3Moral Identity Codes of  Ethics and Institutional  Ethics .docx3Moral Identity Codes of  Ethics and Institutional  Ethics .docx
3Moral Identity Codes of Ethics and Institutional Ethics .docx
 
3NIMH Opinion or FactThe National Institute of Mental Healt.docx
3NIMH Opinion or FactThe National Institute of Mental Healt.docx3NIMH Opinion or FactThe National Institute of Mental Healt.docx
3NIMH Opinion or FactThe National Institute of Mental Healt.docx
 
4.1Updated April-09Lecture NotesChapter 4Enterpr.docx
4.1Updated April-09Lecture NotesChapter 4Enterpr.docx4.1Updated April-09Lecture NotesChapter 4Enterpr.docx
4.1Updated April-09Lecture NotesChapter 4Enterpr.docx
 
3Type your name hereType your three-letter and -number cours.docx
3Type your name hereType your three-letter and -number cours.docx3Type your name hereType your three-letter and -number cours.docx
3Type your name hereType your three-letter and -number cours.docx
 
3Welcome to Writing at Work! After you have completed.docx
3Welcome to Writing at Work! After you have completed.docx3Welcome to Writing at Work! After you have completed.docx
3Welcome to Writing at Work! After you have completed.docx
 
3JWI 531 Finance II Assignment 1TemplateHOW TO USE THIS TEMP.docx
3JWI 531 Finance II Assignment 1TemplateHOW TO USE THIS TEMP.docx3JWI 531 Finance II Assignment 1TemplateHOW TO USE THIS TEMP.docx
3JWI 531 Finance II Assignment 1TemplateHOW TO USE THIS TEMP.docx
 
3Big Data Analyst QuestionnaireWithin this document are fo.docx
3Big Data Analyst QuestionnaireWithin this document are fo.docx3Big Data Analyst QuestionnaireWithin this document are fo.docx
3Big Data Analyst QuestionnaireWithin this document are fo.docx
 
3HR StrategiesKey concepts and termsHigh commitment .docx
3HR StrategiesKey concepts and termsHigh commitment .docx3HR StrategiesKey concepts and termsHigh commitment .docx
3HR StrategiesKey concepts and termsHigh commitment .docx
 
3Implementing ChangeConstruction workers on scaffolding..docx
3Implementing ChangeConstruction workers on scaffolding..docx3Implementing ChangeConstruction workers on scaffolding..docx
3Implementing ChangeConstruction workers on scaffolding..docx
 
3Assignment Three Purpose of the study and Research Questions.docx
3Assignment Three Purpose of the study and Research Questions.docx3Assignment Three Purpose of the study and Research Questions.docx
3Assignment Three Purpose of the study and Research Questions.docx
 
380067.docxby Jamie FeryllFILET IME SUBMIT T ED 22- .docx
380067.docxby Jamie FeryllFILET IME SUBMIT T ED 22- .docx380067.docxby Jamie FeryllFILET IME SUBMIT T ED 22- .docx
380067.docxby Jamie FeryllFILET IME SUBMIT T ED 22- .docx
 
392Group Development JupiterimagesStockbyteThinkstoc.docx
392Group Development JupiterimagesStockbyteThinkstoc.docx392Group Development JupiterimagesStockbyteThinkstoc.docx
392Group Development JupiterimagesStockbyteThinkstoc.docx
 
39Chapter 7Theories of TeachingIntroductionTheories of l.docx
39Chapter 7Theories of TeachingIntroductionTheories of l.docx39Chapter 7Theories of TeachingIntroductionTheories of l.docx
39Chapter 7Theories of TeachingIntroductionTheories of l.docx
 
3902    wileyonlinelibrary.comjournalmec Molecular Ecology.docx
3902     wileyonlinelibrary.comjournalmec Molecular Ecology.docx3902     wileyonlinelibrary.comjournalmec Molecular Ecology.docx
3902    wileyonlinelibrary.comjournalmec Molecular Ecology.docx
 
38  Monthly Labor Review  •  June 2012TelecommutingThe.docx
38  Monthly Labor Review  •  June 2012TelecommutingThe.docx38  Monthly Labor Review  •  June 2012TelecommutingThe.docx
38  Monthly Labor Review  •  June 2012TelecommutingThe.docx
 

Recently uploaded

ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...KokoStevan
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docxPoojaSen20
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.MateoGardella
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docxPoojaSen20
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterMateoGardella
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 

Recently uploaded (20)

ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 

2278CorrelationAnrodphotoiStockThinkstockChapter.docx

  • 1. 227 8Correlation Anrodphoto/iStock/Thinkstock Chapter Learning Objectives After reading this chapter, you should be able to do the following: 1. Explain the hypothesis of association. 2. Interpret the correlation coefficient. 3. List the Pearson correlation requirements. 4. Describe what the coefficient of determination explains. 5. Explain the variables involved in the point-biserial correlation. 6. Describe the applications for the Spearman correlation. tan82773_08_ch08_227-262.indd 227 3/3/16 12:33 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution. Section 8.1 The Hypothesis of Association
  • 2. Introduction Correlation, the concept of a relationship or dependence between variables, transcends statisti- cal analysis. Cloudy days are related to (correlated with) cooler temperatures. Natural disasters are related to declines in the stock market. An impending test is related to the need to study, and grinding noises in the engine compartment of a car are usually related to repair bills. Some relationships are stronger than others, so statistical procedures have been developed to quantify, or numerically gauge, the strength of the relationship between two variables. The numerical indicators are called correlation coefficients, and one of the most common is the Pearson correlation coefficient, which indicates the strength of the relationship between interval- or ratio-scale variables. The name Pearson refers to Karl Pearson, whose impact not just on studying correlation but on statistical analysis generally may be greater than that of any other individual. In the early years of the 20th century, Pearson founded the first department of statistical analy- sis at University College London. Under Pearson’s direction, the department attracted, among others, William Sealy Gosset of t test fame; Ronald Fisher, who produced analysis of variance; and Charles Spearman, for whom an alternative correlation coefficient is named, as well as an elegant statistical procedure based on correlation called factor analysis. To put it succinctly, it is difficult to overstate the impact that Pearson had on the evolution of statistical analysis.
  • 3. A man of fierce independence, Pearson’s education at Cambridge centered in religion and philosophy rather than mathematics. As a student of religion, he sued the university over the compulsory chapel attendance required of all undergraduates. Winning his suit brought a change to university rules—after which Pearson chose to attend chapel. His graduate work (in Germany) emphasized literature, and it is a testimony to his extraordinary breadth of talent that his greatest contributions would be in statistical analysis. Pearson was a contemporary of Einstein, who sought a grand theory that would unite all of physics. Pearson tried to do the same with mathematics. That both men were disappointed in these efforts should not detract from what they did accomplish. Although Pearson’s associations with his colleagues were not always harmonious, he and the others who found an academic home in his department virtu- ally defined modern quantitative analysis. Whether or not they realize it, almost all of those who crunch numbers for any length of time rely on their work. 8.1 The Hypothesis of Association Previous chapters concentrated on tests of significant difference. The z test, the t tests, analy- sis of variance, and the repeated-measures designs test the differences between groups. They all fall under a general assumption referred to as the hypothesis of difference. But some kinds of analyses do not involve questions about whether there are significant differences between groups.
  • 4. If a psychologist asks about the relationship between birth order and achievement motiva- tion among siblings or about the connection between the amount of time children read and their school grades, the subject of research concerns relationships rather than differences. Those questions call for procedures connected to the hypothesis of association, and when tan82773_08_ch08_227-262.indd 228 3/3/16 12:33 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution. Section 8.1 The Hypothesis of Association results are statistically significant, it means that the relationship, rather than the difference, is unlikely to be a random occurrence. Correlation versus Cause Before pursuing correlation, researchers must make a distinction between correlation and cause. Because two characteristics co-vary, or vary together, that does not presume that one necessar- ily causes the other. Although there may be a causal relationship, researchers usually cannot determine one just by studying the correlation. One of the author’s statistics professors explained the risk of confusing correlation with cause this way: A person drinks for three successive nights. The first night, the drink is scotch and water, the second bourbon and water, and on the third, vodka and water. Each morn- ing after is accompanied by a hangover. Because the
  • 5. water is common to each experience, water must be the cause. A classic study demonstrates, among other things, a correlation between the sale of ice cream by vendors on city streets and burglaries in the same city. Someone rushing to judgment about cause might wish to curb ice cream sales or check the criminal records of ice cream vendors to reduce the number of burglaries. Such an individual does not recognize that hot- ter weather—and the open windows that result—probably drive both ice cream sales and burglaries. It is not unusual for some third variable to explain an association between a first and a second. Although correlation values provide some evidence for causation, correlation alone is rarely sufficient to demonstrate cause. Scatterplots Breaking down the word correlation—co-relation—makes its meaning clear: the variables are related. The evidence for the relationship is that the characteristics co-vary. As the level of one variable changes, the other changes as well because both variables contain some of the same information. The higher the correlation, the more common information they contain. A researcher gathers verbal ability and intelligence scores for 12 subjects and presents them in Table 8.1. Note that the first participant has a verbal ability score of 20 and an intelligence score of 80. Scanning the two rows of data, we can see that as the values of one score increase, so do those of the other. In other words, there appears to be a
  • 6. positive correlation between the two scores. The relationship is easier to see in the scatterplot. A scatterplot is a graph plotting the values of one variable along the horizontal axis and the other variable along the vertical axis, using dots to indicate the intersection of each pair of values. Figure 8.1 shows an Excel-generated scatterplot of the verbal ability/ intelligence data. Design Pics/Kelly Redinger/Thinkstock As the classic study involving ice cream sales and burglaries shows us, it is important to make a distinction between correlation and cause. Try It!: #1 How many raw scores does a single point on a scatterplot represent? tan82773_08_ch08_227-262.indd 229 3/3/16 12:34 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution. Section 8.1 The Hypothesis of Association Table 8.1: Results of a study comparing verbal ability and intelligence Participant 1 2 3 4 5 6 7 8 9 10 11 12 Verbal ability: 20 35 42 48 55 60 63 66 72 76 78 85
  • 7. Intelligence: 80 95 90 100 100 100 110 115 120 115 110 125 Figure 8.1: The relationship between verbal ability and intelligence In the Figure 8.1 scatterplot, intelligence scores are plotted along the vertical, or y, axis and the verbal ability scores are plotted along the horizontal, or x, axis. Each diamond-shaped point in the graph, then, represents an intelligence score and a verbal ability score. The plot verifies what our cursory view of the two rows of data in the table suggested: A posi- tive correlation exists between measures of intelligence and those of verbal ability. The gen- eral trend is from lower left to upper right. As the value of one variable increases, the value of the other tends to do likewise. The incline is not dramatic, but the graph shows a general rise in the data points. Less-than-Perfect Relationships The relationship certainly is not perfect. The fourth, fifth, and sixth participants all have the same level of intelligence but different levels of verbal ability. The same is true of participants 8 and 10, as well as participants 7 and 11. Still, there is a general lower-left to upper-right relationship, which might be expected. Brighter people often have more complex language patterns, something suggested by higher verbal-ability scores. It also is not surprising that the relationship between intelligence and verbal ability is less than perfect. An extensive vocabulary alone is no guarantee of
  • 8. an unusually high intelligence tan82773_08_ch08_227-262.indd 230 3/3/16 12:34 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution. Section 8.1 The Hypothesis of Association score. Perhaps the individual is just an avid reader. At the other end of the spectrum, not all highly intelligent people excel verbally. The exceptions point to the fact that people are very complex. Human behavior is rarely explained by one or two variables. Although intelligence is related to verbal aptitude, so are a number of other variables: how much the individual reads, how easily the individual is dis- tracted, how much experience the person has had, and so on. One of the reasons researchers calculate correlation values is to determine the level of agreement when the relationships are not perfect, as they rarely are with people. The issue the hypothesis of association seeks to resolve is not whether the relationship is perfect—because it would be extremely rare if it were—but rather, whether the relationship is statistically significant. Statistically significant correlations produce correlation values that tend to reemerge every time new data are gathered for the variables and the strength of the correlation re-calculated.
  • 9. Although perfect correlations are rare when dealing with people, that is not necessarily the case elsewhere. Mathematicians, for example, enjoy the stability of perfect relationships; the formula for the area of a circle, A 5 πr2 (where the area is found by multiplying the value of pi by the square of the radius), works for circles of any size because a perfect relationship exists between a circle’s radius and its area. Still, even imperfect correlations, such as those related to human-subjects research, can be very important. If health professionals know a correlation, even a weak one, exists between exposure to secondhand smoke and the later development of respiratory problems, they can warn against such exposure. In that particular instance, by the way, the research supports the causal assumption. If educators know there is a correlation between how much homework students do and their success on a high school exit exam, educators can encourage students to complete more assignments. The instructors expect that pass rates will rise as a consequence. In the case of homework and exit exam scores, however, a causal relationship is not as clear. Perhaps people who have a higher level of academic achievement do more homework and have higher exit exam scores. That suggests the academic achievement is the causal element rather than the homework. Maybe the increased homework is the manifestation of that other variable, academic achievement, or perhaps parental involvement is the causal factor—stu-
  • 10. dents whose parents are directly involved in their schooling do more homework and prepare for their exit exams with greater care. The Amount of Scatter The amount of scatter in a scatterplot, the degree to which the points in the scatterplot stray from a straight line, suggests weakness in the correlation. Scatterplots graphed for strong correlations have very little scatter. The points appear to line up. What Correlations Provide Calculating a correlation involves quantifying the strength of the relationship between the variables involved. Correlation values, or coefficients, range from to 21.0 to 11.0. Correlation values of either 21.0 or 11.0 indicate perfect relationships. With positive correlations, as the tan82773_08_ch08_227-262.indd 231 3/3/16 12:34 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution. Section 8.1 The Hypothesis of Association value of one variable increases, so does the value of the other— more verbal reinforcement of subjects in a test of problem-solving ability is probably associated with more effort expended by the subject. With negative correlations, as the value of one variable increases, the other decreases—more
  • 11. involvement with video-gaming while a text passage is read to subjects is probably associated with lower retention of the details of the text passage; as the value of one increases, the value of the other declines. A cor- relation of 0 indicates no relationship—fluctuations in the value of one variable are unrelated to changes in the value of the other. Values less than the absolute value of 1.0, but greater than 0, indicate imperfect relationships, with the strength of the relationship declining as the value approaches 0. Correlating two variables does not require that they both measure the same characteristic or even that they both be gathered from the same subjects. Often, entirely different kinds of things are correlated. The example of secondhand smoke and respiratory issues involves two completely different variables, but the strength of the relationship between them can be calculated nevertheless. As long as the two variables can be quantified—reduced to a num- ber—the strength of any relationship can be determined. Requirements for the Pearson Correlation Researchers may employ any of several different correlation procedures. The appropriate procedure for a particular problem is determined by characteristics such as the scale and normality of the data involved. The Pearson correlation, for example, requires variables of either interval or ratio scale. Nominal or ordinal scale data can be correlated as well, but they involve other correlation procedures. In addition to interval or ratio data, the Pearson corre- lation also requires the following:
  • 12. • In their populations, the characteristics are assumed to be normally distrib- uted. Normal distributions can never be reflected in relatively small samples, but researchers must have reason to believe that the samples come from populations that are normal. • The distributions from which the samples come must be similarly distributed. • The two samples are assumed to be randomly selected from their populations. • The relationship between the variables must be linear; it remains constant through- out their ranges. Recall that normality is indicated when the standard deviation is about one-sixth of the range, the measures of central tendency all have about the same value, and so on (Chapter 2). The way data are distributed in the scatterplot also suggests the normality of the two variables involved in a correlation. When both variables are normal, the points in the plot will be dis- tributed from left to right, with the frequency of the points gradually increasing toward the middle of the graph and then gradually decreasing to the right extreme. If the relationship is positive (example A in Figure 8.2), the scatter is generally from lower left to upper right. If it is negative (example B in Figure 8.2), the graph follows a pattern from upper left to lower right. If the variables have no correlation (example C in Figure 8.2), the points fall into a cir-
  • 13. cular pattern in the middle of the graph with the greatest density at the circle’s center. The Try It!: #2 If two variables are normally distributed but uncorrelated, what pattern will their data points make in a scatterplot? tan82773_08_ch08_227-262.indd 232 3/3/16 12:34 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution. 30 20 10 0 0 5 10 15 30 20 10 A. A positive Correlation B. A negative Correlation C. A Zero Correlation
  • 14. 0 0 42 86 10 5 15 10 5 0 0 5 10 15 20 Section 8.1 The Hypothesis of Association greater frequency in the middle of the circle reflects the fact that most of the data in any normal distribution occur near the middle of the distribution. (The pattern in our example does not look circular because so few data are present.) The similar-distribution requirement does not mean that the standard deviations should be the same. That is not likely to happen unless both variables are measured along the same range. It means that the standard devia- tions should account for similar proportions of their respective ranges. The strength of a correlation is affected by range attenuation. When the range of scores in either variable is artificially abbreviated, the correlation value will be artificially low. Range attenuation can be indicated by a stan-
  • 15. dard deviation that is substantially smaller than we know it to be in the population. If we were correlating intelligence scores with reading comprehension, and the intelligence scores have a standard deviation of 8 points when we know that the population standard deviation is 15 points, we can expect any resulting correlation value to be artificially low. One of the advantages of random selection is that random samples of a reasonable size tend to mirror their populations reasonably well. Range restriction problems are much less likely to occur with randomly selected samples. Linear and Nonlinear Correlations When the relationship between two variables is linear, it means that the degree to which they change in concert with each other is the same throughout their ranges; if it is low and posi- tive, it is low and positive at low levels of both variables and at higher levels of both variables. Some correlations, however, are not linear. Consider the correlation between anxiety and the quality of a musician’s performance. In that instance, a little anxiety is probably a good thing. It prompts the individual to prepare for the performance by practicing, studying the music carefully, asking others for feedback, and so on. Without anxiety, the musician might not make the necessary preparations. It seems likely that, at least in the early going, the quality of the performance improves as anxiety increases. But it is possible that if anxiety continues to increase, the individual’s performance may reach
  • 16. a plateau and then begin to diminish. The musician may become so anxious that concentra- tion is difficult and performance declines, with more anxiety actually depreciating the quality of the music. These conditions describe a relationship that is curvilinear. It is illustrated in Table 8.2, where anxiety is gauged as a function of someone’s increasing pulse rate in beats Figure 8.2: Scatterplots for positive, negative, and zero correlations 30 20 10 0 0 5 10 15 30 20 10 A. A positive Correlation B. A negative Correlation C. A Zero Correlation 0 0 42 86 10
  • 17. 5 15 10 5 0 0 5 10 15 20 tan82773_08_ch08_227-262.indd 233 3/3/16 12:34 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution. Section 8.1 The Hypothesis of Association per minute. The quality of the musician’s performance is represented by the judgment of a trained observer, with higher values indicating a more virtuoso performance. If scores were awarded every 5 minutes during a 65-minute performance, the data are as follows: Table 8.2: Study results of anxiety versus quality of a musician’s performance Anxiety 52 54 58 62 64 67 72 73 75 78 82 86 88 Performance quality 3 5 6 6 8 8 9 7 5 5 4 3 1 Figure 8.3 shows the scatterplot illustrating the relationship
  • 18. between the musician’s anxiety and the quality of the musician’s performance. Figure 8.3: The relationship between performance quality and anxiety Try It!: #3 What impact does range attenuation have on a correlation? Initially, there is a positive relationship between anxiety and the quality of the music. The first few pairs of data have points that rise from left to right. However, a positive relationship becomes negative when performance begins to diminish as anxiety increases. Viewed as a whole, the correlation is curvilinear. After performance reaches the judge’s high of 9, more anxiety is not asso- ciated with better music. The scatterplot also reveals some of the danger associ- ated with range restriction. If someone collects data so that only the first six pairs of scores were the sample, tan82773_08_ch08_227-262.indd 234 3/3/16 12:34 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution. Section 8.1 The Hypothesis of Association those scores provide very different indicators of the relationship between anxiety and perfor-
  • 19. mance than the last six pairs of scores. The first part of the distribution makes the relation- ship look linear and positive. The latter part of the data makes the relationship look linear but negative. An accurate picture of the relationship requires data throughout the entire ranges of the two variables. Understanding Correlation Values It is important not to confuse the sign of the correlation (1 or 2) with its strength. A corre- lation of 20.50 contains the same amount of information about the two variables as does a correlation of 10.50. The sign makes a great deal of difference how the relationship is inter- preted, but it has nothing to do with the strength of the relationship. With positive correla- tions, as the value of one variable increases so does the value of the other. When correlations are negative, increasing values of one variable are associated with decreasing values of the other. Earlier we noted that different scales of data require different types of correlation proce- dures. The number of variables involved also dictates the need for different correlation procedures: • Bivariate correlations indicate the relationship between two variables. For exam- ple, the correlation between intelligence and verbal aptitude is a bivariate correla- tion. This chapter focuses on bivariate correlations. • Multiple correlation gauges the relationship between one
  • 20. variable and a combina- tion of others. For example, the correlation between a combined reading compre- hension and vocabulary measure with an analytical-ability measure would indicate how well reading comprehension and vocabulary ability, combined, correlate with analytical ability. • Canonical correlation measures the relationship between two groups of variables. For example, determining how a combination of reading comprehension and vocab- ulary ability and a combination of analytical ability and problem-solving ability relate calls for a canonical correlation. • Partial correlation measures the relationship between two variables after neu- tralizing the influence of some third variable on both of the first two. For example, a correlation of analytical ability with problem-solving ability, with the influence of age controlled in both of the other variables, eliminates age differences as a factor in the resulting correlation. In effect, a partial correlation would be the correlation of analytical ability with problem-solving ability as if all subjects were the same age. • Semipartial correlation gauges the relationship between two variables after neu- tralizing the influence of a third on either of the first two. For example, a correlation of intelligence with verbal aptitude, with age differences
  • 21. controlled in the verbal- aptitude variable, is a semipartial correlation. Age would not be controlled in the intelligence variable. (This makes some sense since intelligence is often argued to be a stable variable across age differences in the individual.) Only the bivariate correlations are covered here. The others are beyond the scope of this book but are described here very simply, so that the reader has a sense of where bivariate correla- tions fit into the broader discussion of these procedures. tan82773_08_ch08_227-262.indd 235 3/3/16 12:34 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution. Section 8.2 Calculating the Pearson Correlation 8.2 Calculating the Pearson Correlation Formally called the Pearson product-moment correlation coefficient, the Pearson correlation, or—because its symbol is typically a lowercase r—“Pearson’s r,” is probably the most often calculated of any correlation value. Thumbing through statistics books and glancing at online sources reveals several formulas. All provide the same answer, but some are easier to com- plete than others. Visually, at least, Formula 8.1 is probably simplest: Formula 8.1
  • 22. rxy 5 ∑[(zx)(zy)] n 2 1 Note that the r symbol has x and y subscripts. These indicate that the procedure correlates two variables designated x and y. Which variable is assigned x and which y is unimportant, since correlation does not presume that the x variable causes y, for example. Formula 8.1 indicates that if the x and y scores are transformed into z scores (Formula 3.1: z 5 x 2 Ms ), the value of rxy, (the correlation value) is the sum of the products of the x and yz scores for each participant, divided by the number of participants in the data group (rather than the number of scores), minus 1. The n 2 1 signifies that this is a correlation formula for sample, rather than population, data. It is the same adjustment for sample data made with the standard deviation calculation in Chapter 1. Formula 8.1 can be used to calculate the correlation value of the verbal ability and intelligence scores from the earlier example. Calculating the equivalent verbal-ability and intelligence z values with Formula 3.1 produces the z values for the original raw scores listed in Table 8.3. Here, each pair of z scores is multiplied and the products summed: (21.991 3 21.902) 1 (21.212 3 20.761) 1 . . . 1 (1.385 3 1.522) 5
  • 23. 10.313 This provides the numerator to be used in the formula rxy 5 ∑[(zx)(zy)] n 2 1 Then, for the denominator, n (the number of pairs of scores) 5 12, so n 2 1 5 11. Therefore, substituting these values into the above equation gives rxy 5 10.313 11 5 0.938 Table 8.3: z values Verbal ability (x) 21.991 21.212 20.848 20.537 20.173 0.087 0.242 0.398 0.710 0.917 1.021 1.385 Intelli- gence (y) 21.902 20.761 21.141 20.380 20.380 20.380 0.380 0.761 1.141 0.761 0.380 1.522 tan82773_08_ch08_227-262.indd 236 3/3/16 12:34 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for
  • 24. resale or redistribution. Section 8.2 Calculating the Pearson Correlation With a maximum possible correlation value of 1.0, rxy 5 0.938 indicates a strong relationship between verbal ability and intelligence, something that is reflected in the fact that many intel- ligence tests include subtests of verbal ability. Although Formula 8.1 is visually simple, the need to transform everything into z scores before calculating rxy makes the calculations very time consuming and tedious. Completing the cal- culations by hand takes too much time. Formula 8.2, the formula we will use, turns out to be the formula programmed into many hand-calculators. It is visually more complex but much easier to execute: Formula 8.2 rxy 5 n∑xy 2 (∑x)(∑y) Î {[n∑x2 2 (∑x)2][n∑y2 2 (∑y)2]} where x 5 one of the scores in each pair as above in the z score formula. y 5 the other score in the pair. n 5 the number of participants (the number of pairs of scores).
  • 25. ∑xy indicates that each pair of scores is multiplied and then the products for each pair summed. The resulting value is the “sum of the cross-products.” ∑x2 indicates that each x score is squared, and then the squares summed. (∑x)2 indicates that the original x scores are totaled, and then the total is squared. ∑y2 indicates that each y score is squared, and then the squares summed. (∑y)2 indicates that the original y scores are totaled, and then the total is squared. The formula is not as daunting as it appears. The process will become familiar after a few problems. Probably Excel or a hand-calculator with a built-in correlation function will perform most of the statis- tical “heavy-lifting,” but it is helpful to prepare for that occasional time when there is no computer and the calculator has no correlation function. A Correlation Example A researcher is duplicating a classic experiment by psychologist E. L. Thorndike. The experiment relates to Thorndike’s Law of Effect, which maintains that behaviors followed by a satisfying state of affairs will likely be repeated. In the experiment, the researcher sets up a cage equipped with a door that opens if a cat placed in the cage bats a string suspended inside iStockphoto/ Thinkstock
  • 26. Thorndike’s Law of Effect maintains that behaviors followed by satisfaction are likely to be repeated. A hungry cat will learn to bat a suspended string if that action is followed by food. tan82773_08_ch08_227-262.indd 237 3/3/16 12:34 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution. Section 8.2 Calculating the Pearson Correlation the cage. According to the law of effect, if batting the string is followed by something satisfying, that behavior should occur more frequently in future trials than other behaviors. A hungry cat is placed in the cage and food placed outside where it is inaccessible from the inside of the cage. Data comprise the several trials and the elapsed time, in minutes, before the cat releases itself. This experiment is repeated 10 times over as many days. Table 8.4 lists the data. Table 8.4: Experimental results from cat behavioral study Trial number 1 2 3 4 5 6 7 8 9 10 Elapsed time 5.0 5.5 4.75 4.5 4.25 3.5 2.75 2.0 1.0 0.25 Figure 8.4 shows the scatterplot for these data, which suggests that the relationship is prob- ably negative and quite strong.
  • 27. Figure 8.4: The relationship between number of trials and elapsed time The correlation value checks both conclusions. To determine the correlation, we use Formula 8.2: rxy 5 n∑xy 2 (∑x)(∑y) Î {[n∑x2 2 (∑x)2][n∑y2 2 (∑y)2]} The number of trials (n) 5 10. The researcher can then verify that ∑xy 5 137.25 ∑x2 5 385 tan82773_08_ch08_227-262.indd 238 3/3/16 12:34 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution. Section 8.2 Calculating the Pearson Correlation (∑x)2 5 (55)2 ∑y2 5 141 (∑y)2 5 (33.5)2 Substituting the relevant values gives
  • 28. rxy 5 10(137.15) 2 (55)(33.5) Î {[10(385) 2 (55)2][10(141) 2 (33.5)2]} 5 1372.5 2 1842.5 Î [(3850 2 (3025)][(1410 2 1122.25] 5 2470 Î (825 3 287.75) 5 20.965 Interpreting Results The relationship is indeed negative and because the maximum correlation is 61.0, the rela- tionship is also very strong. Neither of those conclusions indicates whether the result is sta- tistically significant, however. As with z, t, and F, significance is determined by comparing the calculated value to the table value indicated by the relevant degrees of freedom and the selected level of probability. A calculated correlation value for which the absolute value is as large is one that probably did not occur by chance. For the Pearson correlation, the values are in Table 8.5 (see also Table B.5 in Appendix B). Like the t and F values, the correct critical value for r is determined by degrees of freedom and by the level of probability the researcher selects. The degrees of freedom for a Pearson correlation are the number of pairs of data, minus 2. Be careful
  • 29. not to confuse the number of pairs with the number of scores. The probability values in Table 8.5 indicate the absolute value that the calculated rxy must reach to be confident that the correlation did not occur by chance. The level of confidence in that conclusion is indicated by the columns for p 5 0.1, p 5 0.05, and p 5 0.01. To have some practice interpreting the values, note the following: • If a correlation were calculated for n 5 7 pairs of data (which means that df 5 5) and the result was rxy 5 1/2 0.669, there is 1 chance in 10, or in other words p 5 0.1, that the correlation occurred by chance. A chance, or random, correlation means that if new data were collected and the rxy value calculated a second time, it would probably be less than the table value. • If the researcher wants more assurance against a random correlation, rxy 5 1/2 0.754 (also with 5 degrees of freedom) will occur by chance just 5 times in 100 (p 5 0.05) and rxy 5 1/2 0.875 will occur by chance just 1 time in 100 (p 5 0.01). tan82773_08_ch08_227-262.indd 239 3/3/16 12:34 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
  • 30. Section 8.2 Calculating the Pearson Correlation Researchers most commonly settle on p 5 0.05 or 0.01. The p 5 0.1 occurs in statistical tables less often because in most research settings, a one-in-ten chance of a random correlation is too great. No one wants to conclude that a correlation is not statistically significant when there is too much chance that the finding will not hold up under further investigation. In exploratory or descriptive research when there is little prior research on which to rely, how- ever, sometimes investigators will relax the probability to p 5 0.1. Table 8.5: The critical values of rxy Number of xy pairs (n) df (n 2 2) Lowest statistically significant correlation for the specified probability p 5 0.10 p 5 0.05 p 5 0.01 3 1 0.988 0.997 1.000 4 2 0.900 0.950 0.990 5 3 0.805 0.878 0.959 6 4 0.729 0.811 0.917 7 5 0.669 0.754 0.875 8 6 0.621 0.707 0.834
  • 31. 9 7 0.582 0.666 0.798 10 8 0.549 0.632 0.765 11 9 0.521 0.602 0.735 12 10 0.497 0.576 0.708 13 11 0.476 0.553 0.684 14 12 0.458 0.532 0.661 15 13 0.441 0.514 0.641 16 14 0.426 0.497 0.623 17 15 0.412 0.482 0.606 18 16 0.400 0.468 0.590 19 17 0.389 0.456 0.575 20 18 0.378 0.444 0.561 21 19 0.369 0.433 0.549 22 20 0.360 0.423 0.537 23 21 0.352 0.413 0.526 24 22 0.344 0.404 0.515 25 23 0.337 0.396 0.505 Source: Brighton Webs Ltd. (2006). Critical values of
  • 32. correlation coefficient (R). Statistics for Energy and the Environment. Retrieved from https://web.archive.org/web/20110117193722/http://www.bright on-webs.co.uk/tables/critical_values_r.asp tan82773_08_ch08_227-262.indd 240 3/3/16 12:34 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution. https://web.archive.org/web/20110117193722/http://www.bright on-webs.co.uk/tables/critical_values_r.asp Section 8.2 Calculating the Pearson Correlation The Relationship Between Degrees of Freedom and Significance Even with a correlation value as extreme as 20.965, checking the table for significance is important. In both the t test and ANOVA, the magnitude of the critical values declines as degrees of freedom (and sample size) increase. It is the same with correlation, but here the decline in critical values is more dramatic. Note from the table, for example, that if n 5 3 (and therefore df 5 1), the correlation would need to be at least rxy 5 0.997 (nearly perfect) to be statistically significant. The related point is that with only three pairs of data, the potential for a random relationship that looks significant is very high. At the other extreme, if n 5 25 (so that df 5 23), a correlation of just rxy= 0.396 is statistically significant. That much data bears a much lower potential for an accidental (random) relationship.
  • 33. The Statistical Hypotheses The null and alternate hypotheses for correlation reflect the fact that we have moved away from the hypothesis of difference. The null hypothesis is that no relationship between the variables exists. Symbolically, it is written: H0: ρ 5 0. The symbol ρ is the Greek letter rho (as in “row” your boat)and the equivalent of r. So the null hypothesis states that the correlation (r) equals 0. More specifically, it means that there is no statistically significant relationship. The alternate hypothesis states that the correlation does not equal 0, that a statistically significant relationship will emerge each time data are collected and the relationship calculated: HA: ρ ? 0. The Coefficient of Determination One of our important recurring themes is the distinction between statistical significance and practical importance. Determining practical importance was the reason for omega-squared and eta-squared calculations for significant t test and ANOVA results, respectively. Effect sizes take on particular importance with correlation because with large samples, rela- tively small correlations can be statistically significant. The effect size corresponding to the Pearson correlation is the coefficient of determination (rxy2). As the notation suggests, the coefficient of determination is the square of the correlation coefficient. Squaring the correla- tion indicates how much of the variance in y is explained by x (or vice versa since correlation
  • 34. does not assume cause). In the problem about number of trials and elapsed time, rxy 5 20.965 so rxy2 5 0.931. For that problem, the coefficient of determination is interpreted this way: the number of tri- als can explain about 93% of the variance in time elapsed, which would be a very important finding with implications for many kinds of performance tasks, except that the numbers were contrived. The Interpretive Value of rxy2 The coefficient of determination can also indicate how unimportant some low correlations are, even when they are statistically significant. For example, with 23 degrees of freedom, a correlation of rxy 5 0.396 is statistically significant. The coefficient of determination for that tan82773_08_ch08_227-262.indd 241 3/3/16 12:34 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution. Section 8.2 Calculating the Pearson Correlation value is rxy2 5 0.157. One variable in such a relationship explains just 16% of the variance in the other. The other 84% of the variability is related to other factors.
  • 35. When the variables describe the behavior of people, small coefficients of determination do not surprise us because they are part of human subjects’ complexity. Very few individual vari- ables will explain large proportions of human behavior. Sometimes, however, even low correlations and low rxy2 values are important. If research revealed that the correlation between the age of first exposure to illegal narcotics and the development of an addiction was rxy 5 20.3, that value (note the negative correlation) indi- cates that the younger subjects are at first exposure, the more likely they are to develop an addiction. The resulting rxy2 value would be just 0.09. But even if just 9% of the variance in addiction is explained by age at first exposure, within the context of human complexity that would be considered important. Practical importance is a function of consequences. Comparing Correlation Values In isolation, correlation coefficients can be difficult to interpret because correlation strength does not increase or decrease in consistent increments. The change from rxy 5 0.2 to rxy 5 0.3 is a less dramatic increase in strength than the increase from rxy 5 0.75 to rxy 5 0.85, for example. Although the Pearson r requires equal interval data, in the coefficients that are the result, an increase in correlation strength of 0.1 reflects a very different change from 0.8 to 0.9 than it does from 0.2 to 0.3. It takes a much stronger increase in the relation- ship to increase by 0.1 in the upper ranges of correlation values than in the lower ranges,
  • 36. something suggested by the distance between tenths in this number line: rxy 5 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Squaring the correlation coefficient makes the intervals consistent. A change in the coefficient of determination from 0.35 to 0.5, for example, represents the same increase in proportion of variance explained as an increase from 0.7 to 0.85, as the line suggests: r2xy 5 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9. Another Correlation Problem A foundation interested in what prompts contributions to charitable causes retains a consul- tant. Noting that age varies with donation, the consultant gathers the data in Table 8.6 and generates the values in Problem 8.1. Table 8.6: Data on charity donations Donor: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Age: 25 27 32 32 35 38 43 45 45 47 48 52 63 65 66 Amount: 20 20 35 25 100 50 75 45 100 150 100 200 50 100 125 tan82773_08_ch08_227-262.indd 242 3/3/16 12:34 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution. Section 8.2 Calculating the Pearson Correlation
  • 37. Problem 8.1: The Pearson correlation for contributor’s age and contribution amount Donor’s age Contribution amount x x2 y y2 xy 25 625 20 400 500 27 729 20 400 540 32 1,024 35 1,225 1,120 32 1,024 25 625 800 35 1,225 100 10,000 3,500 38 1,444 50 2,500 1,900 43 1,849 75 5,625 3,225 45 2,025 45 2,025 2,025 45 2,025 100 10,000 4,500 47 2,209 150 22,500 7,050 48 2,304 100 10,000 4,800 52 4,704 200 40,000 10,400 63 3,969 50 2,500 3,150 65 4,225 100 10,000 6,500 66 4,356 125 15,625 8,250 ∑x 5 663 ∑x2 5 31,737 ∑y 5 1,195 ∑y2 5 133,425 ∑xy 5 58,260 The correlation of the donor’s age and the contribution amount is calculated as follows: rxy 5 n∑xy 2 (∑x)(∑y) Î {[n∑x 2 2 (∑x)
  • 38. 2][n∑y2 2 (∑y) 2]} 5 15(58,260) 2 (663)(1,195) Î {[15(31,737) 2 (663) 2][15(133,425) 2 1,195 2]} 5 81,615 Î [(36,486)(573,350)] 5 0.564 • The critical value at p 5 0.05 and 13 df (r0.05(13)) is 0.514. • Because rxy . r0.05(13), the correlation is statistically significant. • The coefficient of determination (rxy2) 5 0.318, which indicates that age can explain about 32% of the variability in donation amount. tan82773_08_ch08_227-262.indd 243 3/3/16 12:34 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution. Section 8.3 Correlating Data When One Variable Is
  • 39. Dichotomous Problem 8.1 suggests some of the hazard in rushing to judgment about cause from correlation data. While we might be tempted to reduce the problem to “older people contribute more to charity than younger peo- ple,” other factors are probably at work, not the least of which is that age likely correlates with income as well. Perhaps it is not age that explains contribution amount so much as income. The correlation value, while instructive and important, indicates only how variables co-vary, not necessarily why the variables involved vary. 8.3 Correlating Data When One Variable Is Dichotomous If the consultant had asked how the donation amount and the donor’s gender relate, Pear- son still provides the answer, but the procedure becomes a point-biserial correlation. The word point refers to the continuous variable, the amount of money donated in this exam- ple. The word biserial refers to the other variable, which has only two levels. The required change is coding the gender variable in a way that reflects its dichotomy: as either 0 or 1. Which of females or males are coded 0 and which 1 will not affect the strength of the coefficient. The point-biserial correlation has a number of applications. Questions about the relation- ship between marital status and income, between public versus private school students and achievement, or between Republicans’ and Democrats’ optimism are all questions that could
  • 40. be analyzed with point-biserial correlation. In point-biserial correlations, which level is coded 0 and which 1 affects only the sign of the coefficient. We will need to be careful when interpreting the result. If donors 3, 5, 6, 7, 9, 10, 11, and 14 are female, and if females are coded 1 and males 0, the research obtains the data in Table 8.7. Table 8.7: Data on charity donations by donor type (gender) Donor (x) 0 0 1 0 1 1 1 0 1 1 1 0 0 1 0 Amount (y) 20 20 35 25 100 50 75 45 100 150 100 200 50 100 125 Calculating the Point-Biserial Correlation The amounts donated (the y values) remain the same from the age/donor problem (Problem 8.1, where ∑y 5 1,195 and ∑y2 5 133,425). The other values must be recalculated, although that task becomes much simpler with gender (x) recoded to 1s and 0s. Table 8.8 lists those results. Try It!: #4 What is the relationship between degrees of freedom and statistical significance in correlation? tan82773_08_ch08_227-262.indd 244 3/3/16 12:34 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
  • 41. Section 8.3 Correlating Data When One Variable Is Dichotomous Table 8.8: Point-biserial correlation results Gender (x) x2 Amount (y) y2 xy 0 0 20 400 0 0 0 20 400 0 1 1 35 1,225 35 0 0 25 625 0 1 1 100 10,000 100 1 1 50 2,500 50 1 1 75 5,625 75 0 0 45 2,025 0 1 1 100 10,000 100 1 1 150 22,500 150 1 1 100 10,000 100 0 0 200 40,000 0 0 0 50 2,500 0 1 1 100 10,000 100 0 0 125 15,625 0 ∑x58 ∑x258 ∑y51,195 ∑y2=133,425 ∑xy5710 Return to Formula 8.2, in which rxy 5 n∑xy 2 (∑x)(∑y) Î {[n∑x2 2 (∑x)2][n∑y2 2 (∑y)2]} Substituting in the values from Table 8.8 gives
  • 42. rxy 5 15(710) 2 (8)(1.195) Î {[15(8) 2 (8)2][15(133,425) 2 (1,195)2]} 5 0.19 Still testing at p 5 0.05 and with the degrees of freedom still df 5 13, from Table 8.5 the criti- cal value is still rxy0.05(13) 5 0.514. Therefore the statistical decision will be to fail to reject H0. The relationship between the donor’s gender and the amount contributed is not statistically significant. The rxy 5 0.19 result is probably a random correlation that is unlikely to reach the critical value from the table in any new analysis with new subjects. The interpretation of the point-biserial correlation is the same as it is for conventional Pear- son correlations, except that sign of the coefficient is a function only of which variable is coded 1. If male donors had been coded with 1s, the correlation would have been negative, rxy 5 20.19. Consider a few more applications for the point- biserial correlation: • What is the relationship between whether or not a parent earned a college degree and the child’s grades? • How is whether or not a student is a native speaker of English related to the student’s test score? tan82773_08_ch08_227-262.indd 245 3/3/16 12:34 PM
  • 43. © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution. Section 8.4 The Pearson Correlation in Excel • What is the correlation between blue-collar/white-collar jobs and the amount of leisure time? If both variables are dichotomous, another bivariate correlation is involved. It is called the phi coefficient, discussed in Chapter 10. Degrees of Significance? At rxy 5 0.19 and a table value of rxy0.05(13) 5 0.514, the correlation value is not significant. If the value had been rxy 5 0.50, and this correlation value represented some relationship calculated for your senior thesis, would it be appropriate to refer to it as “almost significant” or “nearly significant”? It is not uncommon to see such qualifiers even in the published lit- erature, but significance decisions should be treated the same way as dichotomous variables. Only two outcomes are possible: The correlation is significant or it is not significant. To try to make a statement about the nearness to an alternative outcome undermines the principle behind significance testing. Only two hypotheses for significance exist, and the outcome is couched in terms of one or the other. 8.4 The Pearson Correlation in Excel A psychologist is interested in determining the relationship
  • 44. between risk-taking and success solving novel problems. Having devised the Inventory Risk Survey Catalog (the I-RiSC), the psychologist gauges the willingness of a group of 16-year-olds to do the unconventional and then provides a series of word problems with which the participants are unfamiliar. Scores on the I-RiSC and the problems for 10 participants are listed in Table 8.9. Table 8.9: Risk-taking and problem-solving success data I-RiSC: 2 7 4 5 1 8 7 9 3 6 Problems: 14 17 14 16 12 17 16 17 15 15 To complete the problem in Excel, it is best to set up the data in two columns. Two rows also will work, but parallel columns are visually simpler. 1. Create a label in cell A1 for “I-RiSC” and in cell B1 “ProbSolv” so that the I-RiSC data appear in cells A2 to A11 and the ProbSolv data appear in B2 to B11. 2. From the Home tab at the top of the page click Data, and then Data Analysis at the far right. 3. Select Correlation, which is the second option in the window. 4. In the Input Range window enter A2:B11, which indicates the cells where the data are found. Note that the default groups the data in columns. (Change the default if
  • 45. entering the data in rows.) Had the “Labels in First Row” box been checked, Excel would have treated the first row in each column (A2 and B2 because that is what is tan82773_08_ch08_227-262.indd 246 3/3/16 12:34 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution. Section 8.4 The Pearson Correlation in Excel designated) as labels rather than data. Our adjustment for the labels was made by indicating that the data begin in A2 rather than A1. 5. Enter a cell value below or to the right of the last data entry for the Output Range so that the results do not overwrite the scores—either cell A12 or below, or to the right of column B. 6. Click OK. The results appear in a box called a correlation matrix (see Table 8.10). The intersection of column 1 and column 2 indicates how well the data in column 1 (the Excel A column, where I-RiSC data are located) correlate with the data in column 2 (the Excel B column, which con- tains the problem-solving scores). Table 8.10: Correlation matrix
  • 46. Column 1 Column 2 Column 1 1 0.904203 Column 2 0.904203 1 The result of the analysis is a Pearson correlation of rxy 5 0.904. The 1s in the diagonal indi- cate that each variable correlates perfectly with itself (rxy 5 1.0), of course. Note that the output does not indicate whether the calculated value is statistically significant, which makes a check of the critical values table necessary. Table 8.5 indicates that rxy0.05(8) 5 0.632. The relationship between risk-taking and problem solving is statistically significant. Were these data not contrived, it would be quite important to know that about 82% (rxy2 5 0.818) of problem-solving success (0.9042) is explained by whatever the I-RiSC measures, ostensibly the subject’s willingness to be unconventional. Apply It! Investigating the Correlation between Crime and Unemployment A law enforcement analyst is interested in any link between crime and unemployment as a guide to allocat- ing crime-prevention funds. Specifically, she would like to know whether murders and property crimes correlate with the unemployment rate. The analyst obtains the murder and property-crime rates for her state for the 16 years from 1990 to 2005 from the FBI Uniform Crime Reports (rates are per 100,000 inhabitants). She then consults the Bureau of Labor Sta- tistics for the unemployment rate in the state for the
  • 47. same period. The analyst will compute the Pearson cor- relation between murder rate and unemployment and then between property-crime rate and unemployment. Table 8.11 shows the data. Digital Vision/Photodisc/Thinkstock (continued) tan82773_08_ch08_227-262.indd 247 3/3/16 12:34 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution. Section 8.4 The Pearson Correlation in Excel (continued) Table 8.11: Murder rate, property crime, and unemployment Year Murder rate (per 100,000 people) Property crime rate (per 100,000 people) Unemployment percentage 1990 7.1 4462 5.6 1991 6.7 5092 6.8 1992 6.4 4801 7.5
  • 48. 1993 6.4 4662 6.9 1994 6.2 4678 6.1 1995 5.7 4460 5.6 1996 5.8 4438 5.4 1997 5.4 4279 4.9 1998 6.1 4040 4.5 1999 5.5 3852 4.2 2000 5.1 3592 4.0 2001 4.9 3456 4.7 2002 4.3 3412 5.8 2003 4.2 3289 6.0 2004 4.7 3168 5.5 2005 5.0 3081 5.1 The Excel results indicate the following: • The correlation between murder rate and unemployment is rxy 5 0.386. • Comparing the murder rate/unemployment rate correlation to the critical value from Table 8.5 (rxy0.05(14) 5 0.497) indicates that the calculated correlation is not statistically significant at p 5 0.05. • The analyst fails to reject the null hypothesis, ρ 5 0. • The property crimes rate and unemployment correlation is rxy 5 0.551. • Comparing the calculated value to the critical value from Table 8.5 (the same rxy0.05(14) 5 0.497, since df are unchanged) indicates that this correlation is statistically significant at p 5 0.05.
  • 49. • The analyst rejects the null hypothesis, ρ 5 0. • The coefficient of determination for this relationship is rxy2 5 0.55122 5 0.303. About 30% of the variance in the property crime rate can be explained by the unemployment rate. Although the rxy2 indicates that about 30% of property crime is explained by variations in unemployment, the analyst will want to be careful about making the conceptual leap to a causal conclusion. “Explained by” isn’t the same as “caused by.” To reiterate the point, per- haps something else explains both crime rate and unemployment. Perhaps underfunded pub- lic schooling prompts an unusually high dropout rate from school. The consequently under- educated population has more difficulty securing stable unemployment. Perhaps state budget cuts have been disproportionately imposed on police agencies, and with fewer officers on the street, crime rises. In other words, the simplest explanation might not be the most accurate. A statistically significant correlation is not where the analysis ends. Apply It! boxes written by Shawn Murphy tan82773_08_ch08_227-262.indd 248 3/3/16 12:34 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
  • 50. Section 8.5 Spearman’s Rho 8.5 Spearman’s Rho The Pearson correlation requires that both variables must be at least interval scale. The point- biserial correlation requires that one variable must be at least interval scale, and the other must be a variable with only two levels. Neither of these correlations is helpful when the data are ordinal scale, which describes much of the data that psychologists and other social scientists encounter. Nearly everyone who goes to the mall or answers the telephone has been asked to take a survey, particularly if it hap- pens to be an election year. Survey data are usually ordinal scale. It is common for the ques- tionnaires to have a Likert-type format, where a statement is read and the respondents are asked the degree to which they agree with the statement by selecting from a range of choices such as: • Strongly agree • Agree • Neither agree nor disagree • Disagree • Strongly disagree Although surveyors commonly code the responses (strongly agree 5 1, agree 5 2 and so on) and then calculate means and standard deviations for all respondents, those statistics assume that the data are at least interval scale. Survey data rarely are. The Likert types of responses are essentially rankings. A response of “strongly
  • 51. agree” is more positive than “agree” but precisely how much more is not clear. Besides, one respondent’s “disagree” may be another respondent’s “strongly disagree.” These data are more safely treated as ordinal scale responses. Correlating Ordinal, or Mixed Ordinal/Interval Data In addition to survey data, ordinal scale characterizes other common data, such as class rank- ings and percentile scores. Sometimes the variables investigators might wish to correlate have mixed scales. For example, a researcher wants to correlate subjects’ income (ratio scale data) with their optimism (usually gauged with a Likert-type survey and so ordinal scale). Along with the ordinal variable, the income variable is often not normally distributed. The lack of normality in both the ratio variable and the ordinal scale variable rules out a Pearson’s correlation. Charles Spearman, Pearson’s colleague at University College London, developed a tremen- dously flexible correlation procedure. It accommodates two variables in a correlation proce- dure, provided the variables fit any of the following: • Both are ordinal scale. • One variable is ordinal scale and one is interval or ratio scale. • Two variables are interval or ratio scale, but one or both fail to meet the Pearson correlation requirement for normality.
  • 52. The procedure is Spearman’s rho, symbolized by ρ. Spearman’s rho is a nonparametric procedure, which means that it makes no assumptions about parameters; it means that ρ will tan82773_08_ch08_227-262.indd 249 3/3/16 12:34 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution. Section 8.5 Spearman’s Rho accommodate data when there are reasons to suspect that the data are not normally distrib- uted. The formula, which requires that the scores for each variable be independently ranked, is as follows: Formula 8.3 ρ 5 1 2 6∑d2 n(n2 2 1) where d 5 the difference between the rankings for the two variables n 5 the number of pairs of data The formula’s 1s and 6 are constant values, used every time a Spearman’s correlation is calculated.
  • 53. Following are the steps to calculating a Spearman’s rho: 1. Rank the scores for both variables separately. 2. For each pair of rankings, subtract the second ranking in the pair from the first to produce a difference score, d. 3. Square each of the d values for d2. 4. Sum the d2 values for ∑d2. 5. Solve for ρ. Ranking Tied Scores The ranking procedure must follow rules. If some of the scores for one of the variables have multiples, all must receive the same ranking. If someone were ranking the following values, for example: 3, 5, 6, 6, 7, 8, 8, 8, 9, 10 ranking the values from smallest to largest produces the following values: 1, 2, 3.5, 3.5, 5, 7, 7, 7, 9, 10. The smallest value, 3, was ranked “1,” the 5 was ranked “2,” and so on. The two 6s and the three 8s were handled as follows: • Because the two 6s are rankings 3 and 4, those two values are added and divided by the number of them (2), which results in 3.5 ([3 1 4] 4 2). After both 6s are ranked 3.5 (for places 3 and 4) the next value in the data set, 7, is ranked 5.
  • 54. • The 8s are all ranked 7 ([6 1 7 1 8] 4 3), after which the next value, the 9, is ranked 9. tan82773_08_ch08_227-262.indd 250 3/3/16 12:34 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution. Section 8.5 Spearman’s Rho An Example Suppose the data ranked above measure emotional sta- bility, a variable thought to correlate negatively with stress. If those data are collected for career military ser- vice personnel assigned to combat areas, and age data are added for 10 subjects, Table 8.12 might be the result. Table 8.12: Emotional stability and age data Emotional stability Age 3 26 5 25 6 32 6 35 7 35 8 34
  • 55. 8 37 8 40 9 42 10 39 Calculations for a Spearman’s rho solution, based on the information in Problem 8.1, give ρ 5 1 2 6∑d2 n(n2 2 1) 5 1 2 6(24.5) 10(102 2 1) 5 0.852 Table 8.13 lists the critical values for Spearman’s rho (Table B.6 in Appendix B). There are no degrees of freedom for this procedure. The correct critical value for rho is indicated by the number of data pairs. Note that for p 5 0.05 and 10 pairs ρ.05(10) 5 0.648. The relationship between emotional stability and age among service personnel assigned to combat zones is statistically significant; therefore, we reject H0. Try It!: #5 Spearman’s rho requires data of what
  • 56. scale? tan82773_08_ch08_227-262.indd 251 3/3/16 12:34 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution. Section 8.5 Spearman’s Rho Table 8.13: The critical values for Spearman’s rho Number of pairs of scores p 5 0.05 p 5 0.01 5 1.0 6 0.886 1.0 7 0.786 0.929 8 0.738 0.881 9 0.683 0.883 10 0.648 0.794 12 0.591 0.777 14 0.544 0.715 16 0.506 0.665 18 0.475 0.625 20 0.450 0.591 22 0.428 0.562 24 0.409 0.537 26 0.392 0.515 28 0.377 0.496 30 0.364 0.478 Source: University of Sussex. (n.d.). Critical values of Spearman’s rho (two-tailed). Retrieved from www.sussex.ac.uk/Users/grahamh/RM1web/Rhotable.htm
  • 57. Problem 8.2: The Spearman’s rho correlation: emotional stability and age among service personnel 1. Ranking the scores produces ρ1 for emotional stability and ρ2 for age. 2. The d score is the difference between the two rankings. 3. The square of the difference score is d2. Emotional stability Age ρ1 ρ2 d(ρ1 2 ρ2) d 2 3 26 1 2 21 1 5 25 2 1 1 1 6 32 3.5 3 0.5 0.25 6 35 3.5 5.5 22 4 7 35 5 5.5 20.5 0.25 8 34 7 4 3 9 8 37 7 7 0 0 8 40 7 9 22 4 9 42 9 10 21 1 10 39 10 8 2 4 ∑d2 5 24.50 tan82773_08_ch08_227-262.indd 252 3/3/16 12:34 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution. www.sussex.ac.uk/Users/grahamh/RM1web/Rhotable.htm
  • 58. Section 8.5 Spearman’s Rho Apply It! Exploring the Correlation between Job Satisfaction and Commute Times As part of the justification for allowing workers to work at home part-time, the human resources director for a large firm intends to investigate any correlation between job satisfaction and average commute time for employees. The director asks ten randomly selected employees to fill out a job-satisfaction questionnaire with the following responses to a series of questions: Response Score • very satisfied (vs) 1 • somewhat satisfied (ss) 2 • somewhat dissatisfied (sd) 3 • very dissatisfied (vd) 4 The employees were also asked to indicate their average one- way commute time in minutes. Recognizing that job satisfaction responses will be ordinal scale, the HR director opts for Spearman’s rho. The data and the difference scores are shown in Table 8.14. Table 8.14: Spearman’s rho data for the correlation between job satisfaction and commute time Commute time
  • 59. (minutes) Commute rank Job satisfaction total Satisfaction rank Difference Difference squared 2 1 10 2 21 1 7 2 14 5 23 9 11 3 10 2 1 1 15 4 14 5 21 1 17 5 10 2 3 9 23 6 14 5 1 1 28 7 17 7.5 20.5 0.25 32 8 22 9.5 21.5 2.25 36 9 22 9.5 20.5 0.25 40 10 17 7.5 2.5 6.25 From the table, the sum of the differences is ∑d2 5 1 1 9 1 1 1 1 1 9 11 1 0.25 1 2.25 1 0.25 1 6.25 5 31
  • 60. Digital Vision/Photodisc/Thinkstock (continued) tan82773_08_ch08_227-262.indd 253 3/3/16 12:35 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution. Section 8.5 Spearman’s Rho Direction of the Ranking In the study of emotional stability and age for service personnel, the least stable value received the ranking of 1, and the most stable a ranking of 10, while the young- est subject received the age ranking of 1. In terms of the value of the statistic, it would not have mattered whether the rankings go from lowest to highest, or from highest to lowest, as long as both variables are ranked the same way. We could have ranked the most emotionally stable 1 and the oldest 1, and the coeffi- cient would have come out the same. If we reversed just one of them, however, the correlation would appear to be negative. Summary of Spearman’s Rho Spearman’s correlation provides flexibility to the analyst. As long as some evidence of a rela- tionship exists, correlations can be calculated for any combination of ordinal, interval, and ratio variables. But of course so much latitude requires some sacrifice, and it is statistical power. In the course of ranking values, the amount of difference
  • 61. between any two data points is lost. When the ages of the service personnel were ranked, • the 25-year-old was 1, • the 26-year-old was 2, • and the 32-year-old was 3. Once ranked, the fact that from the first to the second ranking is a one-year difference and from the second to the third ranking is a six-year difference is lost. Pearson’s r retains those (continued) For n 5 10, the Spearman’s rho formula is ρ 5 1 2 6∑d2 n(n2 2 1) 5 1 2 6(31) 10(102 2 1) 5 0.812 For rs 5 0.05 and 10 pairs of data, the critical value is rs0.05(10) 5 0.648. The relationship between job satisfaction and average commute time is statistically significant. Those who commute the least time have the highest levels of job satisfaction. Perhaps the attitudes of those who have the lowest levels of job satisfaction—those who have the longest commutes—will improve if
  • 62. they are required to commute less often because they can sometimes work from home. Apply It! boxes written by Shawn Murphy Try It!: #6 For 10 students, grade averages and rank in class are correlated. How will the result- ing coefficient be affected if the highest ranked student is given the lowest value (1) versus the highest value (10)? tan82773_08_ch08_227-262.indd 254 3/3/16 12:35 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution. Summary and Resources differences. When both correlations are calculated for the same data, their coefficients usu- ally have little difference, but a Pearson correlation will sometimes be statistically signifi- cant when Spearman’s is not. Note the comparison of critical values at p 5 0.05 shown in Table 8.15. Table 8.15: Comparison of Pearson and Spearman critical values No. pairs Pearson critical value* Spearman critical value 5 0.878 1.000
  • 63. 6 0.811 0.886 10 0.632 0.648 *for df =number of pairs, 22 In the examples above, the value required for significance with a Spearman correlation is higher than that required for a Pearson correlation. Another limitation of the Spearman correlation is that we cannot square the Spearman value to determine the proportion of variance in y explained by x. Spearman’s rho has no equivalent of rxy2. When the data do not meet the Pearson requirements, however, the researcher has no choice. When the data do meet the requirements, a Pearson’s r is usually preferable to Spear- man’s rho. Correlation in Research Correlation procedures answer enough of the questions that interest researchers and con- sumers of research that the procedures pervade research literature. Arroyo (2015) exam- ined the correlation between work engagement and internal self- concept. Arroyo found that people tend to engage in the work they do to earn a living, not for the external rewards, but for the work’s own sake; their work is intrinsically satisfying. Ceci and Kumar (2015), meanwhile, asked whether happiness correlates with creative capac- ity. They found no significant correlation but did find a significant correlation between cre- ative capacity and intrinsic motivation, suggesting that those
  • 64. with the greatest creative capac- ity are probably those who are most internally driven to create. The researchers’ approach to quantifying happiness is also a matter of interest, since it is often a challenge to find a way to quantify something so subjective. Summary and Resources Chapter Summary Many of the questions researchers and scholars ask deal with the relationships between variables. To accommodate them, the discussion in this chapter shifted to statistical procedures that reflect the hypothesis of association (Objective 1). Three of the many correlation procedures that respond to the hypothesis of association are the Pearson tan82773_08_ch08_227-262.indd 255 3/3/16 12:35 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution. Summary and Resources correlation, the point-biserial correlation, and Spearman’s rho. In each case, possible values range from –1.0 to 11.0, and all their coefficients are interpreted the same way. Positive correlations indicate that as the values in one variable increase, the values in the other also increase. Negative correlations indicate that as one increases, the
  • 65. other decreases. The sign of the coefficient, however, is unrelated to its strength (Objective 2). The differences among the correlation procedures in this chapter are in the kinds of variables they accommodate. The Pearson correlation requires interval or ratio variables that are normally and similarly distributed (Objective 3). A special applica- tion of Pearson, the point-biserial correlation, requires an interval/ratio variable and a second variable that has only two manifestations, or a dichotomously scored variable (Objective 5). Spearman’s rho accommodates any combination of ordinal, interval, or ratio variables (Objective 6). Because the data used in a Pearson correlation contain more information than the rankings that make up the data for Spearman’s approach, the Pearson value provides more information about the nature of the relationship between the variables. This is evident in the fact that the Pearson value can be squared to produce the coefficient of determination. The rxy2 value indicates the proportion of one variable that can be explained by changes in the other (Objective 4). Spearman values have no equivalent of this statistic. When two variables share information, they are correlated. The amount of one explained by the other is what that rxy2 value, the coefficient of determination, indicates. This con- cept provides a foundation for regression, which is the focus of Chapter 9. Regression
  • 66. allows what is known of y from analyzing x to predict the value of y from a value of x. It involves calculations and thinking with which you are already familiar, so work the end-of-chapter problems, reread any of the sections in Chapter 8, and prepare for Chapter 9. bivariate correlations Include all proce- dures that test for significant relationships between two variables. canonical correlation Measures the rela- tionship between two groups of variables. coefficient of determination Indicates the proportion of one variable in a Pearson cor- relation that can be explained by the other. correlation matrix A box in which the vari- ables involved are listed in rows as well as in columns, and each variable is correlated with all variables, including itself. hypothesis of association The umbrella term for significance tests that analyze the correlation between or among variables. hypothesis of difference The umbrella term for significance tests that analyze the differences between groups. linear Describes a relationship between two variables whose strength is consistent throughout their ranges. With curvilinear relationships, the strength and sometimes
  • 67. even the nature of the relationship (positive or negative) changes depending upon where in the variables’ ranges they are measured. Key Terms tan82773_08_ch08_227-262.indd 256 3/3/16 12:35 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution. Summary and Resources multiple correlation Gauges the strength of the relationship between one variable and two or more other variables. nonparametric Tests for data that do not meet the usual normality requirements. More technically, a test in which there is no interest in population parameters. partial correlation Measures the relation- ship between two variables, controlling for the influence of a third in both of the first two. Pearson correlation coefficient Indicates the strength of the relationship between interval- or ratio-scale variables. point-biserial correlation A special appli- cation of the Pearson correlation for those instances where one of the variables, such as gender or marital status, has just two
  • 68. manifestations. range attenuation Occurs when a variable is not measured throughout its entire range. Attenuated range artificially reduces the strength of any resulting correlation value. scatterplot A graph representing two vari- ables, one on the horizontal axis, the other on the vertical axis. Each point in the graph indicates the measure of both variables for one individual. semi-partial correlation Gauges the rela- tionship between two variables, controlling for a third in just one of the first two. Spearman’s rho A correlation procedure for two ordinal variables, one ordinal and one interval/ratio variable or two interval or ratio variables, that fail to meet Pearson cor- relation requirements for normality. Review Questions Answers to the odd-numbered questions are provided in Appendix A. 1. What values indicate the strongest and weakest values for a Pearson’s r? 2. What is the equivalent in a Pearson correlation for η2? 3. What are the requirements for calculating Pearson’s r? 4. What is “range attenuation,” and how does it affect
  • 69. correlation values for linear relationships? 5. A university counselor gathers data on students’ grades and whether or not they are employed. What statistical procedure will gauge that relationship? 6. What procedure will indicate whether there is a significant relationship between sales representatives’ sales rank and their attitudes about the product they sell? 7. a. What procedure will gauge the relationship between university students’ grade averages and their scores on, for example, a statistics test? b. What statistic will indicate the proportion of the students’ test scores that is a function of their GPA? tan82773_08_ch08_227-262.indd 257 3/3/16 12:35 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution. Summary and Resources 8. A forensic psychologist gathers data on the average time of night juveniles go to bed and whether or not they have an arrest record. a. What procedure will allow the psychologist to evaluate the
  • 70. relationship between those two variables? b. What is the resulting coefficient? c. How much of variability in arrest records can be explained by what time the juve- nile goes to bed? Juvenile Retire Arrest 1 9.0 No 2 9.5 No 3 11.0 Yes 4 11.5 Yes 5 10.0 Yes 6 9.75 No 7 10.0 No 8 10.25 Yes 9. A group of consumers has just taken two surveys on (a) their attitude about the economy and (b) their attitude about those in government. In both, higher scores mean more optimism. The data are ordinal scale. Are the two attitudes related? Consumer Economy Government
  • 71. 1 15 10 2 5 4 3 16 11 4 10 8 5 11 13 6 3 4 7 12 10 8 11 8 9 10 7 10 14 9 tan82773_08_ch08_227-262.indd 258 3/3/16 12:35 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution. Summary and Resources 10. A group of students has been told that reading will help them in a test of verbal ability required by the university they wish to attend. The x variable indicates the minutes per day spent reading. The y variable represents students’ scores on
  • 72. the test. Student Minutes (x) Score (y) 1 15 57 2 80 84 3 0 60 4 75 92 5 30 65 6 10 60 7 22 75 8 15 68 a. Is the relationship statistically significant? b. How much of the variance in test scores can be explained by differences in the amount of time spent reading? 11. A district psychologist is working with developmentally disabled students in a special education setting and is curious about the relationship between students’ persistence on puzzle tasks (measured in the number of minutes they remain on task) and their number of absences from class. Student Persist Absent
  • 73. 1 12 3 2 4 3 3 15 5 4 18 7 5 12 1 6 5 4 7 8 3 8 9 4 Is the relationship between persistence and attendance statistically significant at p 5 0.05? tan82773_08_ch08_227-262.indd 259 3/3/16 12:35 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution. Summary and Resources 12. An employer wishes to analyze the relationship between stress and job perfor- mance. Stress is reflected by systolic blood pressure. Job performance is measured in the number of sales per day. a. What is the appropriate correlation procedure?
  • 74. b. Is the relationship statistically significant? Employee Sales Blood pressure 1 1 150 2 4 140 3 3 140 4 6 110 5 2 140 6 4 130 7 0 160 8 3 110 9 5 120 10 7 160 13. An industrial psychologist is determining the relationship between workers’ willing- ness to embrace new manufacturing procedures, gauged with a dogmatism scale (higher scores indicate greater dogmatism), and their level of job satisfaction (higher scores indicate greater satisfaction). The satisfaction data are at least ordinal scale. a. What is the relationship? b. What is the null hypothesis? c. Do you reject or fail to reject the null hypothesis?
  • 75. d. What is the relationship between dogmatism and job satisfaction? e. Is the correlation statistically significant? Worker Dogmatism Satisfaction 1 8 4 2 4 12 3 3 14 4 5 15 5 7 5 6 2 14 7 3 15 8 1 15 tan82773_08_ch08_227-262.indd 260 3/3/16 12:35 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution. Summary and Resources Answers to Try It! Questions 1. A single point in a scatterplot represents two raw scores, one for x and one for y.
  • 76. 2. If the two variables are normally distributed but uncorrelated, their combined scat- terplot will be circular with greatest density in the middle of the plot because of the tendency for most of the data to fall in the middle of either distribution. 3. Range attenuation diminishes the strength of the correlation value in linear relation- ships. It produces an artificially low correlation coefficient. 4. As degrees of freedom increase, the correlation value required to reach significance diminishes. 5. Spearman’s rho accommodates variables that have any combination of ordinal, interval, or ratio scale. 6. The coefficient would indicate that the higher the ranking, the lower the GPA. If a ranking of 1 is “best,” the best (highest) GPA must also receive a class ranking of 1. Otherwise, the relationship looks negative when it is not. tan82773_08_ch08_227-262.indd 261 3/3/16 12:35 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution. tan82773_08_ch08_227-262.indd 262 3/3/16 12:35 PM © 2016 Bridgepoint Education, Inc. All rights reserved. Not for resale or redistribution.
  • 77. ARTX 435 The Fashion Consumer Name: Date: I Want That! How We All Became Shoppers- Discussion Questions 1. How do we use objects to define our identity? 2. What does the author mean when he writes that objects are “repositories of magic”? 3. The author writes, “The catalogue told Jane about the moment in which she was living.” What is meant by this? What resources do consumers use today to learn about the moment in which they are living? 4. What does the author mean when he describes “just looking” as a form of “domestic due diligence”?
  • 78. 5. What is included in the “buyosphere”?