2. Gravetter &
Wallnau (ascitedintext)
• There is only one group of scores: Test scores
• Evaluating the mean of test scores will not tell us
anything
• We are now evaluating the relationship
between two variable
• How does is one variable related to the other?
Do students who
finish exams
quickly get
higher grades
than students
who take the
entire test time to
finish?
4. Correlation
• Observation of two variables as they exist naturally
• No manipulation of a variable; no “treatment conditions”
Is one variable related to the other?
5. The Characteristics of a Relationship
• The correlation describes three characteristics of the
relationship between two variables:
1. The direction (“+” or “-”)
Positive
(same direction)
Negative
(opposite direction)
6. The Characteristics of a Relationship
• The correlation describes three characteristics of the
relationship between two variables:
1. The direction
2. The form (linear vs. curvilinear)
7. The Characteristics of a Relationship
• The correlation describes three characteristics of the
relationship between two variables:
1. The direction
2. The form
3. The strength
• 0.01 (very weak) to .99 (very strong)
• 1.00 = perfect correlation
• 0.00 = no correlation
Strong
positive
correlation
Weak
negative
correlation
9. The Pearson Correlation (r)
a.k.a The Pearson product-moment correlation
• Measures the degree and the direction of the linear
relationship between two variables
• It is also a ratio:
OR
separatelyvaryYandXwhichtodegree
thervary togeYandXwhichtodegreethe
r
separatelyYandXofyvariabilit
YandXofitycovariabilthe
r
10. Requirements for the Pearson r
• Each individual in the sample must have two scores, X
and Y
• All scores must be numerical values from an interval or
ratio scale of measurement
11. The Linear Relationship
• We are trying to determine if changes to one variable (X) is
accompanied by a corresponding change in the other (Y) variable.
Perfect linear relationship No linear relationship
• X and Y vary together
• Covariability of X and Y together
is the same as the variability of X
and Y separately
• X and Y have no relationship at all
• Covariability of X and Y together is
zero and different from the
variability of X and Y separately
1
1
1
r 0
1
0
r
12. We’ve been working with SS
• The sum of squared
deviations of each score
from the mean
Now we will work with SP
• The sum of the products
of deviations of each pair
of scores from the mean
The Sum of Products of Deviations (SP)
n
YX
XYSP
n
X
XSS
2
2 )(
Conceptually, these are essentially the same. If we write out
the formula for SS, we get:
which is the same as:
n
XX
XXSS
13. Calculating SP
Person X Y
A 1 3
B 2 6
C 4 4
D 5 7
∑X=12 ∑Y=20
XY
3
12
16
35
∑XY=66
• We start with a set of X
scores and Y scores for
each individual
• Calculate the product of
XY (X multiplied by Y)
• Substitute the totals in
the formula:
n
YX
XYSP
66066
4
240
66
4
)20(12
66 SP
14. Calculating the Pearson-correlation
Covariability of X and Y = SP
Variability of X = SSX
Variability of Y = SSY
Therefore,
separatelyYandXoftyvariabilithe
YandXofitycovariabilthe
r
YX SSSS
SP
r
15. Let’s Calculate r! (Ex.15.3 p.517)
Original Scores
Person X Y
A 0 2
B 10 6
C 4 2
D 8 4
E 8 6
∑X=30 ∑Y=20
YX SSSS
SP
r
Squared Scores
X2 Y2
0 4
100 36
16 4
64 16
64 36
∑X2=244 ∑Y2=96
Products
XY
0
60
8
32
48
∑XY=148
n
X
XSSX
2
2 )(
n
Y
YSSY
2
2 )(
64180244
5
900
244
5
30
244
2
168096
5
400
96
5
20
96
2
n
YX
XYSP
28120148
5
600
148
5
)20(30
148
64XSS
16YSS
12SP
16. Let’s Calculate r! (Ex.15.3 p.517)
Original Scores
Person X Y
A 0 2
B 10 6
C 4 2
D 8 4
E 8 6
∑X=30 ∑Y=20
YX SSSS
SP
r
Squared Scores
X2 Y2
0 4
100 36
16 4
64 16
64 36
∑X2=244 ∑Y2=96
Products
XY
0
60
8
32
32
∑XY=132
YX SSSS
SP
r
28
16
64
SP
SS
SS
Y
x
875.0
32
28
1024
28
)16(64
28
18. Why do we use correlations?
1. Prediction
• SAT scores and college GPA
2. Validity
• Is my new IQ test a valid measure of intelligence?
• Scores on one IQ test should correlate strongly with an
established IQ test
3. Reliability
• Does my new IQ test provide stable, consistent measurements
over time?
4. Theory Verification
• We can test a prediction of a theory
• Is brain size really related to intelligence?
19. How do we interpret correlations?
1. Correlation does not imply causation!
• Higher family income does not cause better grades
• Higher SAT scores does not cause better college GPA
2. Range of scores matters
• When you have a restricted range of scores, proceed carefully!
• Correlation between IQ and creativity completed at Purchase:
• Many performing arts majors = higher creativity scores
• All college students = higher IQ scores
3. Outliers matter
• This is why examining the scatter plot before running analyses is
so important (see next slide)
4. Correlation does not mean proportion
• Correlation coefficient ≠ Coefficient of determination
20. • Outliers can dramatically influence the correlation
coefficient
21. Coefficient of Determination (r2)
a.k.a. “r-squared”
• Measures the proportion of variability in one variable that
can be determined from the relationship with the other
variable
• Calculates the size and strength of the correlation
• It is, simply, your Pearson r, squared.
r2
Get it?
It’s called “r-squared” because it is r-squared!
So, if r = 0.875, then r2 = 0.8752 = .766
22. The relationship between r and r2
• r2 tells us how much of the variability in one score can be
determined by the other
• The stronger the correlation, the larger the proportion
explained
0,0 2
rr 36.0,60.0 2
rr 00.1,00.1 2
rr
24. Null hypotheses:
• There is no correlation:
• The correlation is not
positive:
• The correlation is not
negative:
Alternative hypothesis:
• There is a correlation:
• The correlation is
positive:
• The correlation is
negative:
The Hypothesis
The question we are asking is if there is
a correlation between two variables
0:0 H 0:1 H
ρ = “rho”
Pronounced
“row”
0:0 H 0:1 H
0:0 H 0:1 H
25. Critical
values for
the
Pearson r
Table B.6 in Appendix B
in your textbook (p.709)
df = n – 2
Level of significance for
One-Tailed Test
.05 .025 .01 .005
Level of Significance for
Two-Tailed Test
.10 .05 .02 .01
1 .988 .997 .9995 .9999
2 .900 .950 .980 .990
3 .805 .878 .934 .959
4 .729 .811 .882 .917
5 .669 .754 .833 .874
6 .622 .707 .789 .834
7 .582 .666 .750 .798
8 .549 .632 .716 .765
*to be significant, the sample correlation, r, must be
greater than or equal to the critical value in the table
2 ndf
26. Reporting Correlations
In words:
• A correlation for the data revealed a significant
relationship between (name variable X) and (name
variable Y), r = 0.65, n = 30, p < .01, two tails.
A correlation matrix for several variables:
Education Age IQ
Income .65* .41** .27
Education --- .11 .38**
Age --- --- -.02
n = 30
*p < .05, two tails
**p < .01, two tails
The text has this
backwards!
27. Partial Correlations
Measures the relationship between two variables while
controlling the influence of a third variable by holding it
constant.
• In a situation with three variables (X, Y, Z), compute three
Pearson correlations:
• rxy measuring the correlation between X and Y
• rxz measuring the correlation between X and Z
• ryz measuring the correlation between Y and Z
• Then we can compute the partial correlation (df = n - 3):
)1)(1(
)(
22
YZXZ
YZXZXY
ZXY
rr
rrr
r
29. There are alternatives.
• Unfortunately, we do not have the time to cover these in
this class:
• The Spearman Correlation
• If one of your variables is on an ordinal scale
• When the relationship between variables is non-linear
• The Point-Biserial Correlation
• If one of your variables is a dichotomous variable (e.g., gender)
• The Phi-Coefficient
• If both of your variables are dichotomous variables