2. Events and associations
Dark skies
Eclipse of the sun
Hons of the ambulance
Rains
Darkness during day time
Someone is sick/dead
3. Events and variables
Predictability
Consistent correlations
Association between two events
Pattern
In the real world we try to
predict and also detect
correlations.
5. Correlation: definition
Correlation refers to whether the nature of association between two (or
more ) variables is positive, negative or zero.
The values of correlation range between -1.00 and + 1.00
(+, -, 0)
6. Correlation values/Magnitude
There can be “very strong” correlation ±80 to 1.00
There can be “strong” correlation ±60 to .79
There can be “moderate” correlation ± 40 to .59
There can be “weak” correlation ± 20 to .39
“Very weak” correlation” ± .00 to .19
7. History
David Hume
-Champion association between
variables
- In psychology and philosophy
- The link between cause and effect
was the most important principle
governing association of ideas
Sir Francis Galton
- The creator of the first correlation
coefficient
- Curious about hereditary
- Chose the letter r to represent his
“index of correlation”
- “two variable organs are said to be
co-related when the variation of
one is accompanied on the average
by more or less variation of the
other, and in the same direction.”
Karl O. Pearson
- Enlarged the mathematical
background and precision of the
index of correlation.
- Pearson product-moment
correlation coefficient or Pearson r
(the Greek letter rho)
- Moment = mean
8. Correlation and causation
Correlation does not imply causation
“Unless we actively intervene into a situation by manipulating variables and
measuring their effect on one another, we cannot assume that we know the
causal order involved.” (Dana, p. 207)
10. YX
Z
The Third Variable Problem
The association between X and Y is caused
by a third (unknown) variable) variable Z
11. The Pearson correlation coefficient
Pearson r enables investigators to assess the nature of the association between two
variables, X and Y.
Correlations, then, are based on pairs of variables, and each pair is based on the
responses of one person.
In a conventional correlational relationship, X is the independent variable (predictor)
and Y is dependent variable (criterion)/ measure.
X and Y can represent any number of psychological tests and measures, behaviors, or
rating scales, but variables must be based on an interval or a ratio scale.
12. The Pearson r
The Pearson r, a correlation coefficient, is a
statistic that quantifies the extent to which two
variables X and Y are associated, and whether the
direction of their association is positive, negative,
or zero.
13. Participants Extraversion Score (x) Interaction behavior
(Y)
1
2
3
4
5
6
7
8
9
10
20
5
18
6
19
3
4
3
17
18
8
2
10
3
8
4
3
2
7
9
Table 6.1 Extraversion Scores and Interaction Behavior in a Hypothetical Validation Study
Extraversion Higher score indicate higher level of extraversion (range: 1 to 20 )
Interaction Higher number indicates more social contacts made in the 30-minute get-acquainted session (range: 0
to 12)
Extraverts
/Introvert
s
14. Direction of relationship
Positive correlation
A positive correlation is one where as
the value of x increases, the
corresponding value of Y also
increases.
Similarly, a positive correlation exists
when the value of x decreases, the
value of Y also decreases.
Negative correlation
A negative correlation identifies an
inverse relationship between variables
X and Y – as the value of one
increases, the other necessarily
decreases.
Inverse relationship between two
variables.
Zero correlation
A zero correlation indicates that there
is no pattern or predictive relationship
between the behavior of variables X
and Y.
No discernible pattern of covariation –
how things vary together – between
two variables
15. Scatter plot
A scatter plot is a particular graph used to present correlational data. Each
point in a scatter plot represents the intersection of an X value with its
corresponding Y value. See your textbook, pp 214 ff. Dana Dunn.
16. The Pearson r’s relation to Z scores
r=
∑𝑍𝑥𝑍𝑦
𝑁
Where each Z scores from X is multiplied by each Z score for Y, their
products are summed and then divided by the number of XY pairs.
17. Sum of the Squares :
r =
∑ 𝑋−𝑋‾ (𝑌−𝑌‾)
∑ 𝑋−𝑥‾ 2 ∑(𝑦−𝑦‾)2
SS = ∑x 2-
(∑𝑥)2
Ν
Ssχ = ∑x 2-
(∑𝑥)2
Ν
ssY = ∑x 2-
(∑𝑥)2
Ν
THE CALCULATING FORMULA COVARIANCE OF X AND Y IS :
COVxy = ∑XY -
(∑X)(∑Y)
Ν
19. Using COMPUTATIONAL FORMULA
r = =
∑XY−(∑X)(∑Y)
Ν
[∑X2−
(∑𝑥)2
Ν
][∑ Y2−
∑𝑌 2
Ν
]
r = =
831−(113)(56)
10
[1793−
(113)2
10
][∑ 400−
56 2
10
]
r = =
831−(6,328)
10
[1793−
12,769
10
][400 ∑ −
3,136
10
]
r = =
831− 632.8
[1793−1,276.9] 400−313.6
r = =
198.20
[516.10][86.4]
r =
198.20
44,591.04
r =
198.20
211.1659
r = +.9386 ≅ +.94
The convention is to round the
correlation coefficient to two decimal
places behind the decimal, thus r =
+.94
Table 6.3 Step-by-step calculations for the Pearson r (Raw Score Method) using
Data from Table 6.2
21. r =
∑ 𝑋−𝑋‾ (𝑌−𝑌‾).
𝑆𝑆x.𝑆𝑆y
Computational formula for the mean deviation method
This approach relies on the sum of squares and the covariance of X and X
23. Determining predictive accuracy
1. Interested in the percentage of variance in one variable within a correlation
that can be described by the second member of the pair.
2. This variability can be known by squaring the r value in r
2
. It is a statistic
known as the coefficient of determination.
24. Coefficient of determination
In Personality-SOCIAL BEHAVIOR
EXAMPLE: r value of +94 gives the
value of +88
r
2
= (r)
2
= (.94)
2
= +.88
Interpretation
88% of the variance or the change in
social behavior (Y) can be predicted
by the participants ‘ introverted or
extraverted personalities (i.e., X and
Y relationship)
The coefficient of determination (r
2
)
indicates the proportion of variance or
change in one variable that can be accounted
for by another variable.
25. Coefficient of non-
determination
K = 1 - r
2
= 1 - .88 = +.12.
Interpretation
12% of the social interaction that took
place in the lab can be explained by
factors other than the extraverted or
introverted personalities of the
study’s participants.
Always
r
2
and k together (.+88) + (+.12) =
1.00
Symbolize “k”
The coefficient of non-determination (k, 1-
r
2
) indicates the proportion of variance or
change in one variable that cannot be
accounted for by another variable.