This document discusses correlation and the Pearson correlation coefficient (r). It investigates the linear association between body weight and plasma volume in 8 subjects. The correlation coefficient (r) between weight and plasma volume is calculated to be 0.76, indicating a strong positive correlation. A t-test shows this correlation is statistically significant. Values of r range from -1 to 1, where higher positive or negative values indicate stronger linear relationships.
2. CorrelationCorrelation
The aim is to investigate the linearThe aim is to investigate the linear
association between two continuousassociation between two continuous
variables.variables.
Correlation:Correlation:
Measures the closeness of theMeasures the closeness of the
association.association.
4. Correlation, cont.Correlation, cont.
In general, high plasma volume tend to beIn general, high plasma volume tend to be
associated with high weight. This association isassociated with high weight. This association is
measured by Pearson correlation coefficient (r)measured by Pearson correlation coefficient (r)
6. Scatter diagramof plasma volume and bodyweight
showing linear regression line
y = 0.0434x + 0.0998
R2
= 0.5818
0
1
2
3
4
45 55 65 75 85
Body weight (Kg)
Plasmavolume(liter)
7. ResultsResults
N = 8N = 8
∑∑x = 535 Mean x = 66.875x = 535 Mean x = 66.875
∑∑xx22
==35983.535983.5
∑∑y = 24.02 Mean y = 3.0025y = 24.02 Mean y = 3.0025
∑∑yy22
= 72.798= 72.798
∑∑xy = 1615.295xy = 1615.295
r =r = 76.0
)6780.0)(38.205(
96.8
=
8. Significance testSignificance test
A t test is used to testA t test is used to test
whether (r) iswhether (r) is
significantly differsignificantly differ
from zero ie whetherfrom zero ie whether
the observedthe observed
correlation couldcorrelation could
simply due to chance.simply due to chance.
t =t = 2
1
2
r
n
r
−
−
10. Significance test, cont.Significance test, cont.
Note: A significance level is a function ofNote: A significance level is a function of
both the size of the correlation coefficientboth the size of the correlation coefficient
and number of observations.and number of observations.
A weak correlation may therefore beA weak correlation may therefore be
statistically significant if based on a largestatistically significant if based on a large
number of observations, while a strongnumber of observations, while a strong
correlation may fail to achieve significancecorrelation may fail to achieve significance
if there are only a few observations.if there are only a few observations.
11. Interpretation of rInterpretation of r
r is always a No. between minus 1 andr is always a No. between minus 1 and
plus 1.plus 1.
r is positive if x and y tend to be high orr is positive if x and y tend to be high or
low together, and the larger its value, thelow together, and the larger its value, the
closer the association.closer the association.
r is negative if high value of y tend to gor is negative if high value of y tend to go
with low values of x and vice versa.with low values of x and vice versa.
12. r valuer value InterpretationInterpretation
ZeroZero No correlationNo correlation
0<r<10<r<1 Imperfect (direct) positiveImperfect (direct) positive
correlationcorrelation
11 Perfect (direct) positivePerfect (direct) positive
correlationcorrelation
-1 < r < 0-1 < r < 0 Imperfect negativeImperfect negative
(inverse) correlation(inverse) correlation
r = -1r = -1 Perfect negative (inverse)Perfect negative (inverse)
correlationcorrelation
13. Guideline values for interpretationGuideline values for interpretation
the value of (r)the value of (r)
rr Degree ofDegree of
associationassociation
± 1± 1 PerfectPerfect
± 0.7 - ± 1± 0.7 - ± 1 StrongStrong
± 0.4 - ± 0.7± 0.4 - ± 0.7 ModerateModerate
< 0.4< 0.4 WeakWeak
0.00.0 No associationNo association
14. NotesNotes
r measures only linear relationship.r measures only linear relationship.
When there is a strong non-linearWhen there is a strong non-linear
relationship, r = zero.relationship, r = zero.
So we have to draw a scatter diagram firstSo we have to draw a scatter diagram first
to identify non-linear relationship.to identify non-linear relationship.
If you calculate r without examining theIf you calculate r without examining the
data, then you will miss a strong but nondata, then you will miss a strong but non
linear relationship.linear relationship.
15. The coefficient of determination (rThe coefficient of determination (r22
))
When r = 0.58, then r2 = 0.34.When r = 0.58, then r2 = 0.34.
This means that 34% of the variation in theThis means that 34% of the variation in the
values of (y) may be accounted for byvalues of (y) may be accounted for by
knowing values of (X) or vice versa.knowing values of (X) or vice versa.