correlation ;.pptx

UNIT : XVI
CORRELATION
Mrs.D. MelbaSahayaSweetyRN,RM
PhDNursing, MSc Nursing(PediatricNursing) B.ScNursing
Associate Professor
Departmentof PediatricNursing
EnamNursingCollege, Savar,
Bangladesh.
1

Correlationis a bi-variate analysis that measures the strength
of associationbetween two variables and the direction of the relationship. In
terms of the strengthof relationship, the value of thecorrelationcoefficientvaries
between+1 and -1. A value of ± 1 indicates a perfect degree of associationbetween
thetwo variables. As the correlationcoefficient valuegoes towards 0, the
relationshipbetween the two variables will be weaker. The direction of the
relationshipis indicatedby the signof the coefficient; a + sign indicates a positive
relationshipand a – sign indicates a negative relationship. 2

Correlation analysis deals with the association between two
or more variables.
- Simphson & Kafka.
Correlation analysis attemptsto determine the degree of relationship
between variables .
- Ya- Lun Chou
Correlation is an analysis of covariationbetween two or more variables.”
- A.M. Tuttle 3

Types of Correlation
Based on the direction
of change of variables
Based upon the number
of variables studied.
Based upon the
constancy of the ratio
of change between the
variables
Positive
Correlation
Negative
Correlation
Simple Correlation
Multiple Correlation
Partial Correlation
Linear
Correlation
Non- Linear
Correlation
Total Correlation 4

Positive Correlation
The movement of variable in the same direction is
known as positive or direct correlation. That is the value of
both the variable either increase together or decrease
simultaneously. Example: When income rises, so does
consumption and when income falls, consumption does too.
Perfect positive Correlation : +1
Strong Positive Correlation : +0.8 to +1
Medium Positive Correlation : + 0.5
Low Positive Correlation : + 0.2
5

The movement of variable in the Opposite
direction is referred to as negative or Inverse correlation.
That is the value of one variable raises the value of another
falls or vise versa. Example: Height above sea level and
temperature are an example of a negative association. It
gets colder as you climb the mountain (ascend in elevation)
(decrease in temperature).
Negative Correlation
Perfect Negative Correlation : - 1
Strong Negative Correlation : - 0.8 to - 1
Medium Negative Correlation : - 0.5
Low Negative Correlation : - 0.2
6

The study of only two variables is referred to as
simple correlation. The usage of fertilizers and Rice
production is an example of a simple connection, as rice
production is dependent on fertilizer use.
Simple Correlation
Multiple Correlation
Multiple correlations is defined as the study of three or
more variables at the same time. For instance the
relationship of fertilizers and pesticides on Rice production
. 7

In partial Correlation the relationship between
two are more variables are studied, Which consider
only one dependent and independent variable while keeping
all others constant For instance the relationship of Rice
production and fertilizers excluding the effect of rainfall,
Pesticides, and natural manures.
Partial Correlation
Total Correlation
The total correlation is found by taking all the variables.
8

The correlation is said
to be linear when the change in one
variable bears a constant ratio to
the change in the other.
Linear Correlation
Non- Linear Correlation
The change in one variable does not have a
constant ratio to the change in the other
variables, the correlation is non-linear. It is
other wise known as Curvilinear 9

• Degree of correlation can be known
by coefficient of correlation ( r )
Degree of Correlation Positive Negative
Perfect Correlation +1 -1
Strong high degree Correlation +0.9 -0.9
High degree Correlation +0.9 to + 0.75 - 0.9 to - 0.75
Moderate degree Correlation +0.25 to +0.75 +0.25 to + 0.75
Low degree correlation 0 to + 0.25 0 to - 0.25
No Correlation 0 0
10

Methods to find correlation are
1. Scatter diagram
2. Karl Pearson’s product moment correlation coefficient : ‘r’
3. Spearman’s Rank correlation coefficient: ‘ ρ ’
4. Yule’s coefficient of Association: ‘Q’
5, Kendall rank correlation: ‘τ’
6, Point-Biserial correlation. ‘ɼpb’
7, Rank-Biserial correlation. ‘ɼrb’
8, Biserial correlation. ‘ɼb’
9, Phi, Contingency Coefficient. ‘rφ’ 11

Correlation Coefficient Level of Measurement
Karl Pearson’s product moment
correlation coefficient : ‘r’
Both Variable Interval
Spearman’s Rank correlation coefficient:
‘ ρ ’
Kendall rank correlation: ‘τ’
Both variable Ordinal
Phi, Contingency Coefficient. ‘rφ’ Both variable Nominal
Point-Biserial correlation. ‘ɼpb’ One variable Interval and another Dichotomous
Rank-Biserial correlation. ‘ɼrb’ One variable ordinal and another Nominal
Biserial correlation. ‘ɼb’ Interval data against ordinal data but ordinal data
with an underlying continuity but measured 12

Based on the distribution and type of
relationship, correlations can be interpreted in two
categories as follows.
A, Parametric Correlation
B, Non – Parametric Correlation
Description Parametric Correlation Non Parametric
Methods / Metrics
Karl’s
Pearson Correlation
Spearman
& Kendall’s correlation
(interchangeably used)
Assumption Must be interval or ratio
interval or ratio level
or ordinal 13

Description Parametric Correlation Non Parametric
Assumption
Variable forms linear relationship
(positive or negative)
Forms monotonic Relationship
Both Variables shall follow an
approximately normal distribution
Distribution free Variables may form
a skewed distribution or uniform
Characteristics
Both variables move at a constant ratio
& follow linear correlation
(i.e., y = mx, etc. )
Variables move at a constant ratio but
do not follow linear correlation; instead,
follow the exponential, curve, parabola,
etc. (i.e., y = ax+bx^2, a=b^2)
Impacted
by/Sensitive to:
Outliers must be handled as it greatly
affects the correlation
Robust and Mitigates the effect of
outliers
Range -1 ≤ r ≤ 1 -1 ≤ r ≤ 1
14

The Pearson correlation coefficient (also known
as the “product-moment correlation coefficient”) measures the linear
association between two variables. It always takes on a value between -1 and 1
ASSUMPTION:
• 1. Level of Measurement: The two variables should be measured at
the interval or ratio level.
• 2. Linear Relationship: There should exist a linear relationship between the two variables.
• 3. Normality: Both variables should be roughly normally distributed.
• 4. Related Pairs: Each observation in the dataset should have a pair of values.
• 5. No Outliers: There should be no extreme outliers in the dataset.
• 6.Homoscedascity : means ‘equal variances’. 15

Example : 1, Find the Correlation Coefficient of the
following data
x 9 8 7 6 5 4 3 2 1
y 15 16 14 13 11 12 10 8 9
16

r =
X Y X2 Y2 XY
9 15 81 225 135
8 16 64 256 128
7 14 49 196 98
6 13 36 169 78
5 11 25 121 55
4 12 16 144 48
3 10 9 100 30
2 8 4 64 16
1 9 1 81 9
45 108 285 1356 597
9(597) – (45) (108)
[9(285) – (45)2][9(1356) – (108) 2]
√
r = 5373 – 4860
[2565 – 2025][12204 – 11664]
√
r = 513
[540][540]
√
513
291600
√
=
r = 513
540
r = 0.95 Hence it is strong
positive Correlation 17

Example : 2, The following gives the prize of a product and
corresponding to quality of supply calculate Karl Pearson
correlation coefficient
x 2 4 6 8 10
y 9 7 5 3 1
18

r =
X Y X2 Y2 XY
2 9 4 81 18
4 7 16 49 28
6 5 36 25 30
8 3 64 9 24
10 1 100 1 10
30 25 220 165 110
5(110) – (30) (25)
[5(220) – (30)2][5(165) – (25) 2]
√
r = 550 - 750
[1100 – 900][825 – 625]
√
r = -200
[200][200]
√
-200
40000
√
=
r = -200
200
r = -1 Hence it is perfect negative Correlation
19

Merits Demerits
This method indicates the presence or absence
of correlation between two variables and gives
the exact degree of their correlation.
It is more difficult to calculate than other
methods of calculations.
we can also ascertain the direction of the
correlation; positive, or negative.
It is much affected by the values of the
extreme items.
This method has many algebraic properties for
which the calculation of co-efficient of
correlation, and other related factors, are made
easy.
Cannot show cause and effect (what variables
control what)
It is based on a large number of assumptions
viz. linear relationship, cause and effect
relationship etc. which may not always hold
good. 20

This method of determining correlation was
propounded by Prof. Spearman in 1904.The Spearman's
Rank-order correlation is the nonparametric version of
the Pearson product-moment correlation.
Spearman’s correlation coefficient, (ρ, also signified by rs) measures the
strength and direction of association between two ranked variables.
A Spearman rank correlation is a number between -1 and +1 that
indicates to what extent 2 variables are monotonously related.
To understand Spearman’s correlation it is necessary to know what a
monotonic function is
21

A monotonic function is one that either never
increases or never decreases as its independent
variable increases.
Monotonically increasing - as the x variable increases the y
variable never decreases;
Monotonically decreasing - as the x variable increases the y
variable never increases;
Not monotonic - as the x variable increases the y variable
sometimes decreases and sometimes increases.
22

ASSUMPTION:
•Random sample (truly random sample representative of one population of interest)
•A monotonic association exists: between 2 variables.
•Variables are at least ordinal (ratio, interval, continuous (no nominal data like blood
type)
•Data contains paired samples; need variable x and y values, if there is a missing
value you need to delete the row
•Independence of observations: x observations in the x variable should be
independent from the y variable (no brother and sister) or (same subject with
multiple entries)
•Variable does not have to be sampled from a normal distribution 23

Example : 1 The rank obtained by 10
students in 2 classes are given below
find the spearman’s Rank correlation
coefficient ( Rank Given)
Clas
s A
1 2 5 4 1 9 10 6 8 3
Class
B
5 6 1 7 10 8 4 9 3 2
24

c
X Y d = (x-y) d2
1 5 -4 16
2 6 -4 16
5 1 4 16
4 7 -3 9
1 10 -9 81
9 8 1 1
10 4 6 36
6 9 -3 9
8 3 5 25
3 2 1 1
210
ρ = 1- 6 (210)
10 (10 2 – 1)
ρ = 1- 1260
10 ( 99)
ρ = 1- 1260
990
ρ = 1- 1.27
ρ = - 0.27 Negatively Correlated 25

Example : 2 Find the spearman’s Rank
correlation coefficient for the following
value of x and y ( Rank Not Given)
x 50 60 50 60 80 50 80 40 70
y 30 60 40 50 60 30 70 50 60
26

c
X Y R1 R2 d =R1 – R2 d2
50 30 7 8.5 -1.5 2.25
60 60 4.5 3 1.5 2.25
50 40 7 7 0 0
60 50 4.5 5.5 -1 1
80 60 1.5 3 -1.5 2.25
50 30 7 8.5 -1.5 2.25
80 70 1.5 1 0.5 0.25
40 50 9 5.5 3.5 12.25
70 60 3 3 0 0
Total 22.5
27

c
ρ = 1-
ρ =
22.5 + 1/12+(23- 2) + 1/12+(23- 2) + 1/12+(33- 3) +
1/12+(23- 2) + 1/12+(23- 2) + 1/12+(33- 3)
9 (92- 1)
ρ =
1-
22.5 + 1/12+(8 - 2) + 1/12+(8 - 2) + 1/12+(27- 3) +
1/12+(8- 2) + 1/12+(8- 2) + 1/12+(27- 3)
9 (81 - 1)
1-
22.5 + 1/12+(6) + 1/12+(6) + 1/12+(24) + 1/12+(6) + 1/12+(6) +
1/12+(24)
9 (80)
6
6
6
28

ρ = 1-
22.5 + 0.5 + 0.5 + 2 + 0.5 + 0.5 + 2
720
ρ = 1-
28.5
720
ρ = 1- 171
720
ρ = 1- 0.2375
ρ = 0.7625
Hence there is high degree positive
correlation between x and y
6
6
29

• Example : 1, The correlation coefficient between height and weight of 20 girls is
0.2. Is there any relation between these two variables in the population?
Solution:
Hypothesis:
Null Hypothesis: There is no relation between the two variables (H0:ƍ = 0)
Alternative Hypothesis: There is a relation between the two variables (H0:ƍ ≠ 0)
T- Test Sample Correlation Coefficient
30

Test Statistics: n = 20, r = 0.2
t =
0.2
1 – (0.2)
x √ 20-2
2
t =
0.2
1 – 0.04
√18
x
t =
0.2
0.96
x 4.24
√
√
√
t =
0.2
0.98
x 4.24
t = x 4.24
0.2040
t = 0.86496
31

Degree of freedom = n-1
= 20-1
= 19
t(0.05,19) = 2.093
tcal < ttab
Thus we accept null hypothesis so There is no relation between the two
variables (H0:ƍ = 0) 32

Merits Demerits
It is easy to compute Only non-liner data can be computed.
It is easy to understand When data has higher values it is difficult to compute.
This method can be used to carry out correlation
analysis for variables that are not numerical. We
can study the relationships between qualitative
variables such as beauty, intelligence, honesty,
efficiency, and so on
A large computational time is required when the
number of pairs of values of two variables exceeds 30.
In such cases, assigning ranks to each of the numerical
values is a very time-consuming and tedious process.
Spearman’s formula is the only formula to be
used for finding the correlation coefficient if we
are dealing with qualitative characteristics which
cannot be measured quantitatively but can be
arranged serially.
This method cannot be applied to measure the
association between two variables whose distribution
is given in the form of a grouped frequency
distribution.
33

correlation ;.pptx

Recommended

Recommended

More Related Content

Similar to correlation ;.pptx

Similar to correlation ;.pptx (20)

More from Melba Shaya Sweety

More from Melba Shaya Sweety (20)

Recently uploaded

Recently uploaded (20)

correlation ;.pptx