Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s rank correlation
The document discusses correlation and different correlation coefficients. It defines correlation as a linear relationship between two variables and explains that correlation coefficients measure the strength and direction of this relationship. It describes Karl Pearson's correlation coefficient and Spearman's rank correlation coefficient as common methods to determine correlation. It provides examples of calculating correlation using these methods and discusses limitations of correlation analysis.
Simple Correlation :Karl Pearson’s Correlation co-
efficient and Spearman’s rank correlation
Department of Economics
3.
Correlation analysis showus how to determine both the nature and strength of
relationship between two variables. The word Correlation is made of Co- (meaning
"together"), and Relation One of the best statistical tests out there, in my opinion, is
the correlation. Correlation is a mutual relationship between two variables.
Correlation a linear association between two random variables. Correlation analysis
show us how to determine both the nature and strength of relationship between two
variables
Introduction
Department of Economics
4.
Objectives
After going throughthis unit, you will be able to:
Understand the concept of Correlation;
Define Correlation ,Karl Pearson coefficient of correlation and
Spearmen’s rank correlation;
Merits and Demerits Correlation
Some particles problem of Correlation in different series with different
methods.
Department of Economics
5.
A correlation isa linear relationship between two variables. Correlation
measures the linear association between two variables.
Correlation
Definition
According to king, “If it is proved true that in a large number of
instances two variables tend always to fluctuate in the same or in
opposite directions we consider that the fact is established that a
relationship exists. This relationship is called Correlation”
Department of Economics
6.
Type1: On thebasis of degree of correlation
Positive
Negative
No Perfect
Type 2: On the basis of number of variables
Linear
Non – linear
Type 3: On the basis of linearity
Simple
Multiple
Partial
Types of Correlation
Department of Economics
7.
Degree of Correlation
1.Perfect Correlation
2. Absence of Correlation
3. Limited degrees of Correlation
• High degree
• Moderate degree
• Low degree
Degree Positive Negative
Perfect + 1 - 1
High + 0.75 to +1 - 0.75 to -1
Moderate +0.25 to +0.75 - 0.25 to - 0.75
Low 0 to + 0.25 0 to - 0.25
Absence 0 0
Department of Economics
8.
1. Scatter DiagramMethod
2. Graphic Method
3. Karl Pearson Coefficient Correlation of Method
4. Spearman’s Rank Correlation Method
5. Concurrent Deviation Method
6. Method of Least Squares
Methods of determining Correlation
We will be talking about four methods ……..
Department of Economics
9.
The relationship betweenany two variables can be portrayed graphically on
an x- and y- axis. Each subject i1 has (x1, y1). When score s for an entire
sample are plotted, the result is called scatter plot.
Scatter Diagram Method
Department of Economics
10.
Direction of therelationship
Variables can be positively or
negatively correlated.
Positive correlation: A value of one
variable increase, value of other
variable increase.
Negative correlation: A value of one
variable increase, value of other
variable decrease.
No correlation : When different dots
on a scatter diagram are scattered and
do not reveal any trend, there is
absence of correlation between two
variables.
Department of Economics
11.
There are manykinds of correlation coefficients but the most commonly used
measure of correlation is the Pearson’s correlation coefficient. (r)
This method is treated as the best method because it indicates the quality and
direction of correlation in terms of satisfactory numerical measurement.
Karl Pearson’s correlation coefficient
The Pearson r range between -1 to +1.
Sign indicate the direction.
The numerical value indicates the
strength.
Perfect correlation : -1 or 1
No correlation: 0
A correlation of zero indicates the
value are not linearly related.
Department of Economics
12.
Calculation of KarlPearson’s Coefficient of
Correlation
To calculate Karl Pearson’s coefficient of correlation, first the measure of co-
variance is ascertained. Then this absolute measure is converted in coefficient by
dividing it with the product of standard deviation of both these variables. This ratio
of covariance to the product of standard deviation is called Karl Pearson’s coefficient
of correlation.
Symbolically,
r = Ʃdxdy or Covariance of X and Y
N σₓ σ y √Variance of X and Variance of Y
It is the original formula of Karl Pearson’s coefficient of correlation
Department of Economics
13.
Calculation of KarlPearson’s Coefficient of
Correlation
Direct Method for Individual Series
Find out the mean of x and y variables
Take deviations in x variable from its mean i.e, dₓ = (X -X̅ )
Take deviations in y variable from its mean i.e, d y = (Y -Y̅ )
Multiplying the correlation deviation of x and y variables with each other and
summate them i.e, (dₓd y) It is put in last column of table.
Find out the squares of the deviations of x and y variables separately and
summate them i.e, Ʃd²ₓ and Ʃd²y .
Ascertain standard deviation of both the variables with the help of following
formulae: σₓ =√(Ʃd²ₓ) , σ y = (Ʃd²y )
N N
In the end coefficient of correlation be obtained by using following formula:
r = Ʃdxdy or Ʃdₓd y
N σₓ σ y √(Ʃd²ₓ x Ʃd²y
Department of Economics
14.
Direct method
Example: CalculateKarl Pearson’s Coefficient of correlation from the following
data-
Age of
husbands
23 27 28 28 29 30 31 33 35 36
Age of wives 18 20 22 27 21 29 27 29 28 29
X-series
X̅ =ƩX = 300 = 30
N 10
σₓ =√(Ʃd²ₓ) = √138 = 3.71
N 10
Y-series
Y̅ =ƩY = 250 = 25
N 10
σ y = √(Ʃd²y ) = √164 =4.05
N 10
Department of Economics
15.
Direct Method
Age ofhusbands Age of Wives Product of
dx and
dy
Age
(Years)
Deviation
from X̅ = 30
Squares of
deviation
Age
(Years)
Deviation
from Y̅ = 25
Squares of
deviation
X dₓ = (X -X̅ ) d²ₓ Y dy =(Y- Y̅ ) d²y dx dy
23 -7 49 18 -7 49 49
27 -3 9 20 -5 25 15
28 -2 4 22 -3 9 6
28 -2 4 27 2 4 -4
29 -1 1 21 -4 16 4
30 0 0 29 4 16 0
31 1 1 27 2 4 2
33 3 9 29 4 16 12
35 5 25 28 3 9 15
36 6 36 29 4 16 24
Total
N =10
138
Ʃd²ₓ
Total
N = 10
164
Ʃd²y
123
Ʃdx dy
r = Ʃdxdy
N σₓ σ y
= 123
10x 3.71x 4.05
= + 0.82
r = Ʃdₓd y
√(Ʃd²ₓ x Ʃd²y
= 123
√138 x 164
= +0.82
It indicates high degree
of positive correlation
between the ages of
husbands and wives
Department of Economics
16.
Short Cut Methodfor Individual Series (If arithmetic mean is not whole
number this method is used)
In x and y variables, suitable and convenient values from assumed means as
arbitrary means(Aₓ and Ay )
In both the series, deviation of values from assumed arithmetic means are taken
(dₓ and d y )
Deviation taken as above are totalled Ʃdₓ and Ʃdy .
Corresponding deviations of the variables are multiplied with each other and same
are totalled Ʃdₓd y .
Deviations taken from assumed mean are squared up and totalled Ʃd²ₓ and Ʃd²y
In the end coefficient of correlation be obtained by using following formula:
Department of Economics
17.
Short Cut Method
Example:Calculate Karl Pearson’s Coefficient of correlation from the following
data-
Series A 112 114 108 124 145 150 119 125 147 150
Series B 200 190 214 187 170 170 210 190 180 180
Department of Economics
18.
Short Cut Method
ShortCut - Method
Series A Series B Product of
dx and
dy
Item
value
Deviation
from 125
Squares of
deviation
Item
value
Deviation
from 190
Squares of
deviation
X dₓ d²ₓ Y dy d²y dx dy
112 -13 169 200 10 100 -130
114 -11 121 190 0 0 0
108 -17 289 214 24 576 -408
124 -1 1 187 -3 9 3
145 20 400 170 -20 400 -400
150 25 625 170 -20 400 -500
119 -6 36 210 20 400 -120
125 0 0 190 0 0 0
147 22 484 180 -10 100 -220
150 25 625 180 -10 100 -250
Total
1294
Ʃdₓ =44 2750
Ʃd²ₓ
Total
1891
Ʃdy = -9 2085
Ʃd²y
-2025
Ʃdx dy
r = NxƩdₓdy - Ʃdₓ xƩdy
√NxƩd²ₓ - (Ʃdₓ)² X √NxƩd²y - (Ʃdy )²
= 10 X -2025 –(44 X -9)
√25564 x 20769
= -0.86
It indicates high degree
of negative correlation
between the Series A
and Series B
Department of Economics
19.
Spearman’s Rank DifferenceMethod
Spearman’s Rank correlation coefficient is a technique which
can be used to summarize the strength and direction (negative
or positive) of a relationship between two variables.
The result will always be between 1 and minus 1.
Department of Economics
20.
Calculation of Spearman’sRank Difference
Method
Method - calculating the coefficient
Create a table from your data.
Rank the two data sets. Ranking is achieved by giving the ranking '1' to the biggest
number in a column, '2' to the second biggest value and so on. The smallest value in the
column will get the lowest ranking.
Find the difference in the ranks (D): This is the difference between the ranks of the two
values on each row of the table.
Square the differences (D²) To remove negative values and then sum them (ƩD²).
Lastly, the following formula is used to calculate coefficient of correlation by rank
differences method –
Example: From the following data compute coefficient of correlation by Rank
differences method-
X
series
85 91 56 72 95 76 89 51 59 90
Y
series
18.3 20.8 16.9 15.7 19.2 18.1 17.5 14.9 18.9 15.4
Department of Economics
21.
Series
X
Series
Y
Rank
difference
D
Squares of
Rank
Difference
D²
X Rank
X
YRank
Y
85 5 18.3 4 1 1
91 2 20.8 1 1 1
56 9 16.9 7 2 4
72 7 15.7 8 -1 1
95 1 19.2 2 -1 1
76 6 18.1 5 1 1
89 4 17.5 6 -2 4
51 10 14.9 10 0 0
59 8 18.9 3 5 25
90 3 15.4 9 -6 36
N =10 ƩD = 0 ƩD² = 74
ρ= 1 - 6ƩD²
N(N² - 1)
= 1 – 6x 74
10(10² - 1)
= 1 – 444
990
= + 0.55
There is moderate
degree of positive
rank correlation
between X and Y
series
Department of Economics
22.
Although correlation isa powerful tool, there are some limitations in using it:
1.Correlation does not completely tell us everything about the data. Means
and standard deviations continue to be important.
2.The data may be described by a curve more complicated than a straight
line, but this will not show up in the calculation of r.
3.Just because two sets of data are correlated, it doesn't mean that one is the
cause of the other.
Limitations of Correlation
Department of Economics
23.
Unit End Questions
1.Explain the meaning and significance of the concept of correlation.
2. What is correlation? Distinguish between positive and negative
correlation. What is the maximum and minimum value of coefficient of
correlation.
3. Write short notes on the following-
Simple and Multiple correlation
Linear and Curvilinear correlation
Rank correlation
4. Calculate coefficient of correlation from the following data –
X 65 72 78 77 76 77 80
Y 36 40 38 37 39 41 40
Department of Economics