ECONOMICS
Basic Statistics
Dr Rekha choudhary
Department of Economics
Jai NarainVyas University,Jodhpur
Rajasthan Department of Economics
Simple Correlation : Karl Pearson’s Correlation co-
efficient and Spearman’s rank correlation
Department of Economics
Correlation analysis show us how to determine both the nature and strength of
relationship between two variables. The word Correlation is made of Co- (meaning
"together"), and Relation One of the best statistical tests out there, in my opinion, is
the correlation. Correlation is a mutual relationship between two variables.
Correlation a linear association between two random variables. Correlation analysis
show us how to determine both the nature and strength of relationship between two
variables
Introduction
Department of Economics
Objectives
After going through this unit, you will be able to:
 Understand the concept of Correlation;
 Define Correlation ,Karl Pearson coefficient of correlation and
Spearmen’s rank correlation;
Merits and Demerits Correlation
Some particles problem of Correlation in different series with different
methods.
Department of Economics
A correlation is a linear relationship between two variables. Correlation
measures the linear association between two variables.
Correlation
Definition
According to king, “If it is proved true that in a large number of
instances two variables tend always to fluctuate in the same or in
opposite directions we consider that the fact is established that a
relationship exists. This relationship is called Correlation”
Department of Economics
Type1: On the basis of degree of correlation
 Positive
 Negative
 No Perfect
Type 2: On the basis of number of variables
 Linear
 Non – linear
Type 3: On the basis of linearity
 Simple
 Multiple
 Partial
Types of Correlation
Department of Economics
Degree of Correlation
1. Perfect Correlation
2. Absence of Correlation
3. Limited degrees of Correlation
• High degree
• Moderate degree
• Low degree
Degree Positive Negative
Perfect + 1 - 1
High + 0.75 to +1 - 0.75 to -1
Moderate +0.25 to +0.75 - 0.25 to - 0.75
Low 0 to + 0.25 0 to - 0.25
Absence 0 0
Department of Economics
1. Scatter Diagram Method
2. Graphic Method
3. Karl Pearson Coefficient Correlation of Method
4. Spearman’s Rank Correlation Method
5. Concurrent Deviation Method
6. Method of Least Squares
Methods of determining Correlation
We will be talking about four methods ……..
Department of Economics
The relationship between any two variables can be portrayed graphically on
an x- and y- axis. Each subject i1 has (x1, y1). When score s for an entire
sample are plotted, the result is called scatter plot.
Scatter Diagram Method
Department of Economics
Direction of the relationship
Variables can be positively or
negatively correlated.
Positive correlation: A value of one
variable increase, value of other
variable increase.
Negative correlation: A value of one
variable increase, value of other
variable decrease.
No correlation : When different dots
on a scatter diagram are scattered and
do not reveal any trend, there is
absence of correlation between two
variables.
Department of Economics
There are many kinds of correlation coefficients but the most commonly used
measure of correlation is the Pearson’s correlation coefficient. (r)
This method is treated as the best method because it indicates the quality and
direction of correlation in terms of satisfactory numerical measurement.
Karl Pearson’s correlation coefficient
 The Pearson r range between -1 to +1.
 Sign indicate the direction.
 The numerical value indicates the
strength.
 Perfect correlation : -1 or 1
 No correlation: 0
 A correlation of zero indicates the
value are not linearly related.
Department of Economics
Calculation of Karl Pearson’s Coefficient of
Correlation
To calculate Karl Pearson’s coefficient of correlation, first the measure of co-
variance is ascertained. Then this absolute measure is converted in coefficient by
dividing it with the product of standard deviation of both these variables. This ratio
of covariance to the product of standard deviation is called Karl Pearson’s coefficient
of correlation.
Symbolically,
r = Ʃdxdy or Covariance of X and Y
N σₓ σ y √Variance of X and Variance of Y
It is the original formula of Karl Pearson’s coefficient of correlation
Department of Economics
Calculation of Karl Pearson’s Coefficient of
Correlation
Direct Method for Individual Series
 Find out the mean of x and y variables
 Take deviations in x variable from its mean i.e, dₓ = (X -X̅ )
 Take deviations in y variable from its mean i.e, d y = (Y -Y̅ )
 Multiplying the correlation deviation of x and y variables with each other and
summate them i.e, (dₓd y) It is put in last column of table.
 Find out the squares of the deviations of x and y variables separately and
summate them i.e, Ʃd²ₓ and Ʃd²y .
 Ascertain standard deviation of both the variables with the help of following
formulae: σₓ =√(Ʃd²ₓ) , σ y = (Ʃd²y )
N N
 In the end coefficient of correlation be obtained by using following formula:
r = Ʃdxdy or Ʃdₓd y
N σₓ σ y √(Ʃd²ₓ x Ʃd²y
Department of Economics
Direct method
Example: Calculate Karl Pearson’s Coefficient of correlation from the following
data-
Age of
husbands
23 27 28 28 29 30 31 33 35 36
Age of wives 18 20 22 27 21 29 27 29 28 29
X-series
X̅ =ƩX = 300 = 30
N 10
σₓ =√(Ʃd²ₓ) = √138 = 3.71
N 10
Y-series
Y̅ =ƩY = 250 = 25
N 10
σ y = √(Ʃd²y ) = √164 =4.05
N 10
Department of Economics
Direct Method
Age of husbands Age of Wives Product of
dx and
dy
Age
(Years)
Deviation
from X̅ = 30
Squares of
deviation
Age
(Years)
Deviation
from Y̅ = 25
Squares of
deviation
X dₓ = (X -X̅ ) d²ₓ Y dy =(Y- Y̅ ) d²y dx dy
23 -7 49 18 -7 49 49
27 -3 9 20 -5 25 15
28 -2 4 22 -3 9 6
28 -2 4 27 2 4 -4
29 -1 1 21 -4 16 4
30 0 0 29 4 16 0
31 1 1 27 2 4 2
33 3 9 29 4 16 12
35 5 25 28 3 9 15
36 6 36 29 4 16 24
Total
N =10
138
Ʃd²ₓ
Total
N = 10
164
Ʃd²y
123
Ʃdx dy
r = Ʃdxdy
N σₓ σ y
= 123
10x 3.71x 4.05
= + 0.82
r = Ʃdₓd y
√(Ʃd²ₓ x Ʃd²y
= 123
√138 x 164
= +0.82
It indicates high degree
of positive correlation
between the ages of
husbands and wives
Department of Economics
Short Cut Method for Individual Series (If arithmetic mean is not whole
number this method is used)
 In x and y variables, suitable and convenient values from assumed means as
arbitrary means(Aₓ and Ay )
 In both the series, deviation of values from assumed arithmetic means are taken
(dₓ and d y )
 Deviation taken as above are totalled Ʃdₓ and Ʃdy .
 Corresponding deviations of the variables are multiplied with each other and same
are totalled Ʃdₓd y .
 Deviations taken from assumed mean are squared up and totalled Ʃd²ₓ and Ʃd²y
 In the end coefficient of correlation be obtained by using following formula:
Department of Economics
Short Cut Method
Example: Calculate Karl Pearson’s Coefficient of correlation from the following
data-
Series A 112 114 108 124 145 150 119 125 147 150
Series B 200 190 214 187 170 170 210 190 180 180
Department of Economics
Short Cut Method
Short Cut - Method
Series A Series B Product of
dx and
dy
Item
value
Deviation
from 125
Squares of
deviation
Item
value
Deviation
from 190
Squares of
deviation
X dₓ d²ₓ Y dy d²y dx dy
112 -13 169 200 10 100 -130
114 -11 121 190 0 0 0
108 -17 289 214 24 576 -408
124 -1 1 187 -3 9 3
145 20 400 170 -20 400 -400
150 25 625 170 -20 400 -500
119 -6 36 210 20 400 -120
125 0 0 190 0 0 0
147 22 484 180 -10 100 -220
150 25 625 180 -10 100 -250
Total
1294
Ʃdₓ =44 2750
Ʃd²ₓ
Total
1891
Ʃdy = -9 2085
Ʃd²y
-2025
Ʃdx dy
r = NxƩdₓdy - Ʃdₓ xƩdy
√NxƩd²ₓ - (Ʃdₓ)² X √NxƩd²y - (Ʃdy )²
= 10 X -2025 –(44 X -9)
√25564 x 20769
= -0.86
It indicates high degree
of negative correlation
between the Series A
and Series B
Department of Economics
Spearman’s Rank Difference Method
Spearman’s Rank correlation coefficient is a technique which
can be used to summarize the strength and direction (negative
or positive) of a relationship between two variables.
The result will always be between 1 and minus 1.
Department of Economics
Calculation of Spearman’s Rank Difference
Method
Method - calculating the coefficient
 Create a table from your data.
 Rank the two data sets. Ranking is achieved by giving the ranking '1' to the biggest
number in a column, '2' to the second biggest value and so on. The smallest value in the
column will get the lowest ranking.
 Find the difference in the ranks (D): This is the difference between the ranks of the two
values on each row of the table.
 Square the differences (D²) To remove negative values and then sum them (ƩD²).
 Lastly, the following formula is used to calculate coefficient of correlation by rank
differences method –
Example: From the following data compute coefficient of correlation by Rank
differences method-
X
series
85 91 56 72 95 76 89 51 59 90
Y
series
18.3 20.8 16.9 15.7 19.2 18.1 17.5 14.9 18.9 15.4
Department of Economics
Series
X
Series
Y
Rank
difference
D
Squares of
Rank
Difference
D²
X Rank
X
Y Rank
Y
85 5 18.3 4 1 1
91 2 20.8 1 1 1
56 9 16.9 7 2 4
72 7 15.7 8 -1 1
95 1 19.2 2 -1 1
76 6 18.1 5 1 1
89 4 17.5 6 -2 4
51 10 14.9 10 0 0
59 8 18.9 3 5 25
90 3 15.4 9 -6 36
N =10 ƩD = 0 ƩD² = 74
ρ= 1 - 6ƩD²
N(N² - 1)
= 1 – 6x 74
10(10² - 1)
= 1 – 444
990
= + 0.55
There is moderate
degree of positive
rank correlation
between X and Y
series
Department of Economics
Although correlation is a powerful tool, there are some limitations in using it:
1.Correlation does not completely tell us everything about the data. Means
and standard deviations continue to be important.
2.The data may be described by a curve more complicated than a straight
line, but this will not show up in the calculation of r.
3.Just because two sets of data are correlated, it doesn't mean that one is the
cause of the other.
Limitations of Correlation
Department of Economics
Unit End Questions
1. Explain the meaning and significance of the concept of correlation.
2. What is correlation? Distinguish between positive and negative
correlation. What is the maximum and minimum value of coefficient of
correlation.
3. Write short notes on the following-
 Simple and Multiple correlation
 Linear and Curvilinear correlation
 Rank correlation
4. Calculate coefficient of correlation from the following data –
X 65 72 78 77 76 77 80
Y 36 40 38 37 39 41 40
Department of Economics
Required Readings
References
https://corporatefinanceinstitute.com/resources/knowledge/finance/correlation/
https://www.investopedia.com/ask/answers/032515/what-does-it-mean-if-
correlation-coefficient-positive-negative-or-zero.asp
https://www.statisticshowto.com/probability-and-statistics/correlation-coefficient-
formula/
https://images.app.goo.gl/9ovvff4CnttmzYEQ9
B.L.Aggrawal (2009). Basic Statistics. New Age International Publisher, Delhi.
Gupta, S.C.(1990) Fundamentals of Statistics. Himalaya Publishing House, Mumbai
Elhance, D.N: Fundamental of Statistics
Singhal, M.L: Elements of Statistics
Nagar, A.L. and Das, R.K.: Basic Statistics
Croxton Cowden: Applied General Statistics
Nagar, K.N.: Sankhyiki ke mool tatva
Gupta, BN : Sankhyiki
Department of Economics
THANKS………
Department of Economics

Simple Correlation : Karl Pearson’s Correlation co- efficient and Spearman’s rank correlation

  • 1.
    ECONOMICS Basic Statistics Dr Rekhachoudhary Department of Economics Jai NarainVyas University,Jodhpur Rajasthan Department of Economics
  • 2.
    Simple Correlation :Karl Pearson’s Correlation co- efficient and Spearman’s rank correlation Department of Economics
  • 3.
    Correlation analysis showus how to determine both the nature and strength of relationship between two variables. The word Correlation is made of Co- (meaning "together"), and Relation One of the best statistical tests out there, in my opinion, is the correlation. Correlation is a mutual relationship between two variables. Correlation a linear association between two random variables. Correlation analysis show us how to determine both the nature and strength of relationship between two variables Introduction Department of Economics
  • 4.
    Objectives After going throughthis unit, you will be able to:  Understand the concept of Correlation;  Define Correlation ,Karl Pearson coefficient of correlation and Spearmen’s rank correlation; Merits and Demerits Correlation Some particles problem of Correlation in different series with different methods. Department of Economics
  • 5.
    A correlation isa linear relationship between two variables. Correlation measures the linear association between two variables. Correlation Definition According to king, “If it is proved true that in a large number of instances two variables tend always to fluctuate in the same or in opposite directions we consider that the fact is established that a relationship exists. This relationship is called Correlation” Department of Economics
  • 6.
    Type1: On thebasis of degree of correlation  Positive  Negative  No Perfect Type 2: On the basis of number of variables  Linear  Non – linear Type 3: On the basis of linearity  Simple  Multiple  Partial Types of Correlation Department of Economics
  • 7.
    Degree of Correlation 1.Perfect Correlation 2. Absence of Correlation 3. Limited degrees of Correlation • High degree • Moderate degree • Low degree Degree Positive Negative Perfect + 1 - 1 High + 0.75 to +1 - 0.75 to -1 Moderate +0.25 to +0.75 - 0.25 to - 0.75 Low 0 to + 0.25 0 to - 0.25 Absence 0 0 Department of Economics
  • 8.
    1. Scatter DiagramMethod 2. Graphic Method 3. Karl Pearson Coefficient Correlation of Method 4. Spearman’s Rank Correlation Method 5. Concurrent Deviation Method 6. Method of Least Squares Methods of determining Correlation We will be talking about four methods …….. Department of Economics
  • 9.
    The relationship betweenany two variables can be portrayed graphically on an x- and y- axis. Each subject i1 has (x1, y1). When score s for an entire sample are plotted, the result is called scatter plot. Scatter Diagram Method Department of Economics
  • 10.
    Direction of therelationship Variables can be positively or negatively correlated. Positive correlation: A value of one variable increase, value of other variable increase. Negative correlation: A value of one variable increase, value of other variable decrease. No correlation : When different dots on a scatter diagram are scattered and do not reveal any trend, there is absence of correlation between two variables. Department of Economics
  • 11.
    There are manykinds of correlation coefficients but the most commonly used measure of correlation is the Pearson’s correlation coefficient. (r) This method is treated as the best method because it indicates the quality and direction of correlation in terms of satisfactory numerical measurement. Karl Pearson’s correlation coefficient  The Pearson r range between -1 to +1.  Sign indicate the direction.  The numerical value indicates the strength.  Perfect correlation : -1 or 1  No correlation: 0  A correlation of zero indicates the value are not linearly related. Department of Economics
  • 12.
    Calculation of KarlPearson’s Coefficient of Correlation To calculate Karl Pearson’s coefficient of correlation, first the measure of co- variance is ascertained. Then this absolute measure is converted in coefficient by dividing it with the product of standard deviation of both these variables. This ratio of covariance to the product of standard deviation is called Karl Pearson’s coefficient of correlation. Symbolically, r = Ʃdxdy or Covariance of X and Y N σₓ σ y √Variance of X and Variance of Y It is the original formula of Karl Pearson’s coefficient of correlation Department of Economics
  • 13.
    Calculation of KarlPearson’s Coefficient of Correlation Direct Method for Individual Series  Find out the mean of x and y variables  Take deviations in x variable from its mean i.e, dₓ = (X -X̅ )  Take deviations in y variable from its mean i.e, d y = (Y -Y̅ )  Multiplying the correlation deviation of x and y variables with each other and summate them i.e, (dₓd y) It is put in last column of table.  Find out the squares of the deviations of x and y variables separately and summate them i.e, Ʃd²ₓ and Ʃd²y .  Ascertain standard deviation of both the variables with the help of following formulae: σₓ =√(Ʃd²ₓ) , σ y = (Ʃd²y ) N N  In the end coefficient of correlation be obtained by using following formula: r = Ʃdxdy or Ʃdₓd y N σₓ σ y √(Ʃd²ₓ x Ʃd²y Department of Economics
  • 14.
    Direct method Example: CalculateKarl Pearson’s Coefficient of correlation from the following data- Age of husbands 23 27 28 28 29 30 31 33 35 36 Age of wives 18 20 22 27 21 29 27 29 28 29 X-series X̅ =ƩX = 300 = 30 N 10 σₓ =√(Ʃd²ₓ) = √138 = 3.71 N 10 Y-series Y̅ =ƩY = 250 = 25 N 10 σ y = √(Ʃd²y ) = √164 =4.05 N 10 Department of Economics
  • 15.
    Direct Method Age ofhusbands Age of Wives Product of dx and dy Age (Years) Deviation from X̅ = 30 Squares of deviation Age (Years) Deviation from Y̅ = 25 Squares of deviation X dₓ = (X -X̅ ) d²ₓ Y dy =(Y- Y̅ ) d²y dx dy 23 -7 49 18 -7 49 49 27 -3 9 20 -5 25 15 28 -2 4 22 -3 9 6 28 -2 4 27 2 4 -4 29 -1 1 21 -4 16 4 30 0 0 29 4 16 0 31 1 1 27 2 4 2 33 3 9 29 4 16 12 35 5 25 28 3 9 15 36 6 36 29 4 16 24 Total N =10 138 Ʃd²ₓ Total N = 10 164 Ʃd²y 123 Ʃdx dy r = Ʃdxdy N σₓ σ y = 123 10x 3.71x 4.05 = + 0.82 r = Ʃdₓd y √(Ʃd²ₓ x Ʃd²y = 123 √138 x 164 = +0.82 It indicates high degree of positive correlation between the ages of husbands and wives Department of Economics
  • 16.
    Short Cut Methodfor Individual Series (If arithmetic mean is not whole number this method is used)  In x and y variables, suitable and convenient values from assumed means as arbitrary means(Aₓ and Ay )  In both the series, deviation of values from assumed arithmetic means are taken (dₓ and d y )  Deviation taken as above are totalled Ʃdₓ and Ʃdy .  Corresponding deviations of the variables are multiplied with each other and same are totalled Ʃdₓd y .  Deviations taken from assumed mean are squared up and totalled Ʃd²ₓ and Ʃd²y  In the end coefficient of correlation be obtained by using following formula: Department of Economics
  • 17.
    Short Cut Method Example:Calculate Karl Pearson’s Coefficient of correlation from the following data- Series A 112 114 108 124 145 150 119 125 147 150 Series B 200 190 214 187 170 170 210 190 180 180 Department of Economics
  • 18.
    Short Cut Method ShortCut - Method Series A Series B Product of dx and dy Item value Deviation from 125 Squares of deviation Item value Deviation from 190 Squares of deviation X dₓ d²ₓ Y dy d²y dx dy 112 -13 169 200 10 100 -130 114 -11 121 190 0 0 0 108 -17 289 214 24 576 -408 124 -1 1 187 -3 9 3 145 20 400 170 -20 400 -400 150 25 625 170 -20 400 -500 119 -6 36 210 20 400 -120 125 0 0 190 0 0 0 147 22 484 180 -10 100 -220 150 25 625 180 -10 100 -250 Total 1294 Ʃdₓ =44 2750 Ʃd²ₓ Total 1891 Ʃdy = -9 2085 Ʃd²y -2025 Ʃdx dy r = NxƩdₓdy - Ʃdₓ xƩdy √NxƩd²ₓ - (Ʃdₓ)² X √NxƩd²y - (Ʃdy )² = 10 X -2025 –(44 X -9) √25564 x 20769 = -0.86 It indicates high degree of negative correlation between the Series A and Series B Department of Economics
  • 19.
    Spearman’s Rank DifferenceMethod Spearman’s Rank correlation coefficient is a technique which can be used to summarize the strength and direction (negative or positive) of a relationship between two variables. The result will always be between 1 and minus 1. Department of Economics
  • 20.
    Calculation of Spearman’sRank Difference Method Method - calculating the coefficient  Create a table from your data.  Rank the two data sets. Ranking is achieved by giving the ranking '1' to the biggest number in a column, '2' to the second biggest value and so on. The smallest value in the column will get the lowest ranking.  Find the difference in the ranks (D): This is the difference between the ranks of the two values on each row of the table.  Square the differences (D²) To remove negative values and then sum them (ƩD²).  Lastly, the following formula is used to calculate coefficient of correlation by rank differences method – Example: From the following data compute coefficient of correlation by Rank differences method- X series 85 91 56 72 95 76 89 51 59 90 Y series 18.3 20.8 16.9 15.7 19.2 18.1 17.5 14.9 18.9 15.4 Department of Economics
  • 21.
    Series X Series Y Rank difference D Squares of Rank Difference D² X Rank X YRank Y 85 5 18.3 4 1 1 91 2 20.8 1 1 1 56 9 16.9 7 2 4 72 7 15.7 8 -1 1 95 1 19.2 2 -1 1 76 6 18.1 5 1 1 89 4 17.5 6 -2 4 51 10 14.9 10 0 0 59 8 18.9 3 5 25 90 3 15.4 9 -6 36 N =10 ƩD = 0 ƩD² = 74 ρ= 1 - 6ƩD² N(N² - 1) = 1 – 6x 74 10(10² - 1) = 1 – 444 990 = + 0.55 There is moderate degree of positive rank correlation between X and Y series Department of Economics
  • 22.
    Although correlation isa powerful tool, there are some limitations in using it: 1.Correlation does not completely tell us everything about the data. Means and standard deviations continue to be important. 2.The data may be described by a curve more complicated than a straight line, but this will not show up in the calculation of r. 3.Just because two sets of data are correlated, it doesn't mean that one is the cause of the other. Limitations of Correlation Department of Economics
  • 23.
    Unit End Questions 1.Explain the meaning and significance of the concept of correlation. 2. What is correlation? Distinguish between positive and negative correlation. What is the maximum and minimum value of coefficient of correlation. 3. Write short notes on the following-  Simple and Multiple correlation  Linear and Curvilinear correlation  Rank correlation 4. Calculate coefficient of correlation from the following data – X 65 72 78 77 76 77 80 Y 36 40 38 37 39 41 40 Department of Economics
  • 24.
    Required Readings References https://corporatefinanceinstitute.com/resources/knowledge/finance/correlation/ https://www.investopedia.com/ask/answers/032515/what-does-it-mean-if- correlation-coefficient-positive-negative-or-zero.asp https://www.statisticshowto.com/probability-and-statistics/correlation-coefficient- formula/ https://images.app.goo.gl/9ovvff4CnttmzYEQ9 B.L.Aggrawal (2009).Basic Statistics. New Age International Publisher, Delhi. Gupta, S.C.(1990) Fundamentals of Statistics. Himalaya Publishing House, Mumbai Elhance, D.N: Fundamental of Statistics Singhal, M.L: Elements of Statistics Nagar, A.L. and Das, R.K.: Basic Statistics Croxton Cowden: Applied General Statistics Nagar, K.N.: Sankhyiki ke mool tatva Gupta, BN : Sankhyiki Department of Economics
  • 25.