Galton invented the concept of correlation to measure how variables vary together. Pearson later formalized this method. Correlation measures the strength and direction of association between two variables on a scale of -1 to 1. A value of 0 indicates no association, 1 indicates perfect positive association, and -1 indicates perfect negative association. Correlation is widely used to study relationships between variables in fields such as business, psychology, and engineering.
4. HISTORY
GALTON:
Obsessed with measurement
Tried to measure everything
from the weather to female
beauty
Invented correlation and
regression
KARL PEARSON
formalized Galton's method
invented method
5. CORRELATION
Measure of the degree to
which any two variables
vary together.
or
Simultaneously variation
of variables in some
direction.
e.g., iron bar
6. INTRODUCTION
Association b/w two variables
Nature & strength of
relationship b/w two variables
Both random variables
Lies b/w +1 & -1
0 = No relationship b/w
variables
-1 = Perfect Negative
correlation
+1 = Perfect positive
correlation
14. SPEARMAN'S RANK
Actual measurement of objects/individualsActual measurement of objects/individuals
not availablenot available
Accurate assesment is not possibleAccurate assesment is not possible
Arranged in orderArranged in order
Ordered arrangement: RankingOrdered arrangement: Ranking
Order Given to object: RanksOrder Given to object: Ranks
Correlation blw two sets X & Y: RankCorrelation blw two sets X & Y: Rank
correlationcorrelation
15. PROCEDURE
Rank values of X from 1-nRank values of X from 1-n
n: # of pairs of values of X & Yn: # of pairs of values of X & Y
Rank Y from 1-nRank Y from 1-n
Compute value of 'di' by Xi - YiCompute value of 'di' by Xi - Yi
Square each di & compute ∑diSquare each di & compute ∑di22
Apply formula;Apply formula; 2
s 2
6 (di)
r 1
n(n 1)
= −
−
∑
16. EXAMPLE
In a study of the relationship between level education andIn a study of the relationship between level education and
income the following data was obtained. Find the relationshipincome the following data was obtained. Find the relationship
between them and comment.between them and comment.
sample
numbers
level education
(X)
Income
(Y)
A Preparatory.Preparatory. 25
B Primary.Primary. 10
C University.University. 8
D secondarysecondary 10
E secondarysecondary 15
F illiterateilliterate 50
G University.University. 60
17. X Y rank
X
rank
Y
di di2
A Preparato
ry
25 5 3 2 4
B Primary 10 6 5.5 0.5 0.25
C University 8 1.5 7 -5.5 30.25
D secondary 10 3.5 5.5 -2 4
E secondary 15 3.5 4 -0.5 0.25
F illiterate 50 7 2 5 25
G university 60 1.5 1 0.5 0.25
∑ di2
=64
18. Conclusion:Conclusion:
There is an indirect weak correlationThere is an indirect weak correlation
between level of education and income.between level of education and income.
1.0
)48(7
646
1 −=
×
−=sr
21. • POSITIVE - both
either increase or
decrease
• NEGATIVE - one
increase while other
decrease
• NO - no correlation
• PERFECT - both
variables are
independents
22. EXAMPLES
+ive Relationships
• WAter consumption
& temperature
• Study times &
grades
-ive Relationships
• Alcohol
consumption &
driving ability
• Price & Quantity
demanded
26. • SIMPLE - 1 independent & 1 dependent
variable
• MULTIPLE - 1 dep & more than 1 indep
variable
• PARTIAL - 1 dep & more than 1 indep
variable bt only 1 indep variable is considered
while other const
27. COEFFICIENT OF
CORRELATION
Measure of the strength of linear relationship
b/w two variables.
Represented by 'r'
'r' lies b/w +1 & -1
-1 ≤ r ≤ +1
+ive sign = +ive linear correlation
-ive sign = -ive linear correlation
28. MATHEMATICALLY 'r'
2 2 2 2
( )( )
[ ( ) ][ ( ) ]
n xy x y
r
n x x n y y
−
=
− −
∑ ∑ ∑
∑ ∑ ∑ ∑
31. CORRELATIION: LINEAR
RELATIONSHIPS
0
20
40
60
80
100
120
140
160
180
0 50 100 150 200 250
Drug A (dose in mg)
SymptomIndex
0
20
40
60
80
100
120
140
160
0 50 100 150 200 250
Drug B (dose in mg)
SymptomIndex
Srong Relationship → Good linear fit
Points clustered closely around a line show a
strong correlation. The line is a good
predictor (good fit) with the data. The more
spread out the points, the weaker the
correlation, and the less good the fit. The line
is a REGRESSSION line (Y = bX + a)
32. r : shows relationship b/w variables either
+ive or -ive
r2
: shows % of variation by best fit line
33. Example:Example:
A sample of 6 children was selected, data about their ageA sample of 6 children was selected, data about their age
in years and weight in kilograms was recorded as shownin years and weight in kilograms was recorded as shown
in the following table . It is required to find thein the following table . It is required to find the
correlation between age and weight.correlation between age and weight.
serial # Age X
I.V(years)
Weight Y
D.V(Kg)
1 7 12
2 6 8
3 8 12
4 5 10
5 6 11
6 9 13
35. r = 0.759r = 0.759
strong direct correlationstrong direct correlation
2 2
41 66
461
6r
(41) (66)
291 . 742
6 6
×
−
=
− −
2 2
2 2
x y
xy
nr
( x) ( y)
x . y
n n
−
=
− − ÷ ÷ ÷ ÷
∑ ∑∑
∑ ∑∑ ∑
36. APPLICATIONS
Estimating & improving.,
Seasonal sales for departmental stores
Quantity demanded & production
Motivating tools for employees
Cost of products demanded
accuracy of estimations for demands for sails
Inflation & real wage
Oil exploration
37. Moreover, Radar system is field where
correlation is vehicle to map distance
&
in communication, for instance in digital
receivers.