Power point presentationCORRELATION.pptx

 It deals with the association between two or
more variables.
 It attempts to determine the degree of
relationship between variables.
 It is the covariation between two or more
variables.
 If change in one variable cause change in
second variable then variables are said to be
correlated.

 It is the statistical technique to find out
how variables are inter-related.
 For example : Correlation between
smoking and lung cancer.
 Correlation is given by Karl Pearson’s.
 It is represented by symbol ‘r’.

There are three important ways for
classification of correlation:
1. Positive or negative correlation
2. Simple or multiple correlation
3. Linear or Non-linear correlation

 If both variables are varying in the same
direction i.e. if one variable is increasing other is
also increasing.
 Example:
X Y
10 15
12 20
15 22
18 25
20 37
X Y
80 50
70 44
60 30
40 20
30 10

 If both variables vary in opposite direction i.e.
if one variable is increasing then other is
decreasing or vice-versa.
X Y
20 40
30 30
40 22
60 15
80 10
X Y
100 10
90 20
60 30
40 40
30 50

Simple
correlation
Multiple
correlation

 When only two variables are studied it
is a problem of simple correlation.
 When only two or more variables are
studied it is a problem of multiple
correlation.

 Linear correlation
 Non-linear correlation

 If the amount of change in one variable tends to
be constant ratio to the amount of change in
the other variable, it is said to be a linear
correlation.
Y
x

X Y
10 70
20 140
30 210
40 280
50 350
60 420

 Correlation would be non-linear or curvilinear
if the amount of change in one variable does
not bear a constant ratio to the amount of
change in other variable.

X Y
10 70
20 120
30 120
40 200
50 220

There are various methods to ascertain
whether two variables are correlated or not:
 Scatter diagram method
 Karl Pearson’s coefficient of correlation
 Spearman’s Rank correlation method

 It is the simplest method to ascertain
whether two variables are related or not.
 In this , given data is plotted in the form
of dots i.e. for each pair of X and Y
values we plot dot and thus obtain as
many points as the number of
observations.

 The greater the scatter of plotted points on
the chart , the lesser is the relationship
between two variables.
 The more closely the points come to a straight
line falling from the lower left corner to the
upper right corner , correlation is said to be
perfectly positive correlation (r=+1).
 On the other hand if all points are lying on the
straight line rising from the upper left corner
to the lower right corner , correlation is said o
be perfectly negative (r=-1).

 If the points are widely scattered over the
diagram , it indicates very little relationship
between the variables either negative or
positive.
 If the plotted points lie on diagram in
haphazard manner , it shows absence of any
relationship between the variables.

X Y
2 6
3 5
5 7
6 8
8 12
9 11
a) Make a scatter diagram
b) Is there any correlation present between the
variables X and Y.

 It is a simple and non-mathematical method of
studying correlation between the variables.
 Easily understood and a rough idea can be
quickly formed as whether variables are related
or not.
 It is not influenced by the size of extreme items
whereas most of the mathematical methods of
finding correlation are influenced by extreme
values

 By applying this method we can get idea
about the direction of correlation and also
whether it is high or low. But we cannot
establish the exact degree of correlation
between the variables.

 It is popularly known as ‘Pearson’s
coefficient of correlation’ and is most
widely used in practice.
 It is denoted by symbol ‘r’.

r= N ƩXY – ƩX ƩY
√ [N ƩX 2 – (ƩX) 2 ] [N ƩY 2 – (ƩY) 2 ]
The value of coefficient correlation as obtained by
the above formula shall always lie between -1 to
+1.

 If r=+1 then it means there is perfect positive
correlation between the variables.
 If r=-1 then it means there is perfect negative
correlation between variables.
 If r=0 then it means there is no relation
between the variables.
Therefore value of r always ranges
from -1 to +1 but r can never exceed 1.

 Perfect positive correlation: Here two variables
are directly proportional and fully correlated with
each other (r=+1).
 Perfect negative correlation: Here two variables
are inversely proportional and fully correlated with
each other (r=-1).
 Partial positive correlation:
In this case 0 < r < 1
 Partial negative correlation:
In this case -1 < r <0
 Absolutely no relation: r=0

 It summarizes in a figure not only the degree
of correlation but also the direction.

 The correlation coefficient always assumes
linear relationship regardless of the fact that
assumption is correct or not.
 Great care must be exercised in interpreting
the values of coefficient.
 The value of the coefficient is unduly affected
by the extreme items.
 It is a time consuming method.

X Y
9 15
8 16
7 14
6 13
5 11
4 12
3 10
2 8
1 9
Calculate Karl Pearson’s correlation coefficient
for the given data

X Y
52 65
53 68
42 43
60 38
45 77
41 48
37 35
38 30
25 25
27 50

 The Karl Pearson’s correlation method is
based on the assumption that the population
being studied is normally distributed.
 When it is known that population is not
symmetrical there is a need for measure of
correlation that involves no assumption about
the parameter of the population.
 It was developed by the British psychologist
Charles Edward Spearman in 1904.

 It is also called as ‘Spearman’s rank correlation
coefficient’.
 It is represented by symbol ‘R’.
 R= 1- 6ƩD2
N(N2 -1)
Where D is the difference of rank between two
variables
N is the number of items.

 Value is interpreted in the same way as that of
the Karl Pearson’s correlation coefficient.
 R= -1 to +1
 When R=+1 there is complete agreement in
order of the ranks and ranks are in the same
direction.
 When R=-1 there is complete agreement in
order of the ranks and they are in opposite
direction.

 In rank correlation we may have two types of
problems:
When ranks are given When ranks are not
(Just apply formula) given
(Calculate ranks first)

 Two ladies were asked to rank 7 different lipsticks.
The ranks given by them are:
Calculate Spearman’s coefficient of correlation.
X Y
2 1
1 3
4 2
3 4
5 5
7 6
6 7

X Y
6 3
5 8
3 4
10 9
2 1
4 6
9 10
7 7
8 5
1 2
Calculate Spearman’s correlation coefficient

 When we are given the actual data and not
the ranks it is necessary to assign the
ranks.

Price of tea (Rs.)
X
Price of coffee (Rs.)
Y
75 120
88 134
95 150
70 115
60 110
80 140
81 142
50 100

 In some cases it may be found necessary to rank two or
more individual entries as equal.
 In this case let us suppose that 3 rank is to be given to 2
entries
then 3+4 = 3.5
2
and the next rank is skipped.
 In case of three entries having same rank
then 3+4+5 = 3
3
and next two ranks are skipped

 And also 1 (m 3 – m) is added to ƩD 2 for each
12
repetition.

X Y50
50 110
55 110
65 115
50 125
55 140
60 115
50 130
65 120
70 115
75 160

 Simple to understand and easily applied
as compared to Karl Pearson’s method.
 It can also be used where only ranks are
given and no actual data is given.
 When the number of values exceeds 30
the calculations becomes difficult.

Power point presentationCORRELATION.pptx

Power point presentationCORRELATION.pptx

Recommended

Recommended

More Related Content

Similar to Power point presentationCORRELATION.pptx

Similar to Power point presentationCORRELATION.pptx (20)

More from Simran Kaur

More from Simran Kaur (20)

Recently uploaded

Recently uploaded (20)

Power point presentationCORRELATION.pptx