1.
Properties of Correlation Coefficient
Property 1 - Limits for Correlation Coefficient
Pearsonian correlation coefficient can not exceed 1 numerically. In other words it lies between 1
and -1. Symbolically: – 1 ≤ r ≤ 1. r = + 1 implies perfect positive correlation between the
variables.
Property 2 - Correlation Coefficient is independent of the change of origin and scale.
Mathematically, if X and Y are given and they are transformed to the new variables U and V by
the change of origin and scale viz,
u = (x – A)/h and v = (y – B)/k ; h >0, k >0
Where A, B, h >0, k >0; then the correlation coefficient between x and y is same as the
correlation coefficient between u and v i.e.,
r (x,y) = r ( u, v) rxy = ruv
Property 3 - Two independent variables are uncorrelated but the converse is not true.
Remarks: one should not be confused with the words of uncorrelation and independence. rxy = 0
i.e., uncorrelation between the variables x and y simply implies the absence of any linear
(straight line) relationship between them. They may however, be related in some other form
(other than straight line) e.g., quadratic (as we have see in the above example, logarithmic or
trigonometric form.
Property 5 - If the variables x and y is (+ 1) if the signs of a and b are different and (-1) if the
signs of a and b are alike. Interpretation of r the following general points may be borne in mind
while interpreting an observed value of correlation coefficient r: If r = -1 there is perfect negative
correlation between the variables. In this scatter diagram will again be a straight line.
If r = 0, the variables are uncorrelated in other words there is no linear (straight line) relationship
between the variables. However, r = 0 does not imply that the variables are independent.
2.
For other values of r lying between + 1 and – 1 there are no set guidelines for its interpretation.
The maximum we can conclude is that nearer the value of r to 1, the closer is the relationship
between the variables and nearer is the value of r to 0 the less close is the relationship between
them. One should be very careful in interpreting the value of r as it is often misinterpreted.
The reliability or the significance of the value of the correlation depends on a number of factors.
One of the ways of testing the significance of r is finding its probable error, which in addition to
the value of r takes into account the size of the sample also.
Another more useful measure for interpreting the value of r is the coefficient of determination. It
is observed there that the closeness of the relation ship between two variables is not proportional
to r.
In total the Properties are:
o Limits for Correlation Coefficient.
o Independent of the change of origin & scale.
o Two independent variables are uncorrelated but the converse is not true.
o If variable x & y are connected by a linear equation: ax+by+c=0, if the correlation coefficient
between x & y is (+1) if signs of a, b are different & (-1) if signs of a, b are alike.
Important Formulas:
nΣ dx.dy - Σdx. Σdy Σxy
r= r=
√nΣdx2 – (Σdx)2. nΣdy2 -(Σdy)2 √[Σx2.Σy2]
r = [Cov (x,y)} / [SD (x)*SD (y)]
The application of the formulaes depends on different situations. Following are some
problems which are solved using different formulas. We can notice that irrespective of
the formulas the answer will remain same.
4.
Problem No 4:
From the following data given calculate “n”: Correlation coefficient – 0.8; Summation of product
deviations – 60; SD of y – 2.5; Summation of x2 – 90. x & y are the deviations from their
arithmetic mean.
Answer:
r = [Cov (x,y)} / [SD (x)*SD (y)]
0.8 = [1/n (60)] / [{√(90/n)}*(2.5)]
0.8*0.8 = [(1/n)*(1/n)*60*60] / [(90/n)*2.5*2.5]
0.8*0.8*2.5*2.5*90 = [(1/n)*(1/n)*60*60]
n=10
Problem 5:
A computer while calculating correlation coefficient between x & y from a pair of 25
observations. Summation X is 125, Summation X2 is 650; Summation Y is 100, Summation Y2
is 460; Summation of X&Y is 508. Later it is observed that two pairs of observations were taken
as (6, 14) and (8,6) instead of (8, 12) and (6,8). Prove that the correct correlation coefficient is
0.67.
Answer:
When we apply the formula we get the answer. First applying the formula we need to find all
terms. Then add all correct values [(8, 12) and (6,8)] after deducting wrong values [(6, 14) and
(8,6)] from those terms. Now apply them in the formula. We get the answer as 2/3.
Problem 6:
5.
If the relation between two random variables x & y is: 2x+3y=4, then the correlation coefficient
is:
Answer:
-1 (by the property)
Be the first to comment