Stats 2
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
1,255
On Slideshare
1,255
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
12
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Properties of Correlation Coefficient Property 1 - Limits for Correlation Coefficient Pearsonian correlation coefficient can not exceed 1 numerically. In other words it lies between 1 and -1. Symbolically: – 1 ≤ r ≤ 1. r = + 1 implies perfect positive correlation between the variables. Property 2 - Correlation Coefficient is independent of the change of origin and scale. Mathematically, if X and Y are given and they are transformed to the new variables U and V by the change of origin and scale viz, u = (x – A)/h and v = (y – B)/k ; h >0, k >0 Where A, B, h >0, k >0; then the correlation coefficient between x and y is same as the correlation coefficient between u and v i.e., r (x,y) = r ( u, v) rxy = ruv Property 3 - Two independent variables are uncorrelated but the converse is not true. Remarks: one should not be confused with the words of uncorrelation and independence. rxy = 0 i.e., uncorrelation between the variables x and y simply implies the absence of any linear (straight line) relationship between them. They may however, be related in some other form (other than straight line) e.g., quadratic (as we have see in the above example, logarithmic or trigonometric form. Property 5 - If the variables x and y is (+ 1) if the signs of a and b are different and (-1) if the signs of a and b are alike. Interpretation of r the following general points may be borne in mind while interpreting an observed value of correlation coefficient r: If r = -1 there is perfect negative correlation between the variables. In this scatter diagram will again be a straight line. If r = 0, the variables are uncorrelated in other words there is no linear (straight line) relationship between the variables. However, r = 0 does not imply that the variables are independent.
  • 2. For other values of r lying between + 1 and – 1 there are no set guidelines for its interpretation. The maximum we can conclude is that nearer the value of r to 1, the closer is the relationship between the variables and nearer is the value of r to 0 the less close is the relationship between them. One should be very careful in interpreting the value of r as it is often misinterpreted. The reliability or the significance of the value of the correlation depends on a number of factors. One of the ways of testing the significance of r is finding its probable error, which in addition to the value of r takes into account the size of the sample also. Another more useful measure for interpreting the value of r is the coefficient of determination. It is observed there that the closeness of the relation ship between two variables is not proportional to r. In total the Properties are: o Limits for Correlation Coefficient. o Independent of the change of origin & scale. o Two independent variables are uncorrelated but the converse is not true. o If variable x & y are connected by a linear equation: ax+by+c=0, if the correlation coefficient between x & y is (+1) if signs of a, b are different & (-1) if signs of a, b are alike. Important Formulas: nΣ dx.dy - Σdx. Σdy Σxy r= r= √nΣdx2 – (Σdx)2. nΣdy2 -(Σdy)2 √[Σx2.Σy2] r = [Cov (x,y)} / [SD (x)*SD (y)] The application of the formulaes depends on different situations. Following are some problems which are solved using different formulas. We can notice that irrespective of the formulas the answer will remain same.
  • 3. Problem Number 1, 2, 3 are solved with different formulas for the same data. xy X Y x=X-X y=Y-Y x2 y2 39 47 -26 -19 676 361 494 r= 2704 0 √[5398*2224] 65 53 0 -13 0 169 62 58 -3 -8 9 64 24 90 86 25 20 625 400 500 -68 82 62 17 -4 289 16 r = 0. 7804 75 68 10 2 100 4 20 25 60 -40 -6 1600 36 240 98 91 33 25 1089 625 825 36 51 -29 -15 841 225 435 78 84 13 18 169 324 234 650 660 0 0 5398 2224 2704 X Y dx=X-A dy=Y-A dx2 dy2 dxdy 39 47 -31 -13 961 169 403 65 53 -5 -7 25 49 35 62 58 -8 -2 64 4 16 10*2404 – (50)58 90 86 20 24 400 576 480 r= √[10*5648 – (-50)2 . 10*2484 – (58)2] 82 62 12 2 144 4 24 X Y X2 Y2 XY 75 68 5 8 25 64 40 39 47 nΣ XY - ΣX. ΣY 1521 2209 = 0.78 r 1833 25 60 -45 0 2025 0 0 65 53 4225 2809 3445 98 91 28 31 784 961 r= 868 62 58 3844 3364 3596 2 . √nΣX2 –(ΣX) nΣY2 –(ΣY)2 36 51 -34 -9 1156 81 306 90 86 8100 7396 7740 78 84 8 24 64 576 192 82 62 6724 3844 5084 650 660 -50 58 5648 2484 2364 10*45604 – 650*660 75 68 5625 4624 5100 r= 25 60 625 3600 5100 √10*47648 8281 8918 10*45784–(660)2 98 91 9604 –(650)2 . 36 51 1296 2601 1836 78 84 6084 7056 6552 650 660 47648 45784 45604 r=0. 7804
  • 4. Problem No 4: From the following data given calculate “n”: Correlation coefficient – 0.8; Summation of product deviations – 60; SD of y – 2.5; Summation of x2 – 90. x & y are the deviations from their arithmetic mean. Answer: r = [Cov (x,y)} / [SD (x)*SD (y)] 0.8 = [1/n (60)] / [{√(90/n)}*(2.5)] 0.8*0.8 = [(1/n)*(1/n)*60*60] / [(90/n)*2.5*2.5] 0.8*0.8*2.5*2.5*90 = [(1/n)*(1/n)*60*60] n=10 Problem 5: A computer while calculating correlation coefficient between x & y from a pair of 25 observations. Summation X is 125, Summation X2 is 650; Summation Y is 100, Summation Y2 is 460; Summation of X&Y is 508. Later it is observed that two pairs of observations were taken as (6, 14) and (8,6) instead of (8, 12) and (6,8). Prove that the correct correlation coefficient is 0.67. Answer: When we apply the formula we get the answer. First applying the formula we need to find all terms. Then add all correct values [(8, 12) and (6,8)] after deducting wrong values [(6, 14) and (8,6)] from those terms. Now apply them in the formula. We get the answer as 2/3. Problem 6:
  • 5. If the relation between two random variables x & y is: 2x+3y=4, then the correlation coefficient is: Answer: -1 (by the property)