Correlation by Neeraj Bhandari ( Surkhet.Nepal )

1,041 views
918 views

Published on

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,041
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
124
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Correlation by Neeraj Bhandari ( Surkhet.Nepal )

  1. 1. CORRELATION Correlation is a statistical measurement of the relationship between two variables such that a change in one variable results a change in other variable and such variables are called correlated. Thus the correlation analysis is a mathematical tool which is used to measure the degree to which are variable is linearly related to each other
  2. 2. DIRECT OR POSITIVE CORRELATION If the increase(or decrease) in one variable results in a corresponding increase (or decrease) in the other, the correlation is said to be direct or positive. INVERSE OR NEGATIVE CORRELATION If the increase(or decrease) in one variable results in a corresponding decrease (or increase) in the other, the correlation is said to be inverse or negative correlation.
  3. 3. For example, the correlation between (i)The income and expenditure; is positive. And the correlation between (i) the volume and pressure of a perfect gas; is negative.
  4. 4. LINEAR CORRELATION A relation in which the values of two variable have a constant ratio is called linear correlation (or perfect correlation). NON LINEAR CORRELATION A relation in which the values of two variable does not have a constant ratio is called a non linear correlation.
  5. 5. Karl Pearson’s Coefficient of Correlation- Correlation coefficient between two variables x and y is denoted by r(x,y) and it is a numerical measure of linear relationship between them. r= Where r = correlation coefficient between x and y σx= standard deviation of x σy = standard deviation of y n= no. of observations
  6. 6. Properties of coefficient of correlation- (i) It is the degree of measure of correlation (ii)The value of r(x,y) lies between -1 and 1. (iii) If r=1, then the correlation is perfect positive. (iv) If r= -1, then the correlation is perfect negative. (v) If r = 0,then variables are independent , i.e. no correlation
  7. 7. (vi) Correlation coefficient is independent of change of origin and scale. If X and Y are random variables and a,b,c,d are any numbers provided that a ≠0, c ≠0 ,then r( aX+b, cY+d) = r(X,Y)
  8. 8. Example:- Calculate the correlation coefficient of the following heights(in inches) of fathers(X) and their sons(Y): X : 65 66 67 67 68 69 70 72 Y : 67 68 65 68 72 72 69 71
  9. 9. X Y XY 65 67 4225 4489 4355 66 68 4356 4624 4488 67 65 4489 4225 4355 67 68 4489 4624 4556 68 72 4624 5184 4896 69 72 4761 5184 4968 70 69 4900 4761 4830 72 71 5184 5041 5112 Total =544 552 37028 38132 37560 2 x 2 y
  10. 10. = = 544/8 , = 68 = = 552/8 = 69 r(X,Y) = On putting all the values , we get r = .603
  11. 11. SOLUTION:SHORT-CUT METUOD- X Y U=X-68 V=Y-69 U2 V2 UV 65 67 -3 -2 9 4 6 66 68 -2 -1 4 1 2 67 65 -1 -4 1 16 4 67 68 -1 -1 1 1 1 68 72 0 3 0 9 0 69 72 1 3 1 9 3 70 69 2 0 4 0 0 72 71 4 2 16 4 8 Total 0 0 36 44 24
  12. 12. = 0 = 0 r(U,V) = On putting all the values we get- r(U,V) = .603
  13. 13. RANK CORRELATION- Let (xi ,yi) i = 1,2,3……n be the ranks of n individuals in the group for the characteristic A and B respectively. Co-efficient of correlation between the ranks is called the rank correlation co-efficient between the characteristic A and B for that group of individuals. r = 1- Where di denotes the difference in ranks of the ith individual.
  14. 14. EXAMPLE- Compute the rank correlation co-efficient for the following data- Person : A B C D E F G H I J Rank in Maths : 9 10 6 5 7 2 4 8 1 3 Rank in Physics:1 2 3 4 5 6 7 8 9 10
  15. 15. Person R1 R2 d=R1 -R2 d2 A 9 1 8 64 B 10 2 8 64 C 6 3 3 9 D 5 4 1 1 E 7 5 2 4 F 2 6 -4 16 G 4 7 -3 9 H 8 8 0 0 I 1 9 -8 64 J 3 10 -7 49 TOTAL 280
  16. 16. r = 1- =1- [ {6×280}/10(100-1)] = 1- 1.697 = -0.697.
  17. 17. Repeated Ranks 2 2 2 2 1 1 2 2 2 1 1 1 6 1 1 ..... 1 12 12 12 1 1 k kd m m m m m m r n n Example : Obtain the rank correlation co-efficient for the following data ; X 68 64 75 50 64 80 75 40 55 64 Y 62 58 68 45 81 60 68 48 50 70
  18. 18. X 68 64 75 50 64 80 75 40 55 64 Y 62 58 68 45 81 60 68 48 50 70 Ranks in X 4 6 2.5 9 6 1 2.5 10 8 6 Ranks in Y 5 7 3.5 10 1 6 3.5 9 8 2 d=x-y -1 -1 -1 -1 5 -5 -1 1 0 4 0 d2 1 1 1 1 25 25 1 1 0 16 72 2 2 2 2 1 1 2 2 3 3 2 2 2 2 2 1 1 1 6 1 1 1 12 12 12 1 1 1 1 1 6 72 2 2 1 3 3 1 2 2 1 12 12 12 1 10 10 1 6 75 6 1 0.545 990 11 d m m m m m m r n n r r
  19. 19. Regression Analysis The term regression means some sort of functional relationship between two or more variables. Regression measures the nature and extent of correlation. Regression is the estimation or prediction of unknown values of one variable from known values of another variable.
  20. 20. CURVE OF REGRESSION AND REGRESSION EQUATION If two variates x and y are correlated, then the scatter diagram will be more or less concentrated round a curve. This curve is called the curve of regression. The mathematical equation of the regression curve is called regression equation.
  21. 21. LINEAR REGRESSION When the points of the scatter diagram concentrate round a straight line, the regression is called linear and this straight line is known as the line of regression.
  22. 22. LINES OF REGRESSION In case of n pairs (x,y), we can assume x or y as independent or dependent variable. Either of the two may be estimated for the given values of the other. Thus if want to estimate y for given values of x, we shall have the regression equation of the form y = a + bx, called the regression line of y on x. And if we wish to estimate x from the given values of y, we shall have the regression line of the form x = A + By, called the regression line of x on y. Thus in general, we always have two lines of regression
  23. 23. LINE OF REGRESSION OF Y ON X: ( )yxy y b x x
  24. 24. WHERE IS REGRESSION CO-EFFICIENT. 2 2 ( ) y yx x n xy x y b r n x x yxb
  25. 25. LINE OF REGRESSION OF X ON Y: ( )xyx x b y y
  26. 26.  Where is the regression co-efficient. xyb 2 2 ( ) x xy y n xy x y b r n y y
  27. 27. Theorem :- Correlation co-efficient is the geometric mean between the regression co-efficients. The co-efficient of regression are Then geometric mean = = co-efficient of correlation y x yx xy x y r r b and b yx y x rr r
  28. 28. EXAMPLE- Find the line of regression of y on x for the data given below: X: 1.53 1.78 2.60 2.95 3.43 Y: 33.50 36.30 40 45.80 53.50
  29. 29. Solution: x y x y 1.53 33.50 2.3409 51.255 1.78 36.30 2.1684 64.614 2.60 40.00 6.76 104 2.95 45.80 8.7025 135.11 3.42 53.50 11.6964 182.97 2 x 12.28x 209.1y 2 32.67x 537.95xy
  30. 30.  Here n=5 = 9.726 Then, the line of regression of y on x y=17.932+9.726x Which is required line of regression of y on x. 2 2 ( ) yx n xy x y b n x x ( )yxy y b x x
  31. 31. Question: For 10 observations on price (x) and supply (y), the following data were obtained : Obtain the two lines of regression and estimate the supply when price is 16 units. 2 2 130., 220., 2288., 5506., 3467x y x y xy
  32. 32. Solution: Regression coefficient of y on x =1.015 Regression line of y on x is y=1.015x+8.805 10,, 13., 22 x y n x y n n 2 2 ( ) yx n xy x y b n x x ( )yxy y b x x
  33. 33.  Since we are to estimate supply (y) when price (x) is given therefore we are to use regression line of y on x here. When x=16 units y = 1.105(16)+8.805 =25.045
  34. 34.  Ques:- From the following data, find the most likely value of y when x=24: Mean (x)=18.1, mean (y)=985.8 S.D (x)=2, S.D (y)=36.4, r=0.58
  35. 35.  Ex. In a partially destroyed laboratory record of an analysis of a correlation data, the following results only are eligible : Variance of x = 9 Regression equations :  What were (a) the mean values of x and y , (b) the standard deviation of x and y and the coefficient of correlation between x and y 8 10 66 0, 40 18 214.x y x y
  36. 36. 2 (i)Sinceboth thelinesof regression passthrough thepoint(x,y)therefore, 8 10 66 0 40 18 214 0 . 13 17 ( ) 9 3 0.8 6.6 x x x y x y Solvetheseeqs x and y ii Variance of x Theequationsof linesof regressioncanbewritten as y x and x 2 0.45 5.35 0.8 0.45 * 0.8*0.45 0.36 0.6 0.8*0.3 4 0.6 yx xy yx xy y yx x yx y x y b and b r b b r r b b r
  37. 37.  Ques. : If the regression co-efficient are 0.8 and 0.2, what would be the value of co-efficient of correlation.
  38. 38.  Ques.: The equations of two lines of regression obtained in a correlation analysis of 60 observation are 5x = 6y +24 , and 1000y =768 x – 3608.  What is the co-efficient of correlation ?  Mean values of x and y.  What is the ratio of variance of x and y ?

×