Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
Loading in …5
×

# Module 7

2,369 views

Published on

Published in: Technology, Education
• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here
• Be the first to comment

• Be the first to like this

### Module 7

1. 1. UNIT 07 CORRELATION INTRODUCTION So far we have studied the characteristics of only one variable like weights, prices sales etc. this type of study is called univariate analysis. If there exists some relationship between two variables and if we study it, then the statistical analysis of such data is called bivariate analysis. The determination of the existence and extent of the relationship between two phenomenon, is one of the most important objectives of statistics further existence of relation ship between two or more variables enable us to predict further values. To carry out our analysis effectively, it often becomes necessary to observe and study the relationship existing between two measurable variables like between price and demand, yield and rainfall etc. The term for such analysis of the relationship existing between two different variables is known as correlation. So correlation is a statistical technique which measures and analysis the degree of relationship existing between two measurable variables in other words the term correlation indicates the relationship between such as variable, in which with changes in the value of one variables, the value of the other variable also change. DEFINATION: According to L. R. cannon “If two or more quantities vary in sympathy, so that movements in one tend to be accompanied by corresponding movements in the other, then they are said to be correlated”. According to Croxton and Cowden “The relationship of quantitative nature, the appropriate statistical tool for discovering and measuring the relationship and expressing it in brief formula is known as correlation.” According to A. M. Tuttle “ Correlation is an analysis of the co-variation between two or more variables. Thus the association of any two varieties is known as correlation. The correlation expresses the relationship or inter dependence of two sets of variables upon each other in such a way that the changes in the value of one variable are in sympathy with the changes in the other. Correlation is the numerical measurement showing the degree of correlation between two variable. CAUSE AND EFFECT Correlation also means the presence of cause and effect relationship between the two distributions. For example when we say that there is relationship between price and demand it means that price is the cause and demand is the effect. In other words as price increases the amount of demand decreases and vice- versa. It is generally assumed that when two variable are correlated, certain cause and effect relationship exists between them. But there is a possibility that statistically two variables are found correlated but practically they are not related at all. For example, there cannot be cause and effect relationship between rainfall and percentage of pass in the examination, even though there exists correlation between them. Such correlation is called SPURIOUS CORRELATION, which arises due to chance factor. Usefulness of Correlation Correlation is useful in physical and social sciences. Following are the important uses. 1. Correlation is very useful to economists to study the relationship between variables like price and quantity demanded. It helps businessmen to estimate costs, sales, price and other related variables . 2. Some variables show some kind of relationship, correlation analysis helps in measuring the degree of relationship between the variables like supply & demand etc. 3. The relation between variables can be verified and tested for significance, with the help of the correlation analysis. 4. The coefficient of Correlation is a relative measure and we can compare the relationship between variable which are expressed in different units. 5. Sampling error can also be Calculated 6. Correlation is the basis for the concept of regression and ratio of variation. 123
2. 2. Types of Correlation Correlation is classified into following types. 1. Positive and Negative 2. Simple and Multiple 3. Partial and total 4. Linear and Non-Linear 1. Positive and Negative The direction of variation of the variables determines whether correlation is positive or negative. Correlation is said to be positive when the values of two variables move in the same direction, so that an increase in the values of one variable is associated with an increase in the values of the other variable also and a decrease in the value of one variable is associated with the decrease in the values of other variables. Correlation is said to be negative if an increase or decrease in the values of one variable is associated with a decrease or increase in the values of the other that the changes in the values move in the opposite direction. 2. Simple and multiple When we study only two variables, the relationship is described as simple correlation. But in a multiple correlation we study more than two variables simultaneously, example, the relationship of price, demand and supply of a commodity. 3. Partial & Total: The study of two variables excluding some other variables is called partial correlation. For example. We study price and demand laminating the supply side in total correlation all the facts are taken into account 4. Linear and non linear Correlation is said to be linear, if the amount of change in one variable tends to bear a constant ratio to the amount of change in the other variable .if the ratio of change between two variable is uniform, than there will be linear correlation between them. Correlation said to be non linear, if the amount of change in one variable does not bear a constant ratio to the amount of change in the other related variable. Methods of studying correlation The commonly used methods for studying the correlation between two variables are 1. Graphic Method a) Scatter diagram b) Simple graph 2. Mathematical method Karl Pearson’s coefficient of correlation. 1. a) Scatter Diagram: This is the most simplest way of studying correlation between the two distribution, by plotting the values on a chart known as scatter diagram. In this method, the given data are plotted on a graph paper in the form of dots. X variables are plotted on the horizontal axis and y variables on the vertical axis. Thus we have the dots and we can know the scatter of the various points this will show the type of correlation. Following diagrams illustrate the degree and direction of relationship Positive correction Negative No correction Diagram I indicates positive correlation as it shows that the values of the two variables move in the same direction, Diagram 2 indicates negative correlation as the values of the two variables move in the reverse direction. Diagram 3 indicates no correlation. 124
3. 3. SIMPLE GRAPH: In this method, two different curves, one representing the values of x and the other representing the values of y are obtained on a graph paper. If the two curves run parallel to each in the same direction either upward or downward, then there exists positive correlation. On the other hand if the two curves run in opposite direction, then the correlation is negative correlation. The above methods of studying correlation help us only to form an approximate idea. It is not possible to understand the exact size of correlation with the help of the above methods. The numerical values of correlation is obtained by applying the method suggested by Prof. Karl Pearson. Mathematical Method Karl Pearson’s coefficient of correlation Karl personas a great British Bio-metrician and statistician, has propounded the formula for calculating the coefficient of correlation. The formula is based on arithmetic mean and standard deviation and it is most widely used. The formula indicates whether the correlation is positive or negative. The answer lies between +1 and –1. Zero represents the absence of correlation. It is denoted by ‘ς’ which is the symbol of the degree of correlation and is obtained by using the following formula. 1. When Deviations are taken from Actual Mean ς = Σdxdy √ Σd2x x Σd2 y dx =Deviations of x values of variable from 0 i.e.(x –0) dy = Deviations of y values of variable from y i.e ( y - y ) 2. When Deviations are taken from Assumed Mean. Formula ς = ∑dxdy X n - (∑dx x ∑dy) √∑d2x X n – X √∑d2y X n – (∑dy)2 ∑dxdy = sum of the product of deviation of x and y series. ∑dx = sum of deviations taken from assumed mean of x series ∑dy = sum of deviations taken from assumed mean of y series ∑ d2x = sum of squares of deviations of x series ∑d2y = sum of squares of deviations of y series. Calculation of coefficient of correlation Problems on up grouped data - (individual Series) 1 Taking Deviations from Actual Mean: ILLUSTRATION = 01 Calculate Coefficient of Correlation from the following data. X 57 59 62 63 64 65 55 58 57 Y 113 117 126 126 130 129 111 116 112 SOLUTION: Steps: 1. Calculate actual Mean of x & y series 2. Take the Deviations from Actual Mean & square 3. Find the Product of the deviations use formula ς= ∑dxdy √ ∑d2x X ∑d2y 125
4. 4. (x – 60) (y –120) ς = ∑dxdy X d2x y d2y dxdy dx dy √∑d2x X ∑d2y 57 -3 9 113 -7 49 21 = 216 59 -1 1 117 -3 9 +3 √102 x 472 62 2 4 126 6 36 12 = 216 = 216 = 0.9844 63 3 9 126 6 36 18 √48144 219.41 64 4 16 130 10 100 40 65 5 25 126 9 81 45 55 -5 25 111 -9 81 45 58 -2 4 116 -4 16 08 57 -3 9 112 -8 64 21 ∑x =540 10 472 216 2 Calculation of Actual Mean x = ∑x = 540 = 60 y =∑y = 1080 =120 n 9 n 9 PROBABLE ERROR: To find out the reliability or the significance of the value of K.P. Coefficient correlation, probable error is used. A according to Horace Secrist “The probable error of the coefficient of correlation is an amount which if added to or subtracted from the mean correlation coefficient produces amounts within which the chances are even that a coefficient of correlation from a series selected at random will fall. The formula for calculating probable. P.E = 0.6745 x 1 – ς2 √n Functions of probable errors. 1. If the value of ς is less than the probable error, the value of r is not all significant 2. If the value of ς is more than six times the probable error, the value of ς is significant (ς = 6PE) 3. If the probable error is less than 0.3 the correlation should not be considered at all. 4. If the probable error is small, the correlation is definitely existing. Example :- Given values are ς = 0. 9844, n =9 SOLUTION P E = 0.6745 x 1-ς2 √n = 0.6745 x 1 – ( 0.9844)2 √9 = 0.6745 x 1 – 0.9690 = 0.6745 x 0.0310 3 3 P.E = 0.006969 Conclusion:- P.E is very small , it means there exist high degree of positive correlation. ILLUSTRAION –02 Find Karl Personas coefficient of correlation from the following data. Also calculate probable errors. Wages in Rs. 10 10 10 10 10 99 97 98 96 95 0 1 2 2 0 Cost of living 98 99 99 97 95 92 95 94 90 91 126
5. 5. SOLUTION Wage (x-x) Cost of (y - y) 0 = ∑x = 990 = 99 in Rs (x-99) d2x living (y-95) d2y dxdy n 10 X dx y dy y = ∑y = 950 = 95 100 1 1 98 3 9 3 n 10 101 2 4 99 4 16 8 ς = ∑dxdy . 102 3 9 99 4 16 12 √∑d2x X ∑d2y 102 3 9 97 2 4 6 = 61 . 100 1 1 95 0 0 0 √54 X 96 99 0 0 92 -3 9 0 = 61 = 61 = 0.8472 97 -2 4 95 0 0 0 √5184 72 98 -1 1 94 -1 1 1 PE =0.6745 X 1 – ς2 96 -3 9 90 -5 25 15 √n 95 -4 16 91 -4 16 16 PE =0.6745 x 1- (0.8472) 990 0 54 950 96 61 √ 10 ∑x ∑d2x ∑y ∑d2y ∑dxdy = 0.6745 x 1 –0.7177 = 0.6745 x 0.282 = 0.1904 High degree of positive correlation present II. METHOD ASSUMED MEAN METHOD When Deviation are taken from Assumed mean ILLUSTRATION = 3 Calculate the coefficient of correlation for following data also calculate the probable error Price x 42 38 42 45 42 44 40 46 44 40 Demand y 26 40 29 27 30 27 35 25 26 30 SOLUTION AX = x-42 y – 27 Steps: x d2 Y Dy xy dx dy 1. Select any value as assumed mean 42 0 0 26 -1 1 0 from x & y series 38 -4 16 40 +3 169 -52 2. Take the deviations from assumed 42 0 0 29 +2 4 0 mean 45 3 9 27 0 0 0 3. Square the deviations & find the 42 0 0 30 3 9 0 product of the deviation 44 2 4 27 0 0 0 4. Use the formula. 40 -2 4 35 +8 64 -16 46 4 16 25 -2 4 -8 44 2 4 26 -1 1 -2 40 -2 4 30 3 9 -6 57 25 –84 3dx 261 ∑d2x ∑dy ∑dxdy STPES 1. Select any value as assumed mean from x and y series 2. Take the deviation form assumed mean 3. Square the deviation and find the product of the deviation 4. Use the formula ς = ∑dxdy Xn –(∑dx∑dy) √∑dx2 X n-(∑dx)2 X √∑d2y X n-(∑dy)2 = –84X10-(3X25) √57X10 –(3)2 X √261 X 10-(25) = -840-75 √ 570-9 X √2610-625 127
6. 6. =-915 561 X 1985 = -915 =0.86812 √1054.9 PE = 0.6745 X 1 –ς2 =0.6745 X 1 –(-0.868)2 √n √10 =0.6745 X 1-0.7534 = 0.6745 X 0.2466 = 0.1663 = 0.526 3.16 3.16 3.16 PE =0.0526 ILLUSTRATION = 04 Calculate co efficient of correlation between the marks obtained by ten students in accountancy and statistics. Student 1 2 3 4 5 6 7 8 9 10 Accountancy x 45 70 65 30 90 40 50 75 85 60 Statistic y 35 90 70 40 95 40 60 80 80 50 x – 90 y – 95 Steps: Student X d2x y d2y dxdy dx dy 1. Select any value as 1 45 -45 2025 35 -60 3600 2700 assumed mean from x 2 70 -20 400 90 -5 25 100 & y series 3 65 -25 625 70 -25 625 625 2. Take the deviations 4 30 -60 3600 40 -55 3025 3300 from assumed mean 5 90 0 0 95 0 0 0 3. Square the deviations 6 40 -50 2500 40 -55 3025 2750 & find the product of 7 50 -40 1600 60 -35 1225 1400 the deviation 8 75 -15 225 80 -15 225 225 4. Use the formula. 9 85 -5 25 80 -15 225 75 10 60 -30 900 50 -45 2025 1350 -290 11900 -310 14000 12525 ∑dx ∑d2x ∑dy ∑d2y ∑dxdy ς = ∑dxdy Xn –(∑dx∑dy) √∑dx2 X n-(∑dx)2 X √∑d2y X n-(∑dy)2 = –84X10-(3X25) √57X10 –(3)2 X √261 X 10-(25) = -840-75 √ 570-9 X √2610-625 =-915 561 X 1985 = -915 =0.86812 √1054.9 PE = 0.6745 X 1 –ς2 =0.6745 X 1 –(-0.903)2 = 0.6745 x 1 –8154 √n √10 3.16 =0.6745 X 1846 = 0.1245 = 0.0394 3.16 3.16 PE =0.0394 ILLUSTRATION =05 Calculate Karl Pearson’s co efficient of correlation between x & y also calculate PE X=58 43 41 39 43 46 43 45 41 47 45 44 Y=11 27 31 42 30 28 28 20 19 20 32 30 128
7. 7. SOLUTION x –45 y – 27 X d2x y d2y dxdy dx dy 58 13 169 11 -16 256 -208 43 -2 4 27 0 0 0 41 -4 16 31 4 16 -16 39 -6 36 42 15 225 -90 43 -2 4 30 3 9 -6 46 1 1 28 1 1 1 43 -2 4 28 1 1 -2 45 0 0 20 -7 49 0 41 -4 16 19 -8 64 32 47 2 4 20 -7 49 -14 45 0 0 32 5 25 0 44 -1 1 30 3 9 -3 -5 255 -6 -306 704 ∑dx ∑d2x ∑d2y ∑dxdy ς = ∑dxdy X n –(∑dx∑dy) √∑d2x X n -(∑dx)2 X √∑d2y X n-(∑dy)2 = 306 X 10-(-5 X –6) √255 x 10 –(-5)2 X √704 X 10-(-6)2 = -3090 √ 2550 –25 X √7040 –36 = -3090 50.25 X 83.69 = -3090 =0.7348 4205.4 PE = 0.6745 X 1 –ς2 =0.6745 X 1 –(-0.7348)2 = 0.6745 x 1 –0.2652 √n √10 3.16 = 0.0566 ILLUSTRATION =06 Calculate Karl Persons coefficient of correlation between the age of husband & wives also calculate PE Age of husband (x) 20 25 30 35 40 45 50 55 60 65 75 Age of wife (y) 17 24 28 32 35 38 42 51 56 60 62 SOLUTION since ς is independent of the change of origin and scale we take dx = x –45 and dy = y – 35 5 (x –45)/5 y – 35 X d2x y d2y dxdy dx dy 20 -5 25 17 -18 324 90 25 -4 16 24 -11 121 44 30 -3 9 28 -7 49 21 35 -2 4 32 -3 9 6 40 -1 1 35 0 0 0 45 0 0 38 3 9 0 50 1 1 42 7 49 7 55 2 4 51 16 256 32 60 3 9 56 21 441 63 65 4 16 60 25 625 100 70 5 25 62 27 729 135 11 0 110 60 2161 498 N ∑dx ∑d2x ∑dy ∑dx2 ∑dxdy 129
8. 8. ς = ∑dxdy X n –(∑dx∑dy) √∑d2x X n -(∑dx)2 X √∑d2y X n-(∑dy)2 = 498 X 11-(0 X 60) √110 x 11 –(0)2 X √261 X 11-(60)2 = 5478 – 0 √1210 –0 X √28732 –3600 = 5478 √1210 X 25132 = 5478 = 0.7348 5515.258 PE = 0.6745 X 1 –ς2 =0.6745 X 1 –(-0.9932)2 = 0.6745 x 0.0136 √n √11 3.3166 = 0.0027 Since ς >6PE, the result is significant CAICULATION OF COEFFICIENT OF CORRELATION IN BIVARITE FREQUENCY DISTRIBUTION When the number of observation is very large the data is classified into two way frequency distribution the class intervals for y are in the column heading and for y in the stubs the formula for calculating the co efficient of correlation ς = ∑fdxdy X N –(∑fdx X ∑fdy) √∑fd2x X N – (∑fdx )2 X (∑fdy)2 X N –(∑fdy)2 Steps 1. Find the mid points of the various class for x & y variables 2. Take the step deviation of x variables (dx) and of y variables (dy) 3. Multiply dx, dy and the respective frequency of each cell and note the figure obtained in the left hand corner of each cell. 4. Sum up the all the values as calculated and get the total i.e. ∑fdxdy, 5. Find fdx and fd2x, taking the deviation from the assumed mean 6. Find fdy and fd2y taking deviation from the assumed mean 7. Write the formula substitute the values. ILLUTRATION =07 Calculate coefficient of correlation between the marks obtained by a batch of 100 student in accountancy and statistics as given below:- Marks in Marks in Accountancy y statistics x Total 20-30 30-40 40-50 50-60 60-70 15-25 5 9 3 - - 17 25-35 - 10 25 2 - 37 35-45 - 1 12 2 - 15 45-55 - - 4 16 5 25 55-65 - - - 4 2 6 Total 5 20 44 24 7 100 Solution: 130
9. 9. A=45 Y 20-30 30-40 40-50 50-60 60-70 Total C=10 Mid y 25 35 45 55 65 A=40 dy Mid x -2 -1 0 1 2 f fdx Fd2x Fdxdy C=10 dx 2 18 15-25 20 -2 0 5 9 3 - - 17 -34 68 38 1 -2 1 25-35 30 -1 - 0 25 2 - 37 -37 37 8 0 0 0 35-45 40 0 - 1 12 2 - 15 0 0 0 1 1 1 45-55 50 1 - - 4 6 0 5 25 25 25 26 6 8 8 55-65 60 2 - - - 4 2 6 12 24 16 100 -34 154 Total F 5 20 44 24 7 88 N ∑fdx ∑fd2x 8 Fdy -10 -20 0 24 14 ∑fdxy 92 Fd2y 20 20 0 24 28 ∑fd2xy fdxdy 20 28 0 22 18 88 ∑fdxdy Formula: ς = ∑dxdy X n –(∑dx∑dy) √∑d2x X n -(∑dx)2 X √∑d2y X n-(∑dy)2 = 88 X 100 –(-34 X 8) √154 x 100 –(-34)2 X √92 X 100 –(8)2 = 8800 +272 = 9092 √15400 –1156 X √9200 –64 √14244 X 9136 = 9072 = 9072 = 0.7953 119.35 X 95.58 1140.74 ILLUSTRATION = 08 Calculate the coefficient of correlation between ages of husbands and ages of wives in the following bivariate frequency distribution. Find also its probable error and comment on the result. Age of Age of wives Total Husbands 10 –20 20 –30 30 –40 40 –50 50 -60 15 –25 6 3 - - - 9 25 –35 3 16 10 - - 29 35 –45 - 10 15 7 - 32 45 –55 - - 7 10 4 21 55 –65 - - - 4 5 9 Total 9 29 32 21 9 100 7 N Solution dx = x – 40, dy= x – 35 10 10 131
10. 10. SOLUTION A=35 Y 10-20 20-30 30-40 40-50 50-60 C=10 MV 15 25 35 45 55 Total A=40 dy Total C=10 MV -2 -1 0 1 2 fdx Fd2x Fdxdy dx F 2 6 15-25 20 -2 4 6 3 - - - 9 -18 -36 30 6 1 1 25-35 30 -1 3 6 10 - - 29 -29 -29 22 6 0 1 0 35-45 40 0 - 15 7 - 32 0 0 0 0 1 8 1 45-55 50 1 - - 7 0 4 21 21 21 18 0 8 2 5 55-65 60 2 - - - 4 0 9 18 36 28 100 -8 122 Total F 9 29 32 21 9 98 N Σfdx Σfd2x -8 fdy -18 -29 0 21 18 Σfdy -8 Fd2y 36 29 0 21 36 Σfd2y Σfdxdy fdxdy 30 22 0 18 28 98 Formula:- ς = ∑dxdy X n –(∑dx∑dy) √∑d2x X n -(∑dx)2 X √∑d2y X n-(∑dy)2 = 98 X 100 –(-8 X 8) √122 x 100 –(-8)2 X √122 X 100 –(-8)2 = 9800 –64 = 9092 √12200 –64 X √12200 –64 √12136 X 12136 = 9736 =0.7953 12136 PE = 0.6745 X 1 –ς2 =0.6745 X 1 –(-0.8022)2 = 0.6745 x 1 –0.6435 √n √10 10 = 0.3565 x 0.6745 = 0.2404 =0.02404 10 10 6 x 0.02404 = 0.14424, since ς >6PE, Correlation is significant. ILLUSTRATION = 09 From the following table calculate the Karl Person’s coefficient of correlation between the marks obtained is Accountancy and statistics. Also calculate the value of probable error. Marks in Marks in statistics Accountancy 50-59 60-69 70-79 80-89 90-99 Total Below 60 - 6 7 6 - 19 60-64.9 5 8 10 4 5 32 65-69.9 8 6 8 6 1 29 70-74.9 7 12 15 10 5 49 75-79.9 10 8 12 3 4 37 80-84.9 5 4 13 5 5 32 85&above - 6 10 6 - 22 Total 35 50 75 40 20 220 (KU BBM) SOLUTION 132
11. 11. Let x represents marks in Accountancy Let y represents marks in statistics. dx = x –72.45 dy = y – 74.5 5 10 50-59 60-69 70-79 80-89 90-99 MV 54.5 64.5 74.5 84.5 94.5 F dy MV -2 -1 0 1 2 f fdx Fd2x Σfdxdy dx A=72.45 18 -18 57.45 -3 - 6 7 6 - 19 -57 171 0 C=5 20 16 -8 -20 60-64.9 62.45 -2 5 8 10 4 5 32 -64 128 8 16 6 -6 -2 65-69.9 67.45 -1 8 6 8 6 1 29 -29 29 14 0 0 0 0 70-74.9 72.45 0 7 12 15 10 5 49 0 0 0 -20 -8 3 8 75-79.9 77.45 1 10 8 12 3 4 37 37 37 -17 -20 -8 10 20 80-84.9 82.45 2 5 4 13 5 5 32 64 128 2 -18 18 85-89.9 87.45 3 - 6 10 6 - 22 66 198 0 220 17 691 TOTAL F 35 50 75 40 20 7 N Σfdx Σfd2x -40 Fdy -70 -50 0 40 40 Σfdy Fd2 310 140 50 0 40 80 Σfd2y y fdx Σfdxdy -4 6 0 -1 6 7 dy ς = ∑dxdy X n –(∑dx∑dy) √∑d2x X n -(∑dx)2 X √∑d2y X n-(∑dy)2 = 7 X 220 –(17 X –40) √691 x 220 –(17)2 X √310 X 220 –(-40)2 = 1540 +680 = 2220 √152020 –289 X √68200 –1600 √151731 X 66600 = 2220 =2220 = 0.02208 38952 x 258.069 100523 PE = 0.6745 X 1 –ς2 =0.6745 X 1 –(0.02208)2 = 0.6745 x 1 –0.00048 √n √220 10 = 0.99952 x 0.6745 = 0.06744 14.82 since ς >6PE, Correlation is not significant. ILLUSTRATION – 10 Calculate from the following data:- a. The value of Karl Pearson’s coefficient of correlation between salary in Rs and age in years. b. Also calculate its probable error and interpret the result. Salary in Rs Age in years 25 30 35 40 45 50 55 Total Under 3000 - - 2 4 2 3 2 13 3000-4999 - - 5 6 6 2 3 22 5000-6999 1 6 8 10 4 1 4 34 7000-8999 8 8 6 8 5 5 3 43 9000-10999 3 5 6 5 3 4 - 26 11000-12999 - 5 4 3 - - - 12 Total 12 24 31 36 20 15 12 150 SOLUTION 133
12. 12. dx = x –5999.5 , dy = y –40 2000 5 25 30 35 40 45 50 55 Total MV dy -3 -2 -1 0 1 2 3 F fdx Fd2x Fdxdy dx 4 - -1 -1 1000-2 1999.5 -2 - - 2 4 4 2 2 3 2 2 13 -26 52 -24 999 5 - -4 -9 3000-4 3999.5 -1 - - 5 6 6 6 2 3 22 -22 22 -14 999 5000-6 0 0 0 0 5999.5 0 1 6 8 10 4 1 4 34 0 0 0 999 7000-8 -24 -16 -6 5 10 9 7999.5 1 8 8 6 8 5 5 3 43 43 43 -22 999 -18 -20 -1 6 16 9000-1 9999.5 2 3 5 2 6 5 3 4 - 26 52 104 -28 0999 -30 -1 11000- 11999.5 3 - 5 2 4 3 - - - 12 36 108 -42 12999 150 83 329 Total F F 12 24 31 36 20 15 12 -130 N Σfdx Σfd2x -29 Fdx -36 -48 -31 0 20 30 36 Σfdy 423 Fd2y 108 96 31 0 20 60 108 Σfd2y fdxdy -42 -66 -21 0 1 10 -12 -130 Σfdxdy ς= ∑dxdy X n –(∑dx∑dy) √∑d2x X n -(∑dx)2 X √∑d2y X n-(∑dy)2 = -130 X 150 –(83 X –29) √329 x 150 –(83)2 X √423 X 150 –(-29)2 = -19500 + 2404 = -17003 √49350 –6889 X √63450 –841 √42461 X 62609 = -17003 = -17093 = -0.33146 206.06 x 250.217 51567.95 PE = 0.6745 X 1 –ς2 =0.6745 X 1 –(0.33146)2 = 0.6745 x 1 –0.1098 √n √150 12.247 = 0.8902 x 0.6745 = 0.049027 12.247 Here exist negative correlation ILLUSTRATION = 11 Calculate from the following data the value of Karl Pearson’s coefficient of correlation between sales revenue and advertisement expenditure. Also calculate its probable error and interpret its result. Sales Revenue in Advertisement expenditure in 000 of Rs lakhs of Rs 25-30 20-25 15-20 10-15 5-10 Total Under 125 - 2 5 3 - 10 127-174.9 5 6 10 3 3 27 175-224.9 4 4 20 4 4 36 225-274.9 6 4 9 2 2 23 275-& above - 2 1 - 1 04 Total 15 18 45 12 10 100 SOLUTION dx = x –199.95 dy = y –17.5 50 5 134
13. 13. A=17.5 Y 25-30 20-25 15-20 10-15 5-10 C=5 Total A=199.95 MV 27.5 22.5 17.5 12.5 7.5 C=50 dx X MV 2 1 0 -1 -2 F Fdx Fd2x Fdxdy dy -4 6 75-124.9 99.95 -2 - 2 5 3 - 10 -20 40 2 -10 -6 3 6 125-174.9 149.95 -1 5 6 10 3 3 27 -27 27 -7 0 0 0 0 175-224.9 199.95 0 4 4 20 4 4 36 0 0 0 12 4 -2 -4 225-274.9 249.95 1 6 4 9 2 2 23 23 23 10 4 -4 275-224.9 299.95 2 - 2 1 - 1 04 8 16 0 100 -16 106 F 15 18 45 12 10 05 N Σfdx Σfd2x 16 Fdy 30 18 0 -12 -20 Σfdy 130 Fd2y 60 18 0 12 40 Σfd2y fdxdy 62 -2 0 7 -2 05 Σfdxdy ς= ∑dxdy X n –(∑dx∑dy) √∑d2x X n -(∑dx)2 X √∑d2y X n-(∑dy)2 = 5 X 100 –(-16 X –16) √106 x 100 –(16)2 X √130 X 100 –(-16)2 = 500 +256 = 756 √10600 –256 X √13000 –256 √10344 X 12744 = 756 =756 = 0.0658 101.70 x 112.89 11480.9 PE = 0.6745 X 1 –ς2 =0.6745 X 1 –(0.0658)2 = 0.6745 x 1 –0.00432 √n √100 10 = 0.99568 x 0.6745 = 0.06715 10 There is no significant correlation TERMINAL QUESTIONS (5, 10 & 15 MARKS) 1. What is meant by correlation? What is it intended to measure? 2. What is a scatter diagram? How does it help us in studying the correlation? 3. Briefly a explain a.. Positive and Negative correlation b. Linear and Non-Linear correlation 4. How do you interpret the value of correlation 5. What is probable error? State it uses. PRACTICAL PROBLEMS 6. Calculate the value of coefficient of correlation between price and supply. What is probable error? Price 8 10 15 17 20 22 24 25 Supply 25 30 32 35 37 40 42 45 [Answer ς = 0.98, P.E 0.009] 7. Compute Karl Pearson’s coefficient of correlation between per capita National income and per capita consumer expenditure from the data given below. Per capital national income 249 251 248 252 258 269 271 272 280 275 Per capita consumer 237 238 236 240 245 255 254 252 258 251 expenditure [Answer ς=0.9675, PE = 0.01387] 8. Calculate the coefficient of correlation from the following data. And calculate its probable error. 135
14. 14. X 30 60 30 66 72 24 18 12 42 06 Y 06 36 12 48 30 06 24 36 30 12 [Answers ς = 0.575, PE =0.14277] 9. Calculate Karl Pearson’s coefficient correlation Advertisement and sales as per the data given. Advertisement cost in 000of Rs 39 65 62 90 82 75 25 98 36 78 Sales in lakhs of Rs 47 53 58 86 62 68 60 91 51 84 [Answer ς = 0.7804, P E = 0.08345] 10. Calculate Karl Pearson’s coefficient of correlation from data given. X 368 384 385 361 347 384 395 403 400 385 Y 22 21 24 20 22 26 26 29 28 27 [Answer, ς = 0.79] 11. Compute the coefficient correlation between dividends and prices of securities as given below. Security Annual Dividends In Rs prices in Total Rs 6-8 8-10 10-12 12-14 14-16 16-18 130-140 - - 1 3 4 2 10 120-130 - 1 3 3 3 1 11 110-120 - 2 3 2 - - 7 100 –110 - 2 3 2 - - 7 90-100 2 2 1 1 - - 6 80-90 3 1 1 - - - 5 70-80 2 1 - - - - 3 Total 7 8 11 12 9 3 50 [Answer ς= 0.71, PE = 0.0473] 12. Calculate Karl Pearson’s coefficient of correlation from the following Bivariate frequency distribution and also calculate probable error. Age of Age of wives in year Husbands in year 23-30 30-37` 37-44 44-51 51-58 Total 18-25 9 3 - - - 12 25-32 - 20 10 4 - 34 32-39 - - 12 5 3 20 39-46 - - 8 7 5 20 46-53 - - 10 4 14 Total 9 23 30 26 12 100 Answer ς = 0.596, P E = 0.04349 13. Calculate Karl Pearson’s coefficient of correlation between income & Food expenditure. Also calculate P.E Food expenditure Family income in Rs in percentage 200-300 300-400 400-500 500-600 600-700 Total 10-15 - - - 3 7 10 15-20 - 4 9 4 3 20 20-25 7 6 12 5 - 30 25-30 3 10 19 8 - 40 Total 10 20 40 20 10 100 [Answer ς = -0.44] 14. Calculate coefficient of correlation from the following data also calculate probable error. Y x 44.5-49.5 49.5-54.5 54.5-59.5 59.4-64.5 64.5-69.5 Total 54.5-59.5 3 4 2 - - 9 59.5-64.5 4 8 8 2 - 22 64.5-69.5 - 7 12 8 4 31 69.5-74.5 - 3 8 8 5 24 74.5-79.5 - - 3 5 6 14 Total 7 22 33 23 15 100 [Answer ς= 0.60734, P.E = 0.04256 136
15. 15. 15. The following table gives the number of students having different height and weight find coefficient of correlation and probable error. Height in Weights in pounds inches 80-90 90-100 100-110 110-120 120-130 Total 50-55 1 3 7 5 2 18 55-60 2 4 10 7 4 27 60-65 1 5 12 10 7 35 65-70 - 3 8 6 3 20 Total 4 15 37 28 16 100 [Answers: ς =0.0945, PE = 0.0668] 137