Successfully reported this slideshow.
Upcoming SlideShare
×

# Correlation continued

1,414 views

Published on

Published in: Education, Technology
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

### Correlation continued

1. 1. ADDITIONAL INFORMATION Correlation Analysis continued…Chapter 2
2. 2. Examples of Correlation  Sugar consumption and level of activity of a person  Sales volume versus expenditures  Temperature and coffee sales  Price and demand  Production and Plant Capacity  Outdoor temperature and gas consumption
3. 3. Characteristics of a Relationship 1. The direction of a relationship a. Positive b. Negative 2. The form of relationship a. linear b. curved (ex. Mood levels and dosage) 3. The degree of relationship perfect positive perfect negative high degree of positive/ negative correlation low degree of positive / negative correlation
4. 4. Where and Why Correlations are Used? 1. Prediction ex. College admission with NCAE or HS grades Sales and population 2. Validity ex. Employee performance evaluation should have tests on skills, achievements and company contribution of an employee 3. Reliability – it produces stable, consistent measurements * when reliability is high, the correlation between two
5. 5. Correlation and Causation 1. There is a direct cause-and-effect relationship between variables. 2. There is a reverse cause-and-effect relationship between variables. 3. The relationship between variables may be caused by a third value variable. 4. There may be a complexity of interrelationships among variables. 5. The relationship may be coincidental.
6. 6. Learning Check! 1. For each of the following, indicate whether you would expect a positive or negative correlation. Justify. a. Distance sprinted and recovery time b. Sugar consumption and activity level for a group of children c. Daily high temperature and daily energy consumption for 30 days in the summer. d. Daily high temperature and daily energy consumption for 30 days on rainy season.
7. 7. 2. The data points would be clustered more closely around a straight line for a correlation of -0.80 than for a +0.05. (True or False?) 3. If the data points are tightly clustered together around a line that slopes down from left to right, then a good estimate of the correlation would be +0.90. (True or False?) 4. A correlation can never be greater than +1.00. (True or False?)
8. 8. PROBABLE ERROR AND COEFFICIENT OF CORRELATION Correlation Analysis continued…Chapter 2
9. 9. Probable Error (PE) It is a statistical device which measures the reliability and dependability of the value of coefficient of correlation PE = 2 x standard error (or) = 0.6745 x standard error 3
10. 10. Standard Error (SE) SE = 1 – r2 √n PE = 0.6745 x 1 – r² √n • if the value of `r’ is less than the PE, then there is no evidence of correlation • if the value of `r’ is six times more than the PE, the correlation is certain and significant • By adding and submitting PE from coefficient of correlation, we can find out the upper and lower limits within which the population coefficient of correlation may be expected to lie.
11. 11. Uses of PE  1) PE is used to determine the limits within which the population coefficient of correlation may be expected to lie.  2) It can be used to test whether the value of correlation coefficient of a sample is significant with that of the population
12. 12. If r = 0.6 and N = 64, find out the PE and SE of the correlation coefficient. Also determine the limits of population correlation coefficient Sol: r = 0.6 N=64 PE = 0.6745 x SE SE = 1 – r2 √n = 1 – 0.62 = 1- 0.36 = 0.64 / 8 = 0.08 √64 8 PE = 0.6745 x 0.08 = 0.05396 Limits of Population Correlation Coefficient = r ± PE = 0.6 ±0.05396 = 0.54604 to 0.6540
13. 13. Qn. 2 r and PE have values 0.9 and 0.04 for two series. Find n. Sol: PE = 0.04 = 0.6745 x 1 – r2 = 0.04 √n = 1- 0.9² = 0.04 √n 0.6745 = 1-0.81 = 0.0593 √n 0.19 / √n = 0.05930 0.0593 x √n = 0.19 √n = 0.19 ÷ 0.0593 √n = 3.2 N = 3.2² = 10.266 N = 10
14. 14. COEFFICIENT OF DETERMINATION Correlation Analysis continued…Chapter 2
15. 15. Square of Coefficient of Correlation *Coefficient of Determination = (r2) *Coefficient of Non- Determination = (K2) (K2) = 1- r2 The ratio of the explained variance to the total variance
16. 16. Illustrative Example  Calculate the coefficient of determination and non-determination if coefficient of correlation is 0.8  Coefficient of determination = r2 = 0.82 = 0.64 =  Coefficient of non- determination = K2 =1- 0.82 = 1- 0.64 =
17. 17.  It is the most widely used algebraic method to measure the coefficient of correlation  It gives numerical value to express relationship between variables  It gives both direction and degree of relationship between variables  It can be used for further algebraic treatment such as coefficient of determination and non determination  It gives a single figure to explain the accurate degree of correlation between two variables
18. 18.  It is very difficult to compute the value of coefficient of correlation.  It is very difficult to understand.  It requires a complicated mathematical calculation.  It takes more time  It is unduly affected by extreme items.  It assumes a linear relationship between the variables. But in real life situation, it may not be so.
19. 19. SPEARMAN’S RANK CORRELATION METHOD Correlation Analysis continued…Chapter 2
20. 20. This was developed by Charles Edward Spearman in 1904 The correlation of coefficient obtained from ranks of the variables. 6∑D2 Definition (R) =
21. 21. Qn: Find the rank correlation between poverty and overcrowding from the information given below. Town A B C D E F G H I J Poverty 17 13 15 16 6 11 14 9 7 12 Overcro wding 36 46 35 24 12 18 27 22 2 8
22. 22. Soln. 6∑D2 6x44 264 990 = 1- 0.2667 = 0.7333 (R) = (R) = (R) =
23. 23. Qn: Following were the ranks given by three judges in a beauty contest. Determine which pair of judges has the nearest approach to common tastes in beauty. Judge 1 1 6 5 10 3 2 4 9 7 8 Judge 2 3 5 8 4 7 10 2 1 6 9 Judge 3 6 4 9 8 1 2 3 10 5 7
24. 24. Soln. 6∑D2 6x200 = 1- 1.2121 = 0.2121 6x214 = 1- 1.297 = 0.297 6x60 = 1- 0.364 = 0.636 (R) = (R) =Rank correlation between I&II Rank correlation between I&II Rank correlation between I&III (R) = (R) =
25. 25. Qn: The coefficient rank of the marks obtained by 10 students in statistics & English was 0.2. It was later discovered that the difference in ranks of one of the students was wrongly taken as 7 instead of 9. Find the correct result.  R = 0.2 1-.0.2= 6∑D2 1 6∑D2 6∑D2 = 990x 0.8 = 792 ∑D2 = 792/6 = 132-72+92 6∑D2 (R) =
26. 26. Correct 6∑D2 6x164 10 -10 = 1 - 984 990 = 1- 0.9939 = 0.0061 (R) = (R) =
27. 27. (R) = 6∑D2 = 0.8 1 - .08 = 6x33 0.2 x ( Qn: The coefficient rank of the marks obtained by 10 students in statistics & English was 0.2. If the sum of the squares of the difference in ranks is 33, find the number of students in the group.
28. 28. Computation of Rank Correlation Coefficient when Ranks are Equal Where D – Difference of rank in the two series N - Total number of pairs m - Number of times each rank repeats R = 1-
29. 29. Qn:- Obtain rank correlation co-efficient for the data:- X: 68 64 75 50 64 80 75 40 55 64 Y: 62 58 68 45 81 60 68 48 50 70
30. 30. x y R1 R2 D (R1-R2) D² 68 62 4 5 1 1 64 58 6 7 1 1 75 68 2.5 3.5 1 1 50 45 9 10 1 1 64 81 6 1 5 25 80 60 1 6 5 25 75 68 2.5 3.5 1 1 40 48 10 9 1 1 55 50 8 8 0 0 64 70 6 2 4 16 ∑D² 72
31. 31. Merits of Rank Correlation Method  It is very simple to understand.  It can be applied to any type of data, i.e. quantitative and qualitative  It is the only way of studying correlation between qualitative data such as honesty, beauty etc.  As the sum of rank differences of the two qualitative data is always equal to zero, this
32. 32. Demerits of Rank Correlations  Rank Correlation Coefficient is only an approximate measure as the actual values are not used for calculations.  It is not convenient when the number of pairs (N) is large.  Further algebraic treatment is not possible.  Combined correlation coefficient of different series cannot be obtained as in the case of mean and standard deviation. In case of mean and standard
33. 33. CONCURRENT DEVIATION METHOD Correlation Analysis continued…Chapter 2
34. 34. Under this method, we only consider the directions of deviations.  If deviations of two variables are concurrent, then they move in the same direction, otherwise in the opposite direction. √± (2c-N) N Where N = no. of pairs of symbol C= No. of concurrent deviations (ie.No. of +signs in `dx dy’ column r = ±
35. 35. Steps 1. Every value of `x’ series is compared with its proceeding value. Increase is shown by`+’ symbol and decrease by`-’ 2. The above step is repeated for `y’ series and we get `dy’ 3. Multiply `dx’ by `dy’ and the product is shown in the next column. The column heading is `dxdy’ 4. Take the total number of `+’ signs in `dxdy’ column. `+’ signs in `dxdy’ column denotes the concurrent deviations and it is indicated by `C’
36. 36. Qn:- Calculate coefficient if correlation by concurrent deviation method: Year : 2003 2004 2005 2006 2007 2008 2009 2010 2011 Supply : 160 164 172 182 166 170 178 192 186 Price : 292 280 260 234 266 254 230 190 200
37. 37. Merits of concurrent deviation method: 1. It is very easy to calculate coefficient of correlation 2. It is very simple understand the method 3. When the number of items is very large, this method may be used to form quick idea about the degree of relationship 4. This method is more suitable,
38. 38. Demerits of concurrent deviation method: 1. This method ignores the magnitude of changes. Ie. Equal weight is given for small and big changes. 2. The result obtained by this method is only a rough indicator of the presence or absence of correlation 3. Further algebraic treatment is not possible 4. Combined coefficient of concurrent deviation
39. 39. Thank You!!!