Successfully reported this slideshow.

×

# Topic 5 Covariance & Correlation.pptx

education

education

## More Related Content

### Topic 5 Covariance & Correlation.pptx

1. 1. COVARIANCE & CORRELATION
2. 2. How to understand the relation between two variables?
3. 3. Scatter Diagram Suppose we have two variables say: Age of the rider Speed of the motor cycle Age (Years): 16 22 34 45 21 24 27 53 32 24 26 29 Speed(Km/hr): 62 60 45 46 50 55 54 30 36 39 48 50
4. 4. Age Speed 16 62 22 60 34 45 45 46 21 50 24 55 27 54 53 30 32 36 24 39 26 48 29 50 (16, 62) (22, 60)
5. 5. Covariance :Test of Relationship  Covariance is a measure of how much two random variables vary together.  It's similar to variance, but where variance tells you how a single variable varies.  Covariance tells you how two variables vary together.  A positive value of covariance between two variables indicates a positive relationship between them. This means that if one variable deviates from the mean in one direction, the other variable will also deviate from its mean in the same direction.  A negative covariance between two variables indicates a negative relationship between them.  The problem of measuring the relationship between two variables by using covariance is that it is not a standardized measure of relationship.
6. 6. Covariance & Correlation  Covariance is a measure of how much two random variables vary together from its mean.  Correlation is the degree of inter-relatedness among the two or more variables.  Correlation analysis is a process to find out the degree of relationship between two or more variables by applying various statistical tools and techniques.  Term correlation coined by Karl Pearson in 1902.  Denoted by r  Value lies between -1 and+1
7. 7. Formulas
8. 8. Correlation Correlation only measures the extent of relationship between variables Karl Pearson (1867 – 1936), a British Biometrician, developed the formula for Correlation Coefficient. The correlation coefficient between two variables X and Y are denoted by r(X,Y) or rx,y.
9. 9. Coefficient of Correlation • The coefficient of correlation gives a mathematical value for measuring the strength of the linear relationship between two variables. • A linear relationship (or linear association) is a statistical term used to describe a straight-line relationship between a variable and a constant. • It can take values from –1 to 1. • r= +0.6 ……………………………………Positive Correlation • r= -0.7 …………………………………….Negative Correlation • Correlation is the concept of linear relationship between two variables. Whereas correlation coefficient is a measure that measures linear relationship between two variables
10. 10. Assumptions of Pearson Product Moment Correlation  Quantitative Measure  Linearity:X andY should be linearly related  Absence of Outliers:There Should be no outliers  Normality  Minimum 30Observations
11. 11. Application  Relationship between variables like Job satisfactionand turnover intention.  Market risk and ScripPrice  Satisfaction and purchaseintention  Yield of rice and rainfall  ICT use and learning effectiveness
12. 12. Methods of Studying Correlation  Scatter Diagram Method  Karl Pearson Coefficient of Correlation  Spearman’s Rank Correlation Coefficient- Measure of the non- parametric correlation between two ordinal variables. (Rank Correlation)
13. 13. Types of Correlation On the basis of degree of correlation Positive correlation Negative correlation On the basis of number of variables Simple correlation Partial correlation Multiple correlation On the basis of linearity Linear correlation Non –linear correlation
14. 14. Correlation on basis of number of variables Simple Correlation – Sales & Expense , Income and Consumption Partial Correlation – Height , Weight and Age – Effect of third variable “ Age” on height and weight Multiple Correlation-Rainfall and temperature on the yield of wheat , Sales with Advertisement and No of Sales Persons
15. 15. Types of Correlation : On the basis of degree of correlation
16. 16. Linear and Non Linear Correlation
17. 17. Height Weight Age 0. 92
18. 18. Correlation doesn’t imply causation The consumption of ice-cream increases during the summer months. There is a strong correlation between the sales of ice- cream units. In this particular example, we see there is a causal relationship also as the extreme summers do push the sale of ice- creams up. Ice-creams sales also have a strong correlation with shark attacks. Now as we can see very clearly here, the shark attacks are most definitely not caused due to ice-creams. So, there is no causation here. Hence, we can understand that the correlation doesn't ALWAYS imply causation!
19. 19. Problem 1) Calculate Karl Pearson coeff of correlation for the following data
20. 20. Problem : Variance –Covariance Method If Variance between X and Y variables is 12.5 and Variance of X and Y are respectively 16.4 and 13.8 , find the coeficient of correlation between them
21. 21. Problem 2) If covariance between X and Y variables is 12.5 and the variance of X and Y are 16.4 and 13.8 , find coeff of correlation between them
22. 22. Problem: Find coeff of correlation between X and Y by square of values method /Product Moment Method
23. 23. Road Rash The age of 12 motor cycle riders and their average speed while driving are as follows: Age (Years): 16 22 34 45 21 24 27 53 32 24 26 29 Speed(Km/hr): 62 60 45 46 50 55 54 30 36 39 48 50 The Correlation Coefficient is -0.7281
24. 24. Height and Weight Weight (KG) Height (Cms) 56 149 78 164 66 157 63 155 49 159 48 156 52 169 50 163 72 176 79 177 82 175 48 144 56 164 68 169 59 165 Interpretation: The correlation is positive and high i.e. Height and Weight are positively related. This indicates that as height of an individual increases the weight shall also increase and vice versa. The Correlation Coefficient is 0.6551
25. 25. Rank Correlation Coefficient This correlation formula is used when relation between two variable are studied in terms of the ranking of each case within each variable. Mostly used when both the variables relate to some attribute. Appropriate for data sets that are ordinal in nature. The range is from [ -1, 1]
26. 26. Spearman’s Rank Correlation  The Spearman's rank-order correlation is thenonparametric version of the Karl PearsonCoefficient ofCorrelation.  Spearman's correlation coefficient (ρ, also signified byrs) measures the strength and direction of association between two ranked variables.  This method is also called rank correlation. It works on the ranking of the observed score of the variables(Variables can be Ordinal, Interval or Ratio Scale).  The ranking of the variables will be given as 1-highest,2-2nd highest, 3, 4,…………………….N.
27. 27. Formula
28. 28. Relation between two attributes Beauty and Intelligence Subject Score X Score Y A 7.1 61 B 7.4 53 C 7.9 76 D 6.3 43 E 8.3 73 F 9.6 77 G 7.6 69 H 8.8 81 I 5.9 47 J 6.6 36 X represents beauty Y represents intelligence Are X and Y related? ) 1 ( 6 1 2 2     n n di  Rank X Rank Y 7 6 6 7 4 3 9 8 3 4 1 2 5 5 2 1 10 9 8 10
29. 29. Subject Rank X (RX) Rank Y (RY ) di = RX – RY di 2 1 7 6 1 1 2 6 7 -1 1 3 4 3 1 1 4 9 8 1 1 5 3 4 -1 1 6 1 2 -1 1 7 5 5 0 0 8 2 1 1 1 9 10 9 1 1 10 8 10 -2 4 di 2 =12 98 . 0 ) 1 10 ( 10 12 6 1 ) 1 ( 6 1 2 2 2          n n di  Rank Correlation can be interpreted in the same fashion as the Karl Pearson Correlation Coefficient
30. 30. Movie Ranking by 3 Critics .Whats the correlation Year Movie TOI Critic TOI User IMDB 2020Angrezi Medium 3.5 3.5 7.3 2020Baaghi 3 2.5 2.8 2 2020Thappad 4.5 4.1 6.9 2020 Shubh Mangal Zyada Saavdhan 3.5 3.4 5.9 2020Bhoot - Part 1 2.5 3 5.5 2020Love aaj kal 3 3.1 5 2020Hacked 2 2.5 4.2 2020Shikara 3 3.2 3.3 2020Malang 3.5 3.4 6.5 2020Jawaani Janeman 3.5 3.4 6.7 2020Panga 4 3.9 7 2020Street Dancer 3D 3.5 3.4 3.6 2020Jai Mummy DI 2 2.5 3.5 2020Tanhji - The Unsung Warrior 4 4.3 7.7 2020Chhapaak 3.5 3.4 5