Correlation analysis

475 views

Published on

Published in: Education, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
475
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
31
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Correlation analysis

  1. 1. Correlation and Regression -Aakriti Agarwal Roll No. 13004 BMS 1A
  2. 2. Correlation β€’ β€’ β€’ Correlation refers to statistical relationships involving two random variables or sets of data The correlation coefficient is denoted by β€˜r’ and ranges from -1 to +1 Tells the Direction and Measure of the Relationship between two variables
  3. 3. Coefficient of Correlation The coefficient of correlation can be: perfectly negative r=-1 strong negative -1<r<0 and r closer to 1 weak negative -1<r<0 and r closer to 0 independent r=0 strong positive 0<r<1 and r closer to 1 weak positive 0<r<0 and r closer to 0 perfect positive r=1 β€’ β€’ β€’ β€’ β€’ β€’ β€’
  4. 4. Methods to calculate Correlation Coefficient Karl Pearson Spearman
  5. 5. Karl Pearson π‘Ÿ= 𝑛 𝑖=0(π‘₯1 π‘₯1 βˆ’ π‘₯ βˆ’ π‘₯ )(𝑦1 βˆ’ 𝑦) 2 𝑦1 βˆ’ 𝑦 n - number of pairs of observations 2
  6. 6. Data for Calculation in MS Excel 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 Sales (in Unit Lakhs) 9.1 10.1 9.3 9.9 11.3 10.9 11.6 12.5 14 14.5 15 15.6 16.2 18 16 14 12 Lakhs Year Marketing Expenditure (In Rs. Lakhs) 8 10.5 11 12 12.9 13.5 11.6 10.9 13 14 15.3 16 17 10 8 Expenditure In Lakhs 6 Sales in Lakhs 4 2 0 2000 2005 2010 Year 2015
  7. 7. Pearson in MS Excel 2 2 Year x y 2001 8 9.1 -4.746153846 22.52598 -3.20769 10.28929 15.2242 2002 10.5 10.1 -2.246153846 5.045207 -2.20769 4.873905 4.958817 2003 11 9.3 -1.746153846 3.049053 -3.00769 9.046213 5.251893 2004 12 9.9 -0.746153846 0.556746 -2.40769 5.796982 1.796509 2005 12.9 11.3 0.153846154 0.023669 -1.00769 1.015444 -0.15503 2006 13.5 10.9 0.753846154 0.568284 -1.40769 1.981598 -1.06118 2007 11.6 11.6 -1.146153846 1.313669 -0.70769 0.500828 0.811124 2008 10.9 12.5 -1.846153846 3.408284 0.192308 0.036982 -0.35503 2009 13 14 0.253846154 0.064438 1.692308 2.863905 0.429586 2010 14 14.5 1.253846154 1.57213 2.192308 4.806213 2.748817 2011 15.3 15 2.553846154 6.52213 2.692308 7.248521 6.87574 2012 16 15.6 3.253846154 10.58751 3.292308 10.83929 10.71266 2013 17 16.2 4.253846154 18.09521 3.892308 15.15006 16.55728 𝒙 12.74615 βˆ‘ βˆ‘ βˆ‘ π’š 12.30769 73.33231 74.44923 63.79538 π‘₯1 βˆ’ π‘₯ π‘₯𝑖 βˆ’ π‘₯ 𝑦𝑖 βˆ’ 𝑦 𝑦𝑖 βˆ’ 𝑦 (π‘₯ 𝑖 βˆ’ π‘₯ )(𝑦 𝑖 βˆ’ 𝑦) r=$H$16/($E$16*$G$16)^0.5 βˆ‘(π‘₯ 𝑖 βˆ’ π‘₯ )(𝑦 𝑖 βˆ’ 𝑦) βˆ‘ π‘₯𝑖 βˆ’ π‘₯ Square root 2 βˆ‘ 𝑦𝑖 βˆ’ 𝑦 2 R= 0.863399
  8. 8. Spearman π‘Ÿ =1βˆ’ 6 2 𝐷 + 1 𝑛 𝑖=0 12 𝑛3 βˆ’ 𝑛 3 π‘šπ‘– βˆ’ π‘šπ‘– m=no. of times a pair of observations is repeated D=Rank 1- Rank 2
  9. 9. Spearman in MS Excel Year x y Rank 1 Rank 2 D(R1-R2) 𝐷2 2001 8 9.1 1 1 0 0 2002 10.5 10.1 2 4 -2 4 2003 11 9.3 4 2 2 4 2004 12 9.9 6 3 3 9 2005 12.9 11.3 7 6 1 1 2006 13.5 10.9 9 5 4 16 2007 11.6 11.6 5 7 -2 4 2008 10.9 12.5 3 8 -5 25 2009 13 14 8 9 -1 1 2010 14 14.5 10 10 0 0 2011 15.3 15 11 11 0 0 2012 16 15.6 12 12 0 0 2013 17 16.2 13 13 0 0 =SUM(F3:F15) =64 βˆ‘π·2 =1-(6*F17)/($A$1*($A$1^2-1)) βˆ‘π· 2 squaring No. of pairs of observations R=0.824176
  10. 10. Are Correlation and Causation the same?
  11. 11. Correlation β‰  Causation If it were, these would be true...
  12. 12. Practical Applications
  13. 13. Practical Application Correlation is used in: β€’ Business β€’ Government β€’ Education β€’ Medicine β€’ Agriculture
  14. 14. Business β€’ Marketing Expenditure and Sales Volume correlation (to measure the efficiency of marketing department) β€’ Correlation between prices of two securities in the stock market. β€’ Price of a commodity to supply(or demand) correlation.
  15. 15. Government β€’ Year on Year Revenue and Expenditure Correlation (to forecast revenue based on expenditure) β€’ Tool in formulating various Economic Policies by correlating past trends. β€’ Yardstick to measure performance (Correlation between Planned and Actual Revenue)
  16. 16. Education Models β€’ Forecasting of student input flows towards elementary education (Correlation between birth rate data and enrollment in elementary grades) β€’ Forecasting of dropped out student flows at different levels of education (intermediate, graduate, post graduate)
  17. 17. Medicine β€’ Finding out after effects of interactions between different medicines. β€’ Estimating the best treatment where various methods are applicable (Correlation between individual treatments’ results and severity of disease.
  18. 18. Agriculture β€’ Correlation between certain weather conditions and Productivity. β€’ Correlation between irrigating and Productivity. β€’ Correlation between price and production or price and demand, to study demand supply pattern of crops in different seasons.
  19. 19. Conclusion β€’ Correlation is one of the many effective ways of forecasting and predicting possible outcomes based on past observations. β€’ Though other statistical methods too need to be implemented to get a complete picture of the situation.
  20. 20. Thank You

Γ—