0
Correlation and
Regression
-Aakriti Agarwal
Roll No. 13004
BMS 1A
Correlation

β€’
β€’

β€’

Correlation refers to statistical relationships
involving two random variables or sets of
data
The co...
Coefficient of Correlation
The coefficient of correlation can be:
perfectly negative
r=-1
strong negative
-1<r<0 and r clo...
Methods to calculate
Correlation
Coefficient
Karl
Pearson

Spearman
Karl Pearson
π‘Ÿ=

𝑛
𝑖=0(π‘₯1

π‘₯1 βˆ’ π‘₯

βˆ’ π‘₯ )(𝑦1 βˆ’ 𝑦)
2

𝑦1 βˆ’ 𝑦

n - number of pairs of
observations

2
Data for Calculation in MS Excel
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013

Sales
(in Unit Lakhs)
9...
Pearson in MS Excel
2

2

Year

x

y

2001

8

9.1

-4.746153846 22.52598 -3.20769 10.28929

15.2242

2002

10.5

10.1

-2...
Spearman
π‘Ÿ =1βˆ’

6

2
𝐷

+

1
𝑛
𝑖=0 12
𝑛3 βˆ’ 𝑛

3
π‘šπ‘–

βˆ’ π‘šπ‘–

m=no. of times a pair of
observations is repeated
D=Rank 1- Rank...
Spearman in MS Excel
Year

x

y

Rank 1

Rank 2

D(R1-R2)

𝐷2

2001

8

9.1

1

1

0

0

2002

10.5

10.1

2

4

-2

4

20...
Are Correlation and Causation the same?
Correlation β‰  Causation

If it were, these would be true...
Practical
Applications
Practical Application
Correlation is used in:
β€’ Business
β€’ Government
β€’ Education
β€’ Medicine
β€’ Agriculture
Business
β€’ Marketing Expenditure and Sales Volume
correlation (to measure the efficiency of
marketing department)
β€’ Correl...
Government
β€’ Year on Year Revenue and Expenditure
Correlation (to forecast revenue based on
expenditure)
β€’ Tool in formula...
Education Models
β€’ Forecasting of student input flows towards
elementary education (Correlation between
birth rate data and...
Medicine
β€’ Finding out after effects of interactions
between different medicines.
β€’ Estimating the best treatment where
va...
Agriculture
β€’ Correlation between certain weather
conditions and Productivity.
β€’ Correlation between irrigating and
Produc...
Conclusion
β€’ Correlation is one of the many effective
ways of forecasting and predicting
possible outcomes based on past
o...
Thank You
Upcoming SlideShare
Loading in...5
×

Correlation analysis

311

Published on

Published in: Education, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
311
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
29
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Transcript of "Correlation analysis "

  1. 1. Correlation and Regression -Aakriti Agarwal Roll No. 13004 BMS 1A
  2. 2. Correlation β€’ β€’ β€’ Correlation refers to statistical relationships involving two random variables or sets of data The correlation coefficient is denoted by β€˜r’ and ranges from -1 to +1 Tells the Direction and Measure of the Relationship between two variables
  3. 3. Coefficient of Correlation The coefficient of correlation can be: perfectly negative r=-1 strong negative -1<r<0 and r closer to 1 weak negative -1<r<0 and r closer to 0 independent r=0 strong positive 0<r<1 and r closer to 1 weak positive 0<r<0 and r closer to 0 perfect positive r=1 β€’ β€’ β€’ β€’ β€’ β€’ β€’
  4. 4. Methods to calculate Correlation Coefficient Karl Pearson Spearman
  5. 5. Karl Pearson π‘Ÿ= 𝑛 𝑖=0(π‘₯1 π‘₯1 βˆ’ π‘₯ βˆ’ π‘₯ )(𝑦1 βˆ’ 𝑦) 2 𝑦1 βˆ’ 𝑦 n - number of pairs of observations 2
  6. 6. Data for Calculation in MS Excel 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 Sales (in Unit Lakhs) 9.1 10.1 9.3 9.9 11.3 10.9 11.6 12.5 14 14.5 15 15.6 16.2 18 16 14 12 Lakhs Year Marketing Expenditure (In Rs. Lakhs) 8 10.5 11 12 12.9 13.5 11.6 10.9 13 14 15.3 16 17 10 8 Expenditure In Lakhs 6 Sales in Lakhs 4 2 0 2000 2005 2010 Year 2015
  7. 7. Pearson in MS Excel 2 2 Year x y 2001 8 9.1 -4.746153846 22.52598 -3.20769 10.28929 15.2242 2002 10.5 10.1 -2.246153846 5.045207 -2.20769 4.873905 4.958817 2003 11 9.3 -1.746153846 3.049053 -3.00769 9.046213 5.251893 2004 12 9.9 -0.746153846 0.556746 -2.40769 5.796982 1.796509 2005 12.9 11.3 0.153846154 0.023669 -1.00769 1.015444 -0.15503 2006 13.5 10.9 0.753846154 0.568284 -1.40769 1.981598 -1.06118 2007 11.6 11.6 -1.146153846 1.313669 -0.70769 0.500828 0.811124 2008 10.9 12.5 -1.846153846 3.408284 0.192308 0.036982 -0.35503 2009 13 14 0.253846154 0.064438 1.692308 2.863905 0.429586 2010 14 14.5 1.253846154 1.57213 2.192308 4.806213 2.748817 2011 15.3 15 2.553846154 6.52213 2.692308 7.248521 6.87574 2012 16 15.6 3.253846154 10.58751 3.292308 10.83929 10.71266 2013 17 16.2 4.253846154 18.09521 3.892308 15.15006 16.55728 𝒙 12.74615 βˆ‘ βˆ‘ βˆ‘ π’š 12.30769 73.33231 74.44923 63.79538 π‘₯1 βˆ’ π‘₯ π‘₯𝑖 βˆ’ π‘₯ 𝑦𝑖 βˆ’ 𝑦 𝑦𝑖 βˆ’ 𝑦 (π‘₯ 𝑖 βˆ’ π‘₯ )(𝑦 𝑖 βˆ’ 𝑦) r=$H$16/($E$16*$G$16)^0.5 βˆ‘(π‘₯ 𝑖 βˆ’ π‘₯ )(𝑦 𝑖 βˆ’ 𝑦) βˆ‘ π‘₯𝑖 βˆ’ π‘₯ Square root 2 βˆ‘ 𝑦𝑖 βˆ’ 𝑦 2 R= 0.863399
  8. 8. Spearman π‘Ÿ =1βˆ’ 6 2 𝐷 + 1 𝑛 𝑖=0 12 𝑛3 βˆ’ 𝑛 3 π‘šπ‘– βˆ’ π‘šπ‘– m=no. of times a pair of observations is repeated D=Rank 1- Rank 2
  9. 9. Spearman in MS Excel Year x y Rank 1 Rank 2 D(R1-R2) 𝐷2 2001 8 9.1 1 1 0 0 2002 10.5 10.1 2 4 -2 4 2003 11 9.3 4 2 2 4 2004 12 9.9 6 3 3 9 2005 12.9 11.3 7 6 1 1 2006 13.5 10.9 9 5 4 16 2007 11.6 11.6 5 7 -2 4 2008 10.9 12.5 3 8 -5 25 2009 13 14 8 9 -1 1 2010 14 14.5 10 10 0 0 2011 15.3 15 11 11 0 0 2012 16 15.6 12 12 0 0 2013 17 16.2 13 13 0 0 =SUM(F3:F15) =64 βˆ‘π·2 =1-(6*F17)/($A$1*($A$1^2-1)) βˆ‘π· 2 squaring No. of pairs of observations R=0.824176
  10. 10. Are Correlation and Causation the same?
  11. 11. Correlation β‰  Causation If it were, these would be true...
  12. 12. Practical Applications
  13. 13. Practical Application Correlation is used in: β€’ Business β€’ Government β€’ Education β€’ Medicine β€’ Agriculture
  14. 14. Business β€’ Marketing Expenditure and Sales Volume correlation (to measure the efficiency of marketing department) β€’ Correlation between prices of two securities in the stock market. β€’ Price of a commodity to supply(or demand) correlation.
  15. 15. Government β€’ Year on Year Revenue and Expenditure Correlation (to forecast revenue based on expenditure) β€’ Tool in formulating various Economic Policies by correlating past trends. β€’ Yardstick to measure performance (Correlation between Planned and Actual Revenue)
  16. 16. Education Models β€’ Forecasting of student input flows towards elementary education (Correlation between birth rate data and enrollment in elementary grades) β€’ Forecasting of dropped out student flows at different levels of education (intermediate, graduate, post graduate)
  17. 17. Medicine β€’ Finding out after effects of interactions between different medicines. β€’ Estimating the best treatment where various methods are applicable (Correlation between individual treatments’ results and severity of disease.
  18. 18. Agriculture β€’ Correlation between certain weather conditions and Productivity. β€’ Correlation between irrigating and Productivity. β€’ Correlation between price and production or price and demand, to study demand supply pattern of crops in different seasons.
  19. 19. Conclusion β€’ Correlation is one of the many effective ways of forecasting and predicting possible outcomes based on past observations. β€’ Though other statistical methods too need to be implemented to get a complete picture of the situation.
  20. 20. Thank You
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

Γ—