Correlation & Regression_

1,601 views
1,530 views

Published on

Correlation & Regression_

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,601
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
44
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Correlation & Regression_

  1. 1. 1. Scatterplot 2. Simple (Linear) Regression 3. Simple Non-Linear Regression Data: Country database website – information on national characteristics of 160 countries – measures of quality of life of the population (e.g. life expectancy and infant mortality) – measures of wealth (e.g. GNP) Simple Regression/Correlation
  2. 2. Research Question: Is there a relationship between population growth rate and level of urbanization (measured as the percent of population living in urban areas)? A question of “whether or not there is,” as well as “how much.” Scatterplot & Linear Simple Regression T-tests and chi- square tests Correlation and regression analysis
  3. 3. 1. Scatterplot: Graphs/Scatter Simple
  4. 4. popgrow 1. Scatterplot: Graphs/Scatter urb
  5. 5. 1. Scatterplot: Graphs/Scatter URB 120100806040200 POPGROW 8 6 4 2 0 -2 Each dot represents a case/country
  6. 6. Chart/Options To make changes to chart, double click chart in output window. Chart Editor will appear.
  7. 7. Chart/Options: the least square line
  8. 8. Chart/Axis
  9. 9. 1. Scatterplot: Graphs/Scatter URB 100806040200 GROWTH 6 4 2 0 -2 -4 File/Print, Export Chart to export as a Graphics Figure (e.g. .jpg) to Word, Copy/Paste as picture to Word The least square line
  10. 10. 1. Scatterplot: Graphs/Scatter Identify cases using the Point ID tool in the Chart Editor window. Select the button in the menu, and use the pointer to select the point in the upper right corner with high urbanization and high growth rate. (Case number 46: United Arab Emirates.)
  11. 11. Format/Color or Format/Marker While in the Chart Editor, click on a feature (e.g. the least square line or the dots), you can change the color (on the button menu above) or right click and select “properties window” to change the line or marker/dot types.
  12. 12. 2. Simple Linear Regression Analyze/Regression/Linear...
  13. 13. 2. Simple Linear Regression Dependent: popgrow Independent(s): urb
  14. 14. Model Summary .249a .062 .057 1.1537 Model 1 R R Square Adjusted R Square Std. Error of the Estimate Predictors: (Constant), URBa. Regression Results r2 = 0.06 Variables Entered/Removedb URBa . Enter Model 1 Variables Entered Variables Removed Method All requested variables entered.a. Dependent Variable: POPGROWb. About 6% of the variation in popgrow can be explained by variation in urb.
  15. 15. Coefficientsa 2.511 .220 11.389 .000 -1.25E-02 .004 -.249 -3.357 .001 (Constant) URB Model 1 B Std. Error Unstandardized Coefficients Beta Standardized Coefficients t Sig. Dependent Variable: POPGROWa. ANOVAb 15.002 1 15.002 11.271 .001a 226.272 170 1.331 241.274 171 Regression Residual Total Model 1 Sum of Squares df Mean Square F Sig. Predictors: (Constant), URBa. Dependent Variable: POPGROWb. Regression Results < 0.05 Reject H0: No linear relationship between growth and urbanization Confidence interval on the slope for URB, b: [-0.0125 + 1.96 * 0.004] = [-.01984 ≤ b ≤ -.00416 ] --> does not contain 0 There is a significant linear relationship. (reject H0.) > 1.96
  16. 16. There is a weak negative relationship between POPGROW and URB. Only 6% of the variation in growth rate is explained by variation in the level of urbanization (r2 =0.062) The Linear Model (refer to B in coefficients): POPGROW = 2.511 - 0.0125 * URB The predicted growth rate for the US (URB=77.2%) is 1.58% (2.511- 0.0125*77.2), in comparison to the actual rate of 1%. Analysis of Regression Results
  17. 17. Research Question: Perhaps the average number of live births per female (FERTIL) will also explain variation in growth rate? 1. Scatterplot & 2. Linear Simple Regression
  18. 18. popgrow 1. Scatterplot: Graphs/Scatter fertil
  19. 19. 1. Scatterplot: Graphs/Scatter FERTIL 987654321 POPGROW 7 6 5 4 3 2 1 0 -1
  20. 20. 2. Simple Linear Regression Dependent: popgrow Independent(s): fertil
  21. 21. Model Summary .740a .547 .544 .7892 Model 1 R R Square Adjusted R Square Std. Error of the Estimate Predictors: (Constant), FERTILa. Regression Results r2 = 0.547 Variables Entered/Removedb FERTILa . Enter Model 1 Variables Entered Variables Removed Method All requested variables entered.a. Dependent Variable: POPGROWb. About 55% of the variation in popgrow can be explained by variation in fertil.
  22. 22. Coefficientsa .245 .132 1.851 .066 .469 .033 .740 14.027 .000 (Constant) FERTIL Model 1 B Std. Error Unstandardized Coefficients Beta Standardized Coefficients t Sig. Dependent Variable: POPGROWa. ANOVAb 122.537 1 122.537 196.759 .000a 101.513 163 .623 224.051 164 Regression Residual Total Model 1 Sum of Squares df Mean Square F Sig. Predictors: (Constant), FERTILa. Dependent Variable: POPGROWb. Regression Results < 0.05 Reject H0: No linear relationship between growth and fertility Confidence interval on the slope for FERTIL, b: [0.469 + 1.96 * 0.033] = [0.404 < b < 0.533] --> not containing 0 There is a significant linear relationship. (reject H0.) t > 1.96
  23. 23. There is a moderate positive relationship between POPGROW and FERTIL. About 55% of the variation in growth rate is explained by variation in fertility rate (r2 =0.547) The Linear Model (refer to B in coefficients): POPGROW = 0.245 + 0.469 * FERTIL T-scores and the significance levels indicates the constant and coefficient ON FERTIL are significantly different from 0. However, the relationship does not look linear on the scatterplot. Analysis of Regression Results
  24. 24. Test alternative regression models for the relationship between popgrow and fertil using the Analyze/Regression/Curve Estimation package. 3. Simple Non-Linear Regression
  25. 25. 3. Simple Non-linear Regression Dependent: popgrow Independent(s): fertil Linear, Quadratic, Logarithmic
  26. 26. Analysis Results LINEAR: popgrow = b0 + b1 * fertil popgrow = .245 + .469 * fertil r2 = .547 LOGARITHMIC: popgrow = b0 + b1 * log(fertil) popgrow = .045 + 1.67 * log(fertil) r2 = .635 QUADRATIC: popgrow = b0 + b1 * fertil + b2 * fertil2 popgrow = -1.337 + 1.508 * fertil - .132 * fertil2 r2 = .655 About 65% of the variation in growth rates can be explained by variation in fertility using logarithmic or quadratic as compared to only 55% using the linear model. Independent: FERTIL Dependent Mth Rsq d.f. F Sigf b0 b1 b2 POPGROW LIN .547 163 196.76 .000 .2447 .4686 POPGROW LOG .635 163 284.08 .000 .0445 1.6661 POPGROW QUA .655 162 154.03 .000 -1.3371 1.5078 -.1315
  27. 27. POPGROW FERTIL 987654321 7 6 5 4 3 2 1 0 -1 Observed Linear Logarithmic Quadratic Analysis Results Both the logarithmic and quadratic curves seem to better represent the relationship between growth rate and fertility: growth rates increase at a decreasing rate with fertility.

×