Introduction to Multiple Linear Regression.pptx

Jan. 17, 2023
Introduction to Multiple Linear Regression.pptx

Jan. 17, 2023
Data & Analytics

An introductory speech for discussing the initials of the multiple linear regression.

An introductory speech for discussing the initials of the multiple linear regression.

Data & Analytics
Introduction to Multiple Linear Regression.pptx

  1. 1. B. Weaver: Introduction to Multiple Linear Regression 1 Introduction to Multiple Linear Regression Bruce Weaver Northern Ontario School of Medicine, Lakehead University Thunder Bay, Ontario
  2. 2. B. Weaver: Introduction to Multiple Linear Regression 2 Simple vs. Multiple Linear Regression • Simple Linear Regression  One predictor variable  Fits a straight line through a 2-D cloud of points • Multiple Linear Regression  Two or more predictor variables  Can fit a variety of different regression surfaces through 2-D, 3-D or even multi-dimensional clouds of points • Some examples are given on the following slides
  3. 3. B. Weaver: Introduction to Multiple Linear Regression 3 Example 1: Fitting a sheet of plywood through a 3-dimensional cloud of points X1 X2 Y The best-fitting “sheet of plywood” Statisticians would call it a regression plane, or more generally, a regression surface
  4. 4. B. Weaver: Introduction to Multiple Linear Regression 4 Example 2: Fitting straight lines for two (or more) groups Fit line for Group 1 Fit line for Group 2
  5. 5. B. Weaver: Introduction to Multiple Linear Regression 5 Example 3: Fitting curvilinear relationships • Multiple linear regression can fit curvilinear relationships for 1 or more groups
  6. 6. B. Weaver: Introduction to Multiple Linear Regression 6 Example 4: Fitting a curved sheet of plywood through a 3D cloud of points In this case, the sheet of plywood is curved along both X1 and X2 axes
  7. 7. B. Weaver: Introduction to Multiple Linear Regression 7 Example 5: Plywood with a Twist One might ask why models like this are not called Chubby Checker models.
  8. 8. B. Weaver: Introduction to Multiple Linear Regression 8 The KISS Principle • Because this is an introductory chapter, we will keep things (relatively) simple by restricting ourselves to models with:  Continuous predictor variables  First-order effects only • Example 1 (shown earlier) is one such model Example 1
  9. 9. B. Weaver: Introduction to Multiple Linear Regression 9 A Note on Terminology • In standard statistical terminology, univariate and multivariate describe the number of dependent variables, not the number of predictor variables • The terms used to describe the number of predictor variables are univariable and multivariable • People often describe one-predictor regression models as univariate, and multi- predictor regression models as multivariate – but this is not correct usage
  10. 10. B. Weaver: Introduction to Multiple Linear Regression 10 Partial and Semi-Partial Correlation
  11. 11. B. Weaver: Introduction to Multiple Linear Regression 11 Partial & Semi-partial Correlation • Before tackling multiple regression per se, we will look at partial and semi-partial correlation • They can be computed when you have 3 or more variables • They are related to multiple regression, and helpful for understanding some of the concepts • The following example uses Canadian occupational prestige data from John Fox’s 2008 book on applied regression
  12. 12. B. Weaver: Introduction to Multiple Linear Regression 12 Example using occupational prestige data Pearson r = .850 Years of Education and Job Prestige are probably both related to Income What is the correlation between Education and Job Prestige if we remove the effect of Income? Partial and semi-partial correlation allow us to ask questions like:
  13. 13. B. Weaver: Introduction to Multiple Linear Regression 13 Confirmation that both Education and Prestige are Related to Income
  14. 14. B. Weaver: Introduction to Multiple Linear Regression 14 • Partial correlation removes, or partials out the effect of the third variable from both of the other variables • Semi-partial correlation partials out the effect of the third variable from only one of the two variables Partial versus Semi-partial Also known as part correlation Further evidence of a plot to confuse students?
  15. 15. B. Weaver: Introduction to Multiple Linear Regression 15 • Partial correlation removes, or partials out the effect of the third variable from both of the other variables  What is the correlation between Education and Prestige if we partial out the effect of Income from both of them?  What is the correlation between Income and Prestige if we partial out the effect of Education from both of them? Examples of Partial Correlation Let’s take this one and see how it works conceptually.
  16. 16. B. Weaver: Introduction to Multiple Linear Regression 16 Conceptual Approach to Partial Correlation • We want the correlation between Education and Prestige with the effect of Income partialled out of both of them • In other words, we want the correlation between:  Variability in Education that is not explained by Income, and  Variability in Prestige that is not explained by Income We do? • Even though you may not yet know it, you have the tools you need to understand what that means conceptually
  17. 17. B. Weaver: Introduction to Multiple Linear Regression 17 Something you might remember from the chapter on simple linear regression ( ) ( ) ( ) Y Y Y Y Y Y        Predicted score minus the mean of all Y-scores Raw score minus predicted score Residual: NOT accounted for by the relationship between X and Y The part that IS accounted for by the relationship between X and Y 2 Regression ( ) Y Y SS    2 Residual ( ) Y Y SS     Raw score minus the mean of all Y-scores
  18. 18. B. Weaver: Introduction to Multiple Linear Regression 18 • Save the residuals – call them EEduc.Inc • The variance of these residuals is variance in Education that is not explained by Income Variability in Education that is Not Explained by Income • Let Y = Years of Education • Let X = Income • Perform simple linear regression
  19. 19. B. Weaver: Introduction to Multiple Linear Regression 19 • Save the residuals – call them EPrest.Inc • The variance of these residuals is variance in Job Prestige that is not explained by Income Variability in Job Prestige that is Not Explained by Income • Let Y = Job Prestige • Let X = Income • Perform simple linear regression
  20. 20. B. Weaver: Introduction to Multiple Linear Regression 20 • EEduc.Inc  variability in Education not explained by Income • EPrest.Inc  variability in Prestige not explained by Income • Compute Pearson r between EEduc.Inc and EPrest.Inc Partial Correlation as Simple Correlation Conceptually, the partial correlation is a simple (Pearson) correlation between two sets of residuals.
  21. 21. B. Weaver: Introduction to Multiple Linear Regression 21 Okay then…let’s do it * Let Y = Education, X = Income. * Save the residuals. REGRESSION /DEPENDENT education /METHOD=ENTER incomek /SAVE= resid (e_educ.inc) . * Let Y = Prestige, X = Income. * Save the residuals. REGRESSION /DEPENDENT prestige /METHOD=ENTER incomek /SAVE= resid (e_prest.inc) .
  22. 22. B. Weaver: Introduction to Multiple Linear Regression 22 Now Correlate EEduc.Inc with EPrest.Inc • The partial correlation between Education and Job Prestige with Income partialled out of both of them = .766 CORRELATE e_educ.inc WITH e_prest.inc .
  23. 23. B. Weaver: Introduction to Multiple Linear Regression 23 Verifying the Result with PARTIAL CORR In the SPSS GUI: Analyze  Correlate  Partial Compute the (partial) correlation between these variables While controlling for these variables Exit dialog via PASTE!
  24. 24. B. Weaver: Introduction to Multiple Linear Regression 24 The PARTIAL CORR syntax PARTIAL CORR prestige education BY incomek . PARTIAL CORR prestige incomek BY education . • Variables after the key word BY are partialled out • First command: Partial correlation between PRESTIGE and EDUCATION with INCOMEK partialled out of both • Second command: Partial correlation between PRESTIGE and INCOMEK with EDUCATION partialled out of both
  25. 25. B. Weaver: Introduction to Multiple Linear Regression 25 The SPSS Output (1) • The partial correlation between Prestige and Education with Income partialled out of both = .766 • The same value we obtained by the conceptual method Partial Corr
  26. 26. B. Weaver: Introduction to Multiple Linear Regression 26 The SPSS Output (2) • The partial correlation between Prestige and Income with Education partialled out of both = .521 Partial Corr
  27. 27. B. Weaver: Introduction to Multiple Linear Regression 27 • Partial correlation removes, or partials out the effect of a third variable from both of the other variables • Semi-partial (or part) correlation partials out the effect of a third variable from only one of the two variables Recall from earlier that…
  28. 28. B. Weaver: Introduction to Multiple Linear Regression 28 • The correlation between Prestige and Income with the effect of Education partialled out of Income only • The correlation between Prestige and Education with the effect of Income partialled out of Education only Examples of Semi-partial Correlation Let’s take this one and see how it works conceptually.
  29. 29. B. Weaver: Introduction to Multiple Linear Regression 29 Conceptual Approach to Semi-partial Correlation • We want the correlation between Prestige and Education with the effect of Income partialled out of Education only • In other words, we want the correlation between: Job Prestige, and Variability in Education that is not explained by Income We already have the EEduc.Inc residuals that give us this!
  30. 30. B. Weaver: Introduction to Multiple Linear Regression 30 • Save the residuals – call them EEduc.Inc • The variance of these residuals is variance in Education that is not explained by Income Recall from earlier… • Let Y = Years of Education • Let X = Income • Perform simple linear regression
  31. 31. B. Weaver: Introduction to Multiple Linear Regression 31 Semi-partial correlation with Income partialled out of Education The semi-partial correlation between Education and Prestige with Income partialled out of Education The simple Pearson correlation between EEduc.Inc and Prestige = CORRELATE e_educ.inc with prestige .
  32. 32. B. Weaver: Introduction to Multiple Linear Regression 32 Computing Partial & Semi-Partial Correlations via the REGRESSION Command • As we saw earlier, SPSS has a PARTIAL CORR command • There is no comparable command for computing semi-partial correlations • However, both partial and semi-partial correlations can be computed via the REGRESSION command • The key is to add the ZPP option to the /STATISTICS sub- command
  33. 33. B. Weaver: Introduction to Multiple Linear Regression 33 The Syntax * Get partial & semi-partial correlations via REGRESSION . REGRESSION /STATISTICS ZPP /DEPENDENT prestige /METHOD=ENTER education incomek . • Partial correlations: X2 is partialled out of X1 and out of Y • Semi-partial: X2 is partialled out of X1, but not out of Y • The Y-variable cannot not partialled out of anything Zero-order, Partial, and Part correlations
  34. 34. B. Weaver: Introduction to Multiple Linear Regression 34 The Output Ordinary Pearson correlations with Job Prestige Score Partial correlations Semi-partial (or part) correlations with X1 partialled out of X2 or vice versa – but neither partialled out of Y
  35. 35. B. Weaver: Introduction to Multiple Linear Regression 35 A Formula for Partial Correlation • Let Y = Job Prestige Score • Let X1 = Years of Education • Let X2 = Income 2 12 1 2 12 1.2 2 2 (1 )(1 ) Y Y Y Y r r r r r r     You don’t need to memorize this formula – I’m just showing it to you because you might run into it in future.
  36. 36. B. Weaver: Introduction to Multiple Linear Regression 36 A Formula for Semi-Partial Correlation • Let Y = university mean grade • Let X1 = annual income • Let X2 = IQ   12 1 2 12 1.2 2 (1 ) Y Y Y r r r r r    You don’t need to memorize this formula – I’m just showing it to you because you might run into it in future.
  37. 37. B. Weaver: Introduction to Multiple Linear Regression 37 First-order vs. higher order partial correlation • What we have considered thus far is first-order partial and semi-partial correlation • I.e., we partialled out the effect of one variable • In higher order partial and semi-partial correlation, we partial out the effects of two or more variables E.g., in second-order partial or semi-partial correlation, one partials out the effects of two other variables
  38. 38. B. Weaver: Introduction to Multiple Linear Regression 38 Example: Second-order partial correlation • Here is a formula for the second-order partial correlation between Y and X1, with X2 and X3 partialled out of both Y and X1 3.2 13.2 1.2 3.2 13.2 1.23 2 2 (1 )(1 ) Y Y Y Y r r r r r r     You don’t need to memorize this formula – I’m just showing it to you because you might run into it in future.
  39. 39. B. Weaver: Introduction to Multiple Linear Regression 39 Back to Multiple Linear Regression
  40. 40. B. Weaver: Introduction to Multiple Linear Regression 40 0 1 1 2 2 ... p p Y b b X b X b X E       The Multiple Regression Equation Constant ---- Regression Coefficients ----- ------ Explanatory Variables ------- aka predictor variables Residual The part that distinguishes multiple linear regression from simple linear regression
  41. 41. B. Weaver: Introduction to Multiple Linear Regression 41 Least Squares Criterion • The least squares criterion still applies • The sum of squared residuals is a minimum 2 ( ) a minimum Y Y    Residuals are defined just as they are in simple linear regression
  42. 42. B. Weaver: Introduction to Multiple Linear Regression 42 Standardized Regression Equation (1) • When all variables are first converted to standard scores (i.e., z-scores), we get the standardized regression equation Z  Y  1Z1  2 Z2  ...  pZp • The Greek letter beta is used to represent coefficients instead of the Roman letter B • β0 is not shown because it = 0 Why does β0 = 0?
  43. 43. B. Weaver: Introduction to Multiple Linear Regression 43 Standardized Regression Equation (2) • Coefficients from standardized regression equation sometimes called beta-weights • Also called standardized regression coefficients, or standard partial regression coefficients. Standard because regression is performed on z-scores Partial because the effects of other predictor variables are partialled out (or controlled for)
  44. 44. B. Weaver: Introduction to Multiple Linear Regression 44 Standardized Regression Equation (3) • In other words, β1 is short for: β1.2,3,…p • Where p = the number of predictor variables in the multiple regression model • The effects of X2, X3, …Xp are controlled for, or partialled out
  45. 45. B. Weaver: Introduction to Multiple Linear Regression 45 Raw Regression Coefficients • For the raw-score regression equation, b1 is short b1.2,3,…p • As in the standardized regression equation, the effects of X2, X3,…Xp are controlled for, or partialled out
  46. 46. B. Weaver: Introduction to Multiple Linear Regression 46 The Multiple Correlation Coefficient • Pearson r = simple correlation between two variables • The multiple correlation coefficient is R • Conceptually, R = the simple correlation between Y and Y′ for a multiple regression model with 2 or more predictor variables RY.1,2...p = rYY′
  47. 47. B. Weaver: Introduction to Multiple Linear Regression 47 R is biased • “People often assume that if there is no relation between the criterion and the predictors, R should come out near 0.” • “In fact, the expected value of R for random data is p / (N – 1).” (Howell, 2007, p. 506) • E.g., if p = 5 predictors and N = 50 cases, the expected value of R = 5 / 49 = .102, not 0
  48. 48. B. Weaver: Introduction to Multiple Linear Regression 48 R2 and Adjusted R2 • In simple linear regression r2 = the proportion of variance in Y that is account for (or fit by) the linear relationship between X and Y • In multiple regression R2 = the proportion of variance in Y that is accounted for by the linear combination of predictor variables X1, X2, …Xp • Adjusted R2 adjusts for the fact that R2 is biased high – how the adjustment works is explained in the chapter on simple linear regression
  49. 49. B. Weaver: Introduction to Multiple Linear Regression 49 An Example of Multiple Linear Regression with Two Predictor Variables
  50. 50. B. Weaver: Introduction to Multiple Linear Regression 50 • The SPSS data file used for the next example (lung.sav) can be downloaded here • Each record (or row, or case) has data for one family • The main variables we will look at are:  FAGE – Father’s age (in years)  FHEIGHT – Father’s height (in inches)  FFVC - Father’s FVC (i.e., forced vital capacity)  FFEV1 – Father’s FEV1 (i.e., forced expired volume in 1 second) Data for the Next Example These are common measures of lung function
  51. 51. B. Weaver: Introduction to Multiple Linear Regression 51 FVC and FEV1 • Two primary measures of lung function • FVC – forced vital capacity; “the volume of air that can forcibly be blown out after full inspiration, measured in litres.” • FEV1 – Forced expiratory volume in 1 second; “the maximum volume of air that [one] can forcibly blow out in the first second during the FVC manoeuvre, measured in liters.” Source: http://en.wikipedia.org/wiki/Spirometry#Forced_Vital_Capacity_.28FVC.29
  52. 52. B. Weaver: Introduction to Multiple Linear Regression 52 Multiple Linear Regression with Two Predictor Variables • The simplest form of multiple linear regression has two (continuous) predictor variables Y = b0 + b1X1 + b2X2 + E • For simple linear regression, we had the best-fitting straight line through a 2-D cloud of points (in a scatter-plot) • For this model, imagine the best fitting sheet of plywood (or plexiglass) through a 3-D cloud of points
  53. 53. B. Weaver: Introduction to Multiple Linear Regression 53 Variables in the Model • In our two-predictor model:  Y = Father’s FEV1  X1 = Father’s height in inches  X2 = Father’s age (in years) • Some descriptive statistics follow on the next slides
  54. 54. B. Weaver: Introduction to Multiple Linear Regression 54 Descriptive Stats on Y, X1 and X2 Y = FEV1 X1 = Height X2 = Age
  55. 55. B. Weaver: Introduction to Multiple Linear Regression 55 Bivariate relationship between X1 and Y FEV1 = -408.67 + 11.81 × Height + Error
  56. 56. B. Weaver: Introduction to Multiple Linear Regression 56 Bivariate relationship between X2 and Y FEV1 = 526.637 – 2.923 × Age + Error
  57. 57. B. Weaver: Introduction to Multiple Linear Regression 57 The 2 Simple Linear Regression Models X = Height X = Age What are the Pearson correlations of Height and Age with FEV1?
  58. 58. B. Weaver: Introduction to Multiple Linear Regression 58 The Partial & Semi-partial Correlations • Unlike what we saw in the earlier example (Prestige, Education, Income), the partial correlations are further away from 0 than the zero-order (Pearson) correlations • That happens in this case because one of the simple correlations is positive and the other is negative Semi-partial
  59. 59. B. Weaver: Introduction to Multiple Linear Regression 59 A 3-dimensional scatter-plot X1 X2 Y A 3-dimensional cloud of points
  60. 60. B. Weaver: Introduction to Multiple Linear Regression 60 The best-fitting regression plane X1 X2 Y Best fitting by the least squares criterion The sum of the squared errors in prediction is a minimum The best-fitting sheet of plywood
  61. 61. B. Weaver: Introduction to Multiple Linear Regression 61 Least Squares for Multiple Regression • Errors in prediction are measured vertically (i.e., along the Y-axis), as in simple linear regression • In a two-predictor model like this, a prediction error (or fitting error) is the vertical distance of the actual data point from the surface of the best-fitting sheet of plywood • As in simple linear regression, the sum of the squared errors in prediction is minimized
  62. 62. B. Weaver: Introduction to Multiple Linear Regression 62 Fitting the model with SPSS • In the GUI: Analyze  Regression  Linear • The syntax: REGRESSION /STATISTICS COEFF R ANOVA /DEPENDENT FFEV1 /METHOD=ENTER FHEIGHT FAGE .
  63. 63. B. Weaver: Introduction to Multiple Linear Regression 63 The Model Summary The multiple correlation of all predictors with Y The squared multiple correlation; equal to proportion of variability in Y that is accounted for by the linear combination all predictors Discussed in the notes on simple linear regression Root mean square error (RMSE)
  64. 64. B. Weaver: Introduction to Multiple Linear Regression 64 The ANOVA Table 2 Total ( ) SS Y Y   
  65. 65. B. Weaver: Introduction to Multiple Linear Regression 65 The ANOVA Table Regression df p  Total 1 df n   Residual 1 df n p   
  66. 66. B. Weaver: Introduction to Multiple Linear Regression 66 The ANOVA Table Regression Regression Regression SS MS df  Residual Residual Residual SS MS df  2859.954 53.479 RMSE  
  67. 67. B. Weaver: Introduction to Multiple Linear Regression 67 The ANOVA Table Regression ( , 1) Residual p n p MS F MS    The Sig. column gives the p-value for the F-test
  68. 68. B. Weaver: Introduction to Multiple Linear Regression 68 The Regression Coefficients FEV1 = -276.075 + 11.440 × Height – 2.664 × Age + error b0 b1 b2 X1 X2 Y
  69. 69. B. Weaver: Introduction to Multiple Linear Regression 69 Y = the outcome (or dependent or criterion) variable X1 = first predictor variable X2 = second predictor variable b0 = the constant b1 = regression coefficient for X1 b2 = regression coefficient for X2 E = error in prediction, or residual Interpreting the Regression Equation The fitted value of Y when both predictor variables = 0. Change in the fitted value of Y for a one-unit increase in X1 while controlling for X2 Y = b0 + b1X1 + b2X2 + E Change in the fitted value of Y for a one-unit increase in X2 while controlling for X1
  70. 70. B. Weaver: Introduction to Multiple Linear Regression 70 Recap on b0, b1 and b2 • b0  the fitted value of Y when X1 and X2 both equal 0 • b1  the change in the fitted value of Y for a one-unit increase in X1 while controlling for X2 • b2  the change in the fitted value of Y for a one-unit increase in X2 while controlling for X1 Controlling for the other variables in the model is often described as holding them constant. That description works for first order effects only models, but will not work for some more complicated models.
  71. 71. B. Weaver: Introduction to Multiple Linear Regression 71 What do those coefficients mean? FEV1 = -276.075 + 11.440 × Height – 2.664 × Age + error b0 b1 b2 The fitted value of FEV1 when both Height and Age equal 0 The change in the fitted value of FEV1 when Height increases by one inch with Age held constant The change in the fitted value of FEV1 when Age increases by one year with Height held constant Impossible! Impossible!
  72. 72. B. Weaver: Introduction to Multiple Linear Regression 72 Can we do something to ensure that the constant is a possible value of Y? • Yes.  • We can centre the predictor variables on possible values before running the model • Centering a variable on some value just means subtracting that value from all cases • E.g., to centre Age on 50, just compute a new variable that is equal to Age minus 50
  73. 73. B. Weaver: Introduction to Multiple Linear Regression 73 What value should we use for centering? • Many authors recommend centering on the sample mean • There is nothing technically wrong with mean-centering • But the mean changes from sample to sample, which affects comparability of constants from one study to the next • One can centre on any possible value for the variable • We shall centre Height and Age on sensible values near their minima
  74. 74. B. Weaver: Introduction to Multiple Linear Regression 74 Centering the Variables • From the descriptive stats we saw earlier: Heights ranged from 61 to 76 inches – so centre on 60 Ages ranged from 26 to 59 years – so centre on 25 * Compute the centered variables. COMPUTE FHT60 = fheight - 60. /* Min was 61 . COMPUTE FAGE25 = fage - 25. /* Min was 26 . EXE.
  75. 75. B. Weaver: Introduction to Multiple Linear Regression 75 Height vs. Height centered on 60 inches When Height = 60, Centered Height = 0 Setting centered Height to 0 is equivalent to setting Height to 60
  76. 76. B. Weaver: Introduction to Multiple Linear Regression 76 Age vs. Age centered on 25 When Age = 25, Centered Age = 0 Setting centered Age to 0 is equivalent to setting Age to 25
  77. 77. B. Weaver: Introduction to Multiple Linear Regression 77 Run the Model Again using Centered Variables REGRESSION /STATISTICS COEFF OUTS CI(95) R ANOVA /DEPENDENT FFEV1 /METHOD=ENTER FHT60 FAGE25 . In place of the original variable FHEIGHT In place of the original variable FAGE
  78. 78. B. Weaver: Introduction to Multiple Linear Regression 78 The Model Summary From the original model From the model with centered variables Everything is identical.
  79. 79. B. Weaver: Introduction to Multiple Linear Regression 79 The ANOVA Summary Table From the original model From the model with centered variables Everything is identical.
  80. 80. B. Weaver: Introduction to Multiple Linear Regression 80 The Regression Coefficients From the original model From the model with centered variables The coefficients for Height and Age are unaffected by the centering. But the value of the constant changes—in the 2nd model, b0 = the fitted value of Y when Height = 60 and Age = 25.
  81. 81. B. Weaver: Introduction to Multiple Linear Regression 81 Summary on Centering of Variables • Centering the explanatory variables on possible values (e.g., the mean, or a value near the minimum) results in a constant term that represents a possible value of Y • All other aspects of the output are unaffected, including:  R, R2, Adjusted R2, & the Standard Error of Estimate  The ANOVA results  The coefficients for the explanatory variables, their standard errors, the t-tests on them, and their 95% confidence intervals* * In models with first order effects only (e.g., no interactions or polynomial terms)
  82. 82. B. Weaver: Introduction to Multiple Linear Regression 82 Centering & Interactions • If we had more time, we could explore how centering of predictor variables also facilitates interpretation of interactions (and polynomial terms) in regression models • Sadly, we do not have time to explore that fascinating topic • Those who crave more info are referred to the excellent book by Aiken & West (1991)
  83. 83. B. Weaver: Introduction to Multiple Linear Regression 83 Hierarchical Regression & Semi-partial Correlation Revisited
  84. 84. B. Weaver: Introduction to Multiple Linear Regression 84 Hierarchical Regression • Hierarchical regression is a very common and useful technique for regression models of all types • Rather than entering all of the predictor variables at once, you enter some of them on the first step, and then add one or more variables on the second step, and so on • An F-test on the change in R2 from one step to the next can be used to assess whether the fit of the model improved significantly Note that hierarchical regression is not the same thing as a hierarchal linear model (HLM). HLM is another name for a multilevel model.
  85. 85. B. Weaver: Introduction to Multiple Linear Regression 85 F-test on the change in R2 • N = the number of subjects • f = the number of predictors in the “fuller” of the 2 models • r = the number of predictors in the “reduced” model F ( f  r,N  f 1)  (N  f 1)(Rf 2  Rr 2 ) ( f  r)(1  Rf 2 ) H0: Variables added to the model do not improve the fit
  86. 86. B. Weaver: Introduction to Multiple Linear Regression 86 Enter Height first, then Age * Enter height, then age. REGRESSION /STATISTICS COEFF R ANOVA CHANGE /DEPENDENT ffev1 /METHOD=ENTER fht60 /METHOD=ENTER fage25 . Two ENTER sub- commands rather than one Show change in R2 with its F-test
  87. 87. B. Weaver: Introduction to Multiple Linear Regression 87 Model Summary With both variables entered simultaneously With Height entered on Step 1, and Age on Step 2 The squared semi-partial correlation between Age and FEV1 with Height partialled out of Age
  88. 88. B. Weaver: Introduction to Multiple Linear Regression 88 ANOVA Summary Table With both variables entered simultaneously
  89. 89. B. Weaver: Introduction to Multiple Linear Regression 89 ANOVA Summary Table With Height entered on Step 1, and Age on Step 2 Same ANOVA summary table as Model 1 on the last slide
  90. 90. B. Weaver: Introduction to Multiple Linear Regression 90 The Regression Coefficients With both variables entered simultaneously
  91. 91. B. Weaver: Introduction to Multiple Linear Regression 91 The Regression Coefficients With Height entered on Step 1, and Age on Step 2 Same as Model 1 on the last slide
  92. 92. B. Weaver: Introduction to Multiple Linear Regression 92 Enter Age first, then Height * Enter height, then age. REGRESSION /STATISTICS COEFF R ANOVA CHANGE /DEPENDENT ffev1 /METHOD=ENTER fage25 /METHOD=ENTER fht60 . Opposite order compared to last time
  93. 93. B. Weaver: Introduction to Multiple Linear Regression 93 Model Summary With Height entered on Step 1, and Age on Step 2 With Age entered on Step 1, and Height on Step 2 The squared semi-partial correlation between Height and FEV1 with Age partialled out of Height The squared semi-partial correlation between Age and FEV1 with Height partialled out of Age R2 Change for Model 2 = squared semi-partial correlation for the X-variable added in Model 2, with the X-variable from Model 1 partialled out.
  94. 94. B. Weaver: Introduction to Multiple Linear Regression 94 ANOVA Summary Table With Height entered on Step 1, and Age on Step 2
  95. 95. B. Weaver: Introduction to Multiple Linear Regression 95 ANOVA Summary Table With Age entered on Step 1, and Height on Step 2
  96. 96. B. Weaver: Introduction to Multiple Linear Regression 96 The Regression Coefficients With Height entered on Step 1, and Age on Step 2 A large change in the coefficient for FHT60 when FAGE25 is added would be an indication of confounding.
  97. 97. B. Weaver: Introduction to Multiple Linear Regression 97 The Regression Coefficients With Age entered on Step 1, and Height on Step 2 A large change in the coefficient for FAGE25 when FHT60 is added would be an indication of confounding.
  98. 98. B. Weaver: Introduction to Multiple Linear Regression 98 Another way to obtain the change in R2
  99. 99. B. Weaver: Introduction to Multiple Linear Regression 99 Model Summaries Again With Height entered on Step 1, and Age on Step 2 With Age entered on Step 1, and Height on Step 2 When Age is added to a model containing Height When Height is added to a model containing Age
  100. 100. B. Weaver: Introduction to Multiple Linear Regression 100 Another way to obtain those results • For the SPSS REGRESSION command, the default method for adding variables is ENTER • This results in an ANOVA table that has just one overall F- test for all of the predictor variables taken as a group • Another method—which is not available through the GUI—is the TEST method • TEST allows the user to specify groupings of variables for which to compute F-tests
  101. 101. B. Weaver: Introduction to Multiple Linear Regression 101 Running our 2-predictor model with TEST * Two-predictor model with METHOD=TEST . REGRESSION /STATISTICS COEFF R ANOVA CHANGE /DEPENDENT ffev1 /METHOD=TEST (fht60) (fage25) . Compute an F-test for variable FHT60 Compute an F-test for variable FAGE25
  102. 102. B. Weaver: Introduction to Multiple Linear Regression 102 The ANOVA Summary Table with TEST The usual ANOVA summary table, just like we got when we used METHOD=ENTER. Separate F-tests for the variable groupings we specified. The same R2-change and F-tests we saw earlier in the Model Summaries Change in R2 when that variable grouping is removed from the full model
  103. 103. B. Weaver: Introduction to Multiple Linear Regression 103 Just to clarify… • If I put both variables in a single pair of brackets, like this: REGRESSION /STATISTICS COEFF R ANOVA CHANGE /DEPENDENT ffev1 /METHOD = TEST (fht60 fage25) • I will get one F-test with 2 degrees of freedom
  104. 104. B. Weaver: Introduction to Multiple Linear Regression 104 Output from second TEST example • When FHT60 and FAGE25 are enclosed in one set of parentheses, I get one subset test with df = 2 • Because there are only 2 variables in the model, that subset test is identical to the overall F-test for the model
  105. 105. B. Weaver: Introduction to Multiple Linear Regression 105 Unique and Redundant Variance
  106. 106. B. Weaver: Introduction to Multiple Linear Regression 106 Comparing R2 values Model FEV1 = b0 + b1×Height + E FEV1 = b0 + b1×Age + E FEV1 = b0 + b1×Height + b2×Age + E R2 value 0.254 0.096 0.334 Sum = 0.350 Why is R2 for the two-predictor model less than 0.350?
  107. 107. B. Weaver: Introduction to Multiple Linear Regression 107 Unique & Redundant Variance • Rectangle represents the total variance in Y • Left circle represents variance in Y that is accounted for by X1 • Right circle represents variance in Y that is accounted for by X2 Total Variance of Y (FEV1) = 1.000 Height (X1) Age (X2) Notice the overlap! Shared or redundant variance Non-overlapping bits represent unique variance
  108. 108. B. Weaver: Introduction to Multiple Linear Regression 108 Unique & Redundant Variance • A = area outside of the two circles • A = variance in Y that is not explained by the linear combination of X1 and X2 • A = 1 – R2 Y.12 = 1 - .334 = .666 Total Variance of Y (FEV1) = 1.000 Height (X1) Age (X2) A = .666
  109. 109. B. Weaver: Introduction to Multiple Linear Regression 109 Unique & Redundant Variance • B = variance in Y that is uniquely accounted for by X1 • C = variance in Y that is shared by X1 and X2 (redundant variance) • D = variance in Y that is uniquely accounted for by X2 Total Variance of Y (FEV1) = 1.000 Height (X1) Age (X2) A B D C = .666
  110. 110. B. Weaver: Introduction to Multiple Linear Regression 110 Unique & Redundant Variance • B+C+D = R2 Y.12 = .334 • B+C = r2 Y.1 = .254 • C+D = r2 Y.2 = .096 • (B+C)+(C+D) = .350 • C = (B+C+C+D) – (B+C+D) = .350 - .334 = .016 Total Variance of Y (FEV1) = 1.000 Height (X1) Age (X2) A B D C = .666 .016 C = (r2 Y.1 + r2 Y.2) – R2 Y.12
  111. 111. B. Weaver: Introduction to Multiple Linear Regression 111 Unique & Redundant Variance • B+C = r2 Y.1 = .254 • B = .254 – .016 = .238 • C+D = r2 Y.2 = .096 • D = .096 – .016 = .080 Total Variance of Y (FEV1) = 1.000 Height (X1) Age (X2) A B D C = .666 .016 .238 .080
  112. 112. B. Weaver: Introduction to Multiple Linear Regression 112 How to Interpret Area B • B = the change in R2 when X1 is added to a model containing X2 • B = the change in R2 when X1 is removed from a model containing both X1 and X2 • B = the squared semi- partial correlation between X1 and Y with X2 partialled out of X1 Total Variance of Y (FEV1) = 1.000 Height (X1) Age (X2) A B D C = .666 .016 .238 .080 Partialling X2 out of X1
  113. 113. B. Weaver: Introduction to Multiple Linear Regression 113 How to Interpret Area D • D = the change in R2 when X2 is added to a model containing X1 • D = the change in R2 when X2 is removed from a model containing both X1 and X2 • D = the squared semi- partial correlation between X2 and Y with X1 partialled out of X2 Total Variance of Y (FEV1) = 1.000 Height (X1) Age (X2) A B D C = .666 .016 .238 .080 Partialling X1 out of X2
  114. 114. B. Weaver: Introduction to Multiple Linear Regression 114 Squared Partial Correlation between X1 and Y with X2 partialled out of both • To partial X2 out of both X1 and Y, cut out the entire X2 circle • Now total area = (A+B) = .666 + .238 = .904 • r2 Y1.2 = .238 / .904 = .263 • SQRT(.263) = .513 = the partial correlation computed by SPSS Total Variance of Y (FEV1) = 1.000 Height (X1) Age (X2) A B D C = .666 .016 .238 .080 Cut it right out
  115. 115. B. Weaver: Introduction to Multiple Linear Regression 115 Squared Partial Correlation between X2 and Y with X1 partialled out of both • To partial X1 out of both X2 and Y, cut out the entire X1 circle • Now total area = (A+D) = .666 + .080 = .746 • r2 Y2.1 = .080 / .746 = .107 • SQRT(.107) = .327 Total Variance of Y (FEV1) = 1.000 Height (X1) Age (X2) A B D C = .666 .016 .238 .080 SPSS computes rY2.1 = -.326 Cut it right out
  116. 116. B. Weaver: Introduction to Multiple Linear Regression 116 Summary of the Figure • A = 1 - R2 Y.12 = the residual variation in Y • B+C+D = R2 Y.12 = variation fitted by the model • B+C = r2 Y1 = r2 between Height and FEV1 • C+D = r2 Y2 = r2 between Age and FEV1 • B = r2 Y(1.2) = the square of the semi-partial correlation between Height and FEV1 with Age partialled out of Height • D = r2 Y(1.2) = the square of the semi-partial correlation between Age and FEV1 with Height partialled out of Age • B/(A+B) = r2 Y1.2 = the squared partial correlation between Height and FEV1 with Age partialled out of both Height and FEV1 • D/(A+D) = r2 Y2.1 = the squared partial correlation between Age and FEV1 with Height partialled out of both Age and FEV1
  117. 117. B. Weaver: Introduction to Multiple Linear Regression 117 Revisiting the squared multiple correlation • R2 Y.123...p = r2 Y1 + r2 Y(2.1) + r2 Y(3.12) + ...+ r2 Y(p.123...p-1) • r2 Y1 = square of the simple correlation between Y and X1 • r2 Y(2.1) = the squared semi-partial correlation between Y and X2, with X1 partialled out of X2 • r2 Y(3.12) = the squared semi-partial correlation between Y and X3, with both X1 and X2 partialled out of X3 • And so on…
  118. 118. B. Weaver: Introduction to Multiple Linear Regression 118 Mutually independent predictors • If all predictors are mutually independent: No overlapping circles in the Venn diagram No effects to partial out, so… R2 Y.123...p = r2 Y1 + r2 Y2 + r2 Y3 + ... + r2 Yp • The squared multiple correlation = the sum of the squared simple correlations with Y
  119. 119. B. Weaver: Introduction to Multiple Linear Regression 119 Adding a Third Variable to the Model
  120. 120. B. Weaver: Introduction to Multiple Linear Regression 120 Adding Father’s Weight to the Model • To illustrate and reinforce some of the concepts we’ve covered, let’s add Father’s Weight to the model • To ensure that the constant is interpretable, let’s centre Weight on a possible in-range value • Range is 121 to 245 lbs, so let’s centre on 125 lbs
  121. 121. B. Weaver: Introduction to Multiple Linear Regression 121 The Syntax * Range is 121 - 245, so centre on 125 . compute fwt125 = fweight - 125 . var lab fwt125 "Father's weight - 125 lbs“ . REGRESSION /STATISTICS COEFF R ANOVA CHANGE /DEPENDENT ffev1 /METHOD=ENTER fht60 fage25 /METHOD=ENTER fwt125 . Height & Age entered first, Weight added on the second step.
  122. 122. B. Weaver: Introduction to Multiple Linear Regression 122 The Model Summary • R2 changes from .334 (step 1) to .356 (step 2) • F-test on the change in R2 is statistically significant, p = .025 • Hard to say if the change in R2 (.023) is large enough to be practically important – knowledge of the research area is needed to answer that question
  123. 123. B. Weaver: Introduction to Multiple Linear Regression 123 The ANOVA Summary Table
  124. 124. B. Weaver: Introduction to Multiple Linear Regression 124 The Regression Coefficients • Does controlling for Weight have any effect on the coefficients for the other variables in the model? Compare to F-test for change in R2 from Step1 to Step 2
  125. 125. B. Weaver: Introduction to Multiple Linear Regression 125 The Model Summary • F(1, 146) = 5.122, p = .025 • On the previous slide, t(146 df) = -2.264, p = .025 • What is the relationship between t and F?
  126. 126. B. Weaver: Introduction to Multiple Linear Regression 126 Y = the outcome (or dependent or criterion) variable X1 = first predictor variable X2 = second predictor variable b0 = the constant b1 = regression coefficient for X1 b2 = regression coefficient for X2 E = error in prediction, or residual Interpreting the Coefficients From a Model with Two Explanatory Variables The fitted value of Y when both predictor variables = 0. Change in the fitted value of Y for a one-unit increase in X1 while controlling for X2 Y = b0 + b1X1 + b2X2 + E Change in the fitted value of Y for a one-unit increase in X2 while controlling for X1
  127. 127. B. Weaver: Introduction to Multiple Linear Regression 127 Interpreting the Coefficients From a Model with Three or More Explanatory Variables Y = the outcome (or dependent or criterion) variable X1 = first predictor variable X2 = second predictor variable, etc p = number of predictor variables b0 = the constant b1 = regression coefficient for X1 b2 = regression coefficient for X2, etc E = error in prediction 0 1 1 2 2 ... p p Y b b X b X b X E      The fitted value of Y when all predictor variables = 0. Change in the fitted value of Y for one-unit increase in that X- variable while controlling for all other X-variables
  128. 128. B. Weaver: Introduction to Multiple Linear Regression 128 Hierarchical Regression using TEST • With the ENTER method, variables added on step 1 do not need to be repeated on step 2 • With the TEST method, every TEST sub-command must list all of the variables in the model at that point REGRESSION /STATISTICS COEFF R ANOVA CHANGE /DEPENDENT ffev1 /METHOD=TEST (fht60) (fage25) /METHOD=TEST (fht60) (fage25) (fwt125) . Age and Height must appear on both TEST sub-commands
  129. 129. B. Weaver: Introduction to Multiple Linear Regression 129 ANOVA Summary from TEST Method
  130. 130. B. Weaver: Introduction to Multiple Linear Regression 130 Semi-partial correlation again R Square Change Squared semi-partial correlation between HEIGHT and FFEV1 with AGE and WEIGHT partialled out of HEIGHT Squared semi-partial correlation between AGE and FFEV1 with HEIGHT and WEIGHT partialled out of AGE Squared semi-partial correlation between WEIGHT and FFEV1 with HEIGHT and AGE partialled out of WEIGHT FHT60 FAGE25 FWT125
  131. 131. B. Weaver: Introduction to Multiple Linear Regression 131 Confirmation Part2 .245 .078 .023 Apart from some rounding error, the squares of the semi-partial (or part) correlations shown here match the R-square Change values from the previous slide.
  132. 132. B. Weaver: Introduction to Multiple Linear Regression 132 Model Assumptions
  133. 133. B. Weaver: Introduction to Multiple Linear Regression 133 Model Assumptions • The assumptions (or restrictions) for OLS multiple linear regression are the same as for OLS simple linear regression • The errors are assumed to be independently and identically distributed, and to be normally distributed with mean = 0 and variance = σ2 • The conventional notation for that is as follows: i.i.d. N(0, σ2) Independently & identically distributed Normally distributed with mean=0 and variance = σ2
  134. 134. B. Weaver: Introduction to Multiple Linear Regression 134 Details not repeated here • Please review the relevant section of the simple linear regression chapter to get the details • Don’t forget the distinction between errors and residuals • A couple options for residual plots:  Each explanatory variable in turn on the X-axis (i.e., one residual plot per explanatory variable)  A single residual plot with the fitted value of Y as the X-axis variable
  135. 135. B. Weaver: Introduction to Multiple Linear Regression 135 Residual Plots for our 3-predictor Model • Earlier, we ran a model with Height, Age, and Weight as predictors of FEV1 • The following slides show residual plots for each predictor variable in turn, plus a 4th plot with the fitted value of FEV1 plotted as the X-axis variable
  136. 136. B. Weaver: Introduction to Multiple Linear Regression 136 Residual Plot 1: X = Height, Y = Residual
  137. 137. B. Weaver: Introduction to Multiple Linear Regression 137 Residual Plot 2: X = Age, Y = Residual
  138. 138. B. Weaver: Introduction to Multiple Linear Regression 138 Residual Plot 3: X = Weight, Y = Residual
  139. 139. B. Weaver: Introduction to Multiple Linear Regression 139 Residual Plot 4: X = Fitted Value of FEV1,Y = Residual
  140. 140. B. Weaver: Introduction to Multiple Linear Regression 140 Regression Diagnostics
  141. 141. B. Weaver: Introduction to Multiple Linear Regression 141 Overlap with Simple Linear Regression • The Regression Diagnostics section from the notes on simple linear regression also applies to multiple linear regression • However, when there are two or more explanatory variables, multicollinearity is an additional potential problem • It is discussed in a separate (brief) chapter • See also Jerry Dallal’s nice note on it But not in this course, unfortunately!
  142. 142. B. Weaver: Introduction to Multiple Linear Regression 142 Linear in the Coefficients
  143. 143. B. Weaver: Introduction to Multiple Linear Regression 143 “Linear in the coefficients” • Linear regression often described as linear in the coefficients • This means that OLS linear regression can be used to model non-linear (curvilinear) functional relationships. • E.g., to model a quadratic (curvilinear) relationship between X and Y: 2 0 1 2 Y b b X b X   
  144. 144. B. Weaver: Introduction to Multiple Linear Regression 144 Example of a Quadratic Relationship The Yerkes-Dodson Law Level of Arousal Models like this are discussed in another chapter But sadly, not in this course.
  145. 145. B. Weaver: Introduction to Multiple Linear Regression 145 The End (Yes, really. No riveting Appendix this time.)

