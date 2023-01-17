1.
B. Weaver: Introduction to Multiple Linear Regression 1
Introduction to
Multiple Linear Regression
Bruce Weaver
Northern Ontario School of Medicine,
Lakehead University
Thunder Bay, Ontario
2.
B. Weaver: Introduction to Multiple Linear Regression 2
Simple vs. Multiple Linear Regression
• Simple Linear Regression
One predictor variable
Fits a straight line through a 2-D cloud
of points
• Multiple Linear Regression
Two or more predictor variables
Can fit a variety of different regression surfaces through
2-D, 3-D or even multi-dimensional clouds of points
• Some examples are given on the following slides
3.
B. Weaver: Introduction to Multiple Linear Regression 3
Example 1: Fitting a sheet of plywood
through a 3-dimensional cloud of points
X1
X2
Y
The best-fitting
“sheet of plywood”
Statisticians would call
it a regression plane,
or more generally, a
regression surface
4.
B. Weaver: Introduction to Multiple Linear Regression 4
Example 2: Fitting straight lines
for two (or more) groups
Fit line for
Group 1
Fit line for
Group 2
5.
B. Weaver: Introduction to Multiple Linear Regression 5
Example 3: Fitting curvilinear relationships
• Multiple linear regression can fit curvilinear relationships for
1 or more groups
6.
B. Weaver: Introduction to Multiple Linear Regression 6
Example 4: Fitting a curved sheet of
plywood through a 3D cloud of points
In this case, the
sheet of plywood is
curved along both
X1 and X2 axes
7.
B. Weaver: Introduction to Multiple Linear Regression 7
Example 5: Plywood with a Twist
One might ask why
models like this are
not called Chubby
Checker models.
8.
B. Weaver: Introduction to Multiple Linear Regression 8
The KISS Principle
• Because this is an introductory
chapter, we will keep things
(relatively) simple by restricting
ourselves to models with:
Continuous predictor variables
First-order effects only
• Example 1 (shown earlier) is
one such model Example 1
9.
B. Weaver: Introduction to Multiple Linear Regression 9
A Note on Terminology
• In standard statistical terminology, univariate and
multivariate describe the number of dependent variables,
not the number of predictor variables
• The terms used to describe the number of predictor
variables are univariable and multivariable
• People often describe one-predictor
regression models as univariate, and multi-
predictor regression models as multivariate
– but this is not correct usage
10.
B. Weaver: Introduction to Multiple Linear Regression 10
Partial and Semi-Partial Correlation
11.
B. Weaver: Introduction to Multiple Linear Regression 11
Partial & Semi-partial Correlation
• Before tackling multiple regression per se, we will look at
partial and semi-partial correlation
• They can be computed when you have 3 or more variables
• They are related to multiple regression, and helpful for
understanding some of the concepts
• The following example uses Canadian
occupational prestige data from John Fox’s
2008 book on applied regression
12.
B. Weaver: Introduction to Multiple Linear Regression 12
Example using occupational prestige data
Pearson r = .850
Years of Education and
Job Prestige are probably
both related to Income
What is the correlation
between Education and
Job Prestige if we remove
the effect of Income?
Partial and semi-partial
correlation allow us to
ask questions like:
13.
B. Weaver: Introduction to Multiple Linear Regression 13
Confirmation that both Education and
Prestige are Related to Income
14.
B. Weaver: Introduction to Multiple Linear Regression 14
• Partial correlation removes, or partials out the
effect of the third variable from both of the other
variables
• Semi-partial correlation partials out the effect of
the third variable from only one of the two variables
Partial versus Semi-partial
Also known as
part correlation
Further evidence of a plot
to confuse students?
15.
B. Weaver: Introduction to Multiple Linear Regression 15
• Partial correlation removes, or partials out the effect of the
third variable from both of the other variables
What is the correlation between Education and Prestige if we partial
out the effect of Income from both of them?
What is the correlation between Income and Prestige if we partial
out the effect of Education from both of them?
Examples of Partial Correlation
Let’s take this one and see
how it works conceptually.
16.
B. Weaver: Introduction to Multiple Linear Regression 16
Conceptual Approach to Partial Correlation
• We want the correlation between Education and Prestige
with the effect of Income partialled out of both of them
• In other words, we want the correlation between:
Variability in Education that is not explained by Income, and
Variability in Prestige that is not explained by Income
We do?
• Even though you may not yet know it, you
have the tools you need to understand what
that means conceptually
17.
B. Weaver: Introduction to Multiple Linear Regression 17
Something you might remember from
the chapter on simple linear regression
( ) ( ) ( )
Y Y Y Y Y Y
Predicted score
minus the mean
of all Y-scores
Raw score minus
predicted score
Residual: NOT accounted for by
the relationship between X and Y
The part that IS accounted for by
the relationship between X and Y
2
Regression
( )
Y Y SS
2
Residual
( )
Y Y SS
Raw score minus
the mean of all
Y-scores
18.
B. Weaver: Introduction to Multiple Linear Regression 18
• Save the residuals – call them EEduc.Inc
• The variance of these residuals is variance in Education
that is not explained by Income
Variability in Education that is Not
Explained by Income
• Let Y = Years of Education
• Let X = Income
• Perform simple linear regression
19.
B. Weaver: Introduction to Multiple Linear Regression 19
• Save the residuals – call them EPrest.Inc
• The variance of these residuals is variance in Job Prestige
that is not explained by Income
Variability in Job Prestige that is
Not Explained by Income
• Let Y = Job Prestige
• Let X = Income
• Perform simple linear regression
20.
B. Weaver: Introduction to Multiple Linear Regression 20
• EEduc.Inc variability in Education not explained by Income
• EPrest.Inc variability in Prestige not explained by Income
• Compute Pearson r between EEduc.Inc and EPrest.Inc
Partial Correlation as Simple Correlation
Conceptually, the partial correlation is a simple
(Pearson) correlation between two sets of residuals.
21.
B. Weaver: Introduction to Multiple Linear Regression 21
Okay then…let’s do it
* Let Y = Education, X = Income.
* Save the residuals.
REGRESSION
/DEPENDENT education
/METHOD=ENTER incomek
/SAVE= resid (e_educ.inc) .
* Let Y = Prestige, X = Income.
* Save the residuals.
REGRESSION
/DEPENDENT prestige
/METHOD=ENTER incomek
/SAVE= resid (e_prest.inc) .
22.
B. Weaver: Introduction to Multiple Linear Regression 22
Now Correlate EEduc.Inc with EPrest.Inc
• The partial correlation between Education and Job Prestige
with Income partialled out of both of them = .766
CORRELATE e_educ.inc WITH e_prest.inc .
23.
B. Weaver: Introduction to Multiple Linear Regression 23
Verifying the Result with PARTIAL CORR
In the SPSS GUI: Analyze Correlate Partial
Compute the (partial)
correlation between
these variables
While controlling for
these variables
Exit dialog via
PASTE!
24.
B. Weaver: Introduction to Multiple Linear Regression 24
The PARTIAL CORR syntax
PARTIAL CORR prestige education BY incomek .
PARTIAL CORR prestige incomek BY education .
• Variables after the key word BY are partialled out
• First command: Partial correlation between PRESTIGE
and EDUCATION with INCOMEK partialled out of both
• Second command: Partial correlation between PRESTIGE
and INCOMEK with EDUCATION partialled out of both
25.
B. Weaver: Introduction to Multiple Linear Regression 25
The SPSS Output (1)
• The partial correlation between Prestige and Education with
Income partialled out of both = .766
• The same value we obtained by the conceptual method
Partial Corr
26.
B. Weaver: Introduction to Multiple Linear Regression 26
The SPSS Output (2)
• The partial correlation between Prestige and Income with
Education partialled out of both = .521
Partial Corr
27.
B. Weaver: Introduction to Multiple Linear Regression 27
• Partial correlation removes, or partials out the
effect of a third variable from both of the other
variables
• Semi-partial (or part) correlation partials out the
effect of a third variable from only one of the two
variables
Recall from earlier that…
28.
B. Weaver: Introduction to Multiple Linear Regression 28
• The correlation between Prestige and Income with the
effect of Education partialled out of Income only
• The correlation between Prestige and Education with the
effect of Income partialled out of Education only
Examples of Semi-partial Correlation
Let’s take this one and see
how it works conceptually.
29.
B. Weaver: Introduction to Multiple Linear Regression 29
Conceptual Approach to Semi-partial Correlation
• We want the correlation between Prestige and
Education with the effect of Income partialled out
of Education only
• In other words, we want the correlation between:
Job Prestige, and
Variability in Education that is not explained by Income
We already have the EEduc.Inc residuals that give us this!
30.
B. Weaver: Introduction to Multiple Linear Regression 30
• Save the residuals – call them EEduc.Inc
• The variance of these residuals is variance in Education
that is not explained by Income
Recall from earlier…
• Let Y = Years of Education
• Let X = Income
• Perform simple linear regression
31.
B. Weaver: Introduction to Multiple Linear Regression 31
Semi-partial correlation with
Income partialled out of Education
The semi-partial correlation
between Education and
Prestige with Income
partialled out of Education
The simple Pearson
correlation between
EEduc.Inc and Prestige
=
CORRELATE e_educ.inc with prestige .
32.
B. Weaver: Introduction to Multiple Linear Regression 32
Computing Partial & Semi-Partial
Correlations via the REGRESSION Command
• As we saw earlier, SPSS has a PARTIAL CORR command
• There is no comparable command for computing semi-partial
correlations
• However, both partial and semi-partial correlations can be
computed via the REGRESSION command
• The key is to add the ZPP option to the /STATISTICS sub-
command
33.
B. Weaver: Introduction to Multiple Linear Regression 33
The Syntax
* Get partial & semi-partial correlations via REGRESSION .
REGRESSION
/STATISTICS ZPP
/DEPENDENT prestige
/METHOD=ENTER education incomek .
• Partial correlations: X2 is partialled out of X1 and out of Y
• Semi-partial: X2 is partialled out of X1, but not out of Y
• The Y-variable cannot not partialled out of anything
Zero-order, Partial,
and Part correlations
34.
B. Weaver: Introduction to Multiple Linear Regression 34
The Output
Ordinary Pearson
correlations with Job
Prestige Score
Partial correlations
Semi-partial (or part) correlations with X1 partialled out
of X2 or vice versa – but neither partialled out of Y
35.
B. Weaver: Introduction to Multiple Linear Regression 35
A Formula for Partial Correlation
• Let Y = Job Prestige Score
• Let X1 = Years of Education
• Let X2 = Income
2 12
1 2 12
1.2 2 2
(1 )(1 )
Y
Y Y
Y
r r r
r
r r
You don’t need to memorize this formula – I’m just
showing it to you because you might run into it in future.
36.
B. Weaver: Introduction to Multiple Linear Regression 36
A Formula for Semi-Partial Correlation
• Let Y = university mean grade
• Let X1 = annual income
• Let X2 = IQ
12
1 2 12
1.2 2
(1 )
Y Y
Y
r r r
r
r
You don’t need to memorize this formula – I’m just
showing it to you because you might run into it in future.
37.
B. Weaver: Introduction to Multiple Linear Regression 37
First-order vs. higher order partial correlation
• What we have considered thus far is first-order
partial and semi-partial correlation
• I.e., we partialled out the effect of one variable
• In higher order partial and semi-partial correlation,
we partial out the effects of two or more variables
E.g., in second-order partial or semi-partial correlation,
one partials out the effects of two other variables
38.
B. Weaver: Introduction to Multiple Linear Regression 38
Example: Second-order partial correlation
• Here is a formula for the second-order partial correlation
between Y and X1, with X2 and X3 partialled out of both Y
and X1
3.2 13.2
1.2 3.2 13.2
1.23 2 2
(1 )(1 )
Y
Y Y
Y
r r r
r
r r
You don’t need to memorize this formula – I’m just
showing it to you because you might run into it in future.
39.
B. Weaver: Introduction to Multiple Linear Regression 39
Back to Multiple Linear Regression
40.
B. Weaver: Introduction to Multiple Linear Regression 40
0 1 1 2 2 ... p p
Y b b X b X b X E
The Multiple Regression Equation
Constant ---- Regression Coefficients -----
------ Explanatory Variables -------
aka predictor variables
Residual
The part that
distinguishes multiple
linear regression from
simple linear regression
41.
B. Weaver: Introduction to Multiple Linear Regression 41
Least Squares Criterion
• The least squares criterion still applies
• The sum of squared residuals is a minimum
2
( ) a minimum
Y Y
Residuals are defined just as they
are in simple linear regression
42.
B. Weaver: Introduction to Multiple Linear Regression 42
Standardized Regression Equation (1)
• When all variables are first converted to standard scores
(i.e., z-scores), we get the standardized regression equation
Z
Y 1Z1 2 Z2 ... pZp
• The Greek letter beta is used to represent coefficients
instead of the Roman letter B
• β0 is not shown because it = 0
Why does β0 = 0?
43.
B. Weaver: Introduction to Multiple Linear Regression 43
Standardized Regression Equation (2)
• Coefficients from standardized regression equation
sometimes called beta-weights
• Also called standardized regression coefficients, or
standard partial regression coefficients.
Standard because
regression is performed
on z-scores
Partial because the
effects of other predictor
variables are partialled
out (or controlled for)
44.
B. Weaver: Introduction to Multiple Linear Regression 44
Standardized Regression Equation (3)
• In other words, β1 is short for:
β1.2,3,…p
• Where p = the number of predictor variables in the
multiple regression model
• The effects of X2, X3, …Xp are controlled for, or
partialled out
45.
B. Weaver: Introduction to Multiple Linear Regression 45
Raw Regression Coefficients
• For the raw-score regression equation, b1 is short
b1.2,3,…p
• As in the standardized regression equation, the
effects of X2, X3,…Xp are controlled for, or
partialled out
46.
B. Weaver: Introduction to Multiple Linear Regression 46
The Multiple Correlation Coefficient
• Pearson r = simple correlation between two variables
• The multiple correlation coefficient is R
• Conceptually, R = the simple correlation between Y and Y′
for a multiple regression model with 2 or more predictor
variables
RY.1,2...p = rYY′
47.
B. Weaver: Introduction to Multiple Linear Regression 47
R is biased
• “People often assume that if there is no relation
between the criterion and the predictors, R should
come out near 0.”
• “In fact, the expected value of R for random data is
p / (N – 1).” (Howell, 2007, p. 506)
• E.g., if p = 5 predictors and N = 50 cases, the
expected value of R = 5 / 49 = .102, not 0
48.
B. Weaver: Introduction to Multiple Linear Regression 48
R2 and Adjusted R2
• In simple linear regression r2 = the proportion of variance in
Y that is account for (or fit by) the linear relationship between
X and Y
• In multiple regression R2 = the proportion of variance in Y
that is accounted for by the linear combination of
predictor variables X1, X2, …Xp
• Adjusted R2 adjusts for the fact that R2 is biased high – how
the adjustment works is explained in the chapter on simple
linear regression
49.
B. Weaver: Introduction to Multiple Linear Regression 49
An Example of Multiple Linear Regression
with Two Predictor Variables
50.
B. Weaver: Introduction to Multiple Linear Regression 50
• The SPSS data file used for the next example (lung.sav) can
be downloaded here
• Each record (or row, or case) has data for one family
• The main variables we will look at are:
FAGE – Father’s age (in years)
FHEIGHT – Father’s height (in inches)
FFVC - Father’s FVC (i.e., forced vital capacity)
FFEV1 – Father’s FEV1 (i.e., forced expired volume in 1 second)
Data for the Next Example
These are common measures of lung function
51.
B. Weaver: Introduction to Multiple Linear Regression 51
FVC and FEV1
• Two primary measures of lung function
• FVC – forced vital capacity; “the volume of air that can
forcibly be blown out after full inspiration, measured in litres.”
• FEV1 – Forced expiratory volume in 1 second; “the
maximum volume of air that [one] can forcibly blow out in the
first second during the FVC manoeuvre, measured in liters.”
Source: http://en.wikipedia.org/wiki/Spirometry#Forced_Vital_Capacity_.28FVC.29
52.
B. Weaver: Introduction to Multiple Linear Regression 52
Multiple Linear Regression with
Two Predictor Variables
• The simplest form of multiple linear regression has two
(continuous) predictor variables
Y = b0 + b1X1 + b2X2 + E
• For simple linear regression, we had the best-fitting straight
line through a 2-D cloud of points (in a scatter-plot)
• For this model, imagine the best fitting
sheet of plywood (or plexiglass)
through a 3-D cloud of points
53.
B. Weaver: Introduction to Multiple Linear Regression 53
Variables in the Model
• In our two-predictor model:
Y = Father’s FEV1
X1 = Father’s height in inches
X2 = Father’s age (in years)
• Some descriptive statistics follow on the next slides
54.
B. Weaver: Introduction to Multiple Linear Regression 54
Descriptive Stats on Y, X1 and X2
Y = FEV1 X1 = Height X2 = Age
55.
B. Weaver: Introduction to Multiple Linear Regression 55
Bivariate relationship between X1 and Y
FEV1 = -408.67 + 11.81 × Height + Error
56.
B. Weaver: Introduction to Multiple Linear Regression 56
Bivariate relationship between X2 and Y
FEV1 = 526.637 – 2.923 × Age + Error
57.
B. Weaver: Introduction to Multiple Linear Regression 57
The 2 Simple Linear Regression Models
X = Height
X = Age
What are the Pearson correlations
of Height and Age with FEV1?
58.
B. Weaver: Introduction to Multiple Linear Regression 58
The Partial & Semi-partial Correlations
• Unlike what we saw in the earlier example (Prestige, Education, Income),
the partial correlations are further away from 0 than the zero-order
(Pearson) correlations
• That happens in this case because one of the simple correlations is
positive and the other is negative
Semi-partial
59.
B. Weaver: Introduction to Multiple Linear Regression 59
A 3-dimensional scatter-plot
X1 X2
Y
A 3-dimensional
cloud of points
60.
B. Weaver: Introduction to Multiple Linear Regression 60
The best-fitting regression plane
X1 X2
Y
Best fitting by the
least squares
criterion
The sum of the
squared errors in
prediction is a
minimum
The best-fitting
sheet of plywood
61.
B. Weaver: Introduction to Multiple Linear Regression 61
Least Squares for Multiple Regression
• Errors in prediction are measured vertically (i.e.,
along the Y-axis), as in simple linear regression
• In a two-predictor model like this, a prediction error
(or fitting error) is the vertical distance of the actual
data point from the surface of the best-fitting sheet
of plywood
• As in simple linear regression, the sum of the
squared errors in prediction is minimized
62.
B. Weaver: Introduction to Multiple Linear Regression 62
Fitting the model with SPSS
• In the GUI: Analyze Regression Linear
• The syntax:
REGRESSION
/STATISTICS COEFF R ANOVA
/DEPENDENT FFEV1
/METHOD=ENTER FHEIGHT FAGE .
63.
B. Weaver: Introduction to Multiple Linear Regression 63
The Model Summary
The multiple correlation
of all predictors with Y
The squared multiple correlation; equal to
proportion of variability in Y that is accounted
for by the linear combination all predictors
Discussed in the notes on
simple linear regression
Root mean
square error
(RMSE)
64.
B. Weaver: Introduction to Multiple Linear Regression 64
The ANOVA Table
2
Total ( )
SS Y Y
65.
B. Weaver: Introduction to Multiple Linear Regression 65
The ANOVA Table
Regression
df p
Total 1
df n
Residual 1
df n p
66.
B. Weaver: Introduction to Multiple Linear Regression 66
The ANOVA Table
Regression
Regression
Regression
SS
MS
df
Residual
Residual
Residual
SS
MS
df
2859.954 53.479
RMSE
67.
B. Weaver: Introduction to Multiple Linear Regression 67
The ANOVA Table
Regression
( , 1)
Residual
p n p
MS
F
MS
The Sig. column
gives the p-value
for the F-test
68.
B. Weaver: Introduction to Multiple Linear Regression 68
The Regression Coefficients
FEV1 = -276.075 + 11.440 × Height – 2.664 × Age + error
b0 b1 b2
X1 X2
Y
69.
B. Weaver: Introduction to Multiple Linear Regression 69
Y = the outcome (or dependent or criterion) variable
X1 = first predictor variable
X2 = second predictor variable
b0 = the constant
b1 = regression coefficient for X1
b2 = regression coefficient for X2
E = error in prediction, or residual
Interpreting the Regression Equation
The fitted value of Y when
both predictor variables = 0.
Change in the fitted
value of Y for a one-unit
increase in X1 while
controlling for X2
Y = b0 + b1X1 + b2X2 + E
Change in the fitted
value of Y for a one-unit
increase in X2 while
controlling for X1
70.
B. Weaver: Introduction to Multiple Linear Regression 70
Recap on b0, b1 and b2
• b0 the fitted value of Y when X1 and X2 both equal 0
• b1 the change in the fitted value of Y for a one-unit
increase in X1 while controlling for X2
• b2 the change in the fitted value of Y for a one-unit
increase in X2 while controlling for X1
Controlling for the other variables in the model is
often described as holding them constant.
That description works for first order effects only models,
but will not work for some more complicated models.
71.
B. Weaver: Introduction to Multiple Linear Regression 71
What do those coefficients mean?
FEV1 = -276.075 + 11.440 × Height – 2.664 × Age + error
b0 b1 b2
The fitted
value of FEV1
when both
Height and
Age equal 0
The change in the
fitted value of
FEV1 when
Height increases
by one inch with
Age held constant
The change in the
fitted value of
FEV1 when Age
increases by one
year with Height
held constant
Impossible!
Impossible!
72.
B. Weaver: Introduction to Multiple Linear Regression 72
Can we do something to ensure that
the constant is a possible value of Y?
• Yes.
• We can centre the predictor variables on possible
values before running the model
• Centering a variable on some value just means
subtracting that value from all cases
• E.g., to centre Age on 50, just compute a new
variable that is equal to Age minus 50
73.
B. Weaver: Introduction to Multiple Linear Regression 73
What value should we use for centering?
• Many authors recommend centering on the sample mean
• There is nothing technically wrong with mean-centering
• But the mean changes from sample to sample, which affects
comparability of constants from one study to the next
• One can centre on any possible value for the variable
• We shall centre Height and Age on sensible values near their
minima
74.
B. Weaver: Introduction to Multiple Linear Regression 74
Centering the Variables
• From the descriptive stats we saw earlier:
Heights ranged from 61 to 76 inches – so centre on 60
Ages ranged from 26 to 59 years – so centre on 25
* Compute the centered variables.
COMPUTE FHT60 = fheight - 60. /* Min was 61 .
COMPUTE FAGE25 = fage - 25. /* Min was 26 .
EXE.
75.
B. Weaver: Introduction to Multiple Linear Regression 75
Height vs. Height centered on 60 inches
When Height = 60,
Centered Height = 0
Setting centered Height
to 0 is equivalent to
setting Height to 60
76.
B. Weaver: Introduction to Multiple Linear Regression 76
Age vs. Age centered on 25
When Age = 25,
Centered Age = 0
Setting centered Age
to 0 is equivalent to
setting Age to 25
77.
B. Weaver: Introduction to Multiple Linear Regression 77
Run the Model Again using Centered Variables
REGRESSION
/STATISTICS COEFF OUTS CI(95) R ANOVA
/DEPENDENT FFEV1
/METHOD=ENTER FHT60 FAGE25 .
In place of the original
variable FHEIGHT
In place of the original
variable FAGE
78.
B. Weaver: Introduction to Multiple Linear Regression 78
The Model Summary
From the original model
From the model with centered variables
Everything is identical.
79.
B. Weaver: Introduction to Multiple Linear Regression 79
The ANOVA Summary Table
From the original model
From the model with centered variables
Everything is identical.
80.
B. Weaver: Introduction to Multiple Linear Regression 80
The Regression Coefficients
From the original model
From the model with centered variables
The coefficients for Height and Age
are unaffected by the centering.
But the value of the constant changes—in the 2nd model,
b0 = the fitted value of Y when Height = 60 and Age = 25.
81.
B. Weaver: Introduction to Multiple Linear Regression 81
Summary on Centering of Variables
• Centering the explanatory variables on possible values (e.g.,
the mean, or a value near the minimum) results in a
constant term that represents a possible value of Y
• All other aspects of the output are unaffected, including:
R, R2, Adjusted R2, & the Standard Error of Estimate
The ANOVA results
The coefficients for the explanatory variables, their standard errors,
the t-tests on them, and their 95% confidence intervals*
* In models with first order effects only (e.g., no interactions or polynomial terms)
82.
B. Weaver: Introduction to Multiple Linear Regression 82
Centering & Interactions
• If we had more time, we could
explore how centering of
predictor variables also
facilitates interpretation of
interactions (and polynomial
terms) in regression models
• Sadly, we do not have time to
explore that fascinating topic
• Those who crave more info are
referred to the excellent book
by Aiken & West (1991)
83.
B. Weaver: Introduction to Multiple Linear Regression 83
Hierarchical Regression &
Semi-partial Correlation Revisited
84.
B. Weaver: Introduction to Multiple Linear Regression 84
Hierarchical Regression
• Hierarchical regression is a very common and useful
technique for regression models of all types
• Rather than entering all of the predictor variables at once,
you enter some of them on the first step, and then add one
or more variables on the second step, and so on
• An F-test on the change in R2 from one step to the next can
be used to assess whether the fit of the model improved
significantly
Note that hierarchical regression is not the same thing as a hierarchal
linear model (HLM). HLM is another name for a multilevel model.
85.
B. Weaver: Introduction to Multiple Linear Regression 85
F-test on the change in R2
• N = the number of subjects
• f = the number of predictors in the “fuller” of the 2 models
• r = the number of predictors in the “reduced” model
F
( f r,N f 1)
(N f 1)(Rf
2
Rr
2
)
( f r)(1 Rf
2
)
H0: Variables added to the model do not improve the fit
86.
B. Weaver: Introduction to Multiple Linear Regression 86
Enter Height first, then Age
* Enter height, then age.
REGRESSION
/STATISTICS COEFF R ANOVA CHANGE
/DEPENDENT ffev1
/METHOD=ENTER fht60
/METHOD=ENTER fage25 .
Two ENTER sub-
commands rather than one
Show change in R2
with its F-test
87.
B. Weaver: Introduction to Multiple Linear Regression 87
Model Summary
With both variables entered simultaneously
With Height entered on Step 1, and Age on Step 2
The squared semi-partial correlation between
Age and FEV1 with Height partialled out of Age
88.
B. Weaver: Introduction to Multiple Linear Regression 88
ANOVA Summary Table
With both variables entered simultaneously
89.
B. Weaver: Introduction to Multiple Linear Regression 89
ANOVA Summary Table
With Height entered on Step 1, and Age on Step 2
Same ANOVA summary table
as Model 1 on the last slide
90.
B. Weaver: Introduction to Multiple Linear Regression 90
The Regression Coefficients
With both variables entered simultaneously
91.
B. Weaver: Introduction to Multiple Linear Regression 91
The Regression Coefficients
With Height entered on Step 1, and Age on Step 2
Same as Model 1
on the last slide
92.
B. Weaver: Introduction to Multiple Linear Regression 92
Enter Age first, then Height
* Enter height, then age.
REGRESSION
/STATISTICS COEFF R ANOVA CHANGE
/DEPENDENT ffev1
/METHOD=ENTER fage25
/METHOD=ENTER fht60 .
Opposite order
compared to last time
93.
B. Weaver: Introduction to Multiple Linear Regression 93
Model Summary
With Height entered on Step 1, and Age on Step 2
With Age entered on Step 1, and Height on Step 2
The squared semi-partial correlation between
Height and FEV1 with Age partialled out of Height
The squared semi-partial correlation between
Age and FEV1 with Height partialled out of Age
R2 Change for Model 2 = squared semi-partial correlation for the X-variable
added in Model 2, with the X-variable from Model 1 partialled out.
94.
B. Weaver: Introduction to Multiple Linear Regression 94
ANOVA Summary Table
With Height entered on Step 1, and Age on Step 2
95.
B. Weaver: Introduction to Multiple Linear Regression 95
ANOVA Summary Table
With Age entered on Step 1, and Height on Step 2
96.
B. Weaver: Introduction to Multiple Linear Regression 96
The Regression Coefficients
With Height entered on Step 1, and Age on Step 2
A large change in the coefficient for FHT60 when FAGE25
is added would be an indication of confounding.
97.
B. Weaver: Introduction to Multiple Linear Regression 97
The Regression Coefficients
With Age entered on Step 1, and Height on Step 2
A large change in the coefficient for FAGE25 when FHT60
is added would be an indication of confounding.
98.
B. Weaver: Introduction to Multiple Linear Regression 98
Another way to obtain
the change in R2
99.
B. Weaver: Introduction to Multiple Linear Regression 99
Model Summaries Again
With Height entered on Step 1, and Age on Step 2
With Age entered on Step 1, and Height on Step 2
When Age is added
to a model
containing Height
When Height is
added to a model
containing Age
100.
B. Weaver: Introduction to Multiple Linear Regression 100
Another way to obtain those results
• For the SPSS REGRESSION command, the default method
for adding variables is ENTER
• This results in an ANOVA table that has just one overall F-
test for all of the predictor variables taken as a group
• Another method—which is not available through the GUI—is
the TEST method
• TEST allows the user to specify groupings of variables for
which to compute F-tests
101.
B. Weaver: Introduction to Multiple Linear Regression 101
Running our 2-predictor model with TEST
* Two-predictor model with METHOD=TEST .
REGRESSION
/STATISTICS COEFF R ANOVA CHANGE
/DEPENDENT ffev1
/METHOD=TEST (fht60) (fage25) .
Compute an F-test
for variable FHT60
Compute an F-test
for variable FAGE25
102.
B. Weaver: Introduction to Multiple Linear Regression 102
The ANOVA Summary Table with TEST
The usual ANOVA summary table, just like
we got when we used METHOD=ENTER.
Separate F-tests for the variable
groupings we specified.
The same R2-change and
F-tests we saw earlier in
the Model Summaries
Change in R2 when
that variable grouping
is removed from the
full model
103.
B. Weaver: Introduction to Multiple Linear Regression 103
Just to clarify…
• If I put both variables in a single pair of brackets,
like this:
REGRESSION
/STATISTICS COEFF R ANOVA CHANGE
/DEPENDENT ffev1
/METHOD = TEST (fht60 fage25)
• I will get one F-test with 2 degrees of freedom
104.
B. Weaver: Introduction to Multiple Linear Regression 104
Output from second TEST example
• When FHT60 and FAGE25 are enclosed in one set of
parentheses, I get one subset test with df = 2
• Because there are only 2 variables in the model, that subset
test is identical to the overall F-test for the model
105.
B. Weaver: Introduction to Multiple Linear Regression 105
Unique and Redundant Variance
106.
B. Weaver: Introduction to Multiple Linear Regression 106
Comparing R2 values
Model
FEV1 = b0 + b1×Height + E
FEV1 = b0 + b1×Age + E
FEV1 = b0 + b1×Height + b2×Age + E
R2 value
0.254
0.096
0.334
Sum = 0.350
Why is R2 for the two-predictor
model less than 0.350?
107.
B. Weaver: Introduction to Multiple Linear Regression 107
Unique & Redundant Variance
• Rectangle represents the
total variance in Y
• Left circle represents
variance in Y that is
accounted for by X1
• Right circle represents
variance in Y that is
accounted for by X2
Total Variance of Y (FEV1) = 1.000
Height (X1)
Age (X2)
Notice the
overlap! Shared or redundant variance
Non-overlapping bits
represent unique variance
108.
B. Weaver: Introduction to Multiple Linear Regression 108
Unique & Redundant Variance
• A = area outside of the
two circles
• A = variance in Y that is
not explained by the
linear combination of X1
and X2
• A = 1 – R2
Y.12 = 1 - .334
= .666
Total Variance of Y (FEV1) = 1.000
Height (X1)
Age (X2)
A = .666
109.
B. Weaver: Introduction to Multiple Linear Regression 109
Unique & Redundant Variance
• B = variance in Y that is
uniquely accounted for
by X1
• C = variance in Y that is
shared by X1 and X2
(redundant variance)
• D = variance in Y that is
uniquely accounted for
by X2
Total Variance of Y (FEV1) = 1.000
Height (X1)
Age (X2)
A
B D
C
= .666
110.
B. Weaver: Introduction to Multiple Linear Regression 110
Unique & Redundant Variance
• B+C+D = R2
Y.12 = .334
• B+C = r2
Y.1 = .254
• C+D = r2
Y.2 = .096
• (B+C)+(C+D) = .350
• C =
(B+C+C+D) – (B+C+D) =
.350 - .334 = .016
Total Variance of Y (FEV1) = 1.000
Height (X1)
Age (X2)
A
B D
C
= .666
.016 C = (r2
Y.1 + r2
Y.2) – R2
Y.12
111.
B. Weaver: Introduction to Multiple Linear Regression 111
Unique & Redundant Variance
• B+C = r2
Y.1 = .254
• B = .254 – .016 = .238
• C+D = r2
Y.2 = .096
• D = .096 – .016 = .080
Total Variance of Y (FEV1) = 1.000
Height (X1)
Age (X2)
A
B D
C
= .666
.016
.238 .080
112.
B. Weaver: Introduction to Multiple Linear Regression 112
How to Interpret Area B
• B = the change in R2
when X1 is added to a
model containing X2
• B = the change in R2
when X1 is removed
from a model containing
both X1 and X2
• B = the squared semi-
partial correlation
between X1 and Y with
X2 partialled out of X1
Total Variance of Y (FEV1) = 1.000
Height (X1)
Age (X2)
A
B D
C
= .666
.016
.238 .080
Partialling X2
out of X1
113.
B. Weaver: Introduction to Multiple Linear Regression 113
How to Interpret Area D
• D = the change in R2
when X2 is added to a
model containing X1
• D = the change in R2
when X2 is removed
from a model containing
both X1 and X2
• D = the squared semi-
partial correlation
between X2 and Y with
X1 partialled out of X2
Total Variance of Y (FEV1) = 1.000
Height (X1)
Age (X2)
A
B D
C
= .666
.016
.238 .080
Partialling X1
out of X2
114.
B. Weaver: Introduction to Multiple Linear Regression 114
Squared Partial Correlation between X1 and Y
with X2 partialled out of both
• To partial X2 out of both
X1 and Y, cut out the
entire X2 circle
• Now total area = (A+B)
= .666 + .238 = .904
• r2
Y1.2 = .238 / .904 = .263
• SQRT(.263) = .513 = the
partial correlation
computed by SPSS
Total Variance of Y (FEV1) = 1.000
Height (X1)
Age (X2)
A
B D
C
= .666
.016
.238 .080
Cut it
right out
115.
B. Weaver: Introduction to Multiple Linear Regression 115
Squared Partial Correlation between X2 and Y
with X1 partialled out of both
• To partial X1 out of both
X2 and Y, cut out the
entire X1 circle
• Now total area = (A+D)
= .666 + .080 = .746
• r2
Y2.1 = .080 / .746 = .107
• SQRT(.107) = .327
Total Variance of Y (FEV1) = 1.000
Height (X1)
Age (X2)
A
B D
C
= .666
.016
.238 .080
SPSS computes rY2.1 = -.326
Cut it
right out
116.
B. Weaver: Introduction to Multiple Linear Regression 116
Summary of the Figure
• A = 1 - R2
Y.12 = the residual variation in Y
• B+C+D = R2
Y.12 = variation fitted by the model
• B+C = r2
Y1 = r2 between Height and FEV1
• C+D = r2
Y2 = r2 between Age and FEV1
• B = r2
Y(1.2) = the square of the semi-partial
correlation between Height and FEV1 with Age
partialled out of Height
• D = r2
Y(1.2) = the square of the semi-partial correlation between Age and
FEV1 with Height partialled out of Age
• B/(A+B) = r2
Y1.2 = the squared partial correlation between Height and
FEV1 with Age partialled out of both Height and FEV1
• D/(A+D) = r2
Y2.1 = the squared partial correlation between Age and
FEV1 with Height partialled out of both Age and FEV1
117.
B. Weaver: Introduction to Multiple Linear Regression 117
Revisiting the squared multiple correlation
• R2
Y.123...p = r2
Y1 + r2
Y(2.1) + r2
Y(3.12) + ...+ r2
Y(p.123...p-1)
• r2
Y1 = square of the simple correlation between Y and X1
• r2
Y(2.1) = the squared semi-partial correlation between Y
and X2, with X1 partialled out of X2
• r2
Y(3.12) = the squared semi-partial correlation between Y
and X3, with both X1 and X2 partialled out of X3
• And so on…
118.
B. Weaver: Introduction to Multiple Linear Regression 118
Mutually independent predictors
• If all predictors are mutually independent:
No overlapping circles in the Venn diagram
No effects to partial out, so…
R2
Y.123...p = r2
Y1 + r2
Y2 + r2
Y3 + ... + r2
Yp
• The squared multiple correlation = the sum of the
squared simple correlations with Y
119.
B. Weaver: Introduction to Multiple Linear Regression 119
Adding a Third Variable to the Model
120.
B. Weaver: Introduction to Multiple Linear Regression 120
Adding Father’s Weight to the Model
• To illustrate and reinforce some of the concepts we’ve
covered, let’s add Father’s Weight to the model
• To ensure that the constant is interpretable, let’s centre
Weight on a possible in-range value
• Range is 121 to 245 lbs, so let’s centre on 125 lbs
121.
B. Weaver: Introduction to Multiple Linear Regression 121
The Syntax
* Range is 121 - 245, so centre on 125 .
compute fwt125 = fweight - 125 .
var lab fwt125 "Father's weight - 125 lbs“ .
REGRESSION
/STATISTICS COEFF R ANOVA CHANGE
/DEPENDENT ffev1
/METHOD=ENTER fht60 fage25
/METHOD=ENTER fwt125 .
Height & Age entered
first, Weight added on
the second step.
122.
B. Weaver: Introduction to Multiple Linear Regression 122
The Model Summary
• R2 changes from .334 (step 1) to .356 (step 2)
• F-test on the change in R2 is statistically significant, p = .025
• Hard to say if the change in R2 (.023) is large enough to be
practically important – knowledge of the research area is
needed to answer that question
123.
B. Weaver: Introduction to Multiple Linear Regression 123
The ANOVA Summary Table
124.
B. Weaver: Introduction to Multiple Linear Regression 124
The Regression Coefficients
• Does controlling for Weight have any
effect on the coefficients for the other
variables in the model?
Compare to F-test
for change in R2 from
Step1 to Step 2
125.
B. Weaver: Introduction to Multiple Linear Regression 125
The Model Summary
• F(1, 146) = 5.122, p = .025
• On the previous slide, t(146 df) = -2.264, p = .025
• What is the relationship between t and F?
126.
B. Weaver: Introduction to Multiple Linear Regression 126
Y = the outcome (or dependent or criterion) variable
X1 = first predictor variable
X2 = second predictor variable
b0 = the constant
b1 = regression coefficient for X1
b2 = regression coefficient for X2
E = error in prediction, or residual
Interpreting the Coefficients From a
Model with Two Explanatory Variables
The fitted value of Y when
both predictor variables = 0.
Change in the fitted
value of Y for a one-unit
increase in X1 while
controlling for X2
Y = b0 + b1X1 + b2X2 + E
Change in the fitted
value of Y for a one-unit
increase in X2 while
controlling for X1
127.
B. Weaver: Introduction to Multiple Linear Regression 127
Interpreting the Coefficients From a Model
with Three or More Explanatory Variables
Y = the outcome (or dependent or criterion) variable
X1 = first predictor variable
X2 = second predictor variable, etc
p = number of predictor variables
b0 = the constant
b1 = regression coefficient for X1
b2 = regression coefficient for X2, etc
E = error in prediction
0 1 1 2 2 ... p p
Y b b X b X b X E
The fitted value of Y when
all predictor variables = 0.
Change in the fitted
value of Y for one-unit
increase in that X-
variable while controlling
for all other X-variables
128.
B. Weaver: Introduction to Multiple Linear Regression 128
Hierarchical Regression using TEST
• With the ENTER method, variables added on step 1 do not
need to be repeated on step 2
• With the TEST method, every TEST sub-command must list
all of the variables in the model at that point
REGRESSION
/STATISTICS COEFF R ANOVA CHANGE
/DEPENDENT ffev1
/METHOD=TEST (fht60) (fage25)
/METHOD=TEST (fht60) (fage25) (fwt125) .
Age and Height must
appear on both TEST
sub-commands
129.
B. Weaver: Introduction to Multiple Linear Regression 129
ANOVA Summary from TEST Method
130.
B. Weaver: Introduction to Multiple Linear Regression 130
Semi-partial correlation again
R Square
Change
Squared semi-partial correlation
between HEIGHT and FFEV1 with AGE
and WEIGHT partialled out of HEIGHT
Squared semi-partial correlation
between AGE and FFEV1 with HEIGHT
and WEIGHT partialled out of AGE
Squared semi-partial correlation
between WEIGHT and FFEV1 with
HEIGHT and AGE partialled out of
WEIGHT
FHT60
FAGE25
FWT125
131.
B. Weaver: Introduction to Multiple Linear Regression 131
Confirmation
Part2
.245
.078
.023
Apart from some rounding error, the squares of the
semi-partial (or part) correlations shown here match
the R-square Change values from the previous slide.
132.
B. Weaver: Introduction to Multiple Linear Regression 132
Model Assumptions
133.
B. Weaver: Introduction to Multiple Linear Regression 133
Model Assumptions
• The assumptions (or restrictions) for OLS multiple linear
regression are the same as for OLS simple linear regression
• The errors are assumed to be independently and
identically distributed, and to be normally distributed with
mean = 0 and variance = σ2
• The conventional notation for that is as follows:
i.i.d. N(0, σ2)
Independently
& identically
distributed
Normally
distributed with
mean=0 and
variance = σ2
134.
B. Weaver: Introduction to Multiple Linear Regression 134
Details not repeated here
• Please review the relevant section of the simple linear
regression chapter to get the details
• Don’t forget the distinction between errors and residuals
• A couple options for residual plots:
Each explanatory variable in turn on the X-axis (i.e., one residual plot
per explanatory variable)
A single residual plot with the fitted value of Y as the X-axis variable
135.
B. Weaver: Introduction to Multiple Linear Regression 135
Residual Plots for our 3-predictor Model
• Earlier, we ran a model with Height, Age, and
Weight as predictors of FEV1
• The following slides show residual plots for each
predictor variable in turn, plus a 4th plot with the
fitted value of FEV1 plotted as the X-axis variable
136.
B. Weaver: Introduction to Multiple Linear Regression 136
Residual Plot 1: X = Height, Y = Residual
137.
B. Weaver: Introduction to Multiple Linear Regression 137
Residual Plot 2: X = Age, Y = Residual
138.
B. Weaver: Introduction to Multiple Linear Regression 138
Residual Plot 3: X = Weight, Y = Residual
139.
B. Weaver: Introduction to Multiple Linear Regression 139
Residual Plot 4:
X = Fitted Value of FEV1,Y = Residual
140.
B. Weaver: Introduction to Multiple Linear Regression 140
Regression Diagnostics
141.
B. Weaver: Introduction to Multiple Linear Regression 141
Overlap with Simple Linear Regression
• The Regression Diagnostics section from the
notes on simple linear regression also applies to
multiple linear regression
• However, when there are two or more explanatory
variables, multicollinearity is an additional
potential problem
• It is discussed in a separate (brief) chapter
• See also Jerry Dallal’s nice note on it
But not in this course, unfortunately!
142.
B. Weaver: Introduction to Multiple Linear Regression 142
Linear in the Coefficients
143.
B. Weaver: Introduction to Multiple Linear Regression 143
“Linear in the coefficients”
• Linear regression often described as linear in the coefficients
• This means that OLS linear regression can be used to model
non-linear (curvilinear) functional relationships.
• E.g., to model a quadratic (curvilinear) relationship between
X and Y:
2
0 1 2
Y b b X b X
144.
B. Weaver: Introduction to Multiple Linear Regression 144
Example of a Quadratic Relationship
The Yerkes-Dodson Law
Level of Arousal
Models like this
are discussed in
another chapter
But sadly, not in
this course.
145.
B. Weaver: Introduction to Multiple Linear Regression 145
The End
(Yes, really. No riveting Appendix this time.)