© Charles T. Diebold, Ph.D., 71113, 100313. All Rights Res.docx

© Charles T. Diebold, Ph.D., 7/11/13, 10/03/13. All Rights
Reserved. Page 1 of 11
Multiple Linear Regression Tutorial:
RSCH-8250 Advanced Quantitative Reasoning
Charles T. Diebold, Ph.D.
July 11, 2013 (revised October 3, 2013)
How to cite this document:
Diebold, C. T. (2013, October 3). Multiple linear regression
tutorial: RSCH-8250 advanced quantitative
reasoning. Available from [email protected]
Assignment and Tutorial Introduction
...............................................................................................
..................... 2
Section 1: SPSS Specification of the Assignment
...............................................................................................
... 2

Section 2: Annotated Example SPSS Output, Write Up Guide,
and Sample APA Tables .................................... 5
Descriptive Statistics
...............................................................................................
............................................ 5
Bivariate Correlation Matrix
...............................................................................................
................................ 6
Regression Method
...............................................................................................
.............................................. 6
Model
Summary.................................................................................
................................................................. 7
ANOVA F Test of the Omnibus Regression
...............................................................................................
....... 7
Coefficients Output
...............................................................................................
.............................................. 8
Results Write Up Guide
...............................................................................................
..................................... 10

Multiple Linear Regression Tutorial:
RSCH-8250 Advanced Quantitative Reasoning
Assignment and Tutorial Introduction
This tutorial is intended to assist RSCH8250 students in
completing the Week 6 application assignment. I
recommend that you use this tutorial as your first line of
instruction; then, if you have time, study the
textbook chapter and other resources noted in the classroom.
3
rd
edition of Field textbook:
Chapter 7 in the Field textbook, Smart Alex's Task #1 on p. 262,
using the Supermodel.sav SPSS dataset.
4
th
edition of Field textbook:
Chapter 8 in the Field textbook, Smart Alex's Task #4 on p. 355,
using the Supermodel.sav SPSS dataset.

The objective of the exercise is to conduct and interpret a
standard multiple regression, including assessment of
multicollinearity.
The tutorial contains two sections. Section 1 provides step-by-
step graphic user interface (GUI) screenshots for
specifying the assignment in SPSS. If you follow the steps you
will produce correct SPSS output. Section 2
presents and interprets output for a different set of variables,
and includes a results write up guide, and sample
APA style tables (the variables and data in Section 2 are “made
up” and do not reflect real research).
Section 1: SPSS Specification of the Assignment
The assignment asks you to regress the per day salary of models
(SALARY) on model’s age (AGE), number of
years having worked as a model (YEARS), and a rating of the
model’s attractiveness (BEAUTY). The
capitalized words are the respective variable names in the
Supermodel.sav SPSS dataset.
Open the dataset, The Variable View screenshot is shown
below. There are four variables in the dataset,
corresponding to the four variable described in the previous

paragraph.
The Descriptives dialogue box appears (below left). Select all
four variables and move into the Variable(s) box
(below right).
Click the OK button, which will produce output with the
minimum, maximum, mean, and standard deviation
values for each variable.

The Linear Regression dialogue box appears (below left). We
want to predict salary, so it is the dependent
variable; click on it to highlight, then click the arrow next to
the Dependent box, which will move salary into
the box. The other three variables are being used to predict
salary, so they are independent variables; select and
move each one into the Independent(s) box. When done, it
should look like the screen below right.
Below the Independent(s) box is the word “Method” and a
dropdown box with different ways to specify the
entry of the predictors. For this assignment, leave it as “Enter”,
which will force all predictors into the analysis
at the same time, regardless of whether they are statistically
significant. The “Enter” method represents what is
referred to as a standard regression.
In the Linear Regression dialogue box (see previous
screenshot), there is a column of buttons along the upper
right. Click the “Statistics” button; a new dialogue box will

appear. Click the boxes so that checkmarks appear for each
of the elements as shown at left.
For the purposes of this assignment, there is no need to
examine the dialogue boxes for Plots, Save, Options, or
Bootstrap. Even though Field discusses some regression
diagnostics, these are (except for multicollinearity) beyond
the level of this course.
So, once you have specified the statistics at left click the
Continue button, which will return you to the Linear
Regression dialogue, in which clicking the OK button will
run the analysis and produce adequate output for the
assignment. Example output is shown and interpreted in the
next section.
Section 2: Annotated Example SPSS Output, Write Up Guide,
and Sample APA Tables
The example output shown below uses variables different from
the Week 6 assignment. The purpose is to

explain key elements of the output, point out what to focus on,
and demonstrate how to interpret and report the
results in APA statistical style.
The criterion (aka dependent variable, what we are trying to
predict) is overall grade point average (GPA) of 9
th
grade students. The predictors (aka independent variables) are
intelligence quotient (IQ), grade earned in an
English course (ENGG), and a measure of attention deficit
(ADDSC).
As shown in the descriptive statistics output (from the
DESCRIPTIVES procedure in SPSS), data had been
collected on 216 individuals. The minimum, maximum, mean,
and standard deviation of each variable are
provided. Reporting on the operationalization of each variable
and the observed values in the sample give the
reader insight into the variable being analyzed. For example:
“Attention deficit was measured on a scale of 0 to
100 with higher scores indicating more pronounced attention
deficit symptomatology. In the sample of 9

th
grade
students, attention deficit scores ranged from 24.67 to 76.67
with a mean of 52.85 (SD = 10.45).” The
regression procedure will also produce a descriptive statistics
table, but it does not include the minimum and
maximum values.
N Minimum Maximum Mean Std. Deviation
GPA 216 .25 4.00 2.4386 .84507
IQ 216 55.00 137.00 102.3542 12.55762
ENGG 216 .00 4.00 2.4954 .90988
ADDSC 216 24.67 76.67 52.8480 10.45221
Valid N (listwise) 216
Bivariate Correlation Matrix
From the regression output, the correlations table indicates the

bivariate correlations and one-tailed p values of
each pair of variables. There was a statistically significant
inverse relationship between GPA and attention
deficit score, r(214) = -.542, p < .001 one-tailed, indicating that
as attention deficit increased, GPA tended to
decrease. You can similarly report the other two bivariate
correlations with GPA, but keep in mind that these
are just descriptive because the focus is on the multiple
regression results. As FYI, the 214 in the parenthesis in
the example above is the df value. For correlations, the df value
is N – 2. The table below indicates that N = 216,
so df = 216 – 2 = 214.
Correlations
GPA ENGG ADDSC IQ
Pearson Correlation
GPA 1.000 .746 -.542 .446
ENGG .746 1.000 -.445 .283
ADDSC -.542 -.445 1.000 -.629
IQ .446 .283 -.629 1.000
Sig. (1-tailed)

GPA . .000 .000 .000
ENGG .000 . .000 .000
ADDSC .000 .000 . .000
IQ .000 .000 .000 .
N
GPA 216 216 216 216
ENGG 216 216 216 216
ADDSC 216 216 216 216
IQ 216 216 216 216
Regression Method
The output below simply informs us that all three variables
were entered simultaneously, which is what had to
happen because we had specified the “Enter” method. In the
results write up you just need to identify the
method used. Such as: “The purpose of the standard regression
analysis was to examine the combined and
relative effects of 9
th
grade students IQ, English grade, and attention deficit score in

predicting overall GPA.”
The term standard regression means that all predictors were
entered simultaneously. Two other common
methods are statistical regression (aka stepwise regression) in
which variables enter according to level of
significance, and sequential regression (aka hierarchical
regression) in which the analyst decides and specifies
the order of entry of each variable.
Variables Entered/Removed
a
Model Variables Entered Variables
Removed
Method
1
IQ, ENGG,
ADDSC
b
. Enter
a. Dependent Variable: GPA
b. All requested variables entered.

Model Summary
For a standard regression the information highlighted in yellow
in the Model Summary output is relevant (the other information
is somewhat
redundant to what we will pull out of the ANOVA output that
follows). In this example, the three predictors combined
explained 62.9% of the
variance in overall GPA. R
2
is the sample result, and adjusted R
2
is an estimate for the population; the sample result is typically
reported.
Model Summary
Model R R Square Adjusted R
Square
Std. Error of the
Estimate
Change Statistics

R Square Change F Change df1 df2 Sig. F Change
1 .793
a
.629 .624 .51832 .629 119.836 3 212 .000
a. Predictors: (Constant), IQ, ENGG, ADDSC
ANOVA F Test of the Omnibus Regression
The ANOVA output provides the test of statistical significance
of the regression. In this example, the combined effect of
student’s IQ, English
grade, and attention deficit score statistically significantly
predicted overall GPA, F(3, 212) = 119.84, p < .001, R
2
= .63. The output shows
.000 in the Sig column, but probability cannot be zero, so in
such cases, report as p < .001 (which is APA style), do not
report p = .000. To be
clear, ignore Dr. Morrow’s reporting of p = .000 in her videos
and, instead, follow APA style.
FYI for the inquisitive:
The regression sum of squares of 96.585 is the explained
variance in GPA. The residual sum of squares of 56.955 is the
variance in GPA that

was not explained by the three predictors. The sum of these is
the total sum of squares. If you divide the total sum of squares
by the regression
sum of squares you get the proportion of variance explained,
which is R
2
= 96.585 ÷ 153. 540 = .629. R, which in this example is .793, is
the
correlation between predicted GPA and actual GPA. That is, if
you saved the predicted GPA scores from the regression
analysis and then did a
correlation between those predicted scores and the original GPA
scores that we were predicting, the correlation would be .793.
For multiple
regression, a R value of .14 is considered a small effect, .36 a
medium effect, and .51 a large effect; these correspond to the R
2
values of .02,
.13, and .26, respectively.
ANOVA
a
Model Sum of Squares df Mean Square F Sig.
1
Regression 96.585 3 32.195 119.836 .000

b
Residual 56.955 212 .269
Total 153.540 215
b. Predictors: (Constant), IQ, ENGG, ADDSC
Coefficients Output
The Coefficients output details the effects of each predictor
while holding the other predictors constant (or, said another
way, while controlling
for the other predictors). That is, standard multiple regression is
not about the isolated effect of a predictor on the criterion, but
the effect of a
predictor on the criterion while simultaneously considering the
effect of the other predictors and the correlations among the
predictors.
Correlations. To illustrate this point, look at the zero-order and
part correlations columns. The zero-order is the simple
correlation between a

predictor and the criterion. For example, the simple correlation
between English grade and overall GPA is shown as .746, which
is the same as
was shown in the previous correlations output. The part
correlation (aka semipartial correlation, which is the more
common term in the
literature, and is the term I will use hereafter) indexes the
unique relationship between the predictor and criterion that
none of the other
predictors explains. That is, when the predictors are correlated,
some of the variance in the criterion is explained by more than
one predictor;
the semipartial correlation filters out any shared explanation by
predictors, leaving only each predictor’s unique contribution.
If predictors are correlated, the semipartial will always be
smaller than the simple zero-order correlation (if predictors are
uncorrelated, a rare
event, the zero-order and semipartial will be equal). In this
example, the semipartial correlation between English grade and
overall GPA is .563,
much less than the simple correlation of .746. The semipartial
squared (sr
2
), a commonly reported effect size, is the proportion of variance
in
the criterion uniquely accounted for by the predictor; so,
English grade uniquely accounted for 31.7% of the variance in

overall GPA.
The interpretation of the partial correlation is not as
straightforward as the semipartial correlation. In addition to the
unique variance accounted
for, the partial correlation attributes to each predictor its
relative proportion of explained variance in the criterion that is
shared with other
predictors. When predictors are correlated, the partial
correlation will always be higher than the semipartial
correlation.
Relative Importance of Predictors. If interested in rank ordering
the relative importance of each predictor, such is best arranged
by using sr
2
or the absolute value of the semipartial correlations
(Tabachnick & Fidell, 2007). In this example, order of variable
importance is English grade
(sr
2
= .317), IQ (sr
2
= .017), then attention deficit score (sr
2
= .013).

Coefficients
a
Model Unstandardized
Coefficients
Standardized
Coefficients
t Sig. 95.0% Confidence Interval
for B
Correlations Collinearity
Statistics
B Std. Error Beta Lower Bound Upper Bound Zero-
order
Partial Part Tolerance VIF
1
(Constant) .477 .580 .823 .412 -.666 1.620
ENGG .584 .043 .628 13.451 .000 .498 .669 .746 .679 .563 .802
1.247
ADDSC -.013 .005 -.156 -2.704 .007 -.022 -.003 -.542 -.183 -
.113 .527 1.897

IQ .011 .004 .170 3.160 .002 .004 .019 .446 .212 .132 .605
1.654
Unstandardized and Standardized Coefficients. The
unstandardized coefficient, B, indexes the amount of raw score
change in the criterion
for a 1-unit raw score change in the predictor. For example,
while holding the other predictors constant, if English grade
increases from 2.0 to
3.0 (or 2.4 to 3.4. or 1.7 to 2.7, in other words, any 1-unit
increase), overall GPA is predicted to increase by .584 points
(with 95% CI from .498
to .669). Similarly, if IQ increases 10 points, overall GPA is
expected to increase .110 points (i.e., the B weight is .011 for a
1 point increase, so
for a 10 point increase it would be 10 x .011 = .110). From
these two examples, you should be able to determine how to
report the relationship
between attention deficit score and overall GPA.
The generic unstandardized regression equation is: Y’ = B0 +
B1X1 + B2X2 + …BkXk

Y’ is the predicted value of the criterion (aka dependent
variable), B0 is the constant, and the numbered Bs and Xs
represent each predictor.
Contextualized to this example, the unstandardized equation is:
Overall GPA’ = .477 + .584(English grade) -.013(attention
deficit score) + .011(IQ)
The equation can be used to predict overall GPA for specific
values of each predictor. For example, if a student had an
English grade of 2.7, an
attention deficit score of 65, and an IQ of 105, predicted overall
GPA would be: .477 + .584(2.7) - .013(65) + .011(105) = 2.36.
The standardized coefficient, β (pronounced beta), indexes the
standard unit change in the criterion for a 1-standard deviation
change in the
predictor. For a 1 standard deviation increase in attention
deficit score, overall GPA is predicted to decrease by .156
standard deviations.
Predictor Significance Tests. A t test is used to determine the
statistical significance of each predictor. Technically, the t test
determines if the
B coefficient is different from 0. The t value is equal to the B
coefficient divided by its standard error (SE). For English
grade, t = .584 ÷ .043

= 13.58, which is within rounding error of the t value shown in
the output (for the computation to be accurate, the B and SE
values need to be
known to several more decimal places than shown in the
output). The t value is evaluated at the error degrees of freedom
(df) value, which is N
– k - 1, where k is the number of predictor variables. For IQ one
might report: “While holding the effects of the other predictors
constant, the
effect of IQ on predicting overall GPA was statistically
significant, t(212) = 3.16, p = .002, sr
2
= .017, uniquely accounting for 1.7% of the
variance in overall GPA. For each 1-point increase in IQ,
overall GPA was expected to increase .011 points (95% CI from
.004 to .019).”
Similar statements should be made for the other predictors.
Collinearity. Multicollinearity exists if one predictor is highly
predicted by the set of other predictors, which can be the case
when highly
correlated with just one of the other predictors in the set. In the
last two columns of the Coefficients output (see previous page)
the tolerance
and variance inflation factor (VIF) values can be examined to
assess multicollinearity. Tolerance values less than .1 and VIF
values greater

than 10 are often cited as cutoffs. Cohen, Cohen, West, and
Aiken (2003) demonstrated how these cutoffs are not
particularly useful in
identifying potential multicollinearity issues, requiring bivariate
correlations in excess of .90. The Help information within the
SPSS software
recommends tolerance of .5 and VIF of 2 as cutoffs for further
examination of any multicollinearity effects. Correlations
between predictors of
.70 or higher can cause issues. Though there are other possible
causes unrelated to multicollinearity, a common manifestation
of
multicollinearity is different signs (i.e., + or -) for a predictor’s
zero-order correlation and its β weight. In such cases, if
multicollinearity is the
culprit, the regression analysis is invalid.
Results Write Up Guide
Begin the write up by describing the context of the research and
the variables. If known, state how each variable was
operationalized, for
example: “Overall GPA was measured on the traditional 4-point

scale from 0 (F) to 4 (A)”, or “Satisfaction was measured on a
5-point likert-
type scale from 1 (not at all satisfied) to 5 (extremely
satisfied).” Please pay attention to APA style for reporting scale
anchors (see p. 91 and p.
105 in the 6
th
edition of the APA Manual).
Report descriptive statistics such as minimum, maximum, mean,
and standard deviation for each metric variable. For nominal
variables, report
percentage for each level of the variable, for example: “Of the
total sample (N = 150) there were 40 (26.7%) males and 110
(73.3%) females.”
Keep in mind that a sentence that includes information in
parentheticals must still be a sentence (and make sense) if the
parentheticals are
removed. For example, the one above without parentheticals is
still a sentence and makes sense: “Of the total sample there
were 40 males and
110 females.”
State the purpose of the analysis or provide the guiding research
question(s). If you use research questions, do not craft them
such that they can

be answered with a yes or no. Instead, craft them so that they
will have a quantitative answer. For example: “What is the
strength and direction
of relationship between X and Y?” or “What is the difference in
group means on X between males and females?”
Present null and alternative hypothesis sets applicable to the
analysis. For regression there would be a hypothesis set for the
overall result (i.e.,
the combined effect of the predictors) and a hypothesis set for
each predictor while “controlling for” or “holding constant” the
effects of the
other predictors.
State assumptions or other considerations for the analysis, and
report the actual statistical result for relevant tests. For this
course, the only
regression consideration that needs to be presented and
discussed is for multicollinearity. Even if violated, you must
still report and interpret
the remaining results.
Report and interpret the overall regression results. Report and
interpret the results of each predictor. Be sure to include the
actual statistical
results in text—examples were provided within the annotated
output section of this tutorial. Don’t forget to interpret the

results (e.g., as IQ
increased, overall GPA was predicted to increase; based on
semipartial correlations, variable x was the most important
predictor of y; etc.).
Draw conclusions about rejecting or failing to reject each null.
If needed, summarize the results, without statistics, in a
concluding sentence or
paragraph.
Provide APA style tables appropriate to the analysis. Do not use
SPSS output, it is not in APA style. Example APA tables for a
multiple
regression are shown below using the results from the example
output in this tutorial. Although one would typically not
duplicate information
in text and tables, it is important to demonstrate competence in
both ways of reporting the results; so, you cannot just provide
tables, you must
also report the relevant statistical results within the textual
write up.
Example APA Tables for Standard Regression

Table 1
Means, Standard Deviations, and Intercorrelations for Overall
GPA and Predictor Variables IQ, English
Grade, and Attention Deficit Score
Variable M SD 1 2 3 4
1. Overall GPA 2.44 0.85 .45 .75 -.54
2. IQ 102.35 12.56 < .001 .28 -.63
3. English grade 2.50 0.91 < .001 < .001 -.45
4. Attention deficit score 52.85 10.45 < .001 < .001 < .001
Note. Upper diagonal contains correlation coefficients. Lower
diagonal contains p values.
Table 2
Standard Regression Summary for IQ, English Grade, and
Attention Deficit Scores Predicting Overall GPA
Variable B 95% CI β sr p
Constant 0.477 [-0.666, 1.620]

IQ 0.011 [0.004, 0.019] .170 .132 .002
English grade 0.584 [0.498, 0.669] .628 .563 < .001
Attention deficit score -0.013 [-0.022, -0.003] -.156 -.113 .007
Note. CI = confidence interval for B; sr = semipartial
correlation (aka, part correlation).
ATT00001
ATT00002
ATT00003
Sent from my iPhone
IMG_0584.JPG
IMG_0585.JPG
IMG_0586.JPG

© Charles T. Diebold, Ph.D., 71113, 100313. All Rights Res.docx

Recommended

Recommended

More Related Content

Similar to © Charles T. Diebold, Ph.D., 71113, 100313. All Rights Res.docx

Similar to © Charles T. Diebold, Ph.D., 71113, 100313. All Rights Res.docx (20)

More from LynellBull52

More from LynellBull52 (20)

Recently uploaded

Recently uploaded (20)

© Charles T. Diebold, Ph.D., 71113, 100313. All Rights Res.docx