Multiple Regression Model and Equation
What we learned in SLR, is also applicable in Multiple
The multiple regression model simply extends SLR to include
more than 1 independent variable.
Hence we augment our simple linear model accomodate this:
y = β0 + β1x1 + β2x2 + ... + βpxp + ε
Additionally, since we still assume the expected value of ε to be
zero, we show the multiple regression equation as follows:
E(y) = β0 + β1x1 + β2x2 + ... + βpxp
Estimated Multiple Regression Equation
If β0, β1,...βp were known the equation on the previous slide
could be used to compute the mean value of y at given values
of x1, x2, ..., xp.
But we don't know them, so we need estimates b0, b1, ..., bp
Thus we arrive at the Estimated Regression Equation:
ŷ = b0 + b1x1 + b2x2 + ... + bpxp
Least Squares Method
To estimate our beta's, the objective is the same as SLR.
That is we seek to minimize the difference between our actual
dependent variable (y) and the prediction for that dependent
Least Square Criterion: min Σ(yi - ŷi)2
In SLR, we had a relatively easy way to obtain our estimates.
In multiple regression, this is not so easy:
B = (X'X)-1X'Y
So we rely on statistical computing packages to do this for us.
Interpretation of Coefficients (β) in
Now that we have > 1 independent variables, we must be aware of the consequences of adding
multiple independent variables.
Notice from the example in Ch.15, that a b1 estimate computed with 1 independent variable (SLR)
will NOT be the same when additional independent variables are added.
In SLR we interpreted b1 as an estimate of the change in y for a 1 unit change in x.
In multiple regression, bi is an estimate of the change in y for a 1 unit change in xi when all other
independent variables are held constant. (For example, when they are all 0)
Take note also that now we can easily throw in as many independent variables as we want.
This will increase our explained variance, and our R2...So this is good, right??? Wrong.
While this may increase our ability to predict, it will also make our model increasingly complex.
Statistical power is achieved through accurate prediction with least amount of variables.
In the coming sections we will look at additional measures for 'model parsimony'...that is models
that 'do the most with the least'.
Our assumptions in multiple regression parallel those in
SLR. For emphasis, lets briefly review. (Also look at 15.11)
1. E(ε) = 0; Therefore E(y) = β0 + β1x1 + β2x2 + ... + βpxp
2. Var(ε) = σ2 and is the same for all values of x; Therefore the
variance about the regression line also equals σ2 and is the
same for all values of x.
3. The values of ε are independent; Therefore the values of ε
for any set of x values are not related to any other set of x
4. ε is normally distributed random variable; Therefore y is
Testing for Significance
In multiple regression significance testing carries slightly
different meaning than in SLR.
1. F-Test: Tests for a significant relationship between the
dependent variable and the set of all independent
variables. We refer to this as the test for overall significance.
2. If the F-Test shows overall significance, than we use the t-
test to check the significance for each of the independent
variables. We conduct the t-test on each of the independent
variables. We refer to the t-test as the test for individual
In multiple regression, we test that none of the parameters are
equal to zero:
H0: β0 = β1 = ... = βp = 0
Ha: One or more of the parameters are equal to zero.
Remember that F = MSR/MSE.
And in multiple regression:
MSR = SSR/p
MSE = SSE/(n - p - 1)
And we reject H0 if our p-value < α
Remember we test for each parameter.
For any βi:
H0: βi = 0
Ha: βi ≠ 0
t = βi/sbi
And we reject H0 if our p-value < α
This is essentially the correlation among independent variables.
We care about this because we want our independent variables to measure
significantly different things when predicting our dependent variable.
While in practice there is always some multicollinearity, we need to try and
eliminate as much as we can.
A simple test of multicollinearity is with the sample correlation (rx1x2) for any
two independent variables.
If the sample correlation exceeds .7 for any two independent variables we must
take measures to reduce multicollinearity, for example, removing one of the two
highly correlated variables from the model.
Thats it for Ch. 15.
Hope you have recovered from Mardi Gras next time I see you!