1. Multiple Regression Analysis - Inference
Võ Đức Hoàng Vũ
School of Econmics
University of Economics HCMC
5/1/2007
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
2. Assumption of the Classical Linear Model (CLM)
So far, we know that given the Gauss-Markov
assumptions,OLS is BLUE.
In order to do classical hypothesis testing, we need to add
another assumption (beyond the Gauss-Markov
assumptions)
Assume that u is independent of x1, . . . , xk and u is
normally distributed with zero mean and variance
σ2
: u ∼ Normal(0, σ2
)
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
3. CLM Assumptions (cont)
Under CLM, OLS is not only BLUE, but is the minimum
variance unbiased estimator.
We can summarize the population assumptions of CLM
as follows
y|x ∼ Normal(β0 + β1x1 + . . . + βkxk, σ2
)
While for now we just assume normality, clear that
sometimes not the case
Large samples will let us drop normality
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
4. MLR Assumptions
Assumption MLR.1 Linear in Parameter
The Model in the population can be written as
y = β0 + β1x1 + β2x2 + . . . + βkxk + u, (1)
where β0, β1, β2, . . . , βk are unknown parameters (con-
stants) of interest and u is unobserved random error or
disturbance terms.
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
5. MLR Assumptions
Assumption MLR.2 Random Sampling
We have a random sample of n observations,
{(xi1, xi2, . . . , xik, yi ) : i = 1, 2, . . . , n }, following the
population model in Assumption MLR. 1
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
6. MLR Assumptions
Assumption MLR.3 No Perfect Collinearity
In the sample (and therefore in the population), none of
independent variables is constant, and there are no exact
linear relationships among independent variables.
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
7. MLR Assumptions
Assumption MLR.4 Zero Conditional Mean
The error u has an expected value of zero given any
values of the independent variables. In other words,
E(u|x1, x2, . . . , xk) = 0 (2)
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
8. MLR Assumption
Assumption MLR.5 Homoskedasticity
The error u has the same variance given any values of
the explanatory variables. In other words,
Var(u|x1, . . . , xk) = σ2
.
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
9. MLR Assumption
Assumption MLR.6 Normality
The population error u is independent of the explanatory
variables x1, x2, . . . , xk and is normally distributed with
zero mean and variance σ2
: u ∼ Normal(o, σ2
).
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
10. The homoskedastic normal distribution with a single
explanatory variable
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
11. Normal Sampling Distributions
Under the CLM assumptions, conditional on the sample
values of the independent variables
ˆβj ∼ N(βj , Var(ˆβj )), so that
ˆβj − βj
sd(ˆβj )
∼ N(0, 1)
ˆβj is distributed normally because it is a linear
combination of the errors
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
12. The t-test
Under the CLM assumptions
ˆβj − βj
se(ˆβj )
∼ t
α/2
n−k−1
Note this is a t distribution (vs normal) because we have
to estimate σ2
by ˆσ2
Note the degress of freedom: n − k − 1
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
13. The t-test (cont)
Knowing the sampling distribution for standardized
estimator allow us to carry out hypothesis test
Start with a null hypothesis
For example, H0 : βj = 0
If accept null, then accept that xj has no effect on y,
controlling for other x’s
To perform our test we first need to form "the" t-statistic
for ˆβj : tˆβj
≡
ˆβj
se(ˆβj )
We will then use our t statistic along with a rejection rule
to determine whether to accept the null hypothesis, H0.
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
14. t-test: one-side Alternatives
Besides our null, H0, we need an alternative
hypothesis,H1, and a significance level.
H1 may be one-sided, or two-sided
H1 : βj > 0 and H1 : βj < 0 are one-sided
H1 : βj = 0 is a two-sided alternative.
If we want to have only a 5% probability of rejecting H0 if
it is really true, then we say our significance level is 5%.
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
15. t-test: one-side Alternatives (cont)
Having picked a significance level, α, we look up the
(1 − α)th
percentile in a t distribution with n − k − 1 df
and call this c, the critical value
We can reject the null hypothesis if the t statistics is
greater than the critical value
If the t statistics is less than the critical value then we fail
to reject the null.
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
17. One-sided vs Two-sided
Because the t distribution is symmetric, testing
H1 : βj < 0 is straightforward. The critical value is just
the negative of before.
We can reject the null if the t statistic < −c, and if the t
statistic > than −c then we fail to reject the null.
For a two-sided test, we set the critical value based on
α/2 and reject H1 : βj = 0 if the absolute value of the
t-statistics > c
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
19. Summary for H0 : βj = 0
Unless otherwise stated, the alternative is assumed to be
two-sided
If we reject the null, we typically say "xj is statistically
significant at the α% level"
If we fail to reject the null, we typically say "xj is
statistically insignificant at the α% level.
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference
20. Testing other hypothesis
A more general form of the t statistic recognizes that we
may want to test something like H0 : βj = aj
In this case, the appropriate t statistic is
t =
ˆβj − aj
se(ˆβj )
where aj = 0 for the standard test.
Võ Đức Hoàng Vũ Multiple Regression Analysis - Inference