2. Introduction
Regression is the term used to describe the technique of
modeling and analyzing 1 or more variables.
The focus is on a dependent variable, and one or more
independent variables.
Simple Linear Regression means 1 independent variable.
Regression, and other statistical modeling techniques gives us
the power to infer, or predict future outcomes.
An understanding of regression, and the techniques used to
validate your models will provide you with sound methodology
to do just that.
3. Simple Linear Regression
As previously mentioned, simple linear regression means we
have 1 dependent variable (y), and 1 independent variable (x).
In order to make a prediction about y using x, we need sample
data (from both x and y) in order to generate some additional
terms, namely the parameters (β0 and β1), and an error term
(ε).
The parameters, β0 and β1, can be thought of as what is
generated from explained variability
The error term (ε) accounts for unexplained variability.
Thus, the simple linear regression model is: y = β0 + β1x + ε
4. Estimating the Regression Equation
If we were so fortunate to know the population parameters, we could
use the equation on the previous slide to compute the mean.
Unfortunately, for us, we must use sample data to estimate these
parameters, and subsequently, use different symbols to denote our
estimated parameters:
ŷ = b 0 + b 1x
Note that we use place a hat over y (pronounced y-hat) and use
english lettering to denote our estimated parameters.
We now have an equation that graphs a "regression line"
ŷ is the point estimator of E(y), the mean.
b0 is the y-intercept
b1 is the slope
6. The Estimation Process for Simple Linear
Regression
So how to we estimate b0 and b1?
To do this we use a method known as least squares.
In simple linear regression, finding b0 and b1 is relatively straightforward.
Equations 14.6 and 14.7 in your book show the procedure for b0 and
b1, respectively.
Once b0 and b1 are obtained, the estimated simple linear regression equation
will resemble the following:
ŷ = 60 + 5x
It is important to note that you will have a ŷi for every yi in the sample data-set.
It is up to you to determine if the difference between them is small enough to de
the equation an accurate predictor.
7. Coefficient of Determination
The Coefficient of Determination (r2) provides us one measure to judge how well our
regression equation (for example: ŷ = 60 + 5x ) fits the actual data.
Lets take some time to build r2 and learn some important terms along the way:
◆ Remember that we have an estimated dependent variable (ŷi ) and an actual dependent
variable (yi) for each observation.
◆ (yi - ŷi ) is known as the ith residual.
◆ When we take (yi - ŷi ), square it, and sum the squares we get the Sum of Squares of the
Error Terms (SSE) , hence SSE = ∑(yi - ŷi)2 .
◆ When we take (yi - y̅), square it, and sum the squares we get the Total Sum of Squares
(SST), hence SST = ∑ (yi - y̅)2
◆ Lastly, when we take (ŷi - y̅), square it, and sum the squares, we get a measure of how
much the estimated values on the regression line deviate from the actual mean.
◆ This is known as the Sum of Squares of the Regression Line: SSR = ∑ (ŷi - y̅)2
8. Coefficient of Determination (con't)
The relationship between SSR, SST, and SSE is one of the most important
facts to know in statistics.
SST = SSR + SSE
Now, if (yi - ŷi ) = 0 for each ith observation, SST = SSR, and we have a perfect
fit of the data. This is never the case.
On the flip side, if SST - SSR = 0, we have the worst possible fit because
everything is in the error term, or the unexplained portion of the equation.
Hence to measure of goodness of fit we look at the ratio of SSR to SST.
r2 = SSR/SST
This yields a value between 0 and 1.
r2 can be interpreted as the % of the total sum of squares (SST) that can be
explained by using your estimated regression equation.
9. Correlation Coefficient
Denoted rxy, is a measure of the strength of the linear association between the
independent (x) and dependent variable (y).
rxy = (sign of b1) √r2
rxy always yields a value between (-1, +1).
A value of 1 indicates perfect positive linear relationship.
A value of -1 indicates perfect negative linear relationship.
A value of zero indicates no relationship.
In practice, this is used much less as it only provides an accurate
measurement in the case of perfectly linear relationships.
r2 can be used to measure goodness-of-fit in linear and nonlinear relationships.
10. Estimating the Regression Equation
In this model, y can be thought of as having a distribution for a
given range of x values.
As we have learned in the past, a distribution has a mean or
expected value.
Thus the regression equation for the mean is as follows:
E(y) = β0 + β1x
Notice that to obtain the mean, we simply remove our ability to
account for unexplained variance.
11. Model Assumptions
It is important to understand that r2 is not enough to ensure we have an
appropriate regression equation.
There are numerous other tests and measures we must use.
All of these tests are based on assumptions about the error term (ε)
1. E(ε) = 0.
Implication: E(y) = β0 + β1x
2. The variance of ε, denoted by σ2 is the same for all values of x.
Implication: The variance of y equals σ2 and is the same for all values of x.
3. The values of ε are independent (uncorrelated)
Implication: The value of y for any x is not related to value of y for any other x.
4. ε is a normally distributed random variable.
Implication: Because y is a linear function of ε, y is also normally distributed.
Table 14.14 in the text provides a complete explanation.
12. Testing for Significance
In Simple Linear Regression, the mean or expected value of y is a linear
function of x (E(y) = β0 + β1x )
If the value of β1 = 0, then E(y) = β0 + 0x = β0.
Hence, in this case we can conclude x and y are not linearly related.
In the next, couple of slides we offer a few tests, the t-test, an evaluation of the
confidence interval for β1, the F-test.
Each of these test are based on the following hypothesis:
H 0: β 1 = 0
Ha: β1 ≠ 0
This starts to tell us more about the appropriateness of our model.
13. 2
Estimating σ
As a pre-cursor to running our tests, we need an estimate of σ2.
Recall one of our key assumptions that variance of ε also represents the
variance of y.
Also recall the deviations of y about the regression line are called residuals.
Hence we can call upon the SSE to calculate the Mean Square Error (MSE) as
an estimate of σ2 which we will denote as s2.
s2 = MSE = SSE/(n-2), where n is the sample size and (n-2) is the
model degrees of freedom.
Consequently, to get the standard error (s) of the estimate: √MSE.
14. t Test
Remember we are testing the following: H0: β1 = 0; Ha: β1 ≠ 0
To do this we need information about the distribution of b1 (see figure 14.17).,
specifically, we need the estimated standard deviation of b1 (see figure 14.18)
Once we have sb1 we can find the test statistic t: t = b1/sb1.
And using the t-table and our well known rejection rules:
Reject H0 if p-value ≤ α .
where t α/2 is based on a t-distribution with n-2 degrees of freedom.
15. Confidence Interval for β1
As an alternative to the t-test, we can check the confidence interval for β1
We are essentially checking to see if the interval of β1 contains 0.
The form of the confidence interval is as follows:
b1 ± t α/2*sb1
If this interval contains zero at the designated significance level, we cannot
reject the null hypothesis (H0).
16. F-Test
Based on the F probability distribution (hence, using our F-table)
In simple linear regression this does the same thing as the t-test.
With more than one independent variable (multiple regression) ONLY the F-
test can be used to test for overall significance.
To arrive at the F-Test statistic, we need the Mean Square due to Regression
(MSR).
MSR = MSE / (Number of Independent Variables)
F = MSR/MSE (Just like when we first learned ANOVA)
And using the F-table and our well known rejection rules:
Reject H0 if p-value ≤ α .
where F α is based on an F-distribution with 1 degree of freedom (for SLR) in
the numerator and (n-2) degrees of freedom in the denominator.
17. Caution about the Interpretation of
Significance Testing
Correlation is not causation!
Just because we Reject H0 does not guarantee cause-
and-effect, theoretical justification must be warranted.
Furthermore, just because we can Reject H0 does not
mean the relationship between x and y is linear.