1. Interpret Computer Printout 1
For
this lecture, we focus only on how to
interpret computer printout. In the course
of interpretation, there bound to be
material not yet cover in the lecture.
We apply first before we learn how it
comes about.
11/27/13
1
2. Illustrative Example – Sales of
Second Hand Cars
A
used car dealer examine 100 used
cars sold in an auction. He examines
how the auction selling price depends on
the miles that have been covered or the
reading of the odometer.
The computer printout are shown in the
next slide. Examine the printout carefully
and answer the following questions.
11/27/13
2
3. Computer Printout 1
------------------------------------------------------------------------------------------------------Regression Statistics (Observations 100)
------------------------------------------------------------------------------------------------------Multiple R
0.8063
R Square 0.6501
Adjusted R Square 0.6466
Standard Error
151.6
------------------------------------------------------------------------------------------------------ANOVA
df
SS
MS
F
Significance F
------------------------------------------------------------------------------------------------------Regression
1
4183528
4183528
182.1
0.000
Residual
98
2251362
22973
Total
99
6434890
------------------------------------------------------------------------------------------------------Coefficients
Standard Error
t Stat
P value
Intercept
6533
8451
77.31
0.0000
Odometer
-0.0312
0.0023
-13.49
0.0000
11/27/13
3
4. Comments on the Printout 1
The multiple correlation is large about 0.8, indicating that
a strong linear relationship exists between the dependent
variable and the independent variable.
In the ANOVA analysis, The F test shows that overall the
model is valid and the coefficient of the regressor cannot
be zero.
The t test for the intercept and the odometer are
excellent.
As a whole there is a strong linear relationship between
the selling price and the reading of the odometer.
11/27/13
4
5. Questions and Answers
1
2
3
4
5
6
7
Write down the simple linear regression equation for the used
car seller case.
Why is there in the analysis only 99 observations whereas the
total observations suppose to be 100?
How does the degree of freedom calculated?
What do you understand the ss values and ms values for the
residual?
How to interpret the F test result?
What is the difference between adjusted R square and R
square? And how is it related to multiple correlation?
11/27/13
5
7. Addition of Qualitative Independent Variable
The car dealer believes that the color of the
car is a factor in determining the auction selling
price.
He assigns a code of 1 for white car, 2 for
silver car, and 3 for all the other cars.
We conduct a multiple regression as before
but include the codes for the color of the cars.
The computer printout is shown in next slide.
11/27/13
7
8. Multiple regression with the Codes for the color of
the cars
Regression Statistics (Observations 100)
------------------------------------------------------------------------------------------------------Multiple R
0.8095
R Square 0.6552
Adjusted R Square
0.6481
Standard Error
151.2
------------------------------------------------------------------------------------------------------ANOVA
df
SS
MS
F
------------------------------------------------------------------------------------------------------Regression
2
4216263
2108132
92.17
Residual
97
2218627
22872
Total
99
6434890
------------------------------------------------------------------------------------------------------Coefficients Standard Error
t Stat
P value
Intercept
6580
92.96
70.79
Odometer
-0.0313
0.0023
-13.56
color
-21.67
18.11
-1.20
11/27/13
Significance F
0.000
0.0000
0.0000
0.2345
8
9. Comment on the Printout 2
The regression result is as before except that
the t test for the color produces a result that
there is no relationship between color and
price, because the t test is not significance.
Two reasons for it. One, the way the dealer
assigns the codes make the detection of the
relationship impossible.
Two, the dealer has treated the color as a
quantitative data, but in fact it is a qualitative
data.
11/27/13
9
10. The use of Dummy Variable in
multiple regression model
Define
The dealer failed in his attempt to include the
color code in the regression simply because the
color code as he assigned them forms a
quantitative data set.
In stead of using code freely, we restricted them
to two values only that is 1 or 0, which we
classify as dummy variable.
So we can define the first dummy variable for
white and the second dummy variable for a
silver car as follows.
(if the color is white)
1
I1 =
0 (if the color is not white)
11/27/13
(if the color is silver)
1
I2 =
0 (if the color is not silver)
10
11. Interpretation of the real
meaning of dummy variable
1.
I1 = 1 I 2 = 0 indicating the car is white
2.
I 0 = 1 I1 = 1 indicating the car is siver
3.
I1 = 0 I 2 = 0 indicating the color is
neither white or silver
We run the multiple regression again with the two dummy
variables data series. The standard regression equation is
shown below
Price = β0 + β1Odometer + β3 I1 + β 4 I 2
11/27/13
11
12. Regression Printout with Dummy
Variables in the Regression
Regression Statistics (Observations 100)
------------------------------------------------------------------------------------------------------Multiple R
0.8355
R Square 0.6980
Adjusted R Square
0.6886
Standard Error
142.3
------------------------------------------------------------------------------------------------------ANOVA
df
SS
MS
F
------------------------------------------------------------------------------------------------------Regression
3
4491749
1497250
73.97
Residual
96
1943141
20241
Total
99
6434890
------------------------------------------------------------------------------------------------------Coefficients Standard Error
t Stat
P value
Intercept
6350
92.17
68.90
0.0000
Odometer -0.0278
0.0024
-11.72
0.0000
I(1)
45.24
34.08
1.33
0.1876
I(2)
147.7
38.18
3.87
0.0002
11/27/13
Significance F
0.000
12
13. Interpreting the Computer
Printout
1.
2.
3.
4.
For the intercept term or the constant, the
price = 6350 when Odometer=I(1)=I(2)=0.
For each additional mile on the odometer,
the auction price decrease by 2.78 cents.
For beta3 = 45.2, it indicates that white car
sells for RM 45.2 more than other colors. For
beta 4=148, it indicates that a silver car sells
for RM 148 more than other colors.
We write out the separate equations for the
color of the cars in the next slide.
11/27/13
13
14. Equations for non white and non
silver car, and white car
1.
For non white and non silver car, the equation is
Price = β0 + β1Odometer + β3 (0) + β 4 (0)
Which becomes
Price = 6350 − 0.0278Odometer
2. For a white car, the equation will becomes
Price = β 0 + β1Odometer + β 3 (1) + β 4 (0)
Price = 6350 − 0.0278Odometer + 45.2
Price = 6395.2 − 0.0278Odometer
11/27/13
14
15. Equations for silver car
3. Equation for silver
Price = β0 + β1Odometer + β3 (0) + β 4 (1)
Price = 6350 − 0.0278Odometer + 148
Price = 6498 − 0.0278Odometer
We can conduct a t test for white car. The t statistic as given
in the printout is 1.33 (p value = 0.1876), showing that we
should accept the null hypothesis that beta 3 is zero. We
can interpret it as there is not sufficient evidence that white
cars sell better than silver or non white or non silver car.
As for the t test for silver car, the t statistic is 3.87 (p value =
0.0002), we can conclude that there is a difference in price
between silver and other categories.
11/27/13
15