VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
STATA Appplication.pptx
1. 1. Introduction to Linear Regression
2. Tests for Normality of Residuals
3. Tests for Heteroscedasticity
4. Tests for Multicollinearity
5. Tests for Autocorrelation
6. Tests for Model Specification
Linear Regression Analysis
3. Linear Regression
The command regress is used to perform
linear regressions. The first variable after
the regress command is always the
dependent variable ( left-hand-side
variable), and the list of the independent
variables that we chose to include in the
estimation model follows ( right-hand-side
variables).
4. Linear Regression
. clear
. use hs1, clear
. regress write read female
Source | SS df MS Number of obs =
+ F( 2, 197) =
Model | Prob > F
Residual |
7856.32118 2 3928.16059
10022.5538 197 50.8759077 R-squared
+ Adj R-squared =
Total | 17878.875 199 89.843593 Root MSE =
200
77.21
= 0.0000
= 0.4394
0.4337
7.1327
write | Coef. Std. Err. t P>|t| [95% Conf. Interval]
+
read | .5658869 .0493849 11.46 0.000 .468496 .6632778
female | 5.486894 1.014261 5.41 0.000 3.48669 7.487098
_cons | 20.22837 2.713756 7.45 0.000 14.87663 25.58011
6. Tests for Normality of Residuals
We use the predict command with the resid
option to generate residuals and we name the
residuals r.
. predict r, resid
7. Tests for Normality of Residuals
Shapiro-Wilk W test for Normality
For verifying that the
residuals are normally distributed, which is
a very important assumption for regression,
we use Shapiro-Wilk W test for normal data
8. Tests for Normality of Residuals
Shapiro-Wilk W test for Normality
For verifying that the
residuals are normally distributed, which is
a very important assumption for regression,
we use Shapiro-Wilk W test for normal data
. swilk r
9. Tests for Normality of Residuals
Shapiro-Wilk W test for Normality
For verifying that the
residuals are normally distributed, which is
a very important assumption for regression,
we use Shapiro-Wilk W test for normal data
. swilk r
Shapiro-Wilk W test for normal data
Variable | Obs W V z Prob>z
+
r | 200 0.98714 1.919 1.499 0.06692
10. Tests for Normality of Residuals
In verifying that the
residuals are normally distributed, which is
a very important assumption for regression,
the kdensity command with the normal option
displays a
density graph of the residuals
with an
normal distribution superimposed
on the graph.
13. Tests for Normality of Residuals
The pnorm command produces a
normal probability plot and
it is another method of
testing whether the
residuals from the regression are normally
distributed.
16. Tests for Normality of Residuals
The qnorm command produces a
normal quantile plot.
It is yet another method for testing if the
residuals are normally distributed.
19. Summary of Tests for Normality of Residuals
swilk performs the Shapiro-Wilk W test for
normality.
kdensity produces kernel density plot with normal
distribution overlayed.
pnorm
qnorm
graphs a standardized normal probability
(P-P) plot.
plots the quantiles of varname against the
quantiles of a normal distribution.
Tests for Normality of Residuals
21. Tests for Heteroscedasticity
One of the basic assumptions for the ordinary
least squares regression is the homogeneity of
variance of the residuals.
There are graphical and non-graphical methods
for detecting heteroscedasticity.
23. Tests for Heteroscedasticity
Cook-Weisberg test for heteroskedasticity
. hettest
Cook-Weisberg test for heteroskedasticity using
fitted values of write
Ho: Constant variance
chi2(1) =
Prob > chi2 =
5.79
0.0161
25. Tests for Heteroscedasticity
we use the rvfplot command with the yline(0)
option to put a reference line at y=0.
. rvfplot, yline(0)
26. Tests for Heteroscedasticity
we use the rvfplot command with the yline(0)
option to put a reference line at y=0.
. rvfplot, yline(0)
27. Summary of Tests for Heteroscedasticity
hettest performs Cook and Weisberg test
rvfplot graphs residual-versus-fitted plot.
Tests for Heteroscedasticity
29. Tests for Multicollinearity
Multicollinearity is a concern for multiple
regression, not for its existence, but for its
degree.
For severe degree of multicollinearity, the
regression model
estimates of the coefficients become unstable
and
the standard errors for the coefficients can get
wildly inflated.
30. Tests for Multicollinearity
We can use the vif command after the regression to
check for multicollinearity.
vif stands for variance inflation factor.
31. Tests for Multicollinearity
We can use the vif command after the regression to
check for multicollinearity.
vif stands for variance inflation factor.
. vif
Variable | VIF 1/VIF
+
female |
read |
1.00
1.00
0.997182
0.997182
+
Mean VIF | 1.00
32. Tests for Multicollinearity
We can use the vif command after the regression to
check for multicollinearity.
. vif
Variable | VIF
+
female |
read |
1.00
1.00
Mean VIF | 1.00
A variable whose
VIF values are
vif stands for variance inflation factor. greater than 10 may
merit further
investigation.
Tolerance= 1/VIF, is
used to check on
1/VIF
the degree of
0.997182 collinearity. A
0.997182 tolerance value
+ lower than 0.1 is
comparable to a VIF
of 10.
33. Tests for Multicollinearity
Summary of Tests for Multicollinearity
vif calculates the variance inflation factor for the
independent variables in the linear model.
35. Tests for Autocorrelation
. tsset id
time variable: id, 1 to 200
. dwstat
Durbin-Watson d-statistic( 3, 200) = 1.93992
36. Tests for Model Specification
A model specification error can occur when one or more
relevant variables are omitted from the model or one or
more irrelevant variables are included in the model.
37. Tests for Model Specification
There are several methods to detect
specification errors.
The linktest command performs a model
specification link test for single-equation models.
38. Tests for Model Specification
. Linktest
Source | SS df MS Number of obs =
+ F( 2, 197) =
Model | Prob > F
Residual |
8005.11739 2 4002.55869
9873.75761 197 50.120597 R-squared
+
Total | 17878.875 199 89.843593
Adj R-squared =
Root MSE =
200
79.86
= 0.0000
= 0.4477
0.4421
7.0796
write | Coef. Std. Err. t P>|t| [95% Conf. Interval]
+
_hat | 2.807497 1.052071 2.67 0.008 .7327302 4.882264
_hatsq | -.0170281 .0098827 -1.72 0.086 -.0365176 .0024615
_cons | -47.29516 27.77544 -1.70 0.090 -102.0705 7.480201
39. Tests for Model Specification
The ovtest command performs performs a
regression specification error test (RESET) for
omitted variables.
40. Tests for Model Specification
The ovtest command performs performs a
regression specification error test (RESET) for
omitted variables.
. ovtest
41. Tests for Model Specification
The ovtest command performs performs a
regression specification error test (RESET) for
omitted variables.
. ovtest
Ramsey RESET test using powers of the fitted values
of write
Ho: model has no omitted variables
F(3, 194) = 1.95
Prob > F = 0.1233
42. Tests for Model Specification
Summary of Tests for Model Specification
linktest performs a link test for model specification.
ovtest performs regression specification error
test (RESET) for omitted variables.