The document appears to be an analysis of a dataset containing consumption, income, and liquid asset values for multiple observations.
(i) A regression model was developed relating consumption to income and liquid assets. High R-squared and variance inflation factors indicate potential multicollinearity between the predictors.
(ii) Additional analyses excluding variables one at a time were performed, continuing to show high correlation between predictors and potential multicollinearity issues.
(iii) A Theil's inequality coefficient calculation further supported the existence of multicollinearity within the dataset.
1. ASSIGNMENT
Centre for Management Studies
University of Petroleum & Energy Studies
Submitted to: Submitted by:
Prof. I. Krishnamurthy Richa Pandey
MBA (Avm)
R120108036
3. Solution
Solving the given data set by using regression analysis we get-
Regression Analysis: C1 versus C2, C3
The regression equation is
C1 = - 10.6 + 0.682 C2 + 0.373 C3
Predictor Coef SE Coef T P
Constant -10.627 3.273 -3.25 0.003
C2 0.68166 0.07098 9.60 0.000
C3 0.37252 0.09656 3.86 0.000
S = 1.76348 R-Sq = 99.5% R-Sq(adj) = 99.5%
Analysis of Variance
Source DF SS MS F P
Regression 2 23165 11583 3724.45 0.000
Residual Error 35 109 3
Total 37 23274
Source DF Seq SS
C2 1 23119
C3 1 46
(i) The regression model will be
Consumption = -10.6 + 0.682 Income + 0.373 liquid asset.
(ii) Here the value of R square = 99.5 %. Since R square is very high then it can
be said that there may be existing the problem of multi co -linearity
(iii) Standard Error is 3.273, which is not high. So it will be difficult to say
anything out of this observation.
4. (iv) “t-value” for Income(C1) and liquid asset(C2) are 9.60 and 3.86 respectively,
and both are greater than 2, it implies that these values are significant. So
again the problem of multi co-linearity may exist.
Correlations: C2, C3
Pearson correlation of C2 and C3 = 0.988
P-Value = 0.000
Correlation between the Income and liquid asset is very high , that is 0.988. It is an indication for
the existence of correlation
After dropping the last observation the new result will be as follows
Regression Analysis: C1 versus C2, C3
The regression equation is
C1 = - 12.2 + 0.657 C2 + 0.413 C3
Predictor Coef SE Coef T P
Constant -12.185 3.396 -3.59 0.001
C2 0.65690 0.07193 9.13 0.000
C3 0.41272 0.09902 4.17 0.000
S = 1.73621 R-Sq = 99.5% R-Sq(adj) = 99.5%
Analysis of Variance
Source DF SS MS F P
Regression 2 21647 10824 3590.60 0.000
Residual Error 34 102 3
Total 36 21750
5. Source DF Seq SS
C2 1 21595
C3 1 52
After dropping the last observation the new regression model is
Consumption = -12.185 + 0.657 Income + 0.413 liquid asset.
Here there is not much change is the coefficient after dropping an observation. So we
cannot conclude anything from this observation.
Variance Inflation Factor (VIF) :
VIF(Income) = 1/(1-Rsquare i )
Regression Analysis: C2 versus C3
The regression equation is
C2 = - 5.64 + 1.34 C3
Predictor Coef SE Coef T P
Constant -5.638 7.628 -0.74 0.465
C3 1.34394 0.03528 38.09 0.000
S = 4.14102 R-Sq = 97.6% R-Sq(adj) = 97.5%
Analysis of Variance
Source DF SS MS F P
Regression 1 24884 24884 1451.15 0.000
Residual Error 36 617 17
Total 37 25502
Unusual Observations
Obs C3 C2 Fit SE Fit Residual St Resid
13 208 263.000 273.364 0.726 -10.364 -2.54R
R denotes an observation with a large standardized
residual.
VIF = 41.322
When the value of the predictor is more than 10 then the predictors are highly correlated.
6. Theils measure
Excluding Income
Regression Analysis: C1 versus C3
The regression equation is
C1 = - 14.5 + 1.29 C3
Predictor Coef SE Coef T P
Constant -14.470 6.107 -2.37 0.023
C3 1.28863 0.02825 45.62 0.000
S = 3.31535 R-Sq = 98.3% R-Sq(adj) = 98.3%
Analysis of Variance
Source DF SS MS F P
Regression 1 22878 22878 2081.45 0.000
Residual Error 36 396 11
Total 37 23274
Unusual Observations
Obs C3 C1 Fit SE Fit Residual St Resid
34 238 299.500 292.739 0.844 6.761 2.11R
R denotes an observation with a large standardized residual.
Excluding liquid asset
Regression Analysis: C1 versus C2
7. The regression equation is
C1 = - 7.16 + 0.952 C2
Predictor Coef SE Coef T P
Constant -7.160 3.705 -1.93 0.061
C2 0.95213 0.01300 73.25 0.000
S = 2.07583 R-Sq = 99.3% R-Sq(adj) = 99.3%
Analysis of Variance
Source DF SS MS F P
Regression 1 23119 23119 5365.14 0.000
Residual Error 36 155 4
Total 37 23274
Unusual Observations
Obs C2 C1 Fit SE Fit Residual St Resid
13 263 248.700 243.252 0.432 5.448 2.68R
R denotes an observation with a large standardized residual.
m = 0.9953-((0.9953-0.9829)+(0.9953-0.9933))
= 0.9809
Since m is not equal to zero ,hence we can say that multicollinearity exists.