THE METHOD USED IS OLS WITH PER CAPITA GDP AS THE EXPLAINED VARIABLE AND INFLATION, INVESTMENT RATIO, LIFE EXPECTANCY AND LITERACY RATE AS EXPLANATORY VARIABLES. We now provide a brief explanation of the explanatory variables-- ANALYSIS
REGRESSION RESULTS Dependent Variable: PCI Method: Least Squares Sample: 1 20 Included observations: 20 Variable Coefficient Std. Error t-Statistic Prob. I -38.4496 13.3138 -2.888 0.0113 INV -392.582 98.1917 -3.9981 0.0012 LE 571.6545 119.109 4.7994 0.0002 LIT 156.2885 54.5576 2.8647 0.0118 C -31508 6827.77 -4.6147 0.0003 R-squared 0.883177 Mean dependent var 10923 Adjusted R-squared 0.852024 S.D. dependent var 8981.6 S.E. of regression 3455.021 Akaike info criterion 19.345 Sum squared resid 1.79E+08 Schwarz criterion 19.594 Log likelihood -188.454 F-statistic 28.35 Durbin-Watson stat 2.4797 Prob(F-statistic) 0.000001
INTERPRETATION OF THE RESULTS <ul><li>The Independent variables, I, INV, LE and LIT are significant as can be seen from the respective t-statistics and the associated probabilities. </li></ul><ul><li>The significant F-statistic (Prob.= 0.000001) and the high Adjusted R 2 (0.85) indicate that the explanatory variables together significantly explain the explained variable, PCI. In other words, the model has a Good Fit. </li></ul>
RESIDUAL ANALYSIS THE RESIDUAL TABLE AND GRAPH INDICATE THE PRESENCE OF SOME OUTLIERS.THE OBSERVATIONS 3, 4 & 13 HAVE HIGHER RESIDUALS THAN THE OTHERS. THESE COUNTRIES ARE SRI LANKA, USA AND JAPAN RESPECTIVELY. THOUGH ANALYSING THE RESIDUALS IS NOT A VERY GOOD INDICATOR OF OUTLIERS BUT BY LOOKING AT THE INDIVIDUAL OBSERVATIONS ONE CAN GET THE INTUITION THAT THESE COULD BE OUTLIERS SINCE SRI LANKA HAS HIGH LITERACY AND LIFE EXPECTANCY RATES INSPITE OF BEING A LOW PER CAPITA INCOME COUNTRY AND THE REMAINING 2 HAVE A HIGH PER CAPITA INCOME AS COMPARED TO THE OTHER COUNTRIES. THESE DIFFERENCES COULD BE DUE TO DIFFERENT NATIONAL POLICIES THAT THEIR GOVERNMENTS FOLLOW. WE WILL RUN ANOTHER REGRESSION AFTER DELETING THESE.
REGRESSION RESULTS AFTER DELETING THE OUTLIERS Dependent Variable: PCI Method: Least Squares Sample: 1 17 Included observations: 17 Variable Coefficient Std. Error t-Statistic Prob. I -36.57134 6.271489 -5.831365 0.0001 INV -360.7367 47.65338 -7.570013 0.0000 LE 553.5307 56.83341 9.739529 0.0000 LIT 149.2246 25.52452 5.846322 0.0001 C -30431.88 3279.835 -9.278479 0.0000 R-squared 0.972098 Mean dependent var 9835.118 Adjusted R-squared 0.962798 S.D. dependent var 8295.732 S.E. of regression 1600.071 Akaike info criterion 17.83341 Sum squared resid 30722718 Schwarz criterion 18.07847 Log likelihood -146.584 F-statistic 104.5204 Durbin-Watson stat 1.23126 Prob(F-statistic) 0.00000
COMPARISON BETWEEN THE TWO EQUATIONS Variable Coefficient Std. Error Prob. of t- statistic NEW OLD NEW OLD NEW OLD C -30431.88 -31508.03 3279.84 6827.771 0.0000 0.0003 I -36.57134 -38.44955 6.27149 13.31378 0.0001 0.0113 INV -360.7367 -392.5823 47.6534 98.19166 0.0000 0.0012 LE 553.5307 571.6545 56.8334 119.1089 0.0000 0.0002 LIT 149.2246 156.2885 25.5245 54.55759 0.0001 0.0118 OLD R-squared = 0.883177 NEW R-squared = 0.97208
INTERPRETATION OF THE RESULTS <ul><li>There is a significant improvement in the t-statistic of all the Independent variables, as is evident from the fall in the associated probabilities as well as the standard errors. Thus the explanatory variables attain greater power to explain the dependent variable, PCI after getting rid of the outliers. </li></ul><ul><li>Also the value of R² is higher in the new model as compared to the original one indicating that the new model has a better fit. </li></ul><ul><li>The above observations show that we have indeed been justified in our choice of outliers and in their deletion. </li></ul>
WE WILL NOW CHECK IF OUR OLS ESTIMATES ARE VALID OR NOT BY CHECKING IF SOME OF THE STANDARD CLRM ASSUMPTIONS ARE VIOLATED. FOR THIS WE CARRY OUT A FEW TESTING EXERCISES. Estimated Equation: PCI = - 30431.88104 - 36.57134273*I - 360.736706*INV + 553.5306553*LE + 149.2245611*LIT Thus while Inflation and Investment are inversely related with PCI, Life Expectancy and Literacy Rate influence it positively.
JARQUE-BERA NORMALITY TEST ON THE RESIDUALS The test is to check for normality of the disturbance terms. The null hypothesis of the test is that the error terms or the residuals are N(0, σ ²). This test actually tests for the joint null hypothesis that the skewness E( ) is zero and the kurtosis E( ) is equal to 3 , which holds if the s are N(0, σ ²) distributed. Under the null hypothesis the test statistic involved (for large n) has a χ 2 distribution.
The Null Hypothesis cannot be rejected both at 5% & 1% level of significance. So we conclude that the residuals are indeed normally distributed. However, the Jarque-Bera test is an asymptotic test. Our sample size is only 17 so the validity of the test is under suspect.
THE RAMSEY RESET TEST The Ramsey Reset Test is a test for Functional Specification. It checks for any functional mis-specification. As suggested by Ramsey, the Null Hypothesis of a zero u vector is based on an augmented regression on the powers of the estimated or predicted values of the dependent variable namely y ² , y ³ , …. and testing whether the coefficients are significant or not. This test has been done by taking the no. of fitted items as 1.
RESULTS We can see that the coefficients of higher powers are indeed zero as suggested by the Probability value. Thus we can assert that our Regression has a Linear Specification . F-statistic 0.213322 Probability 0.653176 Log likelihood ratio 0.326523 Probability 0.567714 Dependent Variable: PCI Method: Least Squares Sample: 1 17 Included observations: 17 Variable Coefficient Std. Error t-Statistic Prob. I -25.19922 25.46244 -0.989662 0.3436 INV -252.5063 239.4612 -1.054477 0.3143 LE 436.2553 260.6334 1.673827 0.1223 LIT 117.1873 74.22027 1.578913 0.1427 C -24374.78 13546.16 -1.799387 0.0994 FITTED^2 1.01E-05 2.18E-05 0.461868 0.6532 R-squared 0.972629 Mean dependent var 9835.118 Adjusted R-squared 0.960188 S.D. dependent var 8295.732 S.E. of regression 1655.246 Akaike info criterion 17.93185 Sum squared resid 30138250 Schwarz criterion 18.22593 Log likelihood -146.4207 F-statistic 78.17742 Durbin-Watson stat 1.147307 Prob(F-statistic) 0.00000
WHITE HETEROSKEDASTICITY TEST Heteroskedasticity refers to the situation in which the variance of the error term in the regression equation is not constant but varies with the independent variable. In the presence of Heteroskedasticity, the Ordinary Least Square estimates, although still unbiased are no longer efficient. We refer to the WHITE HETEROSKEDASTICITY TEST for the detection of Heteroskedasticity, wherein one simply computes an auxiliary regression of the squared OLS residuals on a constant and all nonredundant variables in the set consisting of the regressors, their squares and their cross products.
RESULTS <ul><li>We see the test statistic nR 2 = 10.44798 < χ 2 0.05 (14) = 23.685. So the null hypothesis of homoscedasticity cannot be rejected. Therefore there is no heteroscedasticity in the data. </li></ul><ul><li>ALSO THE P-VALUE CORRESPONDING TO THE F-STATISTIC SUGGESTS THAT WE CAN NOT REJECT THE NULL HYPOTHESIS THAT THERE IS NO HETEROSKEDASTICITY IN THE DATA. </li></ul><ul><li>THE WHITE HETEROSKEDASTICITY TEST HAS VERY LOW POWER SO WE SHALL VERIFY THE RESULT USING THE GOLDFIELD-QUANDT TEST. </li></ul><ul><li>FOR THIS PURPOSE,THE DATA WAS ARRANGED IN ASCENDING ORDER ACCORDING TO THE VALUES OF THE LIFE EXPECTANCY VARIABLE.THEN WE RAN TWO SEPARATE REGRESSIONS TAKING THE FIRST AND THE LAST SIX OBSERVATIONS. </li></ul>
SORTED DATA Country GDP PC Inflation Investment ratio Life Expectancy Literacy Rates (PCI) (I) (INV) (LE) (LIT) Zimbabwe 2196.0 20.9 22.0 49.0 84.7 Kenya 1404.0 13.0 19.0 53.6 77.0 Bangladesh 1331.0 6.4 17.0 56.4 37.3 India 1348.0 9.8 25.0 61.3 51.2 Pakistan 2154.0 9.2 19.0 62.3 37.1 Indonesia 3740.0 8.8 38.0 63.5 83.2 Russia 4828.0 148.9 25.0 65.7 98.7 China 2604.0 9.3 40.0 68.9 80.9 Thailand 7104.0 5.0 43.0 69.5 93.5 Korea 10656.0 6.7 37.0 71.5 97.9 Argentina 8937.0 255.6 18.0 72.4 96.0 Germany 19675.0 3.0 21.0 76.3 99.0 UK 18620.0 5.1 16.0 76.7 99.0 Norway 21346.0 3.0 23.0 77.5 99.0 Australia 19285.0 3.7 23.0 78.1 99.0 France 20510.0 2.8 18.0 78.7 99.0 Canada 21459.0 2.9 19.0 79.0 99.0
RESULTS OF GOLDFIELD QUANDT TEST <ul><li>We divide the entire set of 17 observations into 3 groups consisting of the first 6, next 5 and the last 6 observations respectively. </li></ul><ul><li>We run two separate regressions corresponding to the first and the last 6 observations and note the values of RSS thus obtained. </li></ul><ul><li>The RSS from the first group comes out to be 1151046. </li></ul><ul><li>The RSS from the last group comes out to be 2016013. </li></ul><ul><li>We form an F-statistic in the following manner: </li></ul><ul><li>This follows an F-Distribution with d.f 1,1. </li></ul><ul><li>From the statistical table under F distribution, we see that: </li></ul><ul><li>= 161 and = 4052 </li></ul><ul><li>Thus the value of the F-statistic obtained is less than the tabular value at both 5% and 1% level of significance. We therefore accept the Null Hypothesis of Homoskedasticity (constant variance) at both 5% and 1% level of significance, i.e., there is no problem of Heteroskedasticity in the our data set . </li></ul>
PARAMETER STABILITY TESTS THE CUSUM TEST SINCE THE CUMULATIVE SUM IS INSIDE THE AREA BETWEEN,THE TWO CRITICAL LINES,WE CAN SAY THAT THE PARAMETERS ARE CONSTANT IN TERMS OF INTERCEPT. THE CUSUMSQ TEST HERE WE CAN SAY THAT THE PARAMETERS ARE CONSTANT IN TERMS OF VARIANCE.
THE RECURSIVE RESIDUALS TEST THE RESIDUALS ARE INSIDE THE STANDARD ERROR BANDS. IT SUGGESTS THAT THE PARAMETERS ARE STABLE.
CHOW FORECAST TEST <ul><li>One of the most important criteria for an estimated equation is that it should have relevance for data outside the sample data used in the estimation i.e. parameter constancy. To examine this we use the Chow Forecast Test which is the one of the most useful as a test of predictive accuracy. </li></ul><ul><li>The procedure is to divide the data set into n 1 observations to be used for estimation & n 2 observations for testing. The null hypothesis for this test is parameter constancy. We use an F-statistic with degrees of freedom n 2 & (n 1 – k). </li></ul><ul><li>In the model under consideration--- </li></ul><ul><li>n 1 =14 </li></ul><ul><li>n 2 =3 </li></ul>
Chow Forecast Test: Forecast from 15 to 17 F-statistic 1.476388 Probability 0.285454 Log likelihood ratio 6.80347 Probability 0.078433 Dependent Variable: PCI Method: Least Squares Sample: 1 14 Included observations: 14 Variable Coefficient Std. Error t-Statistic Prob. I -34.69875 6.000098 -5.783031 0.0003 INV -394.0349 58.12046 -6.779625 0.0001 LE 534.8819 54.46534 9.820593 0.0000 LIT 144.0358 24.30871 5.925275 0.0002 C -28332.41 3331.343 -8.504801 0.0000 R-squared 0.978415 Mean dependent var 9149.357 Adjusted R-squared 0.968821 S.D. dependent var 8565.987 S.E. of regression 1512.535 Akaike info criterion 17.75341 Sum squared resid 20589850 Schwarz criterion 17.98165 Log likelihood -119.2739 F-statistic 101.9883 Durbin-Watson stat 1.396911 Prob(F-statistic) 0.0000
From the statistical table under F distribution, we see that: F 3,9,0.05 = 3.86 F 3,9,0.01 = 6.99 From the table the Chow F-statistic obtained is F(3,9) = 1.476388 Thus the value of the F-statistic obtained is less than the tabular value at both 5% and 1% level of significance. We therefore accept the Null Hypothesis of parameter constancy at both 5% and 1% level of significance.
MULTICOLLINEARITY <ul><li>In order to check whether or not there is any problem of correlation among the explanatory variables, i.e., whether or not our model suffers from the problem of multicollinearity , we use the Variance Inflating Factor (VIF) where, </li></ul><ul><li>We regress each explanatory variable on the others referred to as Auxiliary Regressions. is the squared multiple correlation coefficients obtained from each of these regressions. If the VIF corresponding to any auxiliary regression is greater than 10 we say that the model has a severe multicollinearity problem and VIF between 2 and 10 implies moderate multicollinearity. The general solution suggested is to drop the corresponding regressor . </li></ul>
We have obtained the following results: From the results obtained above we see that in our model the VIF of all the estimated coefficients are close to 1. Thus we conclude that the model is free from multicollinearity or in other words, the explanatory variables are uncorrelated. Regressors VIF I 1.123 INV 1.105 LEX 1.781 LIT 1.911
CONCLUSION <ul><li>We reach the conclusion that the behavior of Per Capita Income(GDP) can indeed be expressed as a linear regression model of the annual average rate of Inflation, Investment Ratio, Life Expectancy at birth and Adult Literacy Rate. </li></ul>
Bibliography <ul><li>Theoretical Exposition : </li></ul><ul><li>Class Notes of Prof. Samarjit Das </li></ul><ul><li>Econometric Methods : J.Johnston & J.Dinardo </li></ul><ul><li>Econometric Analysis : William H. Greene </li></ul><ul><li>Software : </li></ul><ul><li>Eviews 3 </li></ul><ul><li>SPSS 10 </li></ul><ul><li>Data Source : </li></ul><ul><li>(i) UNDP, Human Development Report (1994) </li></ul><ul><li>(ii) World Bank, World Development Report (1994) </li></ul>