Unemployment Rate a function of Population and Economic conditions


Published on

You may now download this complete presentation for $5.99. Download with Paypal confidence at

More marketing documents download at

Please contact me at hello@divyamishra.com for any questions regarding the download. Thanks Guys!

Published in: Business

Unemployment Rate a function of Population and Economic conditions

  1. 1. Unemployment Rate a function of Population and Economic conditions<br />By: Divya Mishra<br />School of Management<br />Purdue University<br />Introduction<br />The objective of the project is to study the relationship between Unemployment rate, economy and the population in United States of America. The Unemployment rate is defined as the ratio of number of people not on job to the total labor force of the country. The GDP (Gross Domestic Product) defines economy or the economic conditions. The Gross Domestic Product or GDP is a measure of all of the services and goods produced in a country over a specific period, classically a fiscal year. The GDP considers the market value of goods and services to arrive at a number which is used to judge the growth rate of the economy and the overall economic health of the nation concerned. Population is the number of people inhabiting a country. Both economy and population affect the unemployment rate in any country. When the economy is good, the unemployment rate is expected to be low and vice versa. Similarly when population is less, everybody get job and so rate of unemployment is low and vice versa. There are other factors also other than population and GDP which affect unemployment. Unemployment is a researchable issue and is a social problem. It is a social issue in the global economy and the workplace.<br />Methodology<br />The project is multivariate i.e. it involves more than two variables. Regression analysis is used to study the relationship between unemployment rate, GDP and population. The dependent variable (y) is unemployment and GDP and population are the independent variables (X1 and X2). The data is yearly data from 1950 to 2009.The unemployment rate is expressed in percentage, GDP in billions of dollars and population in millions. The project aims to study the relationship between the variables, multicollinearity between the independent variables, autocorrelation in the residuals and heteroscedastisity in the residuals. Thus the objective of the project can be reduced to:<br />Unemployment = FN (GDP, Population).<br />The regression equation representing the objective is: Y = 0 + 1(X1) + 1(X2) + <br />The logarithmic transformation to linearlize the model: Ln(Y) = ln (0) + 1ln(X1) + 1ln(X2) + ln ()<br />β0 = y intercept when X1 and X2 are zero<br />β1 = GDP coefficient<br />β2 = Population coefficient<br />ε = error or residual<br />Regression StatisticsMultiple R0.33135R Square0.10979Adjusted R Square0.07856Standard Error0.2588Observations60ANOVA dfSSMSFRegression20.470870.235433.51509Residual573.817760.06698Total594.28863   CoefficientsStandard Errort StatP-valueIntercept-1.732520.8762-0.0830.93415ln(Gdp)0.044450.177390.250570.80305ln(POP)0.161711.15250.140320.88891<br />HYPOTHESIS<br />The global utility of the model is tested by the hypothesis testing. The hypothesis testing applied to test the significance of the model is the F test and it is as follows<br />H0: β0 = β1 = β2 = 0<br />H1: at least one of the βs is not equal to zero<br />From the excel output, F (computed) = 3.52 > F (critical) = 0.0363. Thus the conclusion is to reject H0 i.e. at least one of the βs is not equal to zero. Hence the model is significant.<br />The coefficient of determination for the model (R^2) is 11%, which means that 11% of the variations in the unemployment rate can be explained by the GDP and population.<br />The complete model: = -1.7325 + 0.04445ln (Gdp) + 0.16171ln (POP)<br />ANALYSIS OF MODEL AND RESULTS<br />The model is analyzed to check for multicollinearity, autocorrelation and heteroscedastisity<br />I) Multicollinearity<br />Multicollinearity exists when two or more of the independent variables used in the regression are moderately or highly correlated. In the project multicollinearity is tested by following methods<br /><ul><li>Correlation Matrix: The matrix provides the significant correlation coefficient between pairs of independent variables in the model.</li></ul> ln(UE)ln(Gdp)ln(POP)ln(UE)1  ln(Gdp)0.330891 ln(POP)0.329870.989131<br />The correlation coefficient between GDP and Population is approximately 0.99 which is very high almost equal to 1 which implies high correlation between these two independent variables.<br /><ul><li>Variance Inflation Factor: The VIF is a more formal method for detecting multicollinearity and it involves individual β parameters. If VIF >5, it implies that multicollinearity is present. The formula for VIF is
  2. 2. VIF=1/ (1-Ri^2)</li></ul> VIF (β1) = 1/ (1-.978376) = 46.2439 and VIF (β2) = 1/ (1-.978376) = 46.2439<br />Since VIF for both βs is greater than 5, hence multicollinearity is present. This implies that population and GDP are highly correlated.<br />II) Autocorrelation<br />Autocorrelation is defined as the correlation between the residuals at different point of time. Error terms are auto-correlated (or serially correlated) if the values lagged one or more time periods are not independent of one another.<br />EFFECT OF RESIDUAL CORRELATION<br />When autocorrelation is present, the least squares estimators (the) are not efficient. While they are still linear and unbiased, they are no longer the best linear unbiased estimators (BLUE). <br />Test for autocorrelation<br />Auto-correlation can be tested in following ways<br /><ul><li>Residual Plot: The residual plot appears to show that in some time periods, a positive error term is followed by another negative error term and so on. Autocorrelation present.</li></ul>Residuals plotted against time<br /> <br /><ul><li>Durbin-Watson test: The Durbin-Watson test checks for evidence of first-order autocorrelation. The statement of hypothesis is as follows:</li></ul>H0:1=0{no residual correlation}<br />HA:10{evidence of residual correlation}<br /> The Durbin-Watson test statistic:<br />,0 d 4<br />Decision guideline for the DW test:<br />0-dLPositive AutocorrelationdL-dUInconclusivedU-(4-dU)No Autocorrelation(4-dU)-(4-dL)Inconclusive(4-dL)-4Negative Autocorrelation<br />Here d =2.1964/3.81776 = 0.57532 and since 0<0.5732<dL implies positive autocorrelation. Reject H0.<br />GLS technique for autocorrelation<br />The value of ρ = 0.72<br />From the spread sheet<br />Regression StatisticsMultiple R0.9433328R Square0.8898767Adjusted R Square0.8684688Standard Error0.1816878Observations60ANOVA dfSSMSFRegression315.204661825.068221153.5338526Residual571.8815952950.03301Total6017.08625711   CoefficientsStandard Errort StatP-valueIntercept0#N/A#N/A#N/Aint-41.15378941.51590677-0.991280.325739557GDP*-0.29241320.364129182-0.803050.425283034POP*2.34142672.2960827771.0197480.312156609<br />The GLS model is = -41.153 -0.2924ln (GDP) + 2.3414ln (POP)<br />III) Heteroscedastisity<br />Heteroscedastisity is the problem of unequal variance of the error term. It is checked by two ways in the project.<br /><ul><li>Graphical examination of the residuals: The graphical examination of the residuals does not clearly explain the presence of heteroscedastisity.
  3. 3. Goldfeld-Quant Test: Test the equality of the variances using the F statistic. Divide larger MSE by the smaller MSE to compute F. If the computed F value is greater than the critical F value than heteroscedastisity is present.</li></ul>ANOVA  dfSSMSRegression20.6769910.338495Residual270.9405670.034836Total291.617558 ANOVA dfSSMSRegression20.54290.27145Residual271.599350.059235Total292.142249 <br />F (computed) = 0.57235/0.034836 = 1.70041 and F (critical) at df (28, 28) = 1.84<br />Since 1.70041<1.84, implies no heteroscedastisity.<br />CONCLUSION<br />On the basis of the regression analysis, we find that the model explains the relationship between the variables. Multicollinearity is present between the independent variables. There is a positive autocorrelation in the residuals. There is no heteroscedastisity in the data.<br />LIMITATIONS<br />The coefficient of determination R^2 is only 11%, which means that only 11% of the variations in the unemployment rate can be explained by the GDP and population factor. The model should include some more independent variables so that the goodness of fit i.e. R^2 of the model can be increased.<br />