Group5

1
EMPIRICAL ASSIGNMENT
REPORT
SUBMITTED TO
PROF. MANDIRA SARMA
IN
ADVANCED ECONOMETRICS COURSE
BY
GROUP NUMBER-5
DEEPAK (5556), KRITIKA GUPTA (34198) & SONAL
AGGARWAL (10789)
M.A. II YEAR, MONSOON SEMESTER 2016
(CENTRE FOR INTERNATIONAL TRADE AND
DEVELOPMENT, JAWAHARLAL NEHRU UNIVERSITY, NEW
DELHI)

2
It has long been recognized that Growing Urbanisation, Transportation and GDP of a country
influence Carbon Dioxide Emissions. This is an important issue in need of further research.
AIM: The purpose of this report is to address the following four questions through econometric
analysis and using some real cross section data -:
 Whether CO2 emission level in a country can be explained by its urbanization and vehicle
density?
 Do rich countries emit more CO2 vis-à-vis their non rich counterparts?
 Do we have a continent specific effect on CO2 emissions?
 Do we have a combined effect of income level and continent location on CO2 emissions?
DATA: For our study, we have used a sample dataset from 209 countries across the world,
consisting of information on the following measures -:
 Gross Domestic Product (GDP) per capita measured in terms of USD.
 CO2 emissions measured in metric tons per capita.
 Urban population expressed as a % of total population.
 Vehicles per kilometers of road.
In addition to this, we have constructed two dummy variables. Our first dummy (rich dummy)
categorizes the countries into rich and non-rich categories based on the World development
report classification which puts all countries with GDP per capita equal to or greater than
$11116 into the “High income” category, which we have identified as the rich cohort in our
analysis. Our second dummy (continent dummy) categorizes the countries according to the
continents to which they belong.
DEFINITIONS:
o Urbanization: It is a population shift from rural to urban areas, "the gradual increase in
the proportion of people living in urban areas", and the ways in which each society
adapts to the change. It is predominantly the process by which towns and cities are
formed and become larger as more people begin living and working in central areas.
o Vehicle Density: The average number of vehicles that occupy one mile or one Km of
road space, expressed in vehicles per mile or per Km.

3
METHODOLOGY : For all our econometric analysis, we have used the data analysis and
statistical software STATA. For presentation of our results, we have used Microsoft Word. For
some background on the estimated relationships, we have relied on internet sources.
A GOOD MODEL: Following are the attributes that an ideal econometric model should have in
it and in our study, we have attempted to be as close to an ideal model as possible. A good
model should -:
 Satisfy all the Classical linear regression model assumptions i.e.it be linear in parameters;
errors should have zero means and shouldn‟t be correlated with any explanatory variable i.e.
no endogeneity; errors should be homoskedastic and not autocorrelated; explanatory
variables should not have perfect multicollinearity and errors should be normally
distributed. All our regressions are linear in parameters, we ensured no endogeneity and
Multicollinearity using correlation matrices, ensured homoskedasticity by using the Breusch-
Pagan test, ensured that errors are normally distributed by using Jarque-Bera normality test
and constructing distributions functions for estimated errors.
 The model should not have any omitted variable. We have ensured this by using the Ramsey
omitted variable test in STATA.
 The model should have the correct functional form. The model should be such that
explanatory variables explain a good amount of variation in the explained variable i.e.
( R2
> 0.5).
 The model should explain a lot with a little i.e. it shouldn‟t have irrelevant variables. We
have taken care of this by examining marginal changes in R2
values obtained by adding
variables.
FUNCTIONAL FORMS USED:
Our choice of functional form was based on graphical analysis as well as results from previous
studies. A two-way scatterplot between CO2 emissions and urbanization yielded a monotonic
upward sloping convex graph (an exponential function like relationship) while a two-way
scatterplot between CO2 emissions and vehicle density yielded a monotonic upward sloping
concave graph. This necessitated the use of log-log specification for our model. Double log
specifications straightened the two scatterplots very nicely as well as reduced the standard
deviations of observations for our variables significantly, thereby eliminating the problem of
outliers in our dataset.

4
A two way scatterplot between CO2 emissions and GDP per capita yields an inverted-U
relationship between them. This is in accord with what we have from our celebrated
“Environmental Kuznets curve” theory. Some previous estimates of this relationship suggests
that logGDP per capita and squared-logGDP per capita explain significant variations in logCO2
emissions. So, the most appropriate functional form for all our regressions is the log-log
specification.
Summary Statistics:
Means of log(Co2) emissions, log(Urban Population), log(Vehicle Density), log(GDP),
[log(Vehicle Density)]2
are 0.65, 3.90, 3.40, 7.68,and 12.63 respectively. Their respective
standard Deviations are 1.71, 0.54, 1.02, 1.60 and 7.21. Hence, our variables are not highly
dispersed. From Skewness and Kurtosis Test for Normality, measures for Skewness for our
variables are 0.0033, 0.0000, 0.6559, 0.4903 and 0.019 (which are all close to 0) respectively.
Hence our data is approximately Normally Distributed. Measures for Kurtosis are 0.2213,
0.7923, 0.5616, 0.0000 and 0.93 (which are all close to 0) respectively. Since our data doesn‟t
suffer much from Kurtosis (except for [log(Vehicle Density)]2
), hence it doesn‟t produce more
outliers than the Normal Distribution.(see Tables A,B)

5
Question 1: Whether CO2 emission level in a country can be explained by its urbanization
and vehicle density?
In this question our primary focus is to find out if there is an effect of the level of urbanization
and vehicle density of a country on the level of its CO2 emissions.
BACKGROUND
Urbanization is a demographic indicator that increases urban density and influences household
energy consumption pattern. Growing energy consumption has been singled out as important
factor having the most adverse impact on environment.
Transportation/Vehicles play an equally significant role in explaining CO2 emissions. A growth
in the number of vehicles not only affects mobility, but may also increase the content of carbon
monoxide (CO), carbon dioxide (CO2),and other pollutants.
In summary, the existing literature shows a positive impact of urbanization and vehicle density
on CO2 emissions.
MODEL
( ) ( ) ( )
Our independent variable is log(CO2) emissions and our dependent variables are urban
population and vehicle density. Intuitively and from our theoretical analysis, we expect the two
Beta-coefficients to be positive. Here, beta-coefficients for any explanatory variable show the %
increase in log(CO2) emissions consequent upon a % increase in the value of explanatory
variable.
Regression results
 Beta-coefficients for both our explanatory variables are indeed positive, implying
positive effects of urbanization and vehicle density on the level of CO2 emissions.
β1=1.69305 implying that a 1% increase in urban population leads to a 1.69% increase in
level of CO2 emissions, β2=0.3391488 implying that a 1% increase in vehicle density leads
to a 0.33% increase in level of CO2 emissions.(see Table 1.1)
 P-values for and are less than 0.05 so, we reject the null hypothesis ( 0) and
conclude that urbanization and vehicle density have a significant impact on CO2
emissions at the 5% level of significance.

6
 Since Prob>F=0.0000, implies that F-statistic is statistically significant, we conclude that our
independent variables jointly explain significant variations in our dependent variable. Also
R2
value=0.5370 indicates that our explanatory variables jointly explains 53.7% variations in
the level of CO2 emissions. This shows that our model has a high explanatory power and
thus is well fitted.
 Mean of estimated errors from our model equal -3.04e-09~0.So,E(u)=0.(see Table 1.2)
 Correlation matrix between predicted error terms and all explanatory variables shows that all
correlation coefficients between errors and explanatory variables are zero, indicating no
Endogeneity.(see Table 1.3a)
 Correlation matrix between explanatory variable of our model shows that we don‟t have any
perfect or high pairwise correlation. Also, the mean VIF is 1.5, so we don’t have
multicollinearity between explanatory variables.(see Table 1.3b)
 Breusch-Pagan Test shows P>χ2
=0.4923 (against a null hypothesis of constant variance),
indicating that disturbance terms in our model are Homoskedastic.(see Table 2.4)
 Ramsey test shows that Prob>F=0.1084 (against a null hypothesis of no omitted variable),
indicating that our model has no omitted variable bias.(see Table 2.5)
 Jarque-Bera test shows that χ2
=0.6788 (against a null hypothesis of normality), indicating
that error terms in our model are normally distributed.(see Table 2.6)(see Figure1)
Since, the model satisfies all classical Linear Regression Model assumptions, it is an appropriate
model.
Conclusion: Urbanization and vehicle density do have a significant impact on CO2
emissions. Hence, our results are consistent with theoretical literature.
Remark:
GDP per capita is an important variable in explaining variations in CO2 emissions. But then
variations in GDP and logGDP per capita is highly and significantly explained by variations in
log(urbanpopulation) and log(vehicledensity) (R2=
59.7) and both explanatory variables have
significant impacts). Also, logGDP per capita is highly correlated (rGDP,up=0.73, rGDP,vd=0.61)
with the latter two variables individually. Hence we have avoided using the three explanatory
variables together even though together they explain the variations in logCO2 emissions much
better than the model without GDP or logGDP

7
Question 2: Do rich countries emit more CO2 vis-à-vis their non-rich counterparts?
In this question our primary focus is on the effects of prosperity of a country, if any, on the level
of its CO2 emissions. GDP is an important factor affecting level of CO2 emissions. As Income
level increases, people tend to consume more fuel, buy more cars, fly more thereby increasing
the level of emissions in a country. So intuitively, we would expect the rich countries to emit
comparatively more than others
BACKGROUND
According to the UN report, the world's richest countries are now increasingly outsourcing their
emissions in the form of imports of manufactured electronic goods from China and other
emerging economies. According to draft of the latest report on Climate Change, emissions of
carbon dioxide and the other greenhouse gases has risen over the years. Much of that rise was
due to the burning of coal which was used to run factories in China and other rising economies
that produce goods for US and European consumers. Studies reveal that in 2011,
China(developing economy) ranked as world‟s largest emitter, followed by the United States,
India, Russia, and Japan.
The Environmental Kuznets Curve shows that various indicators of environmental
degradation tend to get worse as modern economic growth occurs until average income reaches a
certain point over the course of development. Here, the differential impact of being rich would
depend upon the relative position of the rich and the non-rich on our EKC.

8
In summary, the existing literature shows that both developed (rich) and developing nations have
significant impact on level of CO2 emissions. However, contribution by developing nations is
comparatively higher so theoretical analysis yields that rich countries emit lesser than others.
MODEL
( ) ( ) ( )
[ ( )]
Our dependent variable here is log(CO2) emissions. Independent variables are log(urban
population), log(vehicledensity) and a rich dummy variable Di, defined as:
Di = {
The World Bank uses GDP per capita to classify economies as low GDP(per capita)<=$905,
middle (GDP (per capita) between $906-$11115 or high income GDP(per capita) >=$11116.
Low-income and middle-income economies are collectively referred to as developing economies
and high income ones as Developed economies which we are categorizing as the rich cohort in
our analysis.
Remark: We have not included [log(urban population)]2
and cross product of log(urban
population) and log(vehicle density) since their beta-coefficients are statistically insignificant.
represents average logCO2 emissions of non-rich countries ceteris paribus. is a „differential
intercept coefficient‟ which shows the difference between average CO2 emissions (in log terms)
of rich and non-rich countries ceterisparibus.
Regression results
 Here = -7.82<0, implying that average CO2 emissions of non-rich countries is less than 1.
=0.66639 >0, so rich countries have a positive differential impact on average level of CO2
emissions (in log terms) vis-à-vis non-rich countries. This is in line with theoretical
predictions.(see Table 2.1)
 Since P-value for <0.05 but that for (=0.008)<0.05, we reject the null hypothesis
( 0) and conclude that rich countries do have a significant differential impact on
average CO2 emissions at the 5% level of significance.

9
 Since, Prob>F=0.000 implies a statistically significant F-statistic, we can conclude that
independent variables jointly explain significant variations in our dependent variable. Also an
R2
value=0.6756 indicates that all our explanatory variables including dummies for continents
jointly explains 67.56% variations in the level of CO2 emissions. This shows that our model
has a high explanatory power and thus is well fitted.
 Mean of estimated errors from our model equal 1.31e-10~0.So, E(u)=0.(see Table 2.2)
 Correlation matrix between predicted error terms and all explanatory variables including
dummies shows that all correlation coefficients between errors and explanatory variables are
zero, indicating no Endogeneity.(see Table 2.3a)
 Correlation matrix between explanatory variable of our model shows that we don‟t have any
perfect correlation. Also, the mean VIF is 17.94, so we don’t have perfect multicollinearity
between explanatory variables.(see Table 2.3b)
=0.2209(against a null hypothesis of constant variance),
A TRADE-OFF BETWEEN MULTICOLLINEARITY AND OMITTED VARIABLE
BIAS
In our model, there is a high correlation (not perfect) between log(vehicle density) and its
squared term. So, as a solution, if we omit either one of them, we land into a problem of
misspecification (omitted variable bias).So, there is a tradeoff. Hence, we are tolerating high (not
perfect) multicollinearity between log(vehicle density) and its squared term.
model.
Conclusion: Rich countries do emit more CO2 emissions than their non-rich counterparts.
So our results are consistent intuitively but inconsistent with theoretical predictions (EKC).

10
Question 3: Do we have a continent specific effect on CO2 emissions?
To address this question, we have constructed a continent dummy to categorize countries in our
sample according to the continents to which they belong. In our sample, we have countries
belonging to 6 continents- Africa, Asia, Europe, North America, Oceania, and South America.
Since we have countries from 6 continents, we have constructed 5 dummies.
Theoretical analyses on CO2 emissions from different countries tells us that African countries
emit the least. Hence, we have taken Africa to serve as base category for our analysis.
MODEL
( ) ( ) ( )
( ) ( ) ( ) ( ) ( )
where,
D1i = {
D2i = {
D3i = {
D4i = {
D5i = {
Constant term in our model shows the average logCO2 emissions of Africa. The Coefficients
on our dummies are the „differential intercept coefficients‟ which shows the differential average
logCO2 emissions of the country associated with a particular dummy vis-à-vis the average
logCO2 emissions of Africa. From theoretical Predictions, we expect the least average logCO2
emissions from Africa.

11
Regression results
 We have a negative constant term (= -5.916) which shows that average CO2 emissions
are less than 1 for Africa. Beta-coefficients for all our dummies except that for South
America are positive, showing that countries belonging to all continents except South
America individually emit on an average more CO2(in log terms) than average emissions
from Africa. In our sample, South America (with CO2 emissions=0.002 metric tonnes per
capita) emits the least and is very closely followed by Africa (with CO2
emissions=0.0027 metric tonnes per capita). Going from the highest to the lowest emitter
we have the following sequence: Oceania, Europe, Asia, North America, Africa and
South America.(see Table 3.1)
 P-values<0.05 for and for beta-coefficients associated with dummies representing
Europe and Oceania, indicating that differential impacts on average logCO2 emissions for
these countries vis-à-vis Africa are indeed statistically significant at the 5% level of
significance. Countries belonging to these two continents do emit on an average
significantly more CO2 vis-à-vis Africa. Differential impact of Asia on emissions is
significant only at a 10% level of significance. Differential impact of South America is
not significant and hence we have Africa as the least emitter. This is in line with our
theoretical predictions that African countries emit the least.
Interpretation of Beta-coefficients: representing the average logCO2 emissions for
Africa is -5.916 (i.e. average CO2 emissions<1) while those for dummies representing
Europe ( ) and Oceania ( ) are 1.04819 and 1.3495 respectively, showing that the
latter two emit on an average more logCO2 by 1.04819 and 1.3495 units respectively vis-
a-vis the average logCO2 emissions from Africa.
 Since, Prob>F=0.000 implies a statistically significant F-statistic, we can conclude that
independent variables jointly explain significant variations in our dependent variable.
Also a R2
value=0.68 indicates that all our explanatory variables including dummies for
continents jointly explains 68% variations in the level of CO2 emissions. This shows that
our model has a high explanatory power and thus is well fitted.
 Mean of estimated errors from our model equal -2.18e-09~0.So, E(u)=0.(see Table 3.2)
 Correlation matrix between predicted error terms and all explanatory variables including
dummies shows that all correlation coefficients between errors and explanatory variables
are zero, indicating no Endogeneity.(see Table 3.3a)

12
 Correlation matrix between explanatory variables of our model shows that we don‟t have
any perfect or high pairwise correlation. Also, the mean VIF is 2.14, so we don’t have
multicollinearity between explanatory variables.(see Table 3.3b)
=0.4801(against a null hypothesis of constant variance),
 Ramsey test shows that Prob>F=0.0803 (against a null hypothesis of no omitted
variable), indicating that our model has no omitted variable bias.(see Table 3.5)
that error terms in our model are normally distributed.(see Table 3.6)(see Figure 3)
model.
Conclusion: We do have a continent specific effect on CO2 emissions. Further, our results
are consistent with theoretical predictions.

13
Question 4: Do we have a combined effect of income level and continent location on CO2
emissions?
To address this question, we have constructed 5 interaction dummies. Some significant
coefficients on interaction dummies would point towards the presence of a combined effect of
income and continent location on CO2 emissions.
MODEL
( ) ( ) ( )
( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( )
where,
( )D1i = {
( )D2i = {
( )D3i = {
( )D4i = {
( )D5i = {
Coefficient on the income variable GDP per capita shows that a 1 unit increase in GDP per
capita increases averageCO2 emissions by 100* % for our base category Africa. Coefficients
on our interaction dummies are „differential slope coefficients‟ which shows the differential
impact of a unit increase in income on averageCO2 emissions (in %) for countries belonging to
the continent represented by a particular interaction dummy vis-à-vis the impact of a unit
increase in GDP per capita on averageCO2 emissions (in %) for the African continent.

14
Regression results
 Beta-coefficients for all our interaction dummies are negative but the Beta-coefficient
associated with income variable GDP per capita (=0.00043)>0. Negative coefficients on
all interaction dummies shows that in all continents other than Africa, a unit increase in GDP
per capita increases logCO2 emissions by less than the increase in logCO2 emissions brought
about by a unit increase in GDP per capita in Africa.
Beta-coefficients for all interaction dummies except that for Asia are less in absolute terms
than the Beta-coefficient for GDP per capita. This indicates that for all continents except
Asia, we still have a positive relationship between GDP per capita and logCO2 emissions
while for the continent Asia, we have a negative relationship between the two i.e. a unit
increase in GDP per capita brings about a reduction in logCO2 emissions.(see Table 4.1)
 P-values are less than 0.05 for beta-coefficients associated with interaction dummies for
Asia, Europe and North America. This shows that the differential impact of a unit increase
in GDP per capita on average logCO2 emissions for these continents vis-à-vis the impact of a
unit increase in GDP per capita on average logCO2 emissions for Africa is statistically
significant at the 5% level of significance. Hence we do observe a combined effect of income
and continent location on CO2 emissions.
Interpretation of Beta-coefficients: Beta-coefficients for GDP per capita is 0.000428 while
that for interaction dummies corresponding to the continents Asia, Europe and North
America are -0.0004321,-0.0004167and -0.0003575 respectively, showing that a unit
increase in GDP per capita brings about a 0.000004 (i.e.0.000428-0.0004321) unit decrease,
a 0.000012 (i.e.0.000428-0.0004167) unit increase, and a 0.000071(i.e.0.000428-0.0003575)
unit increase in average logCO2 emissions in the continents Asia, Europe and North America
respectively compared to a 0.000428 unit increase in logCO2 emissions in Africa.
 Since Prob>F=0.000 implies a statistically significant F-statistic, we can conclude that
independent variables in our model jointly explain significant variations in our dependent
variable. Also a R2
value=0.78 indicates that all our explanatory variables including dummies
for continents and interaction dummies jointly explains 78% variations in the level of CO2
emissions. This shows that our model has a high explanatory power and thus is well
fitted.
 Mean of predicted errors from our model equal 9.57e-10~0. So,E(u)=0.(see Table 4.2)

15
 Correlation matrix between error term and all explanatory variables including dummies
shows that all correlation coefficients between errors and explanatory variables are zero,
indicating no Endogeneity.(see Table 4.3a)
 Breusch-Pagan Test, shows P>χ2
=0.9078 (against a null hypothesis of constant variance),
A TRADE-OFF BETWEEN MULTICOLLINEARITY AND OMITTED VARIABLE
BIAS
Correlation matrix of coefficients of regression shows that we have high (but not perfect)
correlation among explanatory variables. Also, mean VIF=113.13.(see Table 4.3b). This
problem arises because variations in GDP per capita is highly explained by variations in
log(vehicledensity) and log(urbanpopulation).
As a remedial measure, dropping the latter two variables does solves the issue of
multicollinearity but gives rise to an omitted variable bias which necessitates the addition of
more explanatory variables. But then their inclusion brings back high multicollinearity.
Hence, there exists trade-off here.
Dropping variables as a remedy to multicollinearity sometimes worsens the situation as
quoted in D.Gujarati. In our case, it brings an omitted variable bias and significantly reduces
the explanatory power of our model to 0.69 from 0.78).It is then suggested to tolerate
multicollinearity unless it‟s not perfect as in our case and stick with a model with no omitted
variable. We have multicollinearity in our model which leads to insignificant t-ratios. So
while our model says that the interaction dummies corresponding to Oceania and South
America don‟t have statistically significant impacts, this might not be true in reality.
Since, the model satisfies all classical Linear Regression Model assumptions, it is an
appropriate model.
Conclusion: We do have a combined effect of income and continent location on CO2
emissions.

Group5

Recommended

Recommended

More Related Content

Similar to Group5

Similar to Group5 (20)

Group5