SlideShare a Scribd company logo
The authors would like to thank Agnieszka Postępska and Jiadi Chen for their support and help throughout the writing on this
paper.
The determinants of the mortality rate across US counties
Group 1
Leo Acklin, Xinyu Wei, Lingjing Zhu, Xi Zhang, Yuyang Gu
Department of Economics
Georgetown University
Abstract
This studies attempts to estimate the relationship between mortality rate and income inequality
five years previous within counties of the United States using ordinary least squares (OLS).
Independent variables include Gini Index, race and education attainment variables to control for
the various socioeconomic differences across counties, while the dependent variable is mortality
rate per 100,000. The results imply the Gini Index within a county has a statistical significance
on the mortality rate, along with the completion of a High School diploma and degrees of
Associate, Bachelor and Graduate levels. Besides, all races except black are negatively
correlated with mortality rate.
The determinants of the mortality rate across US counties
2
Introduction:
The relationship between income and mortality rate has been a contentious topic for years,
people insist on the basic intuition: income, wealth should have a positive effect on life
expectancy. In fact, a relationship does exist, a research report from The World Bank has shown
that people who earn more than $5,000/month will have a 25% longer life expectancy than those
who earn less than $5,000. But when earning is higher than $5,000/month, the effect of income
to mortality rate is likely to be eliminated, that is to say, income alone may not justify what
causes the fluctuation of mortality rate (Ichiro Kawachi, 1997). As time goes by, more and more
researchers give potential variables that can directly influence the mortality rate, and gradually
focus less on economic variables than classical scholars, rather, they examine the relationship
between mortality rate, races, education and geographical factors, which provide more
possibilities to improve the original model.
Inspired by previous research, in our model we use age-adjusted mortality rate in order to avoid
the disturbance from the death of “natural causes”. There is a famous saying in economics by
John Maynard Keynes, “In the long run we are all dead”. We focus on causes of death other
than time flowing.
We had several choices in the measurement of income inequality, including Gini Index, Hoover
Index and Atkinson Index. We decided to use Gini Index because it is most commonly used by
previous researchers including this year’s Nobel Prize winner, Angus Deaton, and more
importantly, it is easy for us to gather data. We also noticed that there has been a simultaneous
increase in Gini Index and mortality rate since 2006, so we expect there are some causal
relationship between them.
Our ultimate goal of this paper is to find out if there is any kind of causal effect between them, so
that the government can adjust policies to reduce the mortality rate as well as to increase the
social welfare.
Why Should We Expect a Relationship?
Although it may be difficult to use economic theory to explain a potential relationship between
the Gini Index and mortality rate, if we think about how inequality affect people the association
gets easier to understand. Kawachi highlights how inequality can increase stress physiologically
The determinants of the mortality rate across US counties
3
and be harmful to overall health through a number of different areas (Ichiro Kawachi, 1997). A
connection to an increase in mortality rate to something becomes more straightforward when an
increase in that something is harmful to overall health.
The breakdown of social cohesion playing a role in the reduction of personal well-being is a
subject Kawachi focuses on. This is the notion that the pockets of affluence and poverty
encourages dysfunctions among the population (Ichiro Kawachi, 1997). Melvin Tumin
speculated this is because of the inequalities of social rewards and this encourages hostility,
suspicion and mistrust amongst the segments of society (Tumin, 1953). Clearly these community
attributes are not healthy for members of those communities. If we look at the health benefits of
a cohesive society the connection becomes even clearer. Several studies (Syme, 1979) (JS House,
1988) have concluded that socially integrated people live a longer life. This may be because of
the extra support that comes from having a well-connected group of people, which could be in
the form of emotional or financial.
A real world example of these effects were found by Wilkinson. During both world wars in
Britain a greater sense of community was created because of less inequality amongst citizens,
this was coupled with a large increase in life expectancy. On the other side of the coin is the
town of Roseto, PA. Originally an Italian-American area with very few health problems, a rapid
economic expansion during the 1960s created a large divide between the rich and the poor and
the number of deaths by coronary disease rose significantly.(Wilkinson, 1996)
Literature review
It has been long hypothesized there is a negative relationship between mortality and
socioeconomic status: better-off people live longer (Adler, 1993). In the public health and
sociological literatures, “socioeconomic status” is always considered as the fundamental variable
for factor to justify mortality rate. Though assets, occupation, education, or income are all
available variables, we chose income as our basic variable in our model, for it is the most direct
index to reflect people’s wealth and its easiness of finding data.
Despite many empirical studies, current literature has provided mixed results. Many scholars are
now hesitant on the adverse impact of income on mortality rate. Scholars have observed the
ineffectiveness of income for justifying the fatality rate when income has risen to a certain level
(Hanushek, 1996), rather, inequality should be considered as the major variable when adding an
The determinants of the mortality rate across US counties
4
economical variable to the model. Robust evidence on inequality and health comes from the US,
Kaplan et al demonstrated a correlation between age-adjusted mortality and various measures of
income inequality across the U.S. states in 1990 (Greg Brown, 1990). Adding inequality to a
model is necessary when we trying to simulate the real experiment. What we should keep in
mind is inequality works by aggregation. The relationship between mortality risk and income is
nonlinear at the individual level, so that when we compare average mortality across populations,
the distribution of income will play a role. This version of the hypothesis has come to be labeled
as a “statistical artefact”.
Although socioeconomic status is well known to influence mortality patterns in the United States,
few studies have examined other factors on mortality. Albano added education as an independent
variable. Accordingly, educational attainment was strongly and inversely associated with death
from all cancers combined in black and white men and in white women. Loeb and Bound also
provided useful analysis on the effects of education on the mortality rate (Susanna Loeb, 1995).
Deaton and Lubotsky create dummy variables for no high school, some college, and college
graduate and post-graduate, with high school graduate being the omitted dummy variable (Angus
Deaton, 2003). This give us inspiration to put education in our model, education is often ignored
by scholars when building the model, but intuitively, people with more education do care more
about their health, what’s more, higher education sometimes mean higher compensation, so
education will give a positive effect to income.
A more popular idea is to use spatial inequality factors to unveil what is the real cause for
mortality rate, which give us ways to inspect mortality rate related to unequal distribution of
resources across space.
Analyzing factors like race, and education that often tied with geographical locations will give us
hints to build a more complete model to explain mortality rate. More often, researchers use large
descriptive or offered some evidence for correlations within county’s mortality rate.
Hui Zheng used a spatial model to illustrate key social variables, to avoid not being able to
explain the mechanisms influence (which may cause methodological and conceptual problems).
He used a Moran statistic to find if there is any autocorrelation different locations. A large value
at a location tends to be surrounded by large a neighboring value, that is to say, if a county has a
high mortality rate, there will be high local autocorrelation at that location. Having found the
The determinants of the mortality rate across US counties
5
autocorrelation, Zheng built a spatially autoregressive model. What's more, Zheng concluded
that traditional linear regression misspecifies the possible relationship between county ecological
factors and mortality rates unless inherent spatial dimension is not considered (Zheng, 2012).
Also, Yang and Jensen explored these arguments empirically using county-level data, which they
note has yielded weaker results in previous studies caused by spatial dependence bias. This bias
is minimized by using a Bayesian spatial approach. They contribute to the literature in this area
in two fundamental ways. The first was by taking a decidedly spatial perspective and employing
Exploratory Spatial Data Analysis (ESDA), spatial regression modeling and a Bayesian approach
to better handle problems of spatial autocorrelation inherent to ecological data. The second was
by advancing and-to the extent possibly-operationalizing a conceptual model that links inequality
and mortality through a psychosocial pathway and an underinvestment pathway (Tse-Chuan
Yang, 2011). The fraction of the population that are black is also ‘a significant risk factor’ for the
mortality rate, on both whites and blacks separately (Kaplan, 1996)
Also, an article written by AMC (Analytical Methods Committee, 2001) has improved the
method. Where outliers are illegitimately included in the data, it is only common sense that those
data points should be removed. Practically, we can remove the outliers by using ‘robust’
command in Stata when data is normally distributed. (Lewis, 1996)
We initiated our project after noticing that few scholars ever examine the simultaneous effect of
socioeconomic, education and race, most of the writers focused on economic inequality factors.
Although many studies showed that the death rate and inequality are highly correlated, there are
still some uncertainties. Some researchers found that deprivation and social capital partly, but do
not completely account for why inequality is positively associated with mortality. We would like
to gather these factors together in order to plot a picture as precise as possible.
A more sophisticated trend is to track inequality over an individual’s lifetime through survey
data. This enables scholars to treat inequality as person specific and time varying but
unfortunately this method is very much out of our time and budget limitations. We find OLS is
also plausible, conclusions we get from OLS model are more straightforward, but cons also exist,
for example we cannot eliminate geographical correlation.
We tried to make our paper more precise by using county data instead of state data, using county
data can help us to find the most delicate relationship. What’s more, after reading materials, we
The determinants of the mortality rate across US counties
6
find inequity factors won’t affect mortality rate immediately, their effect tend to lag for some
years, so we try to use inequality data five years before to illustrate current mortality. Zheng
concludes the risk starts five years later which is why we chose this distance (Tse-Chuan Yang,
2011).
Data Source
In order to examine the model accurately, data has been collected from reliable sources. Each
group of variables are mentioned in greater detail below.
Mortality
The mortality rates were calculated with the Compressed Mortality Files (CMF) maintained by
the National Center for Health Statistics (NCHS). In order to get rid of the effect of age, the age
adjusted mortality rates were used in the analysis. (NCHS 2013).
Income Inequality
The well-known Gini coefficient ranges from 0 (total equality; everyone has the same income) to
1 (completely unequal, one person enjoys all the income). The Gini coefficient was employed to
measure inequality within a county and calculated with the household income data from the
2005–2009 American Community Survey (ACS) estimates (US Census Bureau 2010). As the
effect of income inequality on mortality rate has a time lag effect (Hui 2012), we use the 2008
Gini index instead of 2013 Gini index.
Race/ethnicity structure
The race/ethnicity structure of a county was captured by four variables: the proportion of
American Indian or Alaska Native, the proportion of Asian or Pacific Islander, and the
proportion of Black or African American. These three variables measure the race structure. And
ethnicity structure is measured by the proportion of Hispanics. Because there are some overlap
between race and ethnicity, e.g. a black people also can be a Hispanic, the sum of these four
variables may be larger one. These were extracted from the U.S. Census Bureau, 2013 American
Community Survey.
The determinants of the mortality rate across US counties
7
Education
The educational attainment variables include percentages of the county population over the age
of 25 that have the following as their highest level of education: high school graduate, some
college, an associate degree, a bachelor degree and graduate degree. The missing variable is a
non-high school graduate. These variables were obtained from the U.S. Census Bureau, 2013
American Community Survey.
Income
The household median income was extracted from the U.S. Census Bureau, 2013 American
Community Survey.
The strength of prevailing evidence varies by geographic scale, with state-level analyses more
supportive than county-level findings (Wilkinson and Pickett 2006). Inequality is better captured
with state-level than with county-level data (Wilkinson and Pickett 2006, 2009) because income
distributions are generally wider within states than counties. Regardless, the inconsistent
evidence by scale of analysis suggests that data aggregation bias may plague studies in this area
(Deaton and Lubotsky 2003). Recently, it has been argued that ‘county’ is a more appropriate
analytic unit rather than ‘state’ as it accounts for the heterogeneity within a state, which helps to
study spatial inequality in detail, and has more relevant implications for localities (Lobao et al.
2007). Accordingly, this study will analyze county-level data. After dropping the missing data,
which are randomly, the sample in this paper consisted of 739 county-level units.
The descriptive statistics of variables are shown in Table 1
The Gini coefficient ranges from 0 (total equality; everyone has the same income) to 1
(completely unequal, one person enjoys all the income). The average of the Gini coefficient is
0.4365. Cass County, MO, has the smallest Gini coefficient, 0.3460, which means Cass County
is the most equal county. And New York County, NY, is the most unequal county, which has the
largest Gini coefficient, 0.6050. The average household’s median income is $29,970.90 dollars.
The income in Isabella County, MI, is the lowest, which is $11,879 dollars. Arlington County,
VA, has the highest income, which is $60,405 dollars.
The average of mortality rate is 0.0076. The minimum mortality rate is 0.0046, which is in
Collier County, FL. And the maximum mortality rate is 0.0113, which is in Baltimore city, MD.
The determinants of the mortality rate across US counties
8
The standard deviation of mortality rate is 0.0011, which shows that the variation of mortality
rate among different counties is significant. This is one of our reasons why we are interested in
the mortality problem.
There are 51 counties that don’t have American Indian or Alaska Native people. McKinley
County, NM, has the most American Indian or Alaska Native people, the proportion is 0.747.
The mean of indian is 0.0109. The mean of black is 0.1105. Gallatin County, MT, only has 0.1%
Black or African American people compare to its population. The county which has the highest
fraction of Black is Hinds County, MS. And the fraction is 0.7080, which means 70.8% people in
Hinds County is Black or African American people. The minimum proportion of Asian or
Pacific Islander is 0.001. In Honolulu County, HI, 42.6% population is Asian or Pacific Islander.
The average of Asian among counties is 0.0317. The range of Hispanic is large at 0.948. The
average is 0.1176. Muskingum County, OH, has the smallest fraction of Hispanic, which is 0.005.
And the fraction of Hispanic in Webb County, TX, is 0.9530, which is the highest. The
coefficients of variation (which equals standard deviation divided by mean) of race/ethnicity
variables are all larger than 100%, which shows that the variation of the proportion of
race/ethnicity among counties is large. We also found that all race/ethnicity variables present
positive skewness (shown by histograms in appendix). The influence of the outliers cannot be
canceled by using Robust estimators (AMC 2001). And the log transformation doesn’t work,
because of the 0 value of observations. Therefore, we drop these outliers according to IQR rule
(Tukey 1977), which will be discussed in further detail below. However, we will lose some
information by dropping observations. So in following, this paper will compare the results with
and without these outliers.
The average of percentage of people, whose highest education level is high school, is 29.03%.
The minimum of highschool is 8.3% and the maximum of highschool is 51.6%. The average of
percentage of people, whose highest education level is college, is 22.09%. The minimum
percentage is 9.2%, and the maximum is 35.6%. The average of percentage of people, whose
highest education level is associate, is 8.55%. The minimum is 3.1% and the maximum is 16.3%.
The average of percentage of people, whose highest education level is bachelor degree, is
17.71%. The minimum is 5.66% and maximum is 39.4%. The average of percentage of people,
The determinants of the mortality rate across US counties
9
whose highest education level is graduate degree, is 10.59%. The minimum is 1.9% and
maximum is 39.1%.
The relationship between mortality rate and income inequality is shown in Appendix B. From the
figure we found that the mortality rate and income inequality has some non-linear relationship.
Before some critical Gini index, the mortality rate will increase with the increasing of income
inequality. This is consist with our expectation. However, after that critical point, the mortality
rate will decrease with the increasing of income inequality. This part draws our attention. And
we managed to explain this results in our paper.
As the Appendix B shows, Gini index and income also has some non-linear relationship. At
beginning, the income falls as Gini index grows, after critical point, income increases as Gini
index grows. ( Deaton 2001)
It has been argued that the inequality-mortality association is unique to the US because
inequality is a proxy for the race and social structure of this country (Deaton and Lubotsky 2003;
Ross et al. 2000). As minority groups in the US are more likely to be impoverished and live in
socially disorganized areas, the inequality-mortality relationship may be attributed to social and
political structure. Hence, controlling for these factors should eliminate the inequality-mortality
relationship (Deaton 2001, 2003). For example, African Americans suffer greater socioeconomic
deprivation and often reside in high-poverty environments marked by social disorganization,
which together create health disparities (Kawachi and Kennedy 1997b; Williams et al. 2010).
Therefore the race structure is another important factor. The relationship between mortality rate
and the fraction of black or African American is shown in Appendix B. As we can see, either
with or without the outliers, the two variables have a linear relationship. The mortality rate goes
up as the increase of the fraction of black or African American people. As the Appendix B shows
that if we drop the outliers, the mortality rate and fraction of Asian has negative relationship.
According to State Health Facts, Asian-Americans have longer life expectancy compared to
other races. This may owe to their lifestyle. And in Appendix B, without the outliers the
mortality rate and fraction of Indian don’t have obvious relationship. The scatterplot of mortality
rate and fraction of Hispanic without the outliers shows these two variables have slightly
negative relationship.
The determinants of the mortality rate across US counties
10
We think that the education will also influence the mortality rates (Sparks 2010). First, people
who get more education will tend to take more care of their health. Second, higher education
expect higher income. Combine these two effects, we expect that the education and mortality rate
have a negative relationship. From the scatterplots about the education level, The county which
has more low education level population, like high school level population, has higher mortality
rate. And the county has more high level education population, like bachelor and graduate
population, has lower mortality rate.
The correlation matrixes also show the relationship between variables. The Black, high school
education level, and college education level have positive relationship with mortality rate. And
Asian, Hispanic, ln(income), bachelor and graduate education level have negative relationship
with mortality rate. In addition, ln(income) has negative relationship with inequality.
Model
The determinants of the mortality rate will be estimated using OLS and robust standard-errors.
The original model is specified as below:
Mortality = β0 + β1gini + β2gini2
+ β3Black + β4Asian + β5Indian + β6Hispanic + β7highschool
+ β8college + β9associates + β10bachelors + β11graduate + β12ln(income) + ε
mortality: age-adjusted mortality rate
gini and gini2
: Gini Coefficient
This is our main variable of interest and is one of a number of measures of inequality available.
We decided on the Gini coefficient as it was readily available for each county in the US. We’re
including the Gini2
variable as we believe the relationship between mortality and inequality is
not linear. This is highlighted in the plots of mortality rate and Gini index.
Race/ethnicity variables
Black: the proportion of black or African Americans per county
Asian: the proportion of Asian or Pacific Islander per county
Indian: the proportion of American Indian or Alaska Native per county
The determinants of the mortality rate across US counties
11
Hispanic: the proportion of Hispanics per county
Here we’re accounting for the different racial compositions that are seen throughout the counties.
Variables for the populations of blacks and whites are very common empirically, but we decided
to add other races to our model. The reasoning for this is because we think there may be other
differences between races not just Blacks and Whites that affect the mortality rate differently.
For example, there could be lifestyle choices that are predominantly found in Asian or Hispanic
cultures that could have a different relationship to mortality than the Black or Indian proportions.
To investigate these differences we have included these variables.
Level of highest education attainment variables
Highschool: completed high school is their highest level of education
College: completed some college education but did not complete a degree requirement
Associates: completed an associate's degree
Bachelor: completed a bachelor’s degree
Graduate: completed a graduate degree
These variables are also consistent with empirical literature, specifically Deaton and Lubotsky
(2003). Although, the associates and graduate variables have been added in an attempt to
improve on their findings and see if these attainments have any explanatory powers.
ln(income): natural log of median household dollars
We’re using income here as more of a control variable that will be further explained below, but it
shall also help us distinguish between a rise in inequality due to poorer people getting poorer and
richer people getting richer. We’ve taken the natural log of income because after some
preliminary regressions using just income the coefficient was very small and difficult to interpret.
Our estimation method is OLS. We choose OLS for two reasons. Firstly, many of the literature
we read use OLS. Secondly, the property of the data is suitable for OLS. We’ll explain the
property of our data, potential problems and solutions in the following part.
The following are the main causes of bias and inconsistency of the estimator of coefficient and
the estimator of variance in OLS.
The determinants of the mortality rate across US counties
12
1. Omitted Variable
Omitted Variable is a variable that has correlations with both dependent and independent
variables, and it will cause bias in the estimator of coefficient. There will always be omitted
variables. We should only consider ones that we think those are significant to our model.
An example of a variable that intuitively has a direct relationship on mortality would be health
status. In practice this a very difficult variable to measure which is why it is left out of our
analysis which could potentially cause an omitted variable bias. Health status is also correlated
with income, the higher the income the healthier a person is so we would expect a negative
relationship between income and mortality rate. This being from able to spend a larger amount of
income on healthcare as well as healthier lifestyle choices. Education attainment would also be
correlated with health status, as the increase of education is believed to make people care more
about their health.
The other possibly omitted variable is pollution. By common sense, a higher level of pollution
can cause various types of diseases and increase the mortality rate. The Environmental Kuznets
Curve is an inverted U-shaped curve describing the relationship between environmental
degradation and income. Specifically, in developing countries with a low national income, the
environmental condition deteriorates as income grows. In developed countries, however, the
level of pollution decreases as income goes up. This is understandable because developing
countries like China have to sacrifice their natural environment in exchange of the
industrialization process. While in developed countries like the US, more income means they
have more money to deal with the pollution they produced before in the economic developing
process. Overall, pollution has a positive correlation with mortality rate and a negative
correlation with income.
2. Heteroskedasticity
We tested our regression after plotting the residuals from a preliminary regression (Appendix D)
and looking at the lack of pattern in the residuals we concluded that heteroskedasticity is present.
We’ll use OLS to get an unbiased and consistent estimator of coefficients and use White’s
Heteroskedasticity-robust standard errors for statistical inference. The source of the
heteroskedasticity could be from the non-normally distributed data in our race variables and the
The determinants of the mortality rate across US counties
13
outliers mentioned below. There are many counties that have a large proportion of each race in
their population, which means the data is skewed (Appendix A).
3. Outliers
We made histograms (Appendix A) for each variable and find that except for race/ethnicity
variables, the other variables are all approximately normally distributed. According to the
literature, we can use robust to deal with outliers problem for those variables that are normally
distributed. So we will keep all data in those variables. And as for race/ethnicity variables, we
can use IQR rule to drop outliers.
IQR rule:
Q1 means bigger than 1 Quarter of the data. Q3 means bigger than 3 Quarter of the data.
IQR (Inter-quartile Range)=Q3-Q1. The outliers are defined as anything below Q1-1.5 IQR or
above Q3+ 1.5 IQR.
If we drop all the outliers, we would lose a lot of data. Since we have no evidence that those
outliers are illegitimately included in the data, it may not be a wise choice to drop so much data.
The bias caused by deleting data may be bigger than the bias caused by outliers.
When discussing results, we will run two separate regressions on the dataset, one with outliers
and one without.
4. Serial correlation
We use cross-sectional data rather than time-series data. Therefore we don’t need to worry about
time serial correlation. There may be some geographical correlation that comes naturally from
using location based variables as previously discussed.
5. Multicollinearity
Looking at the correlation matrix (Appendix) there are a number of variables that have a
significant correlation, most notably the education attainment variables. During preliminary
regressions these variables were statistically significant which would suggest multicollinearity is
not an issue. With these variables predominantly being used as control variables then collinearity
should not be a problem.
The determinants of the mortality rate across US counties
14
Results
Our results are listed in Table 2 in the Appendix C. Both Gini and Gini2
are significant at the 0.1%
level, as well as all of the race/ethnicity variables apart from Indian. From the education
attainment variables all but college are also significant at the 0.1% level. ln(income) does have
an unexpected sign but is not statistically significantly different from zero at the 95% confidence
level.
Evaluating Gini at the mean of 0.436 we obtain a change with respect to mortality of 0.00409 or
409 people per 100,000, meaning any small change of the Gini Index around the mean results in
a change in the mortality rate of 409.
Black is the only race variable that does not show a negative sign, which would imply that our
hypothesis regarding different races having different effects on mortality is fair. A positive sign
on Black would suggest an increase in the proportion of the black population within a county
increases the mortality rate, Deaton and Lubotsky (2003) also showed a positive sign for their
fraction of black variables. A reason for this could be that average incomes among the black
population are negatively correlated with the presence of blacks, whereas the average income of
whites is greater with a larger fraction of blacks (Deaton and Lubotsky, 2003). This may also
explain our insignificance of the income variable as well as the near zero coefficient. As
mentioned above, according to State Health Facts, Asian-Americans and Hispanic-Americans
have longer life expectancy compared to other races. There is a theory known as Hispanic
Paradox. Basically, Hispanics tend to have better health habit and stronger networks of social
support from community (Paola Scommegna, 2013). And intuitively, Asians have the same
attitude of living and lifestyle as well. That’s could be one of the reasons why the coefficient of
Asian and Hispanic variables are negative. The coefficient of Asian becomes insignificant in the
second regression. That’s maybe because we lose efficient information after dropping outliers.
Each level of completed education is significant and exhibits a negative sign. Only college is not
significant which implies there is an importance to finishing an education level. The coefficient
on bachelor is -0.011 which means a 1% increase of the number of people that complete a
bachelor’s degree decreases the mortality rate by 0.011 or 1,100 people per 100,000, holding the
other variables constant. Using Allegheny County, PA as an example, 1% of the population that
has either only completed high school or attended some college is 5,615 people.
The determinants of the mortality rate across US counties
15
The second regression results are without the outliers found using the process mentioned above.
The Indian variable switches signs from negative to positive but is still not statistically
significant. College is the only variable that gains significance and this is at the 5% level. Asian
is the only variable to lose significance. ln(income) is still not statistically significant.
Conclusions
A positive sign on Gini and negative sign on Gini2
would mean as the inequality of a county
increases the effect on mortality decreases. This might not make immediate sense but if we think
about how inequality can change within a county this might make more sense. For example, if a
county’s inequality increases because of people becoming wealthier and therefore healthier, the
decrease in mortality rate could be greater than the increase that the larger inequality creates.
Another way of thinking about this issue is regarding the omitted variable bias caused by a
public healthcare system. Intuitively, people who have healthcare insurance are more likely to
receive good care when they are sick, and people with a low income are more likely to receive
public healthcare for free. Ross (2000) used cross-sectional data to compare the relationship
between income inequality and mortality rate in US and Canada and no significant association
between those two factors was found in Canada. It is, as they concluded, mainly caused by the
fact that health care resources in Canada are “publicly funded and universally available” (Nancy
A Ross, 2000).
In summary, when the social condition of income inequality becomes extreme, the government is
very likely to intervene and provide healthcare to those who cannot afford medical care. It
reduces the marginal effect of income inequality on mortality rate when Gini Index is high,
presenting a non-linear correlation between them.
Based on these conclusions, we also find that our model suffers from the absence of geographic
condition of each county since health care policies are different in different states. The
Commonwealth Fund rates states like Hawaii and Vermont as the best states for health care, and
states like Texas and Mississippi as the worst (Forbes, 2007). Furthermore, there is a positive
relationship between the wealth of a state and its public health care, which may cause upward
bias to our income variable.
The determinants of the mortality rate across US counties
16
If we use panel data and run a FE regression, we would be able to eliminate the bias caused by
different geographic locations. Limited by our knowledge and time constraints, we were not able
to do that in this paper. Later on if we want to improve our model, we can use panel data in our
regressions.
The determinants of the mortality rate across US counties
17
Reference
Adler, N. E. (1993). Socioeconomic inequalities in health: no easy solution. Journal of the
American Medical Association, p. 269.
Analytical Methods Committee. (2001). AMC Technical Brief. Retrieved 12 1, 2015
Angus Deaton, D. L. (2003). Mortality, inequality and race in American cities and states. Social
Science & Medicine, pp. 1139–1153.
Berkman, L. F., & Syme, S. L. (1979). Social networks, host resistance, and mortality: A nine-
year follow- up study of alameda county residents. American Journal of
Epidemiology, 109(2), 186-204.
Brian Wingfield (2007). The Best and Worst States For Health Care. Forbes
Greg Brown, M. P.-T.-Q. (1990). Regression of Coronary Artery Disease as a Result of Intensive
Lipid-Lowering Therapy in Men with High Levels of Apolipoprotein B. England Journal
of Medicine, pp. 1289-1298.
Hanushek, E. A. (1996). Aggregation and the estimated effects of school resources. Review of
Economics and Statistics, 78(4), , pp. 611–627.
House, J. S., Landis, K. R., & Umberson, D. (1988). Social relationships and health. Science
(New York, N.Y.), 241(4865), 540-545.
Ichiro Kawachi, B. P. (1997). The relationship of income inequality to mortality: Does the choice
of indicator matter? Social Science & Medicine, pp. 1121–1127.
Jessica D. Albano, E. W. (2007). Cancer Mortality in the United States by Education Level and
Race. Medicine & Health, pp. 1384-1394.
JS House, K. L. (1988). Social Relationships and Health. Science, pp. 204-208.
Kaplan. (1996). People and places: contrasting perspective on the association between social
class and health. International Journal of Health Services, pp. 507-519.
Kaplan, G. E. (1996). Inequality in income and mortality in the United States: analysis of
mortality and potential. British Medical Journal, p. 312.
The determinants of the mortality rate across US counties
18
Kawachi, I., Colditz, G. A., Ascherio, A., Rimm, E. B., Giovannucci, E., Stampfer, M. J., et al.
(1996). A prospective study of social networks in relation to total mortality and
cardiovascular disease in men in the USA. Journal of Epidemiology and Community
Health, 50(3), 245-251.
Lewis, V. B. (1996). Outliers in statistical data. International Journal of Forecasting, pp. 175-176.
Lobao, L. M., & Hooks, G. (2007). Advancing the sociology of spatial inequality. The Sociology
of Spatial Inequality, 29.
Ross, N. A., Wolfson, M. C., Dunn, J. R., Berthelot, J. M., Kaplan, G. A., & Lynch, J. W. (2000).
Relation between income inequality and mortality in canada and in the united states:
Cross sectional assessment using census data and vital statistics. BMJ (Clinical Research
Ed.), 320(7239), 898-902.
Sparks, P. J., & Sparks, C. S. (2010). An application of spatially autoregressive models to the
study of US county mortality rates. Population, Space and Place, 16(6), 465-481.
Susanna Loeb, J. B. (1995). The Effect of Measured School Inputs on Academic Achievement:
Evidence from the 1920s, 1930s and 1940s Birth Cohorts. NBER Working Paper, pp.
653-64.
Syme, L. F. (1979). Social Networkers, Host Resistance, and Mortality: A Nine-Year Follow-up
Study of Alameda County Residents. American Journal of Epidemiology, pp. 186-204.
Tse-Chuan Yang, L. J. (2011). Social Capital and Human Mortality: Explaining the Rural. Rural
Sociology, pp. 347–374.
Tumin, M. M. (1953). Some Principles of Stratification: A Critical Analysis. American
Sociological Review, pp. 387-394.
Wilkinson, R. G. (2002). Unhealthy societies: The afflictions of inequality Routledge.
Wilkinson, R. G. (2006). The impact of inequality. Social Research, , 711-732.
Wilkinson, R. G., & Pickett, K. E. (2006). Income inequality and population health: A review
and explanation of the evidence. Social Science & Medicine, 62(7), 1768-1784.
The determinants of the mortality rate across US counties
19
Williams, D. R., Mohammed, S. A., Leavell, J., & Collins, C. (2010). Race, socioeconomic
status, and health: Complexities, ongoing challenges, and research opportunities. Annals
of the New York Academy of Sciences, 1186(1), 69-101.
Zheng, H. (2012). Do people die from income inequality of a decade ago? Social Science &
Medicine, pp. 36-45.
The determinants of the mortality rate across US counties
20
Appendix A
Table 1 Summary Statistics
Variable
Gini coefficient 0.4365
(0.0375)
age-adjusted mortality rate 0.0076
(0.0011)
the proportion of American Indian or Alaska Native per county 0.0109
(0.0483)
the proportion of black or African Americans per county 0.1105
(0.1234)
the proportion of Asian or Pacific Islander per county 0.0317
(0.0414)
the proportion of Hispanics per county 0.1176
(0.1335)
completed high school is their highest level of education 0.2903
(0.0662)
completed some college education but did not complete a degree
requirement
0.2209
(0.0396)
completed an associate's degree 0.0855
(0.0203)
completed a bachelor’s degree 0.1771
(0.0566)
completed a graduate degree 0.1059
(0.0502)
median household dollars 29970.8900
(6368.3280)
Number of observations 739
Standard deviations in brackets
The determinants of the mortality rate across US counties
21
Histograms of all variables and normal distribution plot.
1. Variables that are approximately normally distributed:
The determinants of the mortality rate across US counties
22
2. Variables that are not normally distributed
The determinants of the mortality rate across US counties
23
Appendix B
1. Correlation matrix
with outliers:
without outliers:
2. Scatterplots:
The left hand is the scatterplots with outliers and the right hand is the scatterplots without outliers.
Column1 mort gini gini2 black indian asian hisp college associate bachelor graduate lnincome
mort 1
gini 0.0302 1
gini2 0.0177 0.9982 1
black 0.3141 0.2634 0.2653 1
indian -0.0309 -0.0344 -0.0367 -0.093 1
asian -0.4794 0.1968 0.201 0.1393 0.0153 1
hisp -0.2174 0.1957 0.2014 0.1276 0.2094 0.2297 1
college 0.1837 -0.1912 -0.1878 0.0954 0.3115 -0.1741 0.1458 1
associate -0.1365 -0.2628 -0.2676 -0.1913 0.1443 -0.0901 -0.1624 0.1038 1
bachelor -0.6748 0.136 0.1417 -0.0632 -0.0152 0.6151 0.0735 -0.1638 -0.1241 1
graduate -0.6067 0.3168 0.3244 -0.0319 -0.0864 0.6605 0.0233 -0.3609 -0.2066 0.7597 1
lnincome -0.4149 -0.3287 -0.3231 -0.0526 -0.0988 0.3205 0.046 -0.1565 0.0054 0.5159 0.3085 1
The determinants of the mortality rate across US counties
24
.004.006.008
.01
.012
9.5 10 10.5 11
lnincome
mort Fitted values
.004.006.008
.01
.012
.35 .4 .45 .5 .55 .6
Estimate; Gini Index
mort Fitted values
.35
.4
.45
.5
.55
.6
9.5 10 10.5 11
lnincome
Estimate; Gini Index Fitted values
.004.006.008
.01
.012
0 .2 .4 .6 .8
black
mort Fitted values
.004.006.008
.01
.012
.35 .4 .45 .5 .55
Estimate; Gini Index
mort Fitted values
.004.006.008
.01
.012
9.5 10 10.5 11
lnincome
mort Fitted values
.35
.4
.45
.5
.55
9.5 10 10.5 11
lnincome
Estimate; Gini Index Fitted values
.004.006.008
.01
.012
0 .1 .2 .3
black
mort Fitted values
The determinants of the mortality rate across US counties
25
.004.006.008
.01
.012
0 .1 .2 .3 .4
asian
mort Fitted values
.004.006.008
.01
.012
0 .2 .4 .6 .8
indian
mort Fitted values
.004.006.008
.01
.012
0 .2 .4 .6 .8 1
hisp
mort Fitted values
.004.006.008
.01
.012
.1 .2 .3 .4 .5
highschool
mort Fitted values
.004.006.008
.01
.012
0 .02 .04 .06 .08
asian
mort Fitted values
.004.006.008
.01
.012
0 .005 .01
indian
mort Fitted values
.004.006.008
.01
.012
0 .1 .2 .3
hisp
mort Fitted values
.004.006.008
.01
.012
.1 .2 .3 .4 .5
highschool
mort Fitted values
The determinants of the mortality rate across US counties
26
The determinants of the mortality rate across US counties
27
Appendix C
Regression results
Table 2 Regression Table
Mortality rate
with outliers without outliers
Gini coefficient 0.0494*** 0.0642***
(0.0112) (0.0141)
Gini coefficient square -0.0519*** -0.0704***
(0.0128) (0.0162)
the proportion of black or African Americans per county 0.00179*** 0.00250***
(0.000239) (0.000364)
the proportion of American Indian or Alaska Native per county -0.000591 0.00714
(0.000456) (0.0114)
the proportion of Asian or Pacific Islander per county -0.00340*** -0.000921
(0.000485) (0.00217)
the proportion of Hispanics per county -0.00400*** -0.00584***
(0.000330) (0.000601)
completed high school is their highest level of education -0.00536*** -0.00669***
(0.00122) (0.00139)
completed some college education but did not complete a degree requirement -0.00196 -0.00299*
(0.00112) (0.00137)
completed an associate's degree -0.0157*** -0.0168***
(0.00147) (0.00171)
completed a bachelor’s degree -0.0115*** -0.0122***
(0.00111) (0.00128)
completed a graduate degree -0.0104*** -0.0123***
(0.00118) (0.00135)
natural log of median household dollars 0.0000368 0.0000254
(0.000168) (0.000195)
constant 0.00245 0.000670
(0.00323) (0.00398)
N 739 508
R2 0.705 0.701
adj. R2 0.700 0.694
Robust standard errors in parentheses
* p < 0.05, ** p < 0.01, *** p < 0.001
The determinants of the mortality rate across US counties
28
Appendix D
residual graph:
.004.006.008
.01
.012
mort
-.002 -.001 0 .001 .002 .003
residuals

More Related Content

What's hot

1 suicidejocelyn s. barrioseh1020patti smith
1 suicidejocelyn s. barrioseh1020patti smith1 suicidejocelyn s. barrioseh1020patti smith
1 suicidejocelyn s. barrioseh1020patti smith
smile790243
 
Collection of Researched Projects
Collection of Researched ProjectsCollection of Researched Projects
Collection of Researched Projectslaurensurp
 
Relationship Between Volunteering and Subjective Happiness in College Students
Relationship Between Volunteering and Subjective Happiness in College StudentsRelationship Between Volunteering and Subjective Happiness in College Students
Relationship Between Volunteering and Subjective Happiness in College StudentsDanielle Hoyt
 
Justicia social, epidemiologya e inequidad en la salud
Justicia social, epidemiologya e inequidad en la saludJusticia social, epidemiologya e inequidad en la salud
Justicia social, epidemiologya e inequidad en la salud
02678923
 
Andrew Trueblood-Capstone -National Univeristy
Andrew Trueblood-Capstone -National UniveristyAndrew Trueblood-Capstone -National Univeristy
Andrew Trueblood-Capstone -National UniveristyAndrew Trueblood
 
9. The LGBT Movements Health Issues- Higher Rates of HIV/AIDS and Other STDs ...
9. The LGBT Movements Health Issues- Higher Rates of HIV/AIDS and Other STDs ...9. The LGBT Movements Health Issues- Higher Rates of HIV/AIDS and Other STDs ...
9. The LGBT Movements Health Issues- Higher Rates of HIV/AIDS and Other STDs ...
Antonio Bernard
 
Suicide and suicidal behavor ren yan
Suicide and suicidal behavor ren yan Suicide and suicidal behavor ren yan
Suicide and suicidal behavor ren yan Shyam Babu
 
Print journal final copy
Print journal final copyPrint journal final copy
Print journal final copy
MalvikaVenkataraman
 
Materialistic approach to sociology of health
Materialistic approach to sociology of healthMaterialistic approach to sociology of health
Materialistic approach to sociology of health
Meesum Kazmi
 
Module 1 determinants_of_health
Module 1 determinants_of_healthModule 1 determinants_of_health
Module 1 determinants_of_health
rafat naim
 
Zero Suicide in Healthcare: International Declaration & Social Movement (The ...
Zero Suicide in Healthcare: International Declaration & Social Movement (The ...Zero Suicide in Healthcare: International Declaration & Social Movement (The ...
Zero Suicide in Healthcare: International Declaration & Social Movement (The ...
David Covington
 
Suicide Clustering and Contagion: Early Identification and Responding
Suicide Clustering and Contagion: Early Identification and RespondingSuicide Clustering and Contagion: Early Identification and Responding
Suicide Clustering and Contagion: Early Identification and RespondingNational Suicide Research Foundation
 
Zero Suicide in Healthcare: International Declaration & Social Movement (The ...
Zero Suicide in Healthcare: International Declaration & Social Movement (The ...Zero Suicide in Healthcare: International Declaration & Social Movement (The ...
Zero Suicide in Healthcare: International Declaration & Social Movement (The ...
David Covington
 
LuciousDavis1-Research Methods for Health Sciences-01-Unit9_Assignment
LuciousDavis1-Research Methods for Health Sciences-01-Unit9_AssignmentLuciousDavis1-Research Methods for Health Sciences-01-Unit9_Assignment
LuciousDavis1-Research Methods for Health Sciences-01-Unit9_AssignmentLucious Davis
 
Health Opportunity Mapping_KC_2-29
Health Opportunity Mapping_KC_2-29Health Opportunity Mapping_KC_2-29
Health Opportunity Mapping_KC_2-29Jerry Jones
 
Deaths from fall-related traumatic brain injuries are on the rise in U.S.
Deaths from fall-related traumatic brain injuries are on the rise in U.S.Deaths from fall-related traumatic brain injuries are on the rise in U.S.
Deaths from fall-related traumatic brain injuries are on the rise in U.S.
Δρ. Γιώργος K. Κασάπης
 
Inuit Youth Suicide - Mini-Systematic Review [FINAL]
Inuit Youth Suicide - Mini-Systematic Review [FINAL]Inuit Youth Suicide - Mini-Systematic Review [FINAL]
Inuit Youth Suicide - Mini-Systematic Review [FINAL]Amber Armstrong-Izzard
 
Lilly Bloomington Illinois Dec 2009
Lilly Bloomington Illinois Dec 2009Lilly Bloomington Illinois Dec 2009
Lilly Bloomington Illinois Dec 2009Gilbert Gonzales
 

What's hot (20)

1 suicidejocelyn s. barrioseh1020patti smith
1 suicidejocelyn s. barrioseh1020patti smith1 suicidejocelyn s. barrioseh1020patti smith
1 suicidejocelyn s. barrioseh1020patti smith
 
Collection of Researched Projects
Collection of Researched ProjectsCollection of Researched Projects
Collection of Researched Projects
 
Relationship Between Volunteering and Subjective Happiness in College Students
Relationship Between Volunteering and Subjective Happiness in College StudentsRelationship Between Volunteering and Subjective Happiness in College Students
Relationship Between Volunteering and Subjective Happiness in College Students
 
Justicia social, epidemiologya e inequidad en la salud
Justicia social, epidemiologya e inequidad en la saludJusticia social, epidemiologya e inequidad en la salud
Justicia social, epidemiologya e inequidad en la salud
 
Andrew Trueblood-Capstone -National Univeristy
Andrew Trueblood-Capstone -National UniveristyAndrew Trueblood-Capstone -National Univeristy
Andrew Trueblood-Capstone -National Univeristy
 
9. The LGBT Movements Health Issues- Higher Rates of HIV/AIDS and Other STDs ...
9. The LGBT Movements Health Issues- Higher Rates of HIV/AIDS and Other STDs ...9. The LGBT Movements Health Issues- Higher Rates of HIV/AIDS and Other STDs ...
9. The LGBT Movements Health Issues- Higher Rates of HIV/AIDS and Other STDs ...
 
child abuse
child abusechild abuse
child abuse
 
Suicide and suicidal behavor ren yan
Suicide and suicidal behavor ren yan Suicide and suicidal behavor ren yan
Suicide and suicidal behavor ren yan
 
Print journal final copy
Print journal final copyPrint journal final copy
Print journal final copy
 
Materialistic approach to sociology of health
Materialistic approach to sociology of healthMaterialistic approach to sociology of health
Materialistic approach to sociology of health
 
Module 1 determinants_of_health
Module 1 determinants_of_healthModule 1 determinants_of_health
Module 1 determinants_of_health
 
Zero Suicide in Healthcare: International Declaration & Social Movement (The ...
Zero Suicide in Healthcare: International Declaration & Social Movement (The ...Zero Suicide in Healthcare: International Declaration & Social Movement (The ...
Zero Suicide in Healthcare: International Declaration & Social Movement (The ...
 
Suicide Clustering and Contagion: Early Identification and Responding
Suicide Clustering and Contagion: Early Identification and RespondingSuicide Clustering and Contagion: Early Identification and Responding
Suicide Clustering and Contagion: Early Identification and Responding
 
Zero Suicide in Healthcare: International Declaration & Social Movement (The ...
Zero Suicide in Healthcare: International Declaration & Social Movement (The ...Zero Suicide in Healthcare: International Declaration & Social Movement (The ...
Zero Suicide in Healthcare: International Declaration & Social Movement (The ...
 
LuciousDavis1-Research Methods for Health Sciences-01-Unit9_Assignment
LuciousDavis1-Research Methods for Health Sciences-01-Unit9_AssignmentLuciousDavis1-Research Methods for Health Sciences-01-Unit9_Assignment
LuciousDavis1-Research Methods for Health Sciences-01-Unit9_Assignment
 
Dependency
DependencyDependency
Dependency
 
Health Opportunity Mapping_KC_2-29
Health Opportunity Mapping_KC_2-29Health Opportunity Mapping_KC_2-29
Health Opportunity Mapping_KC_2-29
 
Deaths from fall-related traumatic brain injuries are on the rise in U.S.
Deaths from fall-related traumatic brain injuries are on the rise in U.S.Deaths from fall-related traumatic brain injuries are on the rise in U.S.
Deaths from fall-related traumatic brain injuries are on the rise in U.S.
 
Inuit Youth Suicide - Mini-Systematic Review [FINAL]
Inuit Youth Suicide - Mini-Systematic Review [FINAL]Inuit Youth Suicide - Mini-Systematic Review [FINAL]
Inuit Youth Suicide - Mini-Systematic Review [FINAL]
 
Lilly Bloomington Illinois Dec 2009
Lilly Bloomington Illinois Dec 2009Lilly Bloomington Illinois Dec 2009
Lilly Bloomington Illinois Dec 2009
 

Similar to Researchreport

DISCUSSION BOARD forum 3.docx
DISCUSSION BOARD forum 3.docxDISCUSSION BOARD forum 3.docx
DISCUSSION BOARD forum 3.docx
sdfghj21
 
DISCUSSION BOARD forum 3.docx
DISCUSSION BOARD forum 3.docxDISCUSSION BOARD forum 3.docx
DISCUSSION BOARD forum 3.docx
bkbk37
 
Research methods final project
Research methods final projectResearch methods final project
Research methods final projectHayoung Cho
 
Estimated Deaths Attributable to Social Factors in the United States
Estimated Deaths Attributable to Social Factors in the United StatesEstimated Deaths Attributable to Social Factors in the United States
Estimated Deaths Attributable to Social Factors in the United States
Food & Nutrition Section, Illinois Public Health Association
 
Estimated Deaths Attributable to Social Factors in the United States
Estimated Deaths Attributable to Social Factors in the United StatesEstimated Deaths Attributable to Social Factors in the United States
Estimated Deaths Attributable to Social Factors in the United StatesJim Bloyd, DrPH, MPH
 
FINAL SENIOR SEMINAR PROPOSAL
FINAL SENIOR SEMINAR PROPOSALFINAL SENIOR SEMINAR PROPOSAL
FINAL SENIOR SEMINAR PROPOSALMargaret O'Brien
 
Advances In Research On Homelessness An Overview Of The Special Issue
Advances In Research On Homelessness  An Overview Of The Special IssueAdvances In Research On Homelessness  An Overview Of The Special Issue
Advances In Research On Homelessness An Overview Of The Special Issue
Katie Naple
 
ANRV381-SO35-23 ARI 5 June 2009 934Income Inequality and.docx
ANRV381-SO35-23 ARI 5 June 2009 934Income Inequality and.docxANRV381-SO35-23 ARI 5 June 2009 934Income Inequality and.docx
ANRV381-SO35-23 ARI 5 June 2009 934Income Inequality and.docx
justine1simpson78276
 
Giving everyone the health of the educated: an examination of whether social ...
Giving everyone the health of the educated: an examination of whether social ...Giving everyone the health of the educated: an examination of whether social ...
Giving everyone the health of the educated: an examination of whether social ...
CookCountyPLACEMATTERS
 
C U L T U R E A S A M E D I A T O R O F H E A L T H D I S P A .docx
C U L T U R E A S A M E D I A T O R O F H E A L T H D I S P A .docxC U L T U R E A S A M E D I A T O R O F H E A L T H D I S P A .docx
C U L T U R E A S A M E D I A T O R O F H E A L T H D I S P A .docx
humphrieskalyn
 
Mortality & Morbidity in the 21st Century
Mortality & Morbidity in the 21st CenturyMortality & Morbidity in the 21st Century
Mortality & Morbidity in the 21st Century
Paul Coelho, MD
 
Teen pregnancy in west texas ijoe v4 n1 2016
Teen pregnancy in west texas ijoe v4 n1 2016Teen pregnancy in west texas ijoe v4 n1 2016
Teen pregnancy in west texas ijoe v4 n1 2016
William Kritsonis
 
Teen pregnancy in west texas ijoe v4 n1 2016
Teen pregnancy in west texas ijoe v4 n1 2016Teen pregnancy in west texas ijoe v4 n1 2016
Teen pregnancy in west texas ijoe v4 n1 2016
William Kritsonis
 
Chapter15a
Chapter15aChapter15a
Chapter15aankit.rk
 
Honors Symposium Paper
Honors Symposium PaperHonors Symposium Paper
Honors Symposium PaperIsaac Suh
 
Building a Research and Advocacy Agenda on Issuesof Economic.docx
Building a Research and Advocacy Agenda on Issuesof Economic.docxBuilding a Research and Advocacy Agenda on Issuesof Economic.docx
Building a Research and Advocacy Agenda on Issuesof Economic.docx
jasoninnes20
 
The Demographic Transition Theory
The Demographic Transition TheoryThe Demographic Transition Theory
The Demographic Transition Theory
Sheri Elliott
 
696Predictors ofHomelessnessamong Older Adultsin New York CityDisabili.docx
696Predictors ofHomelessnessamong Older Adultsin New York CityDisabili.docx696Predictors ofHomelessnessamong Older Adultsin New York CityDisabili.docx
696Predictors ofHomelessnessamong Older Adultsin New York CityDisabili.docx
christina345678
 
From diagnosis to social diagnosisAuthor Phil Brown Mercedes Lys.docx
From diagnosis to social diagnosisAuthor Phil Brown Mercedes Lys.docxFrom diagnosis to social diagnosisAuthor Phil Brown Mercedes Lys.docx
From diagnosis to social diagnosisAuthor Phil Brown Mercedes Lys.docx
shericehewat
 
Brown-Giving-PsychSci-2003.pdf
Brown-Giving-PsychSci-2003.pdfBrown-Giving-PsychSci-2003.pdf
Brown-Giving-PsychSci-2003.pdf
tytrete
 

Similar to Researchreport (20)

DISCUSSION BOARD forum 3.docx
DISCUSSION BOARD forum 3.docxDISCUSSION BOARD forum 3.docx
DISCUSSION BOARD forum 3.docx
 
DISCUSSION BOARD forum 3.docx
DISCUSSION BOARD forum 3.docxDISCUSSION BOARD forum 3.docx
DISCUSSION BOARD forum 3.docx
 
Research methods final project
Research methods final projectResearch methods final project
Research methods final project
 
Estimated Deaths Attributable to Social Factors in the United States
Estimated Deaths Attributable to Social Factors in the United StatesEstimated Deaths Attributable to Social Factors in the United States
Estimated Deaths Attributable to Social Factors in the United States
 
Estimated Deaths Attributable to Social Factors in the United States
Estimated Deaths Attributable to Social Factors in the United StatesEstimated Deaths Attributable to Social Factors in the United States
Estimated Deaths Attributable to Social Factors in the United States
 
FINAL SENIOR SEMINAR PROPOSAL
FINAL SENIOR SEMINAR PROPOSALFINAL SENIOR SEMINAR PROPOSAL
FINAL SENIOR SEMINAR PROPOSAL
 
Advances In Research On Homelessness An Overview Of The Special Issue
Advances In Research On Homelessness  An Overview Of The Special IssueAdvances In Research On Homelessness  An Overview Of The Special Issue
Advances In Research On Homelessness An Overview Of The Special Issue
 
ANRV381-SO35-23 ARI 5 June 2009 934Income Inequality and.docx
ANRV381-SO35-23 ARI 5 June 2009 934Income Inequality and.docxANRV381-SO35-23 ARI 5 June 2009 934Income Inequality and.docx
ANRV381-SO35-23 ARI 5 June 2009 934Income Inequality and.docx
 
Giving everyone the health of the educated: an examination of whether social ...
Giving everyone the health of the educated: an examination of whether social ...Giving everyone the health of the educated: an examination of whether social ...
Giving everyone the health of the educated: an examination of whether social ...
 
C U L T U R E A S A M E D I A T O R O F H E A L T H D I S P A .docx
C U L T U R E A S A M E D I A T O R O F H E A L T H D I S P A .docxC U L T U R E A S A M E D I A T O R O F H E A L T H D I S P A .docx
C U L T U R E A S A M E D I A T O R O F H E A L T H D I S P A .docx
 
Mortality & Morbidity in the 21st Century
Mortality & Morbidity in the 21st CenturyMortality & Morbidity in the 21st Century
Mortality & Morbidity in the 21st Century
 
Teen pregnancy in west texas ijoe v4 n1 2016
Teen pregnancy in west texas ijoe v4 n1 2016Teen pregnancy in west texas ijoe v4 n1 2016
Teen pregnancy in west texas ijoe v4 n1 2016
 
Teen pregnancy in west texas ijoe v4 n1 2016
Teen pregnancy in west texas ijoe v4 n1 2016Teen pregnancy in west texas ijoe v4 n1 2016
Teen pregnancy in west texas ijoe v4 n1 2016
 
Chapter15a
Chapter15aChapter15a
Chapter15a
 
Honors Symposium Paper
Honors Symposium PaperHonors Symposium Paper
Honors Symposium Paper
 
Building a Research and Advocacy Agenda on Issuesof Economic.docx
Building a Research and Advocacy Agenda on Issuesof Economic.docxBuilding a Research and Advocacy Agenda on Issuesof Economic.docx
Building a Research and Advocacy Agenda on Issuesof Economic.docx
 
The Demographic Transition Theory
The Demographic Transition TheoryThe Demographic Transition Theory
The Demographic Transition Theory
 
696Predictors ofHomelessnessamong Older Adultsin New York CityDisabili.docx
696Predictors ofHomelessnessamong Older Adultsin New York CityDisabili.docx696Predictors ofHomelessnessamong Older Adultsin New York CityDisabili.docx
696Predictors ofHomelessnessamong Older Adultsin New York CityDisabili.docx
 
From diagnosis to social diagnosisAuthor Phil Brown Mercedes Lys.docx
From diagnosis to social diagnosisAuthor Phil Brown Mercedes Lys.docxFrom diagnosis to social diagnosisAuthor Phil Brown Mercedes Lys.docx
From diagnosis to social diagnosisAuthor Phil Brown Mercedes Lys.docx
 
Brown-Giving-PsychSci-2003.pdf
Brown-Giving-PsychSci-2003.pdfBrown-Giving-PsychSci-2003.pdf
Brown-Giving-PsychSci-2003.pdf
 

Researchreport

  • 1. The authors would like to thank Agnieszka Postępska and Jiadi Chen for their support and help throughout the writing on this paper. The determinants of the mortality rate across US counties Group 1 Leo Acklin, Xinyu Wei, Lingjing Zhu, Xi Zhang, Yuyang Gu Department of Economics Georgetown University Abstract This studies attempts to estimate the relationship between mortality rate and income inequality five years previous within counties of the United States using ordinary least squares (OLS). Independent variables include Gini Index, race and education attainment variables to control for the various socioeconomic differences across counties, while the dependent variable is mortality rate per 100,000. The results imply the Gini Index within a county has a statistical significance on the mortality rate, along with the completion of a High School diploma and degrees of Associate, Bachelor and Graduate levels. Besides, all races except black are negatively correlated with mortality rate.
  • 2. The determinants of the mortality rate across US counties 2 Introduction: The relationship between income and mortality rate has been a contentious topic for years, people insist on the basic intuition: income, wealth should have a positive effect on life expectancy. In fact, a relationship does exist, a research report from The World Bank has shown that people who earn more than $5,000/month will have a 25% longer life expectancy than those who earn less than $5,000. But when earning is higher than $5,000/month, the effect of income to mortality rate is likely to be eliminated, that is to say, income alone may not justify what causes the fluctuation of mortality rate (Ichiro Kawachi, 1997). As time goes by, more and more researchers give potential variables that can directly influence the mortality rate, and gradually focus less on economic variables than classical scholars, rather, they examine the relationship between mortality rate, races, education and geographical factors, which provide more possibilities to improve the original model. Inspired by previous research, in our model we use age-adjusted mortality rate in order to avoid the disturbance from the death of “natural causes”. There is a famous saying in economics by John Maynard Keynes, “In the long run we are all dead”. We focus on causes of death other than time flowing. We had several choices in the measurement of income inequality, including Gini Index, Hoover Index and Atkinson Index. We decided to use Gini Index because it is most commonly used by previous researchers including this year’s Nobel Prize winner, Angus Deaton, and more importantly, it is easy for us to gather data. We also noticed that there has been a simultaneous increase in Gini Index and mortality rate since 2006, so we expect there are some causal relationship between them. Our ultimate goal of this paper is to find out if there is any kind of causal effect between them, so that the government can adjust policies to reduce the mortality rate as well as to increase the social welfare. Why Should We Expect a Relationship? Although it may be difficult to use economic theory to explain a potential relationship between the Gini Index and mortality rate, if we think about how inequality affect people the association gets easier to understand. Kawachi highlights how inequality can increase stress physiologically
  • 3. The determinants of the mortality rate across US counties 3 and be harmful to overall health through a number of different areas (Ichiro Kawachi, 1997). A connection to an increase in mortality rate to something becomes more straightforward when an increase in that something is harmful to overall health. The breakdown of social cohesion playing a role in the reduction of personal well-being is a subject Kawachi focuses on. This is the notion that the pockets of affluence and poverty encourages dysfunctions among the population (Ichiro Kawachi, 1997). Melvin Tumin speculated this is because of the inequalities of social rewards and this encourages hostility, suspicion and mistrust amongst the segments of society (Tumin, 1953). Clearly these community attributes are not healthy for members of those communities. If we look at the health benefits of a cohesive society the connection becomes even clearer. Several studies (Syme, 1979) (JS House, 1988) have concluded that socially integrated people live a longer life. This may be because of the extra support that comes from having a well-connected group of people, which could be in the form of emotional or financial. A real world example of these effects were found by Wilkinson. During both world wars in Britain a greater sense of community was created because of less inequality amongst citizens, this was coupled with a large increase in life expectancy. On the other side of the coin is the town of Roseto, PA. Originally an Italian-American area with very few health problems, a rapid economic expansion during the 1960s created a large divide between the rich and the poor and the number of deaths by coronary disease rose significantly.(Wilkinson, 1996) Literature review It has been long hypothesized there is a negative relationship between mortality and socioeconomic status: better-off people live longer (Adler, 1993). In the public health and sociological literatures, “socioeconomic status” is always considered as the fundamental variable for factor to justify mortality rate. Though assets, occupation, education, or income are all available variables, we chose income as our basic variable in our model, for it is the most direct index to reflect people’s wealth and its easiness of finding data. Despite many empirical studies, current literature has provided mixed results. Many scholars are now hesitant on the adverse impact of income on mortality rate. Scholars have observed the ineffectiveness of income for justifying the fatality rate when income has risen to a certain level (Hanushek, 1996), rather, inequality should be considered as the major variable when adding an
  • 4. The determinants of the mortality rate across US counties 4 economical variable to the model. Robust evidence on inequality and health comes from the US, Kaplan et al demonstrated a correlation between age-adjusted mortality and various measures of income inequality across the U.S. states in 1990 (Greg Brown, 1990). Adding inequality to a model is necessary when we trying to simulate the real experiment. What we should keep in mind is inequality works by aggregation. The relationship between mortality risk and income is nonlinear at the individual level, so that when we compare average mortality across populations, the distribution of income will play a role. This version of the hypothesis has come to be labeled as a “statistical artefact”. Although socioeconomic status is well known to influence mortality patterns in the United States, few studies have examined other factors on mortality. Albano added education as an independent variable. Accordingly, educational attainment was strongly and inversely associated with death from all cancers combined in black and white men and in white women. Loeb and Bound also provided useful analysis on the effects of education on the mortality rate (Susanna Loeb, 1995). Deaton and Lubotsky create dummy variables for no high school, some college, and college graduate and post-graduate, with high school graduate being the omitted dummy variable (Angus Deaton, 2003). This give us inspiration to put education in our model, education is often ignored by scholars when building the model, but intuitively, people with more education do care more about their health, what’s more, higher education sometimes mean higher compensation, so education will give a positive effect to income. A more popular idea is to use spatial inequality factors to unveil what is the real cause for mortality rate, which give us ways to inspect mortality rate related to unequal distribution of resources across space. Analyzing factors like race, and education that often tied with geographical locations will give us hints to build a more complete model to explain mortality rate. More often, researchers use large descriptive or offered some evidence for correlations within county’s mortality rate. Hui Zheng used a spatial model to illustrate key social variables, to avoid not being able to explain the mechanisms influence (which may cause methodological and conceptual problems). He used a Moran statistic to find if there is any autocorrelation different locations. A large value at a location tends to be surrounded by large a neighboring value, that is to say, if a county has a high mortality rate, there will be high local autocorrelation at that location. Having found the
  • 5. The determinants of the mortality rate across US counties 5 autocorrelation, Zheng built a spatially autoregressive model. What's more, Zheng concluded that traditional linear regression misspecifies the possible relationship between county ecological factors and mortality rates unless inherent spatial dimension is not considered (Zheng, 2012). Also, Yang and Jensen explored these arguments empirically using county-level data, which they note has yielded weaker results in previous studies caused by spatial dependence bias. This bias is minimized by using a Bayesian spatial approach. They contribute to the literature in this area in two fundamental ways. The first was by taking a decidedly spatial perspective and employing Exploratory Spatial Data Analysis (ESDA), spatial regression modeling and a Bayesian approach to better handle problems of spatial autocorrelation inherent to ecological data. The second was by advancing and-to the extent possibly-operationalizing a conceptual model that links inequality and mortality through a psychosocial pathway and an underinvestment pathway (Tse-Chuan Yang, 2011). The fraction of the population that are black is also ‘a significant risk factor’ for the mortality rate, on both whites and blacks separately (Kaplan, 1996) Also, an article written by AMC (Analytical Methods Committee, 2001) has improved the method. Where outliers are illegitimately included in the data, it is only common sense that those data points should be removed. Practically, we can remove the outliers by using ‘robust’ command in Stata when data is normally distributed. (Lewis, 1996) We initiated our project after noticing that few scholars ever examine the simultaneous effect of socioeconomic, education and race, most of the writers focused on economic inequality factors. Although many studies showed that the death rate and inequality are highly correlated, there are still some uncertainties. Some researchers found that deprivation and social capital partly, but do not completely account for why inequality is positively associated with mortality. We would like to gather these factors together in order to plot a picture as precise as possible. A more sophisticated trend is to track inequality over an individual’s lifetime through survey data. This enables scholars to treat inequality as person specific and time varying but unfortunately this method is very much out of our time and budget limitations. We find OLS is also plausible, conclusions we get from OLS model are more straightforward, but cons also exist, for example we cannot eliminate geographical correlation. We tried to make our paper more precise by using county data instead of state data, using county data can help us to find the most delicate relationship. What’s more, after reading materials, we
  • 6. The determinants of the mortality rate across US counties 6 find inequity factors won’t affect mortality rate immediately, their effect tend to lag for some years, so we try to use inequality data five years before to illustrate current mortality. Zheng concludes the risk starts five years later which is why we chose this distance (Tse-Chuan Yang, 2011). Data Source In order to examine the model accurately, data has been collected from reliable sources. Each group of variables are mentioned in greater detail below. Mortality The mortality rates were calculated with the Compressed Mortality Files (CMF) maintained by the National Center for Health Statistics (NCHS). In order to get rid of the effect of age, the age adjusted mortality rates were used in the analysis. (NCHS 2013). Income Inequality The well-known Gini coefficient ranges from 0 (total equality; everyone has the same income) to 1 (completely unequal, one person enjoys all the income). The Gini coefficient was employed to measure inequality within a county and calculated with the household income data from the 2005–2009 American Community Survey (ACS) estimates (US Census Bureau 2010). As the effect of income inequality on mortality rate has a time lag effect (Hui 2012), we use the 2008 Gini index instead of 2013 Gini index. Race/ethnicity structure The race/ethnicity structure of a county was captured by four variables: the proportion of American Indian or Alaska Native, the proportion of Asian or Pacific Islander, and the proportion of Black or African American. These three variables measure the race structure. And ethnicity structure is measured by the proportion of Hispanics. Because there are some overlap between race and ethnicity, e.g. a black people also can be a Hispanic, the sum of these four variables may be larger one. These were extracted from the U.S. Census Bureau, 2013 American Community Survey.
  • 7. The determinants of the mortality rate across US counties 7 Education The educational attainment variables include percentages of the county population over the age of 25 that have the following as their highest level of education: high school graduate, some college, an associate degree, a bachelor degree and graduate degree. The missing variable is a non-high school graduate. These variables were obtained from the U.S. Census Bureau, 2013 American Community Survey. Income The household median income was extracted from the U.S. Census Bureau, 2013 American Community Survey. The strength of prevailing evidence varies by geographic scale, with state-level analyses more supportive than county-level findings (Wilkinson and Pickett 2006). Inequality is better captured with state-level than with county-level data (Wilkinson and Pickett 2006, 2009) because income distributions are generally wider within states than counties. Regardless, the inconsistent evidence by scale of analysis suggests that data aggregation bias may plague studies in this area (Deaton and Lubotsky 2003). Recently, it has been argued that ‘county’ is a more appropriate analytic unit rather than ‘state’ as it accounts for the heterogeneity within a state, which helps to study spatial inequality in detail, and has more relevant implications for localities (Lobao et al. 2007). Accordingly, this study will analyze county-level data. After dropping the missing data, which are randomly, the sample in this paper consisted of 739 county-level units. The descriptive statistics of variables are shown in Table 1 The Gini coefficient ranges from 0 (total equality; everyone has the same income) to 1 (completely unequal, one person enjoys all the income). The average of the Gini coefficient is 0.4365. Cass County, MO, has the smallest Gini coefficient, 0.3460, which means Cass County is the most equal county. And New York County, NY, is the most unequal county, which has the largest Gini coefficient, 0.6050. The average household’s median income is $29,970.90 dollars. The income in Isabella County, MI, is the lowest, which is $11,879 dollars. Arlington County, VA, has the highest income, which is $60,405 dollars. The average of mortality rate is 0.0076. The minimum mortality rate is 0.0046, which is in Collier County, FL. And the maximum mortality rate is 0.0113, which is in Baltimore city, MD.
  • 8. The determinants of the mortality rate across US counties 8 The standard deviation of mortality rate is 0.0011, which shows that the variation of mortality rate among different counties is significant. This is one of our reasons why we are interested in the mortality problem. There are 51 counties that don’t have American Indian or Alaska Native people. McKinley County, NM, has the most American Indian or Alaska Native people, the proportion is 0.747. The mean of indian is 0.0109. The mean of black is 0.1105. Gallatin County, MT, only has 0.1% Black or African American people compare to its population. The county which has the highest fraction of Black is Hinds County, MS. And the fraction is 0.7080, which means 70.8% people in Hinds County is Black or African American people. The minimum proportion of Asian or Pacific Islander is 0.001. In Honolulu County, HI, 42.6% population is Asian or Pacific Islander. The average of Asian among counties is 0.0317. The range of Hispanic is large at 0.948. The average is 0.1176. Muskingum County, OH, has the smallest fraction of Hispanic, which is 0.005. And the fraction of Hispanic in Webb County, TX, is 0.9530, which is the highest. The coefficients of variation (which equals standard deviation divided by mean) of race/ethnicity variables are all larger than 100%, which shows that the variation of the proportion of race/ethnicity among counties is large. We also found that all race/ethnicity variables present positive skewness (shown by histograms in appendix). The influence of the outliers cannot be canceled by using Robust estimators (AMC 2001). And the log transformation doesn’t work, because of the 0 value of observations. Therefore, we drop these outliers according to IQR rule (Tukey 1977), which will be discussed in further detail below. However, we will lose some information by dropping observations. So in following, this paper will compare the results with and without these outliers. The average of percentage of people, whose highest education level is high school, is 29.03%. The minimum of highschool is 8.3% and the maximum of highschool is 51.6%. The average of percentage of people, whose highest education level is college, is 22.09%. The minimum percentage is 9.2%, and the maximum is 35.6%. The average of percentage of people, whose highest education level is associate, is 8.55%. The minimum is 3.1% and the maximum is 16.3%. The average of percentage of people, whose highest education level is bachelor degree, is 17.71%. The minimum is 5.66% and maximum is 39.4%. The average of percentage of people,
  • 9. The determinants of the mortality rate across US counties 9 whose highest education level is graduate degree, is 10.59%. The minimum is 1.9% and maximum is 39.1%. The relationship between mortality rate and income inequality is shown in Appendix B. From the figure we found that the mortality rate and income inequality has some non-linear relationship. Before some critical Gini index, the mortality rate will increase with the increasing of income inequality. This is consist with our expectation. However, after that critical point, the mortality rate will decrease with the increasing of income inequality. This part draws our attention. And we managed to explain this results in our paper. As the Appendix B shows, Gini index and income also has some non-linear relationship. At beginning, the income falls as Gini index grows, after critical point, income increases as Gini index grows. ( Deaton 2001) It has been argued that the inequality-mortality association is unique to the US because inequality is a proxy for the race and social structure of this country (Deaton and Lubotsky 2003; Ross et al. 2000). As minority groups in the US are more likely to be impoverished and live in socially disorganized areas, the inequality-mortality relationship may be attributed to social and political structure. Hence, controlling for these factors should eliminate the inequality-mortality relationship (Deaton 2001, 2003). For example, African Americans suffer greater socioeconomic deprivation and often reside in high-poverty environments marked by social disorganization, which together create health disparities (Kawachi and Kennedy 1997b; Williams et al. 2010). Therefore the race structure is another important factor. The relationship between mortality rate and the fraction of black or African American is shown in Appendix B. As we can see, either with or without the outliers, the two variables have a linear relationship. The mortality rate goes up as the increase of the fraction of black or African American people. As the Appendix B shows that if we drop the outliers, the mortality rate and fraction of Asian has negative relationship. According to State Health Facts, Asian-Americans have longer life expectancy compared to other races. This may owe to their lifestyle. And in Appendix B, without the outliers the mortality rate and fraction of Indian don’t have obvious relationship. The scatterplot of mortality rate and fraction of Hispanic without the outliers shows these two variables have slightly negative relationship.
  • 10. The determinants of the mortality rate across US counties 10 We think that the education will also influence the mortality rates (Sparks 2010). First, people who get more education will tend to take more care of their health. Second, higher education expect higher income. Combine these two effects, we expect that the education and mortality rate have a negative relationship. From the scatterplots about the education level, The county which has more low education level population, like high school level population, has higher mortality rate. And the county has more high level education population, like bachelor and graduate population, has lower mortality rate. The correlation matrixes also show the relationship between variables. The Black, high school education level, and college education level have positive relationship with mortality rate. And Asian, Hispanic, ln(income), bachelor and graduate education level have negative relationship with mortality rate. In addition, ln(income) has negative relationship with inequality. Model The determinants of the mortality rate will be estimated using OLS and robust standard-errors. The original model is specified as below: Mortality = β0 + β1gini + β2gini2 + β3Black + β4Asian + β5Indian + β6Hispanic + β7highschool + β8college + β9associates + β10bachelors + β11graduate + β12ln(income) + ε mortality: age-adjusted mortality rate gini and gini2 : Gini Coefficient This is our main variable of interest and is one of a number of measures of inequality available. We decided on the Gini coefficient as it was readily available for each county in the US. We’re including the Gini2 variable as we believe the relationship between mortality and inequality is not linear. This is highlighted in the plots of mortality rate and Gini index. Race/ethnicity variables Black: the proportion of black or African Americans per county Asian: the proportion of Asian or Pacific Islander per county Indian: the proportion of American Indian or Alaska Native per county
  • 11. The determinants of the mortality rate across US counties 11 Hispanic: the proportion of Hispanics per county Here we’re accounting for the different racial compositions that are seen throughout the counties. Variables for the populations of blacks and whites are very common empirically, but we decided to add other races to our model. The reasoning for this is because we think there may be other differences between races not just Blacks and Whites that affect the mortality rate differently. For example, there could be lifestyle choices that are predominantly found in Asian or Hispanic cultures that could have a different relationship to mortality than the Black or Indian proportions. To investigate these differences we have included these variables. Level of highest education attainment variables Highschool: completed high school is their highest level of education College: completed some college education but did not complete a degree requirement Associates: completed an associate's degree Bachelor: completed a bachelor’s degree Graduate: completed a graduate degree These variables are also consistent with empirical literature, specifically Deaton and Lubotsky (2003). Although, the associates and graduate variables have been added in an attempt to improve on their findings and see if these attainments have any explanatory powers. ln(income): natural log of median household dollars We’re using income here as more of a control variable that will be further explained below, but it shall also help us distinguish between a rise in inequality due to poorer people getting poorer and richer people getting richer. We’ve taken the natural log of income because after some preliminary regressions using just income the coefficient was very small and difficult to interpret. Our estimation method is OLS. We choose OLS for two reasons. Firstly, many of the literature we read use OLS. Secondly, the property of the data is suitable for OLS. We’ll explain the property of our data, potential problems and solutions in the following part. The following are the main causes of bias and inconsistency of the estimator of coefficient and the estimator of variance in OLS.
  • 12. The determinants of the mortality rate across US counties 12 1. Omitted Variable Omitted Variable is a variable that has correlations with both dependent and independent variables, and it will cause bias in the estimator of coefficient. There will always be omitted variables. We should only consider ones that we think those are significant to our model. An example of a variable that intuitively has a direct relationship on mortality would be health status. In practice this a very difficult variable to measure which is why it is left out of our analysis which could potentially cause an omitted variable bias. Health status is also correlated with income, the higher the income the healthier a person is so we would expect a negative relationship between income and mortality rate. This being from able to spend a larger amount of income on healthcare as well as healthier lifestyle choices. Education attainment would also be correlated with health status, as the increase of education is believed to make people care more about their health. The other possibly omitted variable is pollution. By common sense, a higher level of pollution can cause various types of diseases and increase the mortality rate. The Environmental Kuznets Curve is an inverted U-shaped curve describing the relationship between environmental degradation and income. Specifically, in developing countries with a low national income, the environmental condition deteriorates as income grows. In developed countries, however, the level of pollution decreases as income goes up. This is understandable because developing countries like China have to sacrifice their natural environment in exchange of the industrialization process. While in developed countries like the US, more income means they have more money to deal with the pollution they produced before in the economic developing process. Overall, pollution has a positive correlation with mortality rate and a negative correlation with income. 2. Heteroskedasticity We tested our regression after plotting the residuals from a preliminary regression (Appendix D) and looking at the lack of pattern in the residuals we concluded that heteroskedasticity is present. We’ll use OLS to get an unbiased and consistent estimator of coefficients and use White’s Heteroskedasticity-robust standard errors for statistical inference. The source of the heteroskedasticity could be from the non-normally distributed data in our race variables and the
  • 13. The determinants of the mortality rate across US counties 13 outliers mentioned below. There are many counties that have a large proportion of each race in their population, which means the data is skewed (Appendix A). 3. Outliers We made histograms (Appendix A) for each variable and find that except for race/ethnicity variables, the other variables are all approximately normally distributed. According to the literature, we can use robust to deal with outliers problem for those variables that are normally distributed. So we will keep all data in those variables. And as for race/ethnicity variables, we can use IQR rule to drop outliers. IQR rule: Q1 means bigger than 1 Quarter of the data. Q3 means bigger than 3 Quarter of the data. IQR (Inter-quartile Range)=Q3-Q1. The outliers are defined as anything below Q1-1.5 IQR or above Q3+ 1.5 IQR. If we drop all the outliers, we would lose a lot of data. Since we have no evidence that those outliers are illegitimately included in the data, it may not be a wise choice to drop so much data. The bias caused by deleting data may be bigger than the bias caused by outliers. When discussing results, we will run two separate regressions on the dataset, one with outliers and one without. 4. Serial correlation We use cross-sectional data rather than time-series data. Therefore we don’t need to worry about time serial correlation. There may be some geographical correlation that comes naturally from using location based variables as previously discussed. 5. Multicollinearity Looking at the correlation matrix (Appendix) there are a number of variables that have a significant correlation, most notably the education attainment variables. During preliminary regressions these variables were statistically significant which would suggest multicollinearity is not an issue. With these variables predominantly being used as control variables then collinearity should not be a problem.
  • 14. The determinants of the mortality rate across US counties 14 Results Our results are listed in Table 2 in the Appendix C. Both Gini and Gini2 are significant at the 0.1% level, as well as all of the race/ethnicity variables apart from Indian. From the education attainment variables all but college are also significant at the 0.1% level. ln(income) does have an unexpected sign but is not statistically significantly different from zero at the 95% confidence level. Evaluating Gini at the mean of 0.436 we obtain a change with respect to mortality of 0.00409 or 409 people per 100,000, meaning any small change of the Gini Index around the mean results in a change in the mortality rate of 409. Black is the only race variable that does not show a negative sign, which would imply that our hypothesis regarding different races having different effects on mortality is fair. A positive sign on Black would suggest an increase in the proportion of the black population within a county increases the mortality rate, Deaton and Lubotsky (2003) also showed a positive sign for their fraction of black variables. A reason for this could be that average incomes among the black population are negatively correlated with the presence of blacks, whereas the average income of whites is greater with a larger fraction of blacks (Deaton and Lubotsky, 2003). This may also explain our insignificance of the income variable as well as the near zero coefficient. As mentioned above, according to State Health Facts, Asian-Americans and Hispanic-Americans have longer life expectancy compared to other races. There is a theory known as Hispanic Paradox. Basically, Hispanics tend to have better health habit and stronger networks of social support from community (Paola Scommegna, 2013). And intuitively, Asians have the same attitude of living and lifestyle as well. That’s could be one of the reasons why the coefficient of Asian and Hispanic variables are negative. The coefficient of Asian becomes insignificant in the second regression. That’s maybe because we lose efficient information after dropping outliers. Each level of completed education is significant and exhibits a negative sign. Only college is not significant which implies there is an importance to finishing an education level. The coefficient on bachelor is -0.011 which means a 1% increase of the number of people that complete a bachelor’s degree decreases the mortality rate by 0.011 or 1,100 people per 100,000, holding the other variables constant. Using Allegheny County, PA as an example, 1% of the population that has either only completed high school or attended some college is 5,615 people.
  • 15. The determinants of the mortality rate across US counties 15 The second regression results are without the outliers found using the process mentioned above. The Indian variable switches signs from negative to positive but is still not statistically significant. College is the only variable that gains significance and this is at the 5% level. Asian is the only variable to lose significance. ln(income) is still not statistically significant. Conclusions A positive sign on Gini and negative sign on Gini2 would mean as the inequality of a county increases the effect on mortality decreases. This might not make immediate sense but if we think about how inequality can change within a county this might make more sense. For example, if a county’s inequality increases because of people becoming wealthier and therefore healthier, the decrease in mortality rate could be greater than the increase that the larger inequality creates. Another way of thinking about this issue is regarding the omitted variable bias caused by a public healthcare system. Intuitively, people who have healthcare insurance are more likely to receive good care when they are sick, and people with a low income are more likely to receive public healthcare for free. Ross (2000) used cross-sectional data to compare the relationship between income inequality and mortality rate in US and Canada and no significant association between those two factors was found in Canada. It is, as they concluded, mainly caused by the fact that health care resources in Canada are “publicly funded and universally available” (Nancy A Ross, 2000). In summary, when the social condition of income inequality becomes extreme, the government is very likely to intervene and provide healthcare to those who cannot afford medical care. It reduces the marginal effect of income inequality on mortality rate when Gini Index is high, presenting a non-linear correlation between them. Based on these conclusions, we also find that our model suffers from the absence of geographic condition of each county since health care policies are different in different states. The Commonwealth Fund rates states like Hawaii and Vermont as the best states for health care, and states like Texas and Mississippi as the worst (Forbes, 2007). Furthermore, there is a positive relationship between the wealth of a state and its public health care, which may cause upward bias to our income variable.
  • 16. The determinants of the mortality rate across US counties 16 If we use panel data and run a FE regression, we would be able to eliminate the bias caused by different geographic locations. Limited by our knowledge and time constraints, we were not able to do that in this paper. Later on if we want to improve our model, we can use panel data in our regressions.
  • 17. The determinants of the mortality rate across US counties 17 Reference Adler, N. E. (1993). Socioeconomic inequalities in health: no easy solution. Journal of the American Medical Association, p. 269. Analytical Methods Committee. (2001). AMC Technical Brief. Retrieved 12 1, 2015 Angus Deaton, D. L. (2003). Mortality, inequality and race in American cities and states. Social Science & Medicine, pp. 1139–1153. Berkman, L. F., & Syme, S. L. (1979). Social networks, host resistance, and mortality: A nine- year follow- up study of alameda county residents. American Journal of Epidemiology, 109(2), 186-204. Brian Wingfield (2007). The Best and Worst States For Health Care. Forbes Greg Brown, M. P.-T.-Q. (1990). Regression of Coronary Artery Disease as a Result of Intensive Lipid-Lowering Therapy in Men with High Levels of Apolipoprotein B. England Journal of Medicine, pp. 1289-1298. Hanushek, E. A. (1996). Aggregation and the estimated effects of school resources. Review of Economics and Statistics, 78(4), , pp. 611–627. House, J. S., Landis, K. R., & Umberson, D. (1988). Social relationships and health. Science (New York, N.Y.), 241(4865), 540-545. Ichiro Kawachi, B. P. (1997). The relationship of income inequality to mortality: Does the choice of indicator matter? Social Science & Medicine, pp. 1121–1127. Jessica D. Albano, E. W. (2007). Cancer Mortality in the United States by Education Level and Race. Medicine & Health, pp. 1384-1394. JS House, K. L. (1988). Social Relationships and Health. Science, pp. 204-208. Kaplan. (1996). People and places: contrasting perspective on the association between social class and health. International Journal of Health Services, pp. 507-519. Kaplan, G. E. (1996). Inequality in income and mortality in the United States: analysis of mortality and potential. British Medical Journal, p. 312.
  • 18. The determinants of the mortality rate across US counties 18 Kawachi, I., Colditz, G. A., Ascherio, A., Rimm, E. B., Giovannucci, E., Stampfer, M. J., et al. (1996). A prospective study of social networks in relation to total mortality and cardiovascular disease in men in the USA. Journal of Epidemiology and Community Health, 50(3), 245-251. Lewis, V. B. (1996). Outliers in statistical data. International Journal of Forecasting, pp. 175-176. Lobao, L. M., & Hooks, G. (2007). Advancing the sociology of spatial inequality. The Sociology of Spatial Inequality, 29. Ross, N. A., Wolfson, M. C., Dunn, J. R., Berthelot, J. M., Kaplan, G. A., & Lynch, J. W. (2000). Relation between income inequality and mortality in canada and in the united states: Cross sectional assessment using census data and vital statistics. BMJ (Clinical Research Ed.), 320(7239), 898-902. Sparks, P. J., & Sparks, C. S. (2010). An application of spatially autoregressive models to the study of US county mortality rates. Population, Space and Place, 16(6), 465-481. Susanna Loeb, J. B. (1995). The Effect of Measured School Inputs on Academic Achievement: Evidence from the 1920s, 1930s and 1940s Birth Cohorts. NBER Working Paper, pp. 653-64. Syme, L. F. (1979). Social Networkers, Host Resistance, and Mortality: A Nine-Year Follow-up Study of Alameda County Residents. American Journal of Epidemiology, pp. 186-204. Tse-Chuan Yang, L. J. (2011). Social Capital and Human Mortality: Explaining the Rural. Rural Sociology, pp. 347–374. Tumin, M. M. (1953). Some Principles of Stratification: A Critical Analysis. American Sociological Review, pp. 387-394. Wilkinson, R. G. (2002). Unhealthy societies: The afflictions of inequality Routledge. Wilkinson, R. G. (2006). The impact of inequality. Social Research, , 711-732. Wilkinson, R. G., & Pickett, K. E. (2006). Income inequality and population health: A review and explanation of the evidence. Social Science & Medicine, 62(7), 1768-1784.
  • 19. The determinants of the mortality rate across US counties 19 Williams, D. R., Mohammed, S. A., Leavell, J., & Collins, C. (2010). Race, socioeconomic status, and health: Complexities, ongoing challenges, and research opportunities. Annals of the New York Academy of Sciences, 1186(1), 69-101. Zheng, H. (2012). Do people die from income inequality of a decade ago? Social Science & Medicine, pp. 36-45.
  • 20. The determinants of the mortality rate across US counties 20 Appendix A Table 1 Summary Statistics Variable Gini coefficient 0.4365 (0.0375) age-adjusted mortality rate 0.0076 (0.0011) the proportion of American Indian or Alaska Native per county 0.0109 (0.0483) the proportion of black or African Americans per county 0.1105 (0.1234) the proportion of Asian or Pacific Islander per county 0.0317 (0.0414) the proportion of Hispanics per county 0.1176 (0.1335) completed high school is their highest level of education 0.2903 (0.0662) completed some college education but did not complete a degree requirement 0.2209 (0.0396) completed an associate's degree 0.0855 (0.0203) completed a bachelor’s degree 0.1771 (0.0566) completed a graduate degree 0.1059 (0.0502) median household dollars 29970.8900 (6368.3280) Number of observations 739 Standard deviations in brackets
  • 21. The determinants of the mortality rate across US counties 21 Histograms of all variables and normal distribution plot. 1. Variables that are approximately normally distributed:
  • 22. The determinants of the mortality rate across US counties 22 2. Variables that are not normally distributed
  • 23. The determinants of the mortality rate across US counties 23 Appendix B 1. Correlation matrix with outliers: without outliers: 2. Scatterplots: The left hand is the scatterplots with outliers and the right hand is the scatterplots without outliers. Column1 mort gini gini2 black indian asian hisp college associate bachelor graduate lnincome mort 1 gini 0.0302 1 gini2 0.0177 0.9982 1 black 0.3141 0.2634 0.2653 1 indian -0.0309 -0.0344 -0.0367 -0.093 1 asian -0.4794 0.1968 0.201 0.1393 0.0153 1 hisp -0.2174 0.1957 0.2014 0.1276 0.2094 0.2297 1 college 0.1837 -0.1912 -0.1878 0.0954 0.3115 -0.1741 0.1458 1 associate -0.1365 -0.2628 -0.2676 -0.1913 0.1443 -0.0901 -0.1624 0.1038 1 bachelor -0.6748 0.136 0.1417 -0.0632 -0.0152 0.6151 0.0735 -0.1638 -0.1241 1 graduate -0.6067 0.3168 0.3244 -0.0319 -0.0864 0.6605 0.0233 -0.3609 -0.2066 0.7597 1 lnincome -0.4149 -0.3287 -0.3231 -0.0526 -0.0988 0.3205 0.046 -0.1565 0.0054 0.5159 0.3085 1
  • 24. The determinants of the mortality rate across US counties 24 .004.006.008 .01 .012 9.5 10 10.5 11 lnincome mort Fitted values .004.006.008 .01 .012 .35 .4 .45 .5 .55 .6 Estimate; Gini Index mort Fitted values .35 .4 .45 .5 .55 .6 9.5 10 10.5 11 lnincome Estimate; Gini Index Fitted values .004.006.008 .01 .012 0 .2 .4 .6 .8 black mort Fitted values .004.006.008 .01 .012 .35 .4 .45 .5 .55 Estimate; Gini Index mort Fitted values .004.006.008 .01 .012 9.5 10 10.5 11 lnincome mort Fitted values .35 .4 .45 .5 .55 9.5 10 10.5 11 lnincome Estimate; Gini Index Fitted values .004.006.008 .01 .012 0 .1 .2 .3 black mort Fitted values
  • 25. The determinants of the mortality rate across US counties 25 .004.006.008 .01 .012 0 .1 .2 .3 .4 asian mort Fitted values .004.006.008 .01 .012 0 .2 .4 .6 .8 indian mort Fitted values .004.006.008 .01 .012 0 .2 .4 .6 .8 1 hisp mort Fitted values .004.006.008 .01 .012 .1 .2 .3 .4 .5 highschool mort Fitted values .004.006.008 .01 .012 0 .02 .04 .06 .08 asian mort Fitted values .004.006.008 .01 .012 0 .005 .01 indian mort Fitted values .004.006.008 .01 .012 0 .1 .2 .3 hisp mort Fitted values .004.006.008 .01 .012 .1 .2 .3 .4 .5 highschool mort Fitted values
  • 26. The determinants of the mortality rate across US counties 26
  • 27. The determinants of the mortality rate across US counties 27 Appendix C Regression results Table 2 Regression Table Mortality rate with outliers without outliers Gini coefficient 0.0494*** 0.0642*** (0.0112) (0.0141) Gini coefficient square -0.0519*** -0.0704*** (0.0128) (0.0162) the proportion of black or African Americans per county 0.00179*** 0.00250*** (0.000239) (0.000364) the proportion of American Indian or Alaska Native per county -0.000591 0.00714 (0.000456) (0.0114) the proportion of Asian or Pacific Islander per county -0.00340*** -0.000921 (0.000485) (0.00217) the proportion of Hispanics per county -0.00400*** -0.00584*** (0.000330) (0.000601) completed high school is their highest level of education -0.00536*** -0.00669*** (0.00122) (0.00139) completed some college education but did not complete a degree requirement -0.00196 -0.00299* (0.00112) (0.00137) completed an associate's degree -0.0157*** -0.0168*** (0.00147) (0.00171) completed a bachelor’s degree -0.0115*** -0.0122*** (0.00111) (0.00128) completed a graduate degree -0.0104*** -0.0123*** (0.00118) (0.00135) natural log of median household dollars 0.0000368 0.0000254 (0.000168) (0.000195) constant 0.00245 0.000670 (0.00323) (0.00398) N 739 508 R2 0.705 0.701 adj. R2 0.700 0.694 Robust standard errors in parentheses * p < 0.05, ** p < 0.01, *** p < 0.001
  • 28. The determinants of the mortality rate across US counties 28 Appendix D residual graph: .004.006.008 .01 .012 mort -.002 -.001 0 .001 .002 .003 residuals