The document analyzes factors that may contribute to poverty rates across US states using statistical analysis. It finds correlations between: obesity and hypertension rates; unemployment and food expenditures; and poverty rates and obesity rates. A multiple linear regression model predicts poverty rate based on health, economic and dietary variables. The analysis concludes some variables like unemployment significantly impact poverty rates, and targeting poverty could reduce obesity. Sources of error in the analysis are also discussed.
2. Statistics is a powerful tool of analysis that can be applied to examine social issues. Poverty is a
growing, self-perpetuating problem in the United States; those affected can remain entrenched
in the cycle of poverty over several generations. We decided to examine the factors we believe
are entrenching people in poverty by analyzing the relationship between living habits, overall
well-being, and poverty. We used each of the fifty United States as our unit of analysis and data
from 2012. We attempted to answer the following questions:
1. Is there correlation between obesity and hypertension across states?
2. Is there a difference in food expenditure if you’re above or below the average national
unemployment rate?
3. Is there correlation between the poverty rate and obesity rate by state?
4. Does poverty rate vary based on region?
It should be noted that we checked the assumptions of independence, linearity, equal variance
and normality for each dependent and independent variable. 2
Introduction
3. Variable n Mean Variance Std. dev. Std. err. Median Range Min Max
Unemployment Rate (% by state) 50 7.34 2.935 1.713 0.2423 7.45 8.1 3.1 11.2
Fast food (% users in last 30 days by state) 50 83.46 0.08188 0.2861 0.04047 83.42 1.45 82.79 84.24
Hypertension (% by state) 50 27.64 9.754 3.123 0.4417 27.2 14.3 20.5 34.8
Total Food Expenditures ($/household) 50 48, 480 7,716,000 2778 392.8 48,550 14,040 41,950 56,000
Unemployment coded
Above national average (0)
Below national average (1)
50 n/a n/a n/a n/a n/a n/a n/a n/a
Obesity Rate (% by state) 50 29.38 10.73 3.276 0.4633 29.65 14.6 21.3 35.9
Poverty Rate (% by state) 50 14.04 11.04 3.322 0.4698 13.65 13.6 8.6 22.2
3
Variables and Summary Statistics
4. Is there a correlation between obesity and hypertension
across states?
● H0
: 1
= 0
● Ha
: 1
≠ 0
● We are setting α=0.05
● P-value of <0.0001
● There is a positive correlation between obesity rate and the
percentage of people within a state living with hypertension.
● Slope of ≈0.74 indicates that as hypertension rate within a state
increases by 1%, the obesity rate of that state increases by 0.74%.
● 49.44% of the variability in obesity rate by state can be explained
by the variability in hypertension across that state.
● Conclusion: With a P-value of <0.0001, less than α=0.05, we reject
null hypothesis. We have sufficient evidence to say that there is a
correlation between obesity and hypertension across states.
Parameter Estimate Alternative DF T-Stat P-value
Intercept 8.99 ≠ 0 48 3.00 0.0042
Slope 0.74 ≠ 0 48 6.85 <0.0001
Simple linear regression results:
Dependent Variable: Obesity Rate
Independent Variable: Hypertension (%)
Obesity Rate = 8.9891311 + 0.73763908 Hypertension (%)
Sample size: 50
R (correlation coefficient) = 0.70316441
R-sq = 0.49444019
Estimate of error standard deviation: 2.353631
4
(%)
5. Is there a difference in food expenditure if you’re above
or below the average national unemployment rate?
● We chose to do a two sample t-test to determine
the relationship between unemployment rate
and food expenditure across the United States.
● We coded (created a dummy variable) to create a
mean for unemployment rate:
○ 32 states are 1 (below national avg. of
8.07%)
○ 18 states are 0 (above national avg. of
8.07%)
● We are setting α=0.05
● P-value of 0.047
● Conclusion: With a p-value of 0.047, less than
α=0.05, we reject the H0
and conclude there is
sufficient evidence to say that Unemployment does
have an effect on the mean total food
expenditure of a state.
Hypothesis test results:
μ1
: Mean of Total Food Expenditures ($/household) where
"Unemployment Above (0)"=0
μ2
: Mean of Total Food Expenditures ($/household) where
"Unemployment Above (0)"=1
μ1
- μ2
: Difference between two means
H0
: μ1
- μ2
= 0
HA
: μ1
- μ2
≠ 0
(without pooled variances)
5
Difference Sample Diff. Std. Err. DF T-Stat P-value
μ1
- μ2
95060.60 46581.90 50 2.04 0.047
6. Parameter Estimate Std. Err. Alternative DF T-Stat P-value
Intercept -0.50 3.78 ≠ 0 48 -0.13 0.90
Slope 0.49 0.13 ≠ 0 48 3.87 0.0003
Simple linear regression results:
Dependent Variable: Poverty Rate
Independent Variable: Obesity Rate
Poverty Rate = -0.50 + 0.49 Obesity Rate
Sample size: 50
R (correlation coefficient) = 0.49
R-sq = 0.24
Estimate of error standard deviation: 2.93
● We chose to run a linear regression t-test to test this question.
● H0
: 1
= 0
● Ha
: 1
≠ 0
● We are setting α=0.05
● Predicted Poverty Rate = -0.50 + 0.49 Obesity Rate
● P-value of 0.0003
● Conclusion: Because our p-value is 0.0003, less than α=0.05, we
reject the null hypothesis and say there is enough evidence to
say that there is a positive, relatively strong, significant
relationship between poverty rate within a state and the
obesity rate within that state.
6
Is there correlation between the poverty rate and
obesity rate by state?
7. Does poverty rate vary based on region?
● We tested the popular stereotype that the South is more impoverished
than the North by using a 2 sample t-test.
● South: Alabama, Arkansas, Florida, Georgia, Kentucky, Louisiana,
Mississippi, North Carolina, South Carolina, Tennessee, Virginia, West
Virginia
● Northeast: Connecticut, Delaware, Maine, Maryland, Massachusetts,
New Hampshire, New Jersey, New York, Pennsylvania, Rhode Island,
Vermont
● Southern average poverty rate: μ1
= 17.39%
● Northeastern average poverty rate: μ2
= 11.83%
● H0
: μsouth
- μnortheast
= 0 ; The Northeast and South have the same average
poverty rate
● HA
: μsouth
- μnortheast
> 0 ; Northeast and South do not have the same
poverty rate
● α=0.05
● P-value of <0.0001
● Conclusion: With a p-value that is effectively 0, less than α=0.05, we
reject the null hypothesis and conclude that there is enough evidence to
say that the stereotype is true: the South has a higher poverty rate than
the North. 7
Possible Shortcoming: We took the regional means of
proportions of each state. While the poverty rate, a
proportion, normalizes for the population, averaging
the various proportions does not take into account
differences in population size throughout the regions.
We attempted to create our own regional proportion
using the data that we did have.
Hypothesis test results:
μ1
: Mean of South
μ2
: Mean of Northeast
μ1
- μ2
: Difference between two means
Difference n1
n2
Sample
mean
Std.
err.
Z-
stat
P-value
μ1
- μ2
12 11 5.56 1.05 5.29 <0.0001
8. Multiple Linear Regression Predicting Poverty Rate
Predicted Poverty Rate = -61.67 + 0.30(Obesity) -0.0004(Total Food Expenditure) + 0.30(Hypertension) + 0.72(Fast food) + 0.77(Unemployment Rate)
Summary of fit:
Root MSE: 2.46
R-squared: 0.51
R-squared (adjusted): 0.45
● Hypertension and obesity have the same correlation
coefficient (0.3)—for every 1% increase in the respective rates,
the poverty rate increases by 0.3%.
● Total food expenditure has a very negative, weak impact—for
every added dollar of food expenditure by household, the
state poverty rate decreases by 0.0004.
● Fast food and unemployment rate have fairly strong
coefficients (> 0.7).
○ For every 1% increase in fast food users in the last 30
days, the poverty rate increases by 0.72%.
○ For every 1% increase in the state unemployment
rate, the poverty rate increases by 0.766%
● Model strength: moderately strong
● About 51% of the variability in the poverty rate is accounted
for by the variability in the variables listed in the multiple
linear regression table to the right.
● 45% of the variability in the predicted poverty rate is
accounted for by the variability in the variables listed to the
right when taking into account number of variables and
sample size.
● Significant variables:
○ 95% confidence level: Unemployment Rate (0.0018)
○ 90% confidence level: Unemployment Rate (0.0018)
Obesity Rate (0.10)
Source DF SS MS F-stat P-value
Model 5 275.3 55.05 9.121 <0.0001
Error 44 265.6 6.036
Total 49 540.8
Parameter Estimate T-Stat P-value
Intercept -61.67 -0.414 0.68
Obesity Rate (%) 0.3 1.84 0.07
Total Food Expenditures ($/household) -0.0004 -0.803 0.43
Hypertension (%) 0.3 1.677 0.10
Fast food (% users in last 30 days) 0.72 0.39 0.72
Unemployment Rate (%) 0.766 3.319 0.0018
8
9. Analysis & Conclusion
● What did we learn?
○ Obesity and hypertension are highly correlated. Therefore, if a person has high blood pressure, he or she is more
likely to be obese.
○ The relationship between food expenditure and unemployment rate is significant. Unemployment does affect the
amount spent per household on food annually.
○ Poverty and obesity rates have a significant, positive, and relatively strong relationship. Based on this correlation,
we know that targeting poverty would likely be a good solution to preventing obesity.
○ Being born in the South, which an individual cannot control, increases a person’s likelihood of living in poverty.
○ The social factors we examined are only part of the explanation for why poverty exists as it does, and continues to
cycle in our society. Other factors, including issues of race, marginalization, lack of education, access to healthcare,
etc. also likely contribute to poverty.
● In our analysis, potential sources of error are:
○ Multicollinearity—food expenditure and fast food usage (in the last thirty days) had a strong, positive correlation of
0.718, which might diminish the predictive power of the model.
○ We used a small data set—in the real world, policy makers and NGO’s use big data, which would likely yield a
different result.
○ Inequality within states—states with greater number of high income earners likely affects overall averages, making
problems seem less severe than they are.
9
10. Works Cited
Centers for Disease Control and Prevention (CDC). (2014). Data, Trends and Maps: Obesity Prevalence Maps. Retrieved from
http://www.cdc.gov/obesity/data/table-adults.html.
U.S. Department of Labor, Bureau of Labor Statistics. (2012). Unemployment Rates for States Annual Average Rankings, Year: 2012. Retrieved from
http://www.bls.gov/lau/lastrk12.htm.
U.S. Census Bureau. (2012). Annual Estimates of the Resident Population for the United States, Regions, States, and Puerto Rico: April 1, 2010 to July 1, 2012.
Retrieved from https://www.census.gov/popest/data/state/totals/2012/.
Yoon SS, Burt V, Louis T, Carroll MD. Hypertension among adults in the United States, 2009–2010. NCHS data brief, no 107. Hyattsville, MD:
National Center for Health Statistics. 2012.
U.S. Census Bureau (2013, September). Poverty: 2000 to 2012. Retrieved from https://www.census.gov/prod/2013pubs/acsbr12-01.pdf.
Easy Analytic Software, Inc., Easy Analytic Software Inc. (EASI) (2009-03-01). Consumer Behavior - 2013: Food - Fast Food Restaurants |
Socioeconomic Indicator: Fast Food & Drive-In Restaurants: Visited In Last 6 Months: Total Category, 2013. Data-Planet™ Statistical
Datasets by Conquest Systems, Inc. [Data-file]. Dataset-ID: 050-301-009
Easy Analytic Software, Inc., Easy Analytic Software Inc. (EASI) (2009-03-01). Consumer Expenditures - 2013: Expenditures - Food | Socioeconomic
Indicator: Total annual expenditures, 2013. Data-Planet™ Statistical Datasets by Conquest Systems, Inc. [Data-file]. Dataset-ID:
050-302-001
10