CDC Data Analytics Project
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

CDC Data Analytics Project

  • 490 views
Uploaded on

Data Analysis of CDC data on various risk factors and the effect it has on insurance premiums in the Midwest vs. Northeast.

Data Analysis of CDC data on various risk factors and the effect it has on insurance premiums in the Midwest vs. Northeast.

More in: Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
490
On Slideshare
490
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
6
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. INSURANCE PREMIUMS CONTINUE TO RISE IN THE NORTHEAST COMPARED TO THE MIDWEST Why? What did our research uncover? And what’s to blame? Presented by: Andrew Kim April 20, 2011
  • 2.
    • FACT:
    • Insurance premiums are higher for those living in the Northeast compared to those living in the Midwest.
    • When asked to explain the discrepancy, insurance companies had this to say:
        • “ Those residing in the Northeast compared to those
        • residing in the Midwest live a more risky lifestyle in
        • terms of alcohol consumption, driving habits, etc.
        • As such their premiums unfortunately are going up
        • while those living in the Midwest are not.”
  • 3. OBJECTIVE: Through extensive research, we plan to prove whether or not insurance companies are telling us the truth - that Northeasterners do live a riskier lifestyle compared to Midwesterners. RELEVANT MATERIALS: Center for Disease Control’s “Behavior Risk Factor Surveillance Survey” (2000) 2000 U.S. Census 2008 U.S. Census
  • 4. DATA ANALYSIS: COLLECTION METHODS Using survey results from CDC, we compiled a spreadsheet of Risky Behavior %ages. Divided states into respective regions as defined by U.S. Census. This failed to account for a state’s size, so we weighed each metric (State pop./Region pop.). Not satisfied, we chose to test for poverty levels and ages (<5 & >65) using the 2008 Census. CONFIDENCE INTERVALS Using this data, we were able to calculate the differences in percentages with 95% confidence. Using Drake Direct’s Plan-Analyzer application, we calculated Confidence Intervals for all risky behavior metrics and determined whether the difference was statistically significant enough to be a determining factor in calculating insurance premiums.
  • 5. DATA: Risky Behavior in Northeast, USA (Survey Results, %s) STATE Connecticut Maine Massachusetts New Hampshire New York Pennsylvania Rhode Island Vermont POP 3405607 1274915 6349119 1235791 18976811 12281071 1048315 608821 45180450 POP WTD 8% 3% 14% 3% 42% 27% 2% 1% 100% SMK 22.000% 27.000% 24.000% 22.000% 23.000% 24.000% 26.000% 22.000% SMK WTD 1.658% 0.762% 3.373% 0.602% 9.661% 6.524% 0.603% 0.296% 23.479% WEI 23.000% 24.000% 19.000% 21.000% 20.000% 25.000% 22.000% 20.000% WEI WTD 1.734% 0.677% 2.670% 0.574% 8.400% 6.796% 0.510% 0.270% 21.631% SED 52.000% 60.000% 50.000% 47.000% 63.000% 55.000% 55.000% 51.000% SED WTD 3.920% 1.693% 7.026% 1.286% 26.461% 14.950% 1.276% 0.687% 57.300% ACT 26.000% 36.000% 23.000% 20.000% 33.000% 27.000% 26.000% 25.000% ACT WTD 1.960% 1.016% 3.232% 0.547% 13.861% 7.339% 0.603% 0.337% 28.895% ALC 17.000% 10.000% 18.000% 16.000% 12.000% 18.000% 18.000% 21.000% ALC WTD 1.281% 0.282% 2.530% 0.438% 5.040% 4.893% 0.418% 0.283% 15.164% DWI 3.000% 1.000% 3.000% 2.000% 1.000% 3.000% 2.000% 4.000% DWI WTD 0.226% 0.028% 0.422% 0.055% 0.420% 0.815% 0.046% 0.054% 2.066% SEA 23.000% 41.000% 46.000% 40.000% 20.000% 26.000% 49.000% 34.000% SEA WTD 1.734% 1.157% 6.464% 1.094% 8.400% 7.067% 1.137% 0.458% 27.512% POV 2008 9.100% 12.600% 10.100% 7.800% 13.700% 12.100% 12.100% 10.400% POV WTD 0.686% 0.356% 1.419% 0.213% 5.754% 3.289% 0.281% 0.140% 12.138% AGE 2009 (<5 + >65) 19.900% 21.000% 19.500% 19.100% 19.700% 21.300% 20.000% 19.700% AGE WTD 1.500% 0.593% 2.740% 0.522% 8.274% 5.790% 0.464% 0.265% 20.149% N 998 1,001 993 1,000 999 1,000 998 998 7,987
  • 6. DATA: Risky Behavior in Midwest, USA (Survey Results, %s) STATE Illinois Indiana Iowa Michigan Minnesota Missouri Nebraska North Dakota Ohio South Dakota Wisconsin POP 12419658 6080520 2926380 9938492 4919492 5596684 1711265 642195 11353150 754835 5363708 61706379 POP WTD 20.127% 9.854% 4.742% 16.106% 7.972% 9.070% 2.773% 1.041% 18.399% 1.223% 8.692% 100.000% SMK 24.000% 27.000% 22.000% 29.000% 21.000% 26.000% 23.000% 20.000% 26.000% 21.000% 25.000% SMK WTD 4.830% 2.661% 1.043% 4.671% 1.674% 2.358% 0.638% 0.208% 4.784% 0.257% 2.173% 25.297% WEI 21.000% 26.000% 25.000% 26.000% 21.000% 23.000% 24.000% 23.000% 23.000% 23.000% 23.000% WEI WTD 4.227% 2.562% 1.186% 4.188% 1.674% 2.086% 0.666% 0.239% 4.232% 0.281% 1.999% 23.339% SED 60.000% 61.000% 61.000% 59.000% 55.000% 61.000% 55.000% 56.000% 69.000% 57.000% 54.000% SED WTD 12.076% 6.011% 2.893% 9.503% 4.385% 5.533% 1.525% 0.583% 12.695% 0.697% 4.694% 60.594% ACT 32.000% 27.000% 34.000% 32.000% 25.000% 33.000% 25.000% 27.000% 33.000% 29.000% 25.000% ACT WTD 6.441% 2.661% 1.612% 5.154% 1.993% 2.993% 0.693% 0.281% 6.072% 0.355% 2.173% 30.427% ALC 16.000% 13.000% 13.000% 18.000% 21.000% 16.000% 17.000% 17.000% 9.000% 16.000% 27.000% ALC WTD 3.220% 1.281% 0.617% 2.899% 1.674% 1.451% 0.471% 0.177% 1.656% 0.196% 2.347% 15.989% DWI 4.000% 3.000% 3.000% 3.000% 3.000% 3.000% 5.000% 4.000% 3.000% 4.000% 6.000% DWI WTD 0.805% 0.296% 0.142% 0.483% 0.239% 0.272% 0.139% 0.042% 0.552% 0.049% 0.522% 3.540% SEA 29.000% 28.000% 24.000% 21.000% 24.000% 27.000% 51.000% 60.000% 24.000% 57.000% 29.000% SEA WTD 5.837% 2.759% 1.138% 3.382% 1.913% 2.449% 1.414% 0.624% 4.416% 0.697% 2.521% 27.151% POV 2008 12.200% 12.900% 11.400% 14.400% 9.600% 13.500% 10.800% 11.500% 13.300% 12.700% 10.500% POV WTD 2.455% 1.271% 0.541% 2.319% 0.765% 1.224% 0.300% 0.120% 2.447% 0.155% 0.913% 12.511% AGE 2009 (<5 + >65) 19.300% 19.800% 21.600% 19.600% 19.600% 20.400% 20.900% 21.400% 20.300% 21.800% 19.900% AGE WTD 3.885% 1.951% 1.024% 3.157% 1.563% 1.850% 0.580% 0.223% 3.735% 0.267% 1.730% 19.963% N 1,001 1,000 1,000 991 990 1,000 1,002 999 998 1,002 1,000 10,983
  • 7. Simple Linear Regression Model:
  • 8. Simple Linear Regression Model:
  • 9. CONFIDENCE INTERVALS: %age Differences between Northeast (Control) and Midwest (Test) (w/ 95% confidence) Smokers: (.5850, 3.0550) Overweight: (.5095, 2.9105) Sedentary Lifestyle: (1.8716, 4.7084) Leisure Time: (.2253, 2.8548) Binge Drinking: (-.2133, 1.8733) DWI: (1.0042, 1.9358) Seatbelt Use: (-1.6449, .9249) According to findings, Seatbelt Use and Binge Drinking are not statistically significant %age Differences between Northeast (Control) and Midwest (Test) (No error associated) Living below Poverty level: .37 Ages (<5 & >65): -.19 According to findings, more Midwesterners are living below poverty level. According to findings, more Northeasterners are under the age of 5 and over the age of 65
  • 10.
    • FINDINGS:
    • We found that the following Risk Behavior Factors are higher in the Midwest compared to the Northeast:
      • Smoking
      • Overweight
      • Sedentary Lifestyle
      • No Leisure Time
      • Drinking And Driving
      • Live Below Poverty Line
    Northeast Midwest % Differences SMK WTD 23.479% 25.297% 1.818% WEI WTD 21.631% 23.339% 1.708% SED WTD 57.300% 60.594% 3.294% ACT WTD 28.895% 30.427% 1.532% DWI WTD 2.066% 3.540% 1.474% POV WTD 12.138% 12.511% 0.373%
  • 11.
    • FINDINGS:
    • Meaning that the following Risk Behavior Factors are higher in the Northeast compared to the Midwest:
      • Binge Drinking
      • No Seatbelt Use
    • We also determined that there are more individuals per state under the age of 5 and over the age of 65 in the Northeast. This means that far more eligible insurance holders to be covered in the Midwest than Northeast.
    • ASSUMPTION:
    • I assume that if insurance companies want to keep total premiums per state the same across the country, they will charge a higher rate for those in the Northeast (Pop.: 45M+) compared to the Midwest (Pop.:61M+).
  • 12. GOODNESS OF FIT TEST: Using the Northeast Weighted %ages as our expected values compared to the Midwest Weighted %ages as our observed values, we conducted Goodness of Fit Chi-Squared tests to determine whether to reject or accept the following hypotheses ( for statistically significant data ). H0: Risk Behaviors for Midwest are equal to Northeast. H1: Risk Behaviors for Midwest are not equal to Northeast. TS = 14755801 1-CHIDSIT (14755801, 5) = 1 - 0.00000000000000000000 = 1 RESULT: We reject H0 which states that Risk Behaviors for Midwest are equal to Northeast. We accept H1 which states that Risk Behaviors for Midwest are not equal to Northeast. Insurance companies are wrong in assessing that those living in the Northeast engage in a “riskier” lifestyle than those living in the Midwest. Clearly, according to our data and analysis, this is not true.
  • 13. CONCLUSION: In conclusion, we can say with close to 100% certainty that Risk Behaviors are not equal across the different regions of the United States, thereby concluding that the insurance company we investigated gave us a false statement. “ Based on our data and analysis, we assume that insurance companies are unfairly charging policy holders higher premiums in the Northeast compared to those in the Midwest.” A reason for this may be that insurance companies are using different metrics with which to measure their policy holders. They may give each risk behavior a different weight which would greatly manipulate data. They may also take into account that since there are less eligible policy-holders in the Northeast (as proved by our Age Metric research) . Without a more detailed explanation, our search ends here.