Big Data in Health: The Global Burden of Disease Study
Upcoming SlideShare
Loading in...5

Big Data in Health: The Global Burden of Disease Study






Total Views
Views on SlideShare
Embed Views



2 Embeds 133 129 4


Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

Big Data in Health: The Global Burden of Disease Study Big Data in Health: The Global Burden of Disease Study Presentation Transcript

  • 1 Big Data in Health: The Global Burden of Disease Study Peter Speyer @peterspeyer March 29, 2014
  • Big data in health 2 •Surveys •Censuses •Disease registries •Vital registration •Verbal autopsy •Mortuaries/ burial sites •Police records Variety Volume Velocity •Hospital/ ambulatory/ primary care records •Claims data •Surveillance systems •Administrative data •Literature reviews •Sensor data •Social media •Quantified self
  • IHME input data cataloged in the GHDx 3
  • From data to impact 4 Big data Audiences 1. Data access 2. Data preparation 3. Data analysis 4. Data translation Getting relevant data in useful formats to the right audiences
  • Example: Global Burden of Disease Study • A systematic, scientific effort to quantify the comparative magnitude of health loss due to diseases, injuries, and risk factors • GBD 2010 results published in The Lancet in 2012 – 291 causes, 67 risk factors – 187 countries – 1990-2010 – By age and sex • GBD 2013 update in process 5
  • 1. Accessing the data • Systematic identification of all relevant data sources – Data indexers – Lit reviews • Challenges – Data on paper, PDF, proprietary & obsolete formats – Patient/participant consent – Confidentiality/de-identification – Cost 6 Tristan Schmurr / Flickr CC Chapman / Flickr
  • 2. Preparing data for analysis • Data extraction (databases, tables, papers) • Analysis of microdata • Correction for bias • Data quality issues, e.g., garbage codes • Cross-walks, e.g., between ICDs 7
  • Demo: Tobacco Viz 8
  • 3. Analyzing data • Use all available data – Use covariates: indicators related to quantity of interest • Test the modeling approach, e.g., predictive validity testing (CODEm) • Apply appropriate corrections, e.g., causes of death to match all-cause mortality • Quantify uncertainty • Review: 1000+ experts, peer-reviewed publication
  • 4. Data translation 10 • Academic papers • Policy reports • Data search engine • Data visualizations – Input data – Comprehensive results – Key insights
  • Visualization demos 11
  • Parting thoughts • Grab relevant data and get started – Prep data with Data Wrangler, MS Power BI – Visualize with Tableau Public, Google Fusion – Learn to code: R, Python (analysis), JavaScript (viz) • Check out IHME data sources and GHDx Find resources & contact me at @peterspeyer 12