Managing and AnalyzingGlobal Health Data<br />Seattle, August 30, 2011<br />Peter Speyer, Director of Data Development<br />
IHME Background<br />Global institute dedicated to providing independent, rigorous, and scientific measurements and evalua...
IHME Mission<br />	Our goal isto improve the health of the world’s populationsby providing the best informationon populati...
4<br />
Health Data<br />5<br />Health Data Innovation<br />Patient engagement<br />Open data<br />Health apps<br />
Key Health Data Challenges<br />6<br />Find & access data<br />Use data<br />Dissemi-natedata<br />
Key Health Data Challenges<br />Lack of transparency<br />Timeliness of data<br />Lack of documentation<br />Access vs. pr...
Key Health Data Challenges<br />Sheer quantity of data files (30TB, 20K+ source datasets, 40M files)<br />Diverse source d...
Key Health Data Challenges<br />Make results data engaging <br />Accountability: share results, code, source data<br />Acc...
Example: Global Burden of Disease<br />Mortality & causes of death<br />Sources: census, surveys, vital registration, verb...
GBD Country Years, Causes of Death 1950-2009<br />11<br />
GBD Country Years, Causes of Death 1950-2009<br />12<br />
Solutions: Computing Infrastructure<br />Analysis with statistical packages<br />Projects with 100K+ lines of code<br />Fi...
Solutions: Global Health Data Exchange<br />Objectives<br />Approach<br />Implementation<br />Transparency => data catalog...
15<br />
Thank you!speyer@uw.edu@peterspeyerwww.ghdx.org<br />Peter Speyer<br />Director of Data Development<br />
Upcoming SlideShare
Loading in …5
×

Managing and Analyzing Health Data (VLDB Conference)

608 views

Published on

Published in: Technology
1 Comment
0 Likes
Statistics
Notes
  • Be the first to like this

No Downloads
Views
Total views
608
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
17
Comments
1
Likes
0
Embeds 0
No embeds

No notes for slide

Managing and Analyzing Health Data (VLDB Conference)

  1. 1. Managing and AnalyzingGlobal Health Data<br />Seattle, August 30, 2011<br />Peter Speyer, Director of Data Development<br />
  2. 2. IHME Background<br />Global institute dedicated to providing independent, rigorous, and scientific measurements and evaluations to accelerate progress on global health<br />Part of the Department of Global Health at the University of Washington<br />Funded by the Bill & Melinda Gates Foundation and the State of Washington (‘core funding’), and other funders through specific research grants<br />Created in 2007<br />70 researchers, 30 staff<br />2<br />
  3. 3. IHME Mission<br /> Our goal isto improve the health of the world’s populationsby providing the best informationon population health<br />3<br />
  4. 4. 4<br />
  5. 5. Health Data<br />5<br />Health Data Innovation<br />Patient engagement<br />Open data<br />Health apps<br />
  6. 6. Key Health Data Challenges<br />6<br />Find & access data<br />Use data<br />Dissemi-natedata<br />
  7. 7. Key Health Data Challenges<br />Lack of transparency<br />Timeliness of data<br />Lack of documentation<br />Access vs. privacy<br />7<br />Find & access data<br />Use data<br />Dissemi-natedata<br />
  8. 8. Key Health Data Challenges<br />Sheer quantity of data files (30TB, 20K+ source datasets, 40M files)<br />Diverse source data types and formats (pdf, csv, SPSS, CSPro, …)<br />Data quality issues<br />8<br />Find & access data<br />Use data<br />Dissemi-natedata<br />
  9. 9. Key Health Data Challenges<br />Make results data engaging <br />Accountability: share results, code, source data<br />Accommodate diverse audiences (expertise, geographies)<br />9<br />Find & access data<br />Use data<br />Dissemi-natedata<br />
  10. 10. Example: Global Burden of Disease<br />Mortality & causes of death<br />Sources: census, surveys, vital registration, verbal autopsy<br />Estimates: covariate models, spatial-temporal regressions; weighted combination of models<br />Morbidity<br />Sources: Literature reviews, surveys, registries,hospital data<br />Disease modeling: compartmental Bayesian model<br />Health severity weights<br />Burden of disease<br />DALYnator<br />10<br />300 diseases<br />40 risk factors<br />21 regions<br />1990, 2005, 2010<br />
  11. 11. GBD Country Years, Causes of Death 1950-2009<br />11<br />
  12. 12. GBD Country Years, Causes of Death 1950-2009<br />12<br />
  13. 13. Solutions: Computing Infrastructure<br />Analysis with statistical packages<br />Projects with 100K+ lines of code<br />File system <br />60TB disc space<br />Redundant backup<br />Cluster with 63 nodes (+300% in 2011), ~2000 cores<br />Runs 24x7, very little downtime<br />Virtual environments to test new applications, servethem to collaborators, etc.<br />13<br />
  14. 14. Solutions: Global Health Data Exchange<br />Objectives<br />Approach<br />Implementation<br />Transparency => data catalog<br />Access => data repository<br />Information => data community (future)<br />One record per dataset<br />Standardized metadata<br />Internal users (10K records): files on file server<br />External users (5K records): files for download<br />CMS: Drupal <br />Search: SOLR<br />14<br />
  15. 15. 15<br />
  16. 16. Thank you!speyer@uw.edu@peterspeyerwww.ghdx.org<br />Peter Speyer<br />Director of Data Development<br />

×