Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Perl cures coronary heart disease


Published on

Life sciences in general such as genetics and biology have traditionally benefited from Perl with excellent projects leading the way such as BioPerl. Unfortunately, in medical research and epidemiology, the picture is different. Researchers are struggling with the ever increasing size and complexity of datasets. This presentation will briefly describe the situation I faced when I first joined a research team working on coronary heart disease, what I did to make things better and how I achieved one small victory for Perl.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Perl cures coronary heart disease

  1. 1. Perl cures coronary heart disease (well, sort of) Spiros Denaxas, @fruit90210 London BioGeeks, 24 th Feb. 2011
  2. 2. Talk outline <ul><li>Who am I? </li></ul><ul><li>Background of two different worlds </li></ul><ul><li>Life before Perl </li></ul><ul><li>What I did </li></ul><ul><li>Life after Perl </li></ul><ul><li>Please help out! </li></ul>
  3. 3. Hi, I am Spiros. <ul><li>Who are you and what do you want? </li></ul><ul><ul><li>Computer Science </li></ul></ul><ul><ul><li>Bioinformatics </li></ul></ul><ul><ul><li>Nestoria </li></ul></ul><ul><ul><li>UCL Clinical Epidemiology </li></ul></ul>
  4. 4. Bioinformatics vs. epidemiology <ul><li>Bioinformatics </li></ul><ul><ul><li>Computer science + molecular biology </li></ul></ul><ul><ul><li>Extremely large datasets </li></ul></ul><ul><ul><li>Both hardware and software innovation </li></ul></ul><ul><ul><li>Established standards (storage, searching,…) </li></ul></ul><ul><ul><li>Data sharing and collaboration </li></ul></ul><ul><li>BioPerl project ( </li></ul><ul><ul><ul><li>Collection of Perl modules </li></ul></ul></ul><ul><ul><ul><li>International collaboration </li></ul></ul></ul><ul><ul><ul><li>Cross-platform </li></ul></ul></ul><ul><ul><ul><li>Plethora of tools based on it </li></ul></ul></ul><ul><ul><ul><li>Good documentation </li></ul></ul></ul>
  5. 5. Bioinformatics vs. epidemiology <ul><li>(Clinical) Epidemiology </li></ul><ul><ul><li>Collect and analyze clinical data on patients </li></ul></ul><ul><ul><li>Traditionally very expensive </li></ul></ul><ul><ul><li>Typical study: less than 5000 individuals. </li></ul></ul><ul><ul><li>Everything was/is based on paper. </li></ul></ul><ul><li>Times changed: </li></ul><ul><ul><li>Electronic Health Records (EHR) </li></ul></ul><ul><ul><li>NHS IT / Connecting For Health (CFH) </li></ul></ul><ul><li>More data available from multiple sources: </li></ul><ul><ul><li>GP, Hospitals, ONS, Government </li></ul></ul>
  6. 6. Epidemiology now <ul><li>Ever increasing size of datasets (6M+) </li></ul><ul><li>Increasing complexity of structured ontologies </li></ul><ul><li>Data is a well kept secret </li></ul><ul><li>Researchers are struggling </li></ul><ul><ul><li>Data quality, formatting, specifications (lack of) </li></ul></ul><ul><li>Focus on analysis, not management. </li></ul><ul><ul><li>Stata / SPSS </li></ul></ul><ul><ul><li>Text is king </li></ul></ul><ul><li>Funding for data management (academia vs. “The Real World”) </li></ul><ul><li>Some common patterns emerged… </li></ul>
  7. 7. Life before Perl
  8. 8. At least he’s happy!
  9. 9. What did I do <ul><li>One small step for man </li></ul><ul><li>Created the “Medical” namespace </li></ul><ul><li>Started thinking of “MedPerl” </li></ul><ul><li>Its all about exposure to non-Perl people (aka normal people) </li></ul><ul><li>We already have some medical modules: </li></ul><ul><ul><li>UMLS::Interface </li></ul></ul><ul><ul><li>Image::ExifTool::DICOM </li></ul></ul><ul><ul><li>UMLS::Similarity </li></ul></ul><ul><li>And I took it from there on… </li></ul>
  10. 10. NHS numbers <ul><li>NHS deals with 1M patients / 36 hours </li></ul><ul><li>(new school) NHS number = 10 digit UID </li></ul><ul><li>Modulus 11 algorithm </li></ul><ul><li>Of course, 21 different formats of old school </li></ul><ul><li>Nothing on CPAN? </li></ul><ul><li>No problem, Medical::NHSNumber </li></ul><ul><ul><li>is_valid() </li></ul></ul>
  11. 11. International Classification of Diseases (ICD10) <ul><li>Created by the World Health Organization (WHO) </li></ul><ul><li>A coding of diseases and signs, symptoms, […] </li></ul><ul><li>Widely used </li></ul><ul><li>15,000 terms, essentially an ontology </li></ul><ul><li>Excel ?%@?(%!?#?@$@!#%@ </li></ul><ul><li>No problem, Medical::ICD10 </li></ul><ul><ul><li>get_term() </li></ul></ul><ul><ul><li>get_parent_term() </li></ul></ul><ul><ul><li>get_child_terms() </li></ul></ul>
  12. 12. Standards? What's that? <ul><li>Lack of formalized standards </li></ul><ul><li>Data is delivered in CSV </li></ul><ul><li>Documentation in Excel, Word, emails, inline comments </li></ul><ul><li>Why don’t we use DDI v. 2? </li></ul><ul><li>MINAP::Describe and Data::DDI </li></ul><ul><ul><li>Clean, recode, describe </li></ul></ul><ul><ul><li>Internal module </li></ul></ul><ul><ul><li>Twiki as output </li></ul></ul><ul><li>Study registration </li></ul><ul><ul><li> </li></ul></ul><ul><ul><li>WebService::ClinicalTrialsdotGov </li></ul></ul>
  13. 14. Introduced Perl <ul><li>People do want to make their lives easier </li></ul><ul><li>Excellent resource: </li></ul><ul><li>Introduced Perl to team members </li></ul><ul><ul><li>Most of them scared away </li></ul></ul><ul><ul><li>One was happy (yay!) </li></ul></ul><ul><ul><li>(Also considered Python) </li></ul></ul>
  14. 15. Life after Perl <ul><li>Introduced a new namespace </li></ul><ul><li>Created several modules (internal and external) </li></ul><ul><li>Created better data documentation using Perl </li></ul><ul><li>Promoted standards </li></ul><ul><li>Introduced Perl to “Normal People” </li></ul><ul><li>Was it complicated? </li></ul><ul><li>Was it worth it? </li></ul>
  15. 16. Life after Perl
  16. 17. Please help out! <ul><li>Introduce Perl to your academic group </li></ul><ul><li>Contribute to the “Medical” namespace </li></ul><ul><li>Help design and implement “medperl” </li></ul><ul><ul><li> </li></ul></ul><ul><li>Use standards </li></ul><ul><li>Join the UCL Perl Users Group </li></ul>
  17. 18. Thank you. <ul><li>Questions? </li></ul>