Quantitative Medicine Feb 2009


Published on

Published in: Business, Health & Medicine
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Roberto Barrera – impressive set of contributions John Delaney – persuaded you that you should all start working on ocean observation I have worked for many years with some incredible people in the physical sciences, working to understand some fascinating phenomena, such as the nature of mass and the causes and likely effects of climate change. These people have been the early leaders in developing and applying grid technology, via such projects as the LHC Computing Grid and the Earth System Grid. Jonathan: 1) Evidence-based medicine is use of carefully evaluating the results (called outcomes) of different diagnostic or therapeutic procedures to determine the best choice for a population of patients. Then a physician, with this information at his/her disposal (a complicated problem to have that happen in itself) can make the best decision for the individual patient by looking at the characteristics of the studied patients and his/her own patient (N of 1) and  can make recommendations (patients, not doctors, make choices) 2. personalized medicine (in a nutshell here - purists might disagree) is2:58using much finer distinguishing characteristics to do the same thing such as specific genomic studies that ensure that the N of one patient is precisely matched with the same sub-sub-sub population of patients2:59Thus, the distinction at some level, blurs, when we have enough examples of personalized medicine (it becomes the evidence-based medicine of the future) but for now all we have is evidence-based medicine (a much more blunt instrument) with the same goal
  • Quantitative Medicine Feb 2009

    1. 1. Quantitative medicine A “killer app” for grid Ian Foster Computation Institute Argonne National Lab & University of Chicago
    2. 2. Thanks in particular to … Carl Stephan Steve Ravi Jonathan Kesselman Erberich Tuecke Madduri Silverstein
    3. 3. Quantitative medicine is the key to reducing healthcare costs and improving healthcare outcomes Patients with same diagnosis
    4. 4. Quantitative medicine is the key to reducing healthcare costs and improving healthcare outcomes Patients with same diagnosis Misdiagnosed Non-responders, toxic responders Non-toxic responders
    5. 5. Major drugs ineffective for many… Asthma Drugs 40-70% Beta-2-agonists Hypertension Drugs 10-30% ACE Inhibitors Heart Failure Drugs 15-25% Beta Blockers Anti Depressants 20-50% SSRIs Cholesterol Drugs 30-70% Statins Source: Amy Miller, Personalized Medicine Coalition
    6. 6. Same clinical disease, but different response to same chemotherapy, depending on gene expression profile Patient ID Number Danenberg Tumor Profile Scale Colorectal cancer: clinical trial data Salonga et al. Clin Cancer Res 2000; 6: 1322-1327.
    7. 8. Personalized medicine is quantitative The right treatment for the right person at the right time Trial and Error Personalized Medicine Current Practice One size fits all Trial and error Source: Amy Miller, Personalized Medicine Coalition
    8. 9. To realize the promise of quantitative medicine, we must break down barriers to information sharing … Discovering effective personalized treatments Determining the right treatment for the individual … and deliver new analytical tools to make sense of large quantities of data
    9. 10. Why it is hard? <ul><li>A large, dispersed community </li></ul><ul><li>Huge quantities of data </li></ul><ul><li>Great diversity of data </li></ul><ul><li>Inadequate computing capabilities </li></ul><ul><li>Lack of a culture of sharing </li></ul><ul><li>Privacy concerns </li></ul>Basic Research Clinical Practice Clinical Trials trial subjects, outcomes library Outcomes, tissue bank screening tests ongoing investigative studies pathways
    10. 11. Healthcare and infrastructure <ul><li>Increased recognition that information systems and data understanding are limiting factor </li></ul><ul><ul><li>… much of the promise associated with health IT requires high levels of adoption … and high levels of use of interoperable systems (in which information can be exchanged across unrelated systems) …. </li></ul></ul><ul><ul><li>RAND COMPARE </li></ul></ul><ul><li>Health system is complex, adaptive system </li></ul><ul><ul><li>There is no single point(s) of control. System behaviors are often unpredictable and uncontrollable, and no one is “in charge.” </li></ul></ul><ul><li>W Rouse, NAE Bridge </li></ul><ul><li>Need to blur boundary from research to clinical </li></ul><ul><ul><li>… I advocate … a model of virtual integration rather than true vertical integration…. </li></ul></ul><ul><ul><li>George Halvorson, CEO Kaiser </li></ul></ul>
    11. 12. Virtual organizations Grids and SOA
    12. 13. Children’s Oncology Group Grid Globus
    13. 14. Childrens’ Oncology Grid clinical imaging trials (Erberich)
    14. 15. Wide-area medical interface service <ul><li>Maps local medical workflow actions to wide area ops </li></ul><ul><ul><li>Image workflow, EHR, … </li></ul></ul><ul><li>Transparently manages federation of </li></ul><ul><ul><li>Security </li></ul></ul><ul><ul><li>Data replication and recovery </li></ul></ul><ul><ul><li>Data discovery </li></ul></ul>Enterprise/Grid Interface Service DICOM protocols Grid protocols (Web services) DICOM XDS HL7 Vendor-specific Wide Area Service Actor Plug-in adapters
    15. 17. US National Institutes of Health infrastructure activities <ul><li>Biomedical Research Informatics Network (BIRN) </li></ul><ul><ul><li>National Center for Research Resources (NCRR) </li></ul></ul><ul><ul><li>General infrastructure, with initial focus on neuroscience applications </li></ul></ul><ul><li>Cancer Biology Informatics Grid (caBIG) </li></ul><ul><ul><li>National Cancer Institute (NCI) </li></ul></ul><ul><ul><li>Initial focus on the cancer research community; BIGhealth initiative seeks to broaden it </li></ul></ul>Globus
    16. 19. Service oriented medicine: caGrid, Introduce, and gRAVI <ul><li>Introduce </li></ul><ul><ul><li>Define service </li></ul></ul><ul><ul><li>Create skeleton </li></ul></ul><ul><ul><li>Discover types </li></ul></ul><ul><ul><li>Add operations </li></ul></ul><ul><ul><li>Configure security </li></ul></ul><ul><li>G rid R emote A pplication V irtualization I nfrastructure </li></ul><ul><ul><li>Wrap executables </li></ul></ul>Index service Repository Service Introduce Container Ohio State University and Argonne/U.Chicago Appln Service Create Store Advertize Discover Invoke; get results Transfer GAR Deploy Globus
    17. 20. As of Oct 19 , 2008: 122 participants 105 services 70 data 35 analytical
    18. 21. Microarray clustering using Taverna <ul><li>Query and retrieve microarray data from a caArray data service: cagridnode.c2b2.columbia.edu:8080/wsrf/services/cagrid/CaArrayScrub </li></ul><ul><li>Normalize microarray data using GenePattern analytical service node255.broad.mit.edu:6060/wsrf/services/cagrid/PreprocessDatasetMAGEService </li></ul><ul><li>Hierarchical clustering using geWorkbench analytical service: cagridnode.c2b2.columbia.edu:8080/wsrf/services/cagrid/HierarchicalClusteringMage </li></ul>Workflow in/output caGrid services “ Shim” services others Wei Tan
    19. 22. Outsourcing analysis: caBIG’s geWorkbench/TeraGrid interface R. Madduri, U.Chicago, Taverna team
    20. 23. Schizophrenia as a neuropsychiatric model (Potkin, UCI) <ul><li>A brain illness with subtle structural and functional changes </li></ul><ul><li>Active area of imaging research with many competing theories and approaches </li></ul><ul><li>Progress hampered by </li></ul><ul><ul><li>Inconsistent data & lack of replications </li></ul></ul><ul><ul><li>Noncomparable imaging techniques </li></ul></ul><ul><ul><li>Small, diverse patient populations </li></ul></ul>
    21. 24. Functional BIRN (fBIRN) information integration vision Multi-Site User Query Data Provenance Information Derived data processing FIPS Results FMRI/MRI Images Processing Pipelines HIDB(s) (Distributed) Data Grid Clinical Data Input DICOM, NIFTI fMRI Scanner
    22. 25. FBIRN multi-site study, 2006 UNM UMN UI UCI BWH MGH UCLA UCSD Stanford Duke/ UNC Yale = 3 or 4T site = 1.5T site = Development site
    23. 26. Lessons learned from BIRN (G. Farber) <ul><li>There is little point in sharing data unless there is community agreement on how to standardize data collection </li></ul><ul><li>There continues to be a communications/ease of use gap between computer scientists and biomedical researchers </li></ul><ul><li>Sharing heterogeneous data from biomedical experiments is a challenge to existing data sharing infrastructures </li></ul><ul><li>Complex queries are a really hard problem </li></ul>
    24. 27. Health informatics services model Analysis Management Integration Publication Policy and Security Decision Support Radiology Medical Records Labs Pathology Genomics Applications Source: Carl Kesselman
    25. 28. Decision support for HIV drug ranking (Peter Sloot et. al)
    26. 29. Clinical Parameters: -weight - opportunistic infections and tumors -survival Molecular Dynamics Binding Affinity Protein Structure & Binding Affinity VIROLAB DRUG RANKING DECISION SUPPORT Text Mining  Drugranking  1 st order logic Complex Networks Epidemics Agent-Based Entry Simulation Phenotype CA Based Immune Response Protease and RT mutations
    27. 30. Virolab: DSS Virtual Laboratory Experiment developer Scientist Clinical Virologist Experiment Planning Environment Experiment scenario ViroLab Portal Virtual Laboratory runtime components (Required to select resources and execute experiment scenarios) Computational services (WS, WSRF, components, jobs) Data services (DAS data sources, standalone databases) Grids (EGEE), Clusters, Computers, Network Users Interfaces Runtime Services Infrastructure Drug Ranking Scenario
    28. 31. Many many tasks: Identifying potential drug targets 2M+ ligands Protein x target(s) (Mike Kubal, Benoit Roux, and others)
    29. 32. start report DOCK6 Receptor (1 per protein: defines pocket to bind to) ZINC 3-D structures ligands complexes NAB script parameters (defines flexible residues, #MDsteps) Amber Score: 1. AmberizeLigand 3. AmberizeComplex 5. RunNABScript end BuildNABScript NAB Script NAB Script Template Amber prep: 2. AmberizeReceptor 4. perl: gen nabscript FRED Receptor (1 per protein: defines pocket to bind to) Manually prep DOCK6 rec file Manually prep FRED rec file 1 protein (1MB) PDB protein descriptions For 1 target: 4 million tasks 500,000 cpu-hrs (50 cpu-years) 6 GB 2M structures (6 GB) DOCK6 FRED ~4M x 60s x 1 cpu ~60K cpu-hrs Amber ~10K x 20m x 1 cpu ~3K cpu-hrs Select best ~500 ~500 x 10hr x 100 cpu ~500K cpu-hrs GCMC Select best ~5K Select best ~5K
    30. 33. DOCK on BG/P: ~1M tasks on 118,000 CPUs <ul><li>CPU cores: 118784 </li></ul><ul><li>Tasks: 934803 </li></ul><ul><li>Elapsed time: 7257 sec </li></ul><ul><li>Compute time: 21.43 CPU years </li></ul><ul><li>Average task time: 667 sec </li></ul><ul><li>Relative Efficiency: 99.7% </li></ul><ul><li>(from 16 to 32 racks) </li></ul><ul><li>Utilization: </li></ul><ul><ul><li>Sustained: 99.6% </li></ul></ul><ul><ul><li>Overall: 78.3% </li></ul></ul><ul><li>GPFS </li></ul><ul><ul><li>1 script (~5KB) </li></ul></ul><ul><ul><li>2 file read (~10KB) </li></ul></ul><ul><ul><li>1 file write (~10KB) </li></ul></ul><ul><li>RAM (cached from GPFS on first task per node) </li></ul><ul><ul><li>1 binary (~7MB) </li></ul></ul><ul><ul><li>Static input data (~45MB) </li></ul></ul>Ioan Raicu Zhao Zhang Mike Wilde Time (secs)
    31. 34. NAE Grand Challenges
    32. 35. Thank you! Computation Institute www.ci.uchicago.edu www.ci.anl.gov