Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

AWS Customer Presentation- Pathwork Diagnostics


Published on

Ljubomir Buturovic, Ph.D., Sr. Director and Chief Scientist, Pathwork Diagnostics talks about using AWS to power their molecular diagnostics for onocology

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

AWS Customer Presentation- Pathwork Diagnostics

  1. 1. Cloud Computing for Cancer Diagnostics AWS Startup Tour Ljubomir Buturovic Chief Scientist, Pathwork Diagnostics, Inc. June 16, 2009
  2. 2. <ul><li>Pathwork Diagnostics, Inc. first product aids oncologists in the diagnosis of hard-to-identify cancer tumors </li></ul><ul><li>The test contains an analytical component (model) which converts genomic data into actionable diagnostic report </li></ul><ul><ul><li>Genomic data: gene expression, measured by Affymetrix GeneChip </li></ul></ul><ul><li>The optimal model chosen by analyzing large genomic libraries of tumor specimens </li></ul><ul><li>The analyses require massive amounts of computational power at certain peak times </li></ul>Pathwork Diagnostics: Genomic Tests for Cancer
  3. 3. Tissue of Origin Test RNA extraction cDNA synthesis cRNA labeling / fragmentation Hybridization to Pathchip microarray Tissue specimen Report Pathwork Analytics
  4. 4. Tissue of Origin Test <ul><li>Measures the gene expression of >1,500 genes </li></ul><ul><li>Panel covers 58 different morphologies, consolidated into 15 tissue types </li></ul><ul><li>~90% of all solid tumors </li></ul><ul><li>Accurate and robust, 89% PPA (sensitivity), >99% NPA (specificity)* </li></ul><ul><li>Reports on which tissues are present (rule-in) and which are not present (rule-out) </li></ul>*In 352 formalin-fixed, paraffin-embedded (FFPE) specimens, the test demonstrated 89% positive percent agreement (akin to sensitivity) with available diagnoses, and greater than 99% negative percent agreement (akin to specificity) in specimens that had previously been identified with existing methods as being among the 15 tumor types on the panel
  5. 5. <ul><li>Pathwork developed proprietary distributed machine learning software system for development of informatics component of the cancer diagnostics tests </li></ul><ul><li>First deployed on an in-house cluster </li></ul><ul><ul><li>Uses open-source job scheduler SGE (Sun Grid Engine) </li></ul></ul><ul><li>Business need: rapidly expand computing capacity on demand, minimize cost </li></ul><ul><li>The requirement triggered by the business cycle and research needs </li></ul>Distributed Machine Learning System
  6. 6. <ul><li>Modest I/O </li></ul><ul><li>Large amount of computing per unit of data </li></ul><ul><li>Easily parallelizable calculations </li></ul><ul><ul><li>In many cases </li></ul></ul>Application: Key Properties
  7. 7. <ul><li>Solution: port the existing system, with minimal changes, on a Cloud Computing platform </li></ul><ul><li>Started feasibility project in July 2008 </li></ul><ul><li>Chose Amazon EC2 as the most flexible service for Pathwork requirements </li></ul><ul><li>Integration of SGE required the use of Univa UD UniCloud solution </li></ul><ul><ul><li>Enables the incorporation of EC2 instances as SGE compute nodes </li></ul></ul>Choosing AWS
  8. 8. <ul><li>System operational since Q4 2008 </li></ul><ul><ul><li>About six person/months from feasibility to functional system </li></ul></ul><ul><li>Main challenge: integration of EC2 nodes within SGE </li></ul><ul><li>Presently used for research projects (analysis of complex genomic models) </li></ul><ul><li>Plan to use in product development in Q3 2009 </li></ul>Status
  9. 9. <ul><li>Max. deployment: provisioned 250 large (four-core) instances for about two days </li></ul><ul><li>Main cost EC2 instances, all other usage negligible </li></ul><ul><ul><li>EBS key to system architecture </li></ul></ul><ul><li>Data transfer might be an issue </li></ul><ul><li>About 1-2% of requested instances fail to boot </li></ul><ul><li>Service denied twice (150 large instances) </li></ul><ul><ul><li>Move to different zone solved the issue </li></ul></ul><ul><li>No memory usage monitor </li></ul>AWS Experience
  10. 10. <ul><li>Enables product improvements </li></ul><ul><ul><li>accurate cancer diagnosis </li></ul></ul><ul><li>Expanded scope of research, maintaining technological cutting-edge </li></ul><ul><li>No need to plan/purchase/maintain large computing infrastructure </li></ul><ul><li>Improved project planning/product delivery/budgeting due to elasticity in computing power </li></ul>Business Benefits for Pathwork