Appistry WGDAS Presentation


Published on

Presentation given by Appistry's Vice President of Product Strategy, Sultan Meghi at the World Genome Data Analysis Summit. Meghi presented about the big data challenges facing labs as they strive to manage the flow of genetic data from sequencer to the clinic.

1 Like
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Voice over points:Next Gen Sequencing (NGS) has provided a powerful new tool for the investigation of genomic informationThe growing number of sequencers on the market are generating a huge demand for tools for data analysisThere is a lack of fast,affordable, easy-to-use, and comprehensive bioinformatics tools in the marketAnother quote: (backup)“Data handling is now the bottleneck. It costs more to analyze a genome than to sequence a genome.” - David Haussler, Director of the Center for Biomolecular Science & Engineering at the University of California, Santa Cruz, in NYT
  • Seen years ago in finance, logistics, geospatial & defense areas5 universal issues when dealing with “big data”StorageComputationNetwork bandwidth (movement of the data)Operational complexity Complex programming tasks
  • Appistry WGDAS Presentation

    1. 1. From Sequencer to Clinic:Managing Science and ScaleSultan Meghi, Vice President of ProductStrategyWorld Genome Data Analysis SummitNovember 28, 2012
    2. 2.  Challenges Along the Path from Genomics Research to Personalized Medicine  Implementing technology  Implementing science  Scaling from research to clinic The Problem Restated… What’s the most efficient, reliable and robust way to capture my genetic data, analyze it and secure it for re-analysis and deeper interpretation in a clinical setting? Enabling Science at Scale  Platform for big data  Analytics framework for implementing science  Flexible deployment AGENDA 1
    3. 3. Target: Clinicians Mega-scale Complex Infrastructure and Patients data Pipeline costs, management Development, complexity, leveraging a and data Test & security and dynamically analysis. Deployment. compliance. expanding Accelerating the Science of Genetic Discovery for field of Researchers, Bioinformatics Specialists & Tool science. Development.Government 3rd Party Funding Payers CUSTOMER NEEDS
    4. 4. “We can sequence the genome for dirt cheap, but we don’t know how to deal with the data.” Eric Green M.D.,Ph.D. Director, NHGRI “How do we avoid the pitfall of having cheap human genome sequencing but complex and expensive manual analysis to make clinical sense out of the data?” Elaine Mardis Ph.D. Director of Technology DevelopmentSource: WSJ, NYT, Genome Medicine THE GENOMICS DATA PROBLEM 3
    5. 5. “Big Data” is essentially large amounts of data  Multiple sources or data formats  Unstructured or semi-structured  Difficult to put into databases and analyzeSeen in other industry areas: Telecom THE BIG DATA CHALLENGE IN GENOMICS 4
    6. 6. “Moving data around and storing the data is painful. It’s a huge problem for us. We’re looking at the cloud for processing options.” - Carol Rohl Ph.D., Director of Merck, Research Labs STORAGE “Datasets are so large, you have to analyze them at the same site where the data is or using mirrors. You do not want to be writing it onto a remote hard drive and move the data each time you want to analyze it.” COMPUTATION - John Monahan, Novartis Institutes for Biomedical Research “Bioinformatics tools and reference datasets change monthly, weekly and in some cases daily. This requires easy to manage application and data management platforms to keep up to date with all the changes.” APPLICATIONS - Sultan Meghji, Appistry, GigaOM 2012Source: Appistry proprietary market research by CBT Advisors THE BIG DATA CHALLENGE IN GENOMICS 5
    8. 8. 7
    10. 10. Capabilities needed Automated Data Private Cloud Genomics Management and Storage Services (HIPAA Tightly Coupled to Compliance) Analysis Industry Tools, Data Sets and YOUR Science Massively Analytics Layer Simplifies Scalable/Reliable Fabric the Build, Test and for Algorithms, Tools and Deployment of Analytic Applications Pipelines APPISTRY’S GENOMIC SOLUTION 9
    11. 11. ATCGTA TCGGCA CTAATC GCTCGG CTATAG Public Cloud Data from Sequencers 2 8 5 1 3 Open-Source 9 Algorithms 4 7 3 10 User 6 Public Gene For EachRun Data All StorageDataFTP or forRepeat5+Days3-8 Step 1: AccessData Algorithms Databases 9, 10= =steps 8: Open-Source Algorithm Infrastructure Months 7: New Gene Stored Open-Source 6: Reorganize Gene on via Update: 5: Upload algorithms + Sequence1, 3: Send AlgorithmsRepeatfortoInfo 2, Infrastructure 10: 9: Public Gene steps Data Set: Database Storage 4: Reprogramto Database to Storage Algorithms 2: Download DataData + Sequencer FedEx Stored From DatabasesSource: Appistry survey AYRRIS PRODUCT 10
    12. 12. ATCGTA TCGGCA CTAATC GCTCGG SFTP Transfer HIPAA Compliant CTATAG Genomics Cloud Data from Appistry Private Cloud AppistrySequencers Courier Over Annotated Results & Ayrris Pipelines HTTPS Visualizations Your Science SNPs, Indels, Rare Variants, etc Appistry Courier Consumption of Over Results by internal HTTPS Bioinformaticians and Clinicians Data Center and Researcher CLOUD WORKFLOW 11
    13. 13. APPISTRY CLOUD APPISTRY APPLIANCE INSTITUTION via INTERNET Cloud-based genomic data  On-site modular turn-key analysis and storage hardware and software Subscription to Appistry’s secure,  Enterprise-level implementation of HIPAA compliant cloud storage private network HIPAA-enabled storage Same access to pipeline analysis algorithms & annotations (Same Science)  Same underlying technology and efficiency BUSINESS MODEL 12
    14. 14. ATCGTA TCGGCA CTAATC Regulatory GCTCGG Compliant CTATAG Genomics Cloud Appistry Private Cloud Data from Sequencers Annotated Results & Ayrris Pipelines Visualizations Your ScienceData from other instruments Integrated with Integrated with Secured, Integrated Workflows, Research Data Medical Data – Data Management and Analysis Systems (Genomics, EMR, Biller/Payer Pharma) CLOUD WORKFLOW 13
    16. 16. Genomic Information Decisions for prevention or early treatment Breast cancer Osteoporosis Lung cancer Heart disease Autism Leukemia ADHD Genetic disorders 15
    17. 17. Thanks for Your Attentionmain: 314.450.5720fax: 314.450.5722sultan@appistry.comappistry.com1141 South 7th St., Suite 300St. Louis, MO 63141
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.