Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Related Books

Free with a 30 day trial from Scribd

See all

The Future of Healthcare with Big Data and AI with Ion Stoica and Frank Nothaft

  1. 1. The Future of Healthcare with Big Data and AI
  2. 2. DATA ENGINEERS DATA SCIENTISTS UNIFIED ANALYTICS PLATFORM
  3. 3. EXPERTISE GAP DOMAIN EXPERT UNIFIED ANALYTICS PLATFORM
  4. 4. DOMAIN EXPERT INDUSTRY-SPECIFIC TOOLS UNIFIED ANALYTICS PLATFORM
  5. 5. Unified Analytics Platform for Genomics
  6. 6. Massive Investments in Genomic Data
  7. 7. Potential to Transform the Industry Faster Drug Discovery Reduced Health Claims Better Patient Outcomes
  8. 8. 40,000 Petabytes / year by 2025 Genomic Data Volumes are Exploding From $2.7B to <$1,000
  9. 9. Sequencing Data / $ 2018
  10. 10. Sequencing Data Processed / $ 2018 Sequencing Data / $
  11. 11. Sequencing Data Processed / $ 2018 Cost of processing all DNA sequenced in a year is growing exponentially year-over-year! Sequencing Data / $
  12. 12. Challenge #1: Complex Pipelines Complex Genomic Pipelines Costly and time consuming Annotation Alignment Variant Calling Quality Control BWA Analysis Raw Data
  13. 13. Challenge #2: Rigid Analytics Complex Genomic Pipelines Costly and time consuming Annotation Alignment Variant Calling Quality Control BWA Analysis Raw Data Rigid Analytics Reduced Scope of Research
  14. 14. Challenge #3: Siloed Teams Complex Genomic Pipelines Costly and time consuming Annotation Alignment Variant Calling Quality Control BWA Analysis Raw Data Rigid Analytics Reduced Scope of Research Siloed Teams Lack of Productivity Researchers and Clinicians Bioinformatics Teams Computational Biologists
  15. 15. Solution #1: Prebuilt Pipelines Complex Genomic Pipelines Costly and time consuming Annotation Alignment Variant Calling Quality Control BWA Analysis Raw Data Rigid Analytics Reduced Scope of Research Siloed Teams Lack of Productivity Researchers and Clinicians Bioinformatics Teams Computational Biologists Raw Data Analyses Raw Data Raw Data Packaged Workflows and Tools powered by Databricks Runtime “One click” execution Best Practice Pipelines
  16. 16. Solution #1: Prebuilt Pipelines Complex Genomic Pipeline Costly and time consuming Annotation Alignment Variant Calling Quality Control BWA Analysis Raw Data Rigid Analytics Reduced Scope of Research Siloed Teams Lack of Productivity Researchers and Clinicians Bioinformatics Teams Computational Biologists Raw Data Analyses Raw Data Raw Data Packaged Workflows and Tools powered by Databricks Runtime “One click” execution Best Practice Pipelines 30x Coverage Whole Genome (GVCF) 0:30:00 1:00:00 1:30:00 2:00:00 Processing Time 3.8x faster than industry leader Edico 2:29:23 0:39:23
  17. 17. Solution #2: Powerful Analytics Rigid Analytics Reduced Scope of Research Siloed Teams Lack of Productivity Researchers and Clinicians Bioinformatics Teams Computational Biologists From interactive queries to AI Powerful Analytics Raw Data Analyses Raw Data Raw Data Packaged Workflows and Tools powered by Databricks Runtime “One click” execution Best Practice Pipelines
  18. 18. Solution #2: Powerful Analytics Rigid Analytics Reduced Scope of Research Siloed Teams Lack of Productivity Researchers and Clinicians Bioinformatics Teams Computational Biologists From interactive queries to AI Powerful Analytics Raw Data Analyses Raw Data Raw Data Packaged Workflows and Tools powered by Databricks Runtime “One click” execution Best Practice Pipelines “Having the data is the first step, enabling drug development teams to answer questions with the data is how we are building the future of drug discovery.” Dr. Jeff Reid, Exec Dir at Regeneron “Queries on 60B+ genome associations in 3 seconds vs. 30 minutes”
  19. 19. Siloed Teams Lack of Productivity Researchers and Clinicians Bioinformatics Teams Computational Biologists Solution #3: Collaborative Workspaces Raw Data Analyses Raw Data Raw Data Packaged Workflows and Tools powered by Databricks Runtime “One click” execution Best Practice Pipelines From interactive queries to AI Powerful Analytics Lack of ProductivityDramatically Improve Productivity Collaborative Workspaces Researchers and Clinicians Bioinformatics Teams Computational Biologists
  20. 20. Siloed Teams Lack of Productivity Researchers and Clinicians Bioinformatics Teams Computational Biologists Solution #3: Collaborative Workspaces Raw Data Analyses Raw Data Raw Data Packaged Workflows and Tools powered by Databricks Runtime “One click” execution Best Practice Pipelines From interactive queries to AI Powerful Analytics Lack of ProductivityDramatically Improve Productivity Collaborative Workspaces Researchers and Clinicians Bioinformatics Teams Computational Biologists “Databricks allows us to take clinical research and turn it into a clinically validated screen in far less time.” Sr. Director of Computational Bioinformatics, Lynn Carmichael
  21. 21. Unified Analytics Platform for Genomics All Your Genomic Data Visualizations Machine Learning Best Practice Pipelines Tertiary Analytics and AI at Scale Collaborative Workspaces Genomic Analytics (e.g. GWAS, eQTL)
  22. 22. Unified Analytics Platform for Genomics All Your Genomic Data Visualizations Machine Learning Best Practice Pipelines Tertiary Analytics and AI at Scale Collaborative Workspaces Genomic Analytics (e.g. GWAS, eQTL) Genomics-specific optimizations increase performance by up to 100x
  23. 23. Sign-up for the preview databricks.com/genomics
  24. 24. Accelerate Discovery Accelerate Discovery
  25. 25. Demo: Preventing Disease with Genomics at Scale
  26. 26. Typical patient intake and treatment Identify Diagnose Treat
  27. 27. Typical patient intake and treatment Identify Diagnose Treat ...but this is very reactive and costly.
  28. 28. Typical patient intake and treatment Identify Diagnose Treat ...but this is very reactive and costly. By the age of 15, over 30% of Europeans will develop a chronic disease
  29. 29. Let’s shift our thinking What if we could identify an individual’s risk for developing a disease and prevent that disease before it ever occurs?
  30. 30. The preventative care process Predict Prevent
  31. 31. The preventative care process Predict Prevent Accelerated treatment improves outcomes
  32. 32. The preventative care process Predict Prevent Huge opportunity for genomics Accelerated treatment improves outcomes
  33. 33. But genomic analysis is really hard Population Scale Data Arrives (e.g. Biobank) Process for Analysis Export Model and Apply to Individual Generate Dashboard for Clinician
  34. 34. Let’s try this with the Databricks Unified Analytics Platform for Genomics...
  35. 35. Sign-up for the preview databricks.com/genomics

×