Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Mark Kaganovich, SolveBio // Data Infrastructure for Genomics


Published on

Mark Kaganovich, Co-Founder and CEO of SolveBio, presented at February 2015's edition of Data Driven NYC. SolveBio provides programmatic access to critical data for genomics applications.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Mark Kaganovich, SolveBio // Data Infrastructure for Genomics

  1. 1. Precision Medicine
  2. 2. We have a lot of data and don’t know what to do with it yet... medicine
  3. 3. Precision medicine? Books you don’t want to see at your doctor’s office.
  4. 4. Not quite there yet...
  5. 5. Are we there yet?
  6. 6. We have the technology...
  7. 7. Illumina 28 Billion market cap
  8. 8. More data!
  9. 9. Even more data!
  10. 10. The horrawful truth!
  11. 11. Good luck with that...
  12. 12. Known unknowns 20 Billion new variants will be observed in 5yrs 150,000,000 VARIANTS OBSERVED 2015 VARIANTS WE UNDERSTAND
  13. 13. Challenge accepted!
  14. 14. BIOINFORMATICS EXPERT Rare disease go-to-guy Center for Rare Jewish Genetic Disorders Brooklyn, NY
  15. 15. Variants: Diagnosis: Family: Hospital: Unclassified Unknown Unsatisfied Job complete OUTCOMES
  16. 16. ONE YEAR LATER Different family Different hospital Same story
  17. 17. ClinVar The goverment’s solution. Yet another FTP site.
  18. 18. Submitting to ClinVar Super painful process. You’ll never want to submit again.
  19. 19. Data infrastructure for genomics CLINICAL REPORTDNA MiSeq
  20. 20. SolveBio Beta
  21. 21. ClinVar on SolveBio Dataset.retrieve('ClinVar/3.1.0-2015-01-13/Variants').query()
  22. 22. p Variant Explorer GRCh37:chr7:117199644-117199647>A Date Generated - 2012 / 12 / 08 12:01:45PM EST Rare Variant CLINICAL EVIDENCE Reported Pathogenic F F POPULATION GENETICS <1% GMAF EFFECT PREDICTION Inframe deletion VARIANT IDENTIFICATION 7 CHR Deletion TYPE 3bp SIZE 117,199,647117,199,644 START STOP ATCT A REF ALT NG_016465.3:g.98809_98811delCTT NC_000007.13:g.117199646_117199648delCTT NC_000007.14:g.117559592_117559594delCTT NG_016465.1:g.84630_84632delCTT CODING DNA PROTEIN GENOMIC NM_000492.3:c.1521_1523delCTT XM_006715842.1:c.1845_1847delCTT NP_000483.3:p.Phe508del NP_000483.3:p.Phe508delPhe XP_006715905.1:p.Phe616del HGVS NM_000492.3:c.1521_1523delCTT 117,199,667 117,199,644 117,199,647 117,199,624 3’ ALIGNMENT5’ ALIGNMENT Better way to explore the genome