Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Using Bioinformatics Data to inform Therapeutics discovery and development


Published on

Diamond Age Data Science and Zafgen, Inc, co-present on their work in using bioinformatics data effectively in the context of a small therapeutics company.

Eleanor Howe, PhD, CEO of Diamond Age, presents on the different types of computational biologist, the characteristics of a good bioinformatics team, and the pluses and minuses of using deep learning/AI in a discovery biology context.

Huseyin Mehmet, VP of Discovery Research at Zafgen, describes his team's work with Diamond Age and uses their capabilities to inform Zafgen's drug development. He discusses the needs of biotech companies for a diverse, experience bioinformatics team.

Published in: Data & Analytics
  • Login to see the comments

  • Be the first to like this

Using Bioinformatics Data to inform Therapeutics discovery and development

  1. 1. From data to insights and action: Strategies to take your bioinformatics to the next level Eleanor Howe, Diamond Age Data Science Huseyin Mehmet, Zafgen, Inc. December 7, 2018
  2. 2. What is this talk about? • Who are we? What is computational biology? • Lessons learned from working with our customers • Our ongoing relationship with Zafgen • Q&A
  3. 3. Eleanor Howe, PhD Background in molecular biology, statistics, programming and computational biology/bioinformatics
  4. 4. Diamond Age Data Science Bioinformatics/computational biology consulting Project-based analysis Staff augmentation Pipeline development “Drop-in” bioinformatics department The Diamond Age: or, A Young Lady’s Illustrated Primer by Neal Stephenson
  5. 5. Team Chris Friedline Sequencing, software engineering Somdutta Saha Computational chemistry and proteomics Bruce Romano Mathematics and data science Nicholas Crawford Human genetics and GWAS Mike DeRan Cancer and diabetes therapeutics, scRNA-seq Max Marin RNA splicing Zarko Boskovic Medicinal chemistry and metabolomics Chris Dwan IT and data security
  6. 6. A few of our clients
  7. 7. Computational Biology Computational biology is data science for biology Bioinformatics is sometimes a synonym for computational biology. Other times, bioinformatics refers to software engineering for biology.
  8. 8. Lessons learned
  9. 9. Drug discovery requires evaluation of diverse, complex data • Sequence analysis is very different from proteomics • Knowing the landscape of available datasets is key • Individual bioinformaticians tend to specialize in one sub-field or another
  10. 10. Public datasets are a gold mine • Cancer Cell-line Encyclopedia • The Cancer Genome Atlas • Gene Expression Omnibus • Dependencies Map (Dep-map) • UK Biobank • DrugBank • VarSome • GTeX
  11. 11. But the real gems come from your own experiments It’s not possible to validate a drug target using public datasets alone. The public datasets are general, and cover only the most common diseases or disease subtypes. The most useful results come from combining custom-generated data with public data.
  12. 12. CROs do the basics well • Ocean Ridge, Novogene ($200 transcriptome!) • Good for the basics - RNA-seq, DNA-seq, proteomics, metabolomics • Reasonable standardized analysis pipelines • Challenges: • combining multiple datasets across experiments or across CROs • more involved analysis (e.g. splicing) • Do a thorough cost-comparison when considering an academic collaborator • Also ask them when their student is graduating.
  13. 13. What additional expertise do you need? Early stage “traditional” therapeutics companies don’t need a full-time computational biologist. Part time can work fine. When the company expands, hire a computational biologist with substantial experience, or an analyst with some kind of advisor available.
  14. 14. Computational biologist: Experience/training in all three areas Analyst: Biology + programming, with an advisor to help with the statistics Methods developer: Wants to build new analytical tools Know what you need
  15. 15. What expertise do you need? For Teams: • Cross-discipline expertise -biology, chemistry, computer science, statistics • Communication skills • Lateral thinking
  16. 16. Expertise gets you fast answers The problem: Get a terabyte of data from a USB hard drive to the cloud in time to analyze a dataset for a conference
  17. 17. Expertise gets you fast answers The problem: Get a terabyte of data from a USB hard drive to the cloud in time to analyze a dataset for a conference The solution: Bicycle across the Charles 3Gb/s bicycle (latency of 1.2M ms) Datacenter internet connection Markley Data Center
  18. 18. Deep Learning / Artificial Intelligence Another danger zone
  19. 19. Deep Learning / Artificial Intelligence Deep learning is “new” in that it’s a more complex version of older technology: a neural network Modern compute power allows for powerful classifiers trained on very large datasets
  20. 20. The basics of machine learning (and DL) Deep Learning works in a similar way to other types of machine learning. The algorithms use larger datasets and are more complex. But the overall workflow is the same.
  21. 21. Should you use deep learning? Is your training data: Large. 100,000+ to 1M+ samples Well-annotated. Gene expression data usually isn’t. Representative of the questions you want to answer? In discovery biology, the data is usually not there. Hence “discovery”.
  22. 22. Good use-cases for deep learning Image processing Diagnostics from histology, radiology High-content screening Biochemical structure/sequence Epitope prediction Protein folding (Deep Mind) Single-cell RNA-seq (potentially)
  23. 23. Should you use deep learning? (cont) Do you need an interpretable model? Deep learning is a black box Have you tried everything else? Linear models, random forests, other ML techniques These tools are often faster, cheaper, and easier to understand and implement
  24. 24. Huseyin Mehmet, PhD Vice President and Head of Discovery Research Zafgen, Inc.
  25. 25. Zafgen, Inc • Publicly traded bio-pharmaceutical company • Founded 12 years ago (IPO in 2014) • Virtual company • Bringing MetAP2 inhibitors to market • Areas of interest: Metabolic disease
  26. 26. Zafgen and Diamond Age Diamond Age acts as a virtual bioinformatics department for Zafgen • Data Analysis • Data Management • Hypothesis generation • Technology recommendations
  27. 27. What Diamond Age has done for Zafgen • Transcriptional profiling • Proteomics/phosphoproteomics • Metabolomics • Clinical outcomes • Custom apps for client needs
  28. 28. The benefits What can Zafgen can do now that it couldn’t before? • Iterative data generation • Cross-dataset analyses • Confidence in analysis results from CROs • Link between pre-clinical and clinical data • Cost efficiencies / value for money
  29. 29. Thank you! Questions?