Successfully reported this slideshow.
Your SlideShare is downloading. ×

AWS HCLS Virtual Symposium 2021_Maze-Nichols.pptx

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 17 Ad

More Related Content

Similar to AWS HCLS Virtual Symposium 2021_Maze-Nichols.pptx (20)

Recently uploaded (20)

Advertisement

AWS HCLS Virtual Symposium 2021_Maze-Nichols.pptx

  1. 1. © 2021, Amazon Web Services, Inc. or its Affiliates. Nolan Nichols Maze Therapeutics: genetic insight to new medicines
  2. 2. © 2020, Amazon Web Services, Inc. or its Affiliates. why do some people get sick and others don’t, even when they have the same disease-causing gene?
  3. 3. © 2020, Amazon Web Services, Inc. or its Affiliates. 3 genetic modifiers are naturally occurring and can be identified, CRISPRi enables mapping of genetic interactions at scale in 2016, the Resilience Project published that they had identified individuals who should have serious childhood diseases, but didn’t, describing potential genetic modifiers Chen et al. Nat Biotechnology 2016 Dr. Jonathan Weissman and team observed that some gene-gene interactions have a ‘buffering’ or protective effect on disease- causing mutations Chen et al. Nat Biotechnology 2016 Horlbeck et al. Cell 2018
  4. 4. © 2020, Amazon Web Services, Inc. or its Affiliates. based on genetic insights, genetic modifier targets can be developed into transformative therapies for patients protective variants can… be discovered from, or validated by, functional genomics data be targeted to develop new therapeutics be identified from human genetic data that naturally protect some people from disease
  5. 5. © 2020, Amazon Web Services, Inc. or its Affiliates. COMPASS guides us along the path from genetic insight to new medicines human genetics mine biobanks across the world to identify genetic variation and prioritize novel targets that impact human disease data science seamlessly integrate diverse proprietary and external data sets and incorporates new computational methods, including machine learning, for analyses functional genomics define the biological mechanisms linking genes to disease and suggest therapeutic strategies for a broad range of unmet needs platform addresses key challenges identify relevant genomic associations determine the mechanistic basis drug difficult genetic targets • tools enable us to establish the basis for the association between a particular gene, cellular pathology and disease state of interest • ability to drug targets with or without structural biology information and direct therapy to the right location • ability to discover novel gene-disease relationships that are pharmacologically relevant at scale
  6. 6. © 2020, Amazon Web Services, Inc. or its Affiliates. data challenges to our target identification workflow • Data sources are heterogeneous formats and block scientists from integrating datasets • Datasets can contain hundreds of thousands of samples that take analysts weeks to process • Many “artisanal” analyses become untrustworthy over time as data drift • Reports and datasets cannot be found quickly and are not in an analysis ready format • Analysts don’t have a process for sharing results and interactive visualizations https://www.anaconda.com/state-of-data-science-2020 which genes are differentially expressed in this experiment?
  7. 7. © 2020, Amazon Web Services, Inc. or its Affiliates. overview of an analyst workflow – providing a computational environment 7 Maze Command Line Interface AWS Batch Analysis Environment Provision Analyst
  8. 8. © 2020, Amazon Web Services, Inc. or its Affiliates. https://github.com/aws-samples/biotech-blueprint-multi- account AWS Biotech Blueprint A collaboration with AWS Healthcare and Life Sciences and Biotech Industry • Enabled Maze to go from a concept deployed architecture in hours • A multi-account architecture provided features to support growing AWS footprint  Additional accounts improve security posture  SSO w/role-based access  Transit Gateway simplifies network configuration and maintenance
  9. 9. © 2020, Amazon Web Services, Inc. or its Affiliates. https://medium.com/slalom-technology/next-generation-networking-with-aws-transit-gateway-and-shared-vpcs- 9d971d868c65 Single Account with Multiple VPCs Multiple Accounts with Single VPC per Account Original Account Data Science Account Informatics Account Comp Chem Account N… Account AWS Transit Gateway simplifies network configuration and maintenance
  10. 10. © 2020, Amazon Web Services, Inc. or its Affiliates. overview of an analyst workflow – providing analysis ready data 1 0 Open Data Maze Command Line Interface Maze Data Buckets Athena Internal Data Vendor Data Data Sources Other Shared Data Analysis Ready Data Register AWS Batch Analysis Environment Provision Analyst BioBank Analysis Data API
  11. 11. © 2020, Amazon Web Services, Inc. or its Affiliates. data sources: data lake as code https://github.com/aws-samples/data-lake- as-code • A framework to enroll data sources as registered assets in a data catalog • Optimized data formats (e.g., parquet) reduce data size and increase performance • Once registered, data can be directly queried through Athena or using BI tools • Examples provided for how to enroll data from the Registry of Open Data on AWS
  12. 12. © 2020, Amazon Web Services, Inc. or its Affiliates. analysis ready data: life science data lake as code 73,635,38 0 17,198,17 4 9,354,592 49,005,57 5 GTEx Open Targets BindingDB ChEMBL • The Registry of Open Data on AWS (RODA) contains 237 datasets with 73 tagged as “life science” • Enabled Maze to import GTEx, Open Targets, BindingDB, and ChEMBL using Data Lake as Code in about an hour • Provides access to 150M records about Biological and Chemical Entities as well as their properties and associations (genes, diseases, compounds) • Questions that took hours or days to answer using public APIs now take seconds or minutes using Athena • Challenges remain for finding the right data elements when there are over 18k unique columns from 242 tables 107 8 50 77 Table Count Record Count https://registry.opendata.aws/
  13. 13. © 2020, Amazon Web Services, Inc. or its Affiliates. overview of an analyst workflow – sharing results with collaborators 1 3 Open Data Maze Command Line Interface Maze Data Buckets Athena Internal Data Vendor Data Data Sources Other Shared Data Analysis Ready Data Register AWS Batch Publish Analysis Environment Provision Analyst Data API BioBank Analysis Self-Service Analytics Published Analyses Data API Results Portal Semantic Data Catalog Drug Target Dashboards Decision Support
  14. 14. © 2020, Amazon Web Services, Inc. or its Affiliates. Publishing analysis results • ontology terms define result types and relationships • provide canonical labels and definitions • designed using the protégé editor and versioned in git • analysts initialize a templated project directory and environment • a dataset description is generated using ontology- driven tooling • a validated dataset description is published to a central data portal • metadata is added to a search index • tabular files accessed via a data service api target constraint violation dataset description • dataset descriptions are modeled as a data graph • the shape constraint language is used to validate the graph
  15. 15. © 2020, Amazon Web Services, Inc. or its Affiliates. Summary • We were able to rapidly deploy our network and data architecture in a matter of days not months • Reference architectures provided a foundation for building out a solution tailored to our goals • Barriers to weaving open data with proprietary data and analyses were reduced but still a challenge • A key gap to fill is ensuring that data have embedded semantics and links to entities with associations relevant to drug discovery
  16. 16. © 2020, Amazon Web Services, Inc. or its Affiliates. launched in 2019 with $190m+ investment based in south san francisco with ~80 employees founded on concept of genetic modifiers investors translating genetic modifying insights into new therapeutics
  17. 17. © 2020, Amazon Web Services, Inc. or its Affiliates. © 2020, Amazon Web Services, Inc. or its Affiliates. Q&A Nolan Nichols

×