Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Best Practices for Building an End-to-End Workflow for Microbial Genomics

196 views

Published on

Invited talk presented at 2019 CAFPA-ASM D.C. Branch
FALL MEETING on "Current Testing Approaches & Implications for Public Health" at the FDA in College Park, MD.

Published in: Science
  • Be the first to comment

Best Practices for Building an End-to-End Workflow for Microbial Genomics

  1. 1. Sample to Insight Best Practices for Building an End-to-End Workflow for Microbial Genomics Jonathan Jacobs, PhD Director of Global Product Management QIAGEN Bioinformatics Current Testing Approaches & Implications for Public Health Fall 2019 Meeting of CAFPA and DC ASM October 31, 2019
  2. 2. Sample to Insight Disclaimer 2 The opinions expressed in this presentation and on the following slides are solely those of the presenter, Jonathan Jacobs, an employee of QIAGEN N.V. (QIAGEN), but does not necessarily represent the opinions or position of QIAGEN, or any other business unit associated with QIAGEN. QIAGEN does not guarantee the accuracy or reliability of the information provided herein.
  3. 3. Sample to Insight CLC Microbial Genomics Module v4 3 The QIAGEN Microbial Insight AR database (QMI-AR) and ARESdb are the latest additions to a growing ecosystem of tools and databases to support microbial genomics research Prebuilt & Customizable Reference Databases QMI-AR Curated AMR Database ARESdb Genomic & Phenotypic Database Strain Typing & Phylogenetics Genome & Metagenome Annotation Functional & Taxonomic Microbiome Profiling Genomic AMR Analysis CLC Microbial Genomics Module For more information, visit QIAGEN Bioinformatics
  4. 4. Sample to Insight Best Practices for Building an End-to-End Workflow for Microbial Genomics 1. What’s the “Process to build a Process” for Microbial Genomics ? 2. Best Practices for the Lab vs. Bioinformatics Analysis 3. A brief comment on Storage, Archive, and Backup 4. REFERENCES & LINKS AGENDA
  5. 5. Sample to Insight 5 Share Observe Hypothesis Test Refine The Research Cycle Design & Document Develop Prototype & Refine Validate & Implement QMS Testing & Monitor BASIC RESEARCH APPLIED TESTING Assay Development i.e. hospital microbiology labs, public health labs, government testing labs, university core labs, etc.
  6. 6. Sample to Insight Gargis, Amy S., Lisa Kalman, and Ira M. Lubin. “Assuring the Quality of Next-Generation Sequencing in Clinical Microbiology and Public Health Laboratories.” Edited by C. S. Kraft. Journal of Clinical Microbiology 54, no. 12 (December 2016): 2857–65. https://doi.org/10.1128/JCM.00949-16. Assay Validation Framework “RESEARCH” “DEPLOYMENT”“DEVELOPMENT”
  7. 7. Sample to Insight 7 What’s the “Process to build a Process” for Microbial Genomics? Best Practices for Building an End-to-End Workflow for Microbial Genomics
  8. 8. Sample to Insight 8 Choose a workflow someone else developed?
  9. 9. Sample to Insight EXAMPLE: The KARIUS™ commercial pipeline Blauwkamp, Timothy A., Simone Thair, Michael J. Rosen, Lily Blair, Martin S. Lindner, Igor D. Vilfan, Trupti Kawli, et al. “Analytical and Clinical Validation of a Microbial Cell-Free DNA Sequencing Test for Infectious Disease.” Nature Microbiology, February 11, 2019. https://doi.org/10.1038/s41564-018-0349-6.
  10. 10. Sample to Insight EXAMPLE: CD-Genomics™ Pipelines
  11. 11. Sample to Insight Over 110 companies now offer some form of End-to-End workflows for microbial genomics. 11 Pros Cons • Fast deployment and implementation • “Hands-Free” Solution • Lower Cost (at least initially) • May not have validation for your pathogen • Less control over implementation and data • Lack of transparency for methods / algos • Regulatory / Legal restrictions • May not have an on-prem solution (Most are “cloud only”) A list of companies in this space was recently posted to my blog: https://microbe.land/2019/05/25/the-commercial-microbiome-market/ Adapsyn Bioscience Loop Genomics Advanced Biological Laboratories (ABL) Maat Pharma AOBiome Metabiota Aperiomics Metabolon AppliedMaths Metagenome Analytics, LLC Ardigen Microbiome Insights ARTpred Microbiome Therapeutics AsiaBiome Microbiotica Astarte Medical MicroGen Biotech Axial Biotherapeutics MicrogenDX Azitra Naked Biome Bio-Me NatureMetrics BioConsortia Nemesis Bioscience Biome Bliss Nextbiotix Biome Makers Noblis Biomes Metabolon Norgen Biotek BiomX Noscendo Boost Novosanis CD Genomics One Codex CeMeT GmbH OpenBiome CHAIN Biotechnology OraSure ChunLabs Oxford Nanopore Clinical Metagenomics A/S PacBio ClostraBio PharmaBiome Commense Phylagen Concentric Probionase CoreBiome Promega CosmosID QIAGEN Biox DayTwo RealTime Genomics Diversigen ReBiotix Dupont Nutrition & Health Reckitt Benckser Eligo Bioscience Resphera Biosciences Enbiome Ritter Pharmaceuticals Enterome Second Genome Enterome Bioscience Seres Therapeutics EpiBiome Serlmmune Evelo Biosciences Seventure Evolve BioSystems Shoreline Biome Farmhouse Culture Signature Science Ferring Therapeutics Siolta Therapeutics Finch Therapeutics Snipr Biome Floragraph Sugarlogix FryLabs Symbiotix Biotherapies Genetech Taconic Genome & Company Takeda Pharmaceuticals Ginko Bioworks TargEDys ID Genomics TGEN IDbyDNA The BioCollective Illumina ThermoFisher Indigo AG Thryve Indigo Agriculture uBiome Inocucor Technologies Vedanta Biosciences Ixcela Viome Janssen WholeBiome Kaleido Biosciences Zymergen Kallyope ZymoResearch Karius Leucine Rich Bio Locus Biosciences
  12. 12. Sample to Insight 12 Or… should you build it yourself?
  13. 13. Sample to Insight Building it yourself… 13 • FOCUS on individual steps in the workflow ◦ Time is Money ◦ Do yourself a favor, and don’t tackle everything at once • LIMIT options to test ◦ Conduct an internal “Best of Breed” review of popular options • BENCHMARK – in all three of these contexts ◦ On real data / samples from your own lab ◦ On real data / samples from 3rd party sources ◦ Synthetic data / Contrived samples designed to test edge cases • DOCUMENT everything ◦ This may take a while, and people graduate, move, get promoted, etc. • STAY FOCUSED ◦ New technologies and methods are constantly being released. Lock in your tech options, and finish with those before considering new technology.
  14. 14. Sample to Insight Example: University Medical Center Groningen’s evaluation and pipeline Couto, Natacha, Leonard Schuele, Erwin C. Raangs, Miguel P. Machado, Catarina I. Mendes, Tiago F. Jesus, Monika Chlebowicz, et al. “Critical Steps in Clinical Shotgun Metagenomics for the Concomitant Detection and Typing of Microbial Pathogens.” Scientific Reports 8, no. 1 (December 2018). https://doi.org/10.1038/s41598-018-31873-w.
  15. 15. Sample to Insight Example: PanGIA Best of Breed Evaluation (MRIGlobal’s mNGS pipeline) Pros Cons • Greatest flexibility in development • Total control over deployment and maintenance • Staff development and training • Time consuming • Most expensive approach • High risk of delays Building it yourself…
  16. 16. Sample to Insight Since 2016, over 200 publications present validated WGS workflows for clinical microbiology, public health, and industrial surveillance. Pros Cons • Greatest flexibility in development • Total Control over development • Perfect Fit deployment • Staff development and training • Time consuming • Trial and Error Process • Most Expensive approach • High risk of delays • High cost of maintenance Building it yourself… These publications are collected in my online repository of Microbial Genomics references on Zotero. https://www.zotero.org/groups/2255187/microbial_genomics_references?
  17. 17. Sample to Insight The Problem with all this… Angers-Loustau A, Petrillo M, Bengtsson-Palme J et al. The challenges of designing a benchmark strategy for bioinformatics pipelines in the identification of antimicrobial resistance determinants using next generation sequencing technologies [version 2]. F1000Research 2018, 7:459 (doi: 10.12688/f1000research.14509.2) Each step has dozens, some hundreds, of options… where to start?
  18. 18. Sample to Insight 18 The Golden Rule of assay development…
  19. 19. Sample to Insight The Golden Rule of assay development … 19 ~ Don’t let the Perfect be the enemy of the Good ~ “Premature optimization is the root of all evil” – Donald Knuth
  20. 20. Sample to Insight 20 The “Other” Golden Rule of assay development…
  21. 21. Sample to Insight The “Other” Golden Rule of assay development … 21 ~ Better The Devil You Know Than The Devil You Don't ...~ Don’t start from scratch (!) Use what you know Beg, Borrow and … from your friends and colleagues
  22. 22. Sample to Insight 22Best Practices for Building an End-to-End Workflow for Microbial Genomics Best Practices & Guidelines for Designing an End-to-End Workflow
  23. 23. Sample to Insight First – Create a “Living” Requirements Document, Define Your Scope, Share it Stakeholders & End Users Workflow Scope, Goals & Outputs Clinician Quality Assurance Manager Laboratory Staff Bioinformatician IT Administrator Lab Director Outside Agencies Sample Types Sequencing Objectives (WGS, mNGS, 16S, etc.) Material Standards & Controls Used Sequencing Platforms Used IT Infrastructure Assessment Contamination Control and Monitoring Database Curation, Acceptance Criteria, and Change Control Regulatory, Reporting, Auditing Requirements Benchmarking and Performance Monitoring Define Acceptance Thresholds Cluster Density Read Lengths #Reads / Library Expected vs. Observed GC Ratio Max# Contaminating Reads Min% Non-Host Reads Max% Rejected / Discard Reads Data & IT Management Establish Storage & Archive Policy
  24. 24. Sample to Insight Maljkovic Berry, Irina, Melanie C Melendrez, Kimberly A Bishop-Lilly, Wiriya Rutvisuttinunt, Simon Pollett, Eldin Talundzic, Lindsay Morton, and Richard G Jarman. “Next Generation Sequencing and Bioinformatics Methodologies for Infectious Disease Research and Public Health: Approaches, Applications, and Considerations for Development of Laboratory Capacity.” The Journal of Infectious Diseases, October 14, 2019, jiz286. https://doi.org/10.1093/infdis/jiz286. Schematic diagram of common sequencing laboratory workflows and approaches When to use Isolate whole genome sequencing • Strain typing • Routine surveillance • Rapid phenotypic prediction (AMR, virulence) Shotgun metagenomics • Emergent Pathogen discovery, in cases where targeted methods fail (i.e. PCR, AmpliSeq, etc.). • Coinfections • Slow growing pathogens • Environmental surveillance Positive / negative selection sequencing • Low abundant targets in complex metagenomics / microbiome samples • Emerging pathogen surveillance Targeted amplicon, 16S / ITS • Cost effective means to interrogate microbiome profiles • Rapid genotyping of specific genes Tackle one workflow at a time…
  25. 25. Sample to Insight Sequencing Platform 25 Platform Applications Runtime Advantages Disadvantages Sanger ABI 3730xl A 20 min–48 h High quality, long reads, low cost for small studies Low throughput, high cost, substitution errors, sequenced material has to be pure to produce good-quality sequence data PacBio RSII V, M, E, HE, RT, CP, EP 0.5–4 h Used in methylome research Indels, large lab footprint, expensive Ion Torrent / PGM318 A, V, M, E, HE, D, PS 4–7 h (chip) Lower cost instrument, upgradable, simple machine Higher error rate with homopolymer issues, more hands-on time, fewer overall reads, higher cost/MB, indel issues Illumina MiSeq A, V, M, E, HE, RT, SV, D, PS 4–55 h Moderate cost/instrument and runs, low cost/MB, fast run time, versatile Substitution errors, as the sequencing reaction proceeds, the error rate increases Oxford Nanopore MinION A, V, M, E, HE, RT, SV, ME, EP, PS 1 min– 48 h Longest individual reads, accessible user community, portable USB size Lower throughput than other machines, low single-read pass accuracy, deletions Illumina NextSeq 500 A, V, M, E, HE, RT, ML, ME, SV, C, MT, D, PS 12–30 h High sequence yield potential, easy to use, expandable Expensive, high concentrations of DNA, requires high indexing capabilities, issues with substitution errors As the sequencing reaction proceeds, the error rate increases Illumina NovaSeq 6000 V, M, E, HE, RT, ML, ME, SV, C, MT 13–44 h High sequence yield potential, no application restrictions Expensive, high concentrations of DNA, requires high indexing capabilities, issues with substitution errors As the sequencing reaction proceeds, the error rate increases, higher frequency of duplicate reads PacBio Sequel V, M, E, HE, RT, CP, EP 30 min– 20 h Fast, desktop sized instrument, long reads Moderate throughput, expensive Oxford Nanopore PromethION A, V, M, E, HE, RT, SV, ME, EP, PS up to 64 h Higher output than MinION, longest individual reads, accessible user community, scalable Low single-read pass accuracy, issues with deletions Illumina iSeq 100 A, V, M, targeted- RT, PS, D (planned) 9–17.5 h Lower cost, faster sample preparation, minimizes potential user error or need for corrective maintenance, single-use cartridges so upgrades are in consumables only Substitution errors, as the sequencing reaction proceeds the error rate increases, prone to barcode hopping, cannot be used at high altitudes Abbreviations: A, amplicon sequencing; C, ChIP-seq; CP, complex population sequencing; D, diagnostics; E, eukaryotic genome; EP, epigenetics; HE, human/exome genomics; M, microbial genome; MB, mega base; ME, metagenomics; ML, methylation studies; MT, metatranscriptomics; PS, pathogen surveillance; RT, RNAseq/transcriptomics; SV, single nucleotide polymorphism/variation studies; V, viral genome. It seems like there are many platforms to choose from – but … start with a MiSeq
  26. 26. Sample to Insight Additional Points to Consider in the End-to-End Workflow 26 Collection Collection Stabilization Culturing Isolation Preparation Lysis NA Purification Host Depletion WGA / WTA Material Controls Sequencing Targeted Amplicons Shotgun Library Prep Sequencing Platform Material Controls Bioinformatics Platform selection Tools & Algos Workflow construction Outputs Interpretation Database curation Performance Thresholds Reporting formats Management • QMS • ISO Certifications • CAP/CLIA • Performance monitoring • Contamination control • Proficiency Testing LifeGuard Soil Preservation Solution PAXgene Blood RNA Kit RNAprotect Bacteria Reagent RNeasy Protect Animal Blood System QIAseq FastSelect –5S/16S/23S Viral RNA MiniKit RNAeasy RNeasy PowerSoil Total RNA DNeasy PowerSoil QIAamp Fast DNA Tissue QIAEX II Suspension 50kb REPLI-g WGA Mini QIAseq UPX 3’ Transcriptome QIAseq 1-Step Amplicon Library QIAseq 16S/ITS QIAseq FX DNA Library CLC Genomics Workbench CLC Microbial Genomics Module CLC Genome Finishing Module CLC Genomics Server CLC Genomics Cloud Engine Microbial Genomics ProSuite QIAGEN Microbial Insights AR ARESdb QIAGEN Managed Services QIAGEN Custom Solutions Example products from QIAGEN along the workflow continuum There are dozens of products that fit into each of these steps. Remember the “Other Golden Rule”
  27. 27. Sample to Insight Estimated Turnaround Times and Costs for WGS of Bacterial Isolates 27 Turnaround time Cost implications STEP Estimated time (hours) Determinants Estimated cost per sample (euros) Determinants DNA extraction 1–2h Choice of kit, additional steps (e.g. enrichment), automation 10 Kits vs. reagents, technician hands-on time vs. automation Library preparation 4–6h Method (enzymatic vs. shearing), choice of kit, automation 30 Choice of kit, automation Sequencing 50h Platform, chemistry, read length, run protocol 75 Platform, chemistry, read length, number of samples per run/coverage Initial analysis 1–2h Depending on number of samples, computing power, available software and pipelines NA Commercial vs. free software, availability of bioinformaticians, computer infrastructureSpecific analysis 4h “We estimate that the performance of WGS for 16-20 bacterial isolates inhouse in a routine setting would cost around 200 euros per isolate and last around 2.5-3 days.” Rossen, J.W.A., A.W. Friedrich, and J. Moran-Gilad. “Practical Issues in Implementing Whole-Genome-Sequencing in Routine Diagnostic Microbiology.” Clinical Microbiology and Infection 24, no. 4 (April 2018): 355–60. https://doi.org/10.1016/j.cmi.2017.11.001. Material costs for metagenomics samples in MRIGlobal’s PanGIA pipeline were similar, about $240/sample Budget planning…
  28. 28. Sample to Insight Bioinformatics workflow and considerations for sequence analysis. Maljkovic Berry, Irina, Melanie C Melendrez, Kimberly A Bishop-Lilly, Wiriya Rutvisuttinunt, Simon Pollett, Eldin Talundzic, Lindsay Morton, and Richard G Jarman. “Next Generation Sequencing and Bioinformatics Methodologies for Infectious Disease Research and Public Health: Approaches, Applications, and Considerations for Development of Laboratory Capacity.” The Journal of Infectious Diseases, October 14, 2019, jiz286. https://doi.org/10.1093/infdis/jiz286.
  29. 29. Sample to Insight ADAPTED FROM: Quainoo, Scott, Jordy P. M. Coolen, Sacha A. F. T. van Hijum, Martijn A. Huynen, Willem J. G. Melchers, Willem van Schaik, and Heiman F. L. Wertheim. “Whole-Genome Sequencing of Bacterial Pathogens: The Future of Nosocomial Outbreak Analysis.” Clinical Microbiology Reviews 30, no. 4 (October 2017): 1015–63. https://doi.org/10.1128/CMR.00016-17. WGS outbreak analysis tools There are many more choices than what is shown here. For example, over 100 metagenomics pipelines have been published since 2008… IMHO… Database Quality is more important than specifically which tool you use
  30. 30. Sample to Insight Example of Complexity: Analyses of Bacterial Genomes for Strain Typing, Resistance, etc. 30 Databases Methods Antimicrobial Resistance • QIAGEN Microbial Insights AR • ARESdb • PATRIC • CARD • NCBI AMRFinder • ResFinder • ARG-ANNOT Virulence • VirDB • VFdb ALL IN ONE • RAST • MG RAST • PATRIC MLST • pubMLST.org • cgMLST.org • Enterobase Gene Finding & Annotation CLC Genomics Workbench PROKKA GeneMark GLIMMER SNP Calling CLC Genomics Workbench PointFinder Assembly Free CLC Genomics Workbench ShortBRED Kmer Spectra Strain Typing BioNumerics SeqSphere+ CLC Microbial Genomics Module Platforms On Premise CLC Genomics Workbench BioNumerics SeqSphere+ GALAXY Cloud Based CLC Genomics Cloud Engine DNA Nexus OneCodex 1921 Genomics GALAXY CloudMan
  31. 31. Sample to Insight Connectivity and Overlap of Antimicrobial Resistance Databases 31
  32. 32. Sample to Insight Some opinionated advice… 32 In general… Isolate Analysis - “Trust but Verify” public databases - Get a second opinion on your results – either from another pipeline or another scientist. - You don’t need a bioinformatician on your staff, but you need know one you can call one when things get weird. - 90% of scientists doing NGS trouble- free software that does 90% of the job. The rest of them are expert bioinformaticians, or will be soon.  - Assembly free methods require less sequencing depth, and thus may save your lab money and time. - You don’t have to assemble contigs to get correct strain – kmer and read mapping approaches work fine too. - Some assembly required for many other applications. - Don’t use closed MLST schemas - Be wary of tribalism… - SNPs are “better” than cgMLST - cgMLST is “better” than SNPs Metagenomics - The database has the biggest impact on results, not the algorithm (as long as you are using “good enough” algos) - A database you can update and modify easily will come in handy to figure out tricky samples. - There are over 120 different algorithms published for metagenomics. - They “basically” all do the same thing. - 95% of them are good enough for 95% of the samples. - Only 1 is the best at 1% of the tasks - None are best at all of the tasks. - Choose wisely… 😉
  33. 33. Sample to Insight 33 A Gold-Standard Case Study Best Practices for Building an End-to-End Workflow for Microbial Genomics
  34. 34. Sample to Insight Case Study – California Public Health Laboratory CAP/CLIA validation of a WGS pipeline for diagnostics 34 “The following objectives were accomplished: (i) the establishment of the performance specifications for WGS applications in PHLs according to CLIA guidelines, (ii) the development of quality assurance and quality control measures, (iii) the development of a reporting format for end users with or without WGS expertise, (iv) the availability of a validation set of microorganisms, and (v) the creation of a modular template for the validation of WGS processes in PHLs”
  35. 35. Sample to Insight Kozyreva, Varvara K., Chau-Linda Truong, Alexander L. Greninger, John Crandall, Rituparna Mukhopadhyay, and Vishnu Chaturvedi. “Validation and Implementation of Clinical Laboratory Improvements Act-Compliant Whole- Genome Sequencing in the Public Health Microbiology Laboratory.” Edited by Daniel J. Diekema. Journal of Clinical Microbiology 55, no. 8 (August 2017): 2502–20. https://doi.org/10.1128/JCM.00361-17. WGS Quality Control Scheme
  36. 36. Sample to Insight 36
  37. 37. Sample to Insight 37
  38. 38. Sample to Insight Summary of WGS Validation Kozyreva, Varvara K., Chau-Linda Truong, Alexander L. Greninger, John Crandall, Rituparna Mukhopadhyay, and Vishnu Chaturvedi. “Validation and Implementation of Clinical Laboratory Improvements Act-Compliant Whole- Genome Sequencing in the Public Health Microbiology Laboratory.” Edited by Daniel J. Diekema. Journal of Clinical Microbiology 55, no. 8 (August 2017): 2502–20. https://doi.org/10.1128/JCM.00361-17.
  39. 39. Sample to Insight 39 A bioinformatics example of a workflow
  40. 40. Sample to Insight Case Study PROM 40 Summary • 8 pediatric patients who underwent stem cell transplantation • 50% developed acute Graft-versus-Host Disease (aGvHD) • Patients developing aGvHD are characterized by expansion of their gut resistome • Resistance genes extended beyond drug classifications administered during therapeutic case. The Data - 32 whole shotgun metagenomics samples, from 8 patients, along a pre- and post-operative time course - All data publicly available at NCBI BioProject Archive - https://www.ncbi.nlm.nih.gov/bioproject/PRJNA525982 D’Amico, Federica, Matteo Soverini, Daniele Zama, Clarissa Consolandi, Marco Severgnini, Arcangelo Prete, Andrea Pession, et al. “Gut Resistome Plasticity in Pediatric Patients Undergoing Hematopoietic Stem Cell Transplantation.” Scientific Reports 9, no. 1 (2019): 5649. https://doi.org/10.1038/s41598-019-42222-w.
  41. 41. Sample to Insight A single workflow for Taxonomic Profiling and Prediction of AMR Genes and Markers PROM 41 IMPORT READS (FASTQ) TRIM & QC READS de novo Assembly Gene Finding Functional Annotation Bin Contigs Taxonomy Bin Contigs Sequence ID Contigs w/ AMR Genes Find AMR Markers Partition Reads Taxonomic Profiling Map Reads to Contigs Visualize Results Visualize Results Visualize Results Visualize Results
  42. 42. Sample to Insight 1. Assembly Free detection of AMR markers directly from reads PROM 42 IMPORT READS (FASTQ) TRIM & QC READS Find AMR Markers Partition Reads Visualize Results We ran ShortBRED w/QMI-AR an obtained the following results. CLC Microbial Genomics Module includes ShortBRED – an algorithm that detects peptide signatures directly from unassembled reads. A pre-built peptide index built from QMI-AR is available for download directly from within CLC.
  43. 43. Sample to Insight 100% Concordance with original author’s findings. PROM 43 Resistance Group 4PRE 4P18 4P28 4P60 11PRE 11P25 11P32 11P85 5PRE 5P16 5P25 5P34 5P55 15PRE 15P17 15P30 15P41 15P62 15P71 15P85 19PRE 19P20 19P54 16PRE 16P15 16P72 20PRE 20P24 20P75 26PRE 26P21 26P60 antibiotic inactivation enzyme 0.4 0.49 0.6 0 1.1 0.7 0.16 0 0.3 0.61 0 0.4 0 0.2 0 0 0.4 antibiotic resistant gene variant or mutant 0.7 0.8 0 0 0.73 0.2 0.4 0.4 0.3 0.2 0.92 0 0.82 1.24 1.3 0.7 0.8 0.2 0.7 0.5 0.2 0.5 0 0 0.6 antibiotic target modifying enzyme 0.49 antibiotic target protection protein 1.4 1.57 1 1.3 1 1.17 1.4 1.2 1.2 1.49 1.29 1.1 1.2 1.63 0.7 1.29 1.68 1.5 1 1 1.1 1 1 1.2 1 1 1.3 1.1 1 1.2 1 1.3 antibiotic target replacement protein 1.2 1.28 1 1.1 0 0.25 1.3 1 0.7 1.24 1.03 0.7 0.9 1.27 0.2 0.83 1.23 1.3 0.7 0.7 0.6 0 0.9 0.7 0.9 0 1.1 0.7 0.6 0 0.9 determinant of aminoglycoside resistance 1.2 1.38 1.2 1 1.19 1.5 1.3 0.7 1.33 1.22 1.1 1 1.53 0.5 1.2 1.54 1.3 0.2 0.3 1.3 0 0.8 1.1 1.1 1 1.2 0.7 1.1 1 0 1.2 determinant of beta-lactam resistance 1.23 0.8 1.1 0.5 1.57 1.32 0.7 0.8 1.02 0.2 0.7 0.5 0.2 0.4 0.8 1 0.7 0.6 0.8 0.2 0.8 0.7 1 1 determinant of fluoroquinolone resistance 0.31 determinant of fosfomycin resistance 0.31 0.2 0 determinant of lincosamide resistance 0.4 0.7 0.5 0 0.94 0.97 0.78 0.3 0 0.47 0.3 1.17 1.41 1.2 0.9 1 0.2 0.6 0.8 0.2 1 0.9 0.5 0.2 0 0.2 determinant of macrolide resistance 1 1.2 0 1.1 0.96 0.5 0.5 0.88 0.44 0.97 0.7 0.4 0.9 0.2 determinant of phenicol resistance 0.25 0.5 0.59 0.4 0.4 1.24 0.96 0.5 0.5 0.89 0.56 0 0.6 0 0.5 0.2 0.2 0 determinant of polymyxin resistance 0.7 0.66 0.8 0 1.22 0 0 0.2 1.29 1.01 0.6 0.4 0.93 0.56 0.52 0.8 0.4 0.4 0.7 0.3 0.3 0.8 0 0.4 0.4 0 0.4 determinant of resistance to glycopeptide antibiotics 0 0.4 0 0.2 0.7 0.8 0.8 0.2 0.91 0.2 determinant of streptogramin resistance 0 1.05 0.4 1 0 1.2 0.9 1.1 1.83 1.56 0.9 1.3 1.4 1.1 1.72 1.56 1.5 1.4 1.4 1 1 0.6 1.1 0.8 1 1.2 0.6 0.6 0.5 1 0.9 determinant of tetracycline resistance 0.54 0.36 0 0.2 0.8 0 efflux pump complex or subunit conferring antibiotic resistance 0.5 1.26 1 1.2 1 1.7 1.5 0.8 0.6 1.81 1.58 1.1 1 1.47 0.8 1.38 1.51 1.4 1.2 1.2 1.3 1.1 1.1 1.2 0 1 1 0.5 0.6 0 1 protein modulating permeability to antibiotic 0.31 0.4 0.69 0.89 0.54 0.2 0.44 0.36 0.6 0.4 0.2 0 0 0.4 0 0.2 protein(s) and two-component regulatory system modulating antibiotic efflux 1 1 1.48 0.2 0.2 0.2 1.56 1.31 0.8 0.7 1.16 0 0.9 0.36 1.1 0.8 0.8 1 0.2 0.7 1 0 0.5 0.7 0.8 protein(s) conferring antibiotic resistance via molecular bypass 0.4 0.81 0 0.9 0 1.05 0.3 1.11 0.79 0.3 0.5 0.77 0.2 0.61 0.47 0.7 0.5 0.2 0.6 0.5 0.7 0.6 0.2 0 0.5 0.6 Patient 4 11 5 15 19 16 20 26 Date antibiotics were administered aGvHD positive
  44. 44. Sample to Insight 2. Assembly free taxonomic profiling of shotgun metagenomics samples PROM 44 IMPORT READS (FASTQ) TRIM & QC READS Taxonomic Profiling Visualize Results CLC’s Taxonomic Profiling tool has been benchmarked as a “best-class” solution* for both 16S and shotgun metagenomics reads. We downloaded the default 16GB memory database from CLC and ran the TaxPro pipeline on a typical laptop. * Couto, N. et. al. (2018) Critical steps in clinical shotgun metagenomics for the concomitant detection and typing of microbial pathogens. Scientific Reports 8: 13767
  45. 45. Sample to Insight Taxonomic profile of reads with AMR marker signatures identified by ShortBRED PROM 45 aGvHD positive Metagenomics profiles produced in CLC clustered 1/1 with the profiles provided in the authors paper. 100% concordant.
  46. 46. Sample to Insight 3. Assembly and gene annotation pipeline to visualize specific genes PROM 46 IMPORT READS (FASTQ) TRIM & QC READS de novo Assembly Gene Finding Functional Annotation Bin Contigs Taxonomy Bin Contigs Sequence Visualize Results
  47. 47. Sample to Insight Various data and reports produced from this pipeline should be inspected on a per sample basis. (11P32 shown as example) PROM 47 IMPORT READS (FASTQ) TRIM & QC READS de novo Assembly Gene Finding Functional Annotation Bin Contigs Taxonomy Bin Contigs Sequence Visualize Results All reports can also be exported at PDFs, graphic images, Excel files, csv, etc.
  48. 48. Sample to Insight After annotation, we then bin contigs and identify potential plasmid/mobile elements. PROM 48 IMPORT READS (FASTQ) TRIM & QC READS de novo Assembly Gene Finding Functional Annotation Bin Contigs Taxonomy Bin Contigs Sequence Visualize Results After binning is completed, we can then map our reads to our assemblies to visualize specific genes and AMR gene clusters. - Both total trimmed reads and ShortBred reads can be visualized in parallel by making use of CLC’s track viewer. Focusing on TaxBin3 - Includes 45 contigs (~68% complete) with high similarity (>90%) to Enterococcus faecium - In this example we can the reads used for the assembly, the genes predicted by our pipeline, as well as individual reads identified by ShortBRED as containing AMR peptide signatures.
  49. 49. Sample to Insight A single workflow for Taxonomic Profiling and Prediction of AMR Genes and Markers PROM 49 IMPORT READS (FASTQ) TRIM & QC READS de novo Assembly Gene Finding Functional Annotation Bin Contigs Taxonomy Bin Contigs Sequence ID Contigs w/ AMR Genes Find AMR Markers Partition Reads Taxonomic Profiling Map Reads to Contigs Visualize Results Visualize Results Visualize Results Visualize Results Once validated, a workflow such as this could be locked and packaged, then distributed to multiple labs running CLC Genomics Workbench.
  50. 50. Sample to Insight A simplified view of this workflow in CLC Genomics Workbench
  51. 51. Sample to Insight 51Best Practices for Building an End-to-End Workflow for Microbial Genomics a brief comment Storage, Archive, and Backup
  52. 52. Sample to Insight 52Best Practices for Building an End-to-End Workflow for Microbial Genomics Preview of new Microbial Genomics Module tools in the next release (Dec, 2019)
  53. 53. Sample to Insight Advanced tools to support the needs of Clinical Microbiology and Public Health laboratoratories Advanced MLST Typing Tools The latest release will include new tools for researchers to carry out Core Genome MLST (cgMLST) and Whole Genome MLST (wgMLST) typing of bacterials isolates. These tools will include the ability to - Identify (type) and compare isolates using publically available schemas from PubMLST.org and other resources - Create, edit, modify, share schemas for any organism - Dynamic vizualizations with minimum spanning trees created with force- directed layout. Add metadata and collapse or expand nodes as needed. - Import/Export schemas, alleles, or loci, to share with other researchers Example minimum spanning tree for MLST visualization.
  54. 54. Sample to Insight High quality, curated databases for Functional Genomics & Antimicrobial Resistance ARESdb Antibiotic Resistance DB QIAGEN Microbial Insight DB ARESdb v1 from ARES-Genetics. - 2,889 genes /2,634 proteins - 746 novel point mutations - 25 drug resistance classes - 43,999 AMR performance indicators for each marker - Markers span 111 species and 21 genus The data was obtained from WGS of 11,000 drug resistant clinical isolates collected globally over a period of 10 years from over 200 clinical labs. All isolates were fully sequenced and subjected to CLSI- standardized testing for drug resistance. Licensed Exclusively to QIAGEN, and available to CLC Microbial Genomics Module users for Research Use Only. QMI-AR is an integrated database for AMR genomics, built from four publicly available databases: CARD, ResFinder, ARG-Annot, and NCBI’s AMRFinder. - 3,827 peptide markers - 5,819 genes / proteins - 755 citable references - 12 species-specific SNP/PointFinder databases QMI-AR was created by deduplicating all four databases, and then applying a common ontology (ARO) to all entries. Multiple tools allow QMI-AR to be used with both bacterial isolates and microbiome / metagenomics samples. Additional Integrated Resources Direct download of multiple established public databases for Drug Resistance Databases: - CARD*, ResFinder, ARG-ANOT Virulence Factor Databases - VirDB Functional Genomics Protein Databases - COG Protein DB, SwissProt, UniRef50, Gene Ontology Microbiome Profiling - OTU databases for 16S profiling: SILVA, GreenGenes, UNITE - Whole Genome Databases for shotgun metagenomics: RefSeq, NCBI Pathogen Detection Project
  55. 55. Sample to Insight Coming in Q1, 2020 CLC Long Read Analysis Plugin
  56. 56. Sample to Insight New Long-Read Analysis Plugin – available to all CLC Genomics Workbench users 56 Support for PacBio data Support for Oxford Nanopore data The new plugin will include support for sequence reads produced by Pacific Biosciences™ (PacBio) instruments. The existing PacBio de novo assembler and polishing tools included in the Genome Finishing Module will be removed and included in this new plugin. The plugin will support - De novo assembly with PacBio reads - Polishing with Illumina reads - The use of PacBio reads or contigs as scaffolds for Illumina assembly The new plugin will include the following capabilities for sequence reads produced on Oxford Nanopore Technologies™ (ONT) instruments: - De novo assembly with ONT reads - Polishing of ONT assemblies with Illumina reads - The use of ONT reads or contigs as scaffolds of an illumina assembly Future Development In the future, we will be adding additional capabilities to the CLC tools to further support long-read sequencing, including features such as - Optimization of the MGM Taxononomic profiling tools for ONT and PacBio data - Full length 16S OTU clustering with error-prone long reads - Structural variant calling and mapping of eukaryotic genomes using ONT and PacBio data. The new Long-Read Analysis Plugin will be available January, 2020.
  57. 57. Sample to Insight THANK YOU! 57 Jonathan Jacobs, PhD Director, Global Product Management, Genomic Analysis QIAGEN BIOX @bioinformer
  58. 58. Sample to Insight 58 REFERENCES Best Practices for Building an End-to-End Workflow for Microbial Genomics
  59. 59. Sample to Insight 15 Critical Papers for Setting up Microbial Genomics Workflows for Production Labs 59 1. Angers-Loustau, Alexandre, Mauro Petrillo, Johan Bengtsson-Palme, Thomas Berendonk, Burton Blais, Kok-Gan Chan, Teresa M. Coque, et al. “The Challenges of Designing a Benchmark Strategy for Bioinformatics Pipelines in the Identification of Antimicrobial Resistance Determinants Using next Generation Sequencing Technologies.” F1000Research 7 (December 7, 2018): 459. https://doi.org/10.12688/f1000research.14509.2. 2. Blauwkamp, Timothy A., Simone Thair, Michael J. Rosen, Lily Blair, Martin S. Lindner, Igor D. Vilfan, Trupti Kawli, et al. “Analytical and Clinical Validation of a Microbial Cell-Free DNA Sequencing Test for Infectious Disease.” Nature Microbiology, February 11, 2019. https://doi.org/10.1038/s41564-018-0349-6. 3. Bogaerts, Bert, Raf Winand, Qiang Fu, Julien Van Braekel, Pieter-Jan Ceyssens, Wesley Mattheus, Sophie Bertrand, Sigrid C. J. De Keersmaecker, Nancy H. C. Roosens, and Kevin Vanneste. “Validation of a Bioinformatics Workflow for Routine Analysis of Whole-Genome Sequencing Data and Related Challenges for Pathogen Typing in a European National Reference Center: Neisseria Meningitidis as a Proof-of-Concept.” Frontiers in Microbiology 10 (March 6, 2019): 362. https://doi.org/10.3389/fmicb.2019.00362. 4. Brown, Eric, Uday Dessai, Sherri McGarry, and Peter Gerner-Smidt. “Use of Whole-Genome Sequencing for Food Safety and Public Health in the United States.” Foodborne Pathogens and Disease 16, no. 7 (July 2019): 441–50. https://doi.org/10.1089/fpd.2019.2662. 5. Couto, Natacha, Leonard Schuele, Erwin C. Raangs, Miguel P. Machado, Catarina I. Mendes, Tiago F. Jesus, Monika Chlebowicz, et al. “Critical Steps in Clinical Shotgun Metagenomics for the Concomitant Detection and Typing of Microbial Pathogens.” Scientific Reports 8, no. 1 (December 2018). https://doi.org/10.1038/s41598-018-31873-w. 6. Gargis, Amy S., Lisa Kalman, and Ira M. Lubin. “Assuring the Quality of Next-Generation Sequencing in Clinical Microbiology and Public Health Laboratories.” Edited by C. S. Kraft. Journal of Clinical Microbiology 54, no. 12 (December 2016): 2857–65. https://doi.org/10.1128/JCM.00949-16. 7. Kozyreva, Varvara K., Chau-Linda Truong, Alexander L. Greninger, John Crandall, Rituparna Mukhopadhyay, and Vishnu Chaturvedi. “Validation and Implementation of Clinical Laboratory Improvements Act-Compliant Whole-Genome Sequencing in the Public Health Microbiology Laboratory.” Edited by Daniel J. Diekema. Journal of Clinical Microbiology 55, no. 8 (August 2017): 2502–20. https://doi.org/10.1128/JCM.00361-17. 8. Maljkovic Berry, Irina, Melanie C Melendrez, Kimberly A Bishop-Lilly, Wiriya Rutvisuttinunt, Simon Pollett, Eldin Talundzic, Lindsay Morton, and Richard G Jarman. “Next Generation Sequencing and Bioinformatics Methodologies for Infectious Disease Research and Public Health: Approaches, Applications, and Considerations for Development of Laboratory Capacity.” The Journal of Infectious Diseases, October 14, 2019, jiz286. https://doi.org/10.1093/infdis/jiz286. 9. Meehan, Conor J., Galo A. Goig, Thomas A. Kohl, Lennert Verboven, Anzaan Dippenaar, Matthew Ezewudo, Maha R. Farhat, et al. “Whole Genome Sequencing of Mycobacterium Tuberculosis: Current Standards and Open Issues.” Nature Reviews Microbiology 17, no. 9 (September 2019): 533–45. https://doi.org/10.1038/s41579-019-0214-5. 10. Portmann, Anne-Catherine, Coralie Fournier, Johan Gimonet, Catherine Ngom-Bru, Caroline Barretto, and Leen Baert. “A Validation Approach of an End-to-End Whole Genome Sequencing Workflow for Source Tracking of Listeria Monocytogenes and Salmonella Enterica.” Frontiers in Microbiology 9 (March 14, 2018): 446. https://doi.org/10.3389/fmicb.2018.00446. 11. Quainoo, Scott, Jordy P. M. Coolen, Sacha A. F. T. van Hijum, Martijn A. Huynen, Willem J. G. Melchers, Willem van Schaik, and Heiman F. L. Wertheim. “Whole-Genome Sequencing of Bacterial Pathogens: The Future of Nosocomial Outbreak Analysis.” Clinical Microbiology Reviews 30, no. 4 (October 2017): 1015–63. https://doi.org/10.1128/CMR.00016-17. 12. Rossen, J.W.A., A.W. Friedrich, and J. Moran-Gilad. “Practical Issues in Implementing Whole-Genome-Sequencing in Routine Diagnostic Microbiology.” Clinical Microbiology and Infection 24, no. 4 (April 2018): 355–60. https://doi.org/10.1016/j.cmi.2017.11.001. 13. Schlaberg, Robert, Charles Y. Chiu, Steve Miller, Gary W. Procop, George Weinstock, the Professional Practice Committee and Committee on Laboratory Practices of the American Society for Microbiology, and the Microbiology Resource Committee of the College of American Pathologists. “Validation of Metagenomic Next-Generation Sequencing Tests for Universal Pathogen Detection.” Archives of Pathology & Laboratory Medicine 141, no. 6 (June 2017): 776–86. https://doi.org/10.5858/arpa.2016-0539-RA. 14. Simner, Patricia J, Steven Miller, and Karen C Carroll. “Understanding the Promises and Hurdles of Metagenomic Next-Generation Sequencing as a Diagnostic Tool for Infectious Diseases.” Clinical Infectious Diseases 66, no. 5 (February 15, 2018): 778–88. https://doi.org/10.1093/cid/cix881. 15. Su, Michelle, Sarah W. Satola, and Timothy D. Read. “Genome-Based Prediction of Bacterial Antibiotic Resistance.” Edited by Alexander J. McAdam. Journal of Clinical Microbiology 57, no. 3 (October 31, 2018): e01405-18, /jcm/57/3/JCM.01405-18.atom. https://doi.org/10.1128/JCM.01405-18.

×