Successfully reported this slideshow.
Your SlideShare is downloading. ×

The path to implementation of Whole Genome Sequencing (WGS) in PulseNet

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 16 Ad

The path to implementation of Whole Genome Sequencing (WGS) in PulseNet

Download to read offline

http://www.fao.org/about/meetings/wgs-on-food-safety-management/en/

The path to implementation of WGS in PulseNet. Presentation from the Technical Meeting on the impact of Whole Genome Sequencing (WGS) on food safety management and GMI-9, 23-25 May 2016, Rome, Italy.

http://www.fao.org/about/meetings/wgs-on-food-safety-management/en/

The path to implementation of WGS in PulseNet. Presentation from the Technical Meeting on the impact of Whole Genome Sequencing (WGS) on food safety management and GMI-9, 23-25 May 2016, Rome, Italy.

Advertisement
Advertisement

More Related Content

Slideshows for you (20)

Similar to The path to implementation of Whole Genome Sequencing (WGS) in PulseNet (20)

Advertisement

Recently uploaded (20)

Advertisement

The path to implementation of Whole Genome Sequencing (WGS) in PulseNet

  1. 1. The path to implementation of WGS in PulseNet National Center for Emerging and Zoonotic Infectious Diseases Division of Foodborne, Waterborne, and Environmental Diseases Peter Gerner-Smidt, MD, DSc Enteric Diseases Laboratory Branch GMI9 Rome, Italy, May 23- 25, 2016
  2. 2. PulseNet International The international subtyping network of national and regional networks for foodborne disease surveillance ”Saving Lives Since 2000” http://www.pulsenetinternational.org/
  3. 3. Whole Genome Sequencing (WGS) is a Transforming and REPLACING Technology  Consolidating multiple laboratory workflows into one: o Identification – serotyping – virulence profiling – antimicrobial resistance characterization – plasmid characterization- subtyping  Replacing - NOT supplementing current methods  More: Precise- Informative- Cost-efficient
  4. 4. WGS in Public Health: The analytical tools must be • Simple • Public health microbiologists are NOT bioinformaticians • Standard desktop software • Comprehensive • All characterization incl. analysis in one workflow • Working in a network of laboratories, i.e. STANDARDIZED • Free sharing and comparison of data between labs • Central and local analysis
  5. 5. MLST vs SNP SNP MLST Epidemiological concordance High High Stable nomenclature (No) Yes Reference characterization: identification, serotyping, virulence & resistance markers No Yes Speed Slow SNP calling, slow analysis Slow allele calling, fast analysis Local computing requirements Medium-High Low Local bioinformatics expertise Yes No Reference used to perform analysis Sequence of closely related annotated strain Allele database Requires curation No (Yes) MLST is the primary approach for public health surveillance; SNP is used if more detail is needed or MLST fails
  6. 6. Listeria 1403MLGX6-1WGS wgMLST and hqSNP Are Equally Discriminatory and Phylogenetic Trees Are Concordant hqSNP 0.0 0.0 0.3 0.3 0.1 0.5 0.1 0.6 1.5 2.1 wgMLST (<All Characters>) 100 99 98 wgMLST LMO_1 LMO_4 LMO_5 LMO_6 LMO_7 LMO_10 2 18 20 41 11 25 2 18 20 41 11 25 2 18 20 41 11 25 2 20 41 11 25 2 18 20 41 11 25 2 41 11 State 2 isolate 1 State 1 isolate State 3 isolate State 2 isolate 2 State 2 isolate 3 2013 isolate – Nearest Neighbor wgMLST State 2 isolate 1 State 1 isolate State 3 isolate State 2 isolate 2 State 2 isolate 3 2013 isolate – Nearest Neighbor
  7. 7. Trees ~ Tables Key SourceStateSerotype PFGE-XbaI-patternPFGE-XbaI-status PFGE-BlnI-pattern PFGE-BlnI- status Outbreak SourceCounty SourceCity SourceCountry SourceT ype SourceSite PatientAge PatientSex IsolatDateReceivedDate UploadDate M18340 M Enteritidis JEGX01.0009 Confirmed Unconfirmed 1507MLJEG-3 DeKalb Dawsonville USA Human Stool 54UNKNOWN 6/26/2015 7/15/2015 8/4/2015 X150951 X Enteritidis JEGX01.0009 Confirmed Unconfirmed 1507MLJEG-3 Gwinnett Key West USA Human Stool 33MALE 7/5/2015 7/15/2015 8/4/2015 D108427 D Enteritidis JEGX01.0009 Confirmed Unconfirmed 1507MLJEG-3 Fulton Miami USA Human Blood 50FEMALE 7/7/2015 7/15/2015 8/4/2015 A15054-1 A Enteritidis JEGX01.0009 Confirmed Unconfirmed 1507MLJEG-3 Pickens USA Human Stool 28FEMALE 7/7/2015 7/27/2015 8/7/2015 D508583 D Enteritidis JEGX01.0009 Confirmed Unconfirmed 1507MLJEG-3 Dawson Philadelphia USA Human Stool 24FEMALE 7/21/2015 8/11/2015 M088433 M Enteritidis JEGX01.0009 Confirmed Unconfirmed 1507MLJEG-3 Forsyth USA Human Stool 44FEMALE 7/16/2015 7/24/2015 8/13/2015 P110964-1 P Enteritidis JEGX01.0009 Confirmed Unconfirmed Forsyth USA Human Blood 72MALE 8/3/2015 8/10/2015 8/17/2015 A09461 A Enteritidis JEGX01.0009 Confirmed Unconfirmed Cabbagetown USA Human Blood 43FEMALE 7/30/2015 8/5/2015 8/26/2015 A109320 A Enteritidis JEGX01.0009 Confirmed Unconfirmed Bismarck USA Human Stool 28UNKNOWN 7/25/2015 8/6/2015 8/27/2015 T509961 T Enteritidis JEGX01.0009 Confirmed Unconfirmed Forsyth Decatur USA Human Stool 57UNKNOWN 7/31/2015 8/13/2015 9/10/2015 A110203 A Enteritidis JEGX01.0009 Confirmed Unconfirmed DeKalb Hollywood USA Human Other 14FEMALE 8/11/2015 8/25/2015 9/22/2015 A151664 A Enteritidis JEGX01.0009 Confirmed Unconfirmed Talking Rock USA Human Stool 62MALE 8/26/2015 9/8/2015 9/28/2015 DA159061 K Enteritidis JEGX01.0009 Confirmed Unconfirmed Pickens Pierre USA Human Stool 6FEMALE 8/29/2015 9/9/2015 9/29/2015 M150130-1 P Enteritidis JEGX01.0009 Confirmed Unconfirmed Dawson USA Human Stool 6MALE 9/20/2015 9/28/2015 10/1/2015 C15-0445058 N Enteritidis JEGX01.0009 Confirmed Unconfirmed Charlotte USA Human Stool 5MALE 9/2/2015 9/25/2015 10/9/2015 A122326 L Enteritidis JEGX01.0009 Confirmed Unconfirmed Gwinnett NYC USA Human Blood 88FEMALE 9/30/2015 10/7/2015 10/15/2015 A151248 A Enteritidis JEGX01.0009 Confirmed Unconfirmed Atlanta USA Human Stool 37MALE 10/4/2015 10/13/2015 10/21/2015 A125223 D Enteritidis JEGX01.0009 Confirmed Unconfirmed Hall L..A. USA Human Stool FEMALE 9/26/2015 10/14/2015 10/22/2015 FDA00009433 FDA00009408 FDA00009432 FDA00009411 FDA00009414 FDA00009410 2015K-0962 FDA00009415 FDA00009409 PNUSAS000907 FDA00009413 2015K-0960 FDA00009412 2015K-0961 FDA00009417 PNUSAS000905 PNUSAS000839 FDA00009416 PNUSAS000861 PNUSAS000906 PNUSAS000842 PNUSAS000858 PNUSAS000844 PNUSAS000862 PNUSAS000840 PNUSAS000908 PNUSAS000897 PNUSAS000845 PNUSAS000860 PNUSAS000903 PNUSAS000904 PNUSAS000764 PNUSAS000843 PNUSAS000859 PNUSAS000841 PNUSAS000807 PNUSAS000895 PNUSAS000773 PNUSAS000767* PNUSAS000894 PNUSAS000766 PNUSAS000770* PNUSAS000772* PNUSAS000896 PNUSAS000769* PNUSAS000771* PNUSAS000808 PNUSAS000768* PNUSAS000799 2015K-0964 63 44 15 38 75 84 67 100 4 35 52 25 19 12 0.001 FDA00009433 FDA00009408 FDA00009432 FDA00009411 FDA00009414 FDA00009410 2015K-0962 FDA00009415 FDA00009409 PNUSAS000907 FDA00009413 2015K-0960 FDA00009412 2015K-0961 FDA00009417 PNUSAS000905 2015K-0963 PNUSAS000839 FDA00009416 PNUSAS000861 PNUSAS000906 PNUSAS000842 PNUSAS000858 PNUSAS000844 PNUSAS000862 PNUSAS000840 PNUSAS000908 PNUSAS000897 PNUSAS000845 PNUSAS000860 PNUSAS000903 PNUSAS000904 PNUSAS000764 PNUSAS000843 PNUSAS000859 PNUSAS000841 PNUSAS000807 PNUSAS000895 PNUSAS000773 PNUSAS000767* PNUSAS000894 PNUSAS000766 PNUSAS000770* PNUSAS000772* PNUSAS000896 PNUSAS000769* PNUSAS000771* PNUSAS000808 PNUSAS000768* PNUSAS000799 2015K-0964 63 44 15 38 75 84 67 100 4 35 52 25 19 12 0.001
  8. 8. Definitive phylogenetically relevant naming of WGSprofiles “SNPAddress” Courtesy Tim Dallman, PHE 1 2 1 2 3 1 2 3 4 5 6 1.1.1 1.2.2 1.2.4 1.2.3 2.3.5 2.3.6
  9. 9. Courtesy Tim Dallman, PHE • Hierarchical clustering based on full pairwise distance between two genomes • Used to assign a SNP address to a strain based on specified index e.g. 50:25:10:5:0 • Can be used for surveillance purposes “SNP address” PulseNet International will use MLST: “Allele Code”
  10. 10. Considerations for a phylogenetic relevant strain nomenclature system • Must be simple – Sequence of numbers • Stability of system – Fit new sequences into an existing tree? – Recalculate the clusters with every new entry? • No matter which method used, the stability can be controlled • < 2% risk that you cannot fit a new sequence unambiguously into the nomenclatural system • Cutoffs between levels • Clustering algorithm – Single linkage? UPGMA?
  11. 11. WGS Data Workflow Allele & Allele code Databases Allele names, Allele code (strain names) NO Metadata Temporary storage, QA/QC, Data extraction Trimming, mapping, de novo assembly, SNP detection, allele detection NO Metadata Public Health databases Extensive Metadata Database managers and end users External storage NCBI, ENA, Limited Metadata Sequencer Raw sequences LIMS 7-gene MLST Allelic profile cgMLST ST wgMLST Allele Code (SNPs)
  12. 12. Acknowledgements National Center for Emerging and Zoonotic Infectious Diseases Division of Foodborne, Waterborne, and Environmental Diseases Disclaimers: “The findings and conclusions in this presentation are those of the author and do not necessarily represent the official position of the Centers for Disease Control and Prevention” “Use of trade names is for identification only and does not imply endorsement by the Centers for Disease Control and Prevention or by the U.S. Department of Health and Human Services.” Public Health Agency of Canada Institut Pasteur, S. Brisse; M. Lecuit Center for Genomic Epidemiology, DTU University of Oxford, M. Maiden, K. Jolly Public Health England, T. Dallman
  13. 13. Hierachical Nomenclature Is Inherently Unstable • As we use approximate matching to group strains, equality is no longer transitive. Given strains A, B and C with distances as indicated, Then at distance cutoff 21, A, B and C would be in the same cluster. • However, if B has not been sampled yet, A and C would not be in the same cluster • How bad is it? A C B13 17 28 Courtesy: Hannes Poussele, Applied Maths
  14. 14. Cutoff determination (case PulseNet Listeria cgMLST database, N= 3,652) Test procedure: find points with minimal name changes starting from nothing and by chronological addition of strains Thresholds: 150:100:63:41:21:11 Courtesy: Hannes Poussele, Applied Maths
  15. 15. Stability Assessment • Test 1: starting from nothing, add samples chronologically • Test 2: starting from a random subset (50%), add samples chronologically • Using a precalculated strain nomenclature structure based on what is known today, reduces the nomenclature stability beyond what is expected (that is, in this case, 50% reduction) • The 21 allelic changes cutoff might be not stable enough threshold % change Test 1 Test 2 11 1.01% 0.30% 21 2.51% 1.64% 41 2.51% 0.57% 63 1.37% 0.27% 100 2.52% 0.03% 150 0.22% 0% Courtesy: Hannes Poussele, Applied Maths
  16. 16. Stability Assessment Conclusions MLST-based hierachical strain nomenclature is feasible • Stability good – Without the 21 allelic changes cutoff, less that 1.17% name changes • Stability can be further increased by defining a broad starting set – Using a more international collection of strains – Using biological knowledge about the population structure of L.monocytogenes • Computational feasibility – Names can be assigned one sample at a time, no need for complete recalculations • wgMLST instead of cgMLST yields extremely similar results Courtesy: Hannes Poussele, Applied Maths

×