Sequencing platforms, Bioinformatic search tools
and Databases, Experience Feedback
Dr Michel-Yves MISTOU,
Head of Foodborne Pathogen department,
Laboratory for Food Safety, Anses
The 10th Expert Group on BioPesticides
Seminar on “Bioinformatics and regulation of microbial pesticides”, OECD, Paris, France Monday 24th June 2019
A strong marketing strategy to accomodate the diversity of uses
Iseq100
4.106 reads
1.2 Gb
17 k€
MiniSeq
25 x 106 reads
7.5 Gb
MiSeq
25 x 106 reads
15 Gb
NextSeq
400 x 106 reads
120 Gb
NextSeq
0.4 x 109 reads
0.1 Tb
HiSeq X
5 x 109 reads
1.5 Tb
HiSeq 4000
6 x 109 reads
1.8 Tb
NovaSeq
20 x 109 reads
6 Tb
900 k€
Benchtop
Platform
production
Illumina, The dominant short-reads technology
Illumina, The dominant short-reads technology
• 75-90% market share
• Short reads (50-300 bp) / Sequencing by
synthesis / Pair-ended sequences
• Ion Torrent(PGM Proton instruments) is
not a strong competitor More hands on
time/smaller user community / higher
cost per Mb
Reference Instrument Error rate
(%)
Fox et al, 2014 HiSeq 2000 0.1
Fox et al., 2014 MiSeq 0.1
Dohm et al, 2008 1 G 0.3
May et al., 2015 MiSeq 0.21
Kelley et al., 2010 MiSeq 0.21-2.6
Pfeiffer et al., 2018 HiSeq / NexSeq 0,24
Could BGI become a competitor of Illumina ?
A modest gap in quality remains
with NovaSeq intrument
“The price points that buyers would consider tradeoffs
between less widely adopted and (slightly) less accurate
BGISEQ genomes in exchange for better economics.”
BGI, Bejing Genomics Institutes « A global genomics organisation »
Start to develop sequencers in 2012
Comparison of BGISEQ 500 to Illumina NovaSeq Data
https://blog.dnanexus.com/2018-07-02-comparison-of-bgiseq-
500-to-illumina-novaseq-data/
May, 2019 BGI Europe is sued (in DK, GER) by
Illumina Inc. for patent infringement (sequencing
chemistry reagents)
Sequencing a genome, how much does it cost?
A complex question depending of the context :
Internally - Investment - (depreciation costs /maintenance contracts/ staff costs)
Externally - Service provider (less flexibility)
Choice has to be made depending on the
• Volume of strains to sequence X per Time (wk/m/y)
• Deadline for results delivery Emergency – Flexibility has a cost
> Organisational strategy (Staff resource etc …)
As an example, in the LFS (Anses) the choice was made to work with a service provider (public tender)
Example, cost/bact. Genome (2.5-5 Mb) :
45€ for 384 samples - 63 € (192) - 85 € (96) - 103 € (32) - 402€ (4) – 1610 € (1)
Main conclusion on short reads technology
• Short-read technologies are well suited for high-throughput
applications. lllumina instruments allow rapid, cheap and accurate
whole genome bacterial analyses.
• Short reads technology do not usually enable complete DNA molecule
assembly > fragmented chromosome (contigs/scaffolds) /Plasmids
reconstruction
Long reads sequencing technologies
Pacific Biosciences - PacBio
• SMRT Error rate: 11-15% (Indel)
• Non portable / reference grade bacterial assemblies
• Platform instrument (500-700 k€)
• ~ 2000 € / genome on a public platform
• Planned $1,2 billions merger with Illumina
Oxford nanopore technology (ONT)
• ER: 5-40%
• Highly portable : 10-30 Gb / 12x
• E. coli chromosomes > 1 contig
• ~ 300-400 € / genome
Sequel System
MinIon
• The actual cost and high-error rates doesn’t make long-read technologies suitable for standard microbial
genomics
• Long reads technologies are used to refine assembly > Reference strains / plasmidome
Phylogenetic analysis of Salmonella Derby isolates in France –
Use of a PacBio sequenced reference genome
Sevellec et al., Frontiers microbiol., 2017
Sevellec et al., Genome announc., 2017
 1 serovar : 6 genetic clades
 Host-clade specific associations
The use of a Salmonella Derby closed reference
genome increased the resolution of the WGS
analysis
WGS analysis - For what purposes ?
• Official missions at LFS : Epidemiological surveillance, strain typing and
contribution to FBO investigation.
• Typing > classification
• Epidemiology short – long term
• Outbreak investigation > traceback
> Genomic information is often used as in forensics to distinguish isolates from
each other. Phylogenetic reconstruction (SNP vs cg/wg MLST)
• Characterize dangers associated to pathogenic isolates
> Phenotypic properties of isolates AMR / Virulence / Persistance / Metabolic
capacities > Refined risk assessment / Evolution of regulation
Basic steps of analysis for phylogenetic reconstruction
Quality check 1
Trimming
Quality check 2
Assembly
Assembly quality
checks
SNP calling
Phylogenetic
reconstruction
Gene level /distance based
K-mer
Annotation
Genes Alleles DB
Reference genome
cgMLST
Nucleotide level / Character-based
Two main approaches to establish genetic relatedness : SNP based and cg/wg MLST
METHOD Pros Cons
SNP • Discriminatory power, coding & non
coding genomic regions are considered
• No standardisation
• Analysis is Pipeline /parameters / reference
genome / collection dependant
• Core genome only
cg/wgMLST • Universal nomenclature (Potential for)
• Less sensitive to hom. recombination
• Fast (after assembly)
• Centralized allele server (Data sharing !!)
• One scheme/species (on which everybody
agrees)
• Coding sequences only
• New Alleles curation
SNP polymorphism versus gene by gene (cg/wg MLST)
approaches for bacterial typing
Cg/wgMLST is potentially more adapted to global real-time epidemiological surveillance while SNP analysis is
better suited in situations when a strong discriminatory power is required.
Access to WGS analysis tools
• Opensource
A wide variety of pipelines (+/- GUI) are available
• Commercial sofwares
BioNumerics, Ridom SeqSphere+ (commercial), CLCBio : offer cgMLST/
wgMLST for a number of species
> BioNumerics (acquired by BioMerieux) is widely used in reference lab community
• Genomic portals
Enterobase / CGE / NCBI Pathogens / PATRIC / BioCyC …
OpenSource pipelines for treatment
and analysis of WGS data
Complex assembly of published open-Source
programs accessible and recognized by the
scientific community > constantly evolving
Many examples of OS pipelines:
• CSI Phylogeny (DTU, DK) Online version
• PHEnix-pipeline (PHE, UK)
• CFSAN-SNP Pipeline (FDA, US)
• ArtWork (Anses,FR)
• chewBBACA (Lisbon Universiy, POR) wgMLST
• ASP3P (Justus Liebig University, GER)
• TORMES (Burgos University, SP)
First gene-ontology enrichment analysis based on bacterial coregenome
variants: insights into adaptations of Salmonella serovars to
mammalian- and avian-hosts Felten et al., 2017 doi: 10.1186/s12866-
017-1132-1.
• In-house bioinformatics skills are required for
development, installation, maintenance and running
• Unix system and storage/calculation infrastuctures
• Three SNP-calling pipelines of major institutes (DTU, PHE,
FDA) tested on the same dataset
• Global epidemiological concordance, however
• Highly different SNP distances returned : 2-4 / 3-6 / 12-24
Benchmarking of SNP calling pipelines
Comparison of SNP-based subtyping workflows for bacterial isolates using WGS data, applied to Salmonella enterica serotype Typhimurium and serotype
1,4,[5],12:i:. Saltykova et al., 2018. doi: 10.1371/journal.pone.0192504
CSI
(DTU, DK)
PHEnix
(PHE, UK)
CFSAN
(FDA, US) > Results are sensitive to data quality, collection of isolates,
pipelines, parameterization … Universal SNP distance cutoff
values to decide about a link between isolates cannot be
determined
Epidemiological question
Build relevant collection of
isolates (case / control)
Sequencing
WGS Analysis
Phylogenetic tree
Distances matrix
Clustering
Genetic relatedness
between isolates
EpiData
Use the genetic relatedness
between isolates together with
epidata to help solve the
epidemiological questions
Genome analysis to solve epidemiological questions
Web portals/Databases for epidemiology and
characterization
Examples of genomic databases for epidemiological purposes
Enterobase (https://enterobase.warwick.ac.uk/) (Warwick Medical School, UK): cgMLST and wgMLST for Salmonella,
Escherichia/Shigella, Yersinia, Vibrio, Helicobacter, Moraxella
NCBI Pathogen detection isolates browser (https://www.ncbi.nlm.nih.gov/pathogens/isolates) Web-based portal that integrates the
genomic sequence, metadata, antibiotic susceptibility and resistance gene information, and the SNP cluster information
CGE Center for Genomic Epidemiology (http://www.genomicepidemiology.org/) (DTU, DK) In the process of implementing cgMLST for
Campylobacter, E. coli, Listeria, Salmonella, and Yersinia
Examples of databases for genome annotation and characterization
AMR: ResFinder, CARD, ARG NOT / Virulence VFDB / Salmonella Serotyping SeqSero, SistR / MLST PubMLST / Plasmid PlasmidFinder /
Phage PHASTER / Metabolic pathways BioCyc, KEGG / Manually cured Annotation RAST, PATRIC, MicroScope
Overview of GrapeTree features.GrapeTree: visualization of core genomic relationships among 100,000 bacterial pathogens
Zhou et al., 2018 doi: 10.1101/gr.232397.117
A genomic overview of the population structure of Salmonella.
Alikhan et al., , 2018PLoS Genet 14: e1007261.
Enterobase https://enterobase.warwick.ac.uk/
A genotyping website for selected enteric pathogens
Use of enterobase in FBO investigation
• Parallel upload of raw reads on Enterobase by the Human (Pasteur Institute) and Food (Anses)
reference laboratories
• Grapetree analysis for Reconciliation of food and human isolates
• A fast and easy way to compare WGS data between laboratories
• Sequences will be made publicly available
Food
Food
Food
Food
SNP analysiscgMLST analysis
Insertion of WGS analysis pipelines
into laboratory quality system
Validation of a Bioinformatics Workflow for Routine Analysis of Whole-Genome Sequencing Data and Related Challenges for Pathogen Typing in a European
National Reference Center: Neisseria meningitidis as a Proof-of-Concept. Front Microbiol. 2019 Mar 6;10:362. doi: 10.3389/fmicb.2019.00362. eCollection 2019.
A Validation Approach of an End-to-End Whole Genome Sequencing Workflow for Source Tracking of Listeria monocytogenes and Salmonella enterica. Portman et
al., 2018 doi: 10.3389/fmicb.2018.00446
• Sciensano clinical laboratory (BE) / Nestlé researche center (SWI) / Animal public Health Agency (UK)
• Validation dataset of N. meningitidis Listeria monocytogenes Salmonella …
• Repeatability (within run), reproducibility (between run), accuracy, precision, (diagnostic) sensitivity, and
(diagnostic) specificity.
• cgMLST / AMR / Serotyping
The bioinformatics workflow must be “fit-for-purpose,” defined by the ISO17025 standard for testing laboratories as
providing “confirmation by examination and provision of objective evidence that the particular requirements for a specific
intended use are fulfilled” (ISO/IEC 17025:2005)
Some laboratories have already accreditated their internal WGS analysis pipelines
• ISO TC34 /SC9 WG25 (2015 - …) In charge of standards in Methods for Food microbiology
• WG25 has drafted a document : project ISO 23418 ‘’Microbiology of the Food Chain — Whole
genome sequencing for typing and genomic characterization of foodborne bacteria — General
requirements and guidance ’’
 This international standard specifies minimum requirements for DNA extraction/sequencing/ analyzing WGS
data/WGS data storage. The document doesn’t go into great technical details.
 The major component of pipelines should be described in peer-reviewed journals / General steps for validating a
bioinformatic pipeline / Validation using reference dataset
• draft NP23418 was circulated to SC9 as a committee draft > currently under review by ISO
national representatives > draft international standard (DIS) stage. (CD consultation closes on July
7, 2019)
Towards an ISO standard for the WGS process
of foodborne bacteria
Some Perspectives
• Trend of bioinformatics tools : toward ease of use (GUI) and scalability > Cloud
• The diversity of bioinformatic tools is puzzling. To ensure comparability of results
in a regulatory environment. Need to establish SOP, have common validation
dataset, organize ring trials
• Establishment of common WGS databases is crucial for harmonization of
procedures (ie cgMLST)
• How much change in a genome constitutes a significant changes between
individual isolates ? Require more empirical and statistical analysis (likelihood
ratio) on comprehensive dataset
THANKS FOR YOUR ATTENTION

Overview of the commonly used sequencing platforms, bioinformatic search tools and databases - Michel-Yves Mistou

  • 1.
    Sequencing platforms, Bioinformaticsearch tools and Databases, Experience Feedback Dr Michel-Yves MISTOU, Head of Foodborne Pathogen department, Laboratory for Food Safety, Anses The 10th Expert Group on BioPesticides Seminar on “Bioinformatics and regulation of microbial pesticides”, OECD, Paris, France Monday 24th June 2019
  • 2.
    A strong marketingstrategy to accomodate the diversity of uses Iseq100 4.106 reads 1.2 Gb 17 k€ MiniSeq 25 x 106 reads 7.5 Gb MiSeq 25 x 106 reads 15 Gb NextSeq 400 x 106 reads 120 Gb NextSeq 0.4 x 109 reads 0.1 Tb HiSeq X 5 x 109 reads 1.5 Tb HiSeq 4000 6 x 109 reads 1.8 Tb NovaSeq 20 x 109 reads 6 Tb 900 k€ Benchtop Platform production Illumina, The dominant short-reads technology
  • 3.
    Illumina, The dominantshort-reads technology • 75-90% market share • Short reads (50-300 bp) / Sequencing by synthesis / Pair-ended sequences • Ion Torrent(PGM Proton instruments) is not a strong competitor More hands on time/smaller user community / higher cost per Mb Reference Instrument Error rate (%) Fox et al, 2014 HiSeq 2000 0.1 Fox et al., 2014 MiSeq 0.1 Dohm et al, 2008 1 G 0.3 May et al., 2015 MiSeq 0.21 Kelley et al., 2010 MiSeq 0.21-2.6 Pfeiffer et al., 2018 HiSeq / NexSeq 0,24
  • 4.
    Could BGI becomea competitor of Illumina ? A modest gap in quality remains with NovaSeq intrument “The price points that buyers would consider tradeoffs between less widely adopted and (slightly) less accurate BGISEQ genomes in exchange for better economics.” BGI, Bejing Genomics Institutes « A global genomics organisation » Start to develop sequencers in 2012 Comparison of BGISEQ 500 to Illumina NovaSeq Data https://blog.dnanexus.com/2018-07-02-comparison-of-bgiseq- 500-to-illumina-novaseq-data/ May, 2019 BGI Europe is sued (in DK, GER) by Illumina Inc. for patent infringement (sequencing chemistry reagents)
  • 5.
    Sequencing a genome,how much does it cost? A complex question depending of the context : Internally - Investment - (depreciation costs /maintenance contracts/ staff costs) Externally - Service provider (less flexibility) Choice has to be made depending on the • Volume of strains to sequence X per Time (wk/m/y) • Deadline for results delivery Emergency – Flexibility has a cost > Organisational strategy (Staff resource etc …) As an example, in the LFS (Anses) the choice was made to work with a service provider (public tender) Example, cost/bact. Genome (2.5-5 Mb) : 45€ for 384 samples - 63 € (192) - 85 € (96) - 103 € (32) - 402€ (4) – 1610 € (1)
  • 6.
    Main conclusion onshort reads technology • Short-read technologies are well suited for high-throughput applications. lllumina instruments allow rapid, cheap and accurate whole genome bacterial analyses. • Short reads technology do not usually enable complete DNA molecule assembly > fragmented chromosome (contigs/scaffolds) /Plasmids reconstruction
  • 7.
    Long reads sequencingtechnologies Pacific Biosciences - PacBio • SMRT Error rate: 11-15% (Indel) • Non portable / reference grade bacterial assemblies • Platform instrument (500-700 k€) • ~ 2000 € / genome on a public platform • Planned $1,2 billions merger with Illumina Oxford nanopore technology (ONT) • ER: 5-40% • Highly portable : 10-30 Gb / 12x • E. coli chromosomes > 1 contig • ~ 300-400 € / genome Sequel System MinIon • The actual cost and high-error rates doesn’t make long-read technologies suitable for standard microbial genomics • Long reads technologies are used to refine assembly > Reference strains / plasmidome
  • 8.
    Phylogenetic analysis ofSalmonella Derby isolates in France – Use of a PacBio sequenced reference genome Sevellec et al., Frontiers microbiol., 2017 Sevellec et al., Genome announc., 2017  1 serovar : 6 genetic clades  Host-clade specific associations The use of a Salmonella Derby closed reference genome increased the resolution of the WGS analysis
  • 9.
    WGS analysis -For what purposes ? • Official missions at LFS : Epidemiological surveillance, strain typing and contribution to FBO investigation. • Typing > classification • Epidemiology short – long term • Outbreak investigation > traceback > Genomic information is often used as in forensics to distinguish isolates from each other. Phylogenetic reconstruction (SNP vs cg/wg MLST) • Characterize dangers associated to pathogenic isolates > Phenotypic properties of isolates AMR / Virulence / Persistance / Metabolic capacities > Refined risk assessment / Evolution of regulation
  • 10.
    Basic steps ofanalysis for phylogenetic reconstruction Quality check 1 Trimming Quality check 2 Assembly Assembly quality checks SNP calling Phylogenetic reconstruction Gene level /distance based K-mer Annotation Genes Alleles DB Reference genome cgMLST Nucleotide level / Character-based Two main approaches to establish genetic relatedness : SNP based and cg/wg MLST
  • 11.
    METHOD Pros Cons SNP• Discriminatory power, coding & non coding genomic regions are considered • No standardisation • Analysis is Pipeline /parameters / reference genome / collection dependant • Core genome only cg/wgMLST • Universal nomenclature (Potential for) • Less sensitive to hom. recombination • Fast (after assembly) • Centralized allele server (Data sharing !!) • One scheme/species (on which everybody agrees) • Coding sequences only • New Alleles curation SNP polymorphism versus gene by gene (cg/wg MLST) approaches for bacterial typing Cg/wgMLST is potentially more adapted to global real-time epidemiological surveillance while SNP analysis is better suited in situations when a strong discriminatory power is required.
  • 12.
    Access to WGSanalysis tools • Opensource A wide variety of pipelines (+/- GUI) are available • Commercial sofwares BioNumerics, Ridom SeqSphere+ (commercial), CLCBio : offer cgMLST/ wgMLST for a number of species > BioNumerics (acquired by BioMerieux) is widely used in reference lab community • Genomic portals Enterobase / CGE / NCBI Pathogens / PATRIC / BioCyC …
  • 13.
    OpenSource pipelines fortreatment and analysis of WGS data Complex assembly of published open-Source programs accessible and recognized by the scientific community > constantly evolving Many examples of OS pipelines: • CSI Phylogeny (DTU, DK) Online version • PHEnix-pipeline (PHE, UK) • CFSAN-SNP Pipeline (FDA, US) • ArtWork (Anses,FR) • chewBBACA (Lisbon Universiy, POR) wgMLST • ASP3P (Justus Liebig University, GER) • TORMES (Burgos University, SP) First gene-ontology enrichment analysis based on bacterial coregenome variants: insights into adaptations of Salmonella serovars to mammalian- and avian-hosts Felten et al., 2017 doi: 10.1186/s12866- 017-1132-1. • In-house bioinformatics skills are required for development, installation, maintenance and running • Unix system and storage/calculation infrastuctures
  • 14.
    • Three SNP-callingpipelines of major institutes (DTU, PHE, FDA) tested on the same dataset • Global epidemiological concordance, however • Highly different SNP distances returned : 2-4 / 3-6 / 12-24 Benchmarking of SNP calling pipelines Comparison of SNP-based subtyping workflows for bacterial isolates using WGS data, applied to Salmonella enterica serotype Typhimurium and serotype 1,4,[5],12:i:. Saltykova et al., 2018. doi: 10.1371/journal.pone.0192504 CSI (DTU, DK) PHEnix (PHE, UK) CFSAN (FDA, US) > Results are sensitive to data quality, collection of isolates, pipelines, parameterization … Universal SNP distance cutoff values to decide about a link between isolates cannot be determined
  • 15.
    Epidemiological question Build relevantcollection of isolates (case / control) Sequencing WGS Analysis Phylogenetic tree Distances matrix Clustering Genetic relatedness between isolates EpiData Use the genetic relatedness between isolates together with epidata to help solve the epidemiological questions Genome analysis to solve epidemiological questions
  • 16.
    Web portals/Databases forepidemiology and characterization Examples of genomic databases for epidemiological purposes Enterobase (https://enterobase.warwick.ac.uk/) (Warwick Medical School, UK): cgMLST and wgMLST for Salmonella, Escherichia/Shigella, Yersinia, Vibrio, Helicobacter, Moraxella NCBI Pathogen detection isolates browser (https://www.ncbi.nlm.nih.gov/pathogens/isolates) Web-based portal that integrates the genomic sequence, metadata, antibiotic susceptibility and resistance gene information, and the SNP cluster information CGE Center for Genomic Epidemiology (http://www.genomicepidemiology.org/) (DTU, DK) In the process of implementing cgMLST for Campylobacter, E. coli, Listeria, Salmonella, and Yersinia Examples of databases for genome annotation and characterization AMR: ResFinder, CARD, ARG NOT / Virulence VFDB / Salmonella Serotyping SeqSero, SistR / MLST PubMLST / Plasmid PlasmidFinder / Phage PHASTER / Metabolic pathways BioCyc, KEGG / Manually cured Annotation RAST, PATRIC, MicroScope
  • 17.
    Overview of GrapeTreefeatures.GrapeTree: visualization of core genomic relationships among 100,000 bacterial pathogens Zhou et al., 2018 doi: 10.1101/gr.232397.117 A genomic overview of the population structure of Salmonella. Alikhan et al., , 2018PLoS Genet 14: e1007261. Enterobase https://enterobase.warwick.ac.uk/ A genotyping website for selected enteric pathogens
  • 18.
    Use of enterobasein FBO investigation • Parallel upload of raw reads on Enterobase by the Human (Pasteur Institute) and Food (Anses) reference laboratories • Grapetree analysis for Reconciliation of food and human isolates • A fast and easy way to compare WGS data between laboratories • Sequences will be made publicly available Food Food Food Food SNP analysiscgMLST analysis
  • 19.
    Insertion of WGSanalysis pipelines into laboratory quality system Validation of a Bioinformatics Workflow for Routine Analysis of Whole-Genome Sequencing Data and Related Challenges for Pathogen Typing in a European National Reference Center: Neisseria meningitidis as a Proof-of-Concept. Front Microbiol. 2019 Mar 6;10:362. doi: 10.3389/fmicb.2019.00362. eCollection 2019. A Validation Approach of an End-to-End Whole Genome Sequencing Workflow for Source Tracking of Listeria monocytogenes and Salmonella enterica. Portman et al., 2018 doi: 10.3389/fmicb.2018.00446 • Sciensano clinical laboratory (BE) / Nestlé researche center (SWI) / Animal public Health Agency (UK) • Validation dataset of N. meningitidis Listeria monocytogenes Salmonella … • Repeatability (within run), reproducibility (between run), accuracy, precision, (diagnostic) sensitivity, and (diagnostic) specificity. • cgMLST / AMR / Serotyping The bioinformatics workflow must be “fit-for-purpose,” defined by the ISO17025 standard for testing laboratories as providing “confirmation by examination and provision of objective evidence that the particular requirements for a specific intended use are fulfilled” (ISO/IEC 17025:2005) Some laboratories have already accreditated their internal WGS analysis pipelines
  • 20.
    • ISO TC34/SC9 WG25 (2015 - …) In charge of standards in Methods for Food microbiology • WG25 has drafted a document : project ISO 23418 ‘’Microbiology of the Food Chain — Whole genome sequencing for typing and genomic characterization of foodborne bacteria — General requirements and guidance ’’  This international standard specifies minimum requirements for DNA extraction/sequencing/ analyzing WGS data/WGS data storage. The document doesn’t go into great technical details.  The major component of pipelines should be described in peer-reviewed journals / General steps for validating a bioinformatic pipeline / Validation using reference dataset • draft NP23418 was circulated to SC9 as a committee draft > currently under review by ISO national representatives > draft international standard (DIS) stage. (CD consultation closes on July 7, 2019) Towards an ISO standard for the WGS process of foodborne bacteria
  • 21.
    Some Perspectives • Trendof bioinformatics tools : toward ease of use (GUI) and scalability > Cloud • The diversity of bioinformatic tools is puzzling. To ensure comparability of results in a regulatory environment. Need to establish SOP, have common validation dataset, organize ring trials • Establishment of common WGS databases is crucial for harmonization of procedures (ie cgMLST) • How much change in a genome constitutes a significant changes between individual isolates ? Require more empirical and statistical analysis (likelihood ratio) on comprehensive dataset
  • 22.
    THANKS FOR YOURATTENTION