SlideShare a Scribd company logo
1 of 39
João André Carriço, PhD
Microbiology Institute/Institute for Molecular Medicine
Faculty of Medicine, University of Lisbon
Portugal
How to compare typing
techniques:
do’s and Don’t’s
http://im.fm.ul.pt
http://imm.fm.ul.pt
http://www.joaocarrico.info
Workshop 20:
Typing of Bacterial Pathogens in 2015:
Expanding the scope of NGS
Conflicts of Interest
NOTHING TO DISCLOSE
Microbial typing
“Crude classifications and False generalizations
are the curse of organized life”
George Bernard Shaw (1856 – 1950)
Microbial Typing:
discriminating strains within a species/subspecies
Typing methods: types / subtypes
Street market, Florence, Italy
How to compare typing methods
Struelens, M.J. et al, 1996. Clinical microbiology and infection, 2(1), pp.2–11.
How to compare typing methods
Struelens, M.J. et al, 1996. Clinical microbiology and infection, 2(1), pp.2–11.
Performance Criteria:
Typeability
Reproducibility
Stability
Discriminatory power
Epidemiological concordance
Typing System concordance
Convenience Criteria
Typing methods: types / subtypes
PFGE :
PFGE Type (cut-off 80% DICE/UPGMA)
PFGE Subtype (cut-off 80% DICE/UPGMA)
PFT A
PFT B
PFT C
PFT D
PFT E
PFT F
Typing methods: types / subtypes
MLST :
Clonal Complex (goeBURST)
Sequence Type
ST 239 : 2-3-1-1-4-4-3
ST 8 : 3-3-1-1-4-4-3
Typing methods: types / subtypes
Microbial Typing: discriminating strains within a species
Serotype :
Serogroup
Serotype
MLVA:
Similar to MLST
cut-offs on MSTs
emm typing:
emm type
emm subtypes
Different typing method results are different partitions of a dataset
Spa typing:
Spa type
BURP complex
Traditional typing and NGS
Chronicle of a Death Foretold
http://en.wikipedia.org/wiki/File:ChronicleOfADeathForetold.JPG
Whole Genome Sequencing in
typing:
- Gene-by-gene: wgMLST,
cgMLST
- SNP comparison
approaches: comparison with
reference strains
- Ability to recover most of the
present sequence based
typing information in a single
experimental procedure
Comparing typing methods
Weissman S J et al. Appl. Environ. Microbiol. 2012;78:1353-
ConcatenatedMLSTlocus
flmHsequences
The Hard way….
Need for quantification and statistics
When you can measure what you are talking
about and express it in numbers you know
something about it. When you cannot
measure it, when you cannot express it,
your knowledge is of a meagre and
unsatisfactory kind.
- Lord Kelvin 1861
Population and Sample
9
7
6
6
Population and Sample
9
7
6
6
3
2
2
3
Sampling introduces an error….
…. but this error can be quantified!
Confidence intervals allow for that quantification of sampling error
and should be used instead of point estimates!
Comparing Partitions Framework
Three Coefficients :
1)Simpson’s Index of Diversity
2)Adjusted Rand
3)Adjusted Wallace
And the respective 95% confidence intervals
Comparingpartitions website
http://www.comparingpartitions.info
Comparingpartitions website
Copy/Paste from Excel
Measuring diversity: SiD
Simpson’s Index of Diversity
This index indicates the probability of two strains sampled
randomly from a population belonging to two different types
Since it is a probability varies between 0 – 1.
Highly discriminatory methods are desired…
..but are they always needed?
Confidence intervals were defined for SID and should be used.
Simpson, 1948
Hunter and Gaston, 1988
Grundmann et al ,2001
Comparing SID’s 95% CIs
Null Hypothesis: The values under comparison are the same
Comparing methods results
PFGEClusters1
s2
s3
s4
s5
s6
s7
SameSequenceType?
Same PFGE cluster?
Y
N
Y N
aa b
c d
For each pair of isolates:
SequenceType
Adjusted RAND
Overall concordance of two methods taking into account that
the agreement between results could arise by chance alone.
Bi-directional agreement measure
Confidence intervals by jackknife pseudo-values method.
Chance Agreement illustration
Two possible random rearrangements…
Chance Agreement:
Rand vs Adjusted RAND
Chance Agreement:
Rand vs Adjusted RAND
Adjusted Wallace
Probability that if two strains share the same classification by a
Method A they also share the same classification by Method B,
corrected by chance agreement
Analytical confidence intervals.
Jackknife pseudo values confidence intervals
Adjusted Wallace
Comparing AR and AW 95% CI
Null Hypothesis: The values under comparison are the same
Comparingpartitions website
Scripts
Other available software
•Bionumerics™ Partition Mapping module
(http://www.applied-maths.com/features/partition-mapping)
Other applications for SID,AR and AW
• Determination of the best set of markers for typing
purposes : given dozens to hundreds or thousands of
possible loci or SNPs is there a subset with enough
discrimination to produce the same results as other
typing method?
http://www.cidmpublichealth.org/pages/ausetts.html.
Other applications for SID,AR and AW
Other applications for SID,AR and AW
• Determination of the best set of markers /typing
methods for typing purposes for predicting a specific
outcome or any associated metadata. Examples:
• Using AW to determine the which typing method
better predicts a clinical outcome or prognosis.
• Using AW to determine association between alleles
and Clonal Complexes (Weissman S J et al. Appl. Environ.
Microbiol. 2012;78:1353-1360)
• Determining association between alleles or types
and geographical location of sampling
Conclusions: Do’s and Don’t’s
DO’s
•The larger the sample size the more accurate can be the
conclusions
•Always use SID, Adjusted Rand and Adjusted Wallace
•Confidence intervals give more information than the point
estimates because they intrinsically take the sample size into
consideration
•Understand the algorithm before making conclusions about the
results
•Assess the biological meaning of the results
Conclusions: Do’s and Don’t’s
DON’T’s
•Make comparisons using small number of isolates. Usually >50
is enough but >100 is better to get statistically significant results
•Don’t use coefficients that not corrected by chance agreement
when comparing typing methods
Conclusions: Do’s and Don’t’s
Conclusions: Do’s and Don’t’s
To Know More:
For examples of usage see the list of references in:
http://darwin.phyloviz.net/ComparingPartitions/index.php?link=References
ACknowledgements
Mário Ramirez
Francisco Pinto
Ana Severiano
UMMI Members
Funding from Fundação para a Ciência e Tecnologia
EU 7th
Framework programme
Dag Harmsen, for the invitation to participate in the workshop
www.comparingpartitions.info
Draft Scientific Programme:
Plenaries:
1)Small Scale Microbial Epidemiology
2)Large Scale Microbial Epidemiology
3)Bioinformatics for Genome-based Microbial Epidemiology
4)Population Genetics: Pathogen Emergence
5)Population Dynamics : Transmission networks and
surveillance
6)Molecular Epidemiology for Global Health and One Health
Parallel Sessions
1)Food and Environmental pathogens
2)Microbial Forensics
3)Virus
4)Fungi and Yeasts
5)Novel Diagnostics methodologies
6)Novel Typing approaches
7)Phylogenetic Inference
8)Interactive Illustration Platforms
Save thedate !

More Related Content

What's hot

zandona14nipsA0
zandona14nipsA0zandona14nipsA0
zandona14nipsA0
Pia Sen
 
Alexey_Ball_CV_v2
Alexey_Ball_CV_v2Alexey_Ball_CV_v2
Alexey_Ball_CV_v2
Lex Ball
 

What's hot (20)

Computational Resources In Infectious Disease
Computational Resources In Infectious DiseaseComputational Resources In Infectious Disease
Computational Resources In Infectious Disease
 
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
 
Software Pipelines: The Good, The Bad and The Ugly
Software Pipelines: The Good, The Bad and The UglySoftware Pipelines: The Good, The Bad and The Ugly
Software Pipelines: The Good, The Bad and The Ugly
 
zandona14nipsA0
zandona14nipsA0zandona14nipsA0
zandona14nipsA0
 
Molecular epidmiology
Molecular epidmiologyMolecular epidmiology
Molecular epidmiology
 
Making Use of NGS Data: From Reads to Trees and Annotations
Making Use of NGS Data: From Reads to Trees and AnnotationsMaking Use of NGS Data: From Reads to Trees and Annotations
Making Use of NGS Data: From Reads to Trees and Annotations
 
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference DatabaseDevelopment of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
 
Cell Authentication By STR Profiling
Cell Authentication By STR ProfilingCell Authentication By STR Profiling
Cell Authentication By STR Profiling
 
Multigenic (mechanistic) biomarkers
Multigenic (mechanistic) biomarkersMultigenic (mechanistic) biomarkers
Multigenic (mechanistic) biomarkers
 
WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015
WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015
WGS in public health microbiology - MDU/VIDRL Seminar - wed 17 jun 2015
 
Bioinformatics and NGS for advancing in hearing loss research
Bioinformatics and NGS for advancing in hearing loss researchBioinformatics and NGS for advancing in hearing loss research
Bioinformatics and NGS for advancing in hearing loss research
 
Normal/Tumor somatic mutations report tool
Normal/Tumor somatic mutations report toolNormal/Tumor somatic mutations report tool
Normal/Tumor somatic mutations report tool
 
How to transform genomic big data into valuable clinical information
How to transform genomic big data into valuable clinical informationHow to transform genomic big data into valuable clinical information
How to transform genomic big data into valuable clinical information
 
The server of the Spanish Population Variability
The server of the Spanish Population VariabilityThe server of the Spanish Population Variability
The server of the Spanish Population Variability
 
iOMICS Research
iOMICS ResearchiOMICS Research
iOMICS Research
 
Alexey_Ball_CV_v2
Alexey_Ball_CV_v2Alexey_Ball_CV_v2
Alexey_Ball_CV_v2
 
Introduction to 16S Analysis with NGS - BMR Genomics
Introduction to 16S Analysis with NGS - BMR GenomicsIntroduction to 16S Analysis with NGS - BMR Genomics
Introduction to 16S Analysis with NGS - BMR Genomics
 
Digging into thousands of variants to find disease genes in Mendelian and com...
Digging into thousands of variants to find disease genes in Mendelian and com...Digging into thousands of variants to find disease genes in Mendelian and com...
Digging into thousands of variants to find disease genes in Mendelian and com...
 
MAS ( Marker Assisted Selection)
MAS ( Marker Assisted Selection)MAS ( Marker Assisted Selection)
MAS ( Marker Assisted Selection)
 
X-Meeting Poster 2015 - Vallys A Coverage tool
X-Meeting Poster 2015 - Vallys A Coverage toolX-Meeting Poster 2015 - Vallys A Coverage tool
X-Meeting Poster 2015 - Vallys A Coverage tool
 

Viewers also liked

Viewers also liked (12)

COMPARE: A global platform for the sequence-based rapid identification of pat...
COMPARE: A global platform for the sequence-based rapid identification of pat...COMPARE: A global platform for the sequence-based rapid identification of pat...
COMPARE: A global platform for the sequence-based rapid identification of pat...
 
Proof of concept of WGS based surveillance: meningococcal disease
Proof of concept of WGS based surveillance: meningococcal diseaseProof of concept of WGS based surveillance: meningococcal disease
Proof of concept of WGS based surveillance: meningococcal disease
 
Overview of the ECDC whole genome sequencing strategy
Overview of the ECDC whole genome sequencing strategyOverview of the ECDC whole genome sequencing strategy
Overview of the ECDC whole genome sequencing strategy
 
Bio303 laboratory diagnosis of infection
Bio303 laboratory diagnosis of infectionBio303 laboratory diagnosis of infection
Bio303 laboratory diagnosis of infection
 
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...Application of Whole Genome Sequencing in the infectious disease’ in vitro di...
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...
 
High-Throughput Sequencing
High-Throughput SequencingHigh-Throughput Sequencing
High-Throughput Sequencing
 
Toolbox for bacterial population analysis using NGS
Toolbox for bacterial population analysis using NGSToolbox for bacterial population analysis using NGS
Toolbox for bacterial population analysis using NGS
 
EU PathoNGenTraceConsortium:cgMLST Evolvement and Challenges for Harmonization
EU PathoNGenTraceConsortium:cgMLST Evolvement and Challenges for HarmonizationEU PathoNGenTraceConsortium:cgMLST Evolvement and Challenges for Harmonization
EU PathoNGenTraceConsortium:cgMLST Evolvement and Challenges for Harmonization
 
Metagenomics and it’s applications
Metagenomics and it’s applicationsMetagenomics and it’s applications
Metagenomics and it’s applications
 
Bioinformatics tools for the diagnostic laboratory - T.Seemann - Antimicrobi...
Bioinformatics tools for the diagnostic laboratory -  T.Seemann - Antimicrobi...Bioinformatics tools for the diagnostic laboratory -  T.Seemann - Antimicrobi...
Bioinformatics tools for the diagnostic laboratory - T.Seemann - Antimicrobi...
 
transforming clinical microbiology by next generation sequencing
transforming clinical microbiology by next generation sequencingtransforming clinical microbiology by next generation sequencing
transforming clinical microbiology by next generation sequencing
 
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
 

Similar to How to compare typing techniques: do’s and Don’t’s

Practical Methods To Overcome Sample Size Challenges
Practical Methods To Overcome Sample Size ChallengesPractical Methods To Overcome Sample Size Challenges
Practical Methods To Overcome Sample Size Challenges
nQuery
 
Chapter 8SamplingSamplingSampling involves decisions a.docx
Chapter 8SamplingSamplingSampling involves decisions a.docxChapter 8SamplingSamplingSampling involves decisions a.docx
Chapter 8SamplingSamplingSampling involves decisions a.docx
mccormicknadine86
 
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
cambridgeWD
 
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
cambridgeWD
 
Machine Learning for Survival Analysis
Machine Learning for Survival AnalysisMachine Learning for Survival Analysis
Machine Learning for Survival Analysis
Chandan Reddy
 
Lab 7 Template1. Using the data you collected for the Week 5 .docx
Lab 7 Template1.  Using the data you collected for the Week 5 .docxLab 7 Template1.  Using the data you collected for the Week 5 .docx
Lab 7 Template1. Using the data you collected for the Week 5 .docx
pauline234567
 
Extending A Trial’s Design Case Studies Of Dealing With Study Design Issues
Extending A Trial’s Design Case Studies Of Dealing With Study Design IssuesExtending A Trial’s Design Case Studies Of Dealing With Study Design Issues
Extending A Trial’s Design Case Studies Of Dealing With Study Design Issues
nQuery
 
Sample size
Sample sizeSample size
Sample size
zubis
 
The impact of different sources of heterogeneity on loss of accuracy from gen...
The impact of different sources of heterogeneity on loss of accuracy from gen...The impact of different sources of heterogeneity on loss of accuracy from gen...
The impact of different sources of heterogeneity on loss of accuracy from gen...
Levi Waldron
 

Similar to How to compare typing techniques: do’s and Don’t’s (20)

Practical Methods To Overcome Sample Size Challenges
Practical Methods To Overcome Sample Size ChallengesPractical Methods To Overcome Sample Size Challenges
Practical Methods To Overcome Sample Size Challenges
 
Sample Size Determination.23.11.2021.pdf
Sample Size Determination.23.11.2021.pdfSample Size Determination.23.11.2021.pdf
Sample Size Determination.23.11.2021.pdf
 
Chapter 8SamplingSamplingSampling involves decisions a.docx
Chapter 8SamplingSamplingSampling involves decisions a.docxChapter 8SamplingSamplingSampling involves decisions a.docx
Chapter 8SamplingSamplingSampling involves decisions a.docx
 
From simulated model by bio pepa to narrative language through sbml
From simulated model by bio pepa to narrative language through sbmlFrom simulated model by bio pepa to narrative language through sbml
From simulated model by bio pepa to narrative language through sbml
 
Developing and validating statistical models for clinical prediction and prog...
Developing and validating statistical models for clinical prediction and prog...Developing and validating statistical models for clinical prediction and prog...
Developing and validating statistical models for clinical prediction and prog...
 
Lemeshow samplesize
Lemeshow samplesizeLemeshow samplesize
Lemeshow samplesize
 
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
 
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
 
Machine Learning for Survival Analysis
Machine Learning for Survival AnalysisMachine Learning for Survival Analysis
Machine Learning for Survival Analysis
 
Lab 7 Template1. Using the data you collected for the Week 5 .docx
Lab 7 Template1.  Using the data you collected for the Week 5 .docxLab 7 Template1.  Using the data you collected for the Week 5 .docx
Lab 7 Template1. Using the data you collected for the Week 5 .docx
 
COM 301 INFERENTIAL STATISTICS SLIDES.ppt
COM 301   INFERENTIAL STATISTICS SLIDES.pptCOM 301   INFERENTIAL STATISTICS SLIDES.ppt
COM 301 INFERENTIAL STATISTICS SLIDES.ppt
 
演講-Meta analysis in medical research-張偉豪
演講-Meta analysis in medical research-張偉豪演講-Meta analysis in medical research-張偉豪
演講-Meta analysis in medical research-張偉豪
 
Towards Replicable and Genereralizable Genomic Prediction Models
Towards Replicable and Genereralizable Genomic Prediction ModelsTowards Replicable and Genereralizable Genomic Prediction Models
Towards Replicable and Genereralizable Genomic Prediction Models
 
Extending A Trial’s Design Case Studies Of Dealing With Study Design Issues
Extending A Trial’s Design Case Studies Of Dealing With Study Design IssuesExtending A Trial’s Design Case Studies Of Dealing With Study Design Issues
Extending A Trial’s Design Case Studies Of Dealing With Study Design Issues
 
Samle size
Samle sizeSamle size
Samle size
 
Sample size
Sample sizeSample size
Sample size
 
Basic of Statistical Inference Part-I
Basic of Statistical Inference Part-IBasic of Statistical Inference Part-I
Basic of Statistical Inference Part-I
 
Submodule 5 - Sampling Designs.pdf
Submodule 5 - Sampling Designs.pdfSubmodule 5 - Sampling Designs.pdf
Submodule 5 - Sampling Designs.pdf
 
The impact of different sources of heterogeneity on loss of accuracy from gen...
The impact of different sources of heterogeneity on loss of accuracy from gen...The impact of different sources of heterogeneity on loss of accuracy from gen...
The impact of different sources of heterogeneity on loss of accuracy from gen...
 
Common statistical pitfalls & errors in biomedical research (a top-5 list)
Common statistical pitfalls & errors in biomedical research (a top-5 list)Common statistical pitfalls & errors in biomedical research (a top-5 list)
Common statistical pitfalls & errors in biomedical research (a top-5 list)
 

Recently uploaded

1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 

Recently uploaded (20)

Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIFood Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 

How to compare typing techniques: do’s and Don’t’s

  • 1. João André Carriço, PhD Microbiology Institute/Institute for Molecular Medicine Faculty of Medicine, University of Lisbon Portugal How to compare typing techniques: do’s and Don’t’s http://im.fm.ul.pt http://imm.fm.ul.pt http://www.joaocarrico.info Workshop 20: Typing of Bacterial Pathogens in 2015: Expanding the scope of NGS
  • 3. Microbial typing “Crude classifications and False generalizations are the curse of organized life” George Bernard Shaw (1856 – 1950) Microbial Typing: discriminating strains within a species/subspecies
  • 4. Typing methods: types / subtypes Street market, Florence, Italy
  • 5. How to compare typing methods Struelens, M.J. et al, 1996. Clinical microbiology and infection, 2(1), pp.2–11.
  • 6. How to compare typing methods Struelens, M.J. et al, 1996. Clinical microbiology and infection, 2(1), pp.2–11. Performance Criteria: Typeability Reproducibility Stability Discriminatory power Epidemiological concordance Typing System concordance Convenience Criteria
  • 7. Typing methods: types / subtypes PFGE : PFGE Type (cut-off 80% DICE/UPGMA) PFGE Subtype (cut-off 80% DICE/UPGMA) PFT A PFT B PFT C PFT D PFT E PFT F
  • 8. Typing methods: types / subtypes MLST : Clonal Complex (goeBURST) Sequence Type ST 239 : 2-3-1-1-4-4-3 ST 8 : 3-3-1-1-4-4-3
  • 9. Typing methods: types / subtypes Microbial Typing: discriminating strains within a species Serotype : Serogroup Serotype MLVA: Similar to MLST cut-offs on MSTs emm typing: emm type emm subtypes Different typing method results are different partitions of a dataset Spa typing: Spa type BURP complex
  • 10. Traditional typing and NGS Chronicle of a Death Foretold http://en.wikipedia.org/wiki/File:ChronicleOfADeathForetold.JPG Whole Genome Sequencing in typing: - Gene-by-gene: wgMLST, cgMLST - SNP comparison approaches: comparison with reference strains - Ability to recover most of the present sequence based typing information in a single experimental procedure
  • 11. Comparing typing methods Weissman S J et al. Appl. Environ. Microbiol. 2012;78:1353- ConcatenatedMLSTlocus flmHsequences The Hard way….
  • 12. Need for quantification and statistics When you can measure what you are talking about and express it in numbers you know something about it. When you cannot measure it, when you cannot express it, your knowledge is of a meagre and unsatisfactory kind. - Lord Kelvin 1861
  • 14. Population and Sample 9 7 6 6 3 2 2 3 Sampling introduces an error…. …. but this error can be quantified! Confidence intervals allow for that quantification of sampling error and should be used instead of point estimates!
  • 15. Comparing Partitions Framework Three Coefficients : 1)Simpson’s Index of Diversity 2)Adjusted Rand 3)Adjusted Wallace And the respective 95% confidence intervals
  • 18. Measuring diversity: SiD Simpson’s Index of Diversity This index indicates the probability of two strains sampled randomly from a population belonging to two different types Since it is a probability varies between 0 – 1. Highly discriminatory methods are desired… ..but are they always needed? Confidence intervals were defined for SID and should be used. Simpson, 1948 Hunter and Gaston, 1988 Grundmann et al ,2001
  • 19. Comparing SID’s 95% CIs Null Hypothesis: The values under comparison are the same
  • 20. Comparing methods results PFGEClusters1 s2 s3 s4 s5 s6 s7 SameSequenceType? Same PFGE cluster? Y N Y N aa b c d For each pair of isolates: SequenceType
  • 21. Adjusted RAND Overall concordance of two methods taking into account that the agreement between results could arise by chance alone. Bi-directional agreement measure Confidence intervals by jackknife pseudo-values method.
  • 22. Chance Agreement illustration Two possible random rearrangements…
  • 23. Chance Agreement: Rand vs Adjusted RAND
  • 24. Chance Agreement: Rand vs Adjusted RAND
  • 25. Adjusted Wallace Probability that if two strains share the same classification by a Method A they also share the same classification by Method B, corrected by chance agreement Analytical confidence intervals. Jackknife pseudo values confidence intervals
  • 27. Comparing AR and AW 95% CI Null Hypothesis: The values under comparison are the same
  • 29. Other available software •Bionumerics™ Partition Mapping module (http://www.applied-maths.com/features/partition-mapping)
  • 30. Other applications for SID,AR and AW • Determination of the best set of markers for typing purposes : given dozens to hundreds or thousands of possible loci or SNPs is there a subset with enough discrimination to produce the same results as other typing method? http://www.cidmpublichealth.org/pages/ausetts.html.
  • 31. Other applications for SID,AR and AW
  • 32. Other applications for SID,AR and AW • Determination of the best set of markers /typing methods for typing purposes for predicting a specific outcome or any associated metadata. Examples: • Using AW to determine the which typing method better predicts a clinical outcome or prognosis. • Using AW to determine association between alleles and Clonal Complexes (Weissman S J et al. Appl. Environ. Microbiol. 2012;78:1353-1360) • Determining association between alleles or types and geographical location of sampling
  • 33. Conclusions: Do’s and Don’t’s DO’s •The larger the sample size the more accurate can be the conclusions •Always use SID, Adjusted Rand and Adjusted Wallace •Confidence intervals give more information than the point estimates because they intrinsically take the sample size into consideration •Understand the algorithm before making conclusions about the results •Assess the biological meaning of the results
  • 34. Conclusions: Do’s and Don’t’s DON’T’s •Make comparisons using small number of isolates. Usually >50 is enough but >100 is better to get statistically significant results •Don’t use coefficients that not corrected by chance agreement when comparing typing methods
  • 35. Conclusions: Do’s and Don’t’s
  • 36. Conclusions: Do’s and Don’t’s
  • 37. To Know More: For examples of usage see the list of references in: http://darwin.phyloviz.net/ComparingPartitions/index.php?link=References
  • 38. ACknowledgements Mário Ramirez Francisco Pinto Ana Severiano UMMI Members Funding from Fundação para a Ciência e Tecnologia EU 7th Framework programme Dag Harmsen, for the invitation to participate in the workshop www.comparingpartitions.info
  • 39. Draft Scientific Programme: Plenaries: 1)Small Scale Microbial Epidemiology 2)Large Scale Microbial Epidemiology 3)Bioinformatics for Genome-based Microbial Epidemiology 4)Population Genetics: Pathogen Emergence 5)Population Dynamics : Transmission networks and surveillance 6)Molecular Epidemiology for Global Health and One Health Parallel Sessions 1)Food and Environmental pathogens 2)Microbial Forensics 3)Virus 4)Fungi and Yeasts 5)Novel Diagnostics methodologies 6)Novel Typing approaches 7)Phylogenetic Inference 8)Interactive Illustration Platforms Save thedate !