SlideShare a Scribd company logo
1 of 19
Download to read offline
Robustness, Reproducibility! 
& Ecological Consistency! 
in the Demarcation of Operational Taxonomic Units 
Sebastian Schmidt! 
Institute for Molecular Life Sciences! 
University of Zürich! 
sebastian.schmidt@imls.uzh.ch
A general workflow in (targeted) metagenomics 
ISME15, Seoul, 2014/08/29 sebastian.schmidt@imls.uzh.ch
A general workflow in (targeted) metagenomics 
Jean Tinguely, “Heureka”! 
Lake Zürich 
Sampling &! 
Sequencing “Making OTUs” 
ISME15, Seoul, 2014/08/29 
Understanding! 
your data! 
(hopefully) 
sebastian.schmidt@imls.uzh.ch
Concepts 
ISME15, Seoul, 2014/08/29 sebastian.schmidt@imls.uzh.ch 
replicability! 
! 
robustness! 
! 
reproducibility! 
! 
ecological consistency
Concepts 
ISME15, Seoul, 2014/08/29 sebastian.schmidt@imls.uzh.ch 
replicability! 
! 
robustness! 
! 
reproducibility! 
! 
ecological consistency 
42! 
Life, the Universe and 
Everything? 
42! 
Life, the Universe and 
Everything?
Concepts 
ISME15, Seoul, 2014/08/29 sebastian.schmidt@imls.uzh.ch 
replicability! 
! 
robustness! 
! 
reproducibility! 
! 
ecological consistency 
42! 
Life, the Universe and 
Everything? 
42! 
Life, Microbial Ecology 
and Everything?
Concepts 
ISME15, Seoul, 2014/08/29 sebastian.schmidt@imls.uzh.ch 
replicability! 
! 
robustness! 
! 
reproducibility! 
! 
ecological consistency 
42! 
Life, the Universe and 
Everything? 
Life, the Universe and 
Everything? 
42!
The Human Skin Microbiome (HSM) dataset:! 
! 
~115,000 full-length 16S sequences! 
! 
sampled from 21 distinct body sites! 
Grice et al, Science, 2009 
! 
clustered to 97% sequence identity 
ISME15, Seoul, 2014/08/29 sebastian.schmidt@imls.uzh.ch
OTU A 
UPARSE 
all methods 
agree (almost) 
5,423 SEQ. perfectly 
SMALL OTUS 
õ4EQ 
PER OTU 
methods provide 
different # of “small” 
OTUs 
	õTFRQFS056
 
OTU D 
2,692 SEQ. 
TQMJUUJOH 
by Uclust 
OTU C 

4EQ. 
TQMJUUJOH 
by CL 
OTU B 
8,465 SEQ. 
MVNQJOH 
by SL 

 OTUS 
UCLUST 
3,282 OTUS 
CD-HIT 

 OTUS 
SINGLE LINKAGE 

 OTUS 
COMPLETE LINKAGE 

 OTUS 
AVERAGE LINKAGE 

 OTUS 
ISME15, Seoul, 2014/08/29 Schmidt et al, Environ Microbiol, in press
MORISITA-HORN 
0.749 
0.154 
AL 
0.920 
-0.075 
CL 0.932 
-0.095 
0.682 
-0.051 
SL HIT CD-UCLUST UPARSE AVERAGE LINKAGE COMPLETE LINKAGE SINGLE LINKAGE CD-HIT UCLUST UPARSE CHAO1 
INV SIMPSON 
SHANNON 
0.988 
0.387 
0.969 
0.300 
0.991 
0.150 
0.794 
0.079 
0.981 
-0.008 
0.576 
0.116 
0.991 
-0.299 
0.858 
-0.261 
0.966 
-0.136 
MORISITA-HORN 
0.772 
-0.099 
0.928 
-0.060 
0.545 
-0.131 
0.986 
0.522 
0.773 
0.463 
0.953 
0.216 
MORISITA-HORN 
0.749 
0.154 
0.922 
0.087 
0.551 
0.167 
0.973 
-0.686 
0.817 
-0.561 
0.949 
-0.286 
0.953 
0.064 
0.513 
-0.358 
0.358 
-0.207 
0.984 
0.204 
0.672 
-0.163 
0.584 
0.780 
0.665 
0.350 
0.922 
0.087 
PEARSON CORRELATION 
MORISITA-HORN 
0.918 
0.128 
0.802 
-0.194 
0.805 
1.511 
0.855 
-0.181 
0.948 
0.390 
0.912 
0.427 
0.472 
-0.325 
0.785 
2.033 
0.694 
-0.280 
0.668 
0.853 
0.799 
0.642 
0.993 
0.027 
0.920 
0.151 
0.643 
-0.158 
0.881 
1.347 
0.884 
-0.126 
0.922 
0.292 
0.905 
0.356 
0.937 
-0.068 
0.981 
-0.056 
0.791 
-0.209 
0.838 
1.734 
0.862 
-0.201 
0.945 
0.592 
0.912 
0.506 
0.482 
-0.366 
0.614 
-0.091 
0.984 
-0.095 
0.764 
-0.084 
0.613 
0.518 
0.608 
0.214 
0.762 
0.036 
0.977 
-0.164 
0.945 
0.055 
0.989 
-0.098 
0.998 
-0.071 
0.558 
-0.271 
0.464 
-0.040 
0.978 
-0.482 
0.759 
-0.009 
0.584 
0.219 
0.574 
0.063 
0.552 
-0.298 
0.630 
-0.076 
0.972 
-0.318 
0.793 
-0.064 
0.569 
0.317 
0.570 
0.134 
Q Q 
MORISITA-HORN 
0.436 
-0.422 
0.520 
0.118 
Qö 
0.837 
-1.829 
0.617 
0.117 
0.559 
-0.073 
0.434 
-0.292 
0.886 
-0.015 
0.993 
0.224 
0.957 
-0.020 
0.974 
0.202 
0.995 
0.079 
SØRENSEN 
JABD 
CHAO1 
INV SIMPSON 
SHANNON 
SØRENSEN 
JABD 
CHAO1 
INV SIMPSON 
SHANNON 
SØRENSEN 
JABD 
CHAO1 
INV SIMPSON 
SHANNON 
SØRENSEN 
JABD 
CHAO1 
INV SIMPSON 
SHANNON 
Q 
SØRENSEN 
MORISITA-HORN 
JABD 
CHAO1 
INV SIMPSON 
SHANNON 
SØRENSEN 
JABD 
B 
significance 
of mean shift 
red: shift towards higher values 
blue: shift towards lower values 
0.551 
0.167 
0.973 
-0.686 
0.817 
-0.561 
0.949 
-0.286 
PEARSON CORRELATION 
RELATIVE SHIFT (LOG2) 
RELATIVE SHIFT (LOG2) 
PEARSON CORRELATION 
RELATIVE SHIFT (LOG2) 
Qö 
Q 
Q Q 
ISME15, Seoul, 2014/08/29 
Schmidt et al, Environ Microbiol, in press! 
(data from Grice et al, Science, 2009)
90 
95 
100 
90 
95 
100 
90 
95 
100 
90 
95 
100 
90 
95 
100 
90 
95 
100 
AVERAGE LINKAGE 
90 95 100 90 95 100 90 95 100 90 95 100 90 95 100 90 95 100 
0.5 0.6 0.7 0.8 0.9 1.0 
UCLUST CD-HIT SINGLE LINKAGE COMPLETE LINKAGE AVERAGE LINKAGE 
COMPLETE LINKAGE SINGLE LINKAGE CD-HIT UCLUST UPARSE 
UPARSE 
ADJUSTED 
MUTUAL INF 
A ‘global’ 16S dataset! 
~1.1M full-length sequences! 
≥30k samples, diverse 
environments! 
! 
Adjusted Mutual 
Information (AMI), a 
measure of partition 
similarity! 
! 
high replicability! 
…when clustering twice to 
the exact same threshold! 
! 
differential robustness! 
…to slight threshold changes 
Schmidt et al, Environ Microbiol,! 
in press
90 
95 
100 
90 
95 
100 
90 
95 
100 
90 
95 
100 
90 
95 
100 
90 
95 
100 
AVERAGE LINKAGE 
90 95 100 90 95 100 90 95 100 90 95 100 90 95 100 90 95 100 
0.5 0.6 0.7 0.8 0.9 1.0 
UCLUST CD-HIT SINGLE LINKAGE COMPLETE LINKAGE AVERAGE LINKAGE 
COMPLETE LINKAGE SINGLE LINKAGE CD-HIT UCLUST UPARSE 
UPARSE 
ADJUSTED 
MUTUAL INF 
A ‘global’ 16S dataset! 
~1.1M full-length sequences! 
≥30k samples, diverse 
environments! 
! 
Adjusted Mutual 
Information (AMI), a 
measure of partition 
similarity! 
! 
high replicability! 
…when clustering twice to 
the exact same threshold! 
! 
differential robustness! 
…to slight threshold changes! 
! 
differential reproducibility! 
pairwise similarity maxima 
between methods off-diagonal! 
comparability of results across 
studies? 
Schmidt et al, Environ Microbiol,! 
in press
90 
95 
100 
90 
95 
100 
90 
95 
100 
90 
95 
100 
90 
95 
100 
90 
95 
100 
AVERAGE LINKAGE 
90 95 100 90 95 100 90 95 100 90 95 100 90 95 100 90 95 100 
0.5 0.6 0.7 0.8 0.9 1.0 
UCLUST CD-HIT SINGLE LINKAGE COMPLETE LINKAGE AVERAGE LINKAGE 
COMPLETE LINKAGE SINGLE LINKAGE CD-HIT UCLUST UPARSE 
UPARSE 
ADJUSTED 
MUTUAL INF 
“Greengenes 97”! 
vs.! 
“SILVA 99”! 
AMI ~ 0.65 
A ‘global’ 16S dataset! 
~1.1M full-length sequences! 
≥30k samples, diverse 
environments! 
! 
Adjusted Mutual 
Information (AMI), a 
measure of partition 
similarity! 
! 
high replicability! 
…when clustering twice to 
the exact same threshold! 
! 
differential robustness! 
…to slight threshold changes! 
Schmidt et al, Environ Microbiol,! 
in press 
! 
differential reproducibility! 
pairwise similarity maxima 
between methods off-diagonal! 
comparability of results across 
studies?
90 
95 
100 
90 
95 
100 
90 
95 
100 
90 
95 
100 
90 
95 
100 
90 
95 
100 
AVERAGE LINKAGE 
90 95 100 90 95 100 90 95 100 90 95 100 90 95 100 90 95 100 
0.5 0.6 0.7 0.8 0.9 1.0 
UCLUST CD-HIT SINGLE LINKAGE COMPLETE LINKAGE AVERAGE LINKAGE 
COMPLETE LINKAGE SINGLE LINKAGE CD-HIT UCLUST UPARSE 
UPARSE 
ADJUSTED 
MUTUAL INF 
A 
~1.1M 
≥ 
environments 
! 
Adjusted Mutual 
Information (AMI) 
measure of partition 
similarity! 
! 
high 
… the exact same threshold! 
! 
differential 
…to slight threshold changes! 
! 
differential 
pairwise similarity maxima 
between 
comparability of results across 
studies? 
Schmidt et al, Environ Microbiol,! 
in press 
But which method makes the ‘best’ OTUs?
‘Good’ OTUs should correspond to ‘true’ bacterial lineages (‘species’)! 
they should comply with evolutionary theory of bacterial speciation! 
BUT: no unifying / commonly accepted bacterial species concept! 
! 
! 
Two main criteria for theory-compliant OTUs! 
phylogenetic consistency (represent monophyletic lineages)! 
ecological consistency (represent ecologically homogenous groups of organisms) 
Gevers et al., Nat Rev Microbiol, 2005! 
Cohan, Philos T R Soc B, 2006! 
Koeppel et al., PNAS, 2008! 
Hunt et al., Science, 2008! 
Fraser et al., Science, 2009! 
Vos, Trends Microbiol, 2011! 
Koeppel  Wu, NAR, 2013! 
Preheim et al, Appl Env Microbiol, 2013! 
! 
[and many more…] 
ISME15, Seoul, 2014/08/29 sebastian.schmidt@imls.uzh.ch
rumen 
halotolerant 
hypersaline 
pathogenic 
intestinal infection 
degradation 
day 
resistant 
producing 
gut 
endosymbiont 
deep 
mat 
thermophilic 
high 
metal 
activated 
cold 
milk 
soil 
environmental diverse 
iron 
diversity 
sediment 
water 
community marine 
associated 
acid 
plant 
sludge 
anaerobic 
field 
sea 
rhizosphere lake 
spring 
halophilic 
culture 
consortium 
extremely 
archaeon 
paddy 
pesticide 
activity root 
surface 
production 
contaminated 
wastewater 
structure 
degrading 
seawater 
treatment 
hydrothermal 
oil 
feces 
hot 
biofilm 
waste 
endophytic 
nodule 
freshwater deepsea 
reactor 
vent 
enrichment 
microbiota 
growth 
disease 
pathogen 
salt 
patient 
aerobic 
coastal 
mine host 
fermented 
culturable 
habitat archaeal actinomycete 
res 
pond 
lactic 
forest 
region 
clinical 
symbiont 
biodegradation 
temperature 
skin 
moderately 
antarctic 
methanogenic 
swab 
reveal 
zone 
ocean 
tract 
natural 
control 
bioreactor 
river 
sponge 
produced 
carbon 
blood 
fluid 
coral 
mud 
food 
shift 
highly 
leaf 
ice 
organic 
rock 
draft 
diet 
oral 
tree 
solar 
stream 
coast 
wild 
core 
fed 
low 
grown 
tidal 
fecal 
mineral 
flat 
compost 
saline 
symbiotic 
content 
saltern 
alkaline 
diseased 
rhizobia 
wound 
active 
intestine 
traditional 
sand 
subsurface 
antimicrobial 
fermentation 
effluent 
comb 
sewage 
condition 
caused 
product 
treating 
sulfatereducing 
ecology 
purification 
station 
hydrocarbon 
nitrogen 
coidentity 
degrade 
resistance 
mangrove 
methane 
polluted 
acidic 
antibiotic 
cultivation 
oxidation 
probiotic cultured 
methanogen 
process 
revealed 
tissue 
agricultural 
chemical 
heterotrophic 
biocontrol 
alkaliphilic 
legume 
denitrifying 
indigenous 
industrial 
correlate 
defense 
cluster 
heavy 
reduction 
tolerant 
aquifer 
reservoir 
wetland 
diabetic 
enriched 
chloroplast 
cultivated 
cultureindependent 
nitrogenfixing 
prolonged 
protease 
basin 
compound 
mesophilic 
microbiome 
removal 
formation 
laboratory 
adult 
anoxic 
petroleum 
termite 
functional 
aquatic 
association 
factory 
fresh 
antifungal 
korean 
terrestrial 
involved 
promoting 
geothermal 
bay 
black 
island 
sulfur 
drainage 
farm 
groundwater 
hydrogen 
ISME15, Seoul, 2014/08/29 sebastian.schmidt@imls.uzh.ch
AVERAGE LINKAGE 
SINGLE LINKAGE 
1000 10000 100000 
NUMBER OF OTUS 
6000 
5500 
5000 
4500 
4000 
3500 
3000 
2500 
2000 
1500 
1000 
A 
ECOLOGICAL CONSISTENCY SCORE (ECS) 
COMPLETE LINKAGE 
UCLUST 
CD-HIT 
97% NOMINAL SIMILARITY 
ISME15, Seoul, 2014/08/29 Schmidt et al, PLOS Comp Biol, 2014
AVERAGE LINKAGE 
SINGLE LINKAGE 
1000 10000 100000 
NUMBER OF OTUS 
6000 
5500 
5000 
4500 
4000 
3500 
3000 
2500 
2000 
1500 
1000 
A 
ECOLOGICAL CONSISTENCY SCORE (ECS) 
COMPLETE LINKAGE 
UCLUST 
CD-HIT 
97% NOMINAL SIMILARITY 
D BACTERIA, SAMPLING SITES 
B ARCHAEA, ECOLOGICAL TERMS 
100 1000 10000 
E BACTERIA, HOST TAXONOMY 
F 
5000 
4000 
3000 
2000 
1000 
1000 10000 100000 
2500 
2000 
1500 
1000 
500 
0 
1000 10000 100000 
2500 
2000 
1500 
1000 
500 
BACTERIA, ENVO TERMS 
1000 10000 100000 
C 
100 1000 10000 
400 
300 
200 
100 
EUKARYA, ECOLOGICAL TERMS 
700 
600 
500 
400 
300 
ISME15, Seoul, 2014/08/29 Schmidt et al, PLOS Comp Biol, 2014
Conclusions 
ISME15, Seoul, 2014/08/29 sebastian.schmidt@imls.uzh.ch 
replicability! 
clustering was generally replicable! 
! 
robustness! 
AL, CL  CD-HIT were highly robust to (slightly) changing thresholds, UCLUST, UPARSE  SL more sensitive! 
similar trends for robustness to clustering context and choice of subregion (not shown)! 
! 
reproducibility! 
surprisingly discordant partitions by different methods! 
similarity maxima generally off-diagonal! 
AL and CD-HIT most similar pair! 
implications for reference-based OTU-binning: choice of reference clustering determines quality!! 
! 
ecological consistency! 
CL provided most consistent OTU sets! 
implications for taxonomy and species definitions?

More Related Content

Similar to Robustness, Reproducibility & Ecological Consistency in the Demarcation of Operational Taxonomic Units

Point of Care Ultrasound - Hyperechoic Future in Family Practice?
Point of Care Ultrasound - Hyperechoic Future in Family Practice?Point of Care Ultrasound - Hyperechoic Future in Family Practice?
Point of Care Ultrasound - Hyperechoic Future in Family Practice?
cbyrne2014
 
0be3slidesonline97[1]
0be3slidesonline97[1]0be3slidesonline97[1]
0be3slidesonline97[1]
Qayoom Sahito
 
Symposium Poster FINISHED
Symposium Poster FINISHEDSymposium Poster FINISHED
Symposium Poster FINISHED
Caroline Bell
 
malas savs 2015
malas savs 2015malas savs 2015
malas savs 2015
Salutaria
 
Role of Dental Radiography in Forensic Odontology
Role of Dental Radiography in Forensic OdontologyRole of Dental Radiography in Forensic Odontology
Role of Dental Radiography in Forensic Odontology
Vibhuti Kaul
 
Ariunbolor echoendoscopes
Ariunbolor echoendoscopesAriunbolor echoendoscopes
Ariunbolor echoendoscopes
Arikachinzo
 
Sensitivity and Reproducibility of Protein Aggregate Analysis by Sedimentatio...
Sensitivity and Reproducibility of Protein Aggregate Analysis by Sedimentatio...Sensitivity and Reproducibility of Protein Aggregate Analysis by Sedimentatio...
Sensitivity and Reproducibility of Protein Aggregate Analysis by Sedimentatio...
KBI Biopharma
 

Similar to Robustness, Reproducibility & Ecological Consistency in the Demarcation of Operational Taxonomic Units (20)

3 BEDA.pdf
3 BEDA.pdf3 BEDA.pdf
3 BEDA.pdf
 
Thesis final /certified fixed orthodontic courses by Indian dental academy
Thesis final /certified fixed orthodontic courses by Indian dental academy Thesis final /certified fixed orthodontic courses by Indian dental academy
Thesis final /certified fixed orthodontic courses by Indian dental academy
 
ThesisPres
ThesisPresThesisPres
ThesisPres
 
Point of Care Ultrasound - Hyperechoic Future in Family Practice?
Point of Care Ultrasound - Hyperechoic Future in Family Practice?Point of Care Ultrasound - Hyperechoic Future in Family Practice?
Point of Care Ultrasound - Hyperechoic Future in Family Practice?
 
0be3slidesonline97[1]
0be3slidesonline97[1]0be3slidesonline97[1]
0be3slidesonline97[1]
 
2016 10-27 timbers
2016 10-27 timbers2016 10-27 timbers
2016 10-27 timbers
 
Symposium Poster FINISHED
Symposium Poster FINISHEDSymposium Poster FINISHED
Symposium Poster FINISHED
 
malas savs 2015
malas savs 2015malas savs 2015
malas savs 2015
 
2017 9 24 farmer interactions with plants 30pp
2017 9  24 farmer interactions with plants 30pp2017 9  24 farmer interactions with plants 30pp
2017 9 24 farmer interactions with plants 30pp
 
Phylogenetic Congruence between Cranial and Postcranial Characters in Archosa...
Phylogenetic Congruence between Cranial and Postcranial Characters in Archosa...Phylogenetic Congruence between Cranial and Postcranial Characters in Archosa...
Phylogenetic Congruence between Cranial and Postcranial Characters in Archosa...
 
Drone 20201216
Drone 20201216Drone 20201216
Drone 20201216
 
Drone 20201216
Drone 20201216Drone 20201216
Drone 20201216
 
Role of Dental Radiography in Forensic Odontology
Role of Dental Radiography in Forensic OdontologyRole of Dental Radiography in Forensic Odontology
Role of Dental Radiography in Forensic Odontology
 
[2017.06.02] ASM17 Mads Albertsen
[2017.06.02] ASM17 Mads Albertsen[2017.06.02] ASM17 Mads Albertsen
[2017.06.02] ASM17 Mads Albertsen
 
Ariunbolor echoendoscopes
Ariunbolor echoendoscopesAriunbolor echoendoscopes
Ariunbolor echoendoscopes
 
Mapping biodiversity and biomass in Sulawesi Indonesia
Mapping biodiversity and biomass in Sulawesi IndonesiaMapping biodiversity and biomass in Sulawesi Indonesia
Mapping biodiversity and biomass in Sulawesi Indonesia
 
Corporate presentation 2015
Corporate presentation 2015Corporate presentation 2015
Corporate presentation 2015
 
Sensitivity and Reproducibility of Protein Aggregate Analysis by Sedimentatio...
Sensitivity and Reproducibility of Protein Aggregate Analysis by Sedimentatio...Sensitivity and Reproducibility of Protein Aggregate Analysis by Sedimentatio...
Sensitivity and Reproducibility of Protein Aggregate Analysis by Sedimentatio...
 
Alara 2 0
Alara 2 0Alara 2 0
Alara 2 0
 
Roel Vermeulen: Exposome: Studying Life-long Exposures
Roel Vermeulen: Exposome: Studying Life-long ExposuresRoel Vermeulen: Exposome: Studying Life-long Exposures
Roel Vermeulen: Exposome: Studying Life-long Exposures
 

Recently uploaded

development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
NazaninKarimi6
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
MohamedFarag457087
 

Recently uploaded (20)

development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 
Introduction to Viruses
Introduction to VirusesIntroduction to Viruses
Introduction to Viruses
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
chemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdfchemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdf
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 

Robustness, Reproducibility & Ecological Consistency in the Demarcation of Operational Taxonomic Units

  • 1. Robustness, Reproducibility! & Ecological Consistency! in the Demarcation of Operational Taxonomic Units Sebastian Schmidt! Institute for Molecular Life Sciences! University of Zürich! sebastian.schmidt@imls.uzh.ch
  • 2. A general workflow in (targeted) metagenomics ISME15, Seoul, 2014/08/29 sebastian.schmidt@imls.uzh.ch
  • 3. A general workflow in (targeted) metagenomics Jean Tinguely, “Heureka”! Lake Zürich Sampling &! Sequencing “Making OTUs” ISME15, Seoul, 2014/08/29 Understanding! your data! (hopefully) sebastian.schmidt@imls.uzh.ch
  • 4. Concepts ISME15, Seoul, 2014/08/29 sebastian.schmidt@imls.uzh.ch replicability! ! robustness! ! reproducibility! ! ecological consistency
  • 5. Concepts ISME15, Seoul, 2014/08/29 sebastian.schmidt@imls.uzh.ch replicability! ! robustness! ! reproducibility! ! ecological consistency 42! Life, the Universe and Everything? 42! Life, the Universe and Everything?
  • 6. Concepts ISME15, Seoul, 2014/08/29 sebastian.schmidt@imls.uzh.ch replicability! ! robustness! ! reproducibility! ! ecological consistency 42! Life, the Universe and Everything? 42! Life, Microbial Ecology and Everything?
  • 7. Concepts ISME15, Seoul, 2014/08/29 sebastian.schmidt@imls.uzh.ch replicability! ! robustness! ! reproducibility! ! ecological consistency 42! Life, the Universe and Everything? Life, the Universe and Everything? 42!
  • 8. The Human Skin Microbiome (HSM) dataset:! ! ~115,000 full-length 16S sequences! ! sampled from 21 distinct body sites! Grice et al, Science, 2009 ! clustered to 97% sequence identity ISME15, Seoul, 2014/08/29 sebastian.schmidt@imls.uzh.ch
  • 9. OTU A UPARSE all methods agree (almost) 5,423 SEQ. perfectly SMALL OTUS õ4EQ PER OTU methods provide different # of “small” OTUs õTFRQFS056 OTU D 2,692 SEQ. TQMJUUJOH by Uclust OTU C 4EQ. TQMJUUJOH by CL OTU B 8,465 SEQ. MVNQJOH by SL OTUS UCLUST 3,282 OTUS CD-HIT OTUS SINGLE LINKAGE OTUS COMPLETE LINKAGE OTUS AVERAGE LINKAGE OTUS ISME15, Seoul, 2014/08/29 Schmidt et al, Environ Microbiol, in press
  • 10. MORISITA-HORN 0.749 0.154 AL 0.920 -0.075 CL 0.932 -0.095 0.682 -0.051 SL HIT CD-UCLUST UPARSE AVERAGE LINKAGE COMPLETE LINKAGE SINGLE LINKAGE CD-HIT UCLUST UPARSE CHAO1 INV SIMPSON SHANNON 0.988 0.387 0.969 0.300 0.991 0.150 0.794 0.079 0.981 -0.008 0.576 0.116 0.991 -0.299 0.858 -0.261 0.966 -0.136 MORISITA-HORN 0.772 -0.099 0.928 -0.060 0.545 -0.131 0.986 0.522 0.773 0.463 0.953 0.216 MORISITA-HORN 0.749 0.154 0.922 0.087 0.551 0.167 0.973 -0.686 0.817 -0.561 0.949 -0.286 0.953 0.064 0.513 -0.358 0.358 -0.207 0.984 0.204 0.672 -0.163 0.584 0.780 0.665 0.350 0.922 0.087 PEARSON CORRELATION MORISITA-HORN 0.918 0.128 0.802 -0.194 0.805 1.511 0.855 -0.181 0.948 0.390 0.912 0.427 0.472 -0.325 0.785 2.033 0.694 -0.280 0.668 0.853 0.799 0.642 0.993 0.027 0.920 0.151 0.643 -0.158 0.881 1.347 0.884 -0.126 0.922 0.292 0.905 0.356 0.937 -0.068 0.981 -0.056 0.791 -0.209 0.838 1.734 0.862 -0.201 0.945 0.592 0.912 0.506 0.482 -0.366 0.614 -0.091 0.984 -0.095 0.764 -0.084 0.613 0.518 0.608 0.214 0.762 0.036 0.977 -0.164 0.945 0.055 0.989 -0.098 0.998 -0.071 0.558 -0.271 0.464 -0.040 0.978 -0.482 0.759 -0.009 0.584 0.219 0.574 0.063 0.552 -0.298 0.630 -0.076 0.972 -0.318 0.793 -0.064 0.569 0.317 0.570 0.134 Q Q MORISITA-HORN 0.436 -0.422 0.520 0.118 Qö 0.837 -1.829 0.617 0.117 0.559 -0.073 0.434 -0.292 0.886 -0.015 0.993 0.224 0.957 -0.020 0.974 0.202 0.995 0.079 SØRENSEN JABD CHAO1 INV SIMPSON SHANNON SØRENSEN JABD CHAO1 INV SIMPSON SHANNON SØRENSEN JABD CHAO1 INV SIMPSON SHANNON SØRENSEN JABD CHAO1 INV SIMPSON SHANNON Q SØRENSEN MORISITA-HORN JABD CHAO1 INV SIMPSON SHANNON SØRENSEN JABD B significance of mean shift red: shift towards higher values blue: shift towards lower values 0.551 0.167 0.973 -0.686 0.817 -0.561 0.949 -0.286 PEARSON CORRELATION RELATIVE SHIFT (LOG2) RELATIVE SHIFT (LOG2) PEARSON CORRELATION RELATIVE SHIFT (LOG2) Qö Q Q Q ISME15, Seoul, 2014/08/29 Schmidt et al, Environ Microbiol, in press! (data from Grice et al, Science, 2009)
  • 11. 90 95 100 90 95 100 90 95 100 90 95 100 90 95 100 90 95 100 AVERAGE LINKAGE 90 95 100 90 95 100 90 95 100 90 95 100 90 95 100 90 95 100 0.5 0.6 0.7 0.8 0.9 1.0 UCLUST CD-HIT SINGLE LINKAGE COMPLETE LINKAGE AVERAGE LINKAGE COMPLETE LINKAGE SINGLE LINKAGE CD-HIT UCLUST UPARSE UPARSE ADJUSTED MUTUAL INF A ‘global’ 16S dataset! ~1.1M full-length sequences! ≥30k samples, diverse environments! ! Adjusted Mutual Information (AMI), a measure of partition similarity! ! high replicability! …when clustering twice to the exact same threshold! ! differential robustness! …to slight threshold changes Schmidt et al, Environ Microbiol,! in press
  • 12. 90 95 100 90 95 100 90 95 100 90 95 100 90 95 100 90 95 100 AVERAGE LINKAGE 90 95 100 90 95 100 90 95 100 90 95 100 90 95 100 90 95 100 0.5 0.6 0.7 0.8 0.9 1.0 UCLUST CD-HIT SINGLE LINKAGE COMPLETE LINKAGE AVERAGE LINKAGE COMPLETE LINKAGE SINGLE LINKAGE CD-HIT UCLUST UPARSE UPARSE ADJUSTED MUTUAL INF A ‘global’ 16S dataset! ~1.1M full-length sequences! ≥30k samples, diverse environments! ! Adjusted Mutual Information (AMI), a measure of partition similarity! ! high replicability! …when clustering twice to the exact same threshold! ! differential robustness! …to slight threshold changes! ! differential reproducibility! pairwise similarity maxima between methods off-diagonal! comparability of results across studies? Schmidt et al, Environ Microbiol,! in press
  • 13. 90 95 100 90 95 100 90 95 100 90 95 100 90 95 100 90 95 100 AVERAGE LINKAGE 90 95 100 90 95 100 90 95 100 90 95 100 90 95 100 90 95 100 0.5 0.6 0.7 0.8 0.9 1.0 UCLUST CD-HIT SINGLE LINKAGE COMPLETE LINKAGE AVERAGE LINKAGE COMPLETE LINKAGE SINGLE LINKAGE CD-HIT UCLUST UPARSE UPARSE ADJUSTED MUTUAL INF “Greengenes 97”! vs.! “SILVA 99”! AMI ~ 0.65 A ‘global’ 16S dataset! ~1.1M full-length sequences! ≥30k samples, diverse environments! ! Adjusted Mutual Information (AMI), a measure of partition similarity! ! high replicability! …when clustering twice to the exact same threshold! ! differential robustness! …to slight threshold changes! Schmidt et al, Environ Microbiol,! in press ! differential reproducibility! pairwise similarity maxima between methods off-diagonal! comparability of results across studies?
  • 14. 90 95 100 90 95 100 90 95 100 90 95 100 90 95 100 90 95 100 AVERAGE LINKAGE 90 95 100 90 95 100 90 95 100 90 95 100 90 95 100 90 95 100 0.5 0.6 0.7 0.8 0.9 1.0 UCLUST CD-HIT SINGLE LINKAGE COMPLETE LINKAGE AVERAGE LINKAGE COMPLETE LINKAGE SINGLE LINKAGE CD-HIT UCLUST UPARSE UPARSE ADJUSTED MUTUAL INF A ~1.1M ≥ environments ! Adjusted Mutual Information (AMI) measure of partition similarity! ! high … the exact same threshold! ! differential …to slight threshold changes! ! differential pairwise similarity maxima between comparability of results across studies? Schmidt et al, Environ Microbiol,! in press But which method makes the ‘best’ OTUs?
  • 15. ‘Good’ OTUs should correspond to ‘true’ bacterial lineages (‘species’)! they should comply with evolutionary theory of bacterial speciation! BUT: no unifying / commonly accepted bacterial species concept! ! ! Two main criteria for theory-compliant OTUs! phylogenetic consistency (represent monophyletic lineages)! ecological consistency (represent ecologically homogenous groups of organisms) Gevers et al., Nat Rev Microbiol, 2005! Cohan, Philos T R Soc B, 2006! Koeppel et al., PNAS, 2008! Hunt et al., Science, 2008! Fraser et al., Science, 2009! Vos, Trends Microbiol, 2011! Koeppel Wu, NAR, 2013! Preheim et al, Appl Env Microbiol, 2013! ! [and many more…] ISME15, Seoul, 2014/08/29 sebastian.schmidt@imls.uzh.ch
  • 16. rumen halotolerant hypersaline pathogenic intestinal infection degradation day resistant producing gut endosymbiont deep mat thermophilic high metal activated cold milk soil environmental diverse iron diversity sediment water community marine associated acid plant sludge anaerobic field sea rhizosphere lake spring halophilic culture consortium extremely archaeon paddy pesticide activity root surface production contaminated wastewater structure degrading seawater treatment hydrothermal oil feces hot biofilm waste endophytic nodule freshwater deepsea reactor vent enrichment microbiota growth disease pathogen salt patient aerobic coastal mine host fermented culturable habitat archaeal actinomycete res pond lactic forest region clinical symbiont biodegradation temperature skin moderately antarctic methanogenic swab reveal zone ocean tract natural control bioreactor river sponge produced carbon blood fluid coral mud food shift highly leaf ice organic rock draft diet oral tree solar stream coast wild core fed low grown tidal fecal mineral flat compost saline symbiotic content saltern alkaline diseased rhizobia wound active intestine traditional sand subsurface antimicrobial fermentation effluent comb sewage condition caused product treating sulfatereducing ecology purification station hydrocarbon nitrogen coidentity degrade resistance mangrove methane polluted acidic antibiotic cultivation oxidation probiotic cultured methanogen process revealed tissue agricultural chemical heterotrophic biocontrol alkaliphilic legume denitrifying indigenous industrial correlate defense cluster heavy reduction tolerant aquifer reservoir wetland diabetic enriched chloroplast cultivated cultureindependent nitrogenfixing prolonged protease basin compound mesophilic microbiome removal formation laboratory adult anoxic petroleum termite functional aquatic association factory fresh antifungal korean terrestrial involved promoting geothermal bay black island sulfur drainage farm groundwater hydrogen ISME15, Seoul, 2014/08/29 sebastian.schmidt@imls.uzh.ch
  • 17. AVERAGE LINKAGE SINGLE LINKAGE 1000 10000 100000 NUMBER OF OTUS 6000 5500 5000 4500 4000 3500 3000 2500 2000 1500 1000 A ECOLOGICAL CONSISTENCY SCORE (ECS) COMPLETE LINKAGE UCLUST CD-HIT 97% NOMINAL SIMILARITY ISME15, Seoul, 2014/08/29 Schmidt et al, PLOS Comp Biol, 2014
  • 18. AVERAGE LINKAGE SINGLE LINKAGE 1000 10000 100000 NUMBER OF OTUS 6000 5500 5000 4500 4000 3500 3000 2500 2000 1500 1000 A ECOLOGICAL CONSISTENCY SCORE (ECS) COMPLETE LINKAGE UCLUST CD-HIT 97% NOMINAL SIMILARITY D BACTERIA, SAMPLING SITES B ARCHAEA, ECOLOGICAL TERMS 100 1000 10000 E BACTERIA, HOST TAXONOMY F 5000 4000 3000 2000 1000 1000 10000 100000 2500 2000 1500 1000 500 0 1000 10000 100000 2500 2000 1500 1000 500 BACTERIA, ENVO TERMS 1000 10000 100000 C 100 1000 10000 400 300 200 100 EUKARYA, ECOLOGICAL TERMS 700 600 500 400 300 ISME15, Seoul, 2014/08/29 Schmidt et al, PLOS Comp Biol, 2014
  • 19. Conclusions ISME15, Seoul, 2014/08/29 sebastian.schmidt@imls.uzh.ch replicability! clustering was generally replicable! ! robustness! AL, CL CD-HIT were highly robust to (slightly) changing thresholds, UCLUST, UPARSE SL more sensitive! similar trends for robustness to clustering context and choice of subregion (not shown)! ! reproducibility! surprisingly discordant partitions by different methods! similarity maxima generally off-diagonal! AL and CD-HIT most similar pair! implications for reference-based OTU-binning: choice of reference clustering determines quality!! ! ecological consistency! CL provided most consistent OTU sets! implications for taxonomy and species definitions?