Phylogeny-Driven Approaches to
Genomics and Metagenomics
Jonathan A. Eisen
University of California, Davis
@phylogenomics
...
My Obsessions
Jonathan A. Eisen
University of California, Davis
@phylogenomics
Talk at
University of Washington
October 23...
Open Science

Wednesday, October 23, 13
X
Open Science

Wednesday, October 23, 13
Social Media & Science

Wednesday, October 23, 13
X
Social Media & Science

Wednesday, October 23, 13
RedSox
• RedSox

Wednesday, October 23, 13
X
RedSox

• RedSox

Wednesday, October 23, 13
Microbial Evolution

Wednesday, October 23, 13
Sequencing

Wednesday, October 23, 13
Sequencing, Phylogeny, Microbes

Wednesday, October 23, 13
Four Eras of Sequencing & Microbes

Wednesday, October 23, 13
Era I: The Tree of Life

Wednesday, October 23, 13
Lost in Graduate School?

Colias

Tree from Woese. 1987.
Microbiological Reviews 51:221
Wednesday, October 23, 13
Lost in Graduate School?

X
Colias

Phil Hanawalt

Tree from Woese. 1987.
Microbiological Reviews 51:221
Wednesday, Octobe...
Lost in Graduate School?

X
Colias

Phil Hanawalt

Adaptive Mutation

Tree from Woese. 1987.
Microbiological Reviews 51:22...
Lost in Graduate School?

X
Colias

Phil Hanawalt

X

Adaptive Mutation

@RELenski
Tree from Woese. 1987.
Microbiological ...
Lost in Graduate School?

Get A Map
Tree from Woese. 1987.
Microbiological Reviews 51:221
Wednesday, October 23, 13
Woese - Three Domains 1977

Tree from Woese. 1987.
Microbiological Reviews 51:221
Wednesday, October 23, 13
Map for Graduate School

Tree from Woese. 1987.
Microbiological Reviews 51:221
Wednesday, October 23, 13
Limited Sampling of RRR Studies

Tree from Woese. 1987.
Microbiological Reviews 51:221
Wednesday, October 23, 13
My Study Organisms

Tree from Woese. 1987.
Microbiological Reviews 51:221
Wednesday, October 23, 13
E.coli vs. H. volcanii UV survival
UV Survival E.coli vs H.volcanii
1
0.1
0.01

Relative
Survival

0.001
0.0001
1E-05
1E-0...
H. volcanii Excision Repair
H. volcanii UV Repair Label 7 - 45J / m2)

0.6
0 J/m2 t0
45 J/m2 t0
45 J/m2 Photoreac.
45 J/m2...
RecA vs. rRNA

Eisen 1995 Journal of Molecular Evolution 41: 1105-1123..
Wednesday, October 23, 13
RecA vs. rRNA

Eisen 1995 Journal of Molecular Evolution 41: 1105-1123..
Wednesday, October 23, 13
Whatever the History: Try to Incorporate It

from Lake et al. doi: 10.1098/rstb.2009.0035
Wednesday, October 23, 13
Tree Updated

adapted from Baldauf, et al., in Assembling the Tree of Life, 2004
Wednesday, October 23, 13
Era II: rRNA in the Environment

Wednesday, October 23, 13
PCR and phylogenetic analysis of rRNA genes
DNA
extraction

PCR
Makes lots of
copies of the
rRNA genes
in sample

PCR

Phy...
Chemosynthetic Symbionts

Eisen et al. 1992. J. Bact.174: 3416
Wednesday, October 23, 13

Eisen et al.
1992
PCR and phylogenetic analysis of rRNA genes
DNA
extraction

PCR
Makes lots of
copies of the
rRNA genes
in sample

PCR

Phy...
PCR and phylogenetic analysis of rRNA genes
DNA
extraction

PCR
Makes lots of
copies of the
rRNA genes
in sample

PCR

rRN...
PCR and phylogenetic analysis of rRNA genes
DNA
extraction

PCR
Makes lots of
copies of the
rRNA genes
in sample

PCR

rRN...
Uses of rRNA Phylogeny
• OTUs
• Taxonomic lists
• Relative abundance of taxa
• Ecological metrics (alpha / beta diversity)...
Sequencing Has Gone Crazy

1977

2010

Sanger sequencing method by F. Sanger
(PNAS ,1977, 74: 560-564)

1983
1953

2000

1...
rRNA PCR Revolution
• More PCR products
• Deeper sequencing
• The rare biosphere
• Relative abundance estimates

• More sa...
mental variation or dispersal limitation) exp
intense research (5–9), as such studies of β-diversity (variation in
vary by...
Drosophila microbiome

Both natural surveys and laboratory experiments indicate
that host diet plays a major role in shapi...
The Built Environment
Microbial Biogeography of Public Restroom Surfaces
Gilberto E. Flores1, Scott T. Bates1, Dan Knights...
Citizen Science - Project MERCCURI

Wednesday, October 23, 13
Phone Microbiome

Jack Gilbert

Georgia Barguil

Wednesday, October 23, 13
Era III: Genomics

Wednesday, October 23, 13
1st Genome Sequence

Fleischmann et al.
1995
Wednesday, October 23, 13
My Study Organisms

Tree from Woese. 1987.
Microbiological Reviews 51:221
Wednesday, October 23, 13
TIGR Genome Projects

Tree from Woese. 1987.
Microbiological Reviews 51:221
Wednesday, October 23, 13
TIGR Genome Projects

Tree from Woese. 1987.
Microbiological Reviews 51:221
Wednesday, October 23, 13
If you can’t beat them, critique them ...

Fleischmann et al.
1995
Wednesday, October 23, 13
Helicobacter pylori genome 1997

Wednesday, October 23, 13
Phylogenomics
PHYLOGENENETIC PREDICTION OF GENE FUNCTION

EXAMPLE A

METHOD

EXAMPLE B

2A

CHOOSE GENE(S) OF INTEREST

5
...
Phylogenetic Prediction of Function
• Many powerful and automated similarity based
methods for assigning genes to protein ...
Phylogenetic Prediction of Function
• Many powerful and automated similarity based
methods for assigning genes to protein ...
Carboxydothermus hydrogenoformans
•
•
•
•

Isolated from a Russian hotspring
Thermophile (grows at 80°C)
Anaerobic
Grows v...
Homologs of Sporulation Genes

Wu et al. 2005 PLoS
Genetics 1: e65.
Wednesday, October 23, 13
Carboxydothermus sporulates

Wu et al. 2005 PLoS Genetics 1: e65.
Wednesday, October 23, 13
Non-Homology Predictions:
Phylogenetic Profiling

• Step 1: Search all genes in
organisms of interest against all
other ge...
Sporulation Gene Profile

Wu et al. 2005 PLoS Genetics 1: e65.
Wednesday, October 23, 13
B. subtilis new sporulation genes

Wednesday, October 23, 13
From http://genomesonline.org
Wednesday, October 23, 13
PG Profiling Independent Contrasts

Wednesday, October 23, 13
Whole Genome Trees
AMPHORA

Wednesday, October 23, 13
Era IV: Genomes in the Environment

Wednesday, October 23, 13
PCR and phylogenetic analysis of rRNA genes
DNA
extraction

PCR
Makes lots of
copies of the
rRNA genes
in sample

PCR

Phy...
Shotgun metagenomics
DNA
extraction

PCR

Wednesday, October 23, 13

Shotgun
Sequence
all genes
Shotgun metagenomics
DNA
extraction

PCR

Wednesday, October 23, 13

Shotgun
Sequence
all genes
Phylogeny has many uses in shotgun metagenomics
DNA
extraction

PCR

Phylotyping
Phylogenetic tree
rRNA1

rRNA2
rRNA4

rRN...
Uses of Phylogeny in Metagenomics
• Taxonomic assessment
• Phylogenetic OTUs
• Phylogenetic taxonomy assignment
• Phylogen...
rRNA Phylotyping - Sargasso Metagenome

Venter et al., Science 304: 66. 2004
Wednesday, October 23, 13
RecA Phylotyping - Sargasso Metagenome

Venter et al., Science 304: 66. 2004
Wednesday, October 23, 13
Wednesday, October 23, 13

si
lo
np
ro
t

er
ia

er
ia

ac
t

ba
ct

eo

ro
t

eo
b

er
ia

ba
ct

eo

Venter et al., Scie...
Genome Biology 2008,

http://genomebiology.com/2008/9/10/R151

Volume 9, Issue 10, Article R151

AMPHORA Phylotyping

AMPH...
Phylogenetic ID of Novel Lineages
GOS 1
GOS 2

GOS 3
GOS 4

Wu et al PLoS One 2011

Wednesday, October 23, 13

GOS 5
Phylogenetic Functional Prediction

Venter et al., Science 304: 66. 2004
Wednesday, October 23, 13
Phylogenetic Binning
Sulcia makes amino acids

Baumannia makes vitamins and cofactors

Wu et al. 2006 PLoS Biology 4: e188...
Improving Phylogenomics I

Wednesday, October 23, 13
Updated Tree of Life

Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al Tree
Wednesday, Octo...
Genomes Poorly Sampled

Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al Tree
Wednesday, Oc...
TIGR Tree of Life Project

Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al Tree
Wednesday,...
Genomic Encyclopedia of Bacteria & Archaea
Wu et al. 2009 Nature 462, 1056-1060

Figure from Barton, Eisen et al. “Evoluti...
Genomic Encyclopedia of Bacteria & Archaea
Wu et al. 2009 Nature 462, 1056-1060

Figure from Barton, Eisen et al. “Evoluti...
Family Diversity vs. PD

Wu et al. 2009 Nature 462, 1056-1060
Wednesday, October 23, 13
The Dark Matter of Biology

From Wu et al. 2009 Nature 462, 1056-1060
Wednesday, October 23, 13
GEBA Uncultured
SAR

A: Hydrothermal vent
B: Gold Mine
C: Tropical gyres (Mesopelagic)
D: Tropical gyres (Photic zone)

OP...
JGI Dark Matter Project
brackish/freshwater

TG

HSM
SM

GBS
GBS

HOT
OT

SAK
AK

hydrothermal

sediment
ETL
E

BACTERIA

...
recognizes
UGA


G

isolation of single
cells (n=9,600)

whole genome
amplification (n=3,300)

U

:6
OP11 (Microgenomates)...
Caldithrix
GOUTA4
Acidobacteria
Elusimicrobia
Nitrospirae
49S1 2B
Chloroflexi
Caldiserica
AD3
OP9 (Atribacteria)
:36í2
Syn...
Deltaproteobacteria
Cyanobacteria
:36í2
Actinobacteria
Gemmatimonadetes
NC10
SC4
WS2
NKB19 (Hydrogenedentes)
WYO
Armatimon...
A Genomic Encyclopedia of Microbes (GEM)

Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al ...
Improving Phylogenomics II
• Better Methods

Wednesday, October 23, 13
iSEEM

Wednesday, October 23, 13
Zorro - Automated Masking
9.0

ce to True Tree

Distance to True Tree

8.0

Wu M, Chatterji S, Eisen JA (2012) Accounting ...
Kembel Combiner

typically used as a qualitative measure because duplicate s
quences are usually removed from the tree. Ho...
Kembel Copy # Correction

Kembel SW, Wu M, Eisen JA, Green JL (2012) Incorporating 16S Gene Copy Number Information Improv...
Sharpton PhylOTU

Finding Metagenomic OTU

Figure 1. PhylOTU Workflow. Computational processes are represented as squares ...
NMF in Metagenomes

Characterizing the niche-space distributions of components

0 .2

0 .3

0 .4

0 .5

0 .6

0 .2

0 .4

...
Phylosift - Mining the Global Metagenome
Erick Matsen
FHCRC
Todd Treangen
BNBI, NBACC
Holly
Bik

Jonathan Guillaume
Jospin...
Phylosift/ pplacer Workflow

each input sequence scanned against both workflows

Input Sequences
rRNA workflow
600 bp

LAS...
Markers
• PMPROK – Dongying Wu’s Bac/Arch
markers
• Eukaryotic Orthologs – Parfrey 2011 paper
• 16S/18S rRNA
• Mitochondri...
Output 1: Taxonomy
Taxonomic
summary
plots in
Krona
(Ondov et al
2011)

Wednesday, October 23, 13
Output 2: Phylogenetic Tree of Reads
Placement tree from 2 week old infant gut data

Wednesday, October 23, 13
Edge PCA vs. UNIFRAC PCA
QIIME and Edge PCA on
110 fecal metagenomes from
Yatsunenko et al 2012
Nature.
Sequenced with 454...
Improving Phylogenomics III
• Better Data Sets

Wednesday, October 23, 13
More Markers
Phylogenetic group
Archaea

Genome
Number
62

Gene
Number
145415

Maker
Candidates
106

Actinobacteria

63

2...
Upcoming SlideShare
Loading in...5
×

"Phylogeny-Driven Approaches to Genomics and Metagenomics" talk by Jonathan Eisen at U. Washington on

4,942

Published on

Talk by Jonathan Eisen at U. Washington 10/23/13

Published in: Health & Medicine, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
4,942
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
30
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

"Phylogeny-Driven Approaches to Genomics and Metagenomics" talk by Jonathan Eisen at U. Washington on

  1. 1. Phylogeny-Driven Approaches to Genomics and Metagenomics Jonathan A. Eisen University of California, Davis @phylogenomics Talk at University of Washington October 23, 2013 Wednesday, October 23, 13
  2. 2. My Obsessions Jonathan A. Eisen University of California, Davis @phylogenomics Talk at University of Washington October 23, 2013 Wednesday, October 23, 13
  3. 3. Open Science Wednesday, October 23, 13
  4. 4. X Open Science Wednesday, October 23, 13
  5. 5. Social Media & Science Wednesday, October 23, 13
  6. 6. X Social Media & Science Wednesday, October 23, 13
  7. 7. RedSox • RedSox Wednesday, October 23, 13
  8. 8. X RedSox • RedSox Wednesday, October 23, 13
  9. 9. Microbial Evolution Wednesday, October 23, 13
  10. 10. Sequencing Wednesday, October 23, 13
  11. 11. Sequencing, Phylogeny, Microbes Wednesday, October 23, 13
  12. 12. Four Eras of Sequencing & Microbes Wednesday, October 23, 13
  13. 13. Era I: The Tree of Life Wednesday, October 23, 13
  14. 14. Lost in Graduate School? Colias Tree from Woese. 1987. Microbiological Reviews 51:221 Wednesday, October 23, 13
  15. 15. Lost in Graduate School? X Colias Phil Hanawalt Tree from Woese. 1987. Microbiological Reviews 51:221 Wednesday, October 23, 13
  16. 16. Lost in Graduate School? X Colias Phil Hanawalt Adaptive Mutation Tree from Woese. 1987. Microbiological Reviews 51:221 Wednesday, October 23, 13
  17. 17. Lost in Graduate School? X Colias Phil Hanawalt X Adaptive Mutation @RELenski Tree from Woese. 1987. Microbiological Reviews 51:221 Wednesday, October 23, 13
  18. 18. Lost in Graduate School? Get A Map Tree from Woese. 1987. Microbiological Reviews 51:221 Wednesday, October 23, 13
  19. 19. Woese - Three Domains 1977 Tree from Woese. 1987. Microbiological Reviews 51:221 Wednesday, October 23, 13
  20. 20. Map for Graduate School Tree from Woese. 1987. Microbiological Reviews 51:221 Wednesday, October 23, 13
  21. 21. Limited Sampling of RRR Studies Tree from Woese. 1987. Microbiological Reviews 51:221 Wednesday, October 23, 13
  22. 22. My Study Organisms Tree from Woese. 1987. Microbiological Reviews 51:221 Wednesday, October 23, 13
  23. 23. E.coli vs. H. volcanii UV survival UV Survival E.coli vs H.volcanii 1 0.1 0.01 Relative Survival 0.001 0.0001 1E-05 1E-06 1E-07 0 50 100 150 200 250 UV J/m2 E.coli NR10121 mfdE.coli NR10125 mfd+ H.volcanii WFD11 Wednesday, October 23, 13 300 350 400
  24. 24. H. volcanii Excision Repair H. volcanii UV Repair Label 7 - 45J / m2) 0.6 0 J/m2 t0 45 J/m2 t0 45 J/m2 Photoreac. 45 J/m2 Dark 24 Hours 0.4 0.2 0 0 2000 4000 6000 8000 10000 Avg. Mol. Wt.(Base Pairs) Wednesday, October 23, 13 12000 14000 16000 18000
  25. 25. RecA vs. rRNA Eisen 1995 Journal of Molecular Evolution 41: 1105-1123.. Wednesday, October 23, 13
  26. 26. RecA vs. rRNA Eisen 1995 Journal of Molecular Evolution 41: 1105-1123.. Wednesday, October 23, 13
  27. 27. Whatever the History: Try to Incorporate It from Lake et al. doi: 10.1098/rstb.2009.0035 Wednesday, October 23, 13
  28. 28. Tree Updated adapted from Baldauf, et al., in Assembling the Tree of Life, 2004 Wednesday, October 23, 13
  29. 29. Era II: rRNA in the Environment Wednesday, October 23, 13
  30. 30. PCR and phylogenetic analysis of rRNA genes DNA extraction PCR Makes lots of copies of the rRNA genes in sample PCR Phylogenetic tree rRNA1 Sequence alignment = Data matrix Yeast Wednesday, October 23, 13 C A C A C T A C A G T E. coli Humans A Yeast E. coli rRNA1 A G A C A G Humans T A T A G T Sequence rRNA genes rRNA1 5’ ...TACAGTATAGGTGG AGCTAGCGATCGATC GA... 3’
  31. 31. Chemosynthetic Symbionts Eisen et al. 1992. J. Bact.174: 3416 Wednesday, October 23, 13 Eisen et al. 1992
  32. 32. PCR and phylogenetic analysis of rRNA genes DNA extraction PCR Makes lots of copies of the rRNA genes in sample PCR Phylogenetic tree rRNA1 Sequence alignment = Data matrix rRNA2 Yeast Wednesday, October 23, 13 C A C A T A C A G T A G A C A G Humans T A T A G T Yeast T A C A G T rRNA1 5’ ...ACACACATAGGTG GAGCTAGCGATCGAT CGA... 3’ C E. coli Humans A rRNA2 E. coli rRNA1 Sequence rRNA genes rRNA2 5’ ...TACAGTATAGGTGG AGCTAGCGATCGATC GA... 3’
  33. 33. PCR and phylogenetic analysis of rRNA genes DNA extraction PCR Makes lots of copies of the rRNA genes in sample PCR rRNA1 5’...ACACACATAGGTGGAGCTA GCGATCGATCGA... 3’ Phylogenetic tree rRNA1 Sequence alignment = Data matrix rRNA2 A C A C rRNA2 T A C A G C A C T G T rRNA4 C A C A G T E. coli A G A C A G T A T A G T T A C A G T rRNA2 5’..TACAGTATAGGTGGAGCTAG CGACGATCGA... 3’ T Yeast Yeast C Humans Humans E. coli A rRNA3 Wednesday, October 23, 13 rRNA1 rRNA4 rRNA3 Sequence rRNA genes rRNA3 5’...ACGGCAAAATAGGTGGATT CTAGCGATATAGA... 3’ rRNA4 5’...ACGGCCCGATAGGTGGATT CTAGCGCCATAGA... 3’
  34. 34. PCR and phylogenetic analysis of rRNA genes DNA extraction PCR Makes lots of copies of the rRNA genes in sample PCR rRNA1 5’...ACACACATAGGTGGAGCTA GCGATCGATCGA... 3’ Phylogeny Phylogenetic tree rRNA1 Sequence alignment = Data matrix rRNA2 A C A C rRNA2 T A C A G C A C T G T rRNA4 C A C A G T E. coli A G A C A G T A T A G T T A C A G T rRNA2 5’..TACAGTATAGGTGGAGCTAG CGACGATCGA... 3’ T Yeast Yeast C Humans Humans E. coli A rRNA3 Wednesday, October 23, 13 rRNA1 rRNA4 rRNA3 Sequence rRNA genes rRNA3 5’...ACGGCAAAATAGGTGGATT CTAGCGATATAGA... 3’ rRNA4 5’...ACGGCCCGATAGGTGGATT CTAGCGCCATAGA... 3’
  35. 35. Uses of rRNA Phylogeny • OTUs • Taxonomic lists • Relative abundance of taxa • Ecological metrics (alpha / beta diversity) • Phylogenetic metrics • • • • • • • • Binning Identification of novel groups Clades Rates of change LGT Convergence PD Phylogenetic ecology (e.g., Unifrac) Wednesday, October 23, 13
  36. 36. Sequencing Has Gone Crazy 1977 2010 Sanger sequencing method by F. Sanger (PNAS ,1977, 74: 560-564) 1983 1953 2000 1990 1980 Approaching to NGS PCR by K. Mullis (Cold Spring Harb Symp Quant Biol. 1986;51 Pt 1:263-73) Discovery of DNA structure (Cold Spring Harb. Symp. Quant. Biol. 1953;18:123-31) Human Genome Project (Nature , 2001, 409: 860–92; Science, 2001, 291: 1304–1351) 1993 Development of pyrosequencing (Anal. Biochem., 1993, 208: 171-175; Science ,1998, 281: 363-365) Single molecule emulsion PCR 1998 Founded Solexa 1998 Founded 454 Life Science 2000 454 GS20 sequencer (First NGS sequencer) 2005 Solexa Genome Analyzer (First short-read NGS sequencer) Illumina acquires Solexa (Illumina enters the NGS business) 2006 2006 ABI SOLiD (Short-read sequencer based upon ligation) Roche acquires 454 Life Sciences (Roche enters the NGS business) 2007 2007 GS FLX sequencer (NGS with 400-500 bp read lenght) NGS Human Genome sequencing (First Human Genome sequencing based upon NGS technology) 2008 2008 Hi-Seq2000 (200Gbp per Flow Cell) From Slideshare presentation of Cosentino Cristian http://www.slideshare.net/cosentia/high-throughput-equencing Wednesday, October 23, 13 2010 Miseq Roche Jr Ion Torrent PacBio Oxford
  37. 37. rRNA PCR Revolution • More PCR products • Deeper sequencing • The rare biosphere • Relative abundance estimates • More samples (with barcoding) • Times series • Spatially diverse sampling • Fine scale sampling Wednesday, October 23, 13
  38. 38. mental variation or dispersal limitation) exp intense research (5–9), as such studies of β-diversity (variation in vary by spatial scale? Because most bacteria community composition) yield insights into the maintenance of and hardy, we predicted that dispersal lim biodiversity. These studies are still relatively rare for microprimarily across continents, resulting in organisms, however, and thus our understanding of the mechanisms underlying microbial diversity—most of the tree of life— microbial “provinces” (15). At the same tim remains limited. environmental factors would contribute β-Diversity, and therefore distance-decay patterns, could be decay at all scales, resulting in the steepest sl driven solely by differences in environmental conditions across scale as reported in plant and animal comm Jennifer B. H. Martinya,1, Jonathan A. Eisenb, Kevin Pennc, Steven D. Allisona,d, and M. Claire Horner-Devinee space, a hypothesis summed up by microbiologists as, “everyDepartment of Ecology and Evolutionary Biology, and Department of Earth System Science, California, and Discussion thing is everywhere—the California Davis Genomeselects” (10). Under University ofResults Irvine, CA 92697; Department of environmental Center, Davis, CA 95616; Centerthis Evolution and Ecology, University of for Marine Biotechnology and Biomedicine, The Scripps Institution Oceanography, University of California at San Diego, because environmenmodel, aofdistance-decay curve is observed La Jolla, CA 92093; and School of Aquatic and Fishery Sciences, University of Washington, We characterized AOB community compo Seattle, WA 98195 tal variables tend to be spatially autocorrelated, and organisms Sanger sequencing of 16S rRNA gene reg Edited by Edward F. DeLong, Massachusetts Institute of Technology, Cambridge, MA, and approved March 31, 2011 (received for review November 1, 2010) with differing niche preferences are selected from the available primer sets. Here we focus on the results fr The of taxa as β-diversity (variation in community composispatial pool factors drivingthe environment changes with distance. scale (12). Fifty-years ago, Preston (13) noted that the sequences from the order Nitrosomonada tion) yield insights into the maintenance of biodiversity on the turnover rate (rate of change) of bird species composition across Dispersal limitation can also give rise to β-diversity, as it perplanet. Here we tested whether the mechanisms that underlie primers specific for AOB within the β-Prot space within a continent is lower than that across continents. He bacterial β-diversity vary over centimeters influence present-day biogeomits historical contingencies to to continental spatial attributed the high turnover second primer set (18) generated lon The rate across continents to evoluscales by comparing composition of graphic patterns.the marsh sediments. As observed in studies tionary diversification (i.e., speciation) between faunas as a result For example,ammonia-oxidizing bacteneutral niche models, in which an ria communities in salt of dispersal limitation and the lower turnover rates of bird speorganism’s abundance is notmarsh bacterial β-diversity environmental as a result of environmental variation. of macroorganisms, the drivers of salt influenced by its cies within continents depend on spatial scale. a distance-decay curve studies, In contrast to macroorganism (8, 11). On relatively preferences, predict Here we investigateAuthor contributions: J.B.H.M. and M.C.H.-D. designed resea whether the mechanisms underlying βhowever, we found no evidence of evolutionary diversification Fig. 1. The 13 marshes sampled details). Marshes comshort time 1. scales,bacteria(see Table S1births for details). Marshes comstochastic forcontinental deaths diversity in bacteria also vary by spatial scale. We chose to focus M.C.H.-D. performed research; J.B.H.M., S.D.A., and M.C.H.-D of pared with one another marshes sampled (seethe (Inset) The arrangement ammonia-oxidizing within regions areat Table S1 and scale, de- contribute to taxa circled. Fig. The 13 on the ammonia-oxidizing bacteria (AOB), which along with the and spite sampling pointsrelationshipwithin points were circled. along a 100-m drift). overall one another between geographic distance and a heterogeneous distribution of are sampled(Inset) The arrangement On longer archaea M.C.H.-D. wrote the paper. step of of an pared with within marshes. Six regions taxa (ecological ammonia-oxidizing (14), perform the rate-limiting of similarity. Our data are ∼1 km away. Two the idea that community sampling points withinsampledconsistent were sampled along a 100-m transect, and a seventh point was marshes. Six points withmarshes in the The authors declare no conflict of interest. time scales, stochastic stars)can contribute to intensively, in the taxonand thus play a key role in nitrogen dynamics. We genetic processes allow for nitrification diNortheast United States local scales were sampled away. β-diversity, dispersal transect, and a seventh point was sampled ∼1 km more Two marshes limitation at (outlined Northeast (outlined stars) along four 100-m United thegrid pattern. in versificationthe transectstransects in a of the relatively common taxa compared AOB community article is a PNAS Direct Submission. across States landscape were sampled more intensively, If dispersal (evolutionary drift). This composition in 106 sediment samples even though four 16S rRNAa genes grid pattern. along 100-m from 12 salt marshes on three continents. A partially nested is are globally distributed. These environmental or biotic conditions will limiting, then current results highlight the importance sampling design achieved a relatively online through the PNAS open access optio Freely available balanced distribution of of a broader range of Proteobacteria, for understanding microbial Fig. 2. Distance-decay curves for the Nitrosomadales communities. The considering multiple spatial scales but yielded similar results Our a explain theofdistance-decay yielded similar results not (Fig.data are consistent with the idea thatdashed,geographicthe least-squaresfor theregression of magnitude, from in this paper have fully and pairwise 2. Distance-decayData deposition: The sequences reported distance nine Nitrosomadales communities. The biogeography. Tables S2 and Proteobacteria, but curve, and thus blue line denotes classes over linear orders across all spatial Fig. curves S1 broader range S3). 3 even blue km separate least-squares S1). We limited our solid after distance (Fig. S1 andcorrelated localcommunity similarityThe to 12,500denote(Fig. theand Tablewithin each of nos.three all spatial will be Tables S2 and 4,931 scales can dispersal samples, we identified withquality Nitrosomadales scales.cmdashed, lines line denotes 1 regressions (accession theacross samBank database linear regression HQ271472–HQ276885 and H Across all limitation at S3). pling to a monophyletic group of regressionsregions each of the the within within Across all samples, we Nitrosomonadales | ecological drift spatial scales. The solid lines denote separatebacteria, the AOB within three microbial composition grouped intoidentified 4,931 quality Nitrosomadales sequences, which distance-decay |176(2). OTUs (operational taxo1 controlling forto| which factorsintoeven similarity cutoff. taxo-1), scales: within marshes, regional (across marshesallmarshes within regions circled in other grouped sequence though the Fig.β-Proteobacteria, and one habitat,correspondence circled in be addressed. E-mail: jm To The slopes of should spatial scales: (across regions).regional salt lines (except the solid and continentalwithin marshes, whom (across marshes primarily domicontributeusing à-diversity, 176 OTUs (operational nomic sequences, an arbitrary 99% units) 1), and The slopes line) are continental (across nomic units) a high For macroorganisms,arbitrary relativeuponsimilarity cutoff. blueFig.environ- significantlyregions).allzero. Thesupportinglines the solid online at www.p the processes diversity, but spp.). This of all lines (except This cutoff retainedusing an amount of99% sequence which so- light oflight blue significantly less than zero. The slopes (blue dashed) line. red lines sequence contribution significantly different from the slopeless than scale of the solid red solid This article iodiversitycutoff retainedof high amount of sequence diversity, are nated by cordgrass (Spartinaof the containsapproach constrained supports the ecosystemrelatively line) are slopes the information 16S rRNA genes including diversity because of se- but pool of total diversity (richness) and kept theofenvironmental This minimized the chance of a the the are significantly different from the slope of the all scale (blue dashed) line. 1073/pnas.1016308108/-/DCSupplemental. mental factors (1). dispersal including diversity because of se- depends on limitation to that genciety minimizedor Understanding the mechanismsβ-diversity depends quencing or PCR the chance of errors. Most (95%) of the sequences appear Beta-Diversity Drivers of bacterial β-diversity depend on spatial scale a d b c e B and common taxaPCRtoerrors. marine(95%) of the sequences appear plant variation relatively constant, increasing our ability to globally distributed. erate andquencing or are theis thus keyNitrosospira-like clade, closely maintaineither related biodiversity Most to predicting ecosystem ECOLOGY ECOLOGY identify if community similarity. Geographic distance conresponses to future environmental changes. Nitrosospira-like in somonadales dispersal limitation influences AOB composition. knowncloselyabundant in estuarinethe marine The decrease clade, We the largest partial regression coefficient (b = distance conto be related either to sediments (e.g., ref. 19) or to tributedsomonadales community similarity. Geographic 0.40, then asked two questions: (i) Does bacterial community similarity with geographic | sediments (e.g., no. 19) or to 0.0001), with the largest partial nitrate concentration, β-diversity— distance(20)a| universal P < is (Fig. S2). ref. 19 marine bacterium C-17, classified 2011 | PNAS | May in as Nitrosomonas 7850–7854known to be abundant10,estuarine vol. 108 tributed the slope moisture,distance-decay curve—vary www.pnas.org/cgi/do regression coefficient (b = over 0.40, plant specifically, sediment biogeographic bacterium observed in communities from (Fig. cover, salinity, and withand of themoisture, nitrate concentration, plant marine pattern C-17, between the samples was calcuPairwise community similarity classified as Nitrosomonas (20) all S2). P < 0.0001), air sediment local (within marsh), water temperature contributing to domainsbased on (as in refs.similarity betweeneach samples was calcu- cover, salinity, andregional (across marshes within a coast), of life the presence or absence of the OTU using lated Pairwise community 2–4). Pinpointing the underlying smaller, but significant, partial regression coefficients (b = 0.09– air and water temperature contributing to Wednesday, oflated “distance-decay” pattern continues to be an area ofusing and continental scales? (ii) Do the underlying factors (environa rarefied Sørensen’s the presence or absence of using this causes October 23, 13 onindex (4). Community similarityeach OTU this based
  39. 39. Drosophila microbiome Both natural surveys and laboratory experiments indicate that host diet plays a major role in shaping the Drosophila bacterial microbiome. Laboratory strains provide only a limited model of natural host–microbe interactions Wednesday, October 23, 13
  40. 40. The Built Environment Microbial Biogeography of Public Restroom Surfaces Gilberto E. Flores1, Scott T. Bates1, Dan Knights2, Christian L. Lauber1, Jesse Stombaugh3, Rob Knight3,4, Noah Fierer1,5* Bacteria of Public Restrooms 1 Cooperative Institute for Research in Environmental Science, University of Colorado, Boulder, Colorado, United States of America, 2 Department of Computer Science, University of Colorado, Boulder, Colorado, United States of America, 3 Department of Chemistry and Biochemistry, University of Colorado, Boulder, Colorado, United States of America, 4 Howard Hughes Medical Institute, University of Colorado, Boulder, Colorado, United States of America, 5 Department of Ecology and Evolutionary Biology, University of Colorado, Boulder, Colorado, United States of America Abstract We spend the majority of our lives indoors where we are constantly exposed to bacteria residing on surfaces. However, the diversity of these surface-associated communities is largely unknown. We explored the biogeographical patterns exhibited by bacteria across ten surfaces within each of twelve public restrooms. Using high-throughput barcoded pyrosequencing of The ISME Journal (2012), 1–11 the 16 S rRNA gene, we identified 19 bacterial phyla across all surfaces. Most sequences belonged to four phyla: & 2012 International Society for Microbial Ecology All rights reserved 1751-7362/12 Actinobacteria, Bacteriodetes, Firmicutes and Proteobacteria. The communities clustered into three general categories: those www.nature.com/ismej found on surfaces associated with toilets, those on the restroom floor, and those found on surfaces routinely touched with hands. On illustrations of the relative abundance of discriminating suggesting fecal contamination of these surfaces. Floor Figure 3. Cartoon toilet surfaces, gut-associated taxa were more prevalent, taxa on public restroom surfaces. Light blue indicates low surfaces dark blue indicates high abundance of taxa. (A) contained several taxa taxa (Propionibacteriaceae, Corynebacteriaceae, abundance while were the most diverse of all communities and Although skin-associated commonly found in soils. Skin-associated Staphylococcaceae especially the Propionibacteriaceae, on all surfaces, they were relatively more abundant on surfaces routinely touched with bacteria, and Streptococcaceae) were abundant dominated surfaces routinely touched with our hands. Certain taxa were more hands. (B) Gut-associated taxa (Clostridiales, Clostridiales group XI, vagina-associated Lactobacillaceae were widelyBacteroidaceae) in female common in female than in male restrooms as Ruminococcaceae, Lachnospiraceae, Prevotellaceae and distributed were most abundant on toilet surfaces. from urine contamination. Use of the SourceTracker algorithm confirmed Nocardioidaceae) taxonomic restrooms, likely (C) Although soil-associated taxa (Rhodobacteraceae, Rhizobiales, Microbacteriaceae and many of our were in low abundance on all restroom surfaces, they were relatively more abundant on the floor of the restrooms we surveyed. Figure not drawn to scale. observations as human skin was the primary source of bacteria on restroom surfaces. Overall, these results demonstrate that doi:10.1371/journal.pone.0028132.g003 restroom surfaces host relatively diverse microbial communities dominated by human-associated bacteria with clear linkages between communities on or in different body sites and those communities found on restroom surfaces.Bacteria of P More show that SourceTracker analysis support the taxonomic the stallgenerally,were likely dispersed manuallypublicwomen used as we Results of human-associated microbes are commonly found in), they this work is relevant to the after health field 1 1 1,2 1,2 1,2 Steven W Kembel , Evan Jones , Jeff Kline , Dale Northcutt , Jason Stenson , on Coupling these observations with those of the patterns highlighted above, indicating that human skin was the the toilet. restroom surfaces suggesting that bacterial pathogens could readily be transmitted between individuals by the touching 1 Bohannan1, G Z Brown1,2 and Jessica L Green1,3 Ann time, the M Womack , Brendan JM 100 of surfaces. Furthermore, we indicate that routine can use SOURCES source bacteria on all public restroom surfaces distribution of gut-associated bacteria demonstrate that we use of high-throughput analyses of bacterial communities to determine 1 Bathroom biogeography. By on indoor surfaces, an approach whichprimary be used of track pathogen transmission and test the Biology and the Built Environment Center, Institute of Ecology and Evolution, Department of sources of dispersal could to examined, while the human gut was an important source on or toilets results in the bacteria of urine- and fecal-associated bacteria Soil un to take swabbing throughout surfaces in While these results are not unexpected, different the restroom. practices. Biology, University of Oregon, Eugene, OR, USA; 2Energy Studies in Buildings Laboratory, efficacy of hygiene around the toilet, and urine was an important source in women’s Water 80 of outside Department of Architecture, University of Oregon, Eugene, OR, USA and 3Santa Fe Institute, public restrooms,highlight the importance of hand-hygiene when using restrooms (Figure 4, Table S4). Contrary to expectations (see they do researchers Mouth Santa Fe, NM, USA om plants Microbial Biogeography of Public by the Surfaces. PLoS ONE 6(11): e28132. public microbes vary in ST, surfaces could also be potential restrooms GE, Bates determined thatCitation: Floressince these Knights D, Lauber CL, Stombaugh J, et al. (2011)above), soil was not identifiedRestroom SourceTracker algorithm as Urine doi:10.1371/journal.pone.0028132 60 being a major source of bacteria on any of the surfaces, including ours after where theyvehicles from dependcome for the transmission of human pathogens. Unfortunately, Gut Editor: Mark R. Liles, Auburn University, college students (who floors (Figure 4). Although the floor samples contained family-level previous studies have documented that United States of America are ing on the surface (chart).frequent users of the studied restrooms) are not ere shut taxa 23, are likely Received September 12, 2011; Accepted November 1, 2011; Published November that2011 common in soil, the SourceTracker algorithm the most Buildings are complex ecosystems that house trillions of microorganisms interactingSkin each with 40 other, with humans and with their environment. Understanding the ecological and evolutionary ortion of probably underestimates the relative importance of which permits always the most ß 2011 Flores et al. This is an[42,43]. Copyright: diligent of hand-washers open-access article distributed under the terms of the Creative Commons Attribution License, sources, like ORIGINAL ARTICLE Average contribution (%) Architectural design influences the diversity and structure of the built environment microbiome processes that determine the diversity and composition of the built environment microbiome—the unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. pant in indoor microbial community of microorganisms that live indoors—is important for understanding the relationship 20 Funding: This work was supported Foundation their Indoor Environment program, and between building design, biodiversity and human health. In this study, we used high-throughput ecology research,ofPeccia the Howard with funding from the Alfred P. Sloan had no role andstudy design, data collection and analysis, in part bytothe National Institutes Health and Hughes Medical Institute. The funders in decision publish, or sequencing of the bacterial 16S rRNA gene to quantify relationships between building attributes and preparation of has thinks that the fieldthe manuscript. airborne bacterial communities at 0 health-care facility. We quantified airborne bacterial community a Competing Interests: structure and environmental conditions in patient rooms exposed to mechanical or window wh i c h yet to gel. And the Sloan The authors have declared that no competing interests exist. ventilation and in outdoor air. The phylogenetic diversity of airborne bacterial communities was * E-mail: noah.fierer@colorado.edu 26 JanuFoundation’s Olsiewski lower indoors than outdoors, and mechanically ventilated rooms contained less diverse microbial communities than did window-ventilated rooms. Bacterial communities in indoor environments Journal, shares some of his concontained many taxa that are absent or rare outdoors, including taxa closely related to potential communities and revealed a greater diversity of bacteria on Introduction hanically cern. “Everybody’s genhuman pathogens. Building attributes, specifically the source of ventilation air, airflow rates, relative indoor surfaces than captured using cultivation-based techniques humidity and temperature, were correlated with the diversity and composition of indoor bacterial had lower erating vastMore than ever, individuals across the globe spend a large [10–13]. Most of the organisms identified in these studies are amounts of communities. The relative abundance of bacteria closely related to human pathogens was higher portion their sets are y than ones with openthan outdoors, and higher in rooms withquantify those con- lower relative humidity. looking acrossofdata lives indoors, yet relatively little is known about the related to human commensals suggesting that the organisms were they move around. But to lower airflow rates and data,” she says, but indoors winFigure 2. Relationship between the studies that microbial diversity of indoor environments. Of bacterial communities associatedgrowing on the restroombut rather Communities The observed relationship between building design and airborne bacterial diversity suggests that not actively with ten public surfaces surfaces. were deposited ility of fresh air can manage indoor environments, altering through building design and operation the community translated tributions, Peccia’s team has had to develop can be difficult because groups choose dif- of the unweighted UniFrac distance matrix. Each point represents atouching) or indirectly (e.g. floor (triangles) andcells) by PCoA single sample. Note that the toilet (as have examined microorganisms associated with indoor environwe directly (i.e. shedding of skin form of microbial species that potentially colonize the human bacteria during ferent indoors. tions of microbes associ- new methods to collect airbornemicrobiomeand our timeanalytical tools. With ments, most have relied upon cultivation-based techniques hands. humans. Despite these efforts, we still have an incomplete Sloan support, clusters distinct from surfaces touched with to doi:10.1371/journal.pone.0028132.g002 The ISME Journal detect organisms residing an body, and consequently, advance online publication,as the microbesdoi:10.1038/ismej.2011.211 a data archive and integrated analyt- on a variety of household surfaces [1–5]. understanding of bacterial communities associated with indoor extract their DNA, 26 January 2012; are much though, Subject Category: microbial population and community ecology Not surprisingly, these studies have identified surfaces in kitchens environments related differences in the relative abundances of high diversity of floor communities is likely due to the frequency of because limitations of traditional 16 S rRNA genes Keywords: aeromicrobiology; bacteria; built environment microbiome; community ecology;are in the works. pathogens. Although this less abundant in air than on surfaces. ical tools dispersal; environmental filtering on February 9, 2012 Do or Do in or ou t St all i Fa Sta n uc et ll ou So han t ap d dis les pe ns To T e ile oile r tf ts lus ea hh t a To ndle ile tf lo Si or nk flo or e human ck to pre- and restrooms as being hot spots of bacterial contamination. cloning and sequencing techniques have made replicate sampling contact shoes, which would track a diversity some surfaces (Figure 1B, Table notably Because several pathogenic bacteria are known hat having natural airflow In one recent study, they used air filters To foster collaborations between micro- with the bottom aofvariety of to survive on inandwhich is characterizations of abundant onS2). Most surfaces of microorganisms from sources including soil, in-depth were clearly more the communities prohibitive. certain surfaces for extended of time [6–8], are of With the Green says answering that to sample airborne particles and microbes biologists, architects, and building scientists,inperiods a highly-diversethese studiesdisease.[27,39]. Indeed,advent of high-throughputrestrooms (Figure 1B). Some known to be microbial habitat restrooms than male sequencing techniques, we obvious importance preventing the spread of human can now investigate are the most common, and often most abun indoor Introduction microbiome—includes human pathogens and combacteria commonly associated with soil (e.g. clinical data; she’s hoping in a classroom during 4 days during which with each other andalso sponsored a symposium widely recognized that the majority of Rhodobacteraceae, family and beginmicrobial communities at an the foundation with their However, it is now unprecedented found in the vagina of healthy reproductive age w depth to understand the relationship mensals interacting Rhizobiales, Microbacteriaceae and Nocardioidaceae) were, on average, Humans spend up to students were indoors microorganisms cannot be readily cultivated [9] and thus, ital to participate in a study 90% of their lives present and 4 days during et on the microbiome of the built environment abundant on floor surfaces (Figurethe Table S2). and are relatively less abundant in male urine environment (Eames al., 2009). There have been more 3C, between humans, microbes and the built environment. (Klepeis October 23, the overall diversity of associated with indoor Wednesday,et al., 2001). Consequently,roomway we few attempts to comprehensively survey the built ence of hospital-acquired 13 which the was vacant. They measured at the 2011 Indoor Air conference in Austin, microorganisms the toilet flush handles harbored In order to begin to of female urine samples collected as part Interestingly, some of bacterial analysis comprehensively describe the microbial
  41. 41. Citizen Science - Project MERCCURI Wednesday, October 23, 13
  42. 42. Phone Microbiome Jack Gilbert Georgia Barguil Wednesday, October 23, 13
  43. 43. Era III: Genomics Wednesday, October 23, 13
  44. 44. 1st Genome Sequence Fleischmann et al. 1995 Wednesday, October 23, 13
  45. 45. My Study Organisms Tree from Woese. 1987. Microbiological Reviews 51:221 Wednesday, October 23, 13
  46. 46. TIGR Genome Projects Tree from Woese. 1987. Microbiological Reviews 51:221 Wednesday, October 23, 13
  47. 47. TIGR Genome Projects Tree from Woese. 1987. Microbiological Reviews 51:221 Wednesday, October 23, 13
  48. 48. If you can’t beat them, critique them ... Fleischmann et al. 1995 Wednesday, October 23, 13
  49. 49. Helicobacter pylori genome 1997 Wednesday, October 23, 13
  50. 50. Phylogenomics PHYLOGENENETIC PREDICTION OF GENE FUNCTION EXAMPLE A METHOD EXAMPLE B 2A CHOOSE GENE(S) OF INTEREST 5 3A 2B 1A 2A 1B 3B IDENTIFY HOMOLOGS 2 1 3 4 5 6 ALIGN SEQUENCES 1A 2A 3A 1B 2B 1 2 3 4 5 6 1 3B 2 3 4 5 6 3 4 5 6 4 5 6 CALCULATE GENE TREE Duplication? 1A 2A 3A 1B 2B 3B OVERLAY KNOWN FUNCTIONS ONTO TREE Duplication? 1A 2A 3A 1B 2B 1 3B 2 INFER LIKELY FUNCTION OF GENE(S) OF INTEREST Ambiguous Duplication? Species 1 1A 1B Species 2 2A 2B Species 3 3A 3B 1 ACTUAL EVOLUTION (ASSUMED TO BE UNKNOWN) Duplication Wednesday, October 23, 13 2 3 Based on Eisen, 1998 Genome Res 8: 163-167.
  51. 51. Phylogenetic Prediction of Function • Many powerful and automated similarity based methods for assigning genes to protein families • COGs • PFAM HMM searches • Some limitations of similarity based methods can be overcome by phylogenetic approaches • Automated methods now available • Sean Eddy • Steven Brenner • Kimmen Sjölander Wednesday, October 23, 13
  52. 52. Phylogenetic Prediction of Function • Many powerful and automated similarity based methods for assigning genes to protein families • COGs • PFAM HMM searches • Some limitations of similarity based methods can be overcome by phylogenetic approaches • Automated methods now available • Sean Eddy • Steven Brenner • Kimmen Sjölander • But … Wednesday, October 23, 13
  53. 53. Carboxydothermus hydrogenoformans • • • • Isolated from a Russian hotspring Thermophile (grows at 80°C) Anaerobic Grows very efficiently on CO (Carbon Monoxide) • Produces hydrogen gas • Low GC Gram positive (Firmicute) • Genome Determined (Wu et al. 2005 PLoS Genetics 1: e65. ) Wednesday, October 23, 13
  54. 54. Homologs of Sporulation Genes Wu et al. 2005 PLoS Genetics 1: e65. Wednesday, October 23, 13
  55. 55. Carboxydothermus sporulates Wu et al. 2005 PLoS Genetics 1: e65. Wednesday, October 23, 13
  56. 56. Non-Homology Predictions: Phylogenetic Profiling • Step 1: Search all genes in organisms of interest against all other genomes • Ask: Yes or No, is each gene found in each other species • Cluster genes by distribution patterns (profiles) Wednesday, October 23, 13
  57. 57. Sporulation Gene Profile Wu et al. 2005 PLoS Genetics 1: e65. Wednesday, October 23, 13
  58. 58. B. subtilis new sporulation genes Wednesday, October 23, 13
  59. 59. From http://genomesonline.org Wednesday, October 23, 13
  60. 60. PG Profiling Independent Contrasts Wednesday, October 23, 13
  61. 61. Whole Genome Trees AMPHORA Wednesday, October 23, 13
  62. 62. Era IV: Genomes in the Environment Wednesday, October 23, 13
  63. 63. PCR and phylogenetic analysis of rRNA genes DNA extraction PCR Makes lots of copies of the rRNA genes in sample PCR Phylotyping Phylogenetic tree rRNA1 rRNA2 rRNA1 5’...ACACACATAGGTGGAGCTA GCGATCGATCGA... 3’ Sequence alignment = Data matrix A C A C rRNA2 T A C A G C A C T G T rRNA4 C A C A G T E. coli A G A C A G T A T A G T T A C A G T rRNA2 5’..TACAGTATAGGTGGAGCTAG CGACGATCGA... 3’ T Yeast Yeast C Humans Humans E. coli A rRNA3 Wednesday, October 23, 13 rRNA1 rRNA4 rRNA3 Sequence rRNA genes rRNA3 5’...ACGGCAAAATAGGTGGATT CTAGCGATATAGA... 3’ rRNA4 5’...ACGGCCCGATAGGTGGATT CTAGCGCCATAGA... 3’
  64. 64. Shotgun metagenomics DNA extraction PCR Wednesday, October 23, 13 Shotgun Sequence all genes
  65. 65. Shotgun metagenomics DNA extraction PCR Wednesday, October 23, 13 Shotgun Sequence all genes
  66. 66. Phylogeny has many uses in shotgun metagenomics DNA extraction PCR Phylotyping Phylogenetic tree rRNA1 rRNA2 rRNA4 rRNA3 Humans E. coli Yeast Wednesday, October 23, 13 Shotgun Sequence all genes
  67. 67. Uses of Phylogeny in Metagenomics • Taxonomic assessment • Phylogenetic OTUs • Phylogenetic taxonomy assignment • Phylogenetic binning • Sample comparisons and hypothesis testing • Alpha diversity (i.e., PD) • Beta diversity • Trait evolution • Dispersal • Functional predictions • Rates of evolution • Convergence Wednesday, October 23, 13
  68. 68. rRNA Phylotyping - Sargasso Metagenome Venter et al., Science 304: 66. 2004 Wednesday, October 23, 13
  69. 69. RecA Phylotyping - Sargasso Metagenome Venter et al., Science 304: 66. 2004 Wednesday, October 23, 13
  70. 70. Wednesday, October 23, 13 si lo np ro t er ia er ia ac t ba ct eo ro t eo b er ia ba ct eo Venter et al., Science 304: 66. 2004 Major Phylogenetic Group er m ry u ar ch s ae ot C a re na rc ha eo ta Th er ia ct ba s RpoB Eu s- oc oc cu De in so RecA Fu ae te ch iro Sp le xi or of hl HSP70 C EFTu FB EFG C eo De ba lta ct pr er ia ot eo ba ct C er ya ia no ba ct er ia Fi rm ic ut es Ac tin ob ac te ria C hl or ob i Ep m ap am G pr ot Be ta ro t ap ph Al Weighted % of Clones Phylotyping - Sargasso Metagenome Sargasso Phylotypes 0.500 rRNA 0.375 0.250 0.125 0
  71. 71. Genome Biology 2008, http://genomebiology.com/2008/9/10/R151 Volume 9, Issue 10, Article R151 AMPHORA Phylotyping AMPHORA 0.8 0.7 0.6 Relative abundance 0.5 0.4 0.3 0.2 0.1 t am es C y ya no dia e b Ac ac te id ob ria Th act e er m ria Fu oto so gae Ac bac te tin ob ria ac te Aq ria Pl u an ct ifica om e Sp yce te iro ch s a Fi ete rm s ic C ute hl or s U of nc le la ss Ch xi l ifi ed oro ba bi ct er ia de ia C hl oi er Ba ct ba ct er ria pr ot eo ac te er d la ss ifi e np ro te ob ba ct te U nc Ep si lo pr ot eo ac ob ia ria ia er ct ro el ta D ap m am G te ba eo ot pr ta Be Al ph ap ro te ob ac te ria 0 Figure 3 Major phylotypes 23, 13 Wednesday, October identified in Sargasso Sea metagenomic data Wu and Eisen R151.7 dnaG frr infC nusA pgk pyrG rplA rplB rplC rplD rplE rplF rplK rplL rplM rplN rplP rplS rplT rpmA rpoB rpsB rpsC rpsE rpsI rpsJ rpsK rpsM rpsS smpB tsf
  72. 72. Phylogenetic ID of Novel Lineages GOS 1 GOS 2 GOS 3 GOS 4 Wu et al PLoS One 2011 Wednesday, October 23, 13 GOS 5
  73. 73. Phylogenetic Functional Prediction Venter et al., Science 304: 66. 2004 Wednesday, October 23, 13
  74. 74. Phylogenetic Binning Sulcia makes amino acids Baumannia makes vitamins and cofactors Wu et al. 2006 PLoS Biology 4: e188. Wednesday, October 23, 13
  75. 75. Improving Phylogenomics I Wednesday, October 23, 13
  76. 76. Updated Tree of Life Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al Tree Wednesday, October 23, 13
  77. 77. Genomes Poorly Sampled Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al Tree Wednesday, October 23, 13
  78. 78. TIGR Tree of Life Project Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al Tree Wednesday, October 23, 13
  79. 79. Genomic Encyclopedia of Bacteria & Archaea Wu et al. 2009 Nature 462, 1056-1060 Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al Tree Wednesday, October 23, 13
  80. 80. Genomic Encyclopedia of Bacteria & Archaea Wu et al. 2009 Nature 462, 1056-1060 Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al Tree Wednesday, October 23, 13
  81. 81. Family Diversity vs. PD Wu et al. 2009 Nature 462, 1056-1060 Wednesday, October 23, 13
  82. 82. The Dark Matter of Biology From Wu et al. 2009 Nature 462, 1056-1060 Wednesday, October 23, 13
  83. 83. GEBA Uncultured SAR A: Hydrothermal vent B: Gold Mine C: Tropical gyres (Mesopelagic) D: Tropical gyres (Photic zone) OP3 Site Site Site Site OP1 406 OD1 1 Number of SAGs from Candidate Phyla 4 6 1 1 13 - 2 - 2 - Sample collections at 4 additional sites are underway. Phil Hugenholtz 83 Wednesday, October 23, 13
  84. 84. JGI Dark Matter Project brackish/freshwater TG HSM SM GBS GBS HOT OT SAK AK hydrothermal sediment ETL E BACTERIA ARCHAEA UGA recoded for Gly (Gracilibacteria) seawater HGT from Eukaryotes (Nanoarchaea) bioreactor EPR EPR T TA G GOM OM Growing AA chain U oxidoretucase Ribo A P51$ environmental samples (n=9) draft genomes (n=201) W51$*O
  85. 85. recognizes UGA G isolation of single cells (n=9,600) whole genome amplification (n=3,300) U :6 OP11 (Microgenomates) OD1 (Parcubacteria) SR1 BH1 TM7 GN02 (Gracilibacteria) Bacteriodetes OP1 (Acetothermia) 'HLQRFRFFXVí7KHUPXV 093í 70 ZB3 )LEUREDFWHUHV TG3 Spirochaetes WWE1 (Cloacamonetes) Proteobacteria )LUPLFXWHV Tenericutes )XVREDFWHULD Chrysiogenetes Chlorobi 6$5 0DULQLPLFURELD
  86. 86. Caldithrix GOUTA4 Acidobacteria Elusimicrobia Nitrospirae 49S1 2B Chloroflexi Caldiserica AD3 OP9 (Atribacteria) :36í2 Synergistetes Thermodesulfobacteria Deferribacteres CD12 (Aerophobetes) OP8 (Aminicenantes) AC1 SBR1093 SPAM GAL15 Dictyoglomi EM3 Thermotogae Aquificae GAL35 EM19 (Calescamantes) 2FW6SDí )HUYLGLEDFWHULD
  87. 87. Deltaproteobacteria Cyanobacteria :36í2 Actinobacteria Gemmatimonadetes NC10 SC4 WS2 NKB19 (Hydrogenedentes) WYO Armatimonadetes WS4 Planctomycetes Chlamydiae OP3 (Omnitrophica) Lentisphaerae Verrucomicrobia BRC1 Poribacteria WS1 +Gí LD1 GN01 WS3 (Latescibacteria) GN04 1 H H 1 $,$5 +2 1 H +2+2 +2+2 OH 2+3 IMP 1 +2+2 O limiting phosphate, fatty acids, carbon, iron SpotT 51$ SROPHUDVH ı3 ı2 -10 ı1 GTP or GDP +ATP limiting amino acids RelA ppGpp (GTP or GDP) + PPi H DksA Expression of components for stress response O OH +2+2 O O O 1+ 1+ 2+3 2+3 tetrapeptide 1$'+ stringent response (Diapherotrites, Nanoarchaea) H 1 O O 1+ ı4 -35 )$,$5 1 guanine O PurP O H 1 + 1+ 2 ȕ ȕ¶ Į7' ? adenine Woyke et al. Nature 2013. Wednesday, October 23, 13 1 H H e- acceptor Archaea PurF PurD 3XU1 PurL/Q PurM PurK PurE 3XU PurB 1 - Į17' archaeal type purine synthesis (Microgenomates) 1+2 1+2 + + sigma factor (Diapherotrites, Nanoarchaea) ribosome PRPP 1 O Oxidation 1$' + H A U Korarchaeota Cren Thermoprotei Thaumarchaeota Cren MCG Cren pISA7 Cren C2 Aigarchaeota Nanoarchaea Micrarchaea pMC2A384 (Diapherotrites) DSEG (Aenigmarchaea) Nanohaloarchaea Euryarchaeota Reduction ADP O H A U G U A A U G A U Ribo 1+ + genome sequencing, assembly and QC (n=201) SSU rRNA gene based identification (n=2,000) + e- donor ADP Eukaryota archaeal toxins (Nanoarchaea) 1+ 2+3 tetrapeptide murein (peptido-glycan) lytic murein transglycosylase
  88. 88. A Genomic Encyclopedia of Microbes (GEM) Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al Tree Wednesday, October 23, 13
  89. 89. Improving Phylogenomics II • Better Methods Wednesday, October 23, 13
  90. 90. iSEEM Wednesday, October 23, 13
  91. 91. Zorro - Automated Masking 9.0 ce to True Tree Distance to True Tree 8.0 Wu M, Chatterji S, Eisen JA (2012) Accounting For Alignment Uncertainty in Phylogenomics. PLoS ONE 7(1): e30288. doi: 10.1371/journal.pone.0030288 Wednesday, October 23, 13 7.0 6.0 5.0 4.0 200 3.0 no masking zorro gblocks 2.0 1.0 0.0 200 400 800 1600 3200 Sequence Length
  92. 92. Kembel Combiner typically used as a qualitative measure because duplicate s quences are usually removed from the tree. However, the test may be used in a semiquantitative manner if all clone even those with identical or near-identical sequences, are i cluded in the tree (13). Here we describe a quantitative version of UniFrac that w call “weighted UniFrac.” We show that weighted UniFrac b haves similarly to the FST test in situations where both a FIG. 1. Calculation of the unweighted and the weighted UniFr measures. Squares and circles represent sequences from two differe environments. (a) In unweighted UniFrac, the distance between t circle and square communities is calculated as the fraction of t branch length that has descendants from either the square or the circ environment (black) but not both (gray). (b) In weighted UniFra branch lengths are weighted by the relative abundance of sequences the square and circle communities; square sequences are weight twice as much as circle sequences because there are twice as many tot circle sequences in the data set. The width of branches is proportion to the degree to which each branch is weighted in the calculations, an gray branches have no weight. Branches 1 and 2 have heavy weigh since the descendants are biased toward the square and circles, respe tively. Branch 3 contributes no value since it has an equal contributio from circle and square sequences after normalization. Kembel SW, Eisen JA, Pollard KS, Green JL (2011) The Phylogenetic Diversity of Metagenomes. PLoS ONE 6(8): e23214. doi:10.1371/journal.pone.0023214 Wednesday, October 23, 13
  93. 93. Kembel Copy # Correction Kembel SW, Wu M, Eisen JA, Green JL (2012) Incorporating 16S Gene Copy Number Information Improves Estimates of Microbial Diversity and Abundance. PLoS Comput Biol 8(10): e1002743. doi:10.1371/journal.pcbi.1002743 Wednesday, October 23, 13
  94. 94. Sharpton PhylOTU Finding Metagenomic OTU Figure 1. PhylOTU Workflow. Computational processes are represented as squares and databases are represented as cylinders in this generaliz Sharpton TJ,of PhylOTU. See Results sectionSW, Ladau J, O'Dwyer JP, Green JL, Eisen JA, workflow Riesenfeld SJ, Kembel for details. Pollard KS. (2011) PhylOTU: A High-Throughput Procedure Quantifies Microbial doi:10.1371/journal.pcbi.1001061.g001 Community Diversity and Resolves Novel Taxa from Metagenomic Data. PLoS PD alignment used to build the profile, resulting in a multiple Comput Biol 7(1): e1001061. doi:10.1371/journal.pcbi.1001061 versus PID clustering, 2) to explore overlap between PhylOT Wednesday, October 23, alignment sequence 13 of full-length reference sequences and clusters and recognized taxonomic designations, and 3) to quantif
  95. 95. NMF in Metagenomes Characterizing the niche-space distributions of components 0 .2 0 .3 0 .4 0 .5 0 .6 0 .2 0 .4 0 .6 0 .8 1 .0 Polyne sia Archipe la gos_ G S 0 4 8 a _ C ora l R e e f India n O ce a n_ G S 1 2 0 _ O pe n O ce a n Polyne sia Archipe la gos_ G S 0 4 9 _ C oa sta l G a la pa gos Isla nds_ G S 0 2 6 _ O pe n O ce a n India n O ce a n_ G S 1 1 9 _ O pe n O ce a n C a ribbe a n S e a _ G S 0 1 5 _ C oa sta l C a ribbe a n S e a _ G S 0 1 9 _ C oa sta l India n O ce a n_ G S 1 1 4 _ O pe n O ce a n E a ste rn Tropica l Pa cific_ G S 0 2 3 _ O pe n O ce a n India n O ce a n_ G S 1 1 0 a _ O pe n O ce a n India n O ce a n_ G S 1 0 8 a _ La goon R e e f C a ribbe a n S e a _ G S 0 1 8 _ O pe n O ce a n G a la pa gos Isla nds_ G S 0 3 4 _ C oa sta l India n O ce a n_ G S 1 2 2 a _ O pe n O ce a n India n O ce a n_ G S 1 2 1 _ O pe n O ce a n C a ribbe a n S e a _ G S 0 1 7 _ O pe n O ce a n India n O ce a n_ G S 1 1 2 a _ O pe n O ce a n India n O ce a n_ G S 1 1 3 _ O pe n O ce a n India n O ce a n_ G S 1 4 8 _ F ringing R e e f C a ribbe a n S e a _ G S 0 1 6 _ C oa sta l S e a India n O ce a n_ G S 1 2 3 _ O pe n O ce a n India n O ce a n_ G S 1 4 9 _ H a rbor G a la pa gos Isla nds_ G S 0 2 7 _ C oa sta l E a ste rn Tropica l Pa cific_ G S 0 2 2 _ O pe n O ce a n S a rga sso S e a _ G S 0 0 1 c_ O pe n O ce a n G a la pa gos Isla nds_ G S 0 3 5 _ C oa sta l G a la pa gos Isla nds_ G S 0 3 0 _ W a rm S e e p G a la pa gos Isla nds_ G S 0 2 9 _ C oa sta l G a la pa gos Isla nds_ G S 0 3 1 _ C oa sta l upwe lling India n O ce a n_ G S 1 1 7 a _ C oa sta l sa m ple G a la pa gos Isla nds_ G S 0 2 8 _ C oa sta l G a la pa gos Isla nds_ G S 0 3 6 _ C oa sta l Polyne sia Archipe la gos_ G S 0 5 1 _ C ora l R e e f Atoll N orth Am e rica n E a st C oa st_ G S 0 1 4 _ C oa sta l N orth Am e rica n E a st C oa st_ G S 0 0 6 _ E stua ry E a ste rn Tropica l Pa cific_ G S 0 2 1 _ C oa sta l N orth Am e rica n E a st C oa st_ G S 0 0 9 _ C oa sta l N orth Am e rica n E a st C oa st_ G S 0 1 1 _ E stua ry N orth Am e rica n E a st C oa st_ G S 0 0 8 _ C oa sta l N orth Am e rica n E a st C oa st_ G S 0 1 3 _ C oa sta l N orth Am e rica n E a st C oa st_ G S 0 0 4 _ C oa sta l N orth Am e rica n E a st C oa st_ G S 0 0 7 _ C oa sta l N orth Am e rica n E a st C oa st_ G S 0 0 3 _ C oa sta l N orth Am e rica n E a st C oa st_ G S 0 0 2 _ C oa sta l N orth Am e rica n E a st C oa st_ G S 0 0 5 _ E m baym e nt G e ne ra l H igh M e dium Low NA W a te r de pth 4000m 2000!4000m 900!2000m 100!200m 20!100m 0!20m Co mp on Co en t1 (a) mp on Co en t2 mp on Co en t3 mp on Co en t4 mp on en Salinity Sample Depth Chlorophyll Temperature Insolation Water Depth S ites 0 .1 t5 (b) (c) Functional biogeography of ocean microbes revealed w/ T ); b) the siteFigure 3: a) Niche-space distributions for our five components (HWeitz, Dushoff, through non-negative matrix ˆ ˆ c) environmental variables Langille, Neches, similarity matrix (H T H);In press PLoS One. Comes for the sites. The matrices are factorization Jiang et al. aligned so that the same row corresponds to the same site in each matrix. Sites are out 9/18. Levin, etc ordered by applying spectral reordering to the similarity matrix (see Materials and Methods). Rows are aligned across the three matrices. Wednesday, October 23, 13
  96. 96. Phylosift - Mining the Global Metagenome Erick Matsen FHCRC Todd Treangen BNBI, NBACC Holly Bik Jonathan Guillaume Jospin Eisen Aaron Darling Mark Brown Tiffanie Nelson Students and other staff: - Eric Lowe, John Zhang, David Coil Open source community: - BLAST, LAST, HMMER, Infernal, pplacer, Krona, metAMOS, Bioperl, Bio::Phylo, JSON, etc. etc. PhyloSift is open source software: -http://phylosift.wordpress.org -http://github.com/gjospin/phylosift Wednesday, October 23, 13 Supported by DHS Grant
  97. 97. Phylosift/ pplacer Workflow each input sequence scanned against both workflows Input Sequences rRNA workflow 600 bp LAST fast candidate search 600 bp LAST fast candidate search search input against references hmmalign multiple alignment Infernal multiple alignment profile HMMs used to align candidates to reference alignment protein workflow LAST pa fast candidate search ral lel op tio n LAST fast candidate search Taxonomic Summaries hmmalign pplacer Krona plots, Number of reads placed for each marker gene phylogenetic placement multiple alignment Sample Analysis Comparison hmmalign Edge PCA, Tree visualization, Bayes factor tests multiple alignment Aaron Darling, Guillaume Jospin, Holly Bik, Erik Matsen, Eric Lowe, and others Wednesday, October 23, 13
  98. 98. Markers • PMPROK – Dongying Wu’s Bac/Arch markers • Eukaryotic Orthologs – Parfrey 2011 paper • 16S/18S rRNA • Mitochondria - protein-coding genes • Viral Markers – Markov clustering on genomes • Codon Subtrees – finer scale taxonomy • Extended Markers – plastids, gene families Wednesday, October 23, 13
  99. 99. Output 1: Taxonomy Taxonomic summary plots in Krona (Ondov et al 2011) Wednesday, October 23, 13
  100. 100. Output 2: Phylogenetic Tree of Reads Placement tree from 2 week old infant gut data Wednesday, October 23, 13
  101. 101. Edge PCA vs. UNIFRAC PCA QIIME and Edge PCA on 110 fecal metagenomes from Yatsunenko et al 2012 Nature. Sequenced with 454, to about 150Mbp/metagenome Edge PCA: Matsen and Evans 2013 Darling et al Submitted. Wednesday, October 23, 13
  102. 102. Improving Phylogenomics III • Better Data Sets Wednesday, October 23, 13
  103. 103. More Markers Phylogenetic group Archaea Genome Number 62 Gene Number 145415 Maker Candidates 106 Actinobacteria 63 267783 136 Alphaproteobacteria 94 347287 121 Betaproteobacteria 56 266362 311 Gammaproteobacteria 126 483632 118 Deltaproteobacteria 25 102115 206 Epislonproteobacteria 18 33416 455 Bacteriodes 25 71531 286 Chlamydae 13 13823 560 Chloroflexi 10 33577 323 Cyanobacteria 36 124080 590 Firmicutes 106 312309 87 Spirochaetes 18 38832 176 Thermi 5 14160 974 Thermotogae 9 17037 684 Wu D, Jospin G, Eisen JA (2013) Systematic Identification of Gene Families for Use as “Markers” for Phylogenetic and Phylogeny-Driven Ecological Studies of Bacteria and Archaea and Their Major Subgroups. PLoS ONE 8(10): e77033. doi:10.1371/journal.pone. 0077033 Wednesday, October 23, 13
  104. 104. Sifting Families Representative Genomes B A Extract Protein Annotation New Genomes All v. All BLAST Extract Protein Annotation Homology Clustering (MCL) Screen for Homologs SFams HMMs C Sharpton et al. 2012.BMC bioinformatics, 13(1), 264. Figure 1 Wednesday, October 23, 13 Align Build HMMs
  105. 105. Better Reference Tree Lang JM, Darling AE, Eisen JA (2013) Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees and Supermatrices. PLoS ONE 8(4): e62510. doi:10.1371/ journal.pone.0062510 Wednesday, October 23, 13
  106. 106. Acknowledgements • GEBA: • • • GEBA Cyanobacteria • • • $$: GBMF Katie Pollard, Jessica Green, Martin Wu, Steven Kembel, Tom Sharpton, Morgan Langille, Guillaume Jospin, Dongying Wu, aTOL • • • $$$ DHS Aaron Darling, Erik Matsen, Holly Bik, Guillaume Jospin iSEEM: • • • $$$ NSF Marc Facciotti, Aaron Darling, Erin Lynch, Phylosift • • • $$: DOE-JGI Cheryl Kerfeld, Dongying Wu, Patrick Shih Haloarchaea • • • $$: DOE-JGI, DSMZ Eddy Rubin, Phil Hugenholtz, Hans-Peter Klenk, Nikos Kyrpides, Tanya Woyke, Dongying Wu, Aaron Darling, Jenna Lang $$: NSF Naomi Ward, Jonathan Badger, Frank Robb, Martin Wu, Dongying Wu Others (not mentioned in detail) • • • $$: NSF, NIH, DOE, GBMF, DARPA, Sloan Frank Robb, Craig Venter, Doug Rusch, Shibu Yooseph, Nancy Moran, Colleen Cavanaugh, Josh Weitz EisenLab: Srijak Bhatnagar, Russell Neches, Lizzy Wilbanks, Holly Bik Wednesday, October 23, 13
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×