Call Girls Varanasi Just Call 9907093804 Top Class Call Girl Service Available
"Phylogeny-Driven Approaches to Genomics and Metagenomics" talk by Jonathan Eisen at U. Washington on
1. Phylogeny-Driven Approaches to
Genomics and Metagenomics
Jonathan A. Eisen
University of California, Davis
@phylogenomics
Talk at
University of Washington
October 23, 2013
Wednesday, October 23, 13
2. My Obsessions
Jonathan A. Eisen
University of California, Davis
@phylogenomics
Talk at
University of Washington
October 23, 2013
Wednesday, October 23, 13
12. Four Eras of Sequencing & Microbes
Wednesday, October 23, 13
13. Era I: The Tree of Life
Wednesday, October 23, 13
14. Lost in Graduate School?
Colias
Tree from Woese. 1987.
Microbiological Reviews 51:221
Wednesday, October 23, 13
15. Lost in Graduate School?
X
Colias
Phil Hanawalt
Tree from Woese. 1987.
Microbiological Reviews 51:221
Wednesday, October 23, 13
16. Lost in Graduate School?
X
Colias
Phil Hanawalt
Adaptive Mutation
Tree from Woese. 1987.
Microbiological Reviews 51:221
Wednesday, October 23, 13
17. Lost in Graduate School?
X
Colias
Phil Hanawalt
X
Adaptive Mutation
@RELenski
Tree from Woese. 1987.
Microbiological Reviews 51:221
Wednesday, October 23, 13
18. Lost in Graduate School?
Get A Map
Tree from Woese. 1987.
Microbiological Reviews 51:221
Wednesday, October 23, 13
19. Woese - Three Domains 1977
Tree from Woese. 1987.
Microbiological Reviews 51:221
Wednesday, October 23, 13
20. Map for Graduate School
Tree from Woese. 1987.
Microbiological Reviews 51:221
Wednesday, October 23, 13
21. Limited Sampling of RRR Studies
Tree from Woese. 1987.
Microbiological Reviews 51:221
Wednesday, October 23, 13
22. My Study Organisms
Tree from Woese. 1987.
Microbiological Reviews 51:221
Wednesday, October 23, 13
25. RecA vs. rRNA
Eisen 1995 Journal of Molecular Evolution 41: 1105-1123..
Wednesday, October 23, 13
26. RecA vs. rRNA
Eisen 1995 Journal of Molecular Evolution 41: 1105-1123..
Wednesday, October 23, 13
27. Whatever the History: Try to Incorporate It
from Lake et al. doi: 10.1098/rstb.2009.0035
Wednesday, October 23, 13
28. Tree Updated
adapted from Baldauf, et al., in Assembling the Tree of Life, 2004
Wednesday, October 23, 13
29. Era II: rRNA in the Environment
Wednesday, October 23, 13
30. PCR and phylogenetic analysis of rRNA genes
DNA
extraction
PCR
Makes lots of
copies of the
rRNA genes
in sample
PCR
Phylogenetic tree
rRNA1
Sequence alignment = Data matrix
Yeast
Wednesday, October 23, 13
C
A
C
A
C
T
A
C
A
G
T
E. coli
Humans
A
Yeast
E. coli
rRNA1
A
G
A
C
A
G
Humans
T
A
T
A
G
T
Sequence
rRNA genes
rRNA1
5’
...TACAGTATAGGTGG
AGCTAGCGATCGATC
GA... 3’
32. PCR and phylogenetic analysis of rRNA genes
DNA
extraction
PCR
Makes lots of
copies of the
rRNA genes
in sample
PCR
Phylogenetic tree
rRNA1
Sequence alignment = Data matrix
rRNA2
Yeast
Wednesday, October 23, 13
C
A
C
A
T
A
C
A
G
T
A
G
A
C
A
G
Humans
T
A
T
A
G
T
Yeast
T
A
C
A
G
T
rRNA1
5’
...ACACACATAGGTG
GAGCTAGCGATCGAT
CGA... 3’
C
E. coli
Humans
A
rRNA2
E. coli
rRNA1
Sequence
rRNA genes
rRNA2
5’
...TACAGTATAGGTGG
AGCTAGCGATCGATC
GA... 3’
33. PCR and phylogenetic analysis of rRNA genes
DNA
extraction
PCR
Makes lots of
copies of the
rRNA genes
in sample
PCR
rRNA1
5’...ACACACATAGGTGGAGCTA
GCGATCGATCGA... 3’
Phylogenetic tree
rRNA1
Sequence alignment = Data matrix
rRNA2
A
C
A
C
rRNA2
T
A
C
A
G
C
A
C
T
G
T
rRNA4
C
A
C
A
G
T
E. coli
A
G
A
C
A
G
T
A
T
A
G
T
T
A
C
A
G
T
rRNA2
5’..TACAGTATAGGTGGAGCTAG
CGACGATCGA... 3’
T
Yeast
Yeast
C
Humans
Humans
E. coli
A
rRNA3
Wednesday, October 23, 13
rRNA1
rRNA4
rRNA3
Sequence
rRNA genes
rRNA3
5’...ACGGCAAAATAGGTGGATT
CTAGCGATATAGA... 3’
rRNA4
5’...ACGGCCCGATAGGTGGATT
CTAGCGCCATAGA... 3’
34. PCR and phylogenetic analysis of rRNA genes
DNA
extraction
PCR
Makes lots of
copies of the
rRNA genes
in sample
PCR
rRNA1
5’...ACACACATAGGTGGAGCTA
GCGATCGATCGA... 3’
Phylogeny
Phylogenetic tree
rRNA1
Sequence alignment = Data matrix
rRNA2
A
C
A
C
rRNA2
T
A
C
A
G
C
A
C
T
G
T
rRNA4
C
A
C
A
G
T
E. coli
A
G
A
C
A
G
T
A
T
A
G
T
T
A
C
A
G
T
rRNA2
5’..TACAGTATAGGTGGAGCTAG
CGACGATCGA... 3’
T
Yeast
Yeast
C
Humans
Humans
E. coli
A
rRNA3
Wednesday, October 23, 13
rRNA1
rRNA4
rRNA3
Sequence
rRNA genes
rRNA3
5’...ACGGCAAAATAGGTGGATT
CTAGCGATATAGA... 3’
rRNA4
5’...ACGGCCCGATAGGTGGATT
CTAGCGCCATAGA... 3’
35. Uses of rRNA Phylogeny
• OTUs
• Taxonomic lists
• Relative abundance of taxa
• Ecological metrics (alpha / beta diversity)
• Phylogenetic metrics
•
•
•
•
•
•
•
•
Binning
Identification of novel groups
Clades
Rates of change
LGT
Convergence
PD
Phylogenetic ecology (e.g., Unifrac)
Wednesday, October 23, 13
36. Sequencing Has Gone Crazy
1977
2010
Sanger sequencing method by F. Sanger
(PNAS ,1977, 74: 560-564)
1983
1953
2000
1990
1980
Approaching to NGS
PCR by K. Mullis
(Cold Spring Harb Symp Quant Biol. 1986;51 Pt 1:263-73)
Discovery of DNA structure
(Cold Spring Harb. Symp. Quant. Biol. 1953;18:123-31)
Human Genome Project
(Nature , 2001, 409: 860–92; Science, 2001, 291: 1304–1351)
1993
Development of pyrosequencing
(Anal. Biochem., 1993, 208: 171-175; Science ,1998, 281: 363-365)
Single molecule emulsion PCR
1998
Founded Solexa
1998
Founded 454 Life Science
2000
454 GS20 sequencer
(First NGS sequencer)
2005
Solexa Genome Analyzer
(First short-read NGS sequencer)
Illumina acquires Solexa
(Illumina enters the NGS business)
2006
2006
ABI SOLiD
(Short-read sequencer based upon ligation)
Roche acquires 454 Life Sciences
(Roche enters the NGS business)
2007
2007
GS FLX sequencer
(NGS with 400-500 bp read lenght)
NGS Human Genome sequencing
(First Human Genome sequencing based upon NGS technology)
2008
2008
Hi-Seq2000
(200Gbp per Flow Cell)
From Slideshare presentation of Cosentino Cristian
http://www.slideshare.net/cosentia/high-throughput-equencing
Wednesday, October 23, 13
2010
Miseq
Roche Jr
Ion Torrent
PacBio
Oxford
37. rRNA PCR Revolution
• More PCR products
• Deeper sequencing
• The rare biosphere
• Relative abundance estimates
• More samples (with barcoding)
• Times series
• Spatially diverse sampling
• Fine scale sampling
Wednesday, October 23, 13
38. mental variation or dispersal limitation) exp
intense research (5–9), as such studies of β-diversity (variation in
vary by spatial scale? Because most bacteria
community composition) yield insights into the maintenance of
and hardy, we predicted that dispersal lim
biodiversity. These studies are still relatively rare for microprimarily across continents, resulting in
organisms, however, and thus our understanding of the mechanisms underlying microbial diversity—most of the tree of life—
microbial “provinces” (15). At the same tim
remains limited.
environmental factors would contribute
β-Diversity, and therefore distance-decay patterns, could be
decay at all scales, resulting in the steepest sl
driven solely by differences in environmental conditions across
scale as reported in plant and animal comm
Jennifer B. H. Martinya,1, Jonathan A. Eisenb, Kevin Pennc, Steven D. Allisona,d, and M. Claire Horner-Devinee
space, a hypothesis summed up by microbiologists as, “everyDepartment of Ecology and Evolutionary Biology, and Department of Earth System Science,
California, and Discussion
thing is everywhere—the California Davis Genomeselects” (10). Under University ofResults Irvine, CA 92697; Department of
environmental Center, Davis, CA 95616; Centerthis
Evolution and Ecology, University of
for Marine Biotechnology and Biomedicine, The Scripps
Institution
Oceanography, University of California at San Diego, because environmenmodel, aofdistance-decay curve is observed La Jolla, CA 92093; and School of Aquatic and Fishery Sciences, University of Washington,
We characterized AOB community compo
Seattle, WA 98195
tal variables tend to be spatially autocorrelated, and organisms
Sanger sequencing of 16S rRNA gene reg
Edited by Edward F. DeLong, Massachusetts Institute of Technology, Cambridge, MA, and approved March 31, 2011 (received for review November 1, 2010)
with differing niche preferences are selected from the available
primer sets. Here we focus on the results fr
The of taxa as β-diversity (variation in community composispatial
pool factors drivingthe environment changes with distance. scale (12). Fifty-years ago, Preston (13) noted that the
sequences from the order Nitrosomonada
tion) yield insights into the maintenance of biodiversity on the
turnover rate (rate of change) of bird species composition across
Dispersal limitation can also give rise to β-diversity, as it perplanet. Here we tested whether the mechanisms that underlie
primers specific for AOB within the β-Prot
space within a continent is lower than that across continents. He
bacterial β-diversity vary over centimeters influence present-day biogeomits historical contingencies to to continental spatial attributed the high turnover second primer set (18) generated lon
The rate across continents to evoluscales by comparing
composition of
graphic patterns.the marsh sediments. As observed in studies tionary diversification (i.e., speciation) between faunas as a result
For example,ammonia-oxidizing bacteneutral niche models, in which an
ria communities in salt
of dispersal limitation and the lower turnover rates of bird speorganism’s abundance is notmarsh bacterial β-diversity environmental as a result of environmental variation.
of macroorganisms, the drivers of salt influenced by its
cies within continents
depend on spatial scale. a distance-decay curve studies,
In contrast to macroorganism (8, 11). On relatively
preferences, predict
Here we investigateAuthor contributions: J.B.H.M. and M.C.H.-D. designed resea
whether the mechanisms underlying βhowever, we found no evidence of evolutionary diversification
Fig. 1. The 13 marshes sampled
details). Marshes comshort time 1. scales,bacteria(see Table S1births for details). Marshes comstochastic forcontinental deaths diversity in bacteria also vary by spatial scale. We chose to focus
M.C.H.-D. performed research; J.B.H.M., S.D.A., and M.C.H.-D
of pared with one another marshes sampled (seethe (Inset) The arrangement
ammonia-oxidizing within regions areat Table S1 and scale, de- contribute to
taxa circled.
Fig.
The 13
on the ammonia-oxidizing bacteria (AOB), which along with the
and
spite sampling pointsrelationshipwithin points were circled. along a 100-m drift).
overall one another between geographic distance and
a heterogeneous distribution of are sampled(Inset) The arrangement On longer archaea M.C.H.-D. wrote the paper. step of
of an pared with within marshes. Six regions taxa (ecological
ammonia-oxidizing
(14), perform the rate-limiting
of similarity. Our data are ∼1 km away. Two the idea that
community sampling points withinsampledconsistent were sampled along a 100-m
transect, and a seventh point was marshes. Six points withmarshes in the
The authors declare no conflict of interest.
time scales, stochastic stars)can contribute to intensively, in the taxonand thus play a key role in nitrogen dynamics. We
genetic processes allow for
nitrification diNortheast United States local scales were sampled away. β-diversity,
dispersal transect, and a seventh point was sampled ∼1 km more Two marshes
limitation at (outlined
Northeast
(outlined stars)
along four 100-m United thegrid pattern.
in
versificationthe transectstransects in a of the relatively common taxa compared AOB community article is a PNAS Direct Submission.
across States landscape were sampled more intensively, If dispersal
(evolutionary drift).
This composition in 106 sediment samples
even though four 16S rRNAa genes grid pattern.
along
100-m
from 12 salt marshes on three continents. A partially nested
is are globally distributed. These environmental or biotic conditions will
limiting, then current results highlight the importance sampling design achieved a relatively online through the PNAS open access optio
Freely available balanced distribution of
of a broader range of Proteobacteria, for understanding microbial Fig. 2. Distance-decay curves for the Nitrosomadales communities. The
considering multiple spatial scales but yielded similar results
Our a explain theofdistance-decay yielded similar results
not (Fig.data are consistent with the idea thatdashed,geographicthe least-squaresfor theregression of magnitude, from in this paper have
fully and
pairwise 2. Distance-decayData deposition: The sequences reported
distance
nine Nitrosomadales communities. The
biogeography. Tables S2 and Proteobacteria, but curve, and thus blue line denotes classes over linear orders across all spatial
Fig.
curves
S1 broader range
S3).
3 even blue km separate least-squares S1). We limited our
solid after
distance (Fig. S1 andcorrelated localcommunity similarityThe to 12,500denote(Fig. theand Tablewithin each of nos.three all spatial
will be Tables S2 and 4,931 scales can
dispersal samples, we identified withquality Nitrosomadales scales.cmdashed, lines line denotes 1 regressions (accession theacross samBank database linear regression HQ271472–HQ276885 and H
Across all limitation at S3).
pling to a monophyletic group of regressionsregions each of the the
within within
Across all samples, we Nitrosomonadales | ecological drift spatial scales. The solid lines denote separatebacteria, the AOB within three
microbial composition grouped intoidentified 4,931 quality Nitrosomadales
sequences, which distance-decay |176(2).
OTUs (operational taxo1
controlling forto| which factorsintoeven similarity cutoff. taxo-1), scales: within marshes, regional (across marshesallmarshes within regions circled in
other grouped sequence though the Fig.β-Proteobacteria, and one habitat,correspondence circled in be addressed. E-mail: jm
To The slopes of
should
spatial scales: (across regions).regional salt lines (except the solid
and continentalwithin marshes, whom (across marshes primarily domicontributeusing à-diversity, 176 OTUs (operational
nomic sequences, an arbitrary 99%
units)
1), and
The slopes
line) are continental (across
nomic units) a high
For macroorganisms,arbitrary relativeuponsimilarity cutoff. blueFig.environ- significantlyregions).allzero. Thesupportinglines the solid online at www.p
the processes diversity, but
spp.). This of all lines (except
This cutoff retainedusing an amount of99% sequence which so- light oflight blue significantly less than zero. The slopes (blue dashed) line. red lines
sequence contribution significantly different from the slopeless than scale of the solid red solid
This article
iodiversitycutoff retainedof high amount of sequence diversity, are nated by cordgrass (Spartinaof the containsapproach constrained
supports the ecosystemrelatively
line) are
slopes
the information
16S rRNA genes including diversity because of se- but pool of total diversity (richness) and kept theofenvironmental
This
minimized the chance of a the
the are significantly different from the slope of the all scale (blue dashed) line.
1073/pnas.1016308108/-/DCSupplemental.
mental factors (1). dispersal including diversity because of se- depends on
limitation to that genciety minimizedor Understanding the mechanismsβ-diversity
depends
quencing or PCR the chance of
errors. Most (95%) of the sequences appear
Beta-Diversity
Drivers of bacterial β-diversity depend on spatial scale
a
d
b
c
e
B
and
common taxaPCRtoerrors. marine(95%) of the sequences appear plant variation relatively constant, increasing our ability to
globally distributed.
erate andquencing or are theis thus keyNitrosospira-like clade,
closely maintaineither
related biodiversity Most to predicting ecosystem
ECOLOGY
ECOLOGY
identify if community similarity. Geographic distance conresponses to future environmental changes. Nitrosospira-like in somonadales dispersal limitation influences AOB composition.
knowncloselyabundant in estuarinethe marine The decrease clade, We the largest partial regression coefficient (b = distance conto be related either to sediments (e.g., ref. 19) or to
tributedsomonadales community similarity. Geographic 0.40,
then asked two questions: (i) Does bacterial
community similarity with geographic | sediments (e.g., no. 19) or to 0.0001), with the largest partial nitrate concentration, β-diversity—
distance(20)a| universal P <
is (Fig. S2).
ref. 19
marine bacterium C-17, classified 2011
| PNAS | May in as Nitrosomonas
7850–7854known to be abundant10,estuarine vol. 108
tributed the slope moisture,distance-decay curve—vary www.pnas.org/cgi/do
regression coefficient (b = over
0.40,
plant
specifically, sediment
biogeographic bacterium observed in communities from (Fig. cover, salinity, and withand of themoisture, nitrate concentration, plant
marine pattern C-17, between the samples was calcuPairwise community similarity classified as Nitrosomonas (20) all S2).
P < 0.0001), air sediment
local (within marsh), water temperature contributing to
domainsbased on (as in refs.similarity betweeneach samples was calcu- cover, salinity, andregional (across marshes within a coast),
of life the presence or absence of the OTU using
lated Pairwise community 2–4). Pinpointing the underlying smaller, but significant, partial regression coefficients (b = 0.09–
air and water temperature contributing to
Wednesday, oflated “distance-decay” pattern continues to be an area ofusing
and continental scales? (ii) Do the underlying factors (environa rarefied Sørensen’s the presence or absence of using this
causes October 23, 13 onindex (4). Community similarityeach OTU
this based
39. Drosophila microbiome
Both natural surveys and laboratory experiments indicate
that host diet plays a major role in shaping the Drosophila
bacterial microbiome.
Laboratory strains provide only a limited model of natural
host–microbe interactions
Wednesday, October 23, 13
40. The Built Environment
Microbial Biogeography of Public Restroom Surfaces
Gilberto E. Flores1, Scott T. Bates1, Dan Knights2, Christian L. Lauber1, Jesse Stombaugh3, Rob Knight3,4,
Noah Fierer1,5*
Bacteria of Public Restrooms
1 Cooperative Institute for Research in Environmental Science, University of Colorado, Boulder, Colorado, United States of America, 2 Department of Computer Science,
University of Colorado, Boulder, Colorado, United States of America, 3 Department of Chemistry and Biochemistry, University of Colorado, Boulder, Colorado, United
States of America, 4 Howard Hughes Medical Institute, University of Colorado, Boulder, Colorado, United States of America, 5 Department of Ecology and Evolutionary
Biology, University of Colorado, Boulder, Colorado, United States of America
Abstract
We spend the majority of our lives indoors where we are constantly exposed to bacteria residing on surfaces. However, the
diversity of these surface-associated communities is largely unknown. We explored the biogeographical patterns exhibited
by bacteria across ten surfaces within each of twelve public restrooms. Using high-throughput barcoded pyrosequencing of
The ISME Journal (2012), 1–11
the 16 S rRNA gene, we identified 19 bacterial phyla across all surfaces. Most sequences belonged to four phyla:
& 2012 International Society for Microbial Ecology All rights reserved 1751-7362/12
Actinobacteria, Bacteriodetes, Firmicutes and Proteobacteria. The communities clustered into three general categories: those
www.nature.com/ismej
found on surfaces associated with toilets, those on the restroom floor, and those found on surfaces routinely touched with
hands. On illustrations of the relative abundance of discriminating suggesting fecal contamination of these surfaces. Floor
Figure 3. Cartoon toilet surfaces, gut-associated taxa were more prevalent, taxa on public restroom surfaces. Light blue indicates low
surfaces dark blue indicates high abundance of taxa. (A) contained several taxa taxa (Propionibacteriaceae, Corynebacteriaceae,
abundance while were the most diverse of all communities and Although skin-associated commonly found in soils. Skin-associated
Staphylococcaceae especially the Propionibacteriaceae, on all surfaces, they were relatively more abundant on surfaces routinely touched with
bacteria, and Streptococcaceae) were abundant dominated surfaces routinely touched with our hands. Certain taxa were more
hands. (B) Gut-associated taxa (Clostridiales, Clostridiales group XI, vagina-associated Lactobacillaceae were widelyBacteroidaceae) in female
common in female than in male restrooms as Ruminococcaceae, Lachnospiraceae, Prevotellaceae and distributed were most
abundant on toilet surfaces. from urine contamination. Use of the SourceTracker algorithm confirmed Nocardioidaceae) taxonomic
restrooms, likely (C) Although soil-associated taxa (Rhodobacteraceae, Rhizobiales, Microbacteriaceae and many of our were in low
abundance on all restroom surfaces, they were relatively more abundant on the floor of the restrooms we surveyed. Figure not drawn to scale.
observations as human skin was the primary source of bacteria on restroom surfaces. Overall, these results demonstrate that
doi:10.1371/journal.pone.0028132.g003
restroom surfaces host relatively diverse microbial communities dominated by human-associated bacteria with clear
linkages between communities on or in different body sites and those communities found on restroom surfaces.Bacteria of P
More
show that SourceTracker analysis support the taxonomic
the stallgenerally,were likely dispersed manuallypublicwomen used as we Results of human-associated microbes are commonly found
in), they this work is relevant to the after health field
1
1
1,2
1,2
1,2
Steven W Kembel , Evan Jones , Jeff Kline , Dale Northcutt , Jason Stenson ,
on Coupling these observations with those of the
patterns highlighted above, indicating that human skin was the
the toilet. restroom surfaces suggesting that bacterial pathogens could readily be transmitted between individuals by the touching
1
Bohannan1, G Z Brown1,2 and Jessica L Green1,3
Ann
time, the M Womack , Brendan JM 100
of surfaces. Furthermore, we indicate that routine can use
SOURCES
source
bacteria on all public restroom surfaces
distribution of gut-associated bacteria demonstrate that we use of high-throughput analyses of bacterial communities to determine
1
Bathroom biogeography. By on indoor surfaces, an approach whichprimary be used of track pathogen transmission and test the
Biology and the Built Environment Center, Institute of Ecology and Evolution, Department of
sources of dispersal
could
to
examined, while the human gut was an important source on or
toilets results in the bacteria of urine- and fecal-associated bacteria
Soil
un to take
swabbing throughout surfaces in While these results are not unexpected,
different the restroom. practices.
Biology, University of Oregon, Eugene, OR, USA; 2Energy Studies in Buildings Laboratory,
efficacy of hygiene
around the toilet, and urine was an important source in women’s
Water
80
of outside
Department of Architecture, University of Oregon, Eugene, OR, USA and 3Santa Fe Institute,
public restrooms,highlight the importance of hand-hygiene when using
restrooms (Figure 4, Table S4). Contrary to expectations (see
they do researchers
Mouth
Santa Fe, NM, USA
om plants
Microbial Biogeography of Public by the Surfaces. PLoS ONE 6(11): e28132.
public microbes vary in ST, surfaces could also be potential
restrooms GE, Bates
determined thatCitation: Floressince these Knights D, Lauber CL, Stombaugh J, et al. (2011)above), soil was not identifiedRestroom SourceTracker algorithm as
Urine
doi:10.1371/journal.pone.0028132
60
being a major source of bacteria on any of the surfaces, including
ours after
where theyvehicles from dependcome for the transmission of human pathogens. Unfortunately,
Gut
Editor: Mark R. Liles, Auburn University, college students (who
floors (Figure 4). Although the floor samples contained family-level
previous studies have documented that United States of America are
ing on the surface (chart).frequent users of the studied restrooms) are not
ere shut
taxa 23, are
likely Received September 12, 2011; Accepted November 1, 2011; Published November that2011 common in soil, the SourceTracker algorithm
the most
Buildings are complex ecosystems that house trillions of microorganisms interactingSkin each
with
40
other, with humans and with their environment. Understanding the ecological and evolutionary
ortion of
probably underestimates the relative importance of which permits
always the most ß 2011 Flores et al. This is an[42,43].
Copyright: diligent of hand-washers open-access article distributed under the terms of the Creative Commons Attribution License, sources, like
ORIGINAL ARTICLE
Average contribution (%)
Architectural design influences the diversity and
structure of the built environment microbiome
processes that determine the diversity and composition of the built environment microbiome—the
unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
pant in indoor microbial
community of microorganisms that live indoors—is important for understanding the relationship
20
Funding: This work was supported
Foundation
their Indoor Environment program, and
between building design, biodiversity and human health. In this study, we used high-throughput
ecology research,ofPeccia the Howard with funding from the Alfred P. Sloan had no role andstudy design, data collection and analysis, in part bytothe National
Institutes
Health and
Hughes Medical Institute. The funders
in
decision
publish, or
sequencing of the bacterial 16S rRNA gene to quantify relationships between building attributes and
preparation of has
thinks that the fieldthe manuscript.
airborne bacterial communities at 0 health-care facility. We quantified airborne bacterial community
a
Competing Interests:
structure and environmental conditions in patient rooms exposed to mechanical or window
wh i c h
yet to gel. And the Sloan The authors have declared that no competing interests exist.
ventilation and in outdoor air. The phylogenetic diversity of airborne bacterial communities was
* E-mail: noah.fierer@colorado.edu
26 JanuFoundation’s Olsiewski
lower indoors than outdoors, and mechanically ventilated rooms contained less diverse microbial
communities than did window-ventilated rooms. Bacterial communities in indoor environments
Journal,
shares some of his concontained many taxa that are absent or rare outdoors, including taxa closely related to potential
communities and revealed a greater diversity of bacteria on
Introduction
hanically
cern. “Everybody’s genhuman pathogens. Building attributes, specifically the source of ventilation air, airflow rates, relative
indoor surfaces than captured using cultivation-based techniques
humidity and temperature, were correlated with the diversity and composition of indoor bacterial
had lower
erating vastMore than ever, individuals across the globe spend a large [10–13]. Most of the organisms identified in these studies are
amounts of
communities. The relative abundance of bacteria closely related to human pathogens was higher
portion their sets
are
y than ones with openthan outdoors, and higher in rooms withquantify those con- lower relative humidity. looking acrossofdata lives indoors, yet relatively little is known about the related to human commensals suggesting that the organisms were
they move around. But to lower airflow rates and data,” she says, but
indoors winFigure 2. Relationship between the studies that
microbial diversity of indoor environments. Of bacterial communities associatedgrowing on the restroombut rather Communities
The observed relationship between building design and airborne bacterial diversity suggests that
not actively with ten public surfaces surfaces. were deposited
ility of fresh air can manage indoor environments, altering through building design and operation the community
translated tributions, Peccia’s team has had to develop can be difficult because groups choose dif- of the unweighted UniFrac distance matrix. Each point represents atouching) or indirectly (e.g. floor (triangles) andcells) by
PCoA
single sample. Note that the
toilet (as
have examined microorganisms associated with indoor environwe
directly (i.e.
shedding of skin
form
of microbial species that potentially colonize the human bacteria during ferent indoors.
tions of microbes associ- new methods to collect airbornemicrobiomeand our timeanalytical tools. With ments, most have relied upon cultivation-based techniques hands. humans. Despite these efforts, we still have an incomplete
Sloan support, clusters distinct from surfaces touched with to
doi:10.1371/journal.pone.0028132.g002
The ISME Journal
detect organisms residing
an body, and consequently, advance online publication,as the microbesdoi:10.1038/ismej.2011.211 a data archive and integrated analyt- on a variety of household surfaces [1–5]. understanding of bacterial communities associated with indoor
extract their DNA, 26 January 2012; are much though,
Subject Category: microbial population and community ecology
Not surprisingly, these studies have identified surfaces in kitchens
environments related differences in the relative abundances of
high diversity of floor communities is likely due to the frequency of because limitations of traditional 16 S rRNA genes
Keywords: aeromicrobiology; bacteria; built environment microbiome; community ecology;are in the works.
pathogens. Although this less abundant in air than on surfaces.
ical tools dispersal;
environmental filtering
on February 9, 2012
Do
or
Do in
or
ou
t
St
all
i
Fa Sta n
uc
et ll ou
So han t
ap
d
dis les
pe
ns
To
T
e
ile oile r
tf
ts
lus ea
hh t
a
To ndle
ile
tf
lo
Si or
nk
flo
or
e human
ck to pre-
and restrooms as being hot spots of bacterial contamination.
cloning and sequencing techniques have made replicate sampling
contact
shoes, which would track a diversity
some surfaces (Figure 1B, Table
notably
Because several pathogenic bacteria are known
hat having natural airflow
In one recent study, they used air filters
To foster collaborations between micro- with the bottom aofvariety of to survive on inandwhich is characterizations of abundant onS2). Most surfaces
of microorganisms from
sources including soil, in-depth were clearly more the communities prohibitive.
certain
surfaces for extended
of time [6–8],
are of
With the
Green says answering that to sample airborne particles and microbes biologists, architects, and building scientists,inperiods a highly-diversethese studiesdisease.[27,39]. Indeed,advent of high-throughputrestrooms (Figure 1B). Some
known to be
microbial habitat
restrooms than male sequencing techniques, we
obvious importance
preventing the spread of human
can now investigate are the most common, and often most abun
indoor
Introduction
microbiome—includes human pathogens and combacteria commonly associated with soil (e.g.
clinical data; she’s hoping in a classroom during 4 days during which with each other andalso sponsored a symposium widely recognized that the majority of Rhodobacteraceae, family and beginmicrobial communities at an
the foundation with their
However, it is now
unprecedented found in the vagina of healthy reproductive age w
depth
to understand the relationship
mensals interacting
Rhizobiales, Microbacteriaceae and Nocardioidaceae) were, on average,
Humans spend up to students were indoors
microorganisms cannot be readily cultivated [9] and thus,
ital to participate in a study 90% of their lives present and 4 days during et on the microbiome of the built environment abundant on floor surfaces (Figurethe Table S2). and are relatively less abundant in male urine
environment (Eames
al., 2009). There have been
more
3C, between humans, microbes and the built environment.
(Klepeis October 23,
the
overall diversity of
associated with indoor
Wednesday,et al., 2001). Consequently,roomway we few attempts to comprehensively survey the built
ence of hospital-acquired 13
which the
was vacant. They measured at the 2011 Indoor Air conference in Austin, microorganisms the toilet flush handles harbored In order to begin to of female urine samples collected as part
Interestingly, some of
bacterial
analysis comprehensively describe the microbial
50. Phylogenomics
PHYLOGENENETIC PREDICTION OF GENE FUNCTION
EXAMPLE A
METHOD
EXAMPLE B
2A
CHOOSE GENE(S) OF INTEREST
5
3A
2B
1A 2A 1B 3B
IDENTIFY HOMOLOGS
2
1 3 4
5
6
ALIGN SEQUENCES
1A
2A
3A 1B
2B
1
2
3
4
5
6
1
3B
2
3
4
5
6
3
4
5
6
4
5
6
CALCULATE GENE TREE
Duplication?
1A
2A 3A 1B
2B
3B
OVERLAY KNOWN
FUNCTIONS ONTO TREE
Duplication?
1A
2A 3A 1B
2B
1
3B
2
INFER LIKELY FUNCTION
OF GENE(S) OF INTEREST
Ambiguous
Duplication?
Species 1
1A 1B
Species 2
2A 2B
Species 3
3A 3B
1
ACTUAL EVOLUTION
(ASSUMED TO BE UNKNOWN)
Duplication
Wednesday, October 23, 13
2
3
Based on
Eisen, 1998
Genome Res 8:
163-167.
51. Phylogenetic Prediction of Function
• Many powerful and automated similarity based
methods for assigning genes to protein families
• COGs
• PFAM HMM searches
• Some limitations of similarity based methods can be
overcome by phylogenetic approaches
• Automated methods now available
• Sean Eddy
• Steven Brenner
• Kimmen Sjölander
Wednesday, October 23, 13
52. Phylogenetic Prediction of Function
• Many powerful and automated similarity based
methods for assigning genes to protein families
• COGs
• PFAM HMM searches
• Some limitations of similarity based methods can be
overcome by phylogenetic approaches
• Automated methods now available
• Sean Eddy
• Steven Brenner
• Kimmen Sjölander
• But …
Wednesday, October 23, 13
53. Carboxydothermus hydrogenoformans
•
•
•
•
Isolated from a Russian hotspring
Thermophile (grows at 80°C)
Anaerobic
Grows very efficiently on CO (Carbon
Monoxide)
• Produces hydrogen gas
• Low GC Gram positive (Firmicute)
• Genome Determined (Wu et al. 2005
PLoS Genetics 1: e65. )
Wednesday, October 23, 13
54. Homologs of Sporulation Genes
Wu et al. 2005 PLoS
Genetics 1: e65.
Wednesday, October 23, 13
56. Non-Homology Predictions:
Phylogenetic Profiling
• Step 1: Search all genes in
organisms of interest against all
other genomes
• Ask: Yes or No, is each gene
found in each other species
• Cluster genes by distribution
patterns (profiles)
Wednesday, October 23, 13
62. Era IV: Genomes in the Environment
Wednesday, October 23, 13
63. PCR and phylogenetic analysis of rRNA genes
DNA
extraction
PCR
Makes lots of
copies of the
rRNA genes
in sample
PCR
Phylotyping
Phylogenetic tree
rRNA1
rRNA2
rRNA1
5’...ACACACATAGGTGGAGCTA
GCGATCGATCGA... 3’
Sequence alignment = Data matrix
A
C
A
C
rRNA2
T
A
C
A
G
C
A
C
T
G
T
rRNA4
C
A
C
A
G
T
E. coli
A
G
A
C
A
G
T
A
T
A
G
T
T
A
C
A
G
T
rRNA2
5’..TACAGTATAGGTGGAGCTAG
CGACGATCGA... 3’
T
Yeast
Yeast
C
Humans
Humans
E. coli
A
rRNA3
Wednesday, October 23, 13
rRNA1
rRNA4
rRNA3
Sequence
rRNA genes
rRNA3
5’...ACGGCAAAATAGGTGGATT
CTAGCGATATAGA... 3’
rRNA4
5’...ACGGCCCGATAGGTGGATT
CTAGCGCCATAGA... 3’
66. Phylogeny has many uses in shotgun metagenomics
DNA
extraction
PCR
Phylotyping
Phylogenetic tree
rRNA1
rRNA2
rRNA4
rRNA3
Humans
E. coli
Yeast
Wednesday, October 23, 13
Shotgun
Sequence
all genes
68. rRNA Phylotyping - Sargasso Metagenome
Venter et al., Science 304: 66. 2004
Wednesday, October 23, 13
69. RecA Phylotyping - Sargasso Metagenome
Venter et al., Science 304: 66. 2004
Wednesday, October 23, 13
70. Wednesday, October 23, 13
si
lo
np
ro
t
er
ia
er
ia
ac
t
ba
ct
eo
ro
t
eo
b
er
ia
ba
ct
eo
Venter et al., Science 304: 66. 2004
Major Phylogenetic Group
er
m
ry
u
ar
ch s
ae
ot
C
a
re
na
rc
ha
eo
ta
Th
er
ia
ct
ba
s
RpoB
Eu
s-
oc
oc
cu
De
in
so
RecA
Fu
ae
te
ch
iro
Sp
le
xi
or
of
hl
HSP70
C
EFTu
FB
EFG
C
eo
De
ba
lta
ct
pr
er
ia
ot
eo
ba
ct
C
er
ya
ia
no
ba
ct
er
ia
Fi
rm
ic
ut
es
Ac
tin
ob
ac
te
ria
C
hl
or
ob
i
Ep
m
ap
am
G
pr
ot
Be
ta
ro
t
ap
ph
Al
Weighted % of Clones
Phylotyping - Sargasso Metagenome
Sargasso Phylotypes
0.500
rRNA
0.375
0.250
0.125
0
71. Genome Biology 2008,
http://genomebiology.com/2008/9/10/R151
Volume 9, Issue 10, Article R151
AMPHORA Phylotyping
AMPHORA
0.8
0.7
0.6
Relative abundance
0.5
0.4
0.3
0.2
0.1
t
am es
C
y
ya
no dia
e
b
Ac ac
te
id
ob ria
Th act
e
er
m ria
Fu oto
so gae
Ac bac
te
tin
ob ria
ac
te
Aq ria
Pl
u
an
ct ifica
om
e
Sp yce
te
iro
ch s
a
Fi ete
rm s
ic
C ute
hl
or s
U
of
nc
le
la
ss Ch xi
l
ifi
ed oro
ba bi
ct
er
ia
de
ia
C
hl
oi
er
Ba
ct
ba
ct
er
ria
pr
ot
eo
ac
te
er
d
la
ss
ifi
e
np
ro
te
ob
ba
ct
te
U
nc
Ep
si
lo
pr
ot
eo
ac
ob
ia
ria
ia
er
ct
ro
el
ta
D
ap
m
am
G
te
ba
eo
ot
pr
ta
Be
Al
ph
ap
ro
te
ob
ac
te
ria
0
Figure 3
Major phylotypes 23, 13
Wednesday, October identified
in Sargasso Sea metagenomic data
Wu and Eisen R151.7
dnaG
frr
infC
nusA
pgk
pyrG
rplA
rplB
rplC
rplD
rplE
rplF
rplK
rplL
rplM
rplN
rplP
rplS
rplT
rpmA
rpoB
rpsB
rpsC
rpsE
rpsI
rpsJ
rpsK
rpsM
rpsS
smpB
tsf
72. Phylogenetic ID of Novel Lineages
GOS 1
GOS 2
GOS 3
GOS 4
Wu et al PLoS One 2011
Wednesday, October 23, 13
GOS 5
74. Phylogenetic Binning
Sulcia makes amino acids
Baumannia makes vitamins and cofactors
Wu et al. 2006 PLoS Biology 4: e188.
Wednesday, October 23, 13
76. Updated Tree of Life
Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al Tree
Wednesday, October 23, 13
77. Genomes Poorly Sampled
Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al Tree
Wednesday, October 23, 13
78. TIGR Tree of Life Project
Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al Tree
Wednesday, October 23, 13
79. Genomic Encyclopedia of Bacteria & Archaea
Wu et al. 2009 Nature 462, 1056-1060
Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al Tree
Wednesday, October 23, 13
80. Genomic Encyclopedia of Bacteria & Archaea
Wu et al. 2009 Nature 462, 1056-1060
Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al Tree
Wednesday, October 23, 13
81. Family Diversity vs. PD
Wu et al. 2009 Nature 462, 1056-1060
Wednesday, October 23, 13
82. The Dark Matter of Biology
From Wu et al. 2009 Nature 462, 1056-1060
Wednesday, October 23, 13
83. GEBA Uncultured
SAR
A: Hydrothermal vent
B: Gold Mine
C: Tropical gyres (Mesopelagic)
D: Tropical gyres (Photic zone)
OP3
Site
Site
Site
Site
OP1
406
OD1
1
Number of SAGs from Candidate Phyla
4
6
1
1
13
-
2
-
2
-
Sample collections at 4 additional sites are underway.
Phil Hugenholtz
83
Wednesday, October 23, 13
84. JGI Dark Matter Project
brackish/freshwater
TG
HSM
SM
GBS
GBS
HOT
OT
SAK
AK
hydrothermal
sediment
ETL
E
BACTERIA
ARCHAEA
UGA recoded for Gly (Gracilibacteria)
seawater
HGT from Eukaryotes (Nanoarchaea)
bioreactor
EPR
EPR
T
TA
G
GOM
OM
Growing
AA chain
U
oxidoretucase
Ribo
A
P51$
environmental
samples (n=9)
draft genomes
(n=201)
W51$*O
85. recognizes
UGA
G
isolation of single
cells (n=9,600)
whole genome
amplification (n=3,300)
U
:6
OP11 (Microgenomates)
OD1 (Parcubacteria)
SR1
BH1
TM7
GN02 (Gracilibacteria)
Bacteriodetes
OP1 (Acetothermia)
'HLQRFRFFXVí7KHUPXV
093í
70
ZB3
)LEUREDFWHUHV
TG3
Spirochaetes
WWE1 (Cloacamonetes)
Proteobacteria
)LUPLFXWHV
Tenericutes
)XVREDFWHULD
Chrysiogenetes
Chlorobi
6$5 0DULQLPLFURELD
87. Deltaproteobacteria
Cyanobacteria
:36í2
Actinobacteria
Gemmatimonadetes
NC10
SC4
WS2
NKB19 (Hydrogenedentes)
WYO
Armatimonadetes
WS4
Planctomycetes
Chlamydiae
OP3 (Omnitrophica)
Lentisphaerae
Verrucomicrobia
BRC1
Poribacteria
WS1
+Gí
LD1
GN01
WS3 (Latescibacteria)
GN04
1
H
H
1
$,$5
+2 1
H
+2+2
+2+2
OH
2+3
IMP
1
+2+2
O
limiting
phosphate,
fatty acids,
carbon, iron SpotT
51$ SROPHUDVH
ı3
ı2
-10
ı1
GTP or GDP
+ATP
limiting
amino acids
RelA
ppGpp
(GTP or GDP)
+ PPi
H
DksA
Expression of components
for stress response
O
OH
+2+2
O
O
O
1+
1+
2+3
2+3
tetrapeptide
1$'+
stringent response
(Diapherotrites, Nanoarchaea)
H
1
O
O
1+
ı4
-35
)$,$5
1
guanine
O
PurP
O
H 1
+
1+ 2
ȕ ȕ¶
Į7'
?
adenine
Woyke et al. Nature 2013.
Wednesday, October 23, 13
1
H H
e- acceptor
Archaea
PurF
PurD
3XU1
PurL/Q
PurM
PurK
PurE
3XU
PurB
1
-
Į17'
archaeal type purine synthesis
(Microgenomates)
1+2
1+2
+
+
sigma factor (Diapherotrites, Nanoarchaea)
ribosome
PRPP
1
O
Oxidation
1$' + H
A U
Korarchaeota
Cren Thermoprotei
Thaumarchaeota
Cren MCG
Cren pISA7
Cren C2
Aigarchaeota
Nanoarchaea
Micrarchaea
pMC2A384 (Diapherotrites)
DSEG (Aenigmarchaea)
Nanohaloarchaea
Euryarchaeota
Reduction
ADP
O
H
A U
G U
A A U G A U
Ribo
1+
+
genome sequencing,
assembly and QC (n=201)
SSU rRNA gene
based identification
(n=2,000)
+
e- donor
ADP
Eukaryota
archaeal toxins (Nanoarchaea)
1+
2+3
tetrapeptide
murein (peptido-glycan)
lytic murein transglycosylase
88. A Genomic Encyclopedia of Microbes (GEM)
Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al Tree
Wednesday, October 23, 13
91. Zorro - Automated Masking
9.0
ce to True Tree
Distance to True Tree
8.0
Wu M, Chatterji S, Eisen JA (2012) Accounting For Alignment
Uncertainty in Phylogenomics. PLoS ONE 7(1): e30288. doi:
10.1371/journal.pone.0030288
Wednesday, October 23, 13
7.0
6.0
5.0
4.0
200
3.0
no masking
zorro
gblocks
2.0
1.0
0.0
200
400
800
1600
3200
Sequence Length
92. Kembel Combiner
typically used as a qualitative measure because duplicate s
quences are usually removed from the tree. However, the
test may be used in a semiquantitative manner if all clone
even those with identical or near-identical sequences, are i
cluded in the tree (13).
Here we describe a quantitative version of UniFrac that w
call “weighted UniFrac.” We show that weighted UniFrac b
haves similarly to the FST test in situations where both a
FIG. 1. Calculation of the unweighted and the weighted UniFr
measures. Squares and circles represent sequences from two differe
environments. (a) In unweighted UniFrac, the distance between t
circle and square communities is calculated as the fraction of t
branch length that has descendants from either the square or the circ
environment (black) but not both (gray). (b) In weighted UniFra
branch lengths are weighted by the relative abundance of sequences
the square and circle communities; square sequences are weight
twice as much as circle sequences because there are twice as many tot
circle sequences in the data set. The width of branches is proportion
to the degree to which each branch is weighted in the calculations, an
gray branches have no weight. Branches 1 and 2 have heavy weigh
since the descendants are biased toward the square and circles, respe
tively. Branch 3 contributes no value since it has an equal contributio
from circle and square sequences after normalization.
Kembel SW, Eisen JA, Pollard KS, Green JL (2011) The Phylogenetic Diversity of
Metagenomes. PLoS ONE 6(8): e23214. doi:10.1371/journal.pone.0023214
Wednesday, October 23, 13
93. Kembel Copy # Correction
Kembel SW, Wu M, Eisen JA, Green JL (2012) Incorporating 16S Gene Copy Number Information Improves Estimates
of Microbial Diversity and Abundance. PLoS Comput Biol 8(10): e1002743. doi:10.1371/journal.pcbi.1002743
Wednesday, October 23, 13
94. Sharpton PhylOTU
Finding Metagenomic OTU
Figure 1. PhylOTU Workflow. Computational processes are represented as squares and databases are represented as cylinders in this generaliz
Sharpton TJ,of PhylOTU. See Results sectionSW, Ladau J, O'Dwyer JP, Green JL, Eisen JA,
workflow Riesenfeld SJ, Kembel for details.
Pollard KS. (2011) PhylOTU: A High-Throughput Procedure Quantifies Microbial
doi:10.1371/journal.pcbi.1001061.g001
Community Diversity and Resolves Novel Taxa from Metagenomic Data. PLoS
PD
alignment used to build the profile, resulting in a multiple
Comput Biol 7(1): e1001061. doi:10.1371/journal.pcbi.1001061 versus PID clustering, 2) to explore overlap between PhylOT
Wednesday, October 23, alignment
sequence 13
of full-length reference sequences and
clusters and recognized taxonomic designations, and 3) to quantif
95. NMF in Metagenomes
Characterizing the niche-space distributions of components
0 .2
0 .3
0 .4
0 .5
0 .6
0 .2
0 .4
0 .6
0 .8
1 .0
Polyne sia Archipe la gos_ G S 0 4 8 a _ C ora l R e e f
India n O ce a n_ G S 1 2 0 _ O pe n O ce a n
Polyne sia Archipe la gos_ G S 0 4 9 _ C oa sta l
G a la pa gos Isla nds_ G S 0 2 6 _ O pe n O ce a n
India n O ce a n_ G S 1 1 9 _ O pe n O ce a n
C a ribbe a n S e a _ G S 0 1 5 _ C oa sta l
C a ribbe a n S e a _ G S 0 1 9 _ C oa sta l
India n O ce a n_ G S 1 1 4 _ O pe n O ce a n
E a ste rn Tropica l Pa cific_ G S 0 2 3 _ O pe n O ce a n
India n O ce a n_ G S 1 1 0 a _ O pe n O ce a n
India n O ce a n_ G S 1 0 8 a _ La goon R e e f
C a ribbe a n S e a _ G S 0 1 8 _ O pe n O ce a n
G a la pa gos Isla nds_ G S 0 3 4 _ C oa sta l
India n O ce a n_ G S 1 2 2 a _ O pe n O ce a n
India n O ce a n_ G S 1 2 1 _ O pe n O ce a n
C a ribbe a n S e a _ G S 0 1 7 _ O pe n O ce a n
India n O ce a n_ G S 1 1 2 a _ O pe n O ce a n
India n O ce a n_ G S 1 1 3 _ O pe n O ce a n
India n O ce a n_ G S 1 4 8 _ F ringing R e e f
C a ribbe a n S e a _ G S 0 1 6 _ C oa sta l S e a
India n O ce a n_ G S 1 2 3 _ O pe n O ce a n
India n O ce a n_ G S 1 4 9 _ H a rbor
G a la pa gos Isla nds_ G S 0 2 7 _ C oa sta l
E a ste rn Tropica l Pa cific_ G S 0 2 2 _ O pe n O ce a n
S a rga sso S e a _ G S 0 0 1 c_ O pe n O ce a n
G a la pa gos Isla nds_ G S 0 3 5 _ C oa sta l
G a la pa gos Isla nds_ G S 0 3 0 _ W a rm S e e p
G a la pa gos Isla nds_ G S 0 2 9 _ C oa sta l
G a la pa gos Isla nds_ G S 0 3 1 _ C oa sta l upwe lling
India n O ce a n_ G S 1 1 7 a _ C oa sta l sa m ple
G a la pa gos Isla nds_ G S 0 2 8 _ C oa sta l
G a la pa gos Isla nds_ G S 0 3 6 _ C oa sta l
Polyne sia Archipe la gos_ G S 0 5 1 _ C ora l R e e f Atoll
N orth Am e rica n E a st C oa st_ G S 0 1 4 _ C oa sta l
N orth Am e rica n E a st C oa st_ G S 0 0 6 _ E stua ry
E a ste rn Tropica l Pa cific_ G S 0 2 1 _ C oa sta l
N orth Am e rica n E a st C oa st_ G S 0 0 9 _ C oa sta l
N orth Am e rica n E a st C oa st_ G S 0 1 1 _ E stua ry
N orth Am e rica n E a st C oa st_ G S 0 0 8 _ C oa sta l
N orth Am e rica n E a st C oa st_ G S 0 1 3 _ C oa sta l
N orth Am e rica n E a st C oa st_ G S 0 0 4 _ C oa sta l
N orth Am e rica n E a st C oa st_ G S 0 0 7 _ C oa sta l
N orth Am e rica n E a st C oa st_ G S 0 0 3 _ C oa sta l
N orth Am e rica n E a st C oa st_ G S 0 0 2 _ C oa sta l
N orth Am e rica n E a st C oa st_ G S 0 0 5 _ E m baym e nt
G e ne ra l
H igh
M e dium
Low
NA
W a te r de pth
4000m
2000!4000m
900!2000m
100!200m
20!100m
0!20m
Co
mp
on
Co
en
t1
(a)
mp
on
Co
en
t2
mp
on
Co
en
t3
mp
on
Co
en
t4
mp
on
en
Salinity
Sample Depth
Chlorophyll
Temperature
Insolation
Water Depth
S ites
0 .1
t5
(b)
(c)
Functional biogeography of ocean microbes revealed
w/ T ); b) the siteFigure 3: a) Niche-space distributions for our five components (HWeitz, Dushoff,
through non-negative matrix
ˆ ˆ c) environmental variables
Langille, Neches,
similarity matrix (H T H);In press PLoS One. Comes for the sites. The matrices are
factorization Jiang et al.
aligned so that the same row corresponds to the same site in each matrix. Sites are
out 9/18.
Levin, etc
ordered by applying spectral reordering to the similarity matrix (see Materials and
Methods). Rows are aligned across the three matrices.
Wednesday, October 23, 13
96. Phylosift - Mining the Global Metagenome
Erick Matsen
FHCRC
Todd Treangen
BNBI, NBACC
Holly
Bik
Jonathan Guillaume
Jospin
Eisen
Aaron
Darling
Mark
Brown
Tiffanie
Nelson
Students and other staff:
- Eric Lowe, John Zhang, David Coil
Open source community:
- BLAST, LAST, HMMER, Infernal,
pplacer, Krona, metAMOS, Bioperl,
Bio::Phylo, JSON, etc. etc.
PhyloSift is open source software:
-http://phylosift.wordpress.org
-http://github.com/gjospin/phylosift
Wednesday, October 23, 13
Supported by DHS Grant
97. Phylosift/ pplacer Workflow
each input sequence scanned against both workflows
Input Sequences
rRNA workflow
600 bp
LAST
fast candidate search
600 bp
LAST
fast candidate search
search input against references
hmmalign
multiple alignment
Infernal
multiple alignment
profile HMMs used to align
candidates to reference alignment
protein workflow
LAST
pa
fast candidate search
ral
lel
op
tio
n
LAST
fast candidate search
Taxonomic
Summaries
hmmalign
pplacer
Krona plots,
Number of reads placed
for each marker gene
phylogenetic placement
multiple alignment
Sample Analysis
Comparison
hmmalign
Edge PCA,
Tree visualization,
Bayes factor tests
multiple alignment
Aaron Darling, Guillaume Jospin, Holly Bik, Erik Matsen, Eric
Lowe, and others
Wednesday, October 23, 13
100. Output 2: Phylogenetic Tree of Reads
Placement tree from 2 week old infant gut data
Wednesday, October 23, 13
101. Edge PCA vs. UNIFRAC PCA
QIIME and Edge PCA on
110 fecal metagenomes from
Yatsunenko et al 2012
Nature.
Sequenced with 454, to
about 150Mbp/metagenome
Edge PCA: Matsen
and Evans 2013
Darling et al
Submitted.
Wednesday, October 23, 13
105. Better Reference Tree
Lang JM, Darling AE, Eisen JA (2013)
Phylogeny of Bacterial and Archaeal
Genomes Using Conserved Genes:
Supertrees and Supermatrices. PLoS
ONE 8(4): e62510. doi:10.1371/
journal.pone.0062510
Wednesday, October 23, 13
106. Acknowledgements
•
GEBA:
•
•
•
GEBA Cyanobacteria
•
•
•
$$: GBMF
Katie Pollard, Jessica Green, Martin Wu, Steven Kembel, Tom Sharpton, Morgan Langille, Guillaume Jospin,
Dongying Wu,
aTOL
•
•
•
$$$ DHS
Aaron Darling, Erik Matsen, Holly Bik, Guillaume Jospin
iSEEM:
•
•
•
$$$ NSF
Marc Facciotti, Aaron Darling, Erin Lynch,
Phylosift
•
•
•
$$: DOE-JGI
Cheryl Kerfeld, Dongying Wu, Patrick Shih
Haloarchaea
•
•
•
$$: DOE-JGI, DSMZ
Eddy Rubin, Phil Hugenholtz, Hans-Peter Klenk, Nikos Kyrpides, Tanya Woyke, Dongying Wu, Aaron Darling,
Jenna Lang
$$: NSF
Naomi Ward, Jonathan Badger, Frank Robb, Martin Wu, Dongying Wu
Others (not mentioned in detail)
•
•
•
$$: NSF, NIH, DOE, GBMF, DARPA, Sloan
Frank Robb, Craig Venter, Doug Rusch, Shibu Yooseph, Nancy Moran, Colleen Cavanaugh, Josh Weitz
EisenLab: Srijak Bhatnagar, Russell Neches, Lizzy Wilbanks, Holly Bik
Wednesday, October 23, 13