Inferring microbial ecosystem function from community structure

Inferring microbial ecosystem function from community structure
Jeff S. Bowman and Hugh W. Ducklow
Lamont-Doherty Earth Observatory at Columbia University
bowmanjs@ldeo.columbia.edu | www.polarmicrobes.org
Introduction and Motivation
Marine microbes play a central role in the sustainability of the global ocean by mediating the flow of carbon and nutrients through the marine system. Ecologists
commonly study the structure and composition of marine microbial communities by analyzing the 16S rRNA gene. Although this data is well suited to evaluating
differences between communities, and to correlating community structure with other environmental parameters (e.g. chlorophyll concentration, temperature, sa-
linity), it is less well suited to describing the ecosystem functions (i.e. metabolic functions) of these community. Although metagenomics and other techniques can
bridge the gap between microbial community structure and ecosystem function these techniques are costly, data intensive, and low throughput.
Our goal was to develop a high-throughput method for inferring community metabolism from community taxonomy. By evaluating metabolic structure in
place of community structure we capture key inter-sample relationships and their impact on microbial ecosystem function. Our method produces pathway
genome databases (PGDBs) that describe the metabolic pathways likely to be present in the sample. These PGDBs are amenable to flux-based metabolic modeling.
Future work will focus on predicting the flow of elements and energy through these pathways, providing a way to model the impact of changing commu-
nity structure on biogeochemical cycles.
Here we apply our method to a seasonally variable, depth stratified microbial community from the West Antarctic Peninsula, a region undergoing unprecedented
environmental change.
16S sequence
library, the bigger
the better!
Obtain all
completed
genomes
Build 16S rRNA
reference tree
Find consensus
genome for
each tree node
Place reads on
reference tree
Extract pathways
for each placement
Generate
confidence score
for sample
Predict
metabolic
pathways
Calculate
confidence for
each node
Evaluate
genomic
plasticity for
terminal nodes
Evaluate
relative core
genome size
Fig. 1. Methods. Our metabolic inference pipe-
line, PAPRICA [1], uses a phylogenetic placement
program (pplacer) [2] to place query reads on a
reference tree of 16S rRNA genes from all complet-
ed genomes. We determine a consensus genome
for each point of placement on the tree, and deter-
mine the metabolic pathways represented in these
genomes. Separately we determine a confidence
score for each point of placement on the reference
tree from a novel indicator of genomic stability.
Terminal Node
Terminal Node
Internal Node
Core genome
Accessory Genome
=
( )
(1 )
Fig. 2. Confidence score. Placements can be made to
terminal and internal nodes. To determine the confidence
(c) of a metabolic inference for a given placement we con-
sider the core genome size (Score
), the mean genome size
of the clade (Sclade
), and the mean index of plasticity for the
clade (ф; Fig. 3).
Fig. 3. Genomic plasticity of genomes in our database. A major impediment to
accurate metabolic inference is the genetic diversity that can exist within even a
narrow taxonomic clade. We developed a confidence metric for our inferred metab-
olisms that is based on the degree of genomic plasticity present inherent to each
genome. X-axis gives the position of each genome on our reference tree, Y-axis
gives the degree of plasticity. Unusually plastic genomes are indicated by Roman
numerals. I) Nanoarcheum equitans II) the Mycobacteria III) a butyrate producing
bacterium within the Clostridium IV) Candidatus Hodgkinia circadicola V) the Myco-
plasma VI) Sulcia muelleri VII) Portiera aleyrodidanum VIII) Buchnera aphidicola IX) the
Oxalobacteraceae.
0 500 1000 1500 2000 2500
0.00.20.40.60.81.0
Terminal node
Relativeplasticity
I
II
III
IV
V
VI
VII
VIII
IX
Fig. 4. Sample locations within the Palmer LTER off the WAP (left) and inter-sample similarity (right). The location of Palmer Sta-
tion is given by the star. Summer surface and deep samples along with winter surface samples were analyzed [3]. A) Hierarchical cluster-
ing of samples by metabolic structure. B) Hierarchical clustering of samples by taxonomic structure. Note duplicate samples in both A
and B. C) Distances between samples are in good agreement between the two methods (R2 = 0.70). D) Distances are correlated (R2 =
0.40), albeit less well, the alternate metabolic inferrence approach PICRUSt [4].
●
●
●
●
NW
NE
SW
SE
WAP
summer_sw_deep_b.1
summer_sw_deep_b.2
summer_nw_deep_b.1
summer_nw_deep_b.2
summer_se_deep_b.1
summer_se_deep_b.2
winter_ne_shallow_b.1
summer_ne_deep_b.1
summer_ne_deep_b.2
summer_ne_shallow_b.1
summer_se_shallow_b.1
summer_sw_shallow_b.1
summer_nw_shallow_b.1
0.01.02.0
Height
summer_nw_deep_b.1
summer_nw_deep_b.2
summer_se_deep_b.1
summer_se_deep_b.2
summer_sw_deep_b.1
summer_sw_deep_b.2
summer_ne_deep_b.1
summer_ne_deep_b.2
0.00.20.4
Height
0.02 0.04 0.06 0.08 0.10 0.12 0.14
0.10.30.5
Distance by pathway abundance
Distancebyedgeabundance
A B
Surface
Deep
Winter surface
C
0.05 0.10 0.15
0.20.40.60.8
Distance by pathway abundance
DistancebyOTUabundance
D
This method
R2
= 0.70
PICRUSt
R2
= 0.40
Clustering by pathway abundance, this method Clustering by edge abundance, this method
Key Points
• Microbial communities can be described
by their metabolic structure.
• Metabolic structure provides information
on potential microbial ecosystem functions.
• Representing a microbial community by
metabolic structure may provide a way to
model the flow of elements and energy
through the community.
1. Bowman, Jeff S., and Hugh W. Ducklow. 2015. Microbial Communities Can Be Described by Metabolic Structure: A General Framework and Application to a Sea-
sonally Variable, Depth-Stratified Microbial Community from the Coastal West Antarctic Peninsula. PloS one, 10.8: e0135868.
2. Matsen, F, R Kodner, E Armbrust. 2010. pplacer: Linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree.
BMC Bioinformatics, 11:538.
3. Luria, C, H Ducklow, L Amaral-Zettler. 2014. Marine bacterial, archaeal and eukaryotic diversity and community structure on the continental shelf of the western
Antarctic Peninsula. Aquatic Microbial Ecology, 73:2 107-121.
4. Langille, Morgan GI, et al. Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. 2013. Nature biotechnology 31.9:
814-821.
pyruvate fermentation to lactate
phosphonoacetate degradation
adenosine nucleotides degradation III
creatinine degradation II
D−galacturonate degradation I
triacylglycerol degradation
allantoin degradation to ureidoglycolate I (urea producing)
nitrate reduction I (denitrification)
oxalate degradation II
sucrose degradation IV (sucrose phosphorylase)
galactose degradation I (Leloir pathway)
threonine degradation I
S−methyl−5−thio−alpha−D−ribose 1−phosphate degradation
nitrate reduction IV (dissimilatory)
taurine degradation IV
cholesterol degradation to androstenedione II (cholesterol dehydrogenase)
sitosterol degradation to androstenedione
reactive oxygen species degradation (mammalian)
alkylnitronates degradation
reductive monocarboxylic acid cycle
trehalose degradation VI (periplasmic)
arginine degradation III (arginine decarboxylase/agmatinase pathway)
propionyl CoA degradation
phenylmercury acetate degradation
thymine degradation
glutamate degradation I
uracil degradation I (reductive)
ethanol degradation IV
threonine degradation III (to methylglyoxal)
formaldehyde oxidation II (glutathione−dependent)
ethanol degradation II
valine degradation II
S−methyl−5'−thioadenosine degradation II
guanosine nucleotides degradation III
formate oxidation to CO2
pyrimidine deoxyribonucleosides degradation
2'−deoxy−alpha−D−ribose 1−phosphate degradation
methylglyoxal degradation II
glutamate degradation X
glucose and glucose−1−phosphate degradation
glycogen degradation I
urate biosynthesis/inosine 5'−phosphate degradation
pseudouridine degradation
phenylacetate degradation I (aerobic)
D−mannose degradation
urea degradation I
methionine degradation I (to homocysteine)
aspartate degradation I
citrulline degradation
glutamine degradation I
−0.6 −0.4 −0.2 0.0 0.2 0.4 0.6
Enriched in surface | Enriched in deep and winter
p-value
0.05
4.57 x 10-5
Key intracellular metabolism
Anaerobic metabolism
Nitrogen degradation
Carbon degradation
C1
metabolism
Autotrophy
Mercury degradation
Columbia / Kiel University Sustainable Oceans Symposium
Fig. 5. What metabolic pathways are differentially
present between summer surface samples and
winter and deep samples? Having determined that
the relationship between samples can be accurately
represented by metabolic structure we can begin to
ask ecologically relevant questions. A frequent ques-
tion posed to community structure data is how are
metabolisms partitioned between niches? In the
figure at left color gives the p-value for a Mann-Whit-
ney test between sample groups (summer surface vs.
summer deep and winter surface). The X-axis gives
the anomaly, calculated as the difference in sample
group means divided by the sum of the sample group
means.

Inferring microbial ecosystem function from community structure

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Inferring microbial ecosystem function from community structure

Similar to Inferring microbial ecosystem function from community structure (20)

Recently uploaded

Recently uploaded (20)

Inferring microbial ecosystem function from community structure