SlideShare a Scribd company logo
1 of 32
Areejit Samal
LPTMS/CNRS, Univ Paris-Sud 11, Orsay, France
and
Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany
Environmental versatility promotes modularity in
large-scale metabolic networks
International Conference on Mathematical and Theoretical Biology, Jan 23-27, 2012, Pune, India
Jan 24, 2012
In collaboration
with:
Andreas Wagner
University of Zurich
Olivier C. Martin
Université Paris-Sud
Modular Organization of Biological Systems
Spatial modularity:
Organisms are partitioned into organs
and tissues where cells of similar types
are in close proximity.
Topological modularity:
Modules represent subsets of nodes in a
network which are strongly connected to
each other but weakly connected to the
remaining network.
If nodes within a module tend to be involved in the same biological function, then both
spatial and topological modularity point to a general architectural principle:
The parts of an organism that perform specific tasks or functions are grouped
into modules that can function semi-autonomously.
For a review see: LH Hartwell et al, Nature (1999)
Motivation
 We ask if modularity in large-scale metabolic networks can arise as a by-
product of phenotypic constraints associated with the need to sustain life in
one or more chemical environments.
 An organism’s metabolism is highly versatile if it can sustain life in many
different chemical environments.
 Using recently developed techniques to randomly sample large numbers of
viable metabolic networks from a vast space of metabolic networks, we use
flux balance analysis to study insilico metabolic networks that differ in their
versatility.
 We then ask whether versatility affects the modularity of metabolic networks.
Outline
 Modeling Framework
 MCMC sampling of random viable genotypes
 Environmental versatility index and measures of modularity
 Results
 Conclusions and Outlook
Framework: Global Reaction Set
+
Biomass composition
The E. coli biomass
composition of iJR904
was used to determine
viability of genotypes.
Nutrient metabolites
The set of external
metabolites in iJR904
were allowed for
uptake/excretion in the
global reaction set.
KEGG Database + E. coli iJR904
5870 reactions
We can implement Flux
Balance Analysis (FBA)
within this database.
Genome-scale metabolic model reconstruction pipeline
Genome sequence with annotation
Biochemical Knowledge
Physiology
Genome-scale reconstructed metabolic network
AbbreviationOfficialName Equation (note [c] and [e] at the beginning refer to the compartment the reaction takes place in, cytosolic and extracellular respectively)Subsystem ProteinClassDescription
ALATA_L L-alanine transaminase [c]akg + ala-L <==> glu-L + pyr Alanine and aspartate metabolismEC-2.6.1.2
ALAR alanine racemase [c]ala-L <==> ala-D Alanine and aspartate metabolismEC-5.1.1.1
ASNN L-asparaginase [c]asn-L + h2o --> asp-L + nh4 Alanine and aspartate metabolismEC-3.5.1.1
ASNS2 asparagine synthetase [c]asp-L + atp + nh4 --> amp + asn-L + h + ppi Alanine and aspartate metabolismEC-6.3.1.1
ASNS1 asparagine synthase (glutamine-hydrolysing) [c]asp-L + atp + gln-L + h2o --> amp + asn-L + glu-L + h + ppiAlanine and aspartate metabolismEC-6.3.5.4
ASPT L-aspartase [c]asp-L --> fum + nh4 Alanine and aspartate metabolismEC-4.3.1.1
ASPTA aspartate transaminase [c]akg + asp-L <==> glu-L + oaa Alanine and aspartate metabolismEC-2.6.1.1
VPAMT Valine-pyruvate aminotransferase [c]3mob + ala-L --> pyr + val-L Alanine and aspartate metabolismEC-2.6.1.66
DAAD D-Amino acid dehydrogenase [c]ala-D + fad + h2o --> fadh2 + nh4 + pyr Alanine and aspartate metabolismEC-1.4.99.1
ALARi alanine racemase (irreversible) [c]ala-L --> ala-D Alanine and aspartate metabolismEC-5.1.1.1
FFSD beta-fructofuranosidase [c]h2o + suc6p --> fru + g6p Alternate Carbon Metabolism EC-3.2.1.26
A5PISO arabinose-5-phosphate isomerase [c]ru5p-D <==> ara5p Alternate Carbon Metabolism EC-5.3.1.13
MME methylmalonyl-CoA epimerase [c]mmcoa-R <==> mmcoa-S Alternate Carbon Metabolism EC-5.1.99.1
MICITD 2-methylisocitrate dehydratase [c]2mcacn + h2o --> micit Alternate Carbon Metabolism EC-4.2.1.99
ALCD19 alcohol dehydrogenase (glycerol) [c]glyald + h + nadh <==> glyc + nad Alternate Carbon Metabolism EC-1.1.1.1
LCADi lactaldehyde dehydrogenase [c]h2o + lald-L + nad --> (2) h + lac-L + nadh Alternate Carbon Metabolism EC-1.2.1.22
TGBPA Tagatose-bisphosphate aldolase [c]tagdp-D <==> dhap + g3p Alternate Carbon Metabolism EC-4.1.2.40
LCAD lactaldehyde dehydrogenase [c]h2o + lald-L + nad <==> (2) h + lac-L + nadh Alternate Carbon Metabolism EC-1.2.1.22
ALDD2x aldehyde dehydrogenase (acetaldehyde, NAD) [c]acald + h2o + nad --> ac + (2) h + nadh Alternate Carbon Metabolism EC-1.2.1.3
ARAI L-arabinose isomerase [c]arab-L <==> rbl-L Alternate Carbon Metabolism EC-5.3.1.4
RBK_L1 L-ribulokinase (L-ribulose) [c]atp + rbl-L --> adp + h + ru5p-L Alternate Carbon Metabolism EC-2.7.1.16
RBP4E L-ribulose-phosphate 4-epimerase [c]ru5p-L <==> xu5p-D Alternate Carbon Metabolism EC-5.1.3.4
ACACCT acetyl-CoA:acetoacetyl-CoA transferase [c]acac + accoa --> aacoa + ac Alternate Carbon Metabolism
BUTCT Acetyl-CoA:butyrate-CoA transferase [c]accoa + but --> ac + btcoa Alternate Carbon Metabolism EC-2.8.3.8
AB6PGH Arbutin 6-phosphate glucohydrolase [c]arbt6p + h2o --> g6p + hqn Alternate Carbon Metabolism EC-3.2.1.86
PMANM phosphomannomutase [c]man1p <==> man6p Alternate Carbon Metabolism EC-5.4.2.8
PPM2 phosphopentomutase 2 (deoxyribose) [c]2dr1p <==> 2dr5p Alternate Carbon Metabolism EC-5.4.2.7
List of metabolic reactions that can occur in an organism along with
gene-protein -reaction (GPR association)
Genome-scale reconstructed metabolic model
ACALDt acetaldehyde reversible transport acald[e] <==> acald[c] Transport, Extracellular
GUAt Guanine transport gua[e] <==> gua[c] Transport, Extracellular
HYXNt Hypoxanthine transport hxan[e] <==> hxan[c] Transport, Extracellular
XANt xanthine reversible transport xan[e] <==> xan[c] Transport, Extracellular
NACUP Nicotinic acid uptake nac[e] --> nac[c] Transport, Extracellular
ASNabc L-asparagine transport via ABC system asn-L[e] + atp[c] + h2o[c] --> adp[c] + asn-L[c] + h[c] + pi[c]Transport, Extracellular
ASNt2r L-asparagine reversible transport via proton symportasn-L[e] + h[e] <==> asn-L[c] + h[c] Transport, Extracellular
DAPabc M-diaminopimelic acid ABC transport 26dap-M[e] + atp[c] + h2o[c] --> 26dap-M[c] + adp[c] + h[c] + pi[c]Transport, Extracellular
CYSabc L-cysteine transport via ABC system atp[c] + cys-L[e] + h2o[c] --> adp[c] + cys-L[c] + h[c] + pi[c]Transport, Extracellular
ACt2r acetate reversible transport via proton symport ac[e] + h[e] <==> ac[c] + h[c] Transport, Extracellular
ETOHt2r ethanol reversible transport via proton symport etoh[e] + h[e] <==> etoh[c] + h[c] Transport, Extracellular
PYRt2r pyruvate reversible transport via proton symport h[e] + pyr[e] <==> h[c] + pyr[c] Transport, Extracellular
O2t o2 transport (diffusion) o2[e] <==> o2[c] Transport, Extracellular
CO2t CO2 transporter via diffusion co2[e] <==> co2[c] Transport, Extracellular
H2Ot H2O transport via diffusion h2o[e] <==> h2o[c] Transport, Extracellular
DHAt Dihydroxyacetone transport via facilitated diffusiondha[e] <==> dha[c] Transport, Extracellular
NH3t ammonia reversible transport nh4[e] <==> nh4[c] Transport, Extracellular
ARBt2r L-arabinose transport via proton symport arab-L[e] + h[e] <==> arab-L[c] + h[c] Transport, Extracellular
Exchange and transport reactions defining possible nutrients
Cellular Objective: Biomass Reaction
(0.05) 5mthf + (5.0E-5) accoa + (0.488) ala-L + (0.0010) amp + (0.281) arg-L + (0.229) asn-L + (0.229) asp-L + (45.7318) atp + (1.29E-4) clpn_EC + (6.0E-6) coa + (0.126) ctp + (0.087) cys-L + (0.0247)
datp + (0.0254) dctp + (0.0254) dgtp + (0.0247) dttp + (1.0E-5) fad + (0.25) gln-L + (0.25) glu-L + (0.582) gly + (0.154) glycogen + (0.203) gtp + (45.5608) h2o + (0.09) his-L + (0.276) ile-L + (0.428) leu-L +
(0.0084) lps_EC + (0.326) lys-L + (0.146) met-L + (0.00215) nad + (5.0E-5) nadh + (1.3E-4) nadp + (4.0E-4) nadph + (0.001935) pe_EC + (0.0276) peptido_EC + (4.64E-4) pg_EC + (0.176) phe-L + (0.21)
pro-L + (5.2E-5) ps_EC + (0.035) ptrc + (0.205) ser-L + (0.0070) spmd + (3.0E-6) succoa + (0.241) thr-L + (0.054) trp-L + (0.131) tyr-L + (0.0030) udpg + (0.136) utp + (0.402) val-L --> (45.5608) adp +
(45.56035) h + (45.5628) pi + (0.7302) ppi
Flux Balance Analysis (FBA) can be used to analyze the capabilities of these models.
Framework: Metabolic Genotype
Global Reaction Set List of catalyzed reactions
R1: 3pg + nad → 3php + h + nadh
R2: 6pgc + nadp → co2 + nadph + ru5p-D
R3: 6pgc → 2ddg6p + h2o
.
.
.
RN: ---------------------------------
1
0
1
.
.
.
.
1
1
0
.
.
.
.
1
1
1
.
.
.
.
G1 G2 G3
Genotypes G1 and G3 differ by a single reaction swap. This reaction swap can be decomposed into a
single reaction addition, which produces from G1 a new genotype G2, followed by a single reaction
deletion, which produces G3 from G2.
A metabolic genotype G is any subset of reactions taken from the global reaction
set of N reactions. Such a genotype G can be represented as a binary string of
length N with each reaction i either present (1) or absent (0).
Framework: Genotype-to-Phenotype Map
0
1
1
.
.
.
.
0
0
1
.
.
.
.
GA
GB
x
Deleted
reaction
Flux
Balance
Analysis
(FBA)
Viable
genotype
Unviable
genotype
Bit string and equivalent network
representation of genotypes
Genotypes Ability to synthesize Viability
biomass components
Growth
No Growth
For FBA, see Price et al, Nature Rev Micro (2004)
Definition: Genotype network
We denote by Ω(n) the vast space of metabolic genotypes with a given number n of
reactions. Even for modest n, Ω(n) is a huge space containing N!/[n! (N - n)!]
genotypes.
We refer to the set or space of viable genotypes within Ω(n) as V(n).
1
0
1
.
.
.
.
1
0
0
.
.
.
.
1
1
1
.
.
.
.
1
1
0
.
.
.
.
…………………………………………………………..…
Genotypes with n reactions:
Bit strings with exactly n entries 1.
Space of possible genotypes Ω(n)
Viable genotypes
V(n)
Unviable genotypes
The set of viable genotypes with fixed number of reactions n forms a genotype network or
neutral network
Viable genotypes occupy tiny fraction of the
genotype space
Number of reactions in genotype
2000 2200 2400 2600 2800
Fractionofgenotypesthatareviable
100
10-5
10-10
10-15
10-20
Glucose
Acetate
Succinate
Analytic prefactor
We tried to estimate the fraction of viable space by generating random genotypes with a
given number of reactions n and determining their phenotype.
Constraint of having super-
essential reactions
The convex curve indicates that the effect on
viability of successive reaction removals becomes
more and more severe.
The numerical estimation of the fraction
becomes extremely difficult for genotypes
with n<2100 reactions while typical
microbial metabolic networks contain 400-
1200 reactions.
1
0
1
.
.
.
.
1
0
0
.
.
.
.
1
1
1
.
.
.
.
1
1
0
.
.
.
.
……………………
Generate a sample of random bit strings with
exactly n entries 1.
Count the number of viable bit strings in the
sample.
Markov Chain Monte Carlo (MCMC) sampling of viable
genotypes
• MCMC allows one to uniformly
sample the tiny fraction of viable
space by generating a random walk
among viable genotypes, going from
one viable to another by performing a
reaction swap.
• If a swap leads to a non-viable
genotype, the walk stays at the
previous genotype for that step and
then the process is repeated.
G1 G3
G4
Viable genotype Unviable genotype
Random walkReaction swap
Genotype networks in metabolic reaction spaces
Areejit Samal, João F Matias Rodrigues, Jürgen Jost, Olivier C Martin* and Andreas Wagner*
BMC Systems Biology 2010, 4:30
MCMC sampling of viable genotypes
1
0
1
0
1
.
.
1
1
0
0
1
.
.
1
0
1
1
0
.
.
1
1
1
0
0
.
.
0
0
1
0
1
.
.
E. coli
Accept
Reject
Step 1
Swap Swap
Reject
106 Swaps
Step 2
103 Swaps
store store
Erase
memory of
initial
network
Saving
Frequency
Accept/Reject Criterion of reaction swap:
New network is able to grow on the specified environment
Global Reaction Set
R1: 3pg + nad → 3php + h + nadh
R2: 6pgc + nadp → co2 + nadph + ru5p-D
R3: 6pgc → 2ddg6p + h2o
.
.
.
RN: ---------------------------------
The advantage of using reaction swaps in our approach is that it leaves the number of reactions constant
over time.
Genotype networks in metabolic reaction spaces
Areejit Samal, João F Matias Rodrigues, Jürgen Jost, Olivier C Martin* and Andreas Wagner*
BMC Systems Biology 2010, 4:30
Environmental Versatility Index Venv
 MCMC sampling algorithm can be used to sample
the space of genotypes having a given phenotype.
 The desired phenotype is viability on a given set of
minimal environments.
 If the set of minimal environments, consists of Venv
environments, we designate the genotype’s
Environmental Versatility Index to Venv.
 Venv =1 refers to genotypes viable in one specific
environment; Venv =2 refers to genotypes viable in
two given environments; and so on.
 We have considered 89 minimal environments
whose sole carbon sources are their only
distinguishing feature.
Versatility
Venv
Number of
Sampled
genotypes
1 1000
2 1000
5 1000
10 1000
. .
. .
89 1000
Fully Coupled Set (FCS)
 A pair of reactions v1 and v2 are said to be fully coupled to each other if a nonzero flux
for v1 implies a proportionate (nonzero) flux for v2 in any steady state and vice versa.
 A fully coupled set (FCS) in a metabolic network is a maximal set of reactions that are
mutually fully coupled to each other. Such sets have been referred to as
reaction/enzyme subsets or correlated reaction sets in the literature.
 FCSs define functional modules in metabolic network which have been shown to be
biochemically sensible.
 Determining all FCSs in a large metabolic network can be done efficiently using linear
optimization.
Properties of Fully Coupled Sets (FCSs)
 In steady state, each FCS can be replaced by a single effective reaction. FCSs
can be used to coarse-grain metabolic networks.
 It has been shown that if one reaction in a FCS is essential, the complete FCS is
essential (Samal et al., BMC Bioinformatics (2006)).
 Since, FCSs guarantee flux proportionality among its constituent reactions, FCSs
are more relevant for unveiling functional coupling than graph-based measures
(Samal et al., BMC Bioinformatics (2006), Notebaart et al., PLoS Comp Bio
(2008)).
 FCS can be branched and contains cycles.
Example of a FCS in the E. coli metabolic
network that is branched and contains a
cycle: Green (red) edges represent
irreversible (reversible) reactions. The
reactions in this FCS are involved in cell
envelope biosynthesis.
Functional measures of modularity in metabolic networks
We defined two indices of metabolic
network modularity based on Fully
Coupled Sets (FCSs) of reactions in a
metabolic network.
 M : the number of reactions in a
metabolic network that belong to
different FCSs (modules).
 s : the number of FCSs (modules) in
a metabolic network.
Versatility
Venv
Number of
Sampled
genotypes
<M> <s>
1 1000 . .
2 1000
5 1000
10 1000
. .
. .
89 1000 . .
Modularity index M increases with versatility
Venv
1 5 10 20 30 40 50 70 90
<M>
240
250
260
270
280
290
300
310
320
Average over different nested sets
(a)
The data shown here are
based on MCMC sampled
genotypes with number
n=831 reactions (same as
in E. coli metabolic model.
The Environmental Versatility Index Venv denotes the number of distinct environments in
which a genotype is viable.
The modularity index M for a genotype gives the number of reactions contained in the FCSs
(modules) of that genotype.
Modularity index s with environmental versatility
Venv
12 5 10 20 30 40 50 70 89
<s>
70
75
80
85
90
95
100
Average over different nested sets
(b)
The data shown here are
based on MCMC sampled
genotypes with number
n=831 reactions (same as
in E. coli metabolic model.
The Environmental Versatility Index Venv denotes the number of distinct environments in
which a genotype is viable.
The modularity index s for a genotype gives the number of FCSs (modules) of that
genotype.
Environmental versatility enhances M and s regardless of
the value of number n of reactions in a genotype
Venv
1 5 10 20 30 40
<M>
190
200
210
220
230
240
250
260
270
Average over different orders
Venv
1 5 10 20 30 40
<M>
220
230
240
250
260
270
280
Average over different orders
(a)
(b)
Venv
1 5 10 20 30 40
<s>
45
50
55
60
65
70
Average over different orders
Venv
1 5 10 20 30 40
<s>
60
62
64
66
68
70
72
74
76
78
Average over different orders
(a)
(b)
Modularity index M Modularity index s
n = 500
n = 700
Higher values of M and s are by-products of increasing versatility
We next wished to check if the additional
reactions accounting for the increase in
modularity would be closely linked to the
additional nutrients that metabolic networks
must utilize as their versatility increases.
These additional reactions and the modules
they reside could occur just downstream of
the nutrients.
The notion of downstream can be made
quantitative through the Scope or Network
expansion algorithm.
Network Expansion or Scope algorithm
Reference: Handorf, Ebenhöh and Heinrich J Mol Evol (2005)
This algorithm is based on principle of forward evolution
The scope distance of a reaction from the nutrient metabolites in the seed set is
defined as the iteration number of the step when that particular reaction is first
selected by the algorithm.
Distance 1 Distance 2 Distance 3Distance 0
The increase in modularity with environmental versatility can be
attributed to reactions that are close to nutrients
Global reaction set
Scope distance from nutrient metabolites
0 5 10 15 20 25
Frequency
0
100
200
300
400
500
(a) Reactions that account for increase in M
Scope distance from nutrient metabolites
0 5 10 15 20 25
Frequency
0
5
10
15
20
25
30
(b)
Distribution of scope distances from
the nutrients for all possible
reactions.
Distribution of scope distances from
nutrients for those reactions in additional
FCSs that differentiate the sampled
genotypes at Venv=89 from those at Venv =1.
Using Kolmogorov–Smirnov test (K–S test) we could reject the hypothesis that
the two distributions are same at a p-value < 0.00001.
Example of FCSs that account for the increase in
modularity with versatility
These FCSs metabolize the nutrients:
fucose, rhamnose and 3-hydroxycinnamic acid.
Distribution of M for genotypes in our most constrained
ensemble and comparison with E. coli
M
260 280 300 320 340 360 380
Frequency
0
50
100
150
200
250
300
E. coli
(a)
The histogram of modularity M for a sample of 1000 genotypes with Venv =89
different minimal environments. The estimate of M for E. coli is also displayed.
We can reject the hypothesis that the modularity index M of E. coli is drawn from
this same distribution.
Distribution of s for genotypes in our most constrained
ensemble and comparison with E. coli
The histogram of s for a sample of 1000 genotypes with Venv =89 different minimal
environments. The estimate of s for E. coli is also displayed.
We see that the number of FCSs in E. coli is just slightly above the position of the
distribution's peak in our ensemble. Thus, the atypically high M of the E. coli network
does not stem from its having more modules than typical networks allowing growth
on 89 carbon sources.
s
70 80 90 100 110 120
Frequency
0
50
100
150
200
250
300
350
E. coli
(b)
Atypically large FCSs near the output end of the E. coli
metabolic network
Samal et al, BMC Bioinformatics (2006); Private Gallery
Pink metabolites are part of
Biomass reaction
FCSs and operons are
positively associated
in E. coli.
Conclusions
• We find that more versatile networks are more modular and also have more
modules than less versatile networks.
• The reactions that form part of newly arising modules in highly versatile
networks tend to be close to reactions that process nutrients.
• E. coli metabolic network is significantly more modular than random networks
in our most constrained ensemble.
• Since versatility corresponds to viability in increasing numbers of
environments, it can be considered as a trait associated with fitness itself.
• Our work supports a scenario for the origin of modularity in which increasing
functional constraints can make modularity emerge.
Two scenarios for the evolution of modularity
• Modularity might result from directional selection favouring change in one trait
while stabilizing selection maintains other traits unchanged.
This scenario was shown to be true in model gene regulatory networks.
• Modular fluctuations in evolutionary goals can be sufficient to produce and
maintain modularity. In this scenario, modularity will be lost once there are no
fluctuations in the evolutionary goals.
Empirical studies show that generalist prokaryotes are more
modular than specialists
Empirical observations by groups of Uri Alon
(Parter et al, BMC Evol Bio (2009)) and Eytan
Ruppin (Kreimer et al, PNAS (2008)) where
generalists prokaryotes living in many different
environments are more modular than specialists
are fully consistent with our conclusion.
Note that the size of
the networks increases
with environmental
variability in these
studies.
Our samples of random viable networks have not
been subject to unknown selection pressures
unlike metabolism of real organisms.
Functional constraints lead to emergence of modularity
• Our results support the scenario proposed can explain the
emergence of modularity in both non-fluctuating and
fluctuating environment scenario.
• The main requirement for emergence is an increase in the
number of functions that a network performs.
The intersection of genotype
spaces for two different
environments contains
modular networks.
Modularity is ultimately a
property of genotype spaces
(or the genotype-phenotype
map).
Uri Alon’s PictureOur Picture
Global structural properties of real metabolic networks:
Consequence of simple biochemical and functional constraints
Increasinglevelofconstraints
Biological
function is a
main driver of
structure in
metabolic
networks
• Power law degree distribution
• Small Average Path Length
• High Clustering coefficient
• Bow-tie architecture
Acknowledgement
CNRS & Max Planck Society Joint Program in Systems Biology (CNRS MPG
GDRE 513) for a Postdoctoral Fellowship.
Reference
Genotype networks in metabolic reaction spaces
Areejit Samal, João F Matias Rodrigues, Jürgen Jost, Olivier C Martin* and Andreas Wagner*
BMC Systems Biology 2010, 4:30
Environmental versatility promotes modularity in genome-scale metabolic networks
Areejit Samal, Andreas Wagner* and Olivier C Martin*
BMC Systems Biology 2011, 5:135
Funding
Andreas Wagner
University of Zurich
Olivier C. Martin
Université Paris-Sud
Federation of European Biochemical Societies (FEBS) for a Fellowship.

More Related Content

What's hot (6)

Diprotic Acids and Equivalence Points
Diprotic Acids and Equivalence PointsDiprotic Acids and Equivalence Points
Diprotic Acids and Equivalence Points
 
Andrey rubin target photosynthetic products and chlorophyll fluorescence
Andrey rubin target photosynthetic products and chlorophyll fluorescenceAndrey rubin target photosynthetic products and chlorophyll fluorescence
Andrey rubin target photosynthetic products and chlorophyll fluorescence
 
1 s2.0-s0022072896048048-main
1 s2.0-s0022072896048048-main1 s2.0-s0022072896048048-main
1 s2.0-s0022072896048048-main
 
Htos Presentation
Htos PresentationHtos Presentation
Htos Presentation
 
Single- and Double- Coordination Mechanism in Ethylene Tri- and Tetramerizati...
Single- and Double- Coordination Mechanism in Ethylene Tri- and Tetramerizati...Single- and Double- Coordination Mechanism in Ethylene Tri- and Tetramerizati...
Single- and Double- Coordination Mechanism in Ethylene Tri- and Tetramerizati...
 
Building a bridge between human-readable and machine-readable representations...
Building a bridge between human-readable and machine-readable representations...Building a bridge between human-readable and machine-readable representations...
Building a bridge between human-readable and machine-readable representations...
 

Similar to Modularity in metabolic networks

Similar to Modularity in metabolic networks (20)

Cellular respiration ppt
Cellular respiration pptCellular respiration ppt
Cellular respiration ppt
 
Cac
CacCac
Cac
 
Cellular Respiration
Cellular Respiration Cellular Respiration
Cellular Respiration
 
Cellular respiration ppt wit turning pt qs
Cellular respiration ppt wit turning pt qsCellular respiration ppt wit turning pt qs
Cellular respiration ppt wit turning pt qs
 
Biol101 chp9-pp-pt1-fall10-101123170617-phpapp02
Biol101 chp9-pp-pt1-fall10-101123170617-phpapp02Biol101 chp9-pp-pt1-fall10-101123170617-phpapp02
Biol101 chp9-pp-pt1-fall10-101123170617-phpapp02
 
Chapter 15 lecture 3
Chapter 15 lecture 3Chapter 15 lecture 3
Chapter 15 lecture 3
 
Respiration Part3
Respiration Part3Respiration Part3
Respiration Part3
 
09_Respiración celular.ppt
09_Respiración celular.ppt09_Respiración celular.ppt
09_Respiración celular.ppt
 
09 cellrespiration text
09  cellrespiration text09  cellrespiration text
09 cellrespiration text
 
Chapter 9
Chapter 9Chapter 9
Chapter 9
 
Cell respiration haf 1
Cell respiration haf 1Cell respiration haf 1
Cell respiration haf 1
 
Cellular Respiration
Cellular RespirationCellular Respiration
Cellular Respiration
 
lecture_23a.ppt
lecture_23a.pptlecture_23a.ppt
lecture_23a.ppt
 
cell-respiration (1).ppt
cell-respiration (1).pptcell-respiration (1).ppt
cell-respiration (1).ppt
 
Respiratory chain
Respiratory chainRespiratory chain
Respiratory chain
 
Cellular respiration lecture
Cellular respiration lectureCellular respiration lecture
Cellular respiration lecture
 
TCA cycle citric acid cycle krebs cycle.
TCA cycle citric acid cycle krebs cycle.TCA cycle citric acid cycle krebs cycle.
TCA cycle citric acid cycle krebs cycle.
 
Cellular Respiration PowerPoint
Cellular Respiration PowerPointCellular Respiration PowerPoint
Cellular Respiration PowerPoint
 
IB Biology HL Cellular respiration
IB Biology HL Cellular respirationIB Biology HL Cellular respiration
IB Biology HL Cellular respiration
 
Cell Metabolism Part 2
Cell Metabolism Part 2Cell Metabolism Part 2
Cell Metabolism Part 2
 

More from Areejit Samal (7)

Areejit Samal Emergence Alaska 2013
Areejit Samal Emergence Alaska 2013Areejit Samal Emergence Alaska 2013
Areejit Samal Emergence Alaska 2013
 
Metabolic Network Analysis
Metabolic Network AnalysisMetabolic Network Analysis
Metabolic Network Analysis
 
Areejit Samal Preferential Attachment in Catalytic Model
Areejit Samal Preferential Attachment in Catalytic ModelAreejit Samal Preferential Attachment in Catalytic Model
Areejit Samal Preferential Attachment in Catalytic Model
 
Areejit Samal Regulation
Areejit Samal RegulationAreejit Samal Regulation
Areejit Samal Regulation
 
Areejit Samal Randomizing metabolic networks
Areejit Samal Randomizing metabolic networksAreejit Samal Randomizing metabolic networks
Areejit Samal Randomizing metabolic networks
 
Randomizing genome-scale metabolic networks
Randomizing genome-scale metabolic networksRandomizing genome-scale metabolic networks
Randomizing genome-scale metabolic networks
 
Areejit samal fba_tutorial
Areejit samal fba_tutorialAreejit samal fba_tutorial
Areejit samal fba_tutorial
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Modularity in metabolic networks

  • 1. Areejit Samal LPTMS/CNRS, Univ Paris-Sud 11, Orsay, France and Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany Environmental versatility promotes modularity in large-scale metabolic networks International Conference on Mathematical and Theoretical Biology, Jan 23-27, 2012, Pune, India Jan 24, 2012 In collaboration with: Andreas Wagner University of Zurich Olivier C. Martin Université Paris-Sud
  • 2. Modular Organization of Biological Systems Spatial modularity: Organisms are partitioned into organs and tissues where cells of similar types are in close proximity. Topological modularity: Modules represent subsets of nodes in a network which are strongly connected to each other but weakly connected to the remaining network. If nodes within a module tend to be involved in the same biological function, then both spatial and topological modularity point to a general architectural principle: The parts of an organism that perform specific tasks or functions are grouped into modules that can function semi-autonomously. For a review see: LH Hartwell et al, Nature (1999)
  • 3. Motivation  We ask if modularity in large-scale metabolic networks can arise as a by- product of phenotypic constraints associated with the need to sustain life in one or more chemical environments.  An organism’s metabolism is highly versatile if it can sustain life in many different chemical environments.  Using recently developed techniques to randomly sample large numbers of viable metabolic networks from a vast space of metabolic networks, we use flux balance analysis to study insilico metabolic networks that differ in their versatility.  We then ask whether versatility affects the modularity of metabolic networks.
  • 4. Outline  Modeling Framework  MCMC sampling of random viable genotypes  Environmental versatility index and measures of modularity  Results  Conclusions and Outlook
  • 5. Framework: Global Reaction Set + Biomass composition The E. coli biomass composition of iJR904 was used to determine viability of genotypes. Nutrient metabolites The set of external metabolites in iJR904 were allowed for uptake/excretion in the global reaction set. KEGG Database + E. coli iJR904 5870 reactions We can implement Flux Balance Analysis (FBA) within this database.
  • 6. Genome-scale metabolic model reconstruction pipeline Genome sequence with annotation Biochemical Knowledge Physiology Genome-scale reconstructed metabolic network AbbreviationOfficialName Equation (note [c] and [e] at the beginning refer to the compartment the reaction takes place in, cytosolic and extracellular respectively)Subsystem ProteinClassDescription ALATA_L L-alanine transaminase [c]akg + ala-L <==> glu-L + pyr Alanine and aspartate metabolismEC-2.6.1.2 ALAR alanine racemase [c]ala-L <==> ala-D Alanine and aspartate metabolismEC-5.1.1.1 ASNN L-asparaginase [c]asn-L + h2o --> asp-L + nh4 Alanine and aspartate metabolismEC-3.5.1.1 ASNS2 asparagine synthetase [c]asp-L + atp + nh4 --> amp + asn-L + h + ppi Alanine and aspartate metabolismEC-6.3.1.1 ASNS1 asparagine synthase (glutamine-hydrolysing) [c]asp-L + atp + gln-L + h2o --> amp + asn-L + glu-L + h + ppiAlanine and aspartate metabolismEC-6.3.5.4 ASPT L-aspartase [c]asp-L --> fum + nh4 Alanine and aspartate metabolismEC-4.3.1.1 ASPTA aspartate transaminase [c]akg + asp-L <==> glu-L + oaa Alanine and aspartate metabolismEC-2.6.1.1 VPAMT Valine-pyruvate aminotransferase [c]3mob + ala-L --> pyr + val-L Alanine and aspartate metabolismEC-2.6.1.66 DAAD D-Amino acid dehydrogenase [c]ala-D + fad + h2o --> fadh2 + nh4 + pyr Alanine and aspartate metabolismEC-1.4.99.1 ALARi alanine racemase (irreversible) [c]ala-L --> ala-D Alanine and aspartate metabolismEC-5.1.1.1 FFSD beta-fructofuranosidase [c]h2o + suc6p --> fru + g6p Alternate Carbon Metabolism EC-3.2.1.26 A5PISO arabinose-5-phosphate isomerase [c]ru5p-D <==> ara5p Alternate Carbon Metabolism EC-5.3.1.13 MME methylmalonyl-CoA epimerase [c]mmcoa-R <==> mmcoa-S Alternate Carbon Metabolism EC-5.1.99.1 MICITD 2-methylisocitrate dehydratase [c]2mcacn + h2o --> micit Alternate Carbon Metabolism EC-4.2.1.99 ALCD19 alcohol dehydrogenase (glycerol) [c]glyald + h + nadh <==> glyc + nad Alternate Carbon Metabolism EC-1.1.1.1 LCADi lactaldehyde dehydrogenase [c]h2o + lald-L + nad --> (2) h + lac-L + nadh Alternate Carbon Metabolism EC-1.2.1.22 TGBPA Tagatose-bisphosphate aldolase [c]tagdp-D <==> dhap + g3p Alternate Carbon Metabolism EC-4.1.2.40 LCAD lactaldehyde dehydrogenase [c]h2o + lald-L + nad <==> (2) h + lac-L + nadh Alternate Carbon Metabolism EC-1.2.1.22 ALDD2x aldehyde dehydrogenase (acetaldehyde, NAD) [c]acald + h2o + nad --> ac + (2) h + nadh Alternate Carbon Metabolism EC-1.2.1.3 ARAI L-arabinose isomerase [c]arab-L <==> rbl-L Alternate Carbon Metabolism EC-5.3.1.4 RBK_L1 L-ribulokinase (L-ribulose) [c]atp + rbl-L --> adp + h + ru5p-L Alternate Carbon Metabolism EC-2.7.1.16 RBP4E L-ribulose-phosphate 4-epimerase [c]ru5p-L <==> xu5p-D Alternate Carbon Metabolism EC-5.1.3.4 ACACCT acetyl-CoA:acetoacetyl-CoA transferase [c]acac + accoa --> aacoa + ac Alternate Carbon Metabolism BUTCT Acetyl-CoA:butyrate-CoA transferase [c]accoa + but --> ac + btcoa Alternate Carbon Metabolism EC-2.8.3.8 AB6PGH Arbutin 6-phosphate glucohydrolase [c]arbt6p + h2o --> g6p + hqn Alternate Carbon Metabolism EC-3.2.1.86 PMANM phosphomannomutase [c]man1p <==> man6p Alternate Carbon Metabolism EC-5.4.2.8 PPM2 phosphopentomutase 2 (deoxyribose) [c]2dr1p <==> 2dr5p Alternate Carbon Metabolism EC-5.4.2.7 List of metabolic reactions that can occur in an organism along with gene-protein -reaction (GPR association) Genome-scale reconstructed metabolic model ACALDt acetaldehyde reversible transport acald[e] <==> acald[c] Transport, Extracellular GUAt Guanine transport gua[e] <==> gua[c] Transport, Extracellular HYXNt Hypoxanthine transport hxan[e] <==> hxan[c] Transport, Extracellular XANt xanthine reversible transport xan[e] <==> xan[c] Transport, Extracellular NACUP Nicotinic acid uptake nac[e] --> nac[c] Transport, Extracellular ASNabc L-asparagine transport via ABC system asn-L[e] + atp[c] + h2o[c] --> adp[c] + asn-L[c] + h[c] + pi[c]Transport, Extracellular ASNt2r L-asparagine reversible transport via proton symportasn-L[e] + h[e] <==> asn-L[c] + h[c] Transport, Extracellular DAPabc M-diaminopimelic acid ABC transport 26dap-M[e] + atp[c] + h2o[c] --> 26dap-M[c] + adp[c] + h[c] + pi[c]Transport, Extracellular CYSabc L-cysteine transport via ABC system atp[c] + cys-L[e] + h2o[c] --> adp[c] + cys-L[c] + h[c] + pi[c]Transport, Extracellular ACt2r acetate reversible transport via proton symport ac[e] + h[e] <==> ac[c] + h[c] Transport, Extracellular ETOHt2r ethanol reversible transport via proton symport etoh[e] + h[e] <==> etoh[c] + h[c] Transport, Extracellular PYRt2r pyruvate reversible transport via proton symport h[e] + pyr[e] <==> h[c] + pyr[c] Transport, Extracellular O2t o2 transport (diffusion) o2[e] <==> o2[c] Transport, Extracellular CO2t CO2 transporter via diffusion co2[e] <==> co2[c] Transport, Extracellular H2Ot H2O transport via diffusion h2o[e] <==> h2o[c] Transport, Extracellular DHAt Dihydroxyacetone transport via facilitated diffusiondha[e] <==> dha[c] Transport, Extracellular NH3t ammonia reversible transport nh4[e] <==> nh4[c] Transport, Extracellular ARBt2r L-arabinose transport via proton symport arab-L[e] + h[e] <==> arab-L[c] + h[c] Transport, Extracellular Exchange and transport reactions defining possible nutrients Cellular Objective: Biomass Reaction (0.05) 5mthf + (5.0E-5) accoa + (0.488) ala-L + (0.0010) amp + (0.281) arg-L + (0.229) asn-L + (0.229) asp-L + (45.7318) atp + (1.29E-4) clpn_EC + (6.0E-6) coa + (0.126) ctp + (0.087) cys-L + (0.0247) datp + (0.0254) dctp + (0.0254) dgtp + (0.0247) dttp + (1.0E-5) fad + (0.25) gln-L + (0.25) glu-L + (0.582) gly + (0.154) glycogen + (0.203) gtp + (45.5608) h2o + (0.09) his-L + (0.276) ile-L + (0.428) leu-L + (0.0084) lps_EC + (0.326) lys-L + (0.146) met-L + (0.00215) nad + (5.0E-5) nadh + (1.3E-4) nadp + (4.0E-4) nadph + (0.001935) pe_EC + (0.0276) peptido_EC + (4.64E-4) pg_EC + (0.176) phe-L + (0.21) pro-L + (5.2E-5) ps_EC + (0.035) ptrc + (0.205) ser-L + (0.0070) spmd + (3.0E-6) succoa + (0.241) thr-L + (0.054) trp-L + (0.131) tyr-L + (0.0030) udpg + (0.136) utp + (0.402) val-L --> (45.5608) adp + (45.56035) h + (45.5628) pi + (0.7302) ppi Flux Balance Analysis (FBA) can be used to analyze the capabilities of these models.
  • 7. Framework: Metabolic Genotype Global Reaction Set List of catalyzed reactions R1: 3pg + nad → 3php + h + nadh R2: 6pgc + nadp → co2 + nadph + ru5p-D R3: 6pgc → 2ddg6p + h2o . . . RN: --------------------------------- 1 0 1 . . . . 1 1 0 . . . . 1 1 1 . . . . G1 G2 G3 Genotypes G1 and G3 differ by a single reaction swap. This reaction swap can be decomposed into a single reaction addition, which produces from G1 a new genotype G2, followed by a single reaction deletion, which produces G3 from G2. A metabolic genotype G is any subset of reactions taken from the global reaction set of N reactions. Such a genotype G can be represented as a binary string of length N with each reaction i either present (1) or absent (0).
  • 8. Framework: Genotype-to-Phenotype Map 0 1 1 . . . . 0 0 1 . . . . GA GB x Deleted reaction Flux Balance Analysis (FBA) Viable genotype Unviable genotype Bit string and equivalent network representation of genotypes Genotypes Ability to synthesize Viability biomass components Growth No Growth For FBA, see Price et al, Nature Rev Micro (2004)
  • 9. Definition: Genotype network We denote by Ω(n) the vast space of metabolic genotypes with a given number n of reactions. Even for modest n, Ω(n) is a huge space containing N!/[n! (N - n)!] genotypes. We refer to the set or space of viable genotypes within Ω(n) as V(n). 1 0 1 . . . . 1 0 0 . . . . 1 1 1 . . . . 1 1 0 . . . . …………………………………………………………..… Genotypes with n reactions: Bit strings with exactly n entries 1. Space of possible genotypes Ω(n) Viable genotypes V(n) Unviable genotypes The set of viable genotypes with fixed number of reactions n forms a genotype network or neutral network
  • 10. Viable genotypes occupy tiny fraction of the genotype space Number of reactions in genotype 2000 2200 2400 2600 2800 Fractionofgenotypesthatareviable 100 10-5 10-10 10-15 10-20 Glucose Acetate Succinate Analytic prefactor We tried to estimate the fraction of viable space by generating random genotypes with a given number of reactions n and determining their phenotype. Constraint of having super- essential reactions The convex curve indicates that the effect on viability of successive reaction removals becomes more and more severe. The numerical estimation of the fraction becomes extremely difficult for genotypes with n<2100 reactions while typical microbial metabolic networks contain 400- 1200 reactions. 1 0 1 . . . . 1 0 0 . . . . 1 1 1 . . . . 1 1 0 . . . . …………………… Generate a sample of random bit strings with exactly n entries 1. Count the number of viable bit strings in the sample.
  • 11. Markov Chain Monte Carlo (MCMC) sampling of viable genotypes • MCMC allows one to uniformly sample the tiny fraction of viable space by generating a random walk among viable genotypes, going from one viable to another by performing a reaction swap. • If a swap leads to a non-viable genotype, the walk stays at the previous genotype for that step and then the process is repeated. G1 G3 G4 Viable genotype Unviable genotype Random walkReaction swap Genotype networks in metabolic reaction spaces Areejit Samal, João F Matias Rodrigues, Jürgen Jost, Olivier C Martin* and Andreas Wagner* BMC Systems Biology 2010, 4:30
  • 12. MCMC sampling of viable genotypes 1 0 1 0 1 . . 1 1 0 0 1 . . 1 0 1 1 0 . . 1 1 1 0 0 . . 0 0 1 0 1 . . E. coli Accept Reject Step 1 Swap Swap Reject 106 Swaps Step 2 103 Swaps store store Erase memory of initial network Saving Frequency Accept/Reject Criterion of reaction swap: New network is able to grow on the specified environment Global Reaction Set R1: 3pg + nad → 3php + h + nadh R2: 6pgc + nadp → co2 + nadph + ru5p-D R3: 6pgc → 2ddg6p + h2o . . . RN: --------------------------------- The advantage of using reaction swaps in our approach is that it leaves the number of reactions constant over time. Genotype networks in metabolic reaction spaces Areejit Samal, João F Matias Rodrigues, Jürgen Jost, Olivier C Martin* and Andreas Wagner* BMC Systems Biology 2010, 4:30
  • 13. Environmental Versatility Index Venv  MCMC sampling algorithm can be used to sample the space of genotypes having a given phenotype.  The desired phenotype is viability on a given set of minimal environments.  If the set of minimal environments, consists of Venv environments, we designate the genotype’s Environmental Versatility Index to Venv.  Venv =1 refers to genotypes viable in one specific environment; Venv =2 refers to genotypes viable in two given environments; and so on.  We have considered 89 minimal environments whose sole carbon sources are their only distinguishing feature. Versatility Venv Number of Sampled genotypes 1 1000 2 1000 5 1000 10 1000 . . . . 89 1000
  • 14. Fully Coupled Set (FCS)  A pair of reactions v1 and v2 are said to be fully coupled to each other if a nonzero flux for v1 implies a proportionate (nonzero) flux for v2 in any steady state and vice versa.  A fully coupled set (FCS) in a metabolic network is a maximal set of reactions that are mutually fully coupled to each other. Such sets have been referred to as reaction/enzyme subsets or correlated reaction sets in the literature.  FCSs define functional modules in metabolic network which have been shown to be biochemically sensible.  Determining all FCSs in a large metabolic network can be done efficiently using linear optimization.
  • 15. Properties of Fully Coupled Sets (FCSs)  In steady state, each FCS can be replaced by a single effective reaction. FCSs can be used to coarse-grain metabolic networks.  It has been shown that if one reaction in a FCS is essential, the complete FCS is essential (Samal et al., BMC Bioinformatics (2006)).  Since, FCSs guarantee flux proportionality among its constituent reactions, FCSs are more relevant for unveiling functional coupling than graph-based measures (Samal et al., BMC Bioinformatics (2006), Notebaart et al., PLoS Comp Bio (2008)).  FCS can be branched and contains cycles. Example of a FCS in the E. coli metabolic network that is branched and contains a cycle: Green (red) edges represent irreversible (reversible) reactions. The reactions in this FCS are involved in cell envelope biosynthesis.
  • 16. Functional measures of modularity in metabolic networks We defined two indices of metabolic network modularity based on Fully Coupled Sets (FCSs) of reactions in a metabolic network.  M : the number of reactions in a metabolic network that belong to different FCSs (modules).  s : the number of FCSs (modules) in a metabolic network. Versatility Venv Number of Sampled genotypes <M> <s> 1 1000 . . 2 1000 5 1000 10 1000 . . . . 89 1000 . .
  • 17. Modularity index M increases with versatility Venv 1 5 10 20 30 40 50 70 90 <M> 240 250 260 270 280 290 300 310 320 Average over different nested sets (a) The data shown here are based on MCMC sampled genotypes with number n=831 reactions (same as in E. coli metabolic model. The Environmental Versatility Index Venv denotes the number of distinct environments in which a genotype is viable. The modularity index M for a genotype gives the number of reactions contained in the FCSs (modules) of that genotype.
  • 18. Modularity index s with environmental versatility Venv 12 5 10 20 30 40 50 70 89 <s> 70 75 80 85 90 95 100 Average over different nested sets (b) The data shown here are based on MCMC sampled genotypes with number n=831 reactions (same as in E. coli metabolic model. The Environmental Versatility Index Venv denotes the number of distinct environments in which a genotype is viable. The modularity index s for a genotype gives the number of FCSs (modules) of that genotype.
  • 19. Environmental versatility enhances M and s regardless of the value of number n of reactions in a genotype Venv 1 5 10 20 30 40 <M> 190 200 210 220 230 240 250 260 270 Average over different orders Venv 1 5 10 20 30 40 <M> 220 230 240 250 260 270 280 Average over different orders (a) (b) Venv 1 5 10 20 30 40 <s> 45 50 55 60 65 70 Average over different orders Venv 1 5 10 20 30 40 <s> 60 62 64 66 68 70 72 74 76 78 Average over different orders (a) (b) Modularity index M Modularity index s n = 500 n = 700
  • 20. Higher values of M and s are by-products of increasing versatility We next wished to check if the additional reactions accounting for the increase in modularity would be closely linked to the additional nutrients that metabolic networks must utilize as their versatility increases. These additional reactions and the modules they reside could occur just downstream of the nutrients. The notion of downstream can be made quantitative through the Scope or Network expansion algorithm.
  • 21. Network Expansion or Scope algorithm Reference: Handorf, Ebenhöh and Heinrich J Mol Evol (2005) This algorithm is based on principle of forward evolution The scope distance of a reaction from the nutrient metabolites in the seed set is defined as the iteration number of the step when that particular reaction is first selected by the algorithm. Distance 1 Distance 2 Distance 3Distance 0
  • 22. The increase in modularity with environmental versatility can be attributed to reactions that are close to nutrients Global reaction set Scope distance from nutrient metabolites 0 5 10 15 20 25 Frequency 0 100 200 300 400 500 (a) Reactions that account for increase in M Scope distance from nutrient metabolites 0 5 10 15 20 25 Frequency 0 5 10 15 20 25 30 (b) Distribution of scope distances from the nutrients for all possible reactions. Distribution of scope distances from nutrients for those reactions in additional FCSs that differentiate the sampled genotypes at Venv=89 from those at Venv =1. Using Kolmogorov–Smirnov test (K–S test) we could reject the hypothesis that the two distributions are same at a p-value < 0.00001.
  • 23. Example of FCSs that account for the increase in modularity with versatility These FCSs metabolize the nutrients: fucose, rhamnose and 3-hydroxycinnamic acid.
  • 24. Distribution of M for genotypes in our most constrained ensemble and comparison with E. coli M 260 280 300 320 340 360 380 Frequency 0 50 100 150 200 250 300 E. coli (a) The histogram of modularity M for a sample of 1000 genotypes with Venv =89 different minimal environments. The estimate of M for E. coli is also displayed. We can reject the hypothesis that the modularity index M of E. coli is drawn from this same distribution.
  • 25. Distribution of s for genotypes in our most constrained ensemble and comparison with E. coli The histogram of s for a sample of 1000 genotypes with Venv =89 different minimal environments. The estimate of s for E. coli is also displayed. We see that the number of FCSs in E. coli is just slightly above the position of the distribution's peak in our ensemble. Thus, the atypically high M of the E. coli network does not stem from its having more modules than typical networks allowing growth on 89 carbon sources. s 70 80 90 100 110 120 Frequency 0 50 100 150 200 250 300 350 E. coli (b)
  • 26. Atypically large FCSs near the output end of the E. coli metabolic network Samal et al, BMC Bioinformatics (2006); Private Gallery Pink metabolites are part of Biomass reaction FCSs and operons are positively associated in E. coli.
  • 27. Conclusions • We find that more versatile networks are more modular and also have more modules than less versatile networks. • The reactions that form part of newly arising modules in highly versatile networks tend to be close to reactions that process nutrients. • E. coli metabolic network is significantly more modular than random networks in our most constrained ensemble. • Since versatility corresponds to viability in increasing numbers of environments, it can be considered as a trait associated with fitness itself. • Our work supports a scenario for the origin of modularity in which increasing functional constraints can make modularity emerge.
  • 28. Two scenarios for the evolution of modularity • Modularity might result from directional selection favouring change in one trait while stabilizing selection maintains other traits unchanged. This scenario was shown to be true in model gene regulatory networks. • Modular fluctuations in evolutionary goals can be sufficient to produce and maintain modularity. In this scenario, modularity will be lost once there are no fluctuations in the evolutionary goals.
  • 29. Empirical studies show that generalist prokaryotes are more modular than specialists Empirical observations by groups of Uri Alon (Parter et al, BMC Evol Bio (2009)) and Eytan Ruppin (Kreimer et al, PNAS (2008)) where generalists prokaryotes living in many different environments are more modular than specialists are fully consistent with our conclusion. Note that the size of the networks increases with environmental variability in these studies. Our samples of random viable networks have not been subject to unknown selection pressures unlike metabolism of real organisms.
  • 30. Functional constraints lead to emergence of modularity • Our results support the scenario proposed can explain the emergence of modularity in both non-fluctuating and fluctuating environment scenario. • The main requirement for emergence is an increase in the number of functions that a network performs. The intersection of genotype spaces for two different environments contains modular networks. Modularity is ultimately a property of genotype spaces (or the genotype-phenotype map). Uri Alon’s PictureOur Picture
  • 31. Global structural properties of real metabolic networks: Consequence of simple biochemical and functional constraints Increasinglevelofconstraints Biological function is a main driver of structure in metabolic networks • Power law degree distribution • Small Average Path Length • High Clustering coefficient • Bow-tie architecture
  • 32. Acknowledgement CNRS & Max Planck Society Joint Program in Systems Biology (CNRS MPG GDRE 513) for a Postdoctoral Fellowship. Reference Genotype networks in metabolic reaction spaces Areejit Samal, João F Matias Rodrigues, Jürgen Jost, Olivier C Martin* and Andreas Wagner* BMC Systems Biology 2010, 4:30 Environmental versatility promotes modularity in genome-scale metabolic networks Areejit Samal, Andreas Wagner* and Olivier C Martin* BMC Systems Biology 2011, 5:135 Funding Andreas Wagner University of Zurich Olivier C. Martin Université Paris-Sud Federation of European Biochemical Societies (FEBS) for a Fellowship.