The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks  Neil Swainston Manchester Centre for...
Metabolic reconstructions Computational  and  mathematical  representation of the  metabolic capabilities  of a given orga...
Metabolism
Uses of metabolic reconstructions Metabolic engineering Genome-annotation Evolutionary studies Network property analysis I...
Requirements Comprehensive “ Genome-scale ” Connected Minimise  gaps ,  blocked reactions Predictive Produce  biologically...
How are they generated?
How are they generated? Start from… Existing reconstructions  (generate a consensus) Genome  sequence Infer metabolic reac...
Next steps Ensure  consistent naming  of metabolites / enzymes Allows merging Assign  genes / proteins  to reactions Ensur...
Jamborees
Automation Many of these steps can be  automated Subliminal Toolbox Goal is to generate a metabolic reconstruction automat...
KEGG MetaCyc Merge pathways Balance reactions Format for COBRA Add transport reactions Draft (De)protonate metabolites Bal...
Initial draft Both  KEGG  and  MetaCyc  allow export of pathways / networks in  SBML KEGG2SBML BUT these are representatio...
Naming Glucose, glc, D-glucose, alpha-D-glucose? Need to be reconciled Use  semantic annotations ChEBI terms for metabolit...
Merging Standard identifiers: job done? Inconsistent charge states Pyruvic acid and pyruvate
Charge state determination Annotated ChEBI terms provides  programmatic access  to  structural data InChI ,  SMILES  strin...
Stereochemistry KEGG and MetaCyc are inconsistent in their definition of  stereochemical precision Apparently minor but ca...
Stereochemistry-induced gaps X Y
ChEBI ontology ChEBI contains relationships between metabolites
Stereochemistry-induced gaps X Y
Stereochemistry-induced gaps X Y
Stereochemistry-induced gaps X Y
Reaction balancing Reaction  elemental  and  charge balancing Aids merging Requirement of Flux Balance Analysis Prevents i...
Reaction balancing carbon dioxide + 2-Acetolacetate    Pyruvate CO 2  + C 5 H 7 O 4 -      C 3 H 3 O 3 - Ab = 0 A = Reac...
Reaction balancing Linear programming solver solves Ab = 0 b is a vector of stoichiometries carbon dioxide + 2-Acetolaceta...
Transporters Transporters are required to  transport metabolites  into and out of the cell TransportDB  is a source of tra...
Biomass function Flux Balance Analysis requires an  objective function  to maximise Traditionally, a  biomass function  is...
KEGG MetaCyc Merge pathways Balance reactions Format for COBRA Add transport reactions Draft (De)protonate metabolites Bal...
Analysis Goal:  can biomass be generated from growth medium? Next question:  what is the growth medium? By default, ALL me...
Analysis biomass glc PO4 3- y … z x
Analysis biomass glc PO4 3- y … z x ✗
Analysis Minimum biomass producing growth medium: Phosphate, histidine and methionine Amino acids being used as C- N- and ...
Subliminal generated model Many  more metabolites  and  enzymes Better coverage? Poor merging? Few unreachable metabolites...
Future developments Directionality Use  thermodynamic predictions  of reaction  reversibility Possible to automate due to ...
Conclusion Many steps can be  automated  in generating genome-scale metabolic reconstructions Additional modules  would be...
Upcoming SlideShare
Loading in …5
×

The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

2,120 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
2,120
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
37
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks

  1. 1. The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks Neil Swainston Manchester Centre for Integrative Systems Biology Mendes meeting 13 January 2011
  2. 2. Metabolic reconstructions Computational and mathematical representation of the metabolic capabilities of a given organism Consists of… Metabolic reactions Gene–protein–reaction relationships Compartmentalisation Reaction directionality Objective function(s)
  3. 3. Metabolism
  4. 4. Uses of metabolic reconstructions Metabolic engineering Genome-annotation Evolutionary studies Network property analysis Interpretation of ‘ omics datasets Compendium / source for smaller, kinetic models
  5. 5. Requirements Comprehensive “ Genome-scale ” Connected Minimise gaps , blocked reactions Predictive Produce biologically relevant results Gene-essentiality studies
  6. 6. How are they generated?
  7. 7. How are they generated? Start from… Existing reconstructions (generate a consensus) Genome sequence Infer metabolic reactions through gene homology Existing resources KEGG MetaCyc Provides a first-draft
  8. 8. Next steps Ensure consistent naming of metabolites / enzymes Allows merging Assign genes / proteins to reactions Ensure mass / charge balancing Add reaction directionality Add compartmentalisation Add annotation EC terms, PubMed references, confidence scores
  9. 9. Jamborees
  10. 10. Automation Many of these steps can be automated Subliminal Toolbox Goal is to generate a metabolic reconstruction automatically Manual curation still necessary BUT reduce what needs to be done Investigation Can we automate the generation of a metabolic network in yeast?
  11. 11. KEGG MetaCyc Merge pathways Balance reactions Format for COBRA Add transport reactions Draft (De)protonate metabolites Balance reactions (De)protonate metabolites Merge Add transport proteins Add biomass reaction
  12. 12. Initial draft Both KEGG and MetaCyc allow export of pathways / networks in SBML KEGG2SBML BUT these are representations of the database, NOT computational models Merging issue: Components are named inconsistently
  13. 13. Naming Glucose, glc, D-glucose, alpha-D-glucose? Need to be reconciled Use semantic annotations ChEBI terms for metabolites UniProt terms for enzymes Apply MIRIAM standard
  14. 14. Merging Standard identifiers: job done? Inconsistent charge states Pyruvic acid and pyruvate
  15. 15. Charge state determination Annotated ChEBI terms provides programmatic access to structural data InChI , SMILES strings InChI=1/C3H4O3/c1-2(4)3(5)6/h1H3,(H,5,6)/p-1/fC3H3O3/q-1 CC(=O)C([O-])=O Cheminformatics software (ChemAxon MARVIN) can be used to predict charge state at given pH Consistency ✓ ✗
  16. 16. Stereochemistry KEGG and MetaCyc are inconsistent in their definition of stereochemical precision Apparently minor but can cause gaps in the network beta-D-glucose D-glucose
  17. 17. Stereochemistry-induced gaps X Y
  18. 18. ChEBI ontology ChEBI contains relationships between metabolites
  19. 19. Stereochemistry-induced gaps X Y
  20. 20. Stereochemistry-induced gaps X Y
  21. 21. Stereochemistry-induced gaps X Y
  22. 22. Reaction balancing Reaction elemental and charge balancing Aids merging Requirement of Flux Balance Analysis Prevents inconsistencies arriving from “ magical ” production or disappearance of matter KEGG and MetaCyc reactions don ’ t always balance Incorrect stoichiometry Missing protons , water, etc. Solution: use linear programming
  23. 23. Reaction balancing carbon dioxide + 2-Acetolacetate  Pyruvate CO 2 + C 5 H 7 O 4 -  C 3 H 3 O 3 - Ab = 0 A = Reactants Products Optional reactants Optional products CO2 C5H7O4 C3H3O3 H+ H20 H+ H20 CO2 C 1 5 -3 0 0 0 0 -1 O 2 4 -3 0 1 0 -1 -2 H 0 7 -3 1 2 -1 -2 0 charge 0 -1 1 1 0 -1 0 0 b min 1 1 1 0 0 0 0 0
  24. 24. Reaction balancing Linear programming solver solves Ab = 0 b is a vector of stoichiometries carbon dioxide + 2-Acetolacetate  2 Pyruvate + H + CO 2 + C 5 H 7 O 4 -  2 C 3 H 3 O 3 - + H + Reactants Products Optional reactants Optional products CO2 C5H7O4 C3H3O3 H+ H20 H+ H20 CO2 C 1 5 -3 0 0 0 0 -1 O 2 4 -3 0 1 0 -1 -2 H 0 7 -3 1 2 -1 -2 0 charge 0 -1 1 1 0 -1 0 0 b min 1 1 1 0 0 0 0 0 b 1 1 2 0 0 1 0 0
  25. 25. Transporters Transporters are required to transport metabolites into and out of the cell TransportDB is a source of transporter proteins BUT not comprehensive enough to assign these to individual reactions Approach taken is a pragmatic one Add all transport proteins from TransportDB Generate transport reactions for ALL metabolites Map the proteins to the reactions manually
  26. 26. Biomass function Flux Balance Analysis requires an objective function to maximise Traditionally, a biomass function is specified Subliminal adds a generic biomass function Amino acids, nucleotides, lipids, ATP Formats model such that it can be loaded into the COBRA Toolbox
  27. 27. KEGG MetaCyc Merge pathways Balance reactions Format for COBRA Add transport reactions Draft (De)protonate metabolites Balance reactions (De)protonate metabolites Merge Add transport proteins Add biomass reaction
  28. 28. Analysis Goal: can biomass be generated from growth medium? Next question: what is the growth medium? By default, ALL metabolites can be transported into the cell Approach: using FBA to analyse biomass generation and iteratively knock out transporters Generates a minimum required growth medium
  29. 29. Analysis biomass glc PO4 3- y … z x
  30. 30. Analysis biomass glc PO4 3- y … z x ✗
  31. 31. Analysis Minimum biomass producing growth medium: Phosphate, histidine and methionine Amino acids being used as C- N- and S- sources Wrong! Suggests insufficient directionality constraints in the model Second approach specified a “ sensible ” growth medium Only histidine had to be added to the medium Suggests good connectivity BUT suggests gap(s) in histidine synthesis pathways
  32. 32. Subliminal generated model Many more metabolites and enzymes Better coverage? Poor merging? Few unreachable metabolites Good connectivity Many blocked reactions Insufficient sink / export reactions? Components Subliminal Manual Compartments 2 2 Unique metabolites 1385 728 Unique enzymes 1229 939 Metabolic reactions 1440 947 Unreachable metabolites 238/2953 (8.1%) 75/758 (9.9%) Blocked reactions 831/1538 (54%) 140/1102 (13%)
  33. 33. Future developments Directionality Use thermodynamic predictions of reaction reversibility Possible to automate due to our mapping to chemical structures Compartmentalisation Use protein localisation prediction to infer intra-cellular compartmentalisation Possible to automate due to our mapping to UniProt identifiers and protein sequences
  34. 34. Conclusion Many steps can be automated in generating genome-scale metabolic reconstructions Additional modules would be useful Manual curation still necessary…but… Subliminal Toolbox is modular Can be used in manual curation phase Approach is far better than starting from scratch
  35. 35. The Subliminal Toolbox: automating steps in the reconstruction of metabolic networks Neil Swainston Manchester Centre for Integrative Systems Biology Mendes meeting 13 January 2011

×