Informatics In The Manchester Centre For Integrative Systems Biology


Published on

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Informatics In The Manchester Centre For Integrative Systems Biology

  1. 1. Informatics in the Manchester Centre for Integrative Systems Biology <ul><li>Daniel Jameson, Neil Swainston </li></ul><ul><li>Manchester Centre for Integrative Systems Biology </li></ul><ul><li>SysMO-DB Workshop – Connecting Models and Data, Berlin </li></ul><ul><li>23 November 2009 </li></ul>
  2. 2. The MCISB <ul><li>Currently employs 9.5 multidisciplinary people </li></ul><ul><ul><li>All share same office , lab </li></ul></ul><ul><li>Pioneer the development of new experimental and computational technologies in systems biology </li></ul><ul><li>Develop an annotated, kinetic model of yeast metabolism </li></ul>
  3. 3. Goals of the MCISB <ul><li>Follow an integrative approach: </li></ul>
  4. 4. Goals of the MCISB <ul><li>Follow an iterative approach: </li></ul>
  5. 5. Definition of the problem <ul><li>Experimentalists generate data </li></ul><ul><li>Modellers require data </li></ul><ul><li>How do we pass data from the experimentalist to the modeller? </li></ul><ul><li>Traditional method </li></ul><ul><ul><li>Experimentalist analyses data, produces spreadsheet </li></ul></ul><ul><ul><li>Experimentalists e-mails spreadsheet to modeller </li></ul></ul><ul><ul><li>Modeller cuts-and-pastes data into modelling tool </li></ul></ul><ul><ul><li>Do the experimentalist and the modeller speak the same language? </li></ul></ul>
  6. 6. Informatics challenges <ul><li>How do we map experimental data to models? </li></ul><ul><ul><li>How do we know what data applies to what molecule or reaction ? </li></ul></ul><ul><ul><li>How do we identify molecules or reactions? </li></ul></ul><ul><li>(Same problem in merging models) </li></ul><ul><li>Use names…? </li></ul>
  7. 7. Computers don’t like names … because they are non-unique / ambiguous / imprecise / etc.
  8. 8. (3R,4R,5S,6S)-6-(hydroxymethyl) oxane-2,3,4,5-tetrol Biochemists like names a little too much… Glucose Glc Anhydrous dextrose Cerelose 2001 Traubenzucker Staleydex 95M
  9. 9. Solution <ul><li>Utilise unique, public identifiers for identifying molecules </li></ul><ul><ul><li>Don’t re-invent your own… </li></ul></ul><ul><ul><li>Use ChEBI terms to uniquely identify metabolites </li></ul></ul><ul><ul><li>Use UniProt terms to uniquely identify enzyme </li></ul></ul>
  10. 11. Solution <ul><li>Further advantage: </li></ul><ul><ul><ul><li>Using links into existing databases (ChEBI, UniProt) provide additional information immediately </li></ul></ul></ul><ul><ul><ul><li>Chemical formulae, structures </li></ul></ul></ul><ul><ul><ul><li>Protein sequences, phosphorlyation sites, SNPs </li></ul></ul></ul><ul><ul><li>Use unique, public IDs </li></ul></ul>
  11. 12. But names are still important <ul><li>Names are for humans (human-ish) </li></ul><ul><li>Unique ids (e-mail addresses, bank account numbers) are for computers (geek-ish) </li></ul><ul><li>BOTH are needed </li></ul>
  12. 13. But names are still important
  13. 14. Models <ul><li>Useful to have a standard to allow models to be shared / re-used </li></ul><ul><ul><li>Use SBML </li></ul></ul><ul><ul><li>Very well developed / supported </li></ul></ul><ul><ul><li>Tool set increasing all the time </li></ul></ul><ul><li>Identifying metabolites / proteins in models? </li></ul><ul><ul><li>Use MIRIAM standards </li></ul></ul><ul><ul><li> </li></ul></ul><ul><ul><li>Allows unique, public IDs to be embedded into SBML as annotations (along with human-readable names) </li></ul></ul>
  14. 15. Models <ul><li>Genome-scale SBML model of yeast metabolism </li></ul><ul><li>Annotated model </li></ul><ul><ul><li>All >2000 molecules have unique database references </li></ul></ul><ul><ul><li>MIRIAM standards have been followed </li></ul></ul><ul><ul><li>Should be entirely unambiguous for third party users </li></ul></ul><ul><ul><li>Should be usable in third party tools </li></ul></ul><ul><ul><li>Should allow data to be imported “easily” </li></ul></ul>
  15. 17. SBML annotation <species id=”glc&quot; name=&quot;D-Glucose&quot;> <annotation> <rdf:li rdf:resource=&quot;urn:miriam:obo.chebi:CHEBI:17634&quot;/> </annotation> </species>
  16. 18. Solution on the experimental side <ul><li>Ensure that unique identifiers are captured and associated with data at the time of the experiment </li></ul><ul><ul><li>BUT… this is all a bit geek-ish for biologists </li></ul></ul><ul><li>So… generate intuitive tools to do this by stealth </li></ul>
  17. 19. KineticsWizard
  18. 20. Project overview Enzyme kinetics Quantitative metabolomics Quantitative proteomics SBML Model Parameters (K M , K cat ) Variables (metabolite, protein concentrations) PRIDE XML MeMo SABIO-RK Web service Web service Web service MeMo-RK Web service
  19. 24. CellDesigner plugins …eventually
  20. 25. But… <ul><li>… MCISB has to manage “only” three types of experiment </li></ul><ul><ul><li>Proteomics, metabolomics, enzyme kinetics </li></ul></ul><ul><li>Informatics team share office with experimentalists and modellers </li></ul><ul><li>We’ve been doing this for years… </li></ul><ul><ul><ul><li>Lots of time, lots of people, lots of resource </li></ul></ul></ul><ul><ul><ul><li>Infrastructure development is part of our remit </li></ul></ul></ul>
  21. 26. And… <ul><li>… SYSMO projects are far more diverse </li></ul><ul><li>Informatics team separated from experimentalists, who are separated from modellers </li></ul><ul><li>Less informatics resource </li></ul><ul><li>Heavyweight approach of MCISB ( bespoke tools for each experiment) probably not applicable </li></ul>
  22. 27. So… <ul><li>… lightweight approach may be more suitable </li></ul><ul><li>Store only secondary data necessary for modelling </li></ul><ul><ul><ul><li>Not raw data </li></ul></ul></ul><ul><li>Daniel… </li></ul>
  23. 28. Einfach Klasse!
  24. 29. Modelling infrastructure
  25. 30. Taverna
  26. 31. Taverna
  27. 32. Modelling life-cycle workflows
  28. 33. Model construction Input: list of ORFs Output: SBML file 1. Get reaction info 3. Create species 2. Create compartments 4. Create reactions Get annotations
  29. 34. Model construction
  30. 35. Model parameterisation <ul><li>Data requirements </li></ul><ul><ul><li>SBML model </li></ul></ul><ul><ul><li>Starting concentrations for enzymes and source metabolites </li></ul></ul><ul><ul><li>Key results database </li></ul></ul><ul><ul><li>Enzyme kinetics </li></ul></ul><ul><ul><li>SABIO-RK database web service </li></ul></ul>
  31. 36. SABIO-RK web service
  32. 37. Model parameterisation
  33. 38. Model calibration <ul><li>Data requirements </li></ul><ul><ul><li>Parameterised SBML model </li></ul></ul><ul><ul><li>Experimental data </li></ul></ul><ul><ul><li>Metabolite concentrations from key results database </li></ul></ul><ul><ul><li>Calibration by COPASI web service </li></ul></ul>
  34. 39. COPASI web service Design and Architecture of Web Services for Simulation of Biochemical Systems. Dada JO, Mendes P. Data Integration in the Life Sciences, Manchester, UK (2009).
  35. 40. Model calibration
  36. 41. Model simulation <ul><li>Using COPASI web service </li></ul>
  37. 42. Conclusion <ul><li>Integrating experimental data with models is “easy” and can be automated </li></ul><ul><ul><li>If we adopt some standards </li></ul></ul><ul><li>Data can be shared “easily” between groups </li></ul><ul><ul><li>If we all adopt some standards </li></ul></ul><ul><li>Lightweight approach more achievable </li></ul><ul><ul><ul><li>Key Results Database </li></ul></ul></ul>
  38. 43. Thanks…
  39. 44. Informatics in the Manchester Centre for Integrative Systems Biology <ul><li>Daniel Jameson, Neil Swainston </li></ul><ul><li>Manchester Centre for Integrative Systems Biology </li></ul><ul><li>SysMO-DB Workshop – Connecting Models and Data, Berlin </li></ul><ul><li>23 November 2009 </li></ul>