Systems Biology Model Semantics and Integration Allyson Lister Biology, Neurosciences & Computing Group Newcastle Universi...
Background in Standards <ul><li>In the beginning, there was  syntax ... </li><ul><li>TrEMBL, FuGE, SBML </li></ul><li>...t...
Background in Integration <ul><li>Throughout, an interest in  data integration </li><ul><li>Redundancy removal in TrEMBL
International Protein Index (IPI)
FuGE-based metadata database and data storage (SyMBA)
Semantic data integration </li></ul></ul>
Rule-Based Mediation (RBM) <ul>Integrate data from multiple data sources into a single, core ontology for reasoning, query...
RBM (continued) <ul><li>Resolution of syntactic and semantic heterogeneity occurs separately </li><ul><li>The core ontolog...
Syntactic ontologies pass data to the core via SWRL </li><ul><li>Are either syntactic translations of data formats into OW...
RBM (continued) <ul><li>Is  not </li><ul><li>Ontology alignment </li><ul><li>Alignment often a prelude to ontology merging...
We do not intend to merge ontologies, and each data source may be very different from another </li></ul><li>Format reconci...
Systems Biology and RBM <ul><li>Add information to models </li><ul><li>Add new interactions/pathways to existing models
Add new biological annotation to existing models
Build skeleton models of requested interactions/pathways </li></ul></ul>
RBM Overview UniprotKB CellML PathwayCommons XML XML OWL OWL BioPAX Core Ontology Instances Resolve syntactic heterogeneit...
Why have an OWL intermediary where converters exist? <ul><li>BioPAX ↔ SBML, CellML ↔ SBML converters exist </li><ul><li>Lo...
Might not get what we need from such conversion </li></ul></ul>
Why have an OWL intermediary where converters exist? <ul><li>BioPAX ↔ SBML, CellML ↔ SBML converters exist </li><ul><li>Wi...
Not dependent upon external developers if the meaning or structure of a format changes
Easier to change rules (especially for web applications or novices) and re-run mappings than re-write hard-coded Java/perl...
The core ontology <ul><li>Ideally, a core ontology should be a tightly-scoped ontology describing the domain of interest
Multiple core ontologies can be created as necessary to address multiple biological questions
Upcoming SlideShare
Loading in …5
×

Systems Biology Model Semantics and Integration

3,208 views

Published on

A short description of my research experiences using OWL to perform semantic data integration and, ultimately, the addition of annotation for systems biology models.

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
3,208
On SlideShare
0
From Embeds
0
Number of Embeds
1,334
Actions
Shares
0
Downloads
16
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Systems Biology Model Semantics and Integration

  1. 1. Systems Biology Model Semantics and Integration Allyson Lister Biology, Neurosciences & Computing Group Newcastle University 29 July 2011 This presentation is licensed under Creative Commons BY-SA 2.5
  2. 2. Background in Standards <ul><li>In the beginning, there was syntax ... </li><ul><li>TrEMBL, FuGE, SBML </li></ul><li>...then came content ... </li><ul><li>MIGS/MIMS </li></ul><li>...And ultimately, semantics </li><ul><li>OBI, SBO, BioPAX </li></ul></ul>
  3. 3. Background in Integration <ul><li>Throughout, an interest in data integration </li><ul><li>Redundancy removal in TrEMBL
  4. 4. International Protein Index (IPI)
  5. 5. FuGE-based metadata database and data storage (SyMBA)
  6. 6. Semantic data integration </li></ul></ul>
  7. 7. Rule-Based Mediation (RBM) <ul>Integrate data from multiple data sources into a single, core ontology for reasoning, querying and data extraction back to a chosen (non-OWL) format </ul>
  8. 8. RBM (continued) <ul><li>Resolution of syntactic and semantic heterogeneity occurs separately </li><ul><li>The core ontology is a semantically-rich description of the research domain of interest
  9. 9. Syntactic ontologies pass data to the core via SWRL </li><ul><li>Are either syntactic translations of data formats into OWL or pre-existing OWL ontologies </li></ul></ul><li>Mainly uses existing, independent ontologies and off-the-shelf libraries and applications </li></ul>
  10. 10. RBM (continued) <ul><li>Is not </li><ul><li>Ontology alignment </li><ul><li>Alignment often a prelude to ontology merging, and used where domains at least partly intersect
  11. 11. We do not intend to merge ontologies, and each data source may be very different from another </li></ul><li>Format reconciliation </li><ul><li>We are not trying to create a single, overarching format – just quickly pull data from many formats </li></ul></ul></ul>
  12. 12. Systems Biology and RBM <ul><li>Add information to models </li><ul><li>Add new interactions/pathways to existing models
  13. 13. Add new biological annotation to existing models
  14. 14. Build skeleton models of requested interactions/pathways </li></ul></ul>
  15. 15. RBM Overview UniprotKB CellML PathwayCommons XML XML OWL OWL BioPAX Core Ontology Instances Resolve syntactic heterogeneity Resolve semantic heterogeneity, reasoning, querying Retrieve data MFO Export to SBML, other formats If required BioModels XML MFO ... XML OWL
  16. 16. Why have an OWL intermediary where converters exist? <ul><li>BioPAX ↔ SBML, CellML ↔ SBML converters exist </li><ul><li>Lossy (due to different scopes of each format)
  17. 17. Might not get what we need from such conversion </li></ul></ul>
  18. 18. Why have an OWL intermediary where converters exist? <ul><li>BioPAX ↔ SBML, CellML ↔ SBML converters exist </li><ul><li>With SWRL rules we can pull information for exactly those portions of each format we're interested in
  19. 19. Not dependent upon external developers if the meaning or structure of a format changes
  20. 20. Easier to change rules (especially for web applications or novices) and re-run mappings than re-write hard-coded Java/perl etc. </li></ul></ul>
  21. 21. The core ontology <ul><li>Ideally, a core ontology should be a tightly-scoped ontology describing the domain of interest
  22. 22. Multiple core ontologies can be created as necessary to address multiple biological questions
  23. 23. We began with an ontology describing the basics of telomere uncapping </li></ul>
  24. 24. Sharing common concepts among core ontologies <ul><li>To make it easier to swap out core ontologies, use a common ontology which all can inherit </li><ul><li>BioPAX Level 3 (and perhaps the SBPAX3 extension) is being considered for my research
  25. 25. Such an ontology can be selectively enriched with the biological information of interest
  26. 26. Only a small number of domain-specific SWRL rules would be needed with each new core ontology </li></ul></ul>
  27. 27. Visible face of RBM <ul><li>Saint </li><ul><li>pulls suggested MIRIAM annotation and possible interactors from web services
  28. 28. “syntactic” integration of data, or direct querying of WSs based on query strings built from the SBML/CellML models
  29. 29. “semantic” Saint will also pull information out of RBM-integrated data </li></ul></ul>
  30. 30. Reasoning <ul><li>Not much reasoning over BioPAX yet, though as a component of my core ontology this will be coming soon
  31. 31. Reasoning over MFO models is quick, which is to be expected given the (deliberate) relative lack of complexity </li></ul>
  32. 32. Reasoning <ul><li>Reasoning and querying over the core ontology has already discovered new annotations as well as possible identification of unknown species in SBML models
  33. 33. Reasoning tends to be slower than I'd like, although much of it can be done behind the scenes and the results stored for later queries (i.e. with SQWRL) </li></ul>
  34. 34. Many interesting projects <ul><li>Model annotation for Synthetic Biology </li><ul><li>Goksel Misirli and others </li></ul><li>BioPAX ↔ SBML </li><ul><li>SBPAX3 and other work by Oliver Ruebenacker and others
  35. 35. EBI BioPAX ↔ SBML conversion
  36. 36. RBM using both as data sources </li></ul></ul>
  37. 37. Many interesting projects <ul><li>SBML and OWL </li><ul><li>MFO
  38. 38. SBMLHarvester by Robert Hoehndorf and others </li></ul><li>CellML and OWL </li></ul>
  39. 39. Related Work from Us <ul><li>Model annotation for synthetic biology: http://dx.doi.org/10.1093/bioinformatics/btr048
  40. 40. Rule-Based Mediation http://cisban-silico.cs.ncl.ac.uk/RBM/http://dx.doi.org/10.1186/2041-1480-1-S1-S3
  41. 41. MFO: http://cisban-silico.cs.ncl.ac.uk/MFO/ </li><ul><li>doi:10.2390/biecoll-jib-2007-80 </li></ul></ul>
  42. 42. Related Work from Us <ul><li>SyMBA: http://symba.sf.net
  43. 43. Saint: http://saint-annotate.sf.net </li><ul><li>http://dx.doi.org/10.1093/bioinformatics/btp523 </li></ul></ul>
  44. 44. Other Related Work <ul><li>SBPAX3 http://sourceforge.net/apps/mediawiki/biopax/index.php?title=SBPAX3
  45. 45. SBMLHarvester http://bioonto.gen.cam.ac.uk/sbmlharvester/
  46. 46. SBML -> BioPAX conversion sbml2biopax http://www.ebi.ac.uk/compneur-srv/sbml/converters/SBMLtoBioPax.html
  47. 47. CellML and OWL, Wimalaratne et al. doi: 10.1093/bioinformatics/btp391 </li></ul>
  48. 48. Thank you! <ul><li>And thanks also to </li><ul><li>Phil Lord and Neil Wipat, my PhD supervisors
  49. 49. Biology, Neurosciences & Computing Group at the Computing Science Department, Newcastle University
  50. 50. CISBAN
  51. 51. BBSRC </li></ul></ul>
  52. 52. Contact Me <ul><li>Contact me </li><ul><li>@allysonlister
  53. 53. http://themindwobbles.wordpress.com </li></ul></ul>

×