Possibilities for integrating model-related data in computational biology (DILS 2013)

594 views

Published on

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
594
On SlideShare
0
From Embeds
0
Number of Embeds
13
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Possibilities for integrating model-related data in computational biology (DILS 2013)

  1. 1. Possibilities for Integrating Model-related Data in Computational Biology Databases in Life Sciences, Montreal, July 2013 Dagmar Waltemath, University of Rostock, Germany Nicolas Le Novère, Babraham Institute, UK Michel Dumontier, Carleton University, Canada Archive
  2. 2. Introduction 13-07-12 Integrating model-related data 2 Fig.: DOI: 10.1038/35002125
  3. 3. Introduction 13-07-12 Integrating model-related data 3 Fig.: DOI: 10.1038/35002125
  4. 4. Introduction 13-07-12 Integrating model-related data 4 No and size of models time Fig.: DOI: 10.1038/35002125
  5. 5. Introduction 13-07-12 Integrating model-related data 5 Fig.: DOI: 10.1038/35002125 model reuse – result reproducibility
  6. 6. Introduction 13-07-12 Integrating model-related data 6
  7. 7. Introduction 1. How can we distribute models with all information necessary to reuse them (MIRIAM)? 2. How can we effectively manage different types of model-related data? 3. How can we link model-related data to the rest of the world? 13-07-12 Integrating model-related data 7
  8. 8. 1. Distributing models Archive 13-07-12 Integrating model-related data 8 Frank Bergmann Nicolas Le Novère
  9. 9. 1. Distributing models The COMBINE archive v0.1 • single “.zip” file • bundles models and model-related data • single file http://co.mbine.org/documents/archive 13-07-12 Integrating model-related data 9
  10. 10. 1. A manifest file, "manifest.xml“, 2. all described files, 3. a metadata file, "metadata.*“, 4. remaining files. • All documents necessary for the description of a model and all associated data and procedures. • In the future: also references to documents 1. Distributing models <?xml version="1.0" encoding="utf-8"?> <omexManifest xmlns="http://identifiers.org/combine.specifications/omex-manifest"> <content location="./manifest.xml" format="http://identifiers.org/combine.specifications/omex- manifest"/> <content location="./model/model.xml" format="http://identifiers.org/combine.specifications/sbml"/> <content location="./simulation.xml" format="http://identifiers.org/combine.specifications/sedml"/> <content location="./article.pdf" format="application/pdf"/> <content location="./metadata.rdf" format="http://identifiers.org/combine.specifications/omex- metadata"/> </omexManifest> 13-07-12 Integrating model-related data 10
  11. 11. 2. Managing models 13-07-12 11Integrating model-related data Ron Henkel
  12. 12. 2. Managing models • Neo4J database • Model2graph mapping ( , ) • Rich relations http://biomodels.net/qualifiers • Links to annotations 13-07-12 Integrating model-related data 12 “Which models are annotated with ‚Adenosine tri-phosphate‘?“ “Which models contain reactions with ATP as reactant and ADP as product?” Document Model P E CR S SBO:0000268 uniprot:P07101 uniprot:Q03393 GO:0005737HGNC:8582 is isVersionOf is isEncodedBy is asProduct asReactant asModifier Fig.: Henkel et al. (2012) INFORMATIK 2012, Braunschweig
  13. 13. Document Model P E CR S SBO:0000268 uniprot:P07101 uniprot:Q03393 GO:0005737HGNC:8582 is isVersionOf is isEncodedBy is asProduct asReactant asModifier 2. Managing models • Lucene-based ranked retrieval 13-07-12 Integrating model-related data 13 “Give me the best matching model published about the Cell Cycle and covering forms of cdc.“ Lucene query "cdc*" AND "Cell Cycle" http://www.ebi.ac.uk/biomodels-demo/ Henkel et al. (2010), Bioinformatics Fig.: Henkel et al. (2012) INFORMATIK 2012, Braunschweig
  14. 14. 2. Managing models • Representing simulation descriptions • ... and other types of model-related data 13-07-12 Integrating model-related data 14 “Give me all possible simulations that show the dependency of the Cell Cycle on the concentration of cdc25.“ Fig.: Henkel et al. (2012) INFORMATIK 2012, Braunschweig
  15. 15. 3. Integrating model data 13-07-12 Integrating model-related data 15
  16. 16. 3. Integrating model data 13-07-12 Integrating model-related data 1616 At the heart of Linked Data for the Life Sciences • Free and open source • Based on Semantic Web standards • Billions of interlinked statements from dozens of conventional and high value datasets • Partnerships with EBI, NCBI, DBCLS, NCBO, OpenPHACTS, and commercial tool providers chemicals/drugs/formulations, genomes/genes/proteins, domains Interactions, complexes & pathways BioModels animal models and phenotypes Disease, genetic markers, treatments Terminologies & publications
  17. 17. 3. Integrating model data # get all biochemical reactions in biomodels that are kinds of "protein catabolic process“, as defined by the gene ontology (in bioportal endpoint) SELECT ?go ?label count(distinct ?x) WHERE { ?go rdfs:label ?label . ?go rdfs:subClassOf ?tgo OPTION (TRANSITIVE) . ?tgo rdfs:label ?tlabel . FILTER regex(?tlabel, "^protein catabolic process") service <http://biomodels.bio2rdf.org/sparql> { ?x <http://bio2rdf.org/biopax_vocabulary:identical-to> ?go . ?x a <http://www.biopax.org/release/biopax-level3.owl#BiochemicalReaction> . } 13-07-12 Integrating model-related data 17 Gene Ontology Annotation Number of Reactions protein catabolic process [go:0030163] 51 cellular protein catabolic process [go:0044257] 26 modification-dependent protein catabolic process [go:0019941] 1 beta-amyloid formation [go:0034205] 1 “Give me all reactions in BioModels Database that represent protein catabolic processes. “
  18. 18. Summary Approach Features Purpose COMBINE archive File bundle; • Easy access to all model-related data through one single file Shipping files Graph-DB (MORRE) Network of interrelated nodes • IR techniques easily applicable • No schema • Link models and simulations Managing existing model data BIO2RDF Semantic integration of knowledge • Automated reasoning • No schema • Linking into LOD Full integration 13-07-12 Integrating model-related data 18
  19. 19. Thank you. 13-07-12 Integrating model-related data 19 http://co.mbine.org/events/COMBINE_2013

×