From data to knowledge – the Ondex System for integrating Life Sciences data sources


Published on

ISMB'08 poster

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

From data to knowledge – the Ondex System for integrating Life Sciences data sources

  1. 1. ONDEX data integration platform Technology Development & Project Overview Bioenergy crop improvement From data to knowledge – the ONDEX System for integrating Life Sciences data sources C. Rawlings1, A. Karp1, C. Goble2, R. Stevens2, S. Ananiadou2, D. Kell3, A. Wipat4, D. Wilkinson4, P. Lord4, D. Lydall4 , C. Canevet1 1Rothamsted Research, Harpenden; 2University of Manchester, School of Computer Science; 3University of Manchester, School of Chemistry & Manchester Centre for Integrative Systems Biology; 4Newcastle University, Centre for Integ Summary • Funded under SABR initiative (2008-2011) • Create a robust, extensible data integration system for supporting systems biology research. • Extend the ONDEX data integration platform and support four demonstrator projects involving three CISB centres. The ONDEX system stores data as a graph of Concepts and Relations. Concepts represent data entities and Relations link these entities together. Additional semantic annotation is added using concept classes, relation types, evidences and controlled vocabularies. Data is imported by data source specific parsers. Mapping methods create new Relations between Concepts. Local and global consistency checks are performed. Data integration can be configured and executed using web services via Taverna. ( The ONDEX system is open source and written in JAVA. It can be obtained at Yeast metabolome models Different metabolic networks Data Integration (ONDEX) Consensus metabolic network Telomere function relating to ageing • Extend core data structures, interfaces and data integration framework to support probabilistic relationships. • Adapt NaCTeM text mining tools to use semantic parsing to extract more complex relationships from bio-text sources. • Upgrade the workflow management to incorporate new developments from myGrid. • Develop the user interfaces and other components to support comparative analyses. • Exploit new data structures with statistical data analysis methods and associated visualization methods. New Insights Demonstrator Projects • Identify new genetic and molecular targets to improve bioenergy crops (Rothamsted) • Integrate different yeast metabolome models (Manchester) • Support studies of telomere function relating to ageing research in yeast (Newcastle). • Support studies of signalling pathways controlling circadian rhythms (Edinburgh) Protein Networks Parses XML Export Query API Importer Importer W ebservices Graph of Concepts & Relations Biological Databases Importer Data AnalysisData IntegrationData Input Data alignment methods (e.g. accession based mapping, BLAST mapping) OVTK2 OVTK NaCTeM text mining raw (unstructured) text part-of-speech Tagging GENIA named entity recognition deep syntactic Parsing ENJU annotated (structured) text Natural Language Processing lexicon ontology Experimental Data Ontologies & Free Text