Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The Ondex Data Integration Framework

6,833 views

Published on

Title: The Ondex Data Integration Framework
Author: Jan Taubert

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

The Ondex Data Integration Framework

  1. 1. The ONDEX data integration framework BOSC2007, Vienna 20.07.2007 Jan Taubert ( [email_address] ) Rothamsted Research, UK
  2. 2. Summary <ul><li>ONDEX – framework for large scale data integration, text mining & graph analysis </li></ul><ul><li>JAVA API and standalone application </li></ul><ul><li>License: GNU General Public License </li></ul><ul><li>Project status: early alpha </li></ul><ul><li>Statistics: 814 files, 159572 lines </li></ul>http://ondex.sourceforge.net
  3. 3. <ul><li>University of Bielefeld, Bielefeld, Germany. </li></ul><ul><li>University of Koblenz, Koblenz, Germany. </li></ul><ul><li>University of Nottingham, Nottingham, Nottinghamshire, UK. </li></ul><ul><li>University of Tromsø, Tromsø, Norway. </li></ul><ul><li>University of Wageningen, Wageningen, Netherlands. </li></ul><ul><li>Rothamsted Research, Harpenden, Hertfordshire, UK. </li></ul><ul><li>Current members: </li></ul><ul><li>Jan Baumbach </li></ul><ul><li>Sonja Ernst </li></ul><ul><li>Keywan Hassani-Pak </li></ul><ul><li>Matthew Hindle </li></ul><ul><li>Berend Hoekman </li></ul><ul><li>Jacob Köhler </li></ul><ul><li>Artem Lysenko </li></ul><ul><li>Stephan Philippi </li></ul><ul><li>Chris Rawlings </li></ul><ul><li>Jan Taubert </li></ul><ul><li>Paul Verrier </li></ul><ul><li>Jochen Weile </li></ul><ul><li>Rainer Winnenburg </li></ul><ul><li>Tully Yates </li></ul><ul><li>Former members: </li></ul><ul><li>Jessica Butz </li></ul><ul><li>Sebastian Elsner </li></ul><ul><li>Ina Kupp </li></ul><ul><li>Alexander Rüegg </li></ul><ul><li>Klaus Peter Sieren </li></ul><ul><li>Andre Skusa </li></ul><ul><li>Michael Specht </li></ul>Members
  4. 4. Based on <ul><li>JAVA J2SE 5.0 </li></ul><ul><li>Berkeley DB Java Edition </li></ul><ul><li>XFire SOAP framework </li></ul><ul><li>Jetty WebServer </li></ul><ul><li>Lucene text search engine </li></ul><ul><li>Taverna for workflows </li></ul>
  5. 5. combine Large Experimental Data New Insights ONDEX Motivation 100‘s of Bio-Databases
  6. 6. … in which nodes and edges can have different properties. enzyme kinetics protein interactions metabolic pathways protein structure relation properties ontologies Everything is a network
  7. 7. ONDEX: Graph of Concepts and Relations Biology: Protein interaction network Ontology of Concept Classes, Relation Types and additional Properties Properties: compound name, protein sequence, protein structure, cellular component, KM-value, PH optimum … Protein – Ligand interaction network Protein Ligand interact Protein interact Concept Concept Concept Relation Relation Concept Class: Protein Protein Ligand Relation Type: interact interact Ontology based graph
  8. 9. Microarray experiment result analysis map 100‘s of Bio-Databases One example application
  9. 10. Comp Protein Gene Enzyme EC Treat- ment Reaction Pathway
  10. 11. Treatments from DRASTIC Pathways from KEGG
  11. 12. <ul><li>Three steps: </li></ul><ul><li>Try it out, submit bugs you find </li></ul><ul><li>Suggest/Implement your improvements </li></ul><ul><li>Become a contributor and submit improvements </li></ul>http://ondex.sourceforge.net Contributors will be acknowledged on the project website and in publications involving their work. Contributors are welcome to publish their work on ONDEX under their own names. How to contribute
  12. 13. Good : Write Flatfile parser for ONDEX Better : Provide your database in OXL (see Taubert et al. (2007) “Exchange of integrated datasets – the OXL format”, in press, IB2007) Also welcome : Provide your database in another standard (BioPax, SBML, XGMML; but may result in loss of information) http://ondex.sourceforge.net What to contribute
  13. 14. Algorithms : Needed for the alignment of integrated data Core : Improve persistency layer of Ontology based graph Exporter : Provide your own exchange standard Webservices : Increase compatibility http://ondex.sourceforge.net What to contribute
  14. 15. http://ondex.sourceforge.net Support : OXL in your application Connect : Import from web service or directly from Core API Algorithms : Graph analysis using the ONDEX Visualisation and Analysis Tool Kit (OVTK) Feedback & Feature requests : Mailing lists and Sourceforge.net What to contribute
  15. 16. Jun 06 – Jun 07: 1828 Downloads, 15289 page views Current SF.net rank: 1074 Subversion Activity Jan 07 – Jun 07 8904 Reads 2072 Writes 4821 File Uploads Developer mailing list : [email_address] User mailing list : [email_address] Current release : 0.9alpha1 Sourceforge.net
  16. 17. J Taubert, R Winnenburg, M Hindle, J Weile, J Baumbach, S Philippi, C Rawlings and J Köhler (2007) “Data integration, information filtering and knowledge extraction with ONDEX”, Paper in preparation J Taubert, K P Sieren, M Hindle, B Hoekman, R Winnenburg, S Philippi, C Rawlings and J Köhler (2007) “Exchange of integrated datasets – the OXL format”, Submitted Paper, 4th integrative bioinformatics workshop (IB2007) Jacob Köhler, Stephan Philippi, Michael Specht and Alexander Rüeg (2006) &quot;Ontology based text indexing and querying for the semantic web&quot;, Knowledge-Based Systems, Volume 19, Issue 8 Jacob Köhler, Jan Baumbach, Jan Taubert, Michael Specht, Andre Skusa, Alexander Rüegg, Chris Rawlings, Paul Verrier and Stephan Philippi (2006) &quot;Graph-based analysis and visualization of experimental results with ONDEX&quot;, Bioinformatics 22(11) Skusa, A., Rüegg, A., Köhler, J. (2005) &quot;Extraction of biological networks from scientific literature&quot;, Briefings in Bioinformatics 6(3) Köhler, J., Rawlings, C., Verrier, P., Mitchell, R., Skusa, A., Rüegg, A. and Philippi, S. (2004), &quot;Linking experimental results, biological networks and sequence analysis methods using Ontologies and Generalized Data Structures&quot;, In Silico Biol, Volume 5, Special Issue: Ontology and Genome, Manuscript number in online journal: 0005. Publications
  17. 18. Centre for Mathematical and Computational Biology Department of Biomathematics and Bioinformatics Rothamsted Research Dr Jacob Köhler, Principle Investigator Prof Chris Rawlings, Head of Department Rothamsted Research is supported by the BBSRC Travel grants and scholarships by Acknowledgements
  18. 19. 4 th Integrative Bioinformatics workshop 10th to 12th September 2007 University of Ghent, Belgium http://www.rothamsted.bbsrc.ac.uk/bab/conf/ib07/ Invited speakers: Prof Carole Goble, School of Computer Science, University of Manchester, UK Prof Søren Brunak, BioCentrum-DTU, Technical University of Denmark, Denmark Dr David Searls, Senior Vice President, Informatics, GlaxoSmithKline Pharmaceuticals, USA Dr Luis Serrano, EMBL, Heidelberg, Germany Organising committee : Prof Ralf Hofestädt, University of Bielefeld, Germany (Co-chair)   Dr Jacob Koehler, Rothamsted Research, UK (Co-chair)  Prof Martin Kuiper, University of Ghent, Belgium (Local organisation) Paul Verrier, Rothamsted Research, UK (Local organisation) Poster submission deadline 27 th August 2007 Registration deadline 13 th August 2007

×