Bio2RDFs namespace   SPARQL endpoint                 François Belleau         Centre de Biologie Computationnelle         ...
You know them?  How can we help to navigate in thehuge Bioinformatics databases cloud ?
2005 BioPAX.gif next to Semantic Web   image vision of Tim Berner Lee
Linked Data in 2007
Bio2RDF contribution in 2009
2011 Linked Data cloud
Databases of databases names●   PathGuide●   Bioformatics.ca Links Directory●   Annual NAR Database issue●   Go, Uniprot, ...
Two interesting questions●   Which namespace are the most popular for    identifying database ?●   How far is the BioPAX c...
Which namespaces are the mostpopular to identify a database ?
Namespaces collection used by the BioPAX      data provider community.
How far is the BioPAX community to adopt  MIRIAM new namespace standard ?
How we did this ?   To answer a complex question   we first need to build the   database that will potentially   answer it...
A mashup, in web development, is a web page,or web application, that uses and combinesdata, presentation or functionality ...
Building a mashup is a lot easier whenusing Semantic Web technologies likeRDF and SPARQL design for datainteroperability.
A three steps method●   Get the data form the data provider and    transform it into RDF, we use Talend open    source Ecl...
Data provider xref resource●   http://www.ebi.ac.uk/miriam/main/ (XML format)●   Bio2RDF DNS zone description file (text)●...
Lesson #1Produce RDF triples with a  profesionnal ETL tool.
Talend ETL opensource free         software       http://www.talend.com/index.php
Talend workflows to convert HTML Genbank      page to triples and MIRAM XML
Lesson #2Publish with a SPARQL endpoint.       (to get a free 5 stars cup)
Load RDF triples into a triplestore   (we use Openlink Virtuoso)   http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/
Full text searchhttp://namespace.bio2rdf.org/fct/
Discover entity nameand browse the triplestore
Lesson #3  Consume as you like.HTTP GET to obtain RDF from URI,      SPARQL endpoint, SOAP services returning RDF,  semant...
The needed SPARQL queryto draw the previous graph using ManyEyes service     http://namespace.bio2rdf.org/sparql
Use a SOAP servicehttp://namespace.bio2rdf.org/bio2rdf/services.wsdl
Discover your relationsgraphicly with RelFinder http://www.visualdataweb.org/relfinder.php
Conclusion●   Building a mashup is easy with the actual    software, we still need the RDF data.●   One SPARQL query in th...
Acknowledgements●   Bio2RDF is a community project available at http://bio2rdf.org●   The community can be joined at    ht...
Come in Montreal July 2013 with yourSPARQL endpoint an get a FREE cup!     http://www.unbsj.ca/sase/csas/data/semantic-tri...
Upcoming SlideShare
Loading in …5
×

Bio2RDF presentation at Combine 2012

1,997 views

Published on

Bio2RDF's namespace SPARQL endpoint

Published in: Technology, Education
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,997
On SlideShare
0
From Embeds
0
Number of Embeds
72
Actions
Shares
0
Downloads
36
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

Bio2RDF presentation at Combine 2012

  1. 1. Bio2RDFs namespace SPARQL endpoint François Belleau Centre de Biologie Computationnelle du CRCHUQ
  2. 2. You know them? How can we help to navigate in thehuge Bioinformatics databases cloud ?
  3. 3. 2005 BioPAX.gif next to Semantic Web image vision of Tim Berner Lee
  4. 4. Linked Data in 2007
  5. 5. Bio2RDF contribution in 2009
  6. 6. 2011 Linked Data cloud
  7. 7. Databases of databases names● PathGuide● Bioformatics.ca Links Directory● Annual NAR Database issue● Go, Uniprot, Genbank cross-reference list● LSRN initiative● MIRIAM EBI project● BioPAX dataprovider community● Bio2RDF Linked Data space
  8. 8. Two interesting questions● Which namespace are the most popular for identifying database ?● How far is the BioPAX community to adopt MIRIAM new namespace standard ?
  9. 9. Which namespaces are the mostpopular to identify a database ?
  10. 10. Namespaces collection used by the BioPAX data provider community.
  11. 11. How far is the BioPAX community to adopt MIRIAM new namespace standard ?
  12. 12. How we did this ? To answer a complex question we first need to build the database that will potentially answer it: a semantic mashup.
  13. 13. A mashup, in web development, is a web page,or web application, that uses and combinesdata, presentation or functionality from two ormore sources to create new services. The termimplies easy, fast integration, frequently usingopen Application programming interfaces (API)and data sources to produce enriched resultsthat were not necessarily the original reason forproducing the raw source data. http://en.wikipedia.org/wiki/Mashup_(web_application_hybrid)
  14. 14. Building a mashup is a lot easier whenusing Semantic Web technologies likeRDF and SPARQL design for datainteroperability.
  15. 15. A three steps method● Get the data form the data provider and transform it into RDF, we use Talend open source Eclipse base ETL software.● Load the data in a triplestore many software are available (Virtuoso, Sesame, Jena, store, Mulgara, etc.) to load your mashup● Explore the new dataset using specialised user interface (RelFinder, Virtuoso facet browser)● Design your SPARQL query and get the answer
  16. 16. Data provider xref resource● http://www.ebi.ac.uk/miriam/main/ (XML format)● Bio2RDF DNS zone description file (text)● http://lsrn.org/ (RDF/XML)● http://www.geneontology.org/doc/GO.xrf_abbs (key/value format)● http://www.uniprot.org/docs/dbxref (key/value format)● http://www.ncbi.nlm.nih.gov/genbank/collab/db_xref/ (HTML)● 12 BioPAX providers (Reactome, Biomodels, Biocyc, Panther, INOH, etc)
  17. 17. Lesson #1Produce RDF triples with a profesionnal ETL tool.
  18. 18. Talend ETL opensource free software http://www.talend.com/index.php
  19. 19. Talend workflows to convert HTML Genbank page to triples and MIRAM XML
  20. 20. Lesson #2Publish with a SPARQL endpoint. (to get a free 5 stars cup)
  21. 21. Load RDF triples into a triplestore (we use Openlink Virtuoso) http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/
  22. 22. Full text searchhttp://namespace.bio2rdf.org/fct/
  23. 23. Discover entity nameand browse the triplestore
  24. 24. Lesson #3 Consume as you like.HTTP GET to obtain RDF from URI, SPARQL endpoint, SOAP services returning RDF, semantic web new software...
  25. 25. The needed SPARQL queryto draw the previous graph using ManyEyes service http://namespace.bio2rdf.org/sparql
  26. 26. Use a SOAP servicehttp://namespace.bio2rdf.org/bio2rdf/services.wsdl
  27. 27. Discover your relationsgraphicly with RelFinder http://www.visualdataweb.org/relfinder.php
  28. 28. Conclusion● Building a mashup is easy with the actual software, we still need the RDF data.● One SPARQL query in the proper triplestore (Bio2RDFs namespace mashup) could answer our two initial questions.● Why not consider publish your own SPARQL endpoint to make semantic hackers life easier ?
  29. 29. Acknowledgements● Bio2RDF is a community project available at http://bio2rdf.org● The community can be joined at https://groups.google.com/forum/?fromgroups#!forum/bio2rdf● This work was done under the supervision of Dr Arnaud Droit, assistant professor and director of the Centre de Biologie Computationnelle du CRCHUQ at Laval University, where a mirror of Bio2RDF is hosted.● Michel Dumontier, from the Dumontier Lab at Carleton University, is also hosting Bio2RDF server and actually leads the project● Thanks to all the people member of the Bio2RDF community, and especially Marc-Alexandre Nolin and Peter Ansell, initial developers.
  30. 30. Come in Montreal July 2013 with yourSPARQL endpoint an get a FREE cup! http://www.unbsj.ca/sase/csas/data/semantic-trilogy-2013/

×