SlideShare a Scribd company logo
1 of 1
Download to read offline
Context
What is known about PARP family proteins involved in Reactome
pathways ? Interesting question. Our proposed methodology is to
build semantic mashup to solve this problem by using two open
source software : OpenLink Virtuoso triplestore and Talend
Open Studio for data integration.
Our goal is to help solve the data integration problem, a reality in
bioinformatic. Taverna and Galaxy workflows have been very
successful in addressing this problem. They still lack support for
Semantic technologies like RDF and SPARQL. BioMart has also
been successful by offering a global model to share and query data.
Bio2RDF project has the same goal but instead it use Semantic
Web technology strategy based on the distributed RDF graph of
Linked Data and public SPARQL endpoints to address this problem.
Methodology
To implement our strategy, we added Semantic technology and Life
Science linked data sources to Talend. We have created two
collections of components. The first one, Talend4SW, integrates
Virtuoso triplestore into Talend and offer simple utilities to transform
RDF data. The second collection of component, Talend4Bio2RDF,
is used to fetch RDF data from Life Science’s SPARQL endpoints.
Connected together in a workflow, those components are used to
query Bio2RDF release 2 endpoints, UniProt REST service and
EBI’s SPARQL endpoints. They all consume the new Bio2RDF
REST services available at http://bio2rdf.org.
Using those components to build a proper Talend workflow, we
populate a triplestore by fetching RDF data directly from the web.
Each triple is then stored in a local Virtuoso triplestore which is
queried using SPARQL to discover new URIs that will be
dereferenced. At the end we have obtained the needed data to
answer our initial query, and a final SPARQL query returns the
answer.
Results
This well designed semantic workflow instantiate the database
needed to answer the initial query in a few steps. Finally,
PARP1_HUMAN is the only protein of the PARP family present in
Reactome’s pathways.
These new Talend components can be imported from Talend
Exchange http://www.talendforge.org/exchange. This Talend
workflow used to answer the PARP question can be downloaded
from myExperiment
http://www.myexperiment.org/workflows/4050.html
Building mashup from Linked Data
using Bio2RDF’s Talend components
François Belleau, Vincent, Emonet, Arnaud Droit
Centre de Biologie Computationnelle
Centre de recherche du CHUQ
The PI of this project is Dr Arnaud Droit, Directeur du Centre de
Biologie Computationnelle du CRCHUQ à l’Université Laval.
http://bio2rdf.org
The tBio2RDFRequest component is used to fetch RDF
graph from describe, links and search Bio2RDF REST
services. Result is available in different format.
The tNtriplesTemplate component is used to generate N-Triples from the incoming data
flow using a text template. Here it is used to create the owl:sameAs triples needed to
connect Bio2RDF resources to UniProt ones because of the different URI pattern they use.
The tDerefrencableURI component is used to fetch a
graph using its URI. Here it is used to dereference
UniProt URI for proteins and keywords.
The tEBIRequest component is used to send queries to
EBI new SPARQL endpoints, here it fetches Reactome.
This final complex query is used to answer the question by
linking data together from HGNC, UniProt and Reactome
database the Linked Data way.
The execution process can be monitored by looking at the
URI used to fetch RDF data from the web. The table shows
the number of triples loaded in the previous run.
Talend being a complete ETL solution, results can easily exported to
Excel spreadsheet for analysis.
Our team can help you add your own curated
database to this RDF Linked Data project based on
Open Source software. Now your project can join
the Semantic Web of Life Sciences resources.

More Related Content

Similar to Bio2RDF poster for Biocurator 2014 conference

Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011
Juan Sequeda
 
Rdf Processing Tools In Java
Rdf Processing Tools In JavaRdf Processing Tools In Java
Rdf Processing Tools In Java
DicusarCorneliu
 
final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)
Ankit Rathi
 
Comparative Study That Aims Rdf Processing For The Java Platform
Comparative Study That Aims Rdf Processing For The Java PlatformComparative Study That Aims Rdf Processing For The Java Platform
Comparative Study That Aims Rdf Processing For The Java Platform
Computer Science
 
Querying the Semantic Web with SPARQL
Querying the Semantic Web with SPARQLQuerying the Semantic Web with SPARQL
Querying the Semantic Web with SPARQL
Emanuele Della Valle
 
Ontology Based Approach for Semantic Information Retrieval System
Ontology Based Approach for Semantic Information Retrieval SystemOntology Based Approach for Semantic Information Retrieval System
Ontology Based Approach for Semantic Information Retrieval System
IJTET Journal
 

Similar to Bio2RDF poster for Biocurator 2014 conference (20)

Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012
 
Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013
 
Article06
Article06Article06
Article06
 
Bio2RDF@BH2010
Bio2RDF@BH2010Bio2RDF@BH2010
Bio2RDF@BH2010
 
Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011
 
.Net and Rdf APIs
.Net and Rdf APIs.Net and Rdf APIs
.Net and Rdf APIs
 
Websci17 final
Websci17 finalWebsci17 final
Websci17 final
 
Web Spa
Web SpaWeb Spa
Web Spa
 
ParlBench: a SPARQL-benchmark for electronic publishing applications.
ParlBench: a SPARQL-benchmark for electronic publishing applications.ParlBench: a SPARQL-benchmark for electronic publishing applications.
ParlBench: a SPARQL-benchmark for electronic publishing applications.
 
Rdf Processing Tools In Java
Rdf Processing Tools In JavaRdf Processing Tools In Java
Rdf Processing Tools In Java
 
2009 0807 Lod Gmod
2009 0807 Lod Gmod2009 0807 Lod Gmod
2009 0807 Lod Gmod
 
final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)final_copy_camera_ready_paper (7)
final_copy_camera_ready_paper (7)
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival data
 
p27
p27p27
p27
 
Tech. session : Interoperability and Data FAIRness emerges from a novel combi...
Tech. session : Interoperability and Data FAIRness emerges from a novel combi...Tech. session : Interoperability and Data FAIRness emerges from a novel combi...
Tech. session : Interoperability and Data FAIRness emerges from a novel combi...
 
Comparative Study That Aims Rdf Processing For The Java Platform
Comparative Study That Aims Rdf Processing For The Java PlatformComparative Study That Aims Rdf Processing For The Java Platform
Comparative Study That Aims Rdf Processing For The Java Platform
 
Bio2RDF @ W3C HCLS2009
Bio2RDF @ W3C HCLS2009Bio2RDF @ W3C HCLS2009
Bio2RDF @ W3C HCLS2009
 
Querying the Semantic Web with SPARQL
Querying the Semantic Web with SPARQLQuerying the Semantic Web with SPARQL
Querying the Semantic Web with SPARQL
 
Ontology Based Approach for Semantic Information Retrieval System
Ontology Based Approach for Semantic Information Retrieval SystemOntology Based Approach for Semantic Information Retrieval System
Ontology Based Approach for Semantic Information Retrieval System
 
Modern PHP RDF toolkits: a comparative study
Modern PHP RDF toolkits: a comparative studyModern PHP RDF toolkits: a comparative study
Modern PHP RDF toolkits: a comparative study
 

More from François Belleau

More from François Belleau (16)

Bio2RDF @ DILS 2008
Bio2RDF @ DILS 2008Bio2RDF @ DILS 2008
Bio2RDF @ DILS 2008
 
Pitch Reactome2json_ld @ swat4hcls 2020
Pitch Reactome2json_ld @ swat4hcls 2020Pitch Reactome2json_ld @ swat4hcls 2020
Pitch Reactome2json_ld @ swat4hcls 2020
 
Show de boucane pour ELK
Show de boucane pour ELKShow de boucane pour ELK
Show de boucane pour ELK
 
Pitch Qliic coopérathon 2017
Pitch Qliic coopérathon 2017Pitch Qliic coopérathon 2017
Pitch Qliic coopérathon 2017
 
2015-11-17 Présentation SEAO et ES
2015-11-17 Présentation SEAO et ES2015-11-17 Présentation SEAO et ES
2015-11-17 Présentation SEAO et ES
 
Linuq 20160130
Linuq 20160130Linuq 20160130
Linuq 20160130
 
textOdossier
textOdossiertextOdossier
textOdossier
 
BD2K hackathon - Bio2RDF submission
BD2K hackathon - Bio2RDF submissionBD2K hackathon - Bio2RDF submission
BD2K hackathon - Bio2RDF submission
 
Découvrir le web sémantique en 15 minutes (Decideo 2014)
Découvrir le web sémantique en 15 minutes (Decideo 2014)Découvrir le web sémantique en 15 minutes (Decideo 2014)
Découvrir le web sémantique en 15 minutes (Decideo 2014)
 
Acfas 2013 - Comment publier sur le web sémantique : la méthode de Bio2RDF
Acfas 2013 - Comment publier sur le web sémantique : la méthode de Bio2RDFAcfas 2013 - Comment publier sur le web sémantique : la méthode de Bio2RDF
Acfas 2013 - Comment publier sur le web sémantique : la méthode de Bio2RDF
 
Producing, Publishing and Consuming Linked Data Three lessons from the Bio2RD...
Producing, Publishing and Consuming Linked Data Three lessons from the Bio2RD...Producing, Publishing and Consuming Linked Data Three lessons from the Bio2RD...
Producing, Publishing and Consuming Linked Data Three lessons from the Bio2RD...
 
Bio2RDF-ISMB2008
Bio2RDF-ISMB2008Bio2RDF-ISMB2008
Bio2RDF-ISMB2008
 
Bio2RDF : A Semantic Web Atlas of post genomic knowledge about Human and Mouse
Bio2RDF : A Semantic Web Atlas of post genomic knowledge about Human and MouseBio2RDF : A Semantic Web Atlas of post genomic knowledge about Human and Mouse
Bio2RDF : A Semantic Web Atlas of post genomic knowledge about Human and Mouse
 
Bio2RDF should we do it
Bio2RDF should we do itBio2RDF should we do it
Bio2RDF should we do it
 
Bio2RDF: Towards A Mashup To Build Bioinformatics Knowledge System
Bio2RDF: Towards A Mashup To Build Bioinformatics Knowledge SystemBio2RDF: Towards A Mashup To Build Bioinformatics Knowledge System
Bio2RDF: Towards A Mashup To Build Bioinformatics Knowledge System
 
Bio2RDF/Virtuoso
Bio2RDF/VirtuosoBio2RDF/Virtuoso
Bio2RDF/Virtuoso
 

Recently uploaded

The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
shinachiaurasa2
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 

Recently uploaded (20)

%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
Generic or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisions
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 

Bio2RDF poster for Biocurator 2014 conference

  • 1. Context What is known about PARP family proteins involved in Reactome pathways ? Interesting question. Our proposed methodology is to build semantic mashup to solve this problem by using two open source software : OpenLink Virtuoso triplestore and Talend Open Studio for data integration. Our goal is to help solve the data integration problem, a reality in bioinformatic. Taverna and Galaxy workflows have been very successful in addressing this problem. They still lack support for Semantic technologies like RDF and SPARQL. BioMart has also been successful by offering a global model to share and query data. Bio2RDF project has the same goal but instead it use Semantic Web technology strategy based on the distributed RDF graph of Linked Data and public SPARQL endpoints to address this problem. Methodology To implement our strategy, we added Semantic technology and Life Science linked data sources to Talend. We have created two collections of components. The first one, Talend4SW, integrates Virtuoso triplestore into Talend and offer simple utilities to transform RDF data. The second collection of component, Talend4Bio2RDF, is used to fetch RDF data from Life Science’s SPARQL endpoints. Connected together in a workflow, those components are used to query Bio2RDF release 2 endpoints, UniProt REST service and EBI’s SPARQL endpoints. They all consume the new Bio2RDF REST services available at http://bio2rdf.org. Using those components to build a proper Talend workflow, we populate a triplestore by fetching RDF data directly from the web. Each triple is then stored in a local Virtuoso triplestore which is queried using SPARQL to discover new URIs that will be dereferenced. At the end we have obtained the needed data to answer our initial query, and a final SPARQL query returns the answer. Results This well designed semantic workflow instantiate the database needed to answer the initial query in a few steps. Finally, PARP1_HUMAN is the only protein of the PARP family present in Reactome’s pathways. These new Talend components can be imported from Talend Exchange http://www.talendforge.org/exchange. This Talend workflow used to answer the PARP question can be downloaded from myExperiment http://www.myexperiment.org/workflows/4050.html Building mashup from Linked Data using Bio2RDF’s Talend components François Belleau, Vincent, Emonet, Arnaud Droit Centre de Biologie Computationnelle Centre de recherche du CHUQ The PI of this project is Dr Arnaud Droit, Directeur du Centre de Biologie Computationnelle du CRCHUQ à l’Université Laval. http://bio2rdf.org The tBio2RDFRequest component is used to fetch RDF graph from describe, links and search Bio2RDF REST services. Result is available in different format. The tNtriplesTemplate component is used to generate N-Triples from the incoming data flow using a text template. Here it is used to create the owl:sameAs triples needed to connect Bio2RDF resources to UniProt ones because of the different URI pattern they use. The tDerefrencableURI component is used to fetch a graph using its URI. Here it is used to dereference UniProt URI for proteins and keywords. The tEBIRequest component is used to send queries to EBI new SPARQL endpoints, here it fetches Reactome. This final complex query is used to answer the question by linking data together from HGNC, UniProt and Reactome database the Linked Data way. The execution process can be monitored by looking at the URI used to fetch RDF data from the web. The table shows the number of triples loaded in the previous run. Talend being a complete ETL solution, results can easily exported to Excel spreadsheet for analysis. Our team can help you add your own curated database to this RDF Linked Data project based on Open Source software. Now your project can join the Semantic Web of Life Sciences resources.