Semantic Trilogy Bio2RDF tutorial
Upcoming SlideShare
Loading in...5
×
 

Semantic Trilogy Bio2RDF tutorial

on

  • 876 views

The Bio2RDF project aims to transform silos of life science data into a globally distributed network of linked data for biological knowledge discovery. Bio2RDF creates and provides machine ...

The Bio2RDF project aims to transform silos of life science data into a globally distributed network of linked data for biological knowledge discovery. Bio2RDF creates and provides machine understandable descriptions of biological entities using the RDF/RDFS/OWL Semantic Web languages. Using both syntactic and semantic data integration techniques, Bio2RDF seamlessly integrates diverse biological data and enables powerful new SPARQL-based services across it’s globally distributed knowledge bases. The project has released 28 public databases in RDF format, all available on the internet using a SPARQL endpoint or by fetching dereferencable URI.
Now with major data provider like NCBO, UniProt, KEGG, PDB and EBI who also expose their data as Linked Data, we need a framework to ease the buildup of mashup application and designing a workflow is a well-known approach to do so. The tutorial propose to use an open source professional ETL software, Talend, to help rdfization of existing data and to automate triples fetching to populate a mashup into the OpenLink Virtuoso triplestore.
How can we build a specific database to answer a very specialized question? How can we build a mashup by fetching linked data from the web? How can we merge our own lab results with the publicly available knowledge from the semantic web? Those are the questions we answer in the tutorial by proposing tools and methods to the participant. In this tutorial you will learn how to install and administer the Virtuoso triplestore, then we will show you how to load RDF triples directly from the web or from your own data you will have converted to RDF using an open source professional ETL software: Talend. Now that Life Sciences semantic web is a reality, we need to make it answer our questions

Statistics

Views

Total Views
876
Views on SlideShare
871
Embed Views
5

Actions

Likes
0
Downloads
3
Comments
0

1 Embed 5

https://twitter.com 5

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Semantic Trilogy Bio2RDF tutorial Semantic Trilogy Bio2RDF tutorial Presentation Transcript

  • How to produce and consume Linked Data the Bio2RDF way (Using Virtuoso triplestore and Talend ETL) François Belleau, Arnaud Droit Centre de recherche du CHUQ, Laval University Québec, Canada
  • Download stuff... Virtuoso Triplestore http://virtuoso.openlinksw. com/dataspace/doc/dav/wiki/Main/VOSDownload Talend Software http://www.talend.com/products/data-integration Bio2RDF Talend jobs http://sourceforge.net/p/bio2rdf/git/ci/master/tree/
  • Program ● Presentation of Bio2RDF project and other RDF public data provider like NCBO, UniProt and KEGG. ○ 15 minutes ● Virtuoso triplestore installation and administration ○ 30 minutes ● Talend Open Studio installation and basic introduction ○ 30 minutes ● Hands on part of the tutorial ○ 90 minute
  • Virtuoso triplestore installation and administration (30 min.) ● Basic server configuration ● Installing the facet browser ● Loading RDF into the triplestore ● Submitting SPARQL queries
  • Talend Open Studio installation and basic introduction (30 min.) ● Concept of JOB and Component ● Java compilation and exporting package ● How to access and transform data from SQL database or in, XML, JSON or text format ● How to access the web and consume SOAP service
  • Hands on part of the tutorial (90 minutes) ● Learning basic Talend technics ● Fetching data from the web ● Creating triple in n-triples format ● Parsing a XML document ● Accessing Virtuoso triplestore via JDBC API
  • Would you contribute ? https://github.com/fbelleau/talend4sw The project goal is to build Talend components for Semantic Web.