Uni protsparqlcloud

  • 4,630 views
Uploaded on

This 15 minute presentation shows how we can use multiple SPARQL endpoints to integrate biological data. SPARQL we no longer need to start a data warehouse project to integrate multiple datasources we …

This 15 minute presentation shows how we can use multiple SPARQL endpoints to integrate biological data. SPARQL we no longer need to start a data warehouse project to integrate multiple datasources we just use the SPARQL 1.1 service keyword.

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
4,630
On Slideshare
0
From Embeds
0
Number of Embeds
7

Actions

Shares
Downloads
9
Comments
0
Likes
2

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. SPARQLing data stars in the biology cloud Jerven Bolleman Developer Swiss-Prot Group Swiss Institute of BioinformaticsMonday, September 3, 2012
  • 2. Biohackathon A nursery galaxy for sparql endpointsMonday, September 3, 2012
  • 3. http://beta.sparql.uniprot.org • 5.2 Billion triples – All UniProt data • Taxonomy • Sequences • Enzymes • Pathways • etc... etc... • SPARQL 1.1 (January 05 2012 Working Draft) – SERVICE keyword • No blank nodes – SHA-512 series used to stabilize anonymous resources © 2012 SIB Swiss Instiute of Bioinformatics 3Monday, September 3, 2012
  • 4. Data integration RDF/SPARQL Own Lab Data Developer&maintenance time savedChembl.rdf Triple Store SPARQL Federation Triple Store SPARQL Queries SPARQL DrivingUniProt.rdf Services © 2011 SIBMonday, September 3, 2012
  • 5. © 2012 SIB Swiss Instiute of Bioinformatics 5Monday, September 3, 2012
  • 6. SELECT ?name (COUNT(?protein) as ?size) WHERE {   {     ?protein :enzyme ?name.     ?name rdfs:subClassOf enzyme:1.-.-.-   } UNION { Text ?protein :enzyme ?name.     ?name rdfs:subClassOf enzyme:2.-.-.- } ... } GROUP BY ?name ORDER BY ?name © 2012 SIB Swiss Instiute of Bioinformatics 6Monday, September 3, 2012
  • 7. © 2012 SIB Swiss Instiute of Bioinformatics 7Monday, September 3, 2012
  • 8. SELECT ?japaneseTerm ?geneSymbol ?protein WHERE { SERVICE<http://data.allie.dbcls.jp/sparql>{ ?pc a allie:PairCluster; allie:hasShortFormRepresentationOf ?sfr ; allie:hasLongFormRepresentationOf ?lfr . ?sfr rdfs:label ?sf . ?lfr rdfs:label “β1アドレナリン受容体, β1アドレナリン レセプター”@ja, ?japaneseTerm. FILTER (lang(?sf) = "en" ) } BIND(str(?sf) as ?geneSymbol) . ?gene skos:prefLabel ?geneSymbol . ?protein uniprot:encodedBy ?gene . } © 2012 SIB Swiss Instiute of Bioinformatics 8Monday, September 3, 2012
  • 9. © 2012 SIB Swiss Instiute of Bioinformatics 9Monday, September 3, 2012
  • 10. SELECT ?target ?protein WHERE { SERVICE <http://rdf.farmbio.uu.se/chembl/sparql> { ?target a chembl:Target. ?target owl:sameAs ?bio2rdfUniprot . FILTER(contains(str(?bio2rdfUniprot), "uniprot:")) } BIND(iri(concat("http://purl.uniprot.org/ uniprot/",substr(str(?bio2rdfUniprot), 28))) as ?protein ) ?protein a up:Protein . FILTER (NOT EXISTS { ?protein up:annotation ?annotation . ?annotation a up:Disease_Annotation . }) } © 2012 SIB Swiss Instiute of Bioinformatics 10Monday, September 3, 2012
  • 11. SELECT ?target ?protein WHERE { SERVICE <http://rdf.farmbio.uu.se/chembl/sparql> { ?target a chembl:Target. ?target owl:sameAs ?protein } ?protein a up:Protein . FILTER (NOT EXISTS { ?protein up:annotation ?annotation . ?annotation a up:Disease_Annotation . }) } © 2012 SIB Swiss Instiute of Bioinformatics 11Monday, September 3, 2012
  • 12. Questions? © 2012 SIB Swiss Instiute of Bioinformatics 12Monday, September 3, 2012