Your SlideShare is downloading. ×
Uni protsparqlcloud
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Uni protsparqlcloud

4,658
views

Published on

This 15 minute presentation shows how we can use multiple SPARQL endpoints to integrate biological data. SPARQL we no longer need to start a data warehouse project to integrate multiple datasources we …

This 15 minute presentation shows how we can use multiple SPARQL endpoints to integrate biological data. SPARQL we no longer need to start a data warehouse project to integrate multiple datasources we just use the SPARQL 1.1 service keyword.

Published in: Technology, Education

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
4,658
On Slideshare
0
From Embeds
0
Number of Embeds
7
Actions
Shares
0
Downloads
10
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. SPARQLing data stars in the biology cloud Jerven Bolleman Developer Swiss-Prot Group Swiss Institute of BioinformaticsMonday, September 3, 2012
  • 2. Biohackathon A nursery galaxy for sparql endpointsMonday, September 3, 2012
  • 3. http://beta.sparql.uniprot.org • 5.2 Billion triples – All UniProt data • Taxonomy • Sequences • Enzymes • Pathways • etc... etc... • SPARQL 1.1 (January 05 2012 Working Draft) – SERVICE keyword • No blank nodes – SHA-512 series used to stabilize anonymous resources © 2012 SIB Swiss Instiute of Bioinformatics 3Monday, September 3, 2012
  • 4. Data integration RDF/SPARQL Own Lab Data Developer&maintenance time savedChembl.rdf Triple Store SPARQL Federation Triple Store SPARQL Queries SPARQL DrivingUniProt.rdf Services © 2011 SIBMonday, September 3, 2012
  • 5. © 2012 SIB Swiss Instiute of Bioinformatics 5Monday, September 3, 2012
  • 6. SELECT ?name (COUNT(?protein) as ?size) WHERE {   {     ?protein :enzyme ?name.     ?name rdfs:subClassOf enzyme:1.-.-.-   } UNION { Text ?protein :enzyme ?name.     ?name rdfs:subClassOf enzyme:2.-.-.- } ... } GROUP BY ?name ORDER BY ?name © 2012 SIB Swiss Instiute of Bioinformatics 6Monday, September 3, 2012
  • 7. © 2012 SIB Swiss Instiute of Bioinformatics 7Monday, September 3, 2012
  • 8. SELECT ?japaneseTerm ?geneSymbol ?protein WHERE { SERVICE<http://data.allie.dbcls.jp/sparql>{ ?pc a allie:PairCluster; allie:hasShortFormRepresentationOf ?sfr ; allie:hasLongFormRepresentationOf ?lfr . ?sfr rdfs:label ?sf . ?lfr rdfs:label “β1アドレナリン受容体, β1アドレナリン レセプター”@ja, ?japaneseTerm. FILTER (lang(?sf) = "en" ) } BIND(str(?sf) as ?geneSymbol) . ?gene skos:prefLabel ?geneSymbol . ?protein uniprot:encodedBy ?gene . } © 2012 SIB Swiss Instiute of Bioinformatics 8Monday, September 3, 2012
  • 9. © 2012 SIB Swiss Instiute of Bioinformatics 9Monday, September 3, 2012
  • 10. SELECT ?target ?protein WHERE { SERVICE <http://rdf.farmbio.uu.se/chembl/sparql> { ?target a chembl:Target. ?target owl:sameAs ?bio2rdfUniprot . FILTER(contains(str(?bio2rdfUniprot), "uniprot:")) } BIND(iri(concat("http://purl.uniprot.org/ uniprot/",substr(str(?bio2rdfUniprot), 28))) as ?protein ) ?protein a up:Protein . FILTER (NOT EXISTS { ?protein up:annotation ?annotation . ?annotation a up:Disease_Annotation . }) } © 2012 SIB Swiss Instiute of Bioinformatics 10Monday, September 3, 2012
  • 11. SELECT ?target ?protein WHERE { SERVICE <http://rdf.farmbio.uu.se/chembl/sparql> { ?target a chembl:Target. ?target owl:sameAs ?protein } ?protein a up:Protein . FILTER (NOT EXISTS { ?protein up:annotation ?annotation . ?annotation a up:Disease_Annotation . }) } © 2012 SIB Swiss Instiute of Bioinformatics 11Monday, September 3, 2012
  • 12. Questions? © 2012 SIB Swiss Instiute of Bioinformatics 12Monday, September 3, 2012