Learning sparql 2012 12

6,896 views
5,289 views

Published on

Tutorial on SPARQL 1.1 given at SWAT4LS 2012 in paris to a full room. This material covers enough to get started and includes working with Topbraid Composer.

Published in: Technology
0 Comments
10 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
6,896
On SlideShare
0
From Embeds
0
Number of Embeds
171
Actions
Shares
0
Downloads
0
Comments
0
Likes
10
Embeds 0
No embeds

No notes for slide

Learning sparql 2012 12

  1. SPARQL UniProt.RDF Jerven Bolleman Developer Swiss-Prot Group Swiss Institute of BioinformaticsTuesday, December 4, 2012
  2. A few notes before we begin • SPARQL 1 – Some what useful – Standardized in 2008 • SPARQL 1.1 – Very useful – Currently in recommended standard • Still finding incompatibilities • Or not yet implemented features © 2012 SIBTuesday, December 4, 2012
  3. Raise your hand if you have questions © 2012 SIBTuesday, December 4, 2012
  4. Tutorial plan • Set up Topbraid Composer – Skipped in talk – On VM • Gather data from uniprot website – Already there. Text • Learn sparql You do not need Topbraid Composer to use UniProt RDF data or do sparql queries. You can use beta.sparql.uniprot.org as well. © 2012 SIBTuesday, December 4, 2012
  5. Download and install Topbraid composer • Requirements – Sun/Oracle JVM • Go to – http://www.topquadrant.com/products/ TB_download.html – Register – Select any edition, free is ok for today © 2012 SIBTuesday, December 4, 2012
  6. Start Topbraid © 2012 SIBTuesday, December 4, 2012
  7. Setting up a workspace for this tutorial • http://www.topquadrant.com/products/TB_download.html © 2012 SIBTuesday, December 4, 2012
  8. New project • File > New Project > General © 2012 SIBTuesday, December 4, 2012
  9. Gather data from uniprot.org website • In the navigator select the new project you just made. © 2012 SIBTuesday, December 4, 2012
  10. Gather data from uniprot.org website Right click on your new project. Select “Import” in the drop down menu • Import RDF or OWL file from the web © 2012 SIBTuesday, December 4, 2012
  11. Using the same process download core.owl You can see a html view of this schema ontology at http://www.uniprot.org/core/ © 2012 SIBTuesday, December 4, 2012
  12. Gather data from uniprot.org website You can see a html view of this entry at http://www.uniprot.org/taxonomy/40674 © 2012 SIBTuesday, December 4, 2012
  13. Gather data from uniprot.org website • Open the mammalia.rdf file by double clicking © 2012 SIBTuesday, December 4, 2012
  14. You get a very helpfull dialog. Hit yes © 2012 SIBTuesday, December 4, 2012
  15. Its SPARQLy mammal time !! © 2012 SIBTuesday, December 4, 2012
  16. Lets look at an single taxon record © 2012 SIBTuesday, December 4, 2012
  17. Lets navigate to it in TopBraid • Type the uri as is with the angle brackets © 2012 SIBTuesday, December 4, 2012
  18. Investigate the taxon record © 2012 SIBTuesday, December 4, 2012
  19. The “Eastern Chipmunk” in turtle © 2012 SIBTuesday, December 4, 2012
  20. Turtle is the RDF serialization aligned with SPARQL • Shorthand to avoid typing so much – . ‘dot’ is end statement – ; ‘semi-colon’ repeat subject – , ‘comma’ is repeat subject and predicate • prefix – before ‘:’ is abbreviation of uri © 2012 SIBTuesday, December 4, 2012
  21. Why don’t these queries work on the web? • PREFIX – Topbraid composer uses the prefixes defined in the files “overview” tab. – On the web you often have to add these. PREFIX :<http://purl.uniprot.org/core/> SELECT ?x FROM <http://purl.uniprot.org/taxonomy/> WHERE {?x a :Taxon} © 2012 SIBTuesday, December 4, 2012
  22. a = rdf:type = <http://www.w3.org/1999/02/22-rdf- syntax-ns#type> © 2012 SIBTuesday, December 4, 2012
  23. rdfs:subClassOf taxon:45474 is a more specific classification than taxon:13712 © 2012 SIBTuesday, December 4, 2012
  24. rank => “The level, for nomenclatural purposes, of a taxon in a taxonomic hierarchy” © 2012 SIBTuesday, December 4, 2012
  25. Why learn SPARQL • Standardized formal query language – implementation independent • SPARQL ➔ SQL (via R2ML) • SPARQL ➔ webservice (via SADI) • SPARQL ➔ LDAP (e.g. SquirrelRDF) • SPARQL ➔ RDF (triplestore e.g. OWLIM-se) • SPARQL ➔ HADOOP/HIVE (e.g. SHARD) – How you query independent of how you store! © 2012 SIBTuesday, December 4, 2012
  26. Apparently it helps kill vampires !!! © 2012 SIBTuesday, December 4, 2012
  27. Lets learn SPARQL • Queries over RDF data. – Four basic types • SELECT – Returns “tab delimited” results • CONSTRUCT – Makes new triples • DESCRIBE – Returns all triples mentioning a resource • ASK – Return true if anything matches © 2012 SIBTuesday, December 4, 2012
  28. SPARQL:queries triple pattern taxon:9606 rdf:type core:Taxon . © 2012 SIBTuesday, December 4, 2012
  29. SPARQL:queries triple pattern ?anyTaxon rdf:type core:Taxon . © 2012 SIBTuesday, December 4, 2012
  30. SPARQL:queries triple pattern SELECT ?anyTaxon WHERE { ?anyTaxon rdf:type core:Taxon . } © 2012 SIBTuesday, December 4, 2012
  31. SPARQL:queries triple pattern taxon:9606 rdf:type core:Taxon . taxon:9606 core:reviewed “true” . © 2012 SIBTuesday, December 4, 2012
  32. SPARQL:queries triple pattern ?anyTaxon rdf:type core:Taxon . ?anyTaxon core:reviewed “true” . © 2012 SIBTuesday, December 4, 2012
  33. SPARQL:queries triple pattern SELECT ?anyTaxon WHERE { ?anyTaxon rdf:type core:Taxon . ?anyTaxon core:reviewed “true” . } © 2012 SIBTuesday, December 4, 2012
  34. SPARQL:queries triple pattern SELECT ?anyTaxon WHERE { ?anyTaxon rdf:type core:Taxon . ?anyTaxin core:reviewed “true” . } © 2012 SIBTuesday, December 4, 2012
  35. SPARQL:queries triple pattern SELECT ?anyTaxon WHERE { ?anyTaxon rdf:type core:Taxon . $anyTaxon core:reviewed “true” . } © 2012 SIBTuesday, December 4, 2012
  36. Lets learn SPARQL © 2012 SIBTuesday, December 4, 2012
  37. © 2012 SIBTuesday, December 4, 2012
  38. © 2012 SIBTuesday, December 4, 2012
  39. Shorthand a = rdf:type © 2012 SIBTuesday, December 4, 2012
  40. AND join (default) © 2012 SIBTuesday, December 4, 2012
  41. Now you type © 2012 SIBTuesday, December 4, 2012
  42. Remember ‘;’ shortcut © 2012 SIBTuesday, December 4, 2012
  43. Two variables one output column © 2012 SIBTuesday, December 4, 2012
  44. Optional • When values may be missing – yet interesting when they are there • Use as sub query • bound values from outside stay bound inside – ?x ?y?z . OPTIONAL {?x ?b ?c} • ?x same variable = same thing © 2012 SIBTuesday, December 4, 2012
  45. © 2012 SIBTuesday, December 4, 2012
  46. UNION • Allows you to combine query patterns as an OR operation. • Joins are still from outer to inner. © 2012 SIBTuesday, December 4, 2012
  47. UNION © 2012 SIBTuesday, December 4, 2012
  48. Negation • When you do not want a certain category of matches. SELECT ?pet WHERE { ?pet a pets:Friendly . } © 2012 SIBTuesday, December 4, 2012
  49. Oooops © 2012 SIBTuesday, December 4, 2012
  50. Not exists (Negation 1) © 2012 SIBTuesday, December 4, 2012
  51. Minus (Negation 2) © 2012 SIBTuesday, December 4, 2012
  52. MINUS{} or FILTER (NOT EXISTS{}) • Whats the difference? – MINUS subtracts results – NOT EXITS tests if the sub pattern is possible at all. • Normally the faster option. © 2012 SIBTuesday, December 4, 2012
  53. MINUS all data © 2012 SIBTuesday, December 4, 2012
  54. FILTER (NOT EXISTS{}) no results © 2012 SIBTuesday, December 4, 2012
  55. Negation option 3 SPARQL 1.0 SELECT ?subject ?rank WHERE { ?subject core:rank ?rank . OPTIONAL { ?subject core:rank core:Genus . ?subject core:rank ?genus .} FILTER(! BOUND(?genus)) } © 2012 SIBTuesday, December 4, 2012
  56. © 2012 SIBTuesday, December 4, 2012
  57. FILTERS • You just saw it twice – Once in the !BOUND – Once in the NOT EXISTS • FILTERS a result set by possibly removing values – FILTER do not add a value to the result • Inside the same graph pattern order independent. © 2012 SIBTuesday, December 4, 2012
  58. Filter © 2012 SIBTuesday, December 4, 2012
  59. Filter on not in © 2012 SIBTuesday, December 4, 2012
  60. © 2012 SIBTuesday, December 4, 2012
  61. © 2012 SIBTuesday, December 4, 2012
  62. IN © 2012 SIBTuesday, December 4, 2012
  63. © 2012 SIBTuesday, December 4, 2012
  64. FILTER on numbers • < – FILTER (1 < 2) • > – FILTER (2 > 1) • = – FILTER (1 =1) • != – FILTER(1 != 2) • © 2012 SIBTuesday, December 4, 2012
  65. Filters • ?x = ?y does casting (value conversions) – 1.0^^xsd:float = 1^^xsd:int is true • sameTerm(?x, ?y) does not – sameTerm(1.0^^xsd:float, 1^^xsd:int) © 2012 SIBTuesday, December 4, 2012
  66. FILTER on strings • Functions – STRLEN – ENCODE_FOR_URI – SUBSTR – CONCAT – UCASE – langMatches – LCASE – REGEX – STRSTARTS – REPLACE – STRENDS – CONTAINS – IRI – STRBEFORE – STRAFTER © 2012 SIBTuesday, December 4, 2012
  67. STRLEN == String Length © 2012 SIBTuesday, December 4, 2012
  68. CONTAINS is case sensitive is it in there © 2012 SIBTuesday, December 4, 2012
  69. REGEX, just like java regex © 2012 SIBTuesday, December 4, 2012
  70. BIND • Builds new Values – Closes the basic graph pattern SELECT ?p WHERE { { ?taxon a :Taxon . } BIND (?taxon AS ?p) } • Always declare before use. © 2012 SIBTuesday, December 4, 2012
  71. © 2012 SIBTuesday, December 4, 2012
  72. © 2012 SIBTuesday, December 4, 2012
  73. BIND can assign any output © 2012 SIBTuesday, December 4, 2012
  74. Aggregate functions • on select line • limited in number – count – sum – avg – min – max – groupConcat – sample © 2012 SIBTuesday, December 4, 2012
  75. count © 2012 SIBTuesday, December 4, 2012
  76. SAMPLE should give a random result back © 2012 SIBTuesday, December 4, 2012
  77. Follow the path © 2012 SIBTuesday, December 4, 2012
  78. Path queries © 2012 SIBTuesday, December 4, 2012
  79. Finding a grand parent using normal joins © 2012 SIBTuesday, December 4, 2012
  80. Finding a grandParent using a path query © 2012 SIBTuesday, December 4, 2012
  81. | is OR for predicate © 2012 SIBTuesday, December 4, 2012
  82. Same result with UNION © 2012 SIBTuesday, December 4, 2012
  83. Finding any ancestor © 2012 SIBTuesday, December 4, 2012
  84. Can use the variable in a normal join afterwards © 2012 SIBTuesday, December 4, 2012
  85. GROUP BY © 2012 SIBTuesday, December 4, 2012
  86. GROUP BY • Needed for aggregate values • After closing the where clause – ... WHERE {?x ?y ?z} GROUP BY ?x © 2012 SIBTuesday, December 4, 2012
  87. GROUP BY © 2012 SIBTuesday, December 4, 2012
  88. HAVING I have carrot ! © 2012 SIBTuesday, December 4, 2012
  89. HAVING • FILTER for aggregates • After the GROUP BY clause – ... GROUP BY ?x HAVING (count(?y) > 2) – ... GROUP BY ?x HAVING (min(?y) = 2) – etc... © 2012 SIBTuesday, December 4, 2012
  90. HAVING © 2012 SIBTuesday, December 4, 2012
  91. LIMITS & OFFSET © 2012 SIBTuesday, December 4, 2012
  92. LIMIT and OFFSET • OFFSET is skip first results • LIMIT return no more than x results © 2012 SIBTuesday, December 4, 2012
  93. ORDER © 2012 SIBTuesday, December 4, 2012
  94. © 2012 SIBTuesday, December 4, 2012
  95. © 2012 SIBTuesday, December 4, 2012
  96. © 2012 SIBTuesday, December 4, 2012
  97. VALUES • Super BIND • Provide inline data © 2012 SIBTuesday, December 4, 2012
  98. © 2012 SIBTuesday, December 4, 2012
  99. Examples • Parameter lists are between () VALUES (?annotation) { (core:Disease_Annotation) Text (core:Disulfide_Bond_Annotation) } © 2012 SIBTuesday, December 4, 2012
  100. Examples • Undef means no value at – all not bound VALUES (?annotation ?begin) { (core:Disease_Annotation UNDEF) Text (core:Disulfide_Bond_Annotation 2) } © 2012 SIBTuesday, December 4, 2012
  101. VALUES • After declaring a set of values you can use them in your query. SELECT ?comment WHERE { VALUES (?annotation ?begin) { (core:Disease_Annotation UNDEF) (core:Disulfide_Bond_Annotation 2) } ?annotation rdfs:comment ?comment . } © 2012 SIBTuesday, December 4, 2012
  102. SERVICE: Using other sparql endpoints • SERVICE<URL of other endpoint> – Runs a sub query on the other endpoint and merges it back into your query. © 2012 SIBTuesday, December 4, 2012
  103. “Life is better with friends who understand you.” © 2012 SIBTuesday, December 4, 2012
  104. SERVICE © 2012 SIBTuesday, December 4, 2012
  105. SERVICE • Useful – Quick experimenting with combing multiple datasources – Quick for queries where not to much data is send to the remote point • Slow – When you ask for to much data – Remote endpoint not resourced for your questions © 2012 SIBTuesday, December 4, 2012
  106. Lets make some triples © 2012 SIBTuesday, December 4, 2012
  107. Construction • CONSTRUCT – New triples • downloads RDF – Does not update store © 2012 SIBTuesday, December 4, 2012
  108. New triples © 2012 SIBTuesday, December 4, 2012
  109. Constructing an owl:sameAs between two URI © 2012 SIBTuesday, December 4, 2012
  110. INSERT • Adds data – like construct © 2012 SIBTuesday, December 4, 2012
  111. Modifies data © 2012 SIBTuesday, December 4, 2012
  112. DELETE • Removes data – Triples matching are removed from the data – Triples can be bound using where clause © 2012 SIBTuesday, December 4, 2012
  113. DELETE © 2012 SIBTuesday, December 4, 2012
  114. DELETE INSERT • Single atomic operation. © 2012 SIBTuesday, December 4, 2012
  115. Atomic operation © 2012 SIBTuesday, December 4, 2012
  116. I’m exhausted now © 2012 SIBTuesday, December 4, 2012
  117. QuestionsTuesday, December 4, 2012

×