Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

1,991 views

Published on

No Downloads

Total views

1,991

On SlideShare

0

From Embeds

0

Number of Embeds

2

Shares

0

Downloads

43

Comments

0

Likes

2

No embeds

No notes for slide

- 1. Linked Data Query ProcessingTutorial at the 22nd International World Wide Web Conference (WWW 2013)May 14, 2013http://db.uwaterloo.ca/LDQTut2013/2.Theoretical FoundationsOlaf HartigUniversity of Waterloo
- 2. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 2● Data retrieval approach● Data source selection● Data source ranking(optional, for optimization)Query-local dataGET http://.../movie2449● Result construction approach● i.e., query-local data processing● Combining data retrievaland result constructionhttp://mdb.../Paul http://geo.../Berlinhttp://mdb.../Ric http://geo.../Rome?loc?actor“Ingredients” for LD Query Execution
- 3. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 3Outline Basics of the SPARQL Query Language Data Model for Linked Data Queries Full-Web Query Semantics➢ Definition➢ Theoretical Properties Reachability-Based Query Semantics➢ Definition➢ Theoretical Properties
- 4. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 4General Idea of SPARQL● SPARQL: Declarative query language for RDF data● Main idea: pattern matching● Query describes a pattern to be found in RDF data● Basically, query patterns consist of RDF triples with variables( ?p , affiliated with , orgaX ) ,( ?p , interested in , ?i )Basic Query Pattern:
- 5. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 5General Idea of SPARQL● SPARQL: Declarative query language for RDF data● Main idea: pattern matching● Query describes a pattern to be found in RDF data● Basically, query patterns consist of RDF triples with variables● Any part of the queried data that matches the query patterncontributes a solution to the query result( ?p , affiliated with , orgaX ) ,( ?p , interested in , ?i )Basic Query Pattern:( alice , affiliated with , orgaX ) ,( bob , affiliated with , acme ) ,( alice , interested in , yoga ) ,( alice , interested in , tea ) ,( bob , interested in , tea ) ,Data:?p ?ialice yogaalice teaQuery Result:
- 6. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 6Some Terminology● Any query result is represented by a set of valuationsΩ = { μ1 , … , μn }● Valuation: mapping of query variables to RDF termsμ = { ?p → alice , ?i → yoga }● Any valuation in a query result is a solution● Two valuations μ1 and μ2 are compatible if they agree onthe binding for any shared variable● i.e., μ1(?v) = μ2(?v) for all ?v Є dom(μ1) ∩ dom(μ2)?p ?ialice yogaalice tea
- 7. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 7Query Patterns and their Semantics● Triple pattern: An RDF triple that may have a variable assubject, predicate, or object, respectively●eval( tp, G ) :═ { μ | μ[tp] Є G and dom(μ) = vars(tp) }● Group graph pattern: Conjunction of two query patterns●eval( (P1 AND P2), G ) :═ { μ1 U μ2 | μ1 Є eval(P1,G) andμ2 Є eval(P2,G) andμ1 and μ2 are compatible }● Associative and commutative● Basic graph pattern (BGP): A set of triple patterns● B = { tp1 , … , tpn } is equivalent to ( tp1 AND … AND tpn )● Filter, union, optional, etc.
- 8. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 8Outline Basics of the SPARQL Query Language Data Model for Linked Data Queries Full-Web Query Semantics➢ Definition➢ Theoretical Properties Reachability-Based Query Semantics➢ Definition➢ Theoretical Properties√
- 9. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 9Data Model● Static view (i.e., no changes during computation)● Web of Linked Data: W = ( D, adoc, data )● D – set of symbols that represent Web documentsfrom which Linked Data can be extracted● adoc – partial mapping from URIs to D● data – total mapping from D to finite sets of RDF triples●All data in W: AllData(W) :═ Ud є D data(d)D = adoc( ) = data( ) =
- 10. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 10Linked Data QueryQ(W)
- 11. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 11( ?p , affiliated with , orgaX ) ,( ?p , interested in , ?i )?p ?ialice yogaalice teaQP(W) = { μ1 , μ2 , ... }SPARQL-Based Linked Data Query
- 12. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 12Outline Basics of the SPARQL Query Language Data Model for Linked Data Queries Full-Web Query Semantics➢ Definition➢ Theoretical Properties Reachability-Based Query Semantics➢ Definition➢ Theoretical Properties√√
- 13. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 13Full-Web SemanticsQP(W) :═ eval(P,AllData(W))
- 14. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 14Computability● (Ordinary) Turing machinesare unsuitable:● Limited data access capabilitiesnot properly captured● Web machines● Abiteboul and Vianu, 1997● Mendelzon and Milo, 1997QP(W) :═ eval(P,AllData(W))TM
- 15. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 15LD Machine● Multi-tape Turing machine➔ Input➔ Work➔ Output
- 16. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 16LD Machine# enc(u1) enc(adoc(u1)) # enc(u2) enc(adoc(u2)) # ∙ ∙ ∙● Multi-tape Turing machine➔ Web Input➔ Input➔ Work➔ Output
- 17. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 17LD Machine# enc(u1) enc(adoc(u1)) # enc(u2) enc(adoc(u2)) # ∙ ∙ ∙● Multi-tape Turing machine➔ Web Input➔ Input➔ Work➔ Output● Access to Web Input is restricted● Only by performinga particular procedurein a particular state
- 18. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 18# enc(u1) enc(adoc(u1)) # enc(u2) enc(adoc(u2)) # ∙ ∙ ∙➔ Web Input➔ Input➔ Work➔ Output● For Q exists an LD machine MQ such that for any W holds:● MQ halts after a finite number of computation steps, and● MQ outputs the complete result Q(W)Finitely Computable LD Queriesstep 1 ∙ ∙ ∙ step k - 3 step k - 2 step k – 1 step k∙ ∙ ∙# enc(μ1) # enc(μ2) # ∙ ∙ ∙ # enc(μn) #
- 19. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 19Eventually Computable LD Queriesstepk + 2∙ ∙ ∙∙ ∙ ∙stepk - 3stepk - 2stepk - 1stepkstepk + 1# enc(u1) enc(adoc(u1)) # enc(u2) enc(adoc(u2)) # ∙ ∙ ∙# enc(μ1) # enc(μ2)➔ Web Input➔ Input➔ Work➔ Output● For Q exists an LD machine MQ such that for any W holds:1. Output always encodes a subset of query result Q(W), and2. Each μ Q(W) eventually appears on the output✗ No guarantee for termination
- 20. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 20Not (LD Machine) Computablestepk + 2∙ ∙ ∙∙ ∙ ∙stepk - 3stepk - 2stepk - 1stepkstepk + 1# enc(u1) enc(adoc(u1)) # enc(u2) enc(adoc(u2)) # ∙ ∙ ∙# enc(μ1) # enc(μ2)➔ Web Input➔ Input➔ Work➔ Output● For Q exists an LD machine MQ such that for any W holds:1. Output always encodes a subset of query result Q(W), and2. Each μ Q(W) eventually appears on the output✗ No guarantee for termination
- 21. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 21Theorem [Har12]: If a satisfiable SPARQLLD queryQPunder full-Web semantics is monotonic, then it iseventually computable (but not finitely computable);otherwise, it is not even eventually computable.Theorem [Har12]: If a satisfiable SPARQLLD queryQPunder full-Web semantics is monotonic, then it iseventually computable (but not finitely computable);otherwise, it is not even eventually computable.● Main reason: Set of all URIs is infinite (but countable)Computability (Full-Web Semantics)QP(W) :═ eval(P,AllData(W))
- 22. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 22Satisfiability and MonotonicityQ( ) = { …, ri , … } ≠ Ø⟹ Q( ) ⊆ Q( )Proposition [Har12]: Let QPbe a SPARQLLD query under full-Websemantics.● QPis satisfiable ⟺ its query pattern P is satisfiable● QPis monotonic ⟺ its query pattern P is monotonicProposition [Har12]: Let QPbe a SPARQLLD query under full-Websemantics.● QPis satisfiable ⟺ its query pattern P is satisfiable● QPis monotonic ⟺ its query pattern P is monotonicUnfortunately:Satisfiability andmonotonicity of SPARQLquery patterns is undecidableSatisfiability:Monotonicity:OPTtpANDFILTERUNION
- 23. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 23Theorem [Har12]: If a satisfiable SPARQLLD queryQPunder full-Web semantics is monotonic, then it iseventually computable (but not finitely computable);otherwise, it is not even eventually computable.Theorem [Har12]: If a satisfiable SPARQLLD queryQPunder full-Web semantics is monotonic, then it iseventually computable (but not finitely computable);otherwise, it is not even eventually computable.● Main reason: Set of all URIs is infinite (but countable)Computability (Full-Web Semantics)QP(W) :═ eval(P,AllData(W))
- 24. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 24Outline Basics of the SPARQL Query Language Data Model for Linked Data Queries Full-Web Query Semantics➢ Definition➢ Theoretical Properties Reachability-Based Query Semantics➢ Definition➢ Theoretical Properties√√√
- 25. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 25Reachability-Based Query Semantics
- 26. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 26●Seed URIs S ᑕ UReachability-Based Query Semantics
- 27. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 27●Seed URIs S ᑕ U● Reachability criterion c●(Turing) computable function c: T ×U × P → {true,false}Reachability-Based Query Semantics
- 28. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 28WQP,S( ):= eval(P,AllData(W*))cReachability-Based Query Semantics
- 29. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 29(Reachability-Based) cAll-SemanticsWQP,S( ) eval(P,AllData(W*))cAll:=
- 30. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 30WQP,S( ) eval(P,AllData(W*))cNone:=(Reachability-Based) cNone-Semantics
- 31. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 31WQP,S( ) eval(P,AllData(W*))cMatch:=(Reachability-Based) cMatch-Semantics
- 32. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 32Satisfiability and MonotonicityQ( ) = { …, ri , … } ≠ Ø⟹ Q( ) ⊆ Q( )Satisfiability:Monotonicity:Proposition [Har12]: Let c be a reachability criterion and QPcSbe aSPARQLLD query under c-semantics.● QPcSis satisfiable ⟺ its SPARQL expression P is satisfiable● QPcSis monotonic ⇒ its SPARQL expression P is monotonicProposition [Har12]: Let c be a reachability criterion and QPcSbe aSPARQLLD query under c-semantics.● QPcSis satisfiable ⟺ its SPARQL expression P is satisfiable● QPcSis monotonic ⇒ its SPARQL expression P is monotonicCounterexample: Let QPcSbe a SPARQLLD query under c-semantics.If c = cNone and |S| = 1, then QPcSis monotonic.Counterexample: Let QPcSbe a SPARQLLD query under c-semantics.If c = cNone and |S| = 1, then QPcSis monotonic.
- 33. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 33LD Machine-Based ComputabilityTheorem [Har12]: For any reachability criterion c, anySPARQL based Linked Data query under c-semanticsis finitely computable.Theorem [Har12]: For any reachability criterion c, anySPARQL based Linked Data query under c-semanticsis finitely computable.● If we restrict our data model such that Websof Linked Data are finite (i.e., consist of a finitenumber of documents), then:For results w.r.t. the unrestricted data model (i.e.,Webs of Linked Data that may be infinitely large),refer to [Har12].
- 34. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 34Outline Basics of the SPARQL Query Language Data Model for Linked Data Queries Full-Web Query Semantics➢ Definition➢ Theoretical Properties Reachability-Based Query Semantics➢ Definition➢ Theoretical Properties√√√√Next part: 3. Source Selection Strategies ...
- 35. WWW 2013 Tutorial on Linked Data Query Processing [ Theoretical Foundations ] 35These slides have been created byOlaf Hartigfor theWWW 2013 tutorial onLink Data Query ProcessingTutorial Website: http://db.uwaterloo.ca/LDQTut2013/This work is licensed under aCreative Commons Attribution-Share Alike 3.0 License(http://creativecommons.org/licenses/by-sa/3.0/)

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

Be the first to comment