Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Querying federations 
of Triple Pattern Fragments

734 views

Published on

Tutorial

Published in: Internet
  • Be the first to comment

Querying federations 
of Triple Pattern Fragments

  1. 1. Querying Federations
 of Triple Pattern Fragments Ruben Verborgh Tutorial
  2. 2. Linked Data Fragments Triple Pattern Fragments Federated querying Querying Federations
 of Triple Pattern Fragments
  3. 3. Linked Data Fragments Triple Pattern Fragments Federated querying Querying Federations
 of Triple Pattern Fragments
  4. 4. A whole spectrum of trade-offs
 exists between the two extremes. high server costlow server cost data
 dump SPARQL
 endpoint interface offered by the server high availability low availability high bandwidth low bandwidth out-of-date data live data low client costhigh client cost Linked Data
 documents
  5. 5. data metadata controls What triples does it contain? What do we know about it? How to access more data? All RDF interfaces offer fragments
 with the following characteristics.
  6. 6. all dataset triples (none) data dump number of triples, file size data metadata controls Each type of Linked Data Fragment
 is defined by three characteristics.
  7. 7. triples matching the query (none) (none) SPARQL query result data metadata controls Each type of Linked Data Fragment
 is defined by three characteristics.
  8. 8. Linked Data Fragments Triple Pattern Fragments Federated querying Querying Federations
 of Triple Pattern Fragments
  9. 9. We design new mixes of trade-offs
 with much lower server-side cost. high server costlow server cost data
 dump SPARQL
 query results high availability low availability high bandwidth low bandwidth out-of-date data live data low client costhigh client cost Linked Data
 documents
  10. 10. low server cost data
 dump SPARQL
 query results high availability live data Linked Data
 documents Triple Pattern
 Fragments A Triple Pattern Fragments interface
 is low-cost and enables clients to query.
  11. 11. matches of a triple pattern total number of matches access to all other fragments data metadata controls (paged) A Triple Pattern Fragments interface
 is low-cost and enables clients to query.
  12. 12. data (first 100) controls (other fragments) metadata (total count)
  13. 13. Give them a SPARQL query.
 Give them a URL of any dataset fragment. How can intelligent clients
 solve SPARQL queries over fragments? They look inside the fragment
 to see how to access the dataset and use the metadata
 to decide how to plan the query.
  14. 14. Let’s follow the execution
 of an example SPARQL query. SELECT ?artist ?name WHERE { ?artist a dbpedia-owl:Artist; rdfs:label ?name; dbpedia-owl:birthPlace dbpedia:Padua. FILTER LANGMATCHES(LANG(?name), "EN") } Find names of artists born in Padua, Italy. Fragment: http://fragments.dbpedia.org/2014/en
  15. 15. The client looks inside the fragment
 to see how to access the dataset. <http://fragments.dbpedia.org/2014/en#dataset> hydra:search [ hydra:template "http://fragments.dbpedia.org/2014/en {?subject,predicate,object}"; hydra:mapping [ hydra:variable "subject"; hydra:property rdf:subject ], [ hydra:variable "predicate"; hydra:property rdf:predicate ], [ hydra:variable "object"; hydra:property rdf:object ] ]. Fragment: http://fragments.dbpedia.org/2014/en “I can query the dataset by triple pattern.”
  16. 16. The client splits the query
 into the available fragments. SELECT ?artist ?name WHERE { ?artist a dbpedia-owl:Artist; rdfs:label ?name; dbpedia-owl:birthPlace dbpedia:Padua. FILTER LANGMATCHES(LANG(?name), "EN") }
  17. 17. The client gets the fragments
 and inspects their metadata. ?artist a dbpedia-owl:Artist. first 100 triples 96.000 ?artist rdfs:label ?name. first 100 triples 12.000.000 ?artist dbont:birthPlace dbpedia:Padua. first 100 triples 135
  18. 18. ?artist a dbpedia-owl:Artist. 96.000 ?artist rdfs:label ?name. 12.000.000 ?artist dbont:birthPlace dbpedia:Padua. dbpedia:Alberto_Benettin dbont:birthPlace dbpedia:Padua. 135 dbpedia:Alberto_Bigon dbont:birthPlace dbpedia:Padua. The metadata enables the client
 to choose the right starting point. dbp:Alberto_Benettin a dbont:Artist. dbp:Alberto_Benettin rdfs:label ?name.
  19. 19. Clients execute the query in 3 seconds
 on a highly available, low-cost server. SELECT ?artist ?name WHERE { ?artist a dbpedia-owl:Artist; rdfs:label ?name; dbpedia-owl:birthPlace dbpedia:Padua. FILTER LANGMATCHES(LANG(?name), "EN") } Try it yourself:
 bit.ly/artistspadua
  20. 20. Querying Datasets on 1 10 100 10100100010000 clients throughput(q/hr) Virtuoso 6 Fuseki–tdb triple pattern Fig. 3.1: Server performance (log-log plot) The query throughput is lower,
 but resilient to high client numbers. executed SPARQL queries per hour
  21. 21. The server traffic is higher,
 but requests are significantly lighter. ets on the Web with High Availability 13 oso 6 Virtuoso 7 –tdb Fuseki–hdt pattern fragments 1 10 100 0 2 4 clients datasent(mb) Fig. 3.2: Server network trafficdata sent by server in MB
  22. 22. Caching is significantly more effective,
 as clients reuse fragments for queries. 1 10 100 0 2 clients t(mb) Fig. 3.2: Server network traffic 1 10 100 0 10 20 clients sent(mb) Fig. 3.4: Cache network traffic 6 8 ramus data sent by cache in MB
  23. 23. The server uses much less CPU,
 allowing for higher availability. server CPU usage per core 1 10 100 0 50 100 150 clients #timeou Fig. 3.3: Query timeouts 1 1 10 100 0 50 100 clients cpuuse(%) Fig. 3.5: Server processor usage per core 1 100 e(%)
  24. 24. Linked Data Fragments Triple Pattern Fragments Federated querying Querying Federations
 of Triple Pattern Fragments
  25. 25. Federated querying is native
 to Triple Pattern Fragment clients. Every query is decomposed locally. Clients send simple requests to a server. For clients, it doesn’t matter
 which server they send queries to.
  26. 26. For federation, we just send queries
 to multiple servers. No prior source selection. Each triple pattern is sent to all servers. If a certain pattern has no result,
 just don’t send more specific patterns.
  27. 27. Federation compares pretty well
 to SPARQL endpoint federation. dge date n of nter- mea- s on er in pos- the nter- ular om- TPF ANAPSID ANAPSIDEG FedX(warm) SPLENDID LD . . . . . LD . . . . . LD . . . . . LD . . . . . LD . . . . . LD . . . . . LD . . . . . LD . . . . . LD . . . . . LD . . . . . LD . . . . . LS . . . . . LS . . . . . FedBench recall
  28. 28. Federation compares pretty well
 to SPARQL endpoint federation. dge date n of nter- mea- s on er in pos- the nter- ular om- TPF ANAPSID ANAPSIDEG FedX(warm) SPLENDID LD . . . . . LD . . . . . LD . . . . . LD . . . . . LD . . . . . LD . . . . . LD . . . . . LD . . . . . LD . . . . . LD . . . . . LD . . . . . LS . . . . . LS . . . . . recall Complex
 queries ets mat hed EC ated Data CD) om- gain was , ac- m in ncy. bers the ems: Ex- LS . . . . . LS . . . . . LS . . . . . LS . . . . . LS . . . . . CD . . . . . CD . . . . . CD . . . . . CD . . . . . CD . . . . . CD . . . . . CD . . . . . C . . . . . C . . . . . C . . . . . C . . . . . C . . . . . C . . . . . C . . . . . C . . . . . C . . . . . C . . . . . # queries = . .
  29. 29. Federation compares pretty well,
 even time-wise in some cases. LD LD LD LD LD LD LD LD LD LD LD CD 50 100 executiontime(s) 150 200 250 300 iontime(s) LD LD LD LD LD LD LD LD LD CD CD CD CD CD LS LS LS LS LS C C C C C C C C TPF ANAPSID ANAPSID EG FedX SPLENDID mes of FedBench query execution on the TPF client/server setup compared to SPARQL endp FedBench
  30. 30. Federation compares pretty well,
 even time-wise in some cases. LD LD LD LD LD LD LD LD LD LD LD CD 50 100 executiontime( LS LS LS LS LS LS LS C C C C 0 50 100 150 200 250 300 executiontime(s) TPF ANAPSID ANAPSID EG FedX Figure : Evaluation times of FedBench query execution on the TPF client/server setup c systems (timeout of s). These measurements should be considered together with TPF-related measurements were performed in the context of this article; the numbers LD LD LD LD LD LD LD LD LD CD CD CD CD CD LS LS LS LS LS C C C C C C C C TPF ANAPSID ANAPSID EG FedX SPLENDID mes of FedBench query execution on the TPF client/server setup compared to SPARQL endp LD LD LD LD LD CD CD CD CD CD CD CD LS C C C C C C C C C C NAPSID ANAPSID EG FedX SPLENDID xecution on the TPF client/server setup compared to SPARQL endpoint federation nts should be considered together with the recall for each query (Table ). The the context of this article; the numbers for the four SPARQL endpoint federation Complex
 queries
  31. 31. Note the different setup
 in the previous comparisons. SPARQL endpoint federation
 was measured with local servers. Triple Pattern Fragments federation
 was measured over the Web.
  32. 32. Linked Data Fragments Triple Pattern Fragments Federated querying Querying Federations
 of Triple Pattern Fragments
  33. 33. Triple Pattern Fragments are easy:
 all software is available as open source. github.com/LinkedDataFragments linkeddatafragments.org Software Documentation and specification
  34. 34. More than 650.000 TPF interfaces
 are available for federated querying. fragments.dbpedia.org lodlaundromat.org/wardrobe/ data.linkeddatafragments.org
  35. 35. tutorial.linkeddatafragments.org

×