Triple Pattern Fragments
Ruben Taelman - @rubensworks
imec - Ghent University
1
Evaluate SPARQL queries client-side with TPF
SELECT ?person ?city WHERE {
?person rdf:type dbpedia-owl:Artist.
?person dbpedia-owl:birthPlace ?city.
?city foaf:name "Waterloo"@en.
}
?person rdf:type dbpedia-owl:Artist.
?person dbpedia-owl:birthPlace ?city.
?city foaf:name "Waterloo"@en. 2
Publishing with Triple Pattern Fragments
Server
Client
Evaluations
3
Publishing with Triple Pattern Fragments
Server
Client
Evaluations
4
Simple triple pattern interface
Triple pattern queries, paged results
Example: s1 p1 o1 at page 2
http://example.org/my-dataset
?subject=s1&predicate=p1&object=o1&page=2
Metadata and controls
5
Effective caching because of limited number of URI’s
Frequently used TPF’s are cached
Cached TPF’s can be delivered efficiently
Queries with common data fragments will evaluate faster
6
Storage solution must support triple pattern queries
In-memory triplestore for RDF files
HDT (Fernández 2010)
SPARQL endpoints
7
Server is simple
In order to make publication cheap and easy
8
Publishing with Triple Pattern Fragments
Server
Client
Evaluations
9
Clients can evaluate any SPARQL query
using the simple TPF interface of one or more servers
Split up SPARQL queries into separate triple pattern queries
Combine results client-side
10
Joining triple pattern fragments can be tricky
SELECT ?person ?city WHERE {
?person rdf:type dbpedia-owl:Artist.
?person dbpedia-owl:birthPlace ?city.
?city foaf:name "Waterloo"@en.
}
?person rdf:type dbpedia-owl:Artist.
?person dbpedia-owl:birthPlace ?city.
?city foaf:name "Waterloo"@en.
Some query plans are more efficient than others
11
Number of triples as metadata in TPF’s
?person rdf:type dbpedia-owl:Artist.
?person dbpedia-owl:birthPlace ?city.
?city foaf:name "Waterloo"@en. 26
96 000
12 000 000
12
Select most selective triple pattern
?person rdf:type dbpedia-owl:Artist.
?person dbpedia-owl:birthPlace ?city.
?city foaf:name "Waterloo"@en. 26
96 000
12 000 000
Most selective!
13
Find all results for most selective pattern
?person rdf:type dbpedia-owl:Artist.
?person dbpedia-owl:birthPlace ?city.
?city foaf:name "Waterloo"@en.
dbp:Waterloo,_Iowa foaf:name "Waterloo"@en
dbp:Waterloo,_London foaf:name "Waterloo"@en
dbp:Waterloo,_Ontario foaf:name "Waterloo"@en
...
14
Fill in results in other patterns
?person rdf:type dbpedia-owl:Artist.
?person dbpedia-owl:birthPlace dbp:Waterloo,_Iowa.
?city foaf:name "Waterloo"@en.
dbp:Waterloo,_Iowa foaf:name "Waterloo"@en
dbp:Waterloo,_London foaf:name "Waterloo"@en
dbp:Waterloo,_Ontario foaf:name "Waterloo"@en
...
15
Recursively repeat for remaining patterns
?person rdf:type dbpedia-owl:Artist.
?person dbpedia-owl:birthPlace dbp:Waterloo,_Iowa .
96 000
45
16
Select most selective pattern
?person rdf:type dbpedia-owl:Artist.
?person dbpedia-owl:birthPlace dbp:Waterloo,_Iowa .
96 000
45
17
Find all results for most selective pattern
?person rdf:type dbpedia-owl:Artist.
?person dbpedia-owl:birthPlace dbp:Waterloo,_Iowa .
96 000
45
dbp:Allan_Carpenter dbo:birthPlace dbp:Waterloo,_Iowa.
dbp:Adam_DeVine dbo:birthPlace dbp:Waterloo,_Iowa.
dbp:Bonnie_Koloc dbo:birthPlace dbp:Waterloo,_Iowa.
...
18
Fill in results in other patterns
dbp:Allan_Carpenter rdf:type dbpedia-owl:Artist.
?person dbpedia-owl:birthPlace dbp:Waterloo,_Iowa .
96 000
45
dbp:Allan_Carpenter dbo:birthPlace dbp:Waterloo,_Iowa.
dbp:Adam_DeVine dbo:birthPlace dbp:Waterloo,_Iowa.
dbp:Bonnie_Koloc dbo:birthPlace dbp:Waterloo,_Iowa.
...
19
One solution is found
dbp:Allan_Carpenter rdf:type dbpedia-owl:Artist. 1
Repeat process for all other matches
20
Joining algorithm is not always optimal
Other algorithms are possible
Improved algorithm minimizes #requests (Van Herwegen 2015)
Additional metadata may improve query plans
21
Publishing with Triple Pattern Fragments
Server
Client
Evaluations
22
Send many client queries to a single server
1 server (TPF, Virtuoso, Fuseki)
1 - 244 simultaneous clients
Different query types from Berlin SPARQL benchmark
23
(Verborgh 2016)
Query throughput is lower
24
Server load is lower
25
TPF is just one possible trade-off
Reduce client load:
Additional metadata for membership search (Vander Sande 2015)
Substring filtering (Van Herwegen 2015)
Dynamic data publication and querying (Taelman 2016)
Reduce server load:
Decentralized caching (Folz 2016)
26
Conclusions
TPF servers have a simple low-cost interface
TPF clients evaluate SPARQL queries locally, using this
interface
27
Next session
Setting up a TPF server yourself
Querying the server
28
Sources
R Verborgh “Linked Data Publishing”
http://rubenverborgh.github.io/WebFundamentals/linked-data-publishing/
R. Verborgh, M. Vander Sande, O. Hartig, et al. Triple Pattern Fragments: a Low-cost
Knowledge Graph Interface for the Web.
29

EKAW - Triple Pattern Fragments

  • 1.
    Triple Pattern Fragments RubenTaelman - @rubensworks imec - Ghent University 1
  • 2.
    Evaluate SPARQL queriesclient-side with TPF SELECT ?person ?city WHERE { ?person rdf:type dbpedia-owl:Artist. ?person dbpedia-owl:birthPlace ?city. ?city foaf:name "Waterloo"@en. } ?person rdf:type dbpedia-owl:Artist. ?person dbpedia-owl:birthPlace ?city. ?city foaf:name "Waterloo"@en. 2
  • 3.
    Publishing with TriplePattern Fragments Server Client Evaluations 3
  • 4.
    Publishing with TriplePattern Fragments Server Client Evaluations 4
  • 5.
    Simple triple patterninterface Triple pattern queries, paged results Example: s1 p1 o1 at page 2 http://example.org/my-dataset ?subject=s1&predicate=p1&object=o1&page=2 Metadata and controls 5
  • 6.
    Effective caching becauseof limited number of URI’s Frequently used TPF’s are cached Cached TPF’s can be delivered efficiently Queries with common data fragments will evaluate faster 6
  • 7.
    Storage solution mustsupport triple pattern queries In-memory triplestore for RDF files HDT (Fernández 2010) SPARQL endpoints 7
  • 8.
    Server is simple Inorder to make publication cheap and easy 8
  • 9.
    Publishing with TriplePattern Fragments Server Client Evaluations 9
  • 10.
    Clients can evaluateany SPARQL query using the simple TPF interface of one or more servers Split up SPARQL queries into separate triple pattern queries Combine results client-side 10
  • 11.
    Joining triple patternfragments can be tricky SELECT ?person ?city WHERE { ?person rdf:type dbpedia-owl:Artist. ?person dbpedia-owl:birthPlace ?city. ?city foaf:name "Waterloo"@en. } ?person rdf:type dbpedia-owl:Artist. ?person dbpedia-owl:birthPlace ?city. ?city foaf:name "Waterloo"@en. Some query plans are more efficient than others 11
  • 12.
    Number of triplesas metadata in TPF’s ?person rdf:type dbpedia-owl:Artist. ?person dbpedia-owl:birthPlace ?city. ?city foaf:name "Waterloo"@en. 26 96 000 12 000 000 12
  • 13.
    Select most selectivetriple pattern ?person rdf:type dbpedia-owl:Artist. ?person dbpedia-owl:birthPlace ?city. ?city foaf:name "Waterloo"@en. 26 96 000 12 000 000 Most selective! 13
  • 14.
    Find all resultsfor most selective pattern ?person rdf:type dbpedia-owl:Artist. ?person dbpedia-owl:birthPlace ?city. ?city foaf:name "Waterloo"@en. dbp:Waterloo,_Iowa foaf:name "Waterloo"@en dbp:Waterloo,_London foaf:name "Waterloo"@en dbp:Waterloo,_Ontario foaf:name "Waterloo"@en ... 14
  • 15.
    Fill in resultsin other patterns ?person rdf:type dbpedia-owl:Artist. ?person dbpedia-owl:birthPlace dbp:Waterloo,_Iowa. ?city foaf:name "Waterloo"@en. dbp:Waterloo,_Iowa foaf:name "Waterloo"@en dbp:Waterloo,_London foaf:name "Waterloo"@en dbp:Waterloo,_Ontario foaf:name "Waterloo"@en ... 15
  • 16.
    Recursively repeat forremaining patterns ?person rdf:type dbpedia-owl:Artist. ?person dbpedia-owl:birthPlace dbp:Waterloo,_Iowa . 96 000 45 16
  • 17.
    Select most selectivepattern ?person rdf:type dbpedia-owl:Artist. ?person dbpedia-owl:birthPlace dbp:Waterloo,_Iowa . 96 000 45 17
  • 18.
    Find all resultsfor most selective pattern ?person rdf:type dbpedia-owl:Artist. ?person dbpedia-owl:birthPlace dbp:Waterloo,_Iowa . 96 000 45 dbp:Allan_Carpenter dbo:birthPlace dbp:Waterloo,_Iowa. dbp:Adam_DeVine dbo:birthPlace dbp:Waterloo,_Iowa. dbp:Bonnie_Koloc dbo:birthPlace dbp:Waterloo,_Iowa. ... 18
  • 19.
    Fill in resultsin other patterns dbp:Allan_Carpenter rdf:type dbpedia-owl:Artist. ?person dbpedia-owl:birthPlace dbp:Waterloo,_Iowa . 96 000 45 dbp:Allan_Carpenter dbo:birthPlace dbp:Waterloo,_Iowa. dbp:Adam_DeVine dbo:birthPlace dbp:Waterloo,_Iowa. dbp:Bonnie_Koloc dbo:birthPlace dbp:Waterloo,_Iowa. ... 19
  • 20.
    One solution isfound dbp:Allan_Carpenter rdf:type dbpedia-owl:Artist. 1 Repeat process for all other matches 20
  • 21.
    Joining algorithm isnot always optimal Other algorithms are possible Improved algorithm minimizes #requests (Van Herwegen 2015) Additional metadata may improve query plans 21
  • 22.
    Publishing with TriplePattern Fragments Server Client Evaluations 22
  • 23.
    Send many clientqueries to a single server 1 server (TPF, Virtuoso, Fuseki) 1 - 244 simultaneous clients Different query types from Berlin SPARQL benchmark 23 (Verborgh 2016)
  • 24.
  • 25.
  • 26.
    TPF is justone possible trade-off Reduce client load: Additional metadata for membership search (Vander Sande 2015) Substring filtering (Van Herwegen 2015) Dynamic data publication and querying (Taelman 2016) Reduce server load: Decentralized caching (Folz 2016) 26
  • 27.
    Conclusions TPF servers havea simple low-cost interface TPF clients evaluate SPARQL queries locally, using this interface 27
  • 28.
    Next session Setting upa TPF server yourself Querying the server 28
  • 29.
    Sources R Verborgh “LinkedData Publishing” http://rubenverborgh.github.io/WebFundamentals/linked-data-publishing/ R. Verborgh, M. Vander Sande, O. Hartig, et al. Triple Pattern Fragments: a Low-cost Knowledge Graph Interface for the Web. 29