6.
Effective caching because of limited number of URI’s
Frequently used TPF’s are cached
Cached TPF’s can be delivered efficiently
Queries with common data fragments will evaluate faster
6
7.
Storage solution must support triple pattern queries
In-memory triplestore for RDF files
HDT (Fernández 2010)
SPARQL endpoints
7
8.
Server is simple
In order to make publication cheap and easy
8
9.
Publishing with Triple Pattern Fragments
Server
Client
Evaluations
9
10.
Clients can evaluate any SPARQL query
using the simple TPF interface of one or more servers
Split up SPARQL queries into separate triple pattern queries
Combine results client-side
10
11.
Joining triple pattern fragments can be tricky
SELECT ?person ?city WHERE {
?person rdf:type dbpedia-owl:Artist.
?person dbpedia-owl:birthPlace ?city.
?city foaf:name "Waterloo"@en.
}
?person rdf:type dbpedia-owl:Artist.
?person dbpedia-owl:birthPlace ?city.
?city foaf:name "Waterloo"@en.
Some query plans are more efficient than others
11
12.
Number of triples as metadata in TPF’s
?person rdf:type dbpedia-owl:Artist.
?person dbpedia-owl:birthPlace ?city.
?city foaf:name "Waterloo"@en. 26
96 000
12 000 000
12
18.
Find all results for most selective pattern
?person rdf:type dbpedia-owl:Artist.
?person dbpedia-owl:birthPlace dbp:Waterloo,_Iowa .
96 000
45
dbp:Allan_Carpenter dbo:birthPlace dbp:Waterloo,_Iowa.
dbp:Adam_DeVine dbo:birthPlace dbp:Waterloo,_Iowa.
dbp:Bonnie_Koloc dbo:birthPlace dbp:Waterloo,_Iowa.
...
18
19.
Fill in results in other patterns
dbp:Allan_Carpenter rdf:type dbpedia-owl:Artist.
?person dbpedia-owl:birthPlace dbp:Waterloo,_Iowa .
96 000
45
dbp:Allan_Carpenter dbo:birthPlace dbp:Waterloo,_Iowa.
dbp:Adam_DeVine dbo:birthPlace dbp:Waterloo,_Iowa.
dbp:Bonnie_Koloc dbo:birthPlace dbp:Waterloo,_Iowa.
...
19
20.
One solution is found
dbp:Allan_Carpenter rdf:type dbpedia-owl:Artist. 1
Repeat process for all other matches
20
21.
Joining algorithm is not always optimal
Other algorithms are possible
Improved algorithm minimizes #requests (Van Herwegen 2015)
Additional metadata may improve query plans
21
22.
Publishing with Triple Pattern Fragments
Server
Client
Evaluations
22
23.
Send many client queries to a single server
1 server (TPF, Virtuoso, Fuseki)
1 - 244 simultaneous clients
Different query types from Berlin SPARQL benchmark
23
(Verborgh 2016)
26.
TPF is just one possible trade-off
Reduce client load:
Additional metadata for membership search (Vander Sande 2015)
Substring filtering (Van Herwegen 2015)
Dynamic data publication and querying (Taelman 2016)
Reduce server load:
Decentralized caching (Folz 2016)
26
27.
Conclusions
TPF servers have a simple low-cost interface
TPF clients evaluate SPARQL queries locally, using this
interface
27
28.
Next session
Setting up a TPF server yourself
Querying the server
28
29.
Sources
R Verborgh “Linked Data Publishing”
http://rubenverborgh.github.io/WebFundamentals/linked-data-publishing/
R. Verborgh, M. Vander Sande, O. Hartig, et al. Triple Pattern Fragments: a Low-cost
Knowledge Graph Interface for the Web.
29