6. Effective caching because of limited number of URI’s
Frequently used TPF’s are cached
Cached TPF’s can be delivered efficiently
Queries with common data fragments will evaluate faster
6
7. Storage solution must support triple pattern queries
In-memory triplestore for RDF files
HDT (Fernández 2010)
SPARQL endpoints
7
10. Clients can evaluate any SPARQL query
using the simple TPF interface of one or more servers
Split up SPARQL queries into separate triple pattern queries
Combine results client-side
10
11. Joining triple pattern fragments can be tricky
SELECT ?person ?city WHERE {
?person rdf:type dbpedia-owl:Artist.
?person dbpedia-owl:birthPlace ?city.
?city foaf:name "Waterloo"@en.
}
?person rdf:type dbpedia-owl:Artist.
?person dbpedia-owl:birthPlace ?city.
?city foaf:name "Waterloo"@en.
Some query plans are more efficient than others
11
12. Number of triples as metadata in TPF’s
?person rdf:type dbpedia-owl:Artist.
?person dbpedia-owl:birthPlace ?city.
?city foaf:name "Waterloo"@en. 26
96 000
12 000 000
12
18. Find all results for most selective pattern
?person rdf:type dbpedia-owl:Artist.
?person dbpedia-owl:birthPlace dbp:Waterloo,_Iowa .
96 000
45
dbp:Allan_Carpenter dbo:birthPlace dbp:Waterloo,_Iowa.
dbp:Adam_DeVine dbo:birthPlace dbp:Waterloo,_Iowa.
dbp:Bonnie_Koloc dbo:birthPlace dbp:Waterloo,_Iowa.
...
18
19. Fill in results in other patterns
dbp:Allan_Carpenter rdf:type dbpedia-owl:Artist.
?person dbpedia-owl:birthPlace dbp:Waterloo,_Iowa .
96 000
45
dbp:Allan_Carpenter dbo:birthPlace dbp:Waterloo,_Iowa.
dbp:Adam_DeVine dbo:birthPlace dbp:Waterloo,_Iowa.
dbp:Bonnie_Koloc dbo:birthPlace dbp:Waterloo,_Iowa.
...
19
20. One solution is found
dbp:Allan_Carpenter rdf:type dbpedia-owl:Artist. 1
Repeat process for all other matches
20
21. Joining algorithm is not always optimal
Other algorithms are possible
Improved algorithm minimizes #requests (Van Herwegen 2015)
Additional metadata may improve query plans
21
23. Send many client queries to a single server
1 server (TPF, Virtuoso, Fuseki)
1 - 244 simultaneous clients
Different query types from Berlin SPARQL benchmark
23
(Verborgh 2016)
26. TPF is just one possible trade-off
Reduce client load:
Additional metadata for membership search (Vander Sande 2015)
Substring filtering (Van Herwegen 2015)
Dynamic data publication and querying (Taelman 2016)
Reduce server load:
Decentralized caching (Folz 2016)
26
27. Conclusions
TPF servers have a simple low-cost interface
TPF clients evaluate SPARQL queries locally, using this
interface
27
29. Sources
R Verborgh “Linked Data Publishing”
http://rubenverborgh.github.io/WebFundamentals/linked-data-publishing/
R. Verborgh, M. Vander Sande, O. Hartig, et al. Triple Pattern Fragments: a Low-cost
Knowledge Graph Interface for the Web.
29