HiBISCuS: Hypergraph-Based Source
Selection for SPARQL Endpoint
Federation
Muhammad Saleem, Axel-Cyrille Ngonga Ngomo
{las...
SPARQL Endpoint Federation
S1 S2 S3 S4
RDF RDF RDF RDF
Parser
Source Selection
Federator Optimizer
Integrator
Get individu...
Our Contribution
S1 S2 S3 S4
RDF RDF RDF RDF
Parser
Source Selection
Federator Optimizer
Integrator
Efficiently identify
c...
Motivation
FedBench (LD3): Return for all US presidents their party
membership and news pages about them.
SELECT ?presiden...
Motivation
FedBench (LD3): Return for all US presidents their party
membership and news pages about them.
SELECT ?presiden...
Motivation
FedBench (LD3): Return for all US presidents their party
membership and news pages about them.
SELECT ?presiden...
Motivation
FedBench (LD3): Return for all US presidents their party
membership and news pages about them.
SELECT ?presiden...
Motivation
FedBench (LD3): Return for all US presidents their party
membership and news pages about them.
SELECT ?presiden...
Motivation
FedBench (LD3): Return for all US presidents their party
membership and news pages about them.
SELECT ?presiden...
Motivation
FedBench (LD3): Return for all US presidents their party
membership and news pages about them.
SELECT ?presiden...
Motivation
FedBench (LD3): Return for all US presidents their party
membership and news pages about them.
SELECT ?presiden...
Problem Statement
• An overestimation of triple pattern-wise
source selection can be expensive
– Resources are wasted
– Qu...
HiBISCuS: Key Concept
• Makes use of the URI’s authorities
http://dbpedia.org/ontology/party
Scheme Authority Path
For URI...
HiBISCuS: SPARQL Query as Hypergraph
SELECT ?president ?party ?page
WHERE {
?president rdf:type dbpedia:President .
?presi...
HiBISCuS: SPARQL Query as Hypergraph
SELECT ?president ?party ?page
WHERE {
?president rdf:type dbpedia:President .
?presi...
HiBISCuS: SPARQL Query as Hypergraph
SELECT ?president ?party ?page
WHERE {
?president rdf:type dbpedia:President .
?presi...
HiBISCuS: SPARQL Query as Hypergraph
SELECT ?president ?party ?page
WHERE {
?president rdf:type dbpedia:President .
?presi...
HiBISCuS: SPARQL Query as Hypergraph
SELECT ?president ?party ?page
WHERE {
?president rdf:type dbpedia:President .
?presi...
HiBISCuS: SPARQL Query as Hypergraph
SELECT ?president ?party ?page
WHERE {
?president rdf:type dbpedia:President .
?presi...
HiBISCuS: Data Summaries
[] a ds:Service ;
ds:endpointUrl <http://dbpedia.org/sparql> ;
ds:capability [
ds:predicate dbped...
HiBISCuS: Triple Pattern-wise Source Selection
SELECT ?president ?party ?page
WHERE {
?president rdf:type dbpedia:Presiden...
HiBISCuS: Triple Pattern-wise Source Pruning
SELECT ?president ?party ?page
WHERE {
?president rdf:type dbpedia:President ...
HiBISCuS: Triple Pattern-wise Source Pruning
SELECT ?president ?party ?page
WHERE {
?president rdf:type dbpedia:President ...
HiBISCuS: Triple Pattern-wise Source Pruning
SELECT ?president ?party ?page
WHERE {
?president rdf:type dbpedia:President ...
HiBISCuS: Triple Pattern-wise Source Pruning
SELECT ?president ?party ?page
WHERE {
?president rdf:type dbpedia:President ...
HiBISCuS: Triple Pattern-wise Source Pruning
SELECT ?president ?party ?page
WHERE {
?president rdf:type dbpedia:President ...
HiBISCuS: Triple Pattern-wise Source Pruning
SELECT ?president ?party ?page
WHERE {
?president rdf:type dbpedia:President ...
Experimental Setup
• Benchmark
– FedBench
– Real-world datasets collection
– Real queries showing typical request
– Used a...
Experimental Setup
• Metrics
– Index generation time
– Index size
– Total triple pattern-wise sources selected
– Total num...
Index Generation Time and Compression Ratio
FedX SPLENDIDLHD DARQ ANAPSID ADERIS Qtree HiBISCus
Index
Generation
Time (min...
Efficient Source Selection
FedBench Cross Domain Queries
System #TP #AR SST(ms)
FedX(cold) 78 252 317.8
FedX(warm) 78 0 7....
Efficient Source Selection
FedBench Cross Domain Queries
System #TP #AR SST(ms)
FedX(cold) 78 252 317.8
FedX(warm) 78 0 7....
FedX Extension with HiBISCuS
Improvement in 20/25 queries with net performance improvement 24.61%
0
50
100
150
200
250
300...
SPLENDID Extension with HiBISCuS
Improvement in 25/25 queries with net performance improvement 82.72%
0
200
400
600
800
10...
DARQ Extension with HiBISCuS
Improvement in 21/21 queries with net performance improvement 92.22%
0.01
0.1
1
10
100
1000
1...
SPLENDID+HiBISCuS vs ANAPSID
Improvement in 24/24 queries with net performance improvement 98%
0.01
0.1
1
10
100
1000
CD1 ...
Conclusion and Future Work
• An overestimation of triple pattern-wise source
selection can be expensive
• Join-aware tripl...
Thanks!
homepage and source code:
https://code.google.com/p/hibiscusfederation/
{saleem,ngonga}@informatik.uni-leipzig.de
...
Source Selection
• Triple pattern-wise source selection
– Ensures 100% recall
– Can over-estimate capable sources
– Can be...
Upcoming SlideShare
Loading in …5
×

HiBISCuS: Hypergraph-Based Source Selection for SPARQL Endpoint Federation

898 views

Published on

Efficient federated query processing is of significant importance to tame the large amount of data available on the Web of Data. Previous works have focused on generating optimized query execution plans for fast result retrieval. However, devising source selection approaches beyond triple pattern-wise source selection has not received much attention. This work presents HiBISCuS, a novel hypergraph-based source selection approach to federated SPARQL querying. Our approach can be directly combined with existing SPARQL query federation engines to achieve the same recall while querying fewer data sources. We extend three well-known SPARQL query federation engines with HiBISCus and compare our extensions with the original approaches on FedBench. Our evaluation shows that HiBISCuS can efficiently reduce the total number of sources selected without losing recall. Moreover, our approach significantly reduces the execution time of the selected engines on most of the benchmark queries.

Published in: Engineering, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
898
On SlideShare
0
From Embeds
0
Number of Embeds
20
Actions
Shares
0
Downloads
14
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

HiBISCuS: Hypergraph-Based Source Selection for SPARQL Endpoint Federation

  1. 1. HiBISCuS: Hypergraph-Based Source Selection for SPARQL Endpoint Federation Muhammad Saleem, Axel-Cyrille Ngonga Ngomo {lastname}@informatik.uni-leipzig.de Agile Knowledge Engineering and Semantic Web (AKSW), University of Leipzig, Germany 11th Extended Semantic Web Conferece (ESWC), Crete, Greece, 25th-29th of May 2014
  2. 2. SPARQL Endpoint Federation S1 S2 S3 S4 RDF RDF RDF RDF Parser Source Selection Federator Optimizer Integrator Get individual triple patterns Identify capable source against individual triple patterns Generate optimized sub- query execution plan Integrate sub- queries results Execute sub- queries
  3. 3. Our Contribution S1 S2 S3 S4 RDF RDF RDF RDF Parser Source Selection Federator Optimizer Integrator Efficiently identify capable source against individual triple patterns of a SPARQL query
  4. 4. Motivation FedBench (LD3): Return for all US presidents their party membership and news pages about them. SELECT ?president ?party ?page WHERE { ?president rdf:type dbpedia:President . ?president dbpedia:nationality dbpedia:United_States . ?president dbpedia:party ?party . ?x nyt:topicPage ?page . ?x owl:sameAs ?president . } dbpedia RDF Source Selection Algorithm Triple pattern-wise source selection S1TP1 = KEGG RDF ChEBI RDF NYT RDF SWDF RDF LMDB RDF Jamendo RDF Geo Names RDF DrugBank RDF S1 S2 S3 S4 S5 S6 S7 S8 S9 //TP1 //TP3 //TP4 //TP5 //TP2
  5. 5. Motivation FedBench (LD3): Return for all US presidents their party membership and news pages about them. SELECT ?president ?party ?page WHERE { ?president rdf:type dbpedia:President . ?president dbpedia:nationality dbpedia:United_States . ?president dbpedia:party ?party . ?x nyt:topicPage ?page . ?x owl:sameAs ?president . } dbpedia RDF Source Selection Algorithm Triple pattern-wise source selection S1TP1 = KEGG RDF ChEBI RDF NYT RDF SWDF RDF LMDB RDF Jamendo RDF Geo Names RDF DrugBank RDF S1 S2 S3 S4 S5 S6 S7 S8 S9 //TP1 //TP3 //TP4 //TP5 //TP2 TP2 = S1
  6. 6. Motivation FedBench (LD3): Return for all US presidents their party membership and news pages about them. SELECT ?president ?party ?page WHERE { ?president rdf:type dbpedia:President . ?president dbpedia:nationality dbpedia:United_States . ?president dbpedia:party ?party . ?x nyt:topicPage ?page . ?x owl:sameAs ?president . } dbpedia RDF Source Selection Algorithm Triple pattern-wise source selection S1TP1 = KEGG RDF ChEBI RDF NYT RDF SWDF RDF LMDB RDF Jamendo RDF Geo Names RDF DrugBank RDF S1 S2 S3 S4 S5 S6 S7 S8 S9 //TP1 //TP3 //TP4 //TP5 //TP2 TP2 = S1 TP3 = S1
  7. 7. Motivation FedBench (LD3): Return for all US presidents their party membership and news pages about them. SELECT ?president ?party ?page WHERE { ?president rdf:type dbpedia:President . ?president dbpedia:nationality dbpedia:United_States . ?president dbpedia:party ?party . ?x nyt:topicPage ?page . ?x owl:sameAs ?president . } dbpedia RDF Source Selection Algorithm Triple pattern-wise source selection S1TP1 = KEGG RDF ChEBI RDF NYT RDF SWDF RDF LMDB RDF Jamendo RDF Geo Names RDF DrugBank RDF S1 S2 S3 S4 S5 S6 S7 S8 S9 //TP1 //TP3 //TP4 //TP5 //TP2 TP2 = S1 TP3 = S1 TP4 = S4
  8. 8. Motivation FedBench (LD3): Return for all US presidents their party membership and news pages about them. SELECT ?president ?party ?page WHERE { ?president rdf:type dbpedia:President . ?president dbpedia:nationality dbpedia:United_States . ?president dbpedia:party ?party . ?x nyt:topicPage ?page . ?x owl:sameAs ?president . } dbpedia RDF Source Selection Algorithm Triple pattern-wise source selection S1TP1 = KEGG RDF ChEBI RDF NYT RDF SWDF RDF LMDB RDF Jamendo RDF Geo Names RDF DrugBank RDF S1 S2 S3 S4 S5 S6 S7 S8 S9 //TP1 //TP3 //TP4 //TP5 //TP2 TP2 = S1 TP3 = S1 TP4 = S4 TP5 = S1 S2 S4 S5 S6 S7 S8 S9
  9. 9. Motivation FedBench (LD3): Return for all US presidents their party membership and news pages about them. SELECT ?president ?party ?page WHERE { ?president rdf:type dbpedia:President . ?president dbpedia:nationality dbpedia:United_States . ?president dbpedia:party ?party . ?x nyt:topicPage ?page . ?x owl:sameAs ?president . } dbpedia RDF Source Selection Algorithm Triple pattern-wise source selection S1TP1 = KEGG RDF ChEBI RDF NYT RDF SWDF RDF LMDB RDF Jamendo RDF Geo Names RDF DrugBank RDF S1 S2 S3 S4 S5 S6 S7 S8 S9 //TP1 //TP3 //TP4 //TP5 //TP2 TP2 = S1 TP3 = S1 TP4 = S4 TP5 = S1 S2 S4 S5 S6 S7 S8 S9 Total triple pattern-wise selected sources = 12 Total SPARQL ASK queries : 9*5 = 45
  10. 10. Motivation FedBench (LD3): Return for all US presidents their party membership and news pages about them. SELECT ?president ?party ?page WHERE { ?president rdf:type dbpedia:President . ?president dbpedia:nationality dbpedia:United_States . ?president dbpedia:party ?party . ?x nyt:topicPage ?page . ?x owl:sameAs ?president . } dbpedia RDF Source Selection Algorithm Triple pattern-wise source selection S1TP1 = KEGG RDF ChEBI RDF NYT RDF SWDF RDF LMDB RDF Jamendo RDF Geo Names RDF DrugBank RDF S1 S2 S3 S4 S5 S6 S7 S8 S9 //TP1 //TP3 //TP4 //TP5 //TP2 TP2 = S1 TP3 = S1 TP4 = S4 TP5 = S1 S2 S4 S5 S6 S7 S8 S9 Total triple pattern-wise selected sources = 12 Total SPARQL ASK queries : 9*5 = 45
  11. 11. Motivation FedBench (LD3): Return for all US presidents their party membership and news pages about them. SELECT ?president ?party ?page WHERE { ?president rdf:type dbpedia:President . ?president dbpedia:nationality dbpedia:United_States . ?president dbpedia:party ?party . ?x nyt:topicPage ?page . ?x owl:sameAs ?president . } dbpedia RDF Source Selection Algorithm Triple pattern-wise source selection S1TP1 = TP3 = S1 Optimal triple pattern-wise selected sources 5 KEGG RDF ChEBI RDF NYT RDF SWDF RDF LMDB RDF Jamendo RDF Geo Names RDF DrugBank RDF S1 S2 S3 S4 S5 S6 S7 S8 S9 //TP1 //TP3 //TP4 //TP5 //TP2 TP2 = S1 TP4 = S4 TP5 = S1 S2 S4 S5 S6 S7 S8 S9
  12. 12. Problem Statement • An overestimation of triple pattern-wise source selection can be expensive – Resources are wasted – Query runtime is increased – Extra traffic is generated • How do we perform join-aware triple pattern wise source selection in time efficient way?
  13. 13. HiBISCuS: Key Concept • Makes use of the URI’s authorities http://dbpedia.org/ontology/party Scheme Authority Path For URI details: http://tools.ietf.org/html/rfc3986
  14. 14. HiBISCuS: SPARQL Query as Hypergraph SELECT ?president ?party ?page WHERE { ?president rdf:type dbpedia:President . ?president dbpedia:nationality dbpedia:United_States . ?president dbpedia:party ?party . ?x nyt:topicPage ?page . ?x owl:sameAs ?president . } ?president rdf:type dbpedia: President
  15. 15. HiBISCuS: SPARQL Query as Hypergraph SELECT ?president ?party ?page WHERE { ?president rdf:type dbpedia:President . ?president dbpedia:nationality dbpedia:United_States . ?president dbpedia:party ?party . ?x nyt:topicPage ?page . ?x owl:sameAs ?president . } ?president rdf:type dbpedia: President dbpedia: United_S tates dbpedia: nationality
  16. 16. HiBISCuS: SPARQL Query as Hypergraph SELECT ?president ?party ?page WHERE { ?president rdf:type dbpedia:President . ?president dbpedia:nationality dbpedia:United_States . ?president dbpedia:party ?party . ?x nyt:topicPage ?page . ?x owl:sameAs ?president . } ?president rdf:type dbpedia: President dbpedia: United_S tates dbpedia: nationality dbpedia: party ?party
  17. 17. HiBISCuS: SPARQL Query as Hypergraph SELECT ?president ?party ?page WHERE { ?president rdf:type dbpedia:President . ?president dbpedia:nationality dbpedia:United_States . ?president dbpedia:party ?party . ?x nyt:topicPage ?page . ?x owl:sameAs ?president . } ?president rdf:type dbpedia: President dbpedia: United_S tates dbpedia: nationality dbpedia: party ?party ?x nyt:topi cPage ?page
  18. 18. HiBISCuS: SPARQL Query as Hypergraph SELECT ?president ?party ?page WHERE { ?president rdf:type dbpedia:President . ?president dbpedia:nationality dbpedia:United_States . ?president dbpedia:party ?party . ?x nyt:topicPage ?page . ?x owl:sameAs ?president . } ?president rdf:type dbpedia: President dbpedia: United_S tates dbpedia: nationality dbpedia: party ?party ?x nyt:topi cPage ?page owl: SameAs
  19. 19. HiBISCuS: SPARQL Query as Hypergraph SELECT ?president ?party ?page WHERE { ?president rdf:type dbpedia:President . ?president dbpedia:nationality dbpedia:United_States . ?president dbpedia:party ?party . ?x nyt:topicPage ?page . ?x owl:sameAs ?president . } ?president rdf:type dbpedia: President dbpedia: United_S tates dbpedia: nationality ?x owl: SameAs dbpedia: party ?party nyt:topi cPage ?page Star simple hybrid Tail of hyperedge
  20. 20. HiBISCuS: Data Summaries [] a ds:Service ; ds:endpointUrl <http://dbpedia.org/sparql> ; ds:capability [ ds:predicate dbpedia:party ; ds:sbjAuthority <http://dbpedia.org/> ; ds:objAuthority <http://dbpedia.org/> ; ] ; ds:capability [ ds:predicate rdf:type ; ds:sbjAuthority <http://dbpedia.org/> ; ds:objAuthority owl:Thing, dbpedia:President; #we store all distinct classes ] ; ds:capability [ ds:predicate dbpedia:postalCode ; ds:sbjAuthority <http://dbpedia.org/> ; #No objAuthority as the object value for dbpedia:postalCode is string ] ;
  21. 21. HiBISCuS: Triple Pattern-wise Source Selection SELECT ?president ?party ?page WHERE { ?president rdf:type dbpedia:President . ?president dbpedia:nationality dbpedia:United_States . ?president dbpedia:party ?party . ?x nyt:topicPage ?page . ?x owl:sameAs ?president . } ?president rdf:type dbpedia: President dbpedia: United_ States dbpedia: nationality ?x owl: SameAs dbpedia: party ?party nyt:topi cPage ?page dbpedia KEGG NYT SWDF LMDB Geo DrgBnk Jamendo
  22. 22. HiBISCuS: Triple Pattern-wise Source Pruning SELECT ?president ?party ?page WHERE { ?president rdf:type dbpedia:President . ?president dbpedia:nationality dbpedia:United_States . ?president dbpedia:party ?party . ?x nyt:topicPage ?page . ?x owl:sameAs ?president . } ?president rdf:type dbpedia: President dbpedia: United_ States dbpedia: nationality ?x owl: SameAs dbpedia: party ?party nyt:topi cPage ?page dbpedia KEGG NYT SWDF DrgBnk LMDB Geo Jamendo Obj. auth. dbpedia Sbj. auth. KEGG Sbj. auth. NYT Sbj. auth. SWDF Sbj. auth. LMDB Sbj. auth. Geo Sbj. auth. DrgBnk Sbj. auth. Jamendo Sbj. auth.
  23. 23. HiBISCuS: Triple Pattern-wise Source Pruning SELECT ?president ?party ?page WHERE { ?president rdf:type dbpedia:President . ?president dbpedia:nationality dbpedia:United_States . ?president dbpedia:party ?party . ?x nyt:topicPage ?page . ?x owl:sameAs ?president . } ?president rdf:type dbpedia: President dbpedia: United_ States dbpedia: nationality ?x owl: SameAs dbpedia: party ?party nyt:topi cPage ?page dbpedia Sbj. auth. KEGG Sbj. auth. NYT Sbj. auth. SWDF Sbj. auth. LMDB Sbj. auth. Geo Sbj. auth. DrgBnk Sbj. auth. Jamendo Sbj. auth. dbpedia KEGG NYT SWDF DrgBnk LMDB Geo Jamendo Obj. auth.
  24. 24. HiBISCuS: Triple Pattern-wise Source Pruning SELECT ?president ?party ?page WHERE { ?president rdf:type dbpedia:President . ?president dbpedia:nationality dbpedia:United_States . ?president dbpedia:party ?party . ?x nyt:topicPage ?page . ?x owl:sameAs ?president . } ?president rdf:type dbpedia: President dbpedia: United_ States dbpedia: nationality ?x owl: SameAs dbpedia: party ?party nyt:topi cPage ?page dbpedia KEGG NYT SWDF DrgBnk LMDB Geo Jamendo Obj. auth.
  25. 25. HiBISCuS: Triple Pattern-wise Source Pruning SELECT ?president ?party ?page WHERE { ?president rdf:type dbpedia:President . ?president dbpedia:nationality dbpedia:United_States . ?president dbpedia:party ?party . ?x nyt:topicPage ?page . ?x owl:sameAs ?president . } ?president rdf:type dbpedia: President dbpedia: United_ States dbpedia: nationality ?x owl: SameAs dbpedia: party ?party nyt:topi cPage ?page NYT Obj. auth.
  26. 26. HiBISCuS: Triple Pattern-wise Source Pruning SELECT ?president ?party ?page WHERE { ?president rdf:type dbpedia:President . ?president dbpedia:nationality dbpedia:United_States . ?president dbpedia:party ?party . ?x nyt:topicPage ?page . ?x owl:sameAs ?president . } ?president rdf:type dbpedia: President dbpedia: United_ States dbpedia: nationality ?x owl: SameAs dbpedia: party ?party nyt:topi cPage ?page NYT Obj. auth.
  27. 27. HiBISCuS: Triple Pattern-wise Source Pruning SELECT ?president ?party ?page WHERE { ?president rdf:type dbpedia:President . ?president dbpedia:nationality dbpedia:United_States . ?president dbpedia:party ?party . ?x nyt:topicPage ?page . ?x owl:sameAs ?president . } ?president rdf:type dbpedia: President dbpedia: United_ States dbpedia: nationality ?x owl: SameAs dbpedia: party ?party nyt:topi cPage ?page Total triple pattern-wise selected sources = 5 Total SPARQL ASK queries : 0
  28. 28. Experimental Setup • Benchmark – FedBench – Real-world datasets collection – Real queries showing typical request – Used all of the 25 queries • HiBISCuS Extensions – FedX 2.0 – DARQ – SPLENDID • We also included ANAPSID (version of 12/2013) without HiBISCus extension
  29. 29. Experimental Setup • Metrics – Index generation time – Index size – Total triple pattern-wise sources selected – Total number of SPARQL ASK requests used – Source selection time – Query execution time
  30. 30. Index Generation Time and Compression Ratio FedX SPLENDIDLHD DARQ ANAPSID ADERIS Qtree HiBISCus Index Generation Time (min) NA 75 75 102 6 6 - 36 Compression Ratio NA 99.998 99.998 99.997 99.999 99.999 96.000 99.997 Compression Ratio = 1 - index size/ total datadump size
  31. 31. Efficient Source Selection FedBench Cross Domain Queries System #TP #AR SST(ms) FedX(cold) 78 252 317.8 FedX(warm) 78 0 7.3 SPLENDID 78 99 320.8 DARQ 84 0 7.2 ANAPSID 36 43 186 HiBISCuS(cold) 35 27 63 HiBISCuS(warm) 35 0 30.4 FedBench Life Sciences Queries System #TP #AR SST(ms) FedX(cold) 56 297 375.5 FedX(warm) 56 0 8.2 SPLENDID 56 90 307.2 DARQ 77 0 7.5 ANAPSID 44 63 477.4 HiBISCuS(cold) 41 18 31.8 HiBISCuS(warm) 41 0 23.1 FedBench Linked Data Queries System #TP #AR SST(ms) FedX(cold) 97 369 307.8 FedX(warm) 97 0 8 SPLENDID 97 126 279 DARQ 113 0 7.7 ANAPSID 54 37 803.5 HiBISCuS(cold) 47 9 20.6 HiBISCuS(warm) 47 0 16 Overall System #TP #AR SST(ms) FedX(cold) 231 918 330 FedX(warm) 231 0 8 SPLENDID 231 315 299 DARQ 274 0 7.5 ANAPSID 134 143 554 HiBISCuS(cold) 123 54 35.6 HiBISCuS(warm) 123 0 22
  32. 32. Efficient Source Selection FedBench Cross Domain Queries System #TP #AR SST(ms) FedX(cold) 78 252 317.8 FedX(warm) 78 0 7.3 SPLENDID 78 99 320.8 DARQ 84 0 7.2 ANAPSID 36 43 186 HiBISCuS(cold) 35 27 63 HiBISCuS(warm) 35 0 30.4 FedBench Life Sciences Queries System #TP #AR SST(ms) FedX(cold) 56 297 375.5 FedX(warm) 56 0 8.2 SPLENDID 56 90 307.2 DARQ 77 0 7.5 ANAPSID 44 63 477.4 HiBISCuS(cold) 41 18 31.8 HiBISCuS(warm) 41 0 23.1 FedBench Linked Data Queries System #TP #AR SST(ms) FedX(cold) 97 369 307.8 FedX(warm) 97 0 8 SPLENDID 97 126 279 DARQ 113 0 7.7 ANAPSID 54 37 803.5 HiBISCuS(cold) 47 9 20.6 HiBISCuS(warm) 47 0 16 Overall System #TP #AR SST(ms) FedX(cold) 231 918 330 FedX(warm) 231 0 8 SPLENDID 231 315 299 DARQ 274 0 7.5 ANAPSID 134 143 554 HiBISCuS(cold) 123 54 35.6 HiBISCuS(warm) 123 0 22
  33. 33. FedX Extension with HiBISCuS Improvement in 20/25 queries with net performance improvement 24.61% 0 50 100 150 200 250 300 350 400 CD1 CD2 CD3 CD4 CD5 CD6 CD7 LS1 LS2 LS3 LS4 LS5 LS6 LS7 LD1 LD2 LD3 LD4 LD5 LD6 LD7 LD8 LD9 LD10 LD11 Avg. Queryexecutiontime(ms) FedX (warm) FedX+HiBISCus
  34. 34. SPLENDID Extension with HiBISCuS Improvement in 25/25 queries with net performance improvement 82.72% 0 200 400 600 800 1000 1200 CD1 CD2 CD3 CD4 CD5 CD6 CD7 LS1 LS2 LS3 LS4 LS5 LS6 LS7 LD1 LD2 LD3 LD4 LD5 LD6 LD7 LD8 LD9 LD10 LD11 Avg. Queryexecutiontime(ms) SPLENDID SPLENDID+HiBISCus
  35. 35. DARQ Extension with HiBISCuS Improvement in 21/21 queries with net performance improvement 92.22% 0.01 0.1 1 10 100 1000 10000 100000 CD1 CD2 CD3 CD4 CD5 CD6 CD7 LS1 LS2 LS3 LS4 LS5 LS6 LS7 LD1 LD2 LD3 LD4 LD5 LD6 LD7 LD8 LD9 LD10 LD11 Avg Queryexecutiontime(ms)logscale Hundreds DARQ DARQ+HiBISCus Notsupported Notsupported Runtimeerror Runtimeerror Runtimeerror Runtimeerror Timeout Timeout Notsupported Notsupported Timeout Timeout
  36. 36. SPLENDID+HiBISCuS vs ANAPSID Improvement in 24/24 queries with net performance improvement 98% 0.01 0.1 1 10 100 1000 CD1 CD2 CD3 CD4 CD5 CD6 CD7 LS1 LS2 LS3 LS4 LS5 LS6 LS7 LD1 LD2 LD3 LD4 LD5 LD6 LD7 LD8 LD9 LD10 LD11 Avg. Queryexecutiontime(ms)logscale Hundreds ANAPSID SPLENDID+HiBISCus ZeroResults
  37. 37. Conclusion and Future Work • An overestimation of triple pattern-wise source selection can be expensive • Join-aware triple pattern-wise source selection is more efficient than simple triple pattern-wise source selection • Performance improvements – FedX : 24.61 % – SPLENDID: 82.72 % – DARQ: 92.22 % – SPLENDID+HiBISCus is 98% faster than ANAPSID • BigRDFBench is on the roadmap
  38. 38. Thanks! homepage and source code: https://code.google.com/p/hibiscusfederation/ {saleem,ngonga}@informatik.uni-leipzig.de AKSW, University of Leipzig, Germany
  39. 39. Source Selection • Triple pattern-wise source selection – Ensures 100% recall – Can over-estimate capable sources – Can be expensive, e.g., total number of SPARQL ASK requests used – Performed by FedX, SPLENDID, LHD, DARQ, ADERIS etc. • Join-aware triple-pattern wise source selection – Ensures 100% recall – May selects optimal/close to optimal capable sources – Can be expensive, e.g., total number of SPARQL ASK requests used – Can significantly reduce the query execution time – Performed by ANAPSID

×