Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Diefficiency Metrics: Measuring the Continuous Efficiency of Query Processing Approaches

274 views

Published on

During empirical evaluations of query processing techniques, metrics like execution time, time for the first answer, and throughput are usually reported. Albeit informative, these metrics are unable to quantify and evaluate the efficiency of a query engine over a certain time period – or diefficiency –, thus hampering the distinction of cutting- edge engines able to exhibit high-performance gradually. We tackle this issue and devise two experimental metrics named dief@t and dief@k, which allow for measuring the diefficiency during an elapsed time period t or while k answers are produced, respectively. The dief@t and dief@k measurement methods rely on the computation of the area under the curve of answer traces, and thus capturing the answer concentration over a time interval. We report experimental results of evaluating the behavior of a generic SPARQL query engine using both metrics. Observed results suggest that dief@t and dief@k are able to measure the performance of SPARQL query engines based on both the amount of answers produced by an engine and the time required to generate these answers.

Published in: Science
  • Be the first to comment

Diefficiency Metrics: Measuring the Continuous Efficiency of Query Processing Approaches

  1. 1. Diefficiency Metrics: Measuring the Continuous Efficiency of Query Processing Approaches Maribel Acosta, Maria-Esther Vidal, York Sure-Vetter Presented at the International Semantic Web Conference 2017 Best Resource Paper Nominee
  2. 2. Motivation (1) SELECT ?d1 WHERE { ?d1 dcterms:subject dbc:Alcohols . ?d1 dbp:smiles ?s .} Retrieve resources classified as DBpedia that have SMILES identifiers. Query: Query Engine 2 Answer Time {?d1 à dbr:Zuclopenthixol} 0.37 sec. {?d1 à dbr:Ziprepol} 0.37 sec. {?d1 à dbr:Viminol} 0.37 sec. {?d1 à dbr:Trifluperidol} 0.37 sec. {?d1 à dbr:Trabectedin} 0.37 sec. {?d1 à dbr:Tolvaptan} 0.37 sec. Blocking Approach: Produces all results at the end of execution. Input Output
  3. 3. Motivation (1) SELECT ?d1 WHERE { ?d1 dcterms:subject dbc:Alcohols . ?d1 dbp:smiles ?s .} Retrieve resources classified as DBpedia that have SMILES identifiers. Query: 3 Answer Time {?d1 à dbr:Zuclopenthixol} 0.33 sec. {?d1 à dbr:Ziprepol} 0.35 sec. {?d1 à dbr:Viminol} 0.35 sec. {?d1 à dbr:Trifluperidol} 0.36 sec. {?d1 à dbr:Trabectedin} 0.36 sec. {?d1 à dbr:Tolvaptan} 0.37 sec. Incremental Approach: Produces results as soon as they are ready, e.g., ANAPSID, nLDE, TPF Client. Query Engine Input Output
  4. 4. Motivation (2) 4 Metrics nLDE Not Adaptive nLDE Selective nLDE Random Time First Answer (sec.) 0.37 0.24 0.33 Execution Time (sec.) 10.59 12.10 9.30 Throughput (answer/sec.) 486.27 421.87 553.66 Completeness 100% 100% 100% Query Engine 0 1000 2000 3000 4000 5000 0.0 2.5 5.0 7.5 10.0 12.5 Time (sec.) #AnswersProduced nLDE Not Adaptive nLDE Selective nLDE Random Continuous PerformanceTraditional Metrics Overall, nLDE Random outperforms the other approaches. nLDE Not Adaptive outperforms the other approaches in the first 7.5 sec. of execution.
  5. 5. Motivation (3) We need quantitative methods to measure the continuous efficiency of query processing approaches. 5
  6. 6. Related Work 6
  7. 7. Current Performance Metrics Effectiveness Efficiency Combined Metric [Guo05] Answer Completeness [Guo05] [Montoya12] Correctness [Zhang12] Answer Soundness [Guo05] Execution Time [Guo05] [Bizer09] [Montoya12] [Zhang12] Loading Time [Guo05] Throughput [Zhang12] Time for the First Tuple [Acosta11] Queries per Second [Bizer09] Average Slowdown [Sharaf08] These metrics do not consider continuous performance; they are not tailored to benchmark incremental approaches. 7
  8. 8. Our Approach: Measuring Continuous Efficiency 8
  9. 9. Diefficiency Metrics • Diefficiency: continuous efficiency. • Combination of the Greek prefix di(a)- (which means “through” or “across”) and efficiency. • Continuous performance of approaches is recorded in answer traces. • Our metrics quantify the diefficiency of incremental approaches. 9 Answer Time {?d1 à dbr:Zuclopenthixol} 0.33 {?d1 à dbr:Ziprepol} 0.35 {?d1 à dbr:Viminol} 0.35 {?d1 à dbr:Trifluperidol} 0.36 {?d1 à dbr:Tolvaptan} 0.37
  10. 10. Answer Distribution Function • Defined as 𝑋: 0; 𝑡& → ℕ. • 𝑡& is the point in time when the last answer was produced. • 𝑋 𝑥 indicates the number of answers produced until the time 𝑥. • 𝑋 is built from answer traces (applying linear interpolations). 0 1000 2000 3000 4000 5000 0.0 2.5 5.0 7.5 10.0 12.5 Time #AnswersProduced nLDE Not Adaptive nLDE Selective nLDE Random Q9.sparqlAnswer Distribution FunctionAnswer Trace Answer Time {?d1 à dbr:Zuclopenthixol} 0.33 {?d1 à dbr:Ziprepol} 0.35 {?d1 à dbr:Viminol} 0.35 {?d1 à dbr:Trifluperidol} 0.36 {?d1 à dbr:Tolvaptan} 0.37 … 10
  11. 11. Metric dief@t • Quantifies diefficiency during the first t time units of execution. • Measures the area under the curve in the interval [0; 𝑡] of 𝑋 𝑥 . 0 1000 2000 3000 4000 5000 0.0 2.5 5.0 7.5 10.0 12.5 Time (sec.) #AnswersProduced nLDE Not Adaptive nLDE Selective nLDE Random dief @t := X(x)dx 0 t ∫ dief@t interpretation: Higher is better. 11 Not Adaptive Selective Random 7323.46 1148.63 5031.90
  12. 12. k = 2000 0 1000 2000 3000 4000 5000 0.0 2.5 5.0 7.5 10.0 12.5 Time (sec.) #AnswersProduced nLDE Not Adaptive nLDE Selective nLDE Random Metric dief@k • Quantifies diefficiency while producing the first k answers. • Measures the area under the curve of the interval 0; 𝑡𝑘 of 𝑋 𝑥 . • 𝑡𝑘 is the point in time where the kth answer is produced. dief@k interpretation: Lower is better. dief @k := X(x)dx 0 tk ∫ 12 Not Adaptive Selective Random 4686.11 3235.67 3517.85
  13. 13. Measuring diefficiency at any time interval • With dief@t it is possible to measure the diefficiency of an approach during the interval 𝑡-; 𝑡/ , as follows: 𝑑𝑖𝑒𝑓@𝑡/ − 𝑑𝑖𝑒𝑓@𝑡- Extensions of dief@t and dief@k 13 0 1000 2000 3000 4000 5000 0.0 2.5 5.0 7.5 10.0 12.5 Time #AnswersProduced nLDE Not Adaptive nLDE Selective nLDE Random Not Adaptive Selective Random 5073.37 869.18 4024.21
  14. 14. Extensions of dief@t and dief@k Measuring diefficiency between the ka th and kb th answers • With dief@k it is possible to measure the diefficiency of an approach while producing the answers 𝑘- and 𝑘/ (with 𝑘- ≤ 𝑘/), as follows: 𝑑𝑖𝑒𝑓@𝑘/ − 𝑑𝑖𝑒𝑓@𝑘- 14 0 1000 2000 3000 4000 5000 0.0 2.5 5.0 7.5 10.0 12.5 Time #AnswersProduced nLDE Not Adaptive nLDE Selective nLDE Random Not Adaptive Selective Random 5847.05 5457.67 3468.71
  15. 15. Properties of dief@t and dief@k Analytical Relationship Between dief@t and dief@k Let 𝑡9 be the point in time when the 𝑘th answer is produced. Theorem 1: The diefficiency of blocking approaches is always zero. Theorem 2: In queries where the number of answers is greater than one, the total diefficiency of incremental approaches is higher than zero. 15 𝑑𝑖𝑒𝑓@𝑡9 = 𝑑𝑖𝑒𝑓@𝑘
  16. 16. Empirical Study 16
  17. 17. Experimental Settings • Query engine: nLDE [Acosta15] with three configurations: • nLDE Not Adaptive (NA) • nLDE Selective (Sel) • nLDE Random (Ran) • Queries and dataset: • nLDE Benchmark 1: 16 non-selective queries (4 –14 triple patterns) • DBpedia dataset (v. 2015) • Technical specifications: Debian Wheezy 64 bit with CPU 2x Intel(R) Xeon(R) CPU E5-2670 2.60GHz (16 physical cores), and 256GB RAM. 17
  18. 18. 0 5000 10000 0 20 40 60 Time #AnswersProduced nLDE Not Adaptive nLDE Selective nLDE Random Q17.sparql (TFFF)^−1 (ET)^−1 Comp T dief@t NA Ran Sel Comparing dief@t with Other Metrics (1) 18 (Time for the First Tuple)-1 Completeness Throughput Plot interpretation: Higher is better. Results for Query Q17 Uncovered pattern: Ran outperforms NA (Execution Time)-1
  19. 19. Comparing dief@t with Other Metrics (2) 19 Queries in which 𝒅𝒊𝒆𝒇@𝒕 uncovers unknown patterns
  20. 20. 5 10 15 0.6 0.9 1.2 1.5 Time #AnswersProduced nLDE Not Adaptive nLDE Selective nLDE Random Q2.sparql k=25% k=50% k=75% k=100% NA Ran Sel Measuring Answer Rate with dief@k (1) 20 Plot interpretation: Lower is better. Sel produces the first 25% slower than Ran Sel produces the last portions of the answer at a faster rate Results for Query Q2
  21. 21. Measuring Answer Rate with dief@k (2) 21 Only in these queries, all the approaches produced results at a uniform rate.
  22. 22. Conclusions & Outlook 22
  23. 23. Conclusions 𝒅𝒊𝒆𝒇@𝒕 and 𝒅𝒊𝒆𝒇@𝒌: Measure the diefficiency of incremental approaches. • We have demonstrated the theoretical soundness of the metrics. • Our empirical study indicates that 𝑑𝑖𝑒𝑓@𝑡 and 𝑑𝑖𝑒𝑓@𝑘 allow for uncovering performance particularities. • A final remark: 23 𝒅𝒊𝒆𝒇@𝒕 and 𝒅𝒊𝒆𝒇@𝒌 can measure the performance of any incremental approach. ✔ Streaming query processing ✔Top-k ✔ Monotonic reasoning ✔ Crowdsourcing
  24. 24. • dief R package to compute 𝑑𝑖𝑒𝑓@𝑡 and 𝑑𝑖𝑒𝑓@𝑘 https://github.com/maribelacosta/dief • Jupyter notebook: • https://github.com/maribelacosta/dief-notebooks • Online demo: http://km.aifb.kit.edu/services/dief-app/ Available Resources 24
  25. 25. References [Acosta15] M. Acosta and M.-E. Vidal. Networks of linked data eddies: An adaptive web query processing engine for RDF data. In ISWC, pages 111–127, 2015. [Acosta11] M. Acosta, M.-E. Vidal, J. Castillo, T. Lampo, and E. Ruckhaus. ANAPSID: An adaptive query processing engine for SPARQL endpoints. In ISWC, pages 18–34, 2011. [Bizer09] C. Bizer and A. Schultz. The Berlin SPARQL benchmark. Int. J. Semantic Web Inf. Syst., 5(2):1–24, 2009. [Guo05] Y. Guo, Z. Pan, and J. Heflin. LUBM: A benchmark for OWL knowledge base systems. Web Semant., 3(2-3):158–182, Oct. 2005. [Montoya12] G.Montoya, M.-E Vidal, Ó. Corcho, E. Ruckhaus, and C.B.Aranda.Benchmarking federated SPARQL query engines: Are existing testbeds enough? In ISWC, pages 313–324, 2012. [Sharaf08] M. A. Sharaf, P. K. Chrysanthis, A. Labrinidis, and K. Pruhs. Algorithms and metrics for processing multiple heterogeneous continuous queries. ACM Trans. Database Syst., 33(1):5:1–5:44, 2008. [Zhang12] Y. Zhang, M. Pham, Ó. Corcho, and J. Calbimonte. SRBench: A streaming RDF/SPARQL benchmark. In ISWC, pages 641–657, 2012. 25
  26. 26. 0 1000 2000 3000 4000 5000 0.0 2.5 5.0 7.5 10.0 Time #AnswersProduced nLDE Not Adaptive 26 Diefficiency Metrics: Measuring the Continuous Efficiency of Query Processing Approaches Maribel Acosta, Maria-Esther Vidal, York Sure-Vetter dief @t := X(x)dx 0 t ∫ dief @k := X(x)dx 0 tk ∫

×