Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Practical SPARQL Benchmarking Revisited

787 views

Published on

A talk given at SemTechBiz 2014 in San Jose that follows up on the tool originally presented at the 2012 conference. Talks about the limitations we've encountered with the original tool and how we've evolved it to address these and build a more robust general purpose and open source SPARQL testing tool.

The tool is available on SourceForge in pre-built form at http://sourceforge.net/projects/sparql-query-bm/ or as code on SourceForge or GitHub (https://github.com/rvesse/sparql-query-bm)

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Practical SPARQL Benchmarking Revisited

  1. 1. 1 Rob Vesse rvesse@yarcdata.com @RobVesse
  2. 2. 2 1. Rewind to 2012 2. Limitations 3. Evolving the Framework 4. Examples 5. Future Work
  3. 3. 3
  4. 4. 4  Presentation I gave at this conference in 2012  Slides at http://www.slideshare.net/RobVesse/practical-sparql-benchmarking  Highlighted some issues with SPARQL Benchmarking:  Standard Benchmarks all have know deficiencies  Lack of standardized methodology  Best benchmark is the one you run with your data and workload  Introduced the 1.x version of our SPARQL Query Benchmarker tool  Java tool and API for benchmarking  Used a methodology based upon combination of the BSBM runner and Revelytix SP2B white paper  Reports various appropriate statistics  Various configuration options to change what exactly is benchmarked e.g. whether results are fully parsed and counted
  5. 5. 5  The 1.x tool was open sourced shortly after the 2012 conference under a 3 clause BSD License  Available on SourceForge  http://sourceforge.net/projects/sparql-query-bm/files/1.0.0/  Also as Maven artifacts (in Maven Central):  Group ID: net.sf.sparql-query-bm  Artifact IDs:  cmd  core  Latest 1.x Version: 1.1.0
  6. 6. 6
  7. 7.  The 1.x tool can only benchmark SPARQL queries  SPARQL 1.1 has been standardized since the 1.x version of the tool was written and adds various additional SPARQL features that you may want to test: 7  SPARQL Updates  SPARQL Graph Store Protocol  Queries are fixed  No parameterization support  Can't pass custom endpoint parameters in  For example enable/disable reasoning  Also no way to test endpoint specific extensions  e.g. transactions
  8. 8. 8  Requires using HTTP endpoints to access the SPARQL system to be tested  Adds communication overheads to the results  Sometimes this may be desirable  No ability to test SPARQL operations in-memory  i.e. can't test lower level APIs
  9. 9.  Only supports a single benchmarking methodology  Methodology is hard coded  Can't do things like run a subset of the provided operations on each run 9  Or repeat an operation within a run  Or retry an operation under specific failure conditions  Configuration of the methodology is tightly coupled to the methodology  Many aspects are actually independent of the methodology
  10. 10. 1 0  Used a simplistic text based format  One query file per line  No way to specify additional parameters  No way to assign a friendly name to queries  Assigns each query the filename
  11. 11.  There is a progress monitoring API but it is limited  E.g. Gets called after a query completes but not before it starts  Makes it awkward/impossible to implement some kinds of monitoring 1 1  e.g. crash detection, memory usage
  12. 12. 1 2  In the interests of speed over usability we rolled our own command line arguments parser  Means argument parsing is awkward to extend
  13. 13. 1 3
  14. 14. 1 4  Earlier this year we found a compelling reason to rewrite the tool and address the various limitations  First 2.x release was made 9th June 2014  Minor bug fix and maintenance releases since  Releases available at:  http://sourceforge.net/projects/sparql-query-bm/files/  Code is now using Git  http://git.code.sf.net/p/sparql-query-bm/git sparql-query-bm-git  Mirrors available on GitHub for those who think that it is the one true source  https://github.com/rvesse/sparql-query-bm  Maven artifacts available through Maven Central as before:  Group ID: net.sf.sparql-query-bm  Artifact IDs: core, cmd and dist  Latest 2.x version: 2.0.1
  15. 15.  Concept of Queries replaced with the general concept of Operations  Also divorces the definition of an operation with how to run said operation 1 5  Makes it easier to change runtime behaviour of operations  20 built-in operations provided  API allows defining and plugging in new operations as desired  http://sparql-query-bm.sourceforge.net/javadoc/latest/core/
  16. 16. 1 6  Several kinds of query/update  Fixed  Parameterized  Dataset Size  Variants for both remote endpoints and in-memory datasets  Remote variants have additional NVP variants  Allows adding custom parameters to the remote request  Accounts for 13 of the built in operations
  17. 17. 1 7  One for each graph store protocol operation:  DELETE  GET  HEAD  POST  PUT  Accounts for a further 5 of the built-in operations
  18. 18. 1 8  Sleep  Do nothing for some period  Useful for simulating quiet periods as part of testing  Mix  Allow grouping a set of operations into a single operation  Lets you compose mixes from other mixes
  19. 19. 1 9  As already noted in-memory variants of some operations are now available  These run tests against a Dataset implementation  Part of Apache Jena ARQ API  Removes SPARQL Protocol and HTTP overhead from testing  Of course depending on Dataset implementation may still be some communication overhead  But this is likely using lower level back end native communications protocols instead
  20. 20. 2 0  Addresses the limitation of hard coded methodology  Separates test running into three components:  Overall runner  Mix runner  Operation runner  Each has own API and can be customized as desired  Various useful base/abstract implementations provided  Four different test runners are provided:  Benchmark  Smoke  Soak  Stress
  21. 21. 2 1  Smoke  Runs the mix once and indicates whether it passes/fails  Pass is defined as all operations pass  Soak  Run the mix continuously for some period of time  Test how a system reacts under continuous load  Stress  Run the mix with increasingly high load  Test how a system reacts under increasing load  AbstractRunner provides a basic framework and helper method to make it easy to add custom runners or customize existing runs
  22. 22. 2 2  Allows customizing how mixes and individual operations are run  Some alternative implementations built in:  E.g. SamplingOperationMixRunner  Runs a sample of the operations in the mix  May include repeats  E.g. RetryingOperationRunner  Retries an operation if it doesn't succeed  Easy to implement your own
  23. 23. 2 3  Separates test configuration from the test runner  Interface with all common configuration defined  Endpoints  Timeouts  Progress Listeners  etc  NB - Runners are typically defined such that they restrict their input options to sub-interfaces that add runner specific configuration e.g.  Warm-ups for benchmarks  Total runtime for soak testing  Ramp up factor for stress testing
  24. 24. 2 4  Now using TSV as the file format  Still wanted to be simple enough that someone with zero RDF/SPARQL knowledge can configure  Each line is a series of parameters separated by a tab character  First parameter is an identifier for the type of the operation  Used to decide how to interpret the remaining parameters  Can define your own mix file format and register a loader for it  Possible to override the loader for a specific operation identifier since this has an API  Means you can do neat tricks like use a mix designed for remote endpoints against an in-memory dataset
  25. 25. query 806670-warmup1.rq 806670 Warmup Query 1 query 806670-warmup2.rq 806670 Warmup Query 2 query 806670-nofilter.rq 806670 Query with No Filter query 806670-filter3.rq 806670 Query with Filter (Variant 3) param-query 806670-filter3-params.rq instances.tsv Parameterized Query with Filter (Variant 3) query 806670-filter4.rq 806670 Query with Filter (Variant 4) query 806670-filter4a.rq 806670 Query with Filter (Variant 4a - Zero Results) param-query 806670-filter4-params.rq instances.tsv Parameterized Query with Filter (Variant 4) query 806238-warmup1.rq 806238 Warmup Query 1 query 806238-warmup2.rq 806238 Warmup Query 2 query 806238-comment43.rq 806238 Query (Comment 43) query 806238-comment43a.rq 806238 Query (Comment 43 - SELECT * sub-query) query 806238-comment45.rq 806238 Query (Comment 45 - Multiple sub-queries) query 806238-comment54.rq 806238 Query (Comment 54) param-update load-full1m.ru graph-names.tsv Load 1M Dataset into named graph param-query count-loaded.rq graph-names.tsv Count named graph param-update drop-loaded.ru graph-names.tsv Drop named graph query count.rq Count quads checkpoint10 Checkpoint every 10 runs sleep 180 3 minute sleep 2 5
  26. 26.  Now provides notifications before and after operation and mix runs  Improvements to how some of the built-in implementations handle multi-threaded output 2 6  Makes it easier to distinguish where errors occurred when running multi-threaded benchmarks
  27. 27. 2 7  Now based upon the powerful open source Airline library  https://github.com/airlift/airline  Provides a command line interface to each built-in runner  Also provides AbstractCommandwith all standard options exposed  Standardized exit codes across all commands  Comprehensive built-in help  Can help you define operation mixes  ./operations  ./operation --op param-query
  28. 28. 2 8
  29. 29.  These are things we've done (or are currently doing) with the framework that aren't in the open source releases  However the 2.x framework makes these (hopefully) easy to replicate yourself 2 9
  30. 30. 3 0  Many stores often have rich REST APIs in addition to their SPARQL APIs  Can be useful to include testing of these in your mixes  Requires implementing two interfaces:  Operation  OperationCallable  Abstract implementations of both available to give you the boiler plate bits  Internally we have 9 different custom operations defined which test a subset of our REST API:  Database Management  Asynchronous Queries  Import Management
  31. 31.  One thing we're particularly interested in is how operations affect memory usage 3 1  We added custom progress listeners that track and monitor memory usage  Reports on min, max and average memory usage  We also have another progress listener that tracks processes to identify when a test run may have been impacted by other activity on the system
  32. 32. 3 2 public class RetryOnAuthFailureOperationRunner extends RetryingOperationRunner { public RetryOnAuthFailureOperationRunner() { this(1); } public RetryOnAuthFailureOperationRunner(int maxRetries) { super(maxRetries); } @Override protected <T extends Options> boolean shouldRetry(Runner<T> runner, T options, Operation op, OperationRun run) { return run.getErrorCategory() == ErrorCategories.AUTHENTICATION; } }  Extends the built-in RetryingOperationRunner  Simply adds a constraint on retries by overriding the shouldRetry() method
  33. 33. 3 3
  34. 34. 3 4  Embrace Java 7 features fully  Use ServiceLoader to automatically discover new operations and mix formats  Make it even easier to customize runners  i.e. provide more abstraction of the current implementations
  35. 35. 3 5 Questions? rvesse@yarcdata.com @RobVesse

×