Practical SPARQL Benchmarking

•Download as PPTX, PDF•

3 likes•2,845 views

Talk from SemTech 2012 West in San Francisco - Discusses the why and how of SPARQL benchmarking and shows some example results generated by our tool Key takeaway - a benchmark can only tell you so much. You need to test on your data with your queries.

Technology

Rob Vesse
rvesse@yarcdata.com
@RobVesse

1

 Berlin SPARQL Benchmark (BSBM)
 Relational style data model
 Access pattern simulates replacing a traditional RDBMS with a Triple
Store
 Lehigh University Benchmark (LUBM)
 More typical RDF data model
 Stores require reasoning to answer the queries correctly
 SPARQL2Bench (SP2B)
 Again typical RDF data model
 Queries designed to be hard – cross products, filters, etc.
 Generates artificially massive unrealistic results
 Tests clever optimization and join performance

3

 Often no standardized methodology
 E.g. only BSBM provides a test harness
 Lack of transparency as a result
 If I say I’m 10x faster than you is that really true or did I measure
differently?
 Are the figures you’re comparing with even current?
 What actually got measured?
 Time to start responding
 Time to count all results
 Something else?
 Even if you run a benchmark does it actually tell you
anything useful?

4

 Java command line tool (and API) for benchmarking
 Designed to be highly configurable
 Runs any set of SPARQL queries you can devise against any HTTP
based SPARQL endpoint
 Run single and multi-threaded benchmarks
 Generates a variety of statistics
 Methodology
 Runs some quick sanity tests to check the provided endpoint is up
and working
 Optionally runs W warm up runs prior to actual benchmarking
 Runs a Query Mix N times
 Randomizes query order for each run
 Discards outliers (best and worst runs)
 Calculates averages, variances and standard deviations over the runs
 Generates reports as CSV and XML

5

 Response Time
 Time from when query is issued to when results start being received
 Runtime
 Time from when query is issued to all results being received and
counted
 Exact definition may vary according to configuration
 Queries per Second
 How many times a given query can be executed per second
 Query Mixed per Hour
 How many times a query mix can be executed per hour

6

 SP2B at 10k, 50k and 250k run with 5 warm-ups and 25 runs
 All options left as defaults i.e. full result counting
 Runs for 50k and 250k skipped if store was incapable of performing the run
in reasonable time
 Run on following systems
 *nix based stores run on late 2011 Mac Book Pro (quad core, 8GB RAM,
SSD)
 Java heap space set to 4GB
 Windows based stores run on HP Laptop (dual core, 4GB RAM, HDD)
 Both low powered systems compared to servers
 Benchmarked Stores
 Jena TDB 0.9.1
 Sesame 2.6.5 (Memory and Native Stores)
 Bigdata 1.2 (WORM Store)
 Dydra
 Virtuoso 6.1.3 (Open Source Edition)
 dotNetRDF (In-Memory Store)
 Stardog 0.9.4 (In-Memory and Disk Stores)
 OWLIM

8

 Code Release is management Approved
 Currently undergoing Legal and IP Clearance
 Should be open sourced shortly under a BSD license
 Will be available from https://sourceforge.net/p/sparql-query-bm
 Apologies this isn’t yet available at time of writing
 Example Results data available from:
 https://dl.dropbox.com/u/590790/semtech2012.tar.gz

1
2

What's hot

dh-slides-perf.ppthackday08

Webinar: Serie Operazioni per la vostra applicazione - Sessione 6 - Installar...MongoDB

Drupal meets PostgreSQL for DrupalCamp MSK 2014Kate Marshalkina

WTF?Andrey Karpov

re:dash is awesomeHiroshi Toyama

OSGifying the repositoryJukka Zitting

Troubleshooting redisDaeMyung Kang

Gluster the ugly parts with Jeff DarcyGluster.org

Actions in QTPAnish10110

Breaking the Sound Barrier with Persistent Memory HBaseCon

CCI2018 - Benchmarking in the cloudwalk2talk srl

J2EE Performance And Scalability BpChris Adkin

Cassandra from tarball to productionRon Kuris

Compression talkIlya Ganelin

Cloud Performance BenchmarkingSantanu Dey

Why we love pgpool-II and why we hate it!PGConf APAC

What's hot (16)

dh-slides-perf.ppt

Webinar: Serie Operazioni per la vostra applicazione - Sessione 6 - Installar...

Drupal meets PostgreSQL for DrupalCamp MSK 2014

WTF?

re:dash is awesome

OSGifying the repository

Troubleshooting redis

Gluster the ugly parts with Jeff Darcy

Actions in QTP

Breaking the Sound Barrier with Persistent Memory

CCI2018 - Benchmarking in the cloud

J2EE Performance And Scalability Bp

Cassandra from tarball to production

Compression talk

Cloud Performance Benchmarking

Why we love pgpool-II and why we hate it!

Similar to Practical SPARQL Benchmarking

Ceph - High Performance Without High CostsJonathan Long

How Many Slaves (Ukoug)Doug Burns

UnConference for Georgia Southern Computer Science March 31, 2015Christopher Curtin

Big Linked Data ETL Benchmark on Cloud Commodity HardwareLaurens De Vocht

Benchmarking Hadoop and Big DataNicolas Poggi

The Impact of Columnar File Formats on SQL-on-Hadoop Engine Performance: A St...t_ivanov

Thing you didn't know you could do in SparkSnappyData

Planning for-high-performance-web-applicationNguyễn Duy Nhân

Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Bhupesh Bansal

Hadoop and Voldemort @ LinkedInHadoop User Group

Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...javier ramirez

Data Engineering for Data Scientists jlacefie

scale_perf_best_practiceswebuploader

FHIR Server internals - sqlonfhirBrian Postlethwaite

VMworld 2013: Virtualizing Databases: Doing IT Right VMworld

Api testing libraries using java script an overviewvodQA

SQL on Hadoop benchmarks using TPC-DS query setKognitio

Microsoft Openness Mongo DBHeriyadi Janwar

Planning For High Performance Web ApplicationYue Tian

Hardware ProvisioningMongoDB

Similar to Practical SPARQL Benchmarking (20)

Ceph - High Performance Without High Costs

How Many Slaves (Ukoug)

UnConference for Georgia Southern Computer Science March 31, 2015

Big Linked Data ETL Benchmark on Cloud Commodity Hardware

Benchmarking Hadoop and Big Data

The Impact of Columnar File Formats on SQL-on-Hadoop Engine Performance: A St...

Thing you didn't know you could do in Spark

Planning for-high-performance-web-application

Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010

Hadoop and Voldemort @ LinkedIn

Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...

Data Engineering for Data Scientists

scale_perf_best_practices

FHIR Server internals - sqlonfhir

VMworld 2013: Virtualizing Databases: Doing IT Right

Api testing libraries using java script an overview

SQL on Hadoop benchmarks using TPC-DS query set

Microsoft Openness Mongo DB

Planning For High Performance Web Application

Hardware Provisioning

Recently uploaded

[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745

From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays

Histor y of HAM Radio presentation slidevu2urc

Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun

Automating Google Workspace (GWS) & more with Apps Scriptwesley chun

Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko

Presentation on how to chat with PDF using ChatGPT code interpreternaman860154

Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal

A Call to Action for Generative AI in 2024Results

Salesforce Community Group Quito, Salesforce 101Paola De la Torre

08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls

TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science

08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls

2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo

Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC

Recently uploaded (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

Histor y of HAM Radio presentation slide

Data Cloud, More than a CDP by Matt Robison

Automating Google Workspace (GWS) & more with Apps Script

Handwritten Text Recognition for manuscripts and early printed texts

Presentation on how to chat with PDF using ChatGPT code interpreter

Driving Behavioral Change for Information Management through Data-Driven Gree...

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service

A Call to Action for Generative AI in 2024

Salesforce Community Group Quito, Salesforce 101

08448380779 Call Girls In Greater Kailash - I Women Seeking Men

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men

2024: Domino Containers - The Next Step. News from the Domino Container commu...

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

Breaking the Kubernetes Kill Chain: Host Path Mount

Practical SPARQL Benchmarking

1. Rob Vesse rvesse@yarcdata.com @RobVesse 1

2.  Regardless of what technology your solution will be built on (RDBMS, RDF + SPARQL, NoSQL etc) you need to know it performs sufficiently to meet your goals  You need to justify option X over option Y  Business – Price vs Performance  Technical – Does it perform sufficiently?  No guarantee that a standard benchmark accurately models your usage 2

3.  Berlin SPARQL Benchmark (BSBM)  Relational style data model  Access pattern simulates replacing a traditional RDBMS with a Triple Store  Lehigh University Benchmark (LUBM)  More typical RDF data model  Stores require reasoning to answer the queries correctly  SPARQL2Bench (SP2B)  Again typical RDF data model  Queries designed to be hard – cross products, filters, etc.  Generates artificially massive unrealistic results  Tests clever optimization and join performance 3

4.  Often no standardized methodology  E.g. only BSBM provides a test harness  Lack of transparency as a result  If I say I’m 10x faster than you is that really true or did I measure differently?  Are the figures you’re comparing with even current?  What actually got measured?  Time to start responding  Time to count all results  Something else?  Even if you run a benchmark does it actually tell you anything useful? 4

5.  Java command line tool (and API) for benchmarking  Designed to be highly configurable  Runs any set of SPARQL queries you can devise against any HTTP based SPARQL endpoint  Run single and multi-threaded benchmarks  Generates a variety of statistics  Methodology  Runs some quick sanity tests to check the provided endpoint is up and working  Optionally runs W warm up runs prior to actual benchmarking  Runs a Query Mix N times  Randomizes query order for each run  Discards outliers (best and worst runs)  Calculates averages, variances and standard deviations over the runs  Generates reports as CSV and XML 5

6.  Response Time  Time from when query is issued to when results start being received  Runtime  Time from when query is issued to all results being received and counted  Exact definition may vary according to configuration  Queries per Second  How many times a given query can be executed per second  Query Mixed per Hour  How many times a query mix can be executed per hour 6

7. 7

8.  SP2B at 10k, 50k and 250k run with 5 warm-ups and 25 runs  All options left as defaults i.e. full result counting  Runs for 50k and 250k skipped if store was incapable of performing the run in reasonable time  Run on following systems  *nix based stores run on late 2011 Mac Book Pro (quad core, 8GB RAM, SSD)  Java heap space set to 4GB  Windows based stores run on HP Laptop (dual core, 4GB RAM, HDD)  Both low powered systems compared to servers  Benchmarked Stores  Jena TDB 0.9.1  Sesame 2.6.5 (Memory and Native Stores)  Bigdata 1.2 (WORM Store)  Dydra  Virtuoso 6.1.3 (Open Source Edition)  dotNetRDF (In-Memory Store)  Stardog 0.9.4 (In-Memory and Disk Stores)  OWLIM 8

9. 9

10. 1 0

11. 1 1

12.  Code Release is management Approved  Currently undergoing Legal and IP Clearance  Should be open sourced shortly under a BSD license  Will be available from https://sourceforge.net/p/sparql-query-bm  Apologies this isn’t yet available at time of writing  Example Results data available from:  https://dl.dropbox.com/u/590790/semtech2012.tar.gz 1 2

13. 1 3

Editor's Notes

Introduce MyselfMay want to add a disclaimer here about views/opinions expressed primarily being my personal ones and not those of the company a la DVD extras disclaimers ;-)
What is says on the slide ;-)
Describe the benchmarks – shown on slidesDiscuss deficiencies of each benchmarkBSBMRelational – not really showing off the capabilities of a SPARQL engineLUBMNeed for reasoning – implementation thereof can make a huge difference in performanceForward vs Backward Chaining ReasoningSP2BQueries are unrealisticFocuses on optimization
Self explanatory slide for the most partHighlight that just because the store you are interested in is good/bad at a particular benchmark doesn’t tell you whether the store is good/bad for your use case
Describe the methodology in detailNote that this is based on an amalgamation of the BSBM style and Revelytix SP2B methodologies
Key Point is to cover difference between Response Time and RuntimeNote that this stat can give some interesting information about how stores execute queries – almost instant response time but much longer runtime indicates streaming execution. Long response time with small difference to runtime indicates a batch execution.
Run through a brief demo of the command line tool – make sure to have a running Stardog/Fuseki instance to run against – likely safer to use Fuseki as easier to ensure running and open source so no appearance of bias to a commercial productRun on SP2B 10k – will complete in reasonable time while I’m talking – suggest using a limited number of runs for demo purposes.Show the output data (CSV and XML)Key difference is CSV converts to seconds while XML uses raw nanosecondsXML is better for post processingCSV useful for quick import into Spreadsheet tools
Discuss the setup for the example results – why the stores were chosen?Ease of availability (open source, runnable on *nix, personal interest etc)Ensure to highlight YMMVDisclaimer – Be sure to state that this is just a arbitrarily selected sample of stores and that performance indicated here may not be representative of the true performance of any store. Most importantly Cray/YarcData is not endorsing any specific store.Again point out the importance of people running their own benchmarks
Note how as dataset size increases many stores can’t complete within reasonable time on the machines we usedLogarithmic ScaleMake sure to mention that the fact that many stores did not complete on the 50k and 250k sizes doesn’t mean they are defective, merely that with the machine resources available they couldn’t run in a timely fashion. This leads nicely to the point that it is important to benchmark on the hardware you actually intend to use.
Discuss the variation in average runtime – some stores are way ahead of othersNote that some store’s results are heavily influenced by poor performance on certain queries – see next slideLogarithmic Scale
Highlight the variation in performance both between stores and queries. Note how certain queries are just fundamentally hard even with clever optimisationIn-Memory trumps disk for relevant stores in most cases

Practical SPARQL Benchmarking

Recommended

Recommended

More Related Content

What's hot

What's hot (16)

Similar to Practical SPARQL Benchmarking

Similar to Practical SPARQL Benchmarking (20)

More from Rob Vesse

More from Rob Vesse (8)

Recently uploaded

Recently uploaded (20)

Practical SPARQL Benchmarking

Editor's Notes