Flexible transactional scale for the connected world.
Demystifying Benchmarks
How to Use Them to Better Evaluate Databases
Peter Friedenbach, Performance Architect | Clustrix
Outline
o A Brief History of Database Benchmarks
o A Brief Introduction to Open Source Database Benchmark Tools
o How to Evaluate Database Performance
– Best Practices and Lessons Learned
2
Demystifying Benchmarks
First, a little history…
In the beginning, there was the “TPS Wars”
3
o Published in 1985 by Anon.
et. Al (Jim Gray)
o First publication of a
database performance
benchmark.
o Novelties of DebitCredit:
– ACID property transactions
– Price/Performance
– Response Time Constraints
– Database Scaling Rules
Debit/Credit: The 1st Attempt
4
o Transaction Processing
Performance Council
– Founded in 1988, the TPC was
chartered to established industry
standards out of the madness.
TPC Timeline:
1988 : TPC Established
1989 : TPC-A Approved (Formalizes Debit/Credit)
1990 : TPC-B Approved (Database only version of TPC-A)
1992 : TPC-C Approved (Replaces TPC-A OLTP workload)
1995 : TPC-D Approved (1st Decision Support Workload)
TPC-A & TPC-B retired.
1999 : TPC-H Approved (Replaces TPC-D Workload)
2000 : TPC-C v5 Approved (Major revision to TPC-C)
2006 : TPC-E Approved (Next Generation OLTP workload)
2009 : First TPC Technology Conference on Performance
Evaluation & Benchmark (Held in conjunction with VLDB)
2012 : TPC-VMS Approved (1st Virtualized Database benchmark)
2014 : TPCx-HS Approved (1st Hadoop based benchmark)
2015 : TPC-DS Approved (Next Generation Decision Support
Benchmark)
2016 : TPCx-V Approved (Virtual DB Server benchmark)
2016 : TPCx-BB Published (Hadoop Big Data benchmark)
The Call for an Industry Standard
5
o The Good:
– Established the rules to the game
– For the first time, competitive
performance claims could be
compared
– Audited results
– Standard workloads focused the
industry and drove performance
improvements
Transaction Processing Performance Council
6
o The Bad:
– Expensive to play
– “Benchmarketing” and
gamesmanship
– Dominated by vendors
• Hardware vendors
• Database vendors
– Slow to evolve to a changing
marketplace.
The Impact of the TPC
7
What happened to the TPC?
8
o Sysbench
– Open source toolkit
• Moderated by Alexey Kopytov at
Percona
– Implements multiple workloads designed
to test the CPU, disk, memory, and
database capabilities of a system
– Database workload allows for a mixture of
reads (singleton selects and range
queries) and writes (inserts, updates, and
deletes)
– Sysbench is popular in the Mysql
Marketplace
Open Source Database Benchmarking Tools
9
“sysbench is a modular,
cross-platform and multi-
threaded benchmark tool for
evaluating OS parameters
that are important for a
system running a database
under intensive load.”
o YCSB
(Yahoo! Cloud Serving Benchmark)
– Open source toolkit
• Moderated by Brian Cooper at
Google
– Multi-threaded driver exercising
get and put operations against an
object store
– YCSB is popular with the NoSQL
Marketplace
Open Source Database Benchmarking Tools
10
“YCSB is a framework and
common set of workloads for
evaluating the performance of
different “key-value” and
“cloud” serving stores.”
o Others
– “TPC-like” workloads live on
• DBT2 (MySQL Benchmark Tool Kit)
• DebitCredit (TPC-A / TPC-B like)
• OrderEntry (TPC-C like)
– OLTPBench
• https://github.com/oltpbenchmark/oltpbench
– Others?
Open Source Database Benchmarking Tools
11
How to Evaluate Database Performance
o #1 – Know your objective.
– What are you trying to measure/test?
• OLTP Performance: Capacity and Throughput
• Data Analytics: Query Performance
• Do you need full ACID property transactions?
• … And other questions?
12
How to Evaluate Database Performance
o #2 – Choose your approach.
Option 1: Rely on “published Results”
• Words of advice: Trust but verify!
• Be wary of competitive benchmark claims
• Without the TPC, there is no standard for comparison
Option 2: Leverage open source benchmarks
• Leverage Sysbench and/or YCSB mixed workloads
• Create your own custom mix as appropriate
Option 3: Model your own workload (“Proof of Concept”)
• Particularly useful if you have existing data and existing query
profiles
13
How to Evaluate Database Performance
o #3 – A data point is not a curve. (Common mistake.)
Performance Curves
14
Throughput
Latency(ResponseTime)
Performance is a tradeoff
of throughput versus
latency.
Design your tests with a
variable in mind.
How to Evaluate Database Performance
o #4 – Understand where there’s a bottleneck. (Common
mistake.)
– Where systems can bottleneck:
• Hardware (CPU, disk, network)
• Database (internal locks/latches, buffer managers,
transaction managers, …)
• Application (data locks)
• Driver systems
15
How to Evaluate Database Performance
o #5 – Limit the number of variables.
In any test, there are:
– Three fundamental system variables
• Hardware, operating system, and database system
– Driver mode
• On box versus external client
– Database design variables
• Connectivity (odbc, jdbc, session pools, proprietary techniques, …)
• Execution model (session-less, prepare/exec, stored procedures, …)
• # of tables, # of columns, types of columns
– Multiple test variables
• Database scale size
• Concurrent sessions/streams
• Query Complexity
16
Real performance
work is an exercise of
control.
How to Evaluate Database Performance
o #6 – Scalability testing: The quest for “linear scalability”
17
A workload will scale
only if sized
appropriately.
How to Evaluate Database Performance
o #7 – The myth of “representable workloads.”
Benchmarks are not “representable workloads”
– The complexity of the benchmark does not determine the
goodness of the benchmark.
18
Good benchmark performance is
“necessary but not sufficient” for good
application performance.
How I Use the Tools Available
o To access “system” health, I use:
– CPU Processor: sysbench cpu
– Disk Subsystem: sysbench fileio
– Network Subsystem: iperf
o To access “database” health, I use:
– Sysbench for ACID transactions: point selects, point updates, and
simple mixes (90:10 or 80:20)
– YCSB for nonACID transactions: workloadc (readonly), workloada and
workloadb (read/write mixes)
o To access “database” transaction capabilities, I use:
– DebitCredit and OrderEntry (“TPC like database only workloads.”)
o To model application specific problems I will sometimes leverage
any or all of the above.
19
Demystifying Benchmarks: How to Use Them To Better Evaluate Databases

Demystifying Benchmarks: How to Use Them To Better Evaluate Databases

  • 1.
    Flexible transactional scalefor the connected world. Demystifying Benchmarks How to Use Them to Better Evaluate Databases Peter Friedenbach, Performance Architect | Clustrix
  • 2.
    Outline o A BriefHistory of Database Benchmarks o A Brief Introduction to Open Source Database Benchmark Tools o How to Evaluate Database Performance – Best Practices and Lessons Learned 2 Demystifying Benchmarks
  • 3.
    First, a littlehistory… In the beginning, there was the “TPS Wars” 3
  • 4.
    o Published in1985 by Anon. et. Al (Jim Gray) o First publication of a database performance benchmark. o Novelties of DebitCredit: – ACID property transactions – Price/Performance – Response Time Constraints – Database Scaling Rules Debit/Credit: The 1st Attempt 4
  • 5.
    o Transaction Processing PerformanceCouncil – Founded in 1988, the TPC was chartered to established industry standards out of the madness. TPC Timeline: 1988 : TPC Established 1989 : TPC-A Approved (Formalizes Debit/Credit) 1990 : TPC-B Approved (Database only version of TPC-A) 1992 : TPC-C Approved (Replaces TPC-A OLTP workload) 1995 : TPC-D Approved (1st Decision Support Workload) TPC-A & TPC-B retired. 1999 : TPC-H Approved (Replaces TPC-D Workload) 2000 : TPC-C v5 Approved (Major revision to TPC-C) 2006 : TPC-E Approved (Next Generation OLTP workload) 2009 : First TPC Technology Conference on Performance Evaluation & Benchmark (Held in conjunction with VLDB) 2012 : TPC-VMS Approved (1st Virtualized Database benchmark) 2014 : TPCx-HS Approved (1st Hadoop based benchmark) 2015 : TPC-DS Approved (Next Generation Decision Support Benchmark) 2016 : TPCx-V Approved (Virtual DB Server benchmark) 2016 : TPCx-BB Published (Hadoop Big Data benchmark) The Call for an Industry Standard 5
  • 6.
    o The Good: –Established the rules to the game – For the first time, competitive performance claims could be compared – Audited results – Standard workloads focused the industry and drove performance improvements Transaction Processing Performance Council 6 o The Bad: – Expensive to play – “Benchmarketing” and gamesmanship – Dominated by vendors • Hardware vendors • Database vendors – Slow to evolve to a changing marketplace.
  • 7.
    The Impact ofthe TPC 7
  • 8.
    What happened tothe TPC? 8
  • 9.
    o Sysbench – Opensource toolkit • Moderated by Alexey Kopytov at Percona – Implements multiple workloads designed to test the CPU, disk, memory, and database capabilities of a system – Database workload allows for a mixture of reads (singleton selects and range queries) and writes (inserts, updates, and deletes) – Sysbench is popular in the Mysql Marketplace Open Source Database Benchmarking Tools 9 “sysbench is a modular, cross-platform and multi- threaded benchmark tool for evaluating OS parameters that are important for a system running a database under intensive load.”
  • 10.
    o YCSB (Yahoo! CloudServing Benchmark) – Open source toolkit • Moderated by Brian Cooper at Google – Multi-threaded driver exercising get and put operations against an object store – YCSB is popular with the NoSQL Marketplace Open Source Database Benchmarking Tools 10 “YCSB is a framework and common set of workloads for evaluating the performance of different “key-value” and “cloud” serving stores.”
  • 11.
    o Others – “TPC-like”workloads live on • DBT2 (MySQL Benchmark Tool Kit) • DebitCredit (TPC-A / TPC-B like) • OrderEntry (TPC-C like) – OLTPBench • https://github.com/oltpbenchmark/oltpbench – Others? Open Source Database Benchmarking Tools 11
  • 12.
    How to EvaluateDatabase Performance o #1 – Know your objective. – What are you trying to measure/test? • OLTP Performance: Capacity and Throughput • Data Analytics: Query Performance • Do you need full ACID property transactions? • … And other questions? 12
  • 13.
    How to EvaluateDatabase Performance o #2 – Choose your approach. Option 1: Rely on “published Results” • Words of advice: Trust but verify! • Be wary of competitive benchmark claims • Without the TPC, there is no standard for comparison Option 2: Leverage open source benchmarks • Leverage Sysbench and/or YCSB mixed workloads • Create your own custom mix as appropriate Option 3: Model your own workload (“Proof of Concept”) • Particularly useful if you have existing data and existing query profiles 13
  • 14.
    How to EvaluateDatabase Performance o #3 – A data point is not a curve. (Common mistake.) Performance Curves 14 Throughput Latency(ResponseTime) Performance is a tradeoff of throughput versus latency. Design your tests with a variable in mind.
  • 15.
    How to EvaluateDatabase Performance o #4 – Understand where there’s a bottleneck. (Common mistake.) – Where systems can bottleneck: • Hardware (CPU, disk, network) • Database (internal locks/latches, buffer managers, transaction managers, …) • Application (data locks) • Driver systems 15
  • 16.
    How to EvaluateDatabase Performance o #5 – Limit the number of variables. In any test, there are: – Three fundamental system variables • Hardware, operating system, and database system – Driver mode • On box versus external client – Database design variables • Connectivity (odbc, jdbc, session pools, proprietary techniques, …) • Execution model (session-less, prepare/exec, stored procedures, …) • # of tables, # of columns, types of columns – Multiple test variables • Database scale size • Concurrent sessions/streams • Query Complexity 16 Real performance work is an exercise of control.
  • 17.
    How to EvaluateDatabase Performance o #6 – Scalability testing: The quest for “linear scalability” 17 A workload will scale only if sized appropriately.
  • 18.
    How to EvaluateDatabase Performance o #7 – The myth of “representable workloads.” Benchmarks are not “representable workloads” – The complexity of the benchmark does not determine the goodness of the benchmark. 18 Good benchmark performance is “necessary but not sufficient” for good application performance.
  • 19.
    How I Usethe Tools Available o To access “system” health, I use: – CPU Processor: sysbench cpu – Disk Subsystem: sysbench fileio – Network Subsystem: iperf o To access “database” health, I use: – Sysbench for ACID transactions: point selects, point updates, and simple mixes (90:10 or 80:20) – YCSB for nonACID transactions: workloadc (readonly), workloada and workloadb (read/write mixes) o To access “database” transaction capabilities, I use: – DebitCredit and OrderEntry (“TPC like database only workloads.”) o To model application specific problems I will sometimes leverage any or all of the above. 19