DBMS benchmarking overview and trends for Moscow ACM SIGMOD Chapter

A
Andrei NikolaenkoSystems architect at IBS
DATABASE MANAGEMENT
SYSTEMS BENCHMARKING
overview
and
trends
ГОНКИ TPC-C
296 results
Simfoware, Oracle DB 7–8, DB2/400 и UDB, Informix, MS SQL Server, Sybase ASE …
TPC-C, APRIL, 2017
SQL
Anywhere
AGENDA
What happened with TPC-C?
•TPC: early history
•TPC-A/B, -С, -H, -E
•TPC: obsolete and new
•Publication issues
The New Wave of benchmarks
•MapReduce (Hadoop)
•Graph benchmarks
•Meta-universal, atomic
«Pocket tools» to run
Input/Output
Outcomes
TP1
Mid 1970s,
IBM benchmark
Bank
transactions
processing
Idefix: 100 tps
(1973, “bank
with 1.000
branches and
10.000 tellers”)
Batch mode
without
networking
no delay for
teller response
Early 1980s:
fantastic
victorious reports
with 10 ktps
$$bln market…
… focus on fast-
growing OLTP
…however,
customers could
not achieved yet
1 ktps …
DAVIT DEVITT:
WISCONSIN BENCHMARK
Alternative, more
strictly determined
benchmark
Aimed to terminate
“benchmarking
wars”, but lit the
war even stronger!
Weak results of
Oracle DB caused
‘DeWitt clause’
DBMS license
prohibits publishing
benchmarks
JIM GREY: DEBITCREDIT
Counterpoise for Wisconsin benchmark,
based on TP1 subject area
Requires total systems cost, including
equipment, licenses, 5 years of support
Specified as textual functional
requirements, without code requirements
or code examples
Introduced scaling rules by users and table
sizes
Postulated a response time limit : 95 % of
transactions should complete in 1s
PRINCIPAL: RESPONSE TIME
BOUND
Изображение: ©Dell, 2013
TPC.ORG
Benchmarking wars continued: how to validated result?
Independent, non-profit organization need…
1988: Transaction Processing Performance Council
Omri Serlin and 8 [consonant] vendors
Actian Cisco Cloudera Dell DataCore Fujitsu HPE Hitachi Huawei IBM
Inspur Intel Lenovo Microsoft Oracle Pivotal Red Hat SAP Teradata VMWare
Members (2017):
TPC-A AND TPC-B: PROCESS
READ 100 bytes from TTY (AID, TID, BID, DELTA)
BEGIN TRANSACTION
UPDATE ACCOUNT WHERE ACCOUNT_ID = AID:
READ ACCOUNT_BALANCE FROM ACCOUNT
SET ACCOUNT_BALANCE = ACCOUNT_BALANCE + DELTA
WRITE ACCOUNT_BALANCE TO ACCOUNT
WRITE TO HISTORY:
AID, TID, BID, DELTA, TIME_STAMP
UPDATE TELLER WHERE TELLER_ID = TID:
SET TELLER_BALANCE = TELLER_BALANCE + DELTA
WRITE TELLER_BALANCE TO TELLER
UPDATE BRANCH WHERE BRANCH_ID = BID:
SET BRANCH_BALANCE = BRANCH_BALANCE + DELTA
WRITE BRANCH_BALANCE TO BRANCH
COMMIT TRANSACTION
WRITE 200 bytes to TTY (AID, TID, BID, DELTA)
Based on TP1 – retail banking transactions
TPC-A AND TPC-B: MODEL
BRANCH
B
ACCOUNT
B*100K
100K
HISTORY
B*2.6M
TELLER
B*1010
10s cycle for each
terminal
1 transaction per
second for each
branch
Response for 90%
of transactions
not less than 2s
Average
transactions per
second in 15 min
TPC-A AND TPC-B:
DIFFERENCES
TPC-A
Terminals
User response
delay
TPC-B
Server-only
benchmark
Reduced history
(30 days)
TPC-A AND TPC-B: CRITICISM
1995: considered unreliable
Last result (DEC, 1994)
3700 tpsA 4800 $/tpsA
First result (HP, 1990)
38.2 tpsA 29200 $/tpsA
TP1 legacy
Too simple to avoid “adjustments” Implausible dispersion of results
TPC-C: COMPLICATION
Response thresholds for 90% of transactions
Less than 5s for interactive operation Less than 20s for batch
More variances…
9 tables
Inserts, updates, deletes,
operation cancellations
Access by primary and
secondary keys
5 типов транзакций
NEW-ORDER
•new customer
order
PAYMENT
•payment event
•customer balance
refresh
DELIVERY
•delivery orer
•(batch)
ORDER-STATUS
•checking the status
of the last customer
order
STOCK-LEVEL
•checking the stock
level at the
warehouse
45% 43% 4% 4% 4%
TPC-C DATA MODEL
WAREHOUSE
W
DISTRICT
W*10
10
CUSTOMER
W*30K
3K
HISTORY
W*30K+
1+
ITEM
100K (fix)
STOCK
W*100K100K W
ORDER
W*30K+1+
ORDER-LINE
W*300K+
10–15
NEW-ORDER
W*5K0–1
TPC-C: SCALING
Each new warehouse →
+10 districts, + 100 items, +300k customers
Maximum 1.2 tpmC per terminal
10 terminals per warehouse
Scaling factor: warehouse (W)
TPC-C:
CLUSTERED AND NON-CLUSTERED
Clustered >1 nodes
Shared
nothing (fully
federated)
Sharded
Shared storage
(RAC)
Non-clustered 1 node
TPC-C: METRICS
tpmC
Transactions per
minute
tpmC / $
Transaction cost
Hardware
acquisition cost
Software and
services cost for
3 years
W / ktpmC
Energy
consumption per
1000 transactions
per minute
TPC-C INTERPRETATION
By Alan Parker (Alan Parker. Tuning databases on Solaris platform. Prentice-Hall, 2002)
•…not only order entrytpmC × 2
•…if transaction monitor does not usedtpmC / 2
•…if Oracle Forms-like client usedtpmC / 3
•…if lightweight client forms used (like curses-based)tpmC × 2 / 3
•…if SQL has not been tunedtpmC / 2
• …if there are heavy reports and batchestpmC / 2
COULD WE CHEAT TPC-C?
mount -t tmpfs -o size=2048g tmpfs /u01/tablespaces
CREATE UNLOGGED TABLE …
_ALLOW_RESETLOGS_CORRUPTION = TRUE
_IN_MEMORY_UNDO = TRUE
_DB_BLOCK_HASH_LATCHES = 32768
…
RESPONSE FROM COUNCIL
A l l A C I D a s p e c t s c h e c k s s t i f f l y i n t e r p r e t e d i n s p e c i f i c a t i o n
Single node reboot failure
Synchronous commit on
two nodes
Redo logs and recovering Reboot and recovering
Single media failure protection
Synchronous commit on
two separately powered media
Write-ahead logs on separately powered media
Commit = written on durable media
TPC-C: CRITICISM OF 1990S
Even wholesale providers works somewhat differently!
Order: some
unsuccessful wildcard
searches
Printing report after
each entry
(probably, 3 times)
Balance refresh for each
operation is impossible in
a heavy load environment
→ insert + batch calc
Trivial logic
No declarative constraints No trigger logic
Non-typical loads
What about reporting? Decision support systems?
TPC: OBSOLETED
TPC-D 1995–1999
First attempt
for OLAP
benchmark
TPC-R 2001–2005 Reporting
TPC-W 2001–2005
Online web-
commerce
TPC-H
1999 :
OLAP vs OLTP
“Ad-hoc
decision
support”
Instead of
TPC-D
(considered
irrelevant)
“Weight
categories”
100 GB
300 GB
1 TB
3 TB
10 TB
30 TB
100 TB
Parallel load
22 kinds of
complex
queries
2 kinds of
DWH
refresh
TPC-H: MODEL
1 SF ~ 1 GB
Declarative
constraints
Not star schema
TPC-H Q1
SELECT
l_returnflag, l_linestatus,
sum(l_quantity) as sum_qty,
sum(l_extendedprice) as sum_base_price,
sum(l_extendedprice*(1-l_discount)) as sum_disc_price,
sum(l_extendedprice*(1-l_discount)*(1+l_tax)) as sum_charge,
avg(l_quantity) as avg_qty,
avg(l_extendedprice) as avg_price,
avg(l_discount) as avg_disc,
count(*) as count_order
FROM lineitem
WHERE l_shipdate <= date '1998-12-01' - interval Δ day (3)
GROUP BY l_returnflag, l_linestatus
ORDER BY l_returnflag, l_linestatus
Δ = random [60…120]
“ F u n c t i o n a l Q u e r y D e f i n i t i o n ”
TPC-H: RESULTS, 2017
Tricky to apply
for non-relational DBMS
Variants to
apply for
MOLAP
(MDX)
Reports
with
application
for
Apache Hive
Exasol
MS SQL
Server
Oracle
Database
Actian
Vector
1 – 10 results for each
“weight category”
TPC-E: “PRAGMATIC OLTP”
Still OLTP, but more “hybrid”
Declarative constraints
No transaction monitors
More reads
More kinds of load
TPC-E:
MORE COMPLEX WORKFLOW
TPC-E: MODEL
Image: ©Transaction Processing Council, 2009
78 results
All non-clustered, all on MS SQL Server for Windows Server x64
11 KTPSE, $144/TPSE (1,6 MIO)
Изображение: ©Lenovo, © TPC, 2015
TPC-ABCEH
BRAND NEW: TPC-X И TPCX-X
TPC-DI
ETL workloads
TPC_DI_RPS
0 results
TPC-DS
Decision support
systems “on Big
Data solutions,
such as RDBMS as
well as
Hadoop/Spark
based systems”
QphDS@Size
0 results
TPCx-BB
Express benchmark
for “Hadoop-
based Big Data
Systems” adopted
from BigBench
BBpm@Size
0 results
TPCx-HS
Express benchmark
for HDFS-
compatible
systems, adopted
from TeraSort
HSph@Size
From 1 to 4 results
in different
“weight
categories”
PUBLISHING?
Publishing on TPC.org
Academic or research paper, could not be applicable for
marketing purposes
With definite clause that outcome is not comparable with
TPC.org results
With permission of TPC.org
Publishing TPC benchmarks is prohibited,
except the following cases:
Required audit from sizing.com
DEWITT CLAUSE IN 2017?
MSFT
EULA
•“You may not disclose the results of any benchmark test
… without Microsoft’s prior written approval”
OTN
Lic.
•“You may not disclose results of any Program benchmark
tests without Oracle’s prior consent”
IBM
IPLA
•“Licensee may disclose the results of any benchmark test of
the Program or its subcomponents to any third party
provided that Licensee, if … A) … B) … C)…”
NEW BENCHMARKS FOR
NEW WORKLOADS
2010s
2000s
1990s
1980s
OLTP
ROLAP ROLAP
In-memory
OLAP
TCP-DS
ROLAP TCP-H
Graph
queries
LinkBench
MapReduce HiBench
OLTP
OLTP[E] OLTP[E] TPC-E
OLTP
Atomic
access
YCSB
TPCx-HS
TPCx-BB
PRINCIPAL BENCHMARKS
OF “NEW WAVE”
Terrasort Benchmark
BigBench
Intel HiBench
Yahoo! Cloud Services
Benchmark
Linkbench
• Adopted in TPCx-HS
• Adopted in TPCx-BB
• Map-Reduce series
• Group of NoSQL
etalon runs
• Graph-like workload on
RDBMS and HBase
INTEL HIBENCH ДЛЯ
HADOOP
Image:Intel,2013
YAHOO! CLOUD SERVICES
BENCHMARK
New tool for “benchmark marketing”
Used by researchers for wide comparisons
(V. Abramova et al. Experimental Evaluation of NoSQL Databases // IJDMS Vol.6, №3, 2014)
Cassandra HBase
Elastic
search
MongoDB
Oracle
NoSQL
OrientDB Redis Scalaris Tarantool Voldemort
Bombardment from one load station
Atomic operation instead of transaction
(probably, read a few records)
YCSB: WORKLOADS
LINKBENCH
Facebook workload
•Early graph benchmarks were
graph-analysis oriented
•Real workload of the Internet
Giant
•Transactions (MVCC)
Методологично
•Statistic laws for data
generageion
•Measures: average response time
for each type of workload
•Avg from 99-percentile
•Documents in seconds for write
•Queries per second for read
MySQL
(InnoDB vs
TokuDB)
HBase
MongoDB /
TokuMX
OrientDB
T h e Q u e s t i o n o f Fa c e b o o k
POCKET TOOLS
TO MEASURE
Possible?
Correct?
Representative?
Comparable?
Repeatable?
PGBENCH
TPC-B
Standard
part of
PostgreSQL
distributions
One simple
command to
run
Standard de facto for PostgreSQL
internal investigations, such as:
XFS or ext4?
Tables on SSD, indexes
on HDD?
Block size: 4K or 8K?
SYSBENCH
Widely used for internal comparisons in MySQL, MariaDB,
and its forks communities
Sui generis tests, not similar to standard benchmarks
MySQL utility (by Alexei Kopytov)
fileio cpu memory threads mutex oltp
TPCE-MYSQL
TPC-E loader for MySQL by Percona
HAMMERDB
Free Java GUI program for
TPC-C (?) and TPC-H (?) with support a range of DBMSs
Oracle
Database
Microsoft
SQL Server
IBM DB2 TimesTen MySQL
MariaDB
Postgre
SQL
Postgres
Plus AS
Greenplum
DB
Redis
Amazon
Aurora
(MySQL)
Amazon
Redshift
(ParAccel)
Trafodion
SQL on
Hadoop
HAMMERDB:
ONE-CLICK TPC-C?
HAMMERDB:
TPC, BUT NOT QUITE…
Modern benchmarks were not been implemented
TPC-E TPC-DS
TPC-H
Single load station
TPC-C
Not emulated full environment, no
transaction monitor software
Load station in role of single
transaction monitor
HAMMERDB: POPULARITY
Used by
hardware and
software
vendors
Indirectly,
in form of a
“blog of our
tech guy”
Hub with
benchmark
results
Section
«Performance
Data»
SWINGBENCH
TPC-C-like workload for Oracle Database and TimesTen
+ Idiosyncratic OLTP benchmark with a lot of PL/SQL
Oracle DB-specific tools for monitoring and analysis (AWR, etc.)
Coordinator support
GUI and command line
Only internal comparability
Non-representative in RAC environments
DELL BENCHMARK FACTORY
FOR DATABASES
Oracle
Database
MySQL
MS SQL
Server
SQLite
SQL
Anywhere
Commercial tool (Quest Software legacy) TPC-C
TPC-D
TPC-E
TPC-H
ASP3AP
Supports a number of load stations
(Windows)
OSDLDBT.SOURCEFORGE.NET
•TPC-WDBT-1
•TPC-CDBT-2
•TPC-HDBT-3
•TPC-AppDBT-4
•TPC-EDBT-5
While the inspiration for these workloads
are the TPC-<x…>, workloads are entirely
different and results obtained from them
should not and can not be compared to
TPC results.
The use of any supplied results of these
tests for commercial purposes is expressly
prohibited.
MySQL PostgreSQL
…расширяемо
OLTPBENCH
github.com/oltpbenchmark/oltpbench
Java tool for command line
Supports any JDBC-enabled RDBMS
Special version for Hstore (VoltDB )
TPC-C Wikipedia
Synthetic
Resource
Stresser
Twitter Epinions.com
TATP AuctionMark SEATS YCSB
JPAB
(Hibernate)
CH-
benCHmark
Voter
SIBench
(Snapshot
Isolation)
SmallBank LinkBench
TPC TOOLS
Tools by Transaction
Processing
Council
C sources files
Not exists for TPC-C:
just sample in spec
“Do it
yourself”:
all connectivity
and other stuff
TPC-* BY EXAMPLE
По «отчётам о полном раскрытии информации» на TPC.org
RPE2
SAP SD
2-Tier
TPC-C
TPC-HSPEC
jbb2005
SPEC
CPU2006
Supercomposite indicator by Gartner (Ideas)
RPE2-ERP
RPE2-Java
RPE2-OLTP
RPE2-Compute
Intensive
BENCHWARE
Swiss kinfe by Manfred Drozd
Peakmarks Benchware
OraCPU
PL/SQL op
•[ops]
PL/SQL alg
•[ops]
OraSRV
In-memory
SQL
•[ms]
•[dbps]
•[tps]
•[rps]
OraSTO
SeqIO
•[GBps]
•[iops]
RandIO
•[GBps]
•[iops]
OraOLTP
OLTP Select
•[rps]
•[tps]
OLTP Update
•[rps]
•[tps]
OraLoad
TransLoad
•[rps]
•[tps]
BulkLoad
•[rps]
•[tps]
OraAgg
OraAgg &
Rep
•[rps]
•[tps]
Only for Oracle Database, only PL/SQL and SYS.V_$%
DATABASE MACHINES?
Pre-configured
appliances for databases
Should be precisely
measured and
benchmarked?
TERADATA
Latest Teradata publications with TPC-H:
Licensed by
«internal Qph» –
tPerf [Traditional Performance]
EXADATA
tps, Qph – not published
“Passport metrics” (X6-8)
෍ 𝑉 × IOPS ≈ Const
IBM PURE DATA FOR
OPERATIONAL ANALYTICS
Qph not published…
“Passport metrics” about input/output
“SQL IOPS”
Measuring SQL IOPS for another DBMS?
Statistical views (…IO_STATS…)
IOPS from Oracle Database side
Orion
(Oracle IO Numbers)
SLOB
(recommended by EMC,
Flashgrid)
Benchware
(?)
DBMS_RESOURCE_MANAGER
.CALIBRATE_IO
ATOMIZATION OF
AGGREGATE METRICS
tpmC
QpmH
SQL IOPS
SQL
bandwidth
METRICS ATOMIZATION:
PRO
Independent from predefined
models and schemes
Accepted by database
appliance vendors
Pocket tools availability
Representative for wide class
of DBMS and DBMS-like
systems
• Reprehensive not only for
3NF, snowflakes, stars
• Included in passport
metrics
• Running from DB side
• Could be interested for
NoSQL
METRICS ATOMIZATION:
…ET CONTRA
Methodology
Different results on
the same
environment from
CALIBRATE_IO
and Benchware
Not clear
application
sense
The same IO
operations in
different DBMSs
produces different
transaction counts
Tool not exists
for most of
DBMS
Found just for Oracle
DB, IBM DB2, MS SQL
RUNNING REAL
WORKLOAD AND
APPLICATIONS
BENCHMARKS
Most
practical
approach?
WORKLOAD REPLAY AS
TRUSTED EXPERIENCE
For systems with full API access
(usually JSON via HTTP)
Logging and
autocapture
Replay with
“sleeps”
Splitting workload
patterns
(user types)
Data scaling?
Wokrload emulators
JMeter LoadRunner …
DB-side tools
Oracle Real Application Testing
(Database Replay)
MS SQL Server Distributed Replay
BOOTSTRAPPING TRICKS
How to
rollout
data?
Clonning
Repeated
with random
shift?
Mixing with
real data
(open data)
Impact to
analytics
Predictable
query results
Low
selectivity
Impact to
OLTP
Keys,
indexes…
PACKAGED APPLICATIONS
1С benchmark series
Metrics: count of concurrent users with acceptable response time
Microsoft Dynamics AX Application Benchmark Toolkit
Oracle E-Business Suite Standard Benchamrks
Order-to-Cash OLTP Payroll …
SAPS [SAP Application Performance Standard] with SD module
QUESTIONS FOR
DISCUSSION
•Could them reach popularity like TPC-C и TPC-H?
•Megaconfigurations problem
TPC-E и TPC-DS
•Running TPC-B is simple
•Running others is tricky or not fully compliant
Pocket tool
•Acceptance, adoption, standardization, interpretation
•Other metrics of such kind?
SQL IOPS & Bandwidth
•Atomic (YCSB-type)
•Graph (LinkBench-type)
Standardization and generalization of benchmarks for new workloads
•New types of applications: portals, groupware, …
•New database architectures: sharding, in-memory databases, in-memory data grids
Future of benchmarks on real up-to-date workload
THANK YOU! mailto:anikolaenko@ibs.ru
mailto:anikolaenko@acm.org
Cover image: Paul Hudson, CC-SA
1 of 72

Recommended

Haskell Symposium 2010: An LLVM backend for GHC by
Haskell Symposium 2010: An LLVM backend for GHCHaskell Symposium 2010: An LLVM backend for GHC
Haskell Symposium 2010: An LLVM backend for GHCdterei
980 views24 slides
eBPF/XDP by
eBPF/XDP eBPF/XDP
eBPF/XDP Netronome
3.3K views41 slides
Q2.12: Debugging with GDB by
Q2.12: Debugging with GDBQ2.12: Debugging with GDB
Q2.12: Debugging with GDBLinaro
10.8K views44 slides
Onnc intro by
Onnc introOnnc intro
Onnc introLuba Tang
1K views33 slides
eBPF Debugging Infrastructure - Current Techniques by
eBPF Debugging Infrastructure - Current TechniqueseBPF Debugging Infrastructure - Current Techniques
eBPF Debugging Infrastructure - Current TechniquesNetronome
508 views13 slides
ONNC - 0.9.1 release by
ONNC - 0.9.1 releaseONNC - 0.9.1 release
ONNC - 0.9.1 releaseLuba Tang
669 views15 slides

More Related Content

What's hot

Cray XT Porting, Scaling, and Optimization Best Practices by
Cray XT Porting, Scaling, and Optimization Best PracticesCray XT Porting, Scaling, and Optimization Best Practices
Cray XT Porting, Scaling, and Optimization Best PracticesJeff Larkin
782 views122 slides
Target updated track f by
Target updated   track fTarget updated   track f
Target updated track fAlona Gradman
281 views24 slides
Chip Ex2010 Gert Goossens by
Chip Ex2010 Gert GoossensChip Ex2010 Gert Goossens
Chip Ex2010 Gert GoossensAlona Gradman
454 views24 slides
FARIS: Fast and Memory-efficient URL Filter by Domain Specific Machine by
FARIS: Fast and Memory-efficient URL Filter by Domain Specific MachineFARIS: Fast and Memory-efficient URL Filter by Domain Specific Machine
FARIS: Fast and Memory-efficient URL Filter by Domain Specific MachineYuuki Takano
441 views17 slides
Berkeley Packet Filters by
Berkeley Packet FiltersBerkeley Packet Filters
Berkeley Packet FiltersKernel TLV
6.4K views33 slides
Arm tools and roadmap for SVE compiler support by
Arm tools and roadmap for SVE compiler supportArm tools and roadmap for SVE compiler support
Arm tools and roadmap for SVE compiler supportLinaro
4.7K views24 slides

What's hot(7)

Cray XT Porting, Scaling, and Optimization Best Practices by Jeff Larkin
Cray XT Porting, Scaling, and Optimization Best PracticesCray XT Porting, Scaling, and Optimization Best Practices
Cray XT Porting, Scaling, and Optimization Best Practices
Jeff Larkin782 views
Chip Ex2010 Gert Goossens by Alona Gradman
Chip Ex2010 Gert GoossensChip Ex2010 Gert Goossens
Chip Ex2010 Gert Goossens
Alona Gradman454 views
FARIS: Fast and Memory-efficient URL Filter by Domain Specific Machine by Yuuki Takano
FARIS: Fast and Memory-efficient URL Filter by Domain Specific MachineFARIS: Fast and Memory-efficient URL Filter by Domain Specific Machine
FARIS: Fast and Memory-efficient URL Filter by Domain Specific Machine
Yuuki Takano441 views
Berkeley Packet Filters by Kernel TLV
Berkeley Packet FiltersBerkeley Packet Filters
Berkeley Packet Filters
Kernel TLV6.4K views
Arm tools and roadmap for SVE compiler support by Linaro
Arm tools and roadmap for SVE compiler supportArm tools and roadmap for SVE compiler support
Arm tools and roadmap for SVE compiler support
Linaro4.7K views
TIP1 - Overview of C/C++ Debugging/Tracing/Profiling Tools by Xiaozhe Wang
TIP1 - Overview of C/C++ Debugging/Tracing/Profiling ToolsTIP1 - Overview of C/C++ Debugging/Tracing/Profiling Tools
TIP1 - Overview of C/C++ Debugging/Tracing/Profiling Tools
Xiaozhe Wang11.4K views

Similar to DBMS benchmarking overview and trends for Moscow ACM SIGMOD Chapter

TPC_Microsoft.ppt by
TPC_Microsoft.pptTPC_Microsoft.ppt
TPC_Microsoft.pptAsimTaj2
8 views62 slides
22 levine by
22 levine22 levine
22 levineashish61_scs
587 views62 slides
System X - About Benchmarks by
System X - About BenchmarksSystem X - About Benchmarks
System X - About BenchmarksIBM India Smarter Computing
716 views19 slides
Introducing the TPCx-HS Benchmark for Big Data by
Introducing the TPCx-HS Benchmark for Big DataIntroducing the TPCx-HS Benchmark for Big Data
Introducing the TPCx-HS Benchmark for Big Datainside-BigData.com
3.8K views23 slides
Accordion - VLDB 2014 by
Accordion - VLDB 2014Accordion - VLDB 2014
Accordion - VLDB 2014Marco Serafini
845 views19 slides
TPC-H Column Store and MPP systems by
TPC-H Column Store and MPP systemsTPC-H Column Store and MPP systems
TPC-H Column Store and MPP systemsMostafa Mokhtar
3.4K views39 slides

Similar to DBMS benchmarking overview and trends for Moscow ACM SIGMOD Chapter(20)

TPC_Microsoft.ppt by AsimTaj2
TPC_Microsoft.pptTPC_Microsoft.ppt
TPC_Microsoft.ppt
AsimTaj28 views
Introducing the TPCx-HS Benchmark for Big Data by inside-BigData.com
Introducing the TPCx-HS Benchmark for Big DataIntroducing the TPCx-HS Benchmark for Big Data
Introducing the TPCx-HS Benchmark for Big Data
inside-BigData.com3.8K views
TPC-H Column Store and MPP systems by Mostafa Mokhtar
TPC-H Column Store and MPP systemsTPC-H Column Store and MPP systems
TPC-H Column Store and MPP systems
Mostafa Mokhtar3.4K views
Datacenter App July09 Bashar by Santosh Pania
Datacenter App   July09   BasharDatacenter App   July09   Bashar
Datacenter App July09 Bashar
Santosh Pania970 views
Keynote IDEAS 2013 - Peter Boncz by LDBC council
Keynote IDEAS 2013 - Peter BonczKeynote IDEAS 2013 - Peter Boncz
Keynote IDEAS 2013 - Peter Boncz
LDBC council84 views
Keynote IDEAS2013 - Peter Boncz by Ioan Toma
Keynote IDEAS2013 - Peter BonczKeynote IDEAS2013 - Peter Boncz
Keynote IDEAS2013 - Peter Boncz
Ioan Toma853 views
Automation of MultiDimensional DB Design (poster) by Rim Moussa
Automation of MultiDimensional DB Design (poster)Automation of MultiDimensional DB Design (poster)
Automation of MultiDimensional DB Design (poster)
Rim Moussa600 views
Raghu nambiar:industry standard benchmarks by hdhappy001
Raghu nambiar:industry standard benchmarksRaghu nambiar:industry standard benchmarks
Raghu nambiar:industry standard benchmarks
hdhappy0012.1K views
Principles in Data Stream Processing | Matthias J Sax, Confluent by HostedbyConfluent
Principles in Data Stream Processing | Matthias J Sax, ConfluentPrinciples in Data Stream Processing | Matthias J Sax, Confluent
Principles in Data Stream Processing | Matthias J Sax, Confluent
HostedbyConfluent628 views
Two way data sync between legacy and your brand new micro-service architecture by bleporini
 Two way data sync between legacy and your brand new micro-service architecture Two way data sync between legacy and your brand new micro-service architecture
Two way data sync between legacy and your brand new micro-service architecture
bleporini3.3K views
lec25-final.ppt by zahixdd
lec25-final.pptlec25-final.ppt
lec25-final.ppt
zahixdd1 view
Computer Organozation by Aabha Tiwari
Computer OrganozationComputer Organozation
Computer Organozation
Aabha Tiwari573 views
An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian... by DataStax
An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...
An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...
DataStax3.5K views

More from Andrei Nikolaenko

Байтоадресуемая энергонезависимая память и СУБД by
Байтоадресуемая энергонезависимая память и СУБДБайтоадресуемая энергонезависимая память и СУБД
Байтоадресуемая энергонезависимая память и СУБДAndrei Nikolaenko
177 views46 slides
Машины баз данных: концентрированное обозрение by
Машины баз данных: концентрированное обозрениеМашины баз данных: концентрированное обозрение
Машины баз данных: концентрированное обозрениеAndrei Nikolaenko
157 views63 slides
DB-Technologies-2017 Keynote (Strategy and tactics for db evaluation) by
DB-Technologies-2017 Keynote (Strategy and tactics for db evaluation)DB-Technologies-2017 Keynote (Strategy and tactics for db evaluation)
DB-Technologies-2017 Keynote (Strategy and tactics for db evaluation)Andrei Nikolaenko
196 views62 slides
Нереляционный SQL by
Нереляционный SQLНереляционный SQL
Нереляционный SQLAndrei Nikolaenko
484 views57 slides
DBMS Benchmarks in a Nutshell by
DBMS Benchmarks in a Nutshell DBMS Benchmarks in a Nutshell
DBMS Benchmarks in a Nutshell Andrei Nikolaenko
307 views76 slides
Инструменты больших данных: от конкуренции — к интеграции by
Инструменты больших данных: от конкуренции — к интеграцииИнструменты больших данных: от конкуренции — к интеграции
Инструменты больших данных: от конкуренции — к интеграцииAndrei Nikolaenko
465 views50 slides

More from Andrei Nikolaenko(16)

Байтоадресуемая энергонезависимая память и СУБД by Andrei Nikolaenko
Байтоадресуемая энергонезависимая память и СУБДБайтоадресуемая энергонезависимая память и СУБД
Байтоадресуемая энергонезависимая память и СУБД
Andrei Nikolaenko177 views
Машины баз данных: концентрированное обозрение by Andrei Nikolaenko
Машины баз данных: концентрированное обозрениеМашины баз данных: концентрированное обозрение
Машины баз данных: концентрированное обозрение
Andrei Nikolaenko157 views
DB-Technologies-2017 Keynote (Strategy and tactics for db evaluation) by Andrei Nikolaenko
DB-Technologies-2017 Keynote (Strategy and tactics for db evaluation)DB-Technologies-2017 Keynote (Strategy and tactics for db evaluation)
DB-Technologies-2017 Keynote (Strategy and tactics for db evaluation)
Andrei Nikolaenko196 views
Инструменты больших данных: от конкуренции — к интеграции by Andrei Nikolaenko
Инструменты больших данных: от конкуренции — к интеграцииИнструменты больших данных: от конкуренции — к интеграции
Инструменты больших данных: от конкуренции — к интеграции
Andrei Nikolaenko465 views
Машины баз данных на Web-scale IT — 2017 (РИТ++) by Andrei Nikolaenko
Машины баз данных на Web-scale IT — 2017 (РИТ++)Машины баз данных на Web-scale IT — 2017 (РИТ++)
Машины баз данных на Web-scale IT — 2017 (РИТ++)
Andrei Nikolaenko365 views
Эталонные тесты производительнсоти СУБД: обзор и тенденции by Andrei Nikolaenko
Эталонные тесты производительнсоти СУБД: обзор и тенденцииЭталонные тесты производительнсоти СУБД: обзор и тенденции
Эталонные тесты производительнсоти СУБД: обзор и тенденции
Andrei Nikolaenko345 views
Note on hyperconvered infrastructure on CIPR by Andrei Nikolaenko
Note on hyperconvered infrastructure on CIPR Note on hyperconvered infrastructure on CIPR
Note on hyperconvered infrastructure on CIPR
Andrei Nikolaenko141 views
SQL+NoSQL: On the Way to Converged Data Management Platforms by Andrei Nikolaenko
SQL+NoSQL: On the Way to Converged Data Management PlatformsSQL+NoSQL: On the Way to Converged Data Management Platforms
SQL+NoSQL: On the Way to Converged Data Management Platforms
Andrei Nikolaenko221 views
NoSQL: issues and progress, current status and prospects by Andrei Nikolaenko
NoSQL: issues and progress, current status and prospectsNoSQL: issues and progress, current status and prospects
NoSQL: issues and progress, current status and prospects
Andrei Nikolaenko593 views
Cloud Databases, ACM SIGMOD Moscow Workshop, November, 2013 by Andrei Nikolaenko
Cloud Databases, ACM SIGMOD Moscow Workshop, November, 2013Cloud Databases, ACM SIGMOD Moscow Workshop, November, 2013
Cloud Databases, ACM SIGMOD Moscow Workshop, November, 2013
Andrei Nikolaenko326 views
Rapid Deployment of Hadoop Development Environments by Andrei Nikolaenko
Rapid Deployment of Hadoop Development EnvironmentsRapid Deployment of Hadoop Development Environments
Rapid Deployment of Hadoop Development Environments
Andrei Nikolaenko834 views
Introductory Keynote at Hadoop Workshop by Ospcon (2014) by Andrei Nikolaenko
Introductory Keynote at Hadoop Workshop by Ospcon (2014)Introductory Keynote at Hadoop Workshop by Ospcon (2014)
Introductory Keynote at Hadoop Workshop by Ospcon (2014)
Andrei Nikolaenko737 views

Recently uploaded

ict act 1.pptx by
ict act 1.pptxict act 1.pptx
ict act 1.pptxsanjaniarun08
13 views17 slides
Headless JS UG Presentation.pptx by
Headless JS UG Presentation.pptxHeadless JS UG Presentation.pptx
Headless JS UG Presentation.pptxJack Spektor
7 views24 slides
Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI... by
Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...
Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...Marc Müller
36 views83 slides
DSD-INT 2023 Simulating a falling apron in Delft3D 4 - Engineering Practice -... by
DSD-INT 2023 Simulating a falling apron in Delft3D 4 - Engineering Practice -...DSD-INT 2023 Simulating a falling apron in Delft3D 4 - Engineering Practice -...
DSD-INT 2023 Simulating a falling apron in Delft3D 4 - Engineering Practice -...Deltares
6 views15 slides
DSD-INT 2023 Thermobaricity in 3D DCSM-FM - taking pressure into account in t... by
DSD-INT 2023 Thermobaricity in 3D DCSM-FM - taking pressure into account in t...DSD-INT 2023 Thermobaricity in 3D DCSM-FM - taking pressure into account in t...
DSD-INT 2023 Thermobaricity in 3D DCSM-FM - taking pressure into account in t...Deltares
9 views26 slides
SUGCON ANZ Presentation V2.1 Final.pptx by
SUGCON ANZ Presentation V2.1 Final.pptxSUGCON ANZ Presentation V2.1 Final.pptx
SUGCON ANZ Presentation V2.1 Final.pptxJack Spektor
22 views34 slides

Recently uploaded(20)

Headless JS UG Presentation.pptx by Jack Spektor
Headless JS UG Presentation.pptxHeadless JS UG Presentation.pptx
Headless JS UG Presentation.pptx
Jack Spektor7 views
Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI... by Marc Müller
Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...
Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...
Marc Müller36 views
DSD-INT 2023 Simulating a falling apron in Delft3D 4 - Engineering Practice -... by Deltares
DSD-INT 2023 Simulating a falling apron in Delft3D 4 - Engineering Practice -...DSD-INT 2023 Simulating a falling apron in Delft3D 4 - Engineering Practice -...
DSD-INT 2023 Simulating a falling apron in Delft3D 4 - Engineering Practice -...
Deltares6 views
DSD-INT 2023 Thermobaricity in 3D DCSM-FM - taking pressure into account in t... by Deltares
DSD-INT 2023 Thermobaricity in 3D DCSM-FM - taking pressure into account in t...DSD-INT 2023 Thermobaricity in 3D DCSM-FM - taking pressure into account in t...
DSD-INT 2023 Thermobaricity in 3D DCSM-FM - taking pressure into account in t...
Deltares9 views
SUGCON ANZ Presentation V2.1 Final.pptx by Jack Spektor
SUGCON ANZ Presentation V2.1 Final.pptxSUGCON ANZ Presentation V2.1 Final.pptx
SUGCON ANZ Presentation V2.1 Final.pptx
Jack Spektor22 views
DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ... by Deltares
DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ...DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ...
DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ...
Deltares9 views
DSD-INT 2023 The Danube Hazardous Substances Model - Kovacs by Deltares
DSD-INT 2023 The Danube Hazardous Substances Model - KovacsDSD-INT 2023 The Danube Hazardous Substances Model - Kovacs
DSD-INT 2023 The Danube Hazardous Substances Model - Kovacs
Deltares7 views
DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko... by Deltares
DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko...DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko...
DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko...
Deltares11 views
Software evolution understanding: Automatic extraction of software identifier... by Ra'Fat Al-Msie'deen
Software evolution understanding: Automatic extraction of software identifier...Software evolution understanding: Automatic extraction of software identifier...
Software evolution understanding: Automatic extraction of software identifier...
Navigating container technology for enhanced security by Niklas Saari by Metosin Oy
Navigating container technology for enhanced security by Niklas SaariNavigating container technology for enhanced security by Niklas Saari
Navigating container technology for enhanced security by Niklas Saari
Metosin Oy8 views
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx by animuscrm
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx
animuscrm13 views
El Arte de lo Possible by Neo4j
El Arte de lo PossibleEl Arte de lo Possible
El Arte de lo Possible
Neo4j38 views
MariaDB stored procedures and why they should be improved by Federico Razzoli
MariaDB stored procedures and why they should be improvedMariaDB stored procedures and why they should be improved
MariaDB stored procedures and why they should be improved
Software testing company in India.pptx by SakshiPatel82
Software testing company in India.pptxSoftware testing company in India.pptx
Software testing company in India.pptx
SakshiPatel827 views
Fleet Management Software in India by Fleetable
Fleet Management Software in India Fleet Management Software in India
Fleet Management Software in India
Fleetable11 views
Neo4j y GenAI by Neo4j
Neo4j y GenAI Neo4j y GenAI
Neo4j y GenAI
Neo4j42 views

DBMS benchmarking overview and trends for Moscow ACM SIGMOD Chapter

  • 2. ГОНКИ TPC-C 296 results Simfoware, Oracle DB 7–8, DB2/400 и UDB, Informix, MS SQL Server, Sybase ASE …
  • 4. AGENDA What happened with TPC-C? •TPC: early history •TPC-A/B, -С, -H, -E •TPC: obsolete and new •Publication issues The New Wave of benchmarks •MapReduce (Hadoop) •Graph benchmarks •Meta-universal, atomic «Pocket tools» to run Input/Output Outcomes
  • 5. TP1 Mid 1970s, IBM benchmark Bank transactions processing Idefix: 100 tps (1973, “bank with 1.000 branches and 10.000 tellers”) Batch mode without networking no delay for teller response Early 1980s: fantastic victorious reports with 10 ktps $$bln market… … focus on fast- growing OLTP …however, customers could not achieved yet 1 ktps …
  • 6. DAVIT DEVITT: WISCONSIN BENCHMARK Alternative, more strictly determined benchmark Aimed to terminate “benchmarking wars”, but lit the war even stronger! Weak results of Oracle DB caused ‘DeWitt clause’ DBMS license prohibits publishing benchmarks
  • 7. JIM GREY: DEBITCREDIT Counterpoise for Wisconsin benchmark, based on TP1 subject area Requires total systems cost, including equipment, licenses, 5 years of support Specified as textual functional requirements, without code requirements or code examples Introduced scaling rules by users and table sizes Postulated a response time limit : 95 % of transactions should complete in 1s
  • 9. TPC.ORG Benchmarking wars continued: how to validated result? Independent, non-profit organization need… 1988: Transaction Processing Performance Council Omri Serlin and 8 [consonant] vendors Actian Cisco Cloudera Dell DataCore Fujitsu HPE Hitachi Huawei IBM Inspur Intel Lenovo Microsoft Oracle Pivotal Red Hat SAP Teradata VMWare Members (2017):
  • 10. TPC-A AND TPC-B: PROCESS READ 100 bytes from TTY (AID, TID, BID, DELTA) BEGIN TRANSACTION UPDATE ACCOUNT WHERE ACCOUNT_ID = AID: READ ACCOUNT_BALANCE FROM ACCOUNT SET ACCOUNT_BALANCE = ACCOUNT_BALANCE + DELTA WRITE ACCOUNT_BALANCE TO ACCOUNT WRITE TO HISTORY: AID, TID, BID, DELTA, TIME_STAMP UPDATE TELLER WHERE TELLER_ID = TID: SET TELLER_BALANCE = TELLER_BALANCE + DELTA WRITE TELLER_BALANCE TO TELLER UPDATE BRANCH WHERE BRANCH_ID = BID: SET BRANCH_BALANCE = BRANCH_BALANCE + DELTA WRITE BRANCH_BALANCE TO BRANCH COMMIT TRANSACTION WRITE 200 bytes to TTY (AID, TID, BID, DELTA) Based on TP1 – retail banking transactions
  • 11. TPC-A AND TPC-B: MODEL BRANCH B ACCOUNT B*100K 100K HISTORY B*2.6M TELLER B*1010 10s cycle for each terminal 1 transaction per second for each branch Response for 90% of transactions not less than 2s Average transactions per second in 15 min
  • 12. TPC-A AND TPC-B: DIFFERENCES TPC-A Terminals User response delay TPC-B Server-only benchmark Reduced history (30 days)
  • 13. TPC-A AND TPC-B: CRITICISM 1995: considered unreliable Last result (DEC, 1994) 3700 tpsA 4800 $/tpsA First result (HP, 1990) 38.2 tpsA 29200 $/tpsA TP1 legacy Too simple to avoid “adjustments” Implausible dispersion of results
  • 14. TPC-C: COMPLICATION Response thresholds for 90% of transactions Less than 5s for interactive operation Less than 20s for batch More variances… 9 tables Inserts, updates, deletes, operation cancellations Access by primary and secondary keys 5 типов транзакций NEW-ORDER •new customer order PAYMENT •payment event •customer balance refresh DELIVERY •delivery orer •(batch) ORDER-STATUS •checking the status of the last customer order STOCK-LEVEL •checking the stock level at the warehouse 45% 43% 4% 4% 4%
  • 15. TPC-C DATA MODEL WAREHOUSE W DISTRICT W*10 10 CUSTOMER W*30K 3K HISTORY W*30K+ 1+ ITEM 100K (fix) STOCK W*100K100K W ORDER W*30K+1+ ORDER-LINE W*300K+ 10–15 NEW-ORDER W*5K0–1
  • 16. TPC-C: SCALING Each new warehouse → +10 districts, + 100 items, +300k customers Maximum 1.2 tpmC per terminal 10 terminals per warehouse Scaling factor: warehouse (W)
  • 17. TPC-C: CLUSTERED AND NON-CLUSTERED Clustered >1 nodes Shared nothing (fully federated) Sharded Shared storage (RAC) Non-clustered 1 node
  • 18. TPC-C: METRICS tpmC Transactions per minute tpmC / $ Transaction cost Hardware acquisition cost Software and services cost for 3 years W / ktpmC Energy consumption per 1000 transactions per minute
  • 19. TPC-C INTERPRETATION By Alan Parker (Alan Parker. Tuning databases on Solaris platform. Prentice-Hall, 2002) •…not only order entrytpmC × 2 •…if transaction monitor does not usedtpmC / 2 •…if Oracle Forms-like client usedtpmC / 3 •…if lightweight client forms used (like curses-based)tpmC × 2 / 3 •…if SQL has not been tunedtpmC / 2 • …if there are heavy reports and batchestpmC / 2
  • 20. COULD WE CHEAT TPC-C? mount -t tmpfs -o size=2048g tmpfs /u01/tablespaces CREATE UNLOGGED TABLE … _ALLOW_RESETLOGS_CORRUPTION = TRUE _IN_MEMORY_UNDO = TRUE _DB_BLOCK_HASH_LATCHES = 32768 …
  • 21. RESPONSE FROM COUNCIL A l l A C I D a s p e c t s c h e c k s s t i f f l y i n t e r p r e t e d i n s p e c i f i c a t i o n Single node reboot failure Synchronous commit on two nodes Redo logs and recovering Reboot and recovering Single media failure protection Synchronous commit on two separately powered media Write-ahead logs on separately powered media Commit = written on durable media
  • 22. TPC-C: CRITICISM OF 1990S Even wholesale providers works somewhat differently! Order: some unsuccessful wildcard searches Printing report after each entry (probably, 3 times) Balance refresh for each operation is impossible in a heavy load environment → insert + batch calc Trivial logic No declarative constraints No trigger logic Non-typical loads What about reporting? Decision support systems?
  • 23. TPC: OBSOLETED TPC-D 1995–1999 First attempt for OLAP benchmark TPC-R 2001–2005 Reporting TPC-W 2001–2005 Online web- commerce
  • 24. TPC-H 1999 : OLAP vs OLTP “Ad-hoc decision support” Instead of TPC-D (considered irrelevant) “Weight categories” 100 GB 300 GB 1 TB 3 TB 10 TB 30 TB 100 TB Parallel load 22 kinds of complex queries 2 kinds of DWH refresh
  • 25. TPC-H: MODEL 1 SF ~ 1 GB Declarative constraints Not star schema
  • 26. TPC-H Q1 SELECT l_returnflag, l_linestatus, sum(l_quantity) as sum_qty, sum(l_extendedprice) as sum_base_price, sum(l_extendedprice*(1-l_discount)) as sum_disc_price, sum(l_extendedprice*(1-l_discount)*(1+l_tax)) as sum_charge, avg(l_quantity) as avg_qty, avg(l_extendedprice) as avg_price, avg(l_discount) as avg_disc, count(*) as count_order FROM lineitem WHERE l_shipdate <= date '1998-12-01' - interval Δ day (3) GROUP BY l_returnflag, l_linestatus ORDER BY l_returnflag, l_linestatus Δ = random [60…120] “ F u n c t i o n a l Q u e r y D e f i n i t i o n ”
  • 27. TPC-H: RESULTS, 2017 Tricky to apply for non-relational DBMS Variants to apply for MOLAP (MDX) Reports with application for Apache Hive Exasol MS SQL Server Oracle Database Actian Vector 1 – 10 results for each “weight category”
  • 28. TPC-E: “PRAGMATIC OLTP” Still OLTP, but more “hybrid” Declarative constraints No transaction monitors More reads More kinds of load
  • 30. TPC-E: MODEL Image: ©Transaction Processing Council, 2009
  • 31. 78 results All non-clustered, all on MS SQL Server for Windows Server x64
  • 32. 11 KTPSE, $144/TPSE (1,6 MIO) Изображение: ©Lenovo, © TPC, 2015
  • 34. BRAND NEW: TPC-X И TPCX-X TPC-DI ETL workloads TPC_DI_RPS 0 results TPC-DS Decision support systems “on Big Data solutions, such as RDBMS as well as Hadoop/Spark based systems” QphDS@Size 0 results TPCx-BB Express benchmark for “Hadoop- based Big Data Systems” adopted from BigBench BBpm@Size 0 results TPCx-HS Express benchmark for HDFS- compatible systems, adopted from TeraSort HSph@Size From 1 to 4 results in different “weight categories”
  • 35. PUBLISHING? Publishing on TPC.org Academic or research paper, could not be applicable for marketing purposes With definite clause that outcome is not comparable with TPC.org results With permission of TPC.org Publishing TPC benchmarks is prohibited, except the following cases: Required audit from sizing.com
  • 36. DEWITT CLAUSE IN 2017? MSFT EULA •“You may not disclose the results of any benchmark test … without Microsoft’s prior written approval” OTN Lic. •“You may not disclose results of any Program benchmark tests without Oracle’s prior consent” IBM IPLA •“Licensee may disclose the results of any benchmark test of the Program or its subcomponents to any third party provided that Licensee, if … A) … B) … C)…”
  • 37. NEW BENCHMARKS FOR NEW WORKLOADS 2010s 2000s 1990s 1980s OLTP ROLAP ROLAP In-memory OLAP TCP-DS ROLAP TCP-H Graph queries LinkBench MapReduce HiBench OLTP OLTP[E] OLTP[E] TPC-E OLTP Atomic access YCSB TPCx-HS TPCx-BB
  • 38. PRINCIPAL BENCHMARKS OF “NEW WAVE” Terrasort Benchmark BigBench Intel HiBench Yahoo! Cloud Services Benchmark Linkbench • Adopted in TPCx-HS • Adopted in TPCx-BB • Map-Reduce series • Group of NoSQL etalon runs • Graph-like workload on RDBMS and HBase
  • 40. YAHOO! CLOUD SERVICES BENCHMARK New tool for “benchmark marketing” Used by researchers for wide comparisons (V. Abramova et al. Experimental Evaluation of NoSQL Databases // IJDMS Vol.6, №3, 2014) Cassandra HBase Elastic search MongoDB Oracle NoSQL OrientDB Redis Scalaris Tarantool Voldemort Bombardment from one load station Atomic operation instead of transaction (probably, read a few records)
  • 42. LINKBENCH Facebook workload •Early graph benchmarks were graph-analysis oriented •Real workload of the Internet Giant •Transactions (MVCC) Методологично •Statistic laws for data generageion •Measures: average response time for each type of workload •Avg from 99-percentile •Documents in seconds for write •Queries per second for read MySQL (InnoDB vs TokuDB) HBase MongoDB / TokuMX OrientDB T h e Q u e s t i o n o f Fa c e b o o k
  • 44. PGBENCH TPC-B Standard part of PostgreSQL distributions One simple command to run Standard de facto for PostgreSQL internal investigations, such as: XFS or ext4? Tables on SSD, indexes on HDD? Block size: 4K or 8K?
  • 45. SYSBENCH Widely used for internal comparisons in MySQL, MariaDB, and its forks communities Sui generis tests, not similar to standard benchmarks MySQL utility (by Alexei Kopytov) fileio cpu memory threads mutex oltp
  • 46. TPCE-MYSQL TPC-E loader for MySQL by Percona
  • 47. HAMMERDB Free Java GUI program for TPC-C (?) and TPC-H (?) with support a range of DBMSs Oracle Database Microsoft SQL Server IBM DB2 TimesTen MySQL MariaDB Postgre SQL Postgres Plus AS Greenplum DB Redis Amazon Aurora (MySQL) Amazon Redshift (ParAccel) Trafodion SQL on Hadoop
  • 49. HAMMERDB: TPC, BUT NOT QUITE… Modern benchmarks were not been implemented TPC-E TPC-DS TPC-H Single load station TPC-C Not emulated full environment, no transaction monitor software Load station in role of single transaction monitor
  • 50. HAMMERDB: POPULARITY Used by hardware and software vendors Indirectly, in form of a “blog of our tech guy” Hub with benchmark results Section «Performance Data»
  • 51. SWINGBENCH TPC-C-like workload for Oracle Database and TimesTen + Idiosyncratic OLTP benchmark with a lot of PL/SQL Oracle DB-specific tools for monitoring and analysis (AWR, etc.) Coordinator support GUI and command line Only internal comparability Non-representative in RAC environments
  • 52. DELL BENCHMARK FACTORY FOR DATABASES Oracle Database MySQL MS SQL Server SQLite SQL Anywhere Commercial tool (Quest Software legacy) TPC-C TPC-D TPC-E TPC-H ASP3AP Supports a number of load stations (Windows)
  • 53. OSDLDBT.SOURCEFORGE.NET •TPC-WDBT-1 •TPC-CDBT-2 •TPC-HDBT-3 •TPC-AppDBT-4 •TPC-EDBT-5 While the inspiration for these workloads are the TPC-<x…>, workloads are entirely different and results obtained from them should not and can not be compared to TPC results. The use of any supplied results of these tests for commercial purposes is expressly prohibited. MySQL PostgreSQL …расширяемо
  • 54. OLTPBENCH github.com/oltpbenchmark/oltpbench Java tool for command line Supports any JDBC-enabled RDBMS Special version for Hstore (VoltDB ) TPC-C Wikipedia Synthetic Resource Stresser Twitter Epinions.com TATP AuctionMark SEATS YCSB JPAB (Hibernate) CH- benCHmark Voter SIBench (Snapshot Isolation) SmallBank LinkBench
  • 55. TPC TOOLS Tools by Transaction Processing Council C sources files Not exists for TPC-C: just sample in spec “Do it yourself”: all connectivity and other stuff
  • 56. TPC-* BY EXAMPLE По «отчётам о полном раскрытии информации» на TPC.org
  • 57. RPE2 SAP SD 2-Tier TPC-C TPC-HSPEC jbb2005 SPEC CPU2006 Supercomposite indicator by Gartner (Ideas) RPE2-ERP RPE2-Java RPE2-OLTP RPE2-Compute Intensive
  • 58. BENCHWARE Swiss kinfe by Manfred Drozd Peakmarks Benchware OraCPU PL/SQL op •[ops] PL/SQL alg •[ops] OraSRV In-memory SQL •[ms] •[dbps] •[tps] •[rps] OraSTO SeqIO •[GBps] •[iops] RandIO •[GBps] •[iops] OraOLTP OLTP Select •[rps] •[tps] OLTP Update •[rps] •[tps] OraLoad TransLoad •[rps] •[tps] BulkLoad •[rps] •[tps] OraAgg OraAgg & Rep •[rps] •[tps] Only for Oracle Database, only PL/SQL and SYS.V_$%
  • 59. DATABASE MACHINES? Pre-configured appliances for databases Should be precisely measured and benchmarked?
  • 60. TERADATA Latest Teradata publications with TPC-H: Licensed by «internal Qph» – tPerf [Traditional Performance]
  • 61. EXADATA tps, Qph – not published “Passport metrics” (X6-8) ෍ 𝑉 × IOPS ≈ Const
  • 62. IBM PURE DATA FOR OPERATIONAL ANALYTICS Qph not published… “Passport metrics” about input/output
  • 63. “SQL IOPS” Measuring SQL IOPS for another DBMS? Statistical views (…IO_STATS…) IOPS from Oracle Database side Orion (Oracle IO Numbers) SLOB (recommended by EMC, Flashgrid) Benchware (?) DBMS_RESOURCE_MANAGER .CALIBRATE_IO
  • 65. METRICS ATOMIZATION: PRO Independent from predefined models and schemes Accepted by database appliance vendors Pocket tools availability Representative for wide class of DBMS and DBMS-like systems • Reprehensive not only for 3NF, snowflakes, stars • Included in passport metrics • Running from DB side • Could be interested for NoSQL
  • 66. METRICS ATOMIZATION: …ET CONTRA Methodology Different results on the same environment from CALIBRATE_IO and Benchware Not clear application sense The same IO operations in different DBMSs produces different transaction counts Tool not exists for most of DBMS Found just for Oracle DB, IBM DB2, MS SQL
  • 68. WORKLOAD REPLAY AS TRUSTED EXPERIENCE For systems with full API access (usually JSON via HTTP) Logging and autocapture Replay with “sleeps” Splitting workload patterns (user types) Data scaling? Wokrload emulators JMeter LoadRunner … DB-side tools Oracle Real Application Testing (Database Replay) MS SQL Server Distributed Replay
  • 69. BOOTSTRAPPING TRICKS How to rollout data? Clonning Repeated with random shift? Mixing with real data (open data) Impact to analytics Predictable query results Low selectivity Impact to OLTP Keys, indexes…
  • 70. PACKAGED APPLICATIONS 1С benchmark series Metrics: count of concurrent users with acceptable response time Microsoft Dynamics AX Application Benchmark Toolkit Oracle E-Business Suite Standard Benchamrks Order-to-Cash OLTP Payroll … SAPS [SAP Application Performance Standard] with SD module
  • 71. QUESTIONS FOR DISCUSSION •Could them reach popularity like TPC-C и TPC-H? •Megaconfigurations problem TPC-E и TPC-DS •Running TPC-B is simple •Running others is tricky or not fully compliant Pocket tool •Acceptance, adoption, standardization, interpretation •Other metrics of such kind? SQL IOPS & Bandwidth •Atomic (YCSB-type) •Graph (LinkBench-type) Standardization and generalization of benchmarks for new workloads •New types of applications: portals, groupware, … •New database architectures: sharding, in-memory databases, in-memory data grids Future of benchmarks on real up-to-date workload