Methods of NoSQL database systems benchmarking
Upcoming SlideShare
Loading in...5
×
 

Methods of NoSQL database systems benchmarking

on

  • 1,394 views

 

Statistics

Views

Total Views
1,394
Views on SlideShare
1,394
Embed Views
0

Actions

Likes
1
Downloads
15
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Methods of NoSQL database systems benchmarking Methods of NoSQL database systems benchmarking Presentation Transcript

  • Methods of benchmarking NoSQL database systems Ilya Bakulin webmaster@kibab.com, kibab@FreeBSD.org SMS Traffic LVEE 2011Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 1 / 16
  • 1 Introduction2 YCSB benchmarking framework3 YCSB practical usage4 Results5 Where to find further information Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 2 / 16
  • Why benchmarking NoSQL is nessesary No guides / FAQs about performance are generally available, or are outdated NoSQL systems are actively developed Nobody wants to end up with crashed DB in production right before 2-week vacation Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 3 / 16
  • Why benchmarking NoSQL is complex RDBMS use SQL to provide access to data stored in them, while NOSQL systems don’t Each NoSQL uses different protocol (Thrift, Memcached-style, own protocols) Existing benchmarks require SQL to work with database under inspection. Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 4 / 16
  • What is YCSB? YCSB stands for Yahoo Cloud Serving Benchmark Developed by Yahoo! Research group Open Source project, hosted on GitHub (178 watchers, 42 forks) Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 5 / 16
  • Architecture Benchmark tool •! Java application Java application –! Many systems have Java APIs Shipped with ready-to-use adapters for several popular Opensource databases systems via HTTP/REST, JNI or some other solution –! Other Command-line parameters •! DB to use •! Target throughput •! Number of threads •! … Workload YCSB client Cloud DB parameter file DB client •! R/W mix •! Record size Client Workload threads •! Data set executor •! … Stats Extensible: define new workloads Extensible: plug in new clients 5 Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 6 / 16
  • More on DB interface Benchmark tool Simple operations: INSERT, UPDATE, REPLACE, DELETE, SCAN •! Does not use SQL Java application ... –! Many systems have Java APIs but SQL support is avaible through contributed JDBC driver ... –! Other systems configurations are possible other solution Even sharding via HTTP/REST, JNI or some Command-line parameters •! DB to use •! Target throughput •! Number of threads •! … Workload YCSB client Cloud DB parameter file DB client •! R/W mix •! Record size Client Workload threads •! Data set executor •! … Stats Extensible: define new workloads Extensible: plug in new clients 5 Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 7 / 16
  • More on DB interface Benchmark tool Simple operations: INSERT, UPDATE, REPLACE, DELETE, SCAN •! Does not use SQL Java application ... –! Many systems have Java APIs but SQL support is avaible through contributed JDBC driver ... –! Other systems configurations are possible other solution Even sharding via HTTP/REST, JNI or some Command-line parameters •! DB to use •! Target throughput •! Number of threads •! … Workload YCSB client Cloud DB parameter file DB client •! R/W mix •! Record size Client Workload threads •! Data set executor •! … Stats Extensible: define new workloads Extensible: plug in new clients 5 Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 7 / 16
  • More on DB interface Benchmark tool Simple operations: INSERT, UPDATE, REPLACE, DELETE, SCAN •! Does not use SQL Java application ... –! Many systems have Java APIs but SQL support is avaible through contributed JDBC driver ... –! Other systems configurations are possible other solution Even sharding via HTTP/REST, JNI or some Command-line parameters •! DB to use •! Target throughput •! Number of threads •! … Workload YCSB client Cloud DB parameter file DB client •! R/W mix •! Record size Client Workload threads •! Data set executor •! … Stats Extensible: define new workloads Extensible: plug in new clients 5 Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 7 / 16
  • More on workload d by replacing the value of one Uniform: +(,-./%&0 ther one randomly chosen field Specifies what DB 1)))2)))333 operations are used by order, starting at a randomly !"#$%&(")(%*$% 4 number of application records to scan is Zipfian: Also defines request +(,-./%&0 distribution of scan lengths is distributionoad. Thus, the scan() method number of records to scan.to specify It is possible Of y instead specify a scan interval 1)))2)))333 4 record sizeFebruary 15th). The number of !"#$%&(")(%*$% It’s possible to specify to control the size of these in- Latest: termine and number of records and specify meaningful +(,-./%&0 of the database calls, including tion 5.2.1.) operations 1)))2)))333 4 !"#$%&(")(%*$% t make many random choices Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 8 / 16
  • /* TODO: Remove this crap */ Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 9 / 16
  • SMS Traffic: workload constructionSMS service provider, several gateways, big clients (such as banks) 15% inserts, 65% updates, 15% reads Request distribution: latest SMS messages are the ”hottest” ones Evaluated Cassandra and sharded MySQL as DB storage for the next generation of SMS sending platform Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 10 / 16
  • Testing process 3 instances of DBMS system on one server (Core Quad Q9400, 4GB RAM, SATA-II HDD, FreeBSD 8.2-amd64) Cassandra 0.7.4 (1GB Java heap / instance) MySQL 5.1 + InnoDB engine (1GB InnoDB buffer pool size / instance) Client: separate machine, 1Gb/s connectionShould avoid swapping and disk IO saturation Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 11 / 16
  • Some resuts: Workload ”A”: 50% read / 50% write Workload A – Upd •! 50/50 Read/update Workload A - Read latency 70 80 60 70 Average read latency (ms) Update latency (ms) 60 50 50 40 40 30 30 20 20 10 10 0 0 0 5000 10000 15000 0 Throughput (ops/sec) Cassandra Hbase Sherpa MySQL Cassan Ilya Bakulin (SMS Traffic) Comment: Cassandra is optimized for writes, and achieves h NoSQL benchmarking July 2, 2011 12 / 16
  • oad A – Workload ”A”: 50% read / 50% write Some resuts: Update heavy y Workload A - Update latency 80 70 Update latency (ms) 60 50 40 30 20 10 0 15000 0 5000 10000 15000 ) Throughput (ops/sec) MySQL Cassandra Hbase Sherpa MySQL zed for writes, and achieves higher throughput and lower Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 13 / 16
  • Some resuts: Workload ”B”: 95% read / 5% write Workload B – Re •! 95/5 Read/update Workload B - Read latency 20 40 18 35 Average update latency (ms) Average read latency (ms) 16 30 14 12 25 10 20 8 15 6 10 4 5 2 0 0 0 2000 4000 6000 8000 10000 0 2 Throughput (operations/sec) Cassandra HBase Sherpa MySQL Cassan Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 14 / 16
  • oad resuts: Workload ”B”:heavy 5% write Some B – Read 95% read / Workload B - Update latency 40 35 Average update latency (ms) 30 25 20 15 10 5 0 8000 10000 0 2000 4000 6000 8000 10000ec) Throughput (operations/sec) MySQL Cassandra Hbase Sherpa MySQL Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 15 / 16
  • Links Yahoo Cloud Serving Benchmark: https://github.com/brianfrankcooper/YCSB google:// webmaster@kibab.com, kibab@FreeBSD.org Ilya Bakulin (SMS Traffic) NoSQL benchmarking July 2, 2011 16 / 16