• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Evaluating NoSQL Performance: Time for Benchmarking
 

Evaluating NoSQL Performance: Time for Benchmarking

on

  • 16,652 views

 

Statistics

Views

Total Views
16,652
Views on SlideShare
16,431
Embed Views
221

Actions

Likes
23
Downloads
301
Comments
8

8 Embeds 221

http://www.scoop.it 120
http://tweets.valtellinux.it 48
http://rg443blog.wordpress.com 37
https://twitter.com 6
http://www.linkedin.com 5
https://twimg0-a.akamaihd.net 2
http://twitter.com 2
http://kred.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

18 of 8 previous next Post a comment

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • This is a useless document. The mongodb config has a shard key that will nail all writes to the primary of a single node.

    The _id field is an incrementing value as it incorporates a timestamp. This will force all inserts to be stored in the last extent range which resides on a single shard, so the shard configuration is invalid.

    The results you are reporting are ridiculous. Cassandra read performance should be terrible compared to MongoDB, and the write performance is way off for reasons previously mentioned. Your high read latency is likely caused by the same problem as the write performance...range queries are all being served by one node because of the poor selection of shard key.
    Are you sure you want to
    Your message goes here
    Processing…
  • I would add one more criteria for evaluating the system - learning curve and complexity of operation. The efforts required to operate these systems in production significantly differs from product to product.
    Are you sure you want to
    Your message goes here
    Processing…
  • Yes. Target means number of operations for certain time.
    Throughput = Operations / Seconds
    Are you sure you want to
    Your message goes here
    Processing…
  • @oussama2nd Correct, specific throughput is achieved by manipulating --target= option on the command line (or in the properties file)
    Are you sure you want to
    Your message goes here
    Processing…
  • Thank you VERY much for this, I spent a whole day to find this kind of benchmarks and I think you spent much more time to make them.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Evaluating NoSQL Performance: Time for Benchmarking Evaluating NoSQL Performance: Time for Benchmarking Presentation Transcript

    • EVALUATING NOSQL PERFORMANCE: TIME FOR BENCHMARKING Sergey Bushik Senior RnD Engineer, Altoros! © ALTOROS Systems | CONFIDENTIAL
    • Agenda Ø  Introduction Ø  Few lines about benchmarking client Ø  Workloads Ø  Cluster setup Ø  Meet evaluated databases: short stop on each Ø  Diagrams and comments Ø  Summary Ø  Questions © ALTOROS Systems | CONFIDENTIAL
    • Introduction ü  Database’s list is extensive (RDBMS 90+, NoSQL 122+) ü  Core NoSQL systems types •  Key value stores •  Document oriented stores •  Column family stores •  Graph databases •  XML databases •  Object database management systems ü  Different APIs and clients ü  Different performance © ALTOROS Systems | CONFIDENTIAL
    • How do they compare? ü  Yahoo! team offered “standard” benchmark ü  Yahoo! Cloud Serving Benchmark (YCSB) •  Focus on database •  Focus on performance ü  YCSB Client consists of 2 parts •  Workload generator •  Workload scenarios © ALTOROS Systems | CONFIDENTIAL
    • YCSB features ü  Open source ü  Extensible ü  Has connectors Azure, BigTable, Cassandra, CouchDB, Dynomite, GemFire, HBase, Hypertable, Infinispan, MongoDB, PNUTS, Redis, Connector for Sharded RDBMS (i.e. MySQL), Voldemort, GigaSpaces XAP ü  We developed few connectors Accumulo, Couchbase, Riak, Connector for Shared Nothing RDBMS (i.e. MySQL Cluster) © ALTOROS Systems | CONFIDENTIAL
    • YCSB architecture © ALTOROS Systems | CONFIDENTIAL
    • Workloads ü  Workload is a combination of key-values: Request distribution (uniform, zipfian) Record size Operation proportion (%) ü  Types of workload phases: Load phase Transaction phase © ALTOROS Systems | CONFIDENTIAL
    • Workloads ü  Load phase workload Working set is created 100 million records 1 KB record (10 fields by 100 Bytes) 120-140G total or ≈30-40G per node ü  Transaction phase workloads Workload A (read/update ratio: 50/50, zipfian) Workload B (read/update ratio: 95/5, zipfian) Workload C (read ratio: 100, zipfian) Workload D (read/update/insert ratio: 95/0/5, zipfian) Workload E (read/update/insert ratio: 95/0/5, uniform) Workload F (read/read-modify-write ratio: 50/50, zipfian) Workload G (read/insert ratio: 10/90, zipfian) © ALTOROS Systems | CONFIDENTIAL
    • Cluster setup ü  Amazon EC2 as a cluster infrastructure ü  No replication (replication factor = 0) ü  EBS volumes in RAID0 (stripping) array for data storage directory ü  OS swapping is OFF © ALTOROS Systems | CONFIDENTIAL
    • Cluster specification Amazon m1.large Instance 7.5 GB memory 2 virtual cores 8 GB instance storage YCSB Client 64-bit Amazon Linux (CentOS binary compatible) Amazon m1.xlarge Instances * 4 15 GB memory 4 virtual cores 4 EBS 50 GB volumes in RAID0 64-bit Amazon Linux * Extra nodes for masters, routers, etc © ALTOROS Systems | CONFIDENTIAL
    • Databases The list of databases •  Cassandra 1.0 •  HBase 0.92.0 •  MongoDB 2.0.5 •  MySQL Cluster 7.2.5 •  MySQL Sharded 5.5.2.3 •  Riak 1.1.1 We tuned each system as well as we knew how Let’s see who is worth the prize © ALTOROS Systems | CONFIDENTIAL
    • Cassandra 1.0 ü  Column-oriented ü  No single point of failure ü  Distributed ü  Elastically scalable ü  Tuneably consistent ü  Caching Key cache Off-heap/on-heap row cache Memory mapped files © ALTOROS Systems | CONFIDENTIAL
    • Cassandra 1.0 YCSB Client Cassandra configuration Random partitioner Initial token space: 2^127 / 4 Memtable space: 4G Commit log is on the separate disk Concurrent reads: 32 (8 * 4 cores) Concurrent writes: 64 (16 * 4 disks) Compression: Snappy Thrift JVM tuning MAX_HEAP_SIZE: 6G HEAP_NEWSIZE: 400M Rest of 15G RAM is for OS caching © ALTOROS Systems | CONFIDENTIAL
    • Cassandra 1.0 CREATE KEYSPACE UserKeyspace WITH placement_strategy = SimpleStrategy AND strategy_options = {replication_factor:1} AND durable_writes = true; USE UserKeyspace; CREATE COLUMN FAMILY UserColumnFamily WITH comparator = UTF8Type AND key_validation_class = UTF8Type AND keys_cached = 100000000 AND rows_cached = 1000000 AND row_cache_provider = SerializingCacheProvider’ AND compression_options = {sstable_compression:SnappyCompressor}; * Make sure you use off-heap row caching! (it requires JNA library) © ALTOROS Systems | CONFIDENTIAL
    • HBase 0.92.0 ü  Column-oriented ü  Distributed ü  Built on top of Hadoop DFS (Distributed File System) ü  Block cache ü  Bloom filters on per-column family basis ü  Consistent reads/writes © ALTOROS Systems | CONFIDENTIAL
    • HBase 0.92.0 HMaster YCSB Client HQuorumPeer Namenode HBase configuration Auto flush: OFF Write buffer: 12M (2M default) Compression: LZO JVM tuning Namenode: 4G Datanode: 2G Region server: 6G Master: 2G HRegionServer Datanode (HDFS process) Systems | CONFIDENTIAL © ALTOROS
    • HBase 0.92.0 CREATE t1, { NAME => f1, BLOOMFILTER => ROW, REPLICATION_SCOPE => 0, COMPRESSION => LZO, VERSIONS => 3, TTL => 2147483647, BLOCKSIZE => 16384, IN_MEMORY => true, BLOCKCACHE => true, DEFERRED_LOG_FLUSH => true’ } DISABLE t1 ALTER t1, METHOD => table_att, MAX_FILESIZE => 1073741824 ENABLE t1 * Make column family name short (several chars) © ALTOROS Systems | CONFIDENTIAL
    • MongoDB 2.0.5 ü  Document-oriented ü  Distributed ü  Load balancing by sharding ü  Relies on memory mapped files for caching © ALTOROS Systems | CONFIDENTIAL
    • MongoDB 2.0.5 YCSB Client Mongo configuration Sharding enabled on database Mongo Wire Protocol + BSON Collection is sharded by _key (PK) Mongos Router Mongod Config Server Mongod © ALTOROS Systems | CONFIDENTIAL
    • MongoDB 2.0.5 // use hostnames instead of IPs var shards = […]; // adding shards for (var i = 0; i < shards.length; i++) { db.runCommand({ addshard : shards[i] }); } // enabling sharding db.runCommand({ enablesharding : "UserDatabase” }); // sharding the collection by key db.runCommand({ shardcollection : "UserDatabase.UserTable", key : { _id : 1 } }); © ALTOROS Systems | CONFIDENTIAL
    • MySQL Sharded 5.5.2.3 ü  RDBMS (no surprises here) ü  Sharding is done on the client (YCSB) side ü  Not scalable © ALTOROS Systems | CONFIDENTIAL
    • MySQL Sharded 5.5.2.3 YCSB Client MySQL Configuration Storage engine: MyISAM key_buffer_size: 6G DDL for table creation [sharding by the primary key] CREATE TABLE user_table( ycsb_key VARCHAR(32) PRIMARY KEY, // specify 10 table columns (100 bytes each) ) ENGINE=MYISAM; mysqld 5.5.2.3 © ALTOROS Systems | CONFIDENTIAL
    • MySQL Cluster 7.2.5 ü  Relational ü  Not really relational No foreign keys ACID: read committed transactions only ü  Shared-nothing ü  In-memory database ü  Can be persistent (non-indexed columns) © ALTOROS Systems | CONFIDENTIAL
    • MySQL Cluster 7.2.5 YCSB Client MySQL Cluster Configuration DataMemory: 3G IndexMemory: 5G DiskPageBufferMemory: 2G Each shard has: MySQL Database Daemon NDB Daemon MySQL Cluster Management Server Daemon © ALTOROS Systems | CONFIDENTIAL
    • MySQL Cluster 7.2.5CREATE TABLE user_table( ycsb_key VARCHAR(32) PRIMARY KEY, … // columns MAX_ROWS=200000000 ENGINE=NDBCLUSTER PARTITION BY KEY(ycsb_key);CREATE LOGFILE …; // log files are createdCREATE TABLESPACE …; // table space for disk persistenceALTER TABLE user_table TABLESPACE user_table_space STORAGE DISK ENGINE=NDBCLUSTER; // assigning table space with target table © ALTOROS Systems | CONFIDENTIAL
    • Riak 1.1.1 ü  Key-value storage (Amazon’s Dynamo inspired) ü  Distributed ü  Scalable ü  Schema free ü  Decentralized (no single point of failure) © ALTOROS Systems | CONFIDENTIAL
    • Riak 1.1.1 YCSB Client Riak Configuration storage_backend: bitcask // eleveldb backend was slow Erlang (vm.arg) Configuration // number of threads in async thread pool HTTP or Protobuf +A 256 // kernel poll enabled +K true // number of concurrent ports and sockets -env ERL_MAX_PORTS 4096 Riak daemon process SchemaDDL Is not required, just a bucket name © ALTOROS Systems | CONFIDENTIAL
    • Load phase, [INSERT] Load  phase,  100.000.000  records  *  1  KB,  [INSERT]     18.0   16.0   14.0   Cassandra  1.0   Average  latency,  ms   HBase  0.92.0   12.0   MongoDB  2.0.5   10.0   MySQL  Cluster  7.2.5   8.0   MySQL  Sharded  5.5.2.3   6.0   Riak  1.1.1   4.0   2.0   0.0   0   5000   10000   15000   20000   25000   Throughput,  ops/sec  HBase has unconquerable superiority in writes, and with a pre-created regions it showedus up to 40K ops/sec. Cassandra also provides noticeable performance during loadingphase with around 15K ops/sec. MySQL Cluster can show much higher numbers in “justin-memory” mode. © ALTOROS Systems | CONFIDENTIAL
    • Read heavy workload (B),[READ] B  workload  (UPDATE  0.05,  READ  0.95),  [READ]   20   18   16   Cassandra  1.0   Average  latency,  ms   14   HBase  0.92.0   12   10   MongoDB  2.0.5   8   MySQL  Cluster  7.2.5   6   MySQL  Sharded   4   5.5.2.3   2   0   0   500   1000   1500   2000   2500   Throughput,  ops/sec  MySQL Sharded is a performance leader in reads. MongoDB is close to it accelerated bythe “memory mapped files” type of cache. MongoDB uses memory-mapped files for alldisk I/O. Cassandra’s key and row caching allows very fast access to frequentlyrequested data. Random read performance is slower in HBase. © ALTOROS Systems | CONFIDENTIAL
    • Read heavy workload (B),[UPDATE] B  workload  (UPDATE  0.05,  READ  0.95),  [UPDATE]   50   45   40   Cassandra  1.0   Average  latency,  ms   35   HBase  0.92.0   30   MongoDB  2.0.5   25   MySQL  Cluster  7.2.5   20   MySQL  Sharded  5.5.2.3   15   Riak  1.1.1   10   5   0   0   500   1000   1500   2000   2500   Throughput,  ops/sec  Deferred log flush does the right job for HBase during mutation ops. Edits are committedto the memstore firstly and then aggregated edits are flushed to HLog asynchronously.Cassandra has great write throughput since writes are first written to the commit log withappend method which is fast operation. MongoDB’s latency suffers from global write lock.Riak behaves more stably than MongoDB. | CONFIDENTIAL © ALTOROS Systems
    • Read only workload (C) C  workload  (READ  1)   20.0   18.0   16.0   Cassandra  1.0   Average  latency,  ms   14.0   HBase  0.92.0   12.0   MongoDB  2.0.5   10.0   MySQL  Cluster  7.2.5   8.0   MySQL  Sharded  5.5.2.3   6.0   Riak  1.1.1   4.0   2.0   0.0   0   500   1000   1500   2000   2500   3000   3500   Throughput,  ops/sec  Read only workload simulates data caching system, where data itself is constructedelsewhere. Application just reads the data. B-Tree indexes make MySQL Sharded anotable winner in this competition. © ALTOROS Systems | CONFIDENTIAL
    • Scan ranges workload (E),[SCAN] E  workload  (INSERT  0.05,  SCAN  0.95),  [SCAN]   80.0   70.0   60.0   Average  latency,  ms   50.0   Cassandra  1.0   40.0   HBase  0.92.0   30.0   20.0   10.0   0.0   0   50   100   150   200   250   300   350   400   450   Throughput,  ops/sec   HBase performs a bit better than Cassandra in range scans, though Cassandra range scans improved noticeably from the 0.6 version presented in YCSB slides. MongoDB 2.5 max throughput 20 ops/sec, latency >≈ 1 sec MySQL Cluster 7.2.5 <10 ops/sec, latency ≈400 ms. MySQL Sharded 5.5.2.3 <40 ops/sec, latency ≈400 ms. Riak’s 1.1.1 bitcask storage engine doesn’t support range scans (eleveldb was slow during load) © ALTOROS Systems | CONFIDENTIAL
    • Insert mostly workload (G),[INSERT] G  workload  (INSERT  0.9,  READ  0.1),  [INSERT]   18.0   16.0   14.0   Cassandra  1.0   Average  latency,  ms   12.0   HBase  0.92.0   MongoDB  2.0.5   10.0   MySQL  Cluster  7.2.5   8.0   MySQL  Sharded  5.5.2.3   6.0   Riak  1.1.1   4.0   2.0   0.0   0   1000   2000   3000   4000   5000   6000   7000   8000   9000   Throughput,  ops/sec  Workload with high volume inserts proves that HBase is a winner here, closely followed byCassandra. MySQL Cluster’s NDB engine also manages perfectly with intensive writes. © ALTOROS Systems | CONFIDENTIAL
    • Before conclusions Ø  Is there a single winner? Ø  Who is worth the prize? © ALTOROS Systems | CONFIDENTIAL
    • Summary Answers Ø  You decide who is a winner Ø  NoSQL is a “different horses for different courses” Ø  Evaluate before choosing the “horse” Ø  Construct your own or use existing workloads •  Benchmark it •  Tune database! •  Benchmark it again Amazon EC2 observations Ø  Scales perfectly for NoSQL Ø  EBS slowes down database on reads Ø  RAID0 it! Use 4 disk in array (good choice), some reported performance degraded with higher number (6 and >) Ø  Don’t be sparing of RAM! © ALTOROS Systems | CONFIDENTIAL
    • Thank you for attention mailto: sergey.bushik@altoros.com skype: siarhei_bushyk linkedin: http://www.linkedin.com/pub/sergey-bushik/25/685/199 © ALTOROS Systems | CONFIDENTIAL