• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
“Quick” NoSQL Comparison: Measuring performance and failover of Aerospike, Cassandra, Couchbase, and MongoDB
 

“Quick” NoSQL Comparison: Measuring performance and failover of Aerospike, Cassandra, Couchbase, and MongoDB

on

  • 12,284 views

 

Statistics

Views

Total Views
12,284
Views on SlideShare
10,118
Embed Views
2,166

Actions

Likes
19
Downloads
0
Comments
0

12 Embeds 2,166

http://www.scoop.it 939
http://www.bigdatanosql.com 929
http://bigdatapress.com 127
https://twitter.com 91
http://east.naini.com 46
http://mysqltag.tistory.com 12
http://tweetedtimes.com 7
http://webcache.googleusercontent.com 6
http://blog.daum.net 4
http://abtasty.com 2
http://184.72.164.173 2
http://www.feedly.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • CAP not really too relevant. If you’re using a NoSQL database, you probably are running on a cluster with multiple partitions. And consistency – availability is not necessarily easy to interpret except in a formal theoretical way.Plus, it’s tunable. Plus, what does consistency even mean? Example, Cassandra partial writes.
  • and have been clobbered by dband managing relational shards is hardRAD – it’s a great reason and very valuable for document and graph type problems. But it’s not what we’re talking about.
  • - Fair use – they kept yelling at us!All the other tests out there seemed to be running them without understanding the core models. They tended to analyze breadth but not depth.Even choosing the problem is quite hard. For example, Aerospike and Couchbase were the most closely related databases in our test, but the kind of problem space that makes Aerospike interesting is the kind that breaks Couchbase.
  • Examine how they fail over, and what that means in real terms.
  • Couchbase is in its default mode is consistent, but by having a master copyData set fits in RAM – think about itTo disk – if the # IOPS is 40k, and the number of transactions is 250k, think about it.
  • These are all approximateFind vBucket for key, master on Node 1, write
  • These are all approximateFind vBucket for key, master on Node 1, write
  • Client connects to something, it can be a coordinator Different drivers with better routing
  • Similar to Cassandra, but no partial writes – each transaction is
  • 1. Provision the hardware, operating systems, and environment using our best estimates of what customers will run in their data centers and the best practices, where specified, of the various databases.2. Configure it optimally and ensure it is functioning as a single cluster.3. by inserting records individually but as fast as possible.
  • 2ndary indexes – hit all nodes

“Quick” NoSQL Comparison: Measuring performance and failover of Aerospike, Cassandra, Couchbase, and MongoDB “Quick” NoSQL Comparison: Measuring performance and failover of Aerospike, Cassandra, Couchbase, and MongoDB Presentation Transcript

  • Measuring performance and failover ofAerospike, Cassandra, Couchbase, and MongoDB Ben Engber Thumbtack Technology
  •  A lot of interest in NoSQL databases Discussions around these tend to be confusing ◦ e.g. discussions of CAP Theorem Our goal was to present business use case answers
  •  You want to support a large transaction volume You want to find a way to distribute your data tier You want simpler failover and other administration‣ You want rapid application development
  •  Test a bunch of databases Start with a nice simple workload ◦ (key value storage) Use a standard client (YCSB) Test something new but legit ◦ Solid state performance on bare metal Then move on to ◦ secondary indexes ◦ other databases ◦ failover
  •  Running a database is easy – running it correctly is hard ◦ Memory sizing, problem sizing, etc. ◦ Eviction / ejection ◦ These databases work in very different ways How do we even represent a fair use of a database? What does it mean to test key-value storage
  •  Choose the databases we hear about most often Create standard baseline scenarios Measure raw performance for various scenarios Later: Examine how they fail over
  • Fast Reliable Not the same as  Not the same as Available Consistent Asynchronous  Synchronous replication to nodes replication to nodes Asynchronous writes to  Synchronous or disk asynchronous writes to disk Data set fits in RAM  Data set cannot live in Pretty reliable RAM  Pretty fast
  • Node 2 Master: B Slave:Client Node 1 C,D Node 3Writemaster Master: A Master: CRead master Slave: B,C Slave: D,E Node 6 Node 4 Client Write Master: F Master: D master and Slave: A,B Node 5 Slave: E,F observe Read master Master: E Slave: F,A
  • Client 1 Node 1 Node 4Writemaster Master: A Master: BRead master Node 2 Node 3 Node 5 Node 6 Slave: A Slave: A Slave: B Slave: B Node 1 Node 4 Client 2 Write Master: A Master: B master and observe Read master Node 2 Node 3 Node 5 Node 6 Slave: A Slave: A Slave: B Slave: B
  • Client 4 Write quorum Read quorum Node 2 B,C,D Node 1 Node 3Client 1 Client 5Write one Write one A,B,C C,D,ERead one Read all Node 6 Node 4 F,A,B D,E,F Node 5 E,F,A Client 6 Write all Read oneClient 2 Client 3 “In distributed data systems like Cassandra,Write one Write one [consistency] usually means that once aRead one Read one writer has written, all readers will see that write.”
  • Node 2 B,C,D Node 1 Node 3Client 1 Client 4Fire and ACID A,B,C C,D,Eforget Node 6 Node 4 F,A,B D,E,F Node 5 E,F,A Client 5 ACIDClient 2 Client 3Fire and Fire andforget forget
  • Aerospike Aerospike Cassandra Cassandra Couchbase MongoDB (async) (sync) (async) (sync) (async) (async)Standard Asynchronous Synchronous Asynchronous Synchronous Asynchronous AsynchronousReplicationModelDurability Asynchronous Synchronous Asynchronous Asynchronou Asynchronous Asynchronous sDefault sync 128kB per immediate 10 seconds 10 seconds 250k records 100msbatch deviceConsistency Eventual Immediate Eventual Immediate Immediate ImmediateModelConsistency Inconsistent Consistent Inconsistent Consistent Inconsistent Inconsistenton singlenode failureAvailability Available Available Available Unavailable Available Availableon singlenode failure/ noquorumData loss 25% 25% 25% 25% 25% 50%on replicaset failure
  • 1. Provision according to best practices and reality2. Install a database on a 4-node cluster3. Load a large dataset to disk (SSD)4. Determine maximum load, using the strongest durability guarantees practical5. Perform a stepwise load for latency6. Repeatforread-heavy and balanced read-write7. Repeat steps 3-6 for a dataset that fits into RAM
  • 350,000300,000250,000 Aerospike200,000 Cassandra MongoDB150,000 Couchbase 1.8* Couchbase 2.0*100,000 50,000 0 SSD In Memory
  • SSD / Synchronous RAM / Asynchronous350,000 1,000,000 900,000300,000 800,000250,000 700,000 600,000 Aerospike200,000 Aerospike Cassandra Cassandra 500,000 MongoDB150,000 MongoDB 400,000 Couchbase 1.8 Couchbase 2.0*100,000 300,000 Couchbase 2.0 200,000 50,000 100,000 0 0 Balanced Read-Heavy Balanced Read-Heavy
  • SSD / Synchronous RAM / Asynchronous Balanced Workload Read Latency (Full view) Balanced Workload Read Latency (Full view) 10 20 AerospikeAverage Latency, ms Average Latency, ms 7.5 Aerospike 15 Cassandra Couchbase 5 MongoDB 10 1.8 Couchbase 2.5 5 2.0 0 0 0 100,000 200,000 0 100,000 200,000 300,000 400,000 Throughput, ops/sec Throughput, ops/sec Balanced Workload Update Latency (Full view) Balanced Workload Update Latency (Full view) 16 8 Aerospike Average Latency, ms Average Latency, ms 12 Aerospike 6 Cassandra Couchbase 8 4 1.8 MongoDB Couchbase 4 2 2.0 0 0 0 50,000 100,000 150,000 200,000 0 100,000 200,000 300,000 400,000 Throughput, ops/sec Throughput, ops/sec
  •  Aerospike is an SSD optimized DB, and proved itself Asynchronous K/V stores can do a huge amount of traffic This says nothing about secondary indexes Doesn’t answer our other main concern about distributed DBs
  •  Throughput (50%, 75%, 100%) Failure type (graceful, kill -9, split brain) Workload (balanced read-write, mostly read) Replication Model / Durability Model
  • Aerospike CassandraCouchbase MongoDB
  • Aerospike CassandraCouchbase MongoDB
  • Aerospike CassandraCouchbase MongoDB
  • Downtimes on node down Downtime on node restore 14000 35000 30000 12000 median downtime (ms) 25000 10000 20000min/max downtime (ms) 15000 8000 Aerospike 10000 Cassandra 6000 Couchbase 5000 MongoDB 4000 0 2000 0 Aerospike Cassandra Couchbase MongoDB
  • Aerospike Aerospike Cassandra Cassandra Couchbase MongoDB (async) (sync) (async) (sync)Original 300,000 150,000 27,500 30,000 375,000 33,750ThroughputOriginal 100% 100% 99% 104% 100% 100%ReplicationDowntime 3,200 1,600 6,300 ∞ 2,400* 4,250(ms)Recovery time 4,500 900 27,000 N/A 5,000 600(ms)Node Down 300,000 149,200 22,000 0 362,000 31,200ThroughputNode Down 52% 52% N/A 54% 50% 50%ReplicationPotential Data large none 220,000 rows none large 2400 rowsLossTime to small 3,300 small small small 31,100stabilize onnode up (ms)Final 300,000 88,300 21,300† 17,500† 362,000 31,200ThroughputFinal 100% 100% 101% 108% 76% 100%Replication † Depends on driver being used. Newer drivers like Hector restore to 100% throughput * Assuming perfect monitoring scripts
  •  For “Fast” scenario, these systems function as advertised ◦ Downtime is low ◦ Performance effect is not dramatic For “Reliable” scenario: ◦ For MongoDB and Cassandra, make sure you have a replication factor of 3 ◦ Include replication lag in your capacity planning For both, understand your potential data loss
  •  Finite wealth ◦ Building and reserving bare metal hardware each with multiple SSDs is expensive ◦ Not a fair synchronous failover test for Cassandra and MongoDB Lab != real world Replication delays
  •  Do it in a larger cluster Measure data loss Measure more than K/V store Get other databases involved ◦ What’s your preference?
  • Thumbtack Ben Engberhttp://www.thumbtack.net bengber@thumbtack.net NoSQL implementations bengber Everything Else Application scalability Social applications Mobile Cloud migrationshttp://www.slideshare.net/bengber/no-sql-presentationhttp://thumbtack.net/solutions/ThumbtackWhitePaper.html