• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Couchbase performance benchmarking
 

Couchbase performance benchmarking

on

  • 1,955 views

On September 21, 2012, Renat Khasanshyn, the Founder and Chairman of Altoros, made a session “Benchmarking Couchbase Server” at CouchConf that was held in San Francisco. In his session Renat ...

On September 21, 2012, Renat Khasanshyn, the Founder and Chairman of Altoros, made a session “Benchmarking Couchbase Server” at CouchConf that was held in San Francisco. In his session Renat highlighted that all NoSQL vendors say their databases are fast and scalable, but this is not really helpful for end users.

Statistics

Views

Total Views
1,955
Views on SlideShare
1,125
Embed Views
830

Actions

Likes
1
Downloads
0
Comments
0

6 Embeds 830

http://blog.altoros.com 818
http://altorosblog.phpmaintest.altoros.corp 6
http://feeds.feedburner.com 3
http://www.docshut.com 1
http://dddd 1
http://www.tuicool.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Couchbase performance benchmarking Couchbase performance benchmarking Presentation Transcript

    • Benchmarking Couchbase Server Renat Khasanshyn CEO, Altoros Systems, Inc.CouchConf 2012September 21, 2012 Copyright © Altoros Systems, Inc. | CONFIDENTIAL
    • Presentation Outline• Benchmark Goals• Benchmark Design and Scenario• Benchmarking Tools• Benchmark Results 2 Copyright © Altoros Systems, Inc. | CONFIDENTIAL
    • 3Copyright © Altoros Systems, Inc. | CONFIDENTIAL
    • About Altoros • Software delivery acceleration specialist for big data application implementation services • 200+ employees globally (Eastern Europe, US, UK, Denmark, Norway) • Big data practice areas:  Advertising analytics  Automated device analytics  Big data warehouseCustomersPartners Implementation Partner 4 Copyright © Altoros Systems, Inc. | CONFIDENTIAL
    • Why Benchmark NoSQL technologies?• All NoSQL technologies say they are “high performance and scalable”But this isn’t helpful to end users• Performance needs to be measured for meaning full workloads ⇒ To help users understand the performance characteristics of databases those workloads• So we decided to compare the commonly used NoSQL databases • MongoDB 2.2RC • Cassandra 1.1.2 • Couchbase Server 2.0 - Recent Build 5 Copyright © Altoros Systems, Inc. | CONFIDENTIAL
    • Benchmark Goals• Reproducible by anyone – Open Source workload generator• Focus on use case for which NoSQL typically selected• Use a realistic workload – Simulate steady state of application running – Meaningful data amounts & runtime• Compare latency vs throughput• Measure max throughput (for given scenario) 6 Copyright © Altoros Systems, Inc. | CONFIDENTIAL
    • Benchmarking Scenario• For interactive web application • Scalability and performance are the most common requirements • Typically leads to users selecting NoSQL over RDBMS• The working set of data changes with time • End users using the application change over time • Example: every few hours, every few days, every few weeks• There is more data available than memory (RAM)• Replication is used for fault tolerance• Real world data sizes• Use EC2 as deployment platform – Commonly used – Easy to replicate results 7 Copyright © Altoros Systems, Inc. | CONFIDENTIAL
    • Benchmarking Scenario DetailsHardware• 4 Amazon m1.xlarge instances for the NoSQL DBs• 1 instance used as the clientWorkload details• Operations are a mix of C:R:U:D in the ratio 5:60:33:2• Each document roughly 1.5-2K in size (15 fields * 100 bytes)• 15 million active and 15 million replica documents• Workload with sliding working set• Load phase, warm-up phase, access phase• Runtime of the access phase ~1 hour• Latency measured for varying throughput - 3 times for each run• Focus on transaction performance – Latency – Throughput 8 Copyright © Altoros Systems, Inc. | CONFIDENTIAL
    • What was measured?• Latency • Throughput • Round trip time taken • Throughput was varied for a request to execute from 1K ops/sec to 25K from the client to the ops/sec depending on server and back NoSQL database • Average, 95th and 99th • Max throughput was percentile measured measured• Why is this important? • Why is this important? • You want your users to • You want your app to have a great experience support hundreds of • Not just an “average” thousands of users one Workloads are not rate limited, focused on max throughput. 9 Copyright © Altoros Systems, Inc. | CONFIDENTIAL
    • YCSB 10 Copyright © Altoros Systems, Inc. | CONFIDENTIAL
    • Benchmark Implementation: YCSB• Yahoo! team offered a “standard” benchmark• Yahoo! Cloud Serving Benchmark (YCSB) – Focus on database – Focus on performance• YCSB Client consists of 2 parts – Workload generator – Workload scenarios 11 Copyright © Altoros Systems, Inc. | CONFIDENTIAL
    • Why YCSB• Open source• Extensible• Rich selection of connectors • Azure, BigTable, Cassandra, CouchDB, • Dynomite, GemFire, HBase, Hypertable, • Infinispan, MongoDB, PNUTS, Redis, • Connector for Sharded RDBMS (i.e. MySQL), • Voldemort, GigaSpaces XAP• We developed a few connectors • Accumulo, Couchbase, Riak, • Connector for Shared Nothing RDBMS (i.e. MySQL Cluster) 12 Copyright © Altoros Systems, Inc. | CONFIDENTIAL
    • How YCSB Works 13Copyright © Altoros Systems, Inc. | CONFIDENTIAL
    • THE CONFIGURATIONS 14 Copyright © Altoros Systems, Inc. | CONFIDENTIAL
    • Cluster specification Amazon m1.xlarge Instance 15 GB memory 4 virtual cores 4 EBS 50 GB volumes in RAID0 YCSB Client 64-bit Amazon Linux (CentOS binary compatible) Amazon m1.xlarge Instances * 4 15 GB memory 4 virtual cores 4 EBS 50 GB volumes in RAID0 64-bit Amazon Linux* Extra nodes for masters, routers, etc 15 Copyright © Altoros Systems, Inc. | CONFIDENTIAL
    • Couchbase Configuration• 4 node Couchbase cluster• 1 replica setting• Each node has some active and some replica data• 12GB used as the (12288 MB) Couchbase bucket size per node 16 Copyright © Altoros Systems, Inc. | CONFIDENTIAL
    • MongoDB Configation• 4 shards each has 1 replica (replication factor – 1), where each shard is a set of 2 nodes - primary and secondary• Journaling disabled (trying to maximize performance)• var shards = [ "shard1/ycsb-node1:27017,ycsb-node2:27018", "shard2/ycsb-node2:27017,ycsb-node1:27018", "shard3/ycsb-node3:27017,ycsb-node4:27018", "shard4/ycsb-node4:27017,ycsb-node3:27018"]; Each node running • 2 mongod processes (all together 8 mongod processes on 4 nodes) • 4 mongos processes, which is the MongoDB router, process on 27019 port 17 Copyright © Altoros Systems, Inc. | CONFIDENTIAL
    • Cassandra Configuration• Cassandra JVM settings: • 1.1) MAX_HEAP_SIZE, which is a total amount of memory dedicated to the Java heap - 6G • 1.2) HEAP_NEWSIZE, total amount of memory for the new generation of objects - 400M• Cassandra settings: • 2.1) RandomPartitioner was used which distributes rows across the cluster evenly by MD5 • 2.2) Memtable size 4048 MB 18 Copyright © Altoros Systems, Inc. | CONFIDENTIAL
    • THE RESULTS 19 Copyright © Altoros Systems, Inc. | CONFIDENTIAL
    • Reads (Average time) Read latencies against throughput 7 6 Cassandra 5Average Latency [ms] 4 MongoDB 3 2 1 Couchbase 0 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 22000 Operations per Second 20 Copyright © Altoros Systems, Inc. | CONFIDENTIAL
    • Reads (95th percentile) 18 Read latencies against throughput 16 14 Cassandra 1295th Percentile Latency [ms] 10 8 6 4 Couchbase 2 MongoDB 0 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 22000 Operations per Second 21 Copyright © Altoros Systems, Inc. | CONFIDENTIAL
    • Reads (99th percentile) 60 Read latencies against throughput 50 Cassandra 4099th Percentile Latency [ms] 30 MongoDB 20 10 Couchbase 0 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 22000 Operations per Second 22 Copyright © Altoros Systems, Inc. | CONFIDENTIAL
    • Mongo Replica Reads• MongoDB setup had 4 shards • By default only masters will service reads• To allow replica reads and still be comparable, need to ensure that replica data is up-to-date • This was done using write-concern (REPLICAS_SAFE)• Tests showed that results did not improve • This includes results for writes 23 Copyright © Altoros Systems, Inc. | CONFIDENTIAL
    • Writes (Average time) 5 Insert and Update latencies against throughput 4.5 4 MongoDB 3.5Average Latency [ms] 3 Cassandra 2.5 2 1.5 1 Couchbase 0.5 0 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 22000 Operations per second 24 Copyright © Altoros Systems, Inc. | CONFIDENTIAL
    • Writes (95th percentile) 30 Insert and update latencies against throughput 25 MongoDB95th Percentile Latency [ms] 20 15 Cassandra 10 Couchbase 5 0 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 22000 Operations per Second 25 Copyright © Altoros Systems, Inc. | CONFIDENTIAL
    • Writes (99th percentile) 50 Insert and update latencies against throughput 45 40 MongoDB 3599th Percentile Latency [ms] 30 25 Cassandra 20 15 10 5 Couchbase 0 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 22000 Operations per Second 26 Copyright © Altoros Systems, Inc. | CONFIDENTIAL
    • Results Analysis• Couchbase • Showed the lowest latencies & highest throughput • Latency was independent of throughput for up to 3/4th the max achievable throughput (for both reads and write)• Cassandra • Had the highest latencies of all the databases • Showed higher max throughput compared with mongoDB but only 60% of the throughput achieved by Couchbase • Latencies rose fast as throughput was increased• MongoDB • Read latencies were better than Cassandra but higher than Couchbase • Max throughput for read and writes was the lowest of all the databases – Particularly for writes, high latencies seen for average throughput – Coarse write lock seems to have a big impact on performance 27 Copyright © Altoros Systems, Inc. | CONFIDENTIAL
    • Other Thoughts• You decide who is a winner• NoSQL is a “different horses for different courses”• Evaluate before choosing the “horse”• Construct your own or use existing workloads • Benchmark it • Tune database! • Benchmark it againAmazon EC2 observations• Scales perfectly for NoSQL• EBS slows down database on reads• RAID0 it! Use 4 disk in array (good choice), some reported performance degraded with higher number (6 and >) 28 Copyright © Altoros Systems, Inc. | CONFIDENTIAL
    • What are we missing in our benchmarking scenario?Load phase workload• Working set is created• 15 million records• 1.5 KB record (15 fields by 100 Bytes)• 45GB total or ≈12GB per nodeIdeas, anyone? 29 Copyright © Altoros Systems, Inc. | CONFIDENTIAL
    • YCSB Connectorsgithub.com/Altoros/YCSB 30 Copyright © Altoros Systems, Inc. | CONFIDENTIAL
    • Workload Generator SpecsHotspot generator with sliding window:hotspotslidingspeed=10Speed of the hot set window movement measured in keys per second, with adefault value of 10 keys/sec (can be overridden in workload properties file).hotspotdatafraction=0.2Proportion of the hot data set to the whole dataset, default is 0.2hotspotoperationfraction=0.9Value specifying how often hot dataset will be queried comparing to colddataset, default is 0.8, used 0.9lowerbound=0The minimal key value allowed to be queried. Set to 0upperbound=15000000The maximum key value allowed to be queried. Set to 15 millionAlso specification of the client process, which drives workload:6) threadcount=30Number of parallel threads spawned on the client node to drive benchmark 31 Copyright © Altoros Systems, Inc. | CONFIDENTIAL
    • Thank you!Thank You! renat.k@altoros.com @renatkhasanshyn Tel. (650) 395-7002 32Copyright © Altoros Systems, Inc. | CONFIDENTIAL