Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Sizing Your Couchbase Cluster: Couchbase Connect 2015

5,772 views

Published on

How many nodes? That is the million dollar question that we will answer during this session. Consider factors like RAM, Disk, CPU and your specific hardware and workload requirements to identify the ideal cluster size. This session will also discuss some specific architecture and deployment considerations as well as the effects of using different Couchbase features.

Published in: Technology
  • Be the first to comment

Sizing Your Couchbase Cluster: Couchbase Connect 2015

  1. 1. HOW MANY NODES? PROPERLY SIZING YOUR COUCHBASE CLUSTER Perry Krug Sr. Solutions Architect
  2. 2. ©2015 Couchbase Inc. 2 Read this Article: http://blog.couchbase.com/how-many-nodes-part- 1-introduction-sizing-couchbase-server-20-cluster
  3. 3. ©2015 Couchbase Inc. 3 Application Server Size Couchbase Server  Sizing == performance  Serve reads out of RAM  Enough IO for writes and disk operations  Mitigate inevitable failures Reading Data Writing Data Couchbase Server Give me document A Here is document A A Couchbase Server Please store document A OK, I stored document A A Application Server
  4. 4. ©2015 Couchbase Inc. 4 Scaling out permits matching of aggregate flow rates so queues do not grow network networknetwork Couchbase Server Couchbase Server Couchbase Server Application Server Application ServerApplication Server
  5. 5. 5 Factors of Sizing
  6. 6. ©2015 Couchbase Inc. 6 How many nodes? 5 Key Factors determine number of nodes needed: 1) RAM 2) Disk 3) CPU 4) Network 5) Data Distribution/Safety (per-bucket, multiple buckets aggregate) Couchbase Servers Web application server Application user
  7. 7. ©2015 Couchbase Inc. 7 RAM sizing 1) Total RAM  Managed document cache:  Working set  Metadata  Active+Replicas  Index caching (I/O buffer) Keep working set in RAM for best read performance Server Give me document A Here is document A A A A Reading Data Application Server
  8. 8. ©2015 Couchbase Inc. 8 Working set depends on your application Late stage social game Many users no longer active; few logged in at any given time. Ad Network Any cookie can show up at any time. Business application Users logged in during the day. Day moves around the globe. working/total set = 1working/total set = .01 working/total set = .33 Couchbase Server Couchbase Server Couchbase Server
  9. 9. ©2015 Couchbase Inc. 9 RAM Sizing -View/Index cache (disk I/O)  File system cache availability for the index has a big impact performance:  Test runs based on 10 million items with 16GB bucket quota and 4GB, 8GB system RAM availability for indexes  Performance results show that by doubling system cache availability  query latency reduces by half  throughput increases by 50%  Leave RAM free with quotas
  10. 10. ©2015 Couchbase Inc. 10 Disk Sizing: Space and I/O 2) Disk  Sustained write rate  Rebalance capacity  Backups  XDCR  Views/Indexes  Compaction  Total dataset:  (active + replicas + indexes)  Append-only I/O Space Please store document A OK, I stored document A A Server A A Writing Data Application Server
  11. 11. ©2015 Couchbase Inc. 11 Disk Sizing: Space and I/O  DiskWrites are Buffered  Bursts of data expand the disk write queue  Sustained writes need corresponding throughput  Disk throughput affected by disk speed  SSD > 10K RPM > EBS  SSDs give a huge boost to write throughput and startup/warmup times  RAID can provide redundancy and increase throughput  Throughput = read/write+compaction+indexing+XDCR  2.1 introduces multiple disk threads  Default is 3 (1 writer / 2 readers), max is 8 combined  Best to configure different paths for data and indexes  Plan on about 3x space (append-only, compaction, backups, etc)
  12. 12. ©2015 Couchbase Inc. 12 CPU sizing 3) CPU  Disk writing  Views/compaction/XDCR  RAM r/w performance not impacted  Min. production requirement: 4 cores +1 per bucket +1 core per Design Doc +1 core per XDCR stream
  13. 13. ©2015 Couchbase Inc. 13 Network sizing 4) Network  Client traffic  Replication (writes)  Rebalancing  XDCR Reads+Writes Replication (multiply writes) and Rebalancing network networknetwork Couchbase Server Couchbase Server Couchbase Server Application Server Application ServerApplication Server
  14. 14. ©2015 Couchbase Inc. 14 Network Considerations  Low latency, high throughput (LAN) - within cluster  Eliminate router hops:  Within Cluster nodes  Between clients and cluster  Check who else is sharing the network  Increase bandwidth by:  Add more nodes (will scale linearly)  Upgrade routers/switches/NIC’s/etc
  15. 15. ©2015 Couchbase Inc. 15 Data Distribution 5) Data Distribution / Safety (assuming one replica):  1 node = Single point of failure  2 nodes = +Replication  3+ nodes = Best for production  Autofailover  Upgrade-ability  Further scale-ability Note: Many applications will need more than 3 nodes Servers fail, be prepared. The more nodes, the less impact a failure will have.
  16. 16. ©2015 Couchbase Inc. 16 How many nodes recap 5 Key Factors determine number of nodes needed: 1) RAM 2) Disk 3) CPU 4) Network 5) Data Distribution/Safety (per-bucket, multiple buckets aggregate) Couchbase Servers Web application server Application user
  17. 17. Deployment Considerations
  18. 18. ©2015 Couchbase Inc. 18 Hardware Minimums RAM: At least ~4GB (highly dependent on data set) Disk: Fastest “local” storage available -SSD is better -RAID 0 or 10, not 5 CPU (minimums): 8 cores + 1-per bucket + 1-per design document + 1-per XDCR stream Hardware requirements/recommendations are the intersection of what’s needed versus what’s available.
  19. 19. ©2015 Couchbase Inc. 19 Hardware Considerations  Designed for commodity hardware  Scale out, not up…more smaller nodes better than less larger ones (can scale up later)  Tested and deployed in EC2  Physical hardware offers best performance and efficiency  Certain considerations with usingVM’s:  RAM use inefficient / Disk IO usually not as fast  Local storage better than shared SAN  1 CouchbaseVM per physical host  You will generally need more nodes  Don’t overcommit
  20. 20. ©2015 Couchbase Inc. 20 Couchbase in AWS  R3 or C3 instances best value for performance  Higher RAM-to-CPU ratios  Come with SSD’s  Disk Choice: SSD’s are best  Ephemeral is okay  Single EBS not great, use LVM/RAID  Views/indexes on ephemeral, main data on EBS or both on SSD  Backups: Use cbbackup locally on each node and migrate to EBS/S3  Can use EBS snapshots
  21. 21. ©2015 Couchbase Inc. 21 Couchbase in AWS  Deploy across AZ’s with rack/zone awareness  Use a EIP/public-hostname instead of private IP:  Easier connectivity from outside AWS  Easier restoration/better availability  Couchbase XDCR across regions must use hostname  In AWS as with any cloud/virtual deployment, you will likely need more nodes than you would with a physical infrastructure
  22. 22. Effects of…
  23. 23. ©2015 Couchbase Inc. 23 Views/Indexes  Effect on scale/sizing:  Increase the CPU and disk IO requirements  More complex views require more CPU  More view output requires more disk IO  More RAM should be left out of the quota for better IO caching  Indication:  Indexes significantly behind data writes (or growing delays)  What do to:  Make sure you follow best practices in view writing  Add more nodes to distribute processing “work”  Look into SSD’s
  24. 24. ©2015 Couchbase Inc. 24 XDCR  Effect on scale/sizing:  XDCR is CPU Intensive  Disk IO will double  Memory needs to be sized accordingly (bi-directional may mean more data)  Indication:  A rising XDCR queue on source  What to do:  More nodes on source and destination will drain queue faster (scales linearly)  Tune replication streams according to CPU availability
  25. 25. ©2015 Couchbase Inc. 25 As your workload grows…  Effects on scale/sizing:  More reads: • Individual documents will not be impacted (static working set) • Views may require faster disks, more disk IO caching  More writes will increase disk IO needs  Indications:  Cache miss ratio rising  Growing disk write queue / XDCR queue  Compaction not keeping up  What to do:  Revise sizing calculations and add more nodes if needed  Most applications don’t need to scale the number of nodes based upon normal workload variation.
  26. 26. ©2015 Couchbase Inc. 26 As your dataset grows…  Effects on scale/sizing:  Your RAM needs will grow:  Metadata needs increase with item count  Is your working set increasing?  Your disk space will likely grow (duh?)  Indications:  Dropping resident ratio  Rising ejections/cache miss ratio  What to do:  Revise sizing calculations, add more nodes  Remove un-needed data  This is the most common need for scaling and will most likely result in needing more nodes
  27. 27. ©2015 Couchbase Inc. 27 Rebalancing  Yes there is resource utilization during a rebalance but a “properly” sized cluster should not have any effect on performance during a rebalance:  Distribution of data and work across all nodes  Managed caching layer separates RAM-based performance from IO utilization  Rebalance automatically manages working set in RAM  Rebalance automatically throttles itself if needed  Can be stopped midway without endangering data or progress  Proper sizing includes not maxing out all resources: leave some headroom in preparation
  28. 28. Couchbase 4.0
  29. 29. ©2015 Couchbase Inc. 29 Sizing Couchbase Server 4.0  Multi-Dimensional Scalability (MDS) – Optionally Scale each service independently:  Data  Index  Query  5 factors still apply:  RAM  Disk  CPU  Network  Data Safety/Distribution
  30. 30. ©2015 Couchbase Inc. 30 Sizing Couchbase Server 4.0 - Data  Data Service in 4.0 same as previous Couchbase Server:  Enough RAM to cache reads  Enough Disk to eventually persist writes  CPU primarily forViews and XDCR  At least 3 nodes – Replication at the bucket level  Minimum requirements: 4GB RAM, 8 Cores CPU
  31. 31. ©2015 Couchbase Inc. 31 Sizing Couchbase Server 4.0 - Index  Index service new to 4.0 (a.k.a. GSI or “Secondary Indexes”):  Primarily RAM and Disk IO bound  ForestDB persistence engine  At least 2 nodes for HA, each index replicated individually  Minimum Requirements: 8GB RAM, 8 core CPU, “fast disk”  Note: 4.0 is still in beta, final sizing numbers are being formulated
  32. 32. ©2015 Couchbase Inc. 32 Sizing Couchbase Server 4.0 - Query  Query Service new to 4.0 (a.k.a. N1QL)  Primarily CPU bound  Optimized for multi-core systems  Very low RAM and disk requirements  At least 2 nodes for HA – Queries automatically load balanced  Minimum Requirements: 4GB RAM, 16+ Core CPU  Note: 4.0 is still in beta, final sizing numbers are being formulated
  33. 33. ©2015 Couchbase Inc. 33 Sizing Couchbase Server 4.0 - MDS  Multi-Dimensional Scalability (MDS)  Option 1:All 3 services enabled on all nodes – Size for aggregate requirements (Data+Index+Query)  Option 2: Separated services – Size nodes independently for different workloads. i.e.: • Data Service: More nodes with more RAM, less disk, less CPU • Index Service: Fewer nodes with more RAM, more disk, less CPU • Query Service: Fewer nodes with less RAM, less disk, more CPU
  34. 34. ©2015 Couchbase Inc. 34 Sizing Couchbase Server 4.0 - MDS Independent Load Distribution  ModularArchitecture to Construct the Database forYour Need  Pick HW Capacity – scale up and/or scale out  Pick Services Layout - overlap and/or isolate services  Pick Data/Index Partitioning Couchbase Cluster Index Service Query Service Data Service node1 node8
  35. 35. ©2015 Couchbase Inc. 35 Sizing is tricky business…  Work with the CouchbaseTeam  Validate your “on-paper” numbers with testing  Constantly monitor production
  36. 36. ©2015 Couchbase Inc. 36 Dive in…  Gather your workload and dataset requirements:  Item counts and sizes, read/write/delete ratios  Review our documentation and formulas  Test, Deploy, Monitor…rinse and repeat
  37. 37. ©2015 Couchbase Inc. 37 Want more? Lots of details and best practices in our documentation: http://www.couchbase.com/docs/ And my sizing blog: http://blog.couchbase.com/how-many-nodes-part-1- introduction-sizing-couchbase-server-20-cluster
  38. 38. Get Started withCouchbase Server 4.0: www.couchbase.com/beta GetTrained on Couchbase: training.couchbase.com Thank you perry@couchbase.com | @couchbase

×