CouchConf London: Couchbase Server in Production
 

Like this? Share it with your network

Share

CouchConf London: Couchbase Server in Production

on

  • 773 views

 

Statistics

Views

Total Views
773
Views on SlideShare
773
Embed Views
0

Actions

Likes
1
Downloads
15
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • In this session, we’re shifting gears from development to production. I’m going to talk about how to operate Couchbase in production – how to “care and feed” for the system to maintain application uptime and performance.I will try to demo as much as time permits – as this is a lot about practice.
  • The typical couchbase production environment. Many users of a web application, served by a load balanced tier of web/application servers, backed by a cluster of Couchbase Servers. Couchbase provides the real-time/transactional data store for the application data.
  • Ultimately, what matters is the interaction between the application and the database. The database must allow data to be randomly read by the application with low latency and high throughput. It must also accept writes from the application, replicate the data for safety and durably store the data as quickly as possible.
  • And it must continue those things across the application lifecycle. Not only when the application is in “steady state” but when adding and removing capacity. When a node fails. When nodes are in the process of maintenance. Sizing the cluster properly is not just about ensuring things work when everything is “steady” but also about ensuring that things work when things aren’t “steady.”
  • Before getting into the detailed recommendations and considerations for operating Couchbase across the application lifecycle, we’ll cover a few key concepts and describe the “high level” considerations for successfully operating Couchbase in production.
  • Operating Couchbase ultimately boils down to ensuring applications can read and write data – safely, without interruption and with high performance. We’ll focus first on reading, and then on writing. We’ll save the arithmetic for the “sizing” section : )
  • When applications read data from Couchbase, they request a given record (document) – we’ll focus here on simple (non-query) reads and writes of individual documents. The flow inside couchbase is pretty simple – if the document is cached in RAM, it can be returned to the application with zero delay – with 99th percentile latency not exceeding one millisecond. If the document is NOT present in cache, then Couchbase must ask the storage subsystem to read the data from disk back into cache. This will obviously require more time than simply returning the document from memory. How much time will depend on on contention for the disk (both readers and writers could be in line for the IO cycles). Now, these “cache misses” are fine from time to time. If you have a user who has not logged in for a while, and they are just logging in, then fetching their application information from disk is fine – they’ll stay logged in for a while, and all subsequent accesses will be to and from cache.
  • In order to ensure predictable, low-latency random access to data, your application’s working set should be in RAM to make the need to read from disk an infrequent occurrence.
  • Different applications, and even where the application is in its lifecycle, will lead to different required ratios between data in RAM and data only on disk (i.e. the working set to total set ratio will vary by application). We have three examples of very different working set to total dataset size ratios.
  • Now on to writing. Whereas with reading we were trying to AVOID hitting the disk as much as practical, here hitting disk (and, as you’ll see, network) is IMPORTANT. We want to hit them as fast as possible, actually, to get data replicated to other nodes for high availability…and to disk for disaster recovery. So lets look at a write inside of Couchbase. When the application writes a document to Couchbase, it is stored in RAM (cached). If RAM is full, data is ejected from memory (though, of course, it still remains on disk) to make room for the new item. Couchbase then schedules the document to be replicated to other servers in the cluster to allow the loss of this server without interrupting data availability. It also schedules the write of the document to disk.
  • So you can see here that the application tier is busily writing data to Couchbase. Couchbase is then busily replicating those documents and storing them to disk.
  • But what if the documents are being written faster than the network and disk can keep up? Here the aggregate arrival rate of data to the Couchbase Server exceeds the ability of Couchbase to replicate data to other servers over the network, and to flush data to disk. As a result, a backlog (queue) is building. These backlogs are no problem if they don’t grow without bound. They allow application write spikes to be absorbed, without impacting application-perceived performance. Couchbase effectively “decouples” application performance from the unpredictable performance of back-end disk and network IO. But if they queues grow out of control, then you’ll fill memory and have no room left for caching your data.
  • The solution to scale writes is to add more servers to the couchbase cluster ensuring AGGREGATE back-end IO performance to match AGGREGATE front-end data rate (or to at least allow the absorption of the maximum write spike you expect). If queues get too built up and Couchbase can’t drain them fast enough, Couchbase will eventually tell your application to “slow down” that it needs time to ingest the spike. As we’ll discuss in the sizing section, ensuring aggregate back end disk IO is available and sizing RAM to match working set size are the two primary requirements for getting your cluster correctly configured. Likewise, monitoring will primarily focus on ensuring you’ve done that job correctly and don’t need to make adjustments.
  • - Memcached – widely used protocol, used on 18 of the top 20 websites in the world most developers who faced scale or performance issues dealt with it or maybe some other flavor of caching layers.
  • Calculate for both Active and number of replicas.Replicas will be the first to be dropped out of ram if not enough memory
  • Each on of these can determine the number of nodesData sets, work load, etc.
  • Calculate for both Active and number of replicas.Replicas will be the first to be dropped out of ram if not enough memory
  • This is a cluster quota, need to be distributed across cluster
  • Each on of these can determined the number of nodesData sets, work load, etc.
  • Replication is needed only for writes/updates. Gets are not replicated.
  • The more nodes you have the less impactful a failure of one node on the remaining nodes on the cluster.1 node is a single point of failure, obviously bad2 nodes gives you replication which is better, but if one node goes down, the whole load goes to just one node and now you’re at an spof3 nodes is the minimal recommendation because a failure of one distributes the load over twoThe more node the better, as recovering from a single node failure is easier with more nodes in the cluster
  • Each on of these can determined the number of nodesData sets, work load, etc.
  • Calculate for both Active and number of replicas.Replicas will be the first to be dropped out of ram if not enough memory
  • Each on of these can determined the number of nodesData sets, work load, etc.
  • StatsStats timingDemo 3- Stats – check how we are on time.Load: -h localhost -i1000000 –M1000 –m9900 -t8 –K sharon –c500000 -l
  • An example of a weekly view of application on production, clearly see the oscillation on the disk write queue load.About 13 node cluster at the time (grew since then),With ops/sec that varies from 1k at the low time to 65K at peak, running on EC2.We can easily see the traffic patterns on the disk write queue, and regardless the load, the application sees the same deterministic latency.
  • Calculate for both Active and number of replicas.Replicas will be the first to be dropped out of ram if not enough memory
  • So the monitoring goal is to help assess the cluster capacity usage which derive the decision of when to grow.
  • In the demo, you just saw blah blahblahUnder the hood, blah blahEarlier today, you saw us walk through a fuller series of steps involving updating clientsWanted to show you here that the progress bar in the UI was showing this activity going on
  • Worthwhile to say that during warmup, data is not available from node…Unlike traditional RDBMS…Can handle at application level with “move on”, “retry”, “log”, “blow up”…some data is unavailable, not all
  • Worthwhile to say that during warmup, data is not available from node…Unlike traditional RDBMS…Can handle at application level with “move on”, “retry”, “log”, “blow up”…some data is unavailable, not all
  • Worthwhile to say that during warmup, data is not available from node…Unlike traditional RDBMS…Can handle at application level with “move on”, “retry”, “log”, “blow up”…some data is unavailable, not all
  • Talk about the Amazon “disaster” in December. Amazon told almost all our customers that almost all of their nodes would be restarted. We advised them to proactively rebalance in a whole cluster of new nodes and rebalance out the old ones, preventing any disruption when the restarts actually happened.
  • not unique to Couchbase…MySQL suffers as well for example
  • Do not failover a healthy node!
  • Do not failover a healthy node!
  • Do not failover a healthy node!
  • And it must continue those things across the application lifecycle. Not only when the application is in “steady state” but when adding and removing capacity. When a node fails. When nodes are in the process of maintenance. Sizing the cluster properly is not just about ensuring things work when everything is “steady” but also about ensuring that things work when things aren’t “steady.”

CouchConf London: Couchbase Server in Production Presentation Transcript

  • 1. Couchbase Server in Production Perry Krug Sr. Solutions Architect 1
  • 2. Typical Couchbase production environment Application users Load Balancer Application Servers Servers 2
  • 3. We’ll focus on App-Couchbase interaction … Application users Load Balancer Application Servers Servers 3
  • 4. … at each step of the application lifecycle Dev/Test Size Deploy Monitor Manage 4
  • 5. KEY CONCEPTS 5
  • 6. Reading, Writing and Arithmetic Reading Data Writing Data Application Server Application Server Give me Please store document A A document A Here is A OK, I stored document A document A Server Server (We’ll save the arithmetic for the sizing section : ) 6
  • 7. Reading data Application Server Give me document A Here is document A If document A is in memory A RAM return document A to the application Else A add document to read queue reader eventually loads document from disk into memory return document A to the application DISK Server Reading Data 7
  • 8. Keeping working data set in RAM is key to read performance Your application’s working set should fit in RAM… … or else! (because you don’t want the “else” part happening very often – it is MUCH slower than a memory read and you could have to wait in line an indeterminate amount of time for the read to happen.) Reading Data 8
  • 9. Working set ratio depends on your application working/total set = .01 working/total set = .33 working/total set = 1 Server Server Server Late stage social game Business application Ad Network Many users no longer Users logged in during Any cookie can show up active; few logged in at the day. Day moves at any time. any given time. around the globe. Reading Data 9
  • 10. Couchbase in operation: Writing data Application Server Store document A OK, it is stored If there is room for the document in RAM A Store the document in RAM RAM Else Eject other document(s) from RAM A Store the document in RAM Add the document to the replication queue Replicator eventually transmits document Add the document to write queue DISK Writer eventually writes document to disk Server Writing Data 10
  • 11. Flow of data when writing Application Server Application Server Application ServerApplications writing to Couchbase Server Couchbase transmitting replicas Couchbase writing to disk network Writing Data 11
  • 12. Queues build if aggregate arrival rate exceeds drain rates Application Server Application Server Application Server Server Replication queue Disk write queue network Writing Data 12
  • 13. Scaling out permits matching of aggregate flow rates soqueues do not grow Application Server Application Server Application Server Server Server Server network network network 13
  • 14. DEVELOPMENTDev-Test Size Deploy Monitor Manage 14
  • 15. Couchbase Data• Couchbase uses (and is completely compatible with) the memcached protocol.• While you can use any standard memcached library, Couchbase also provides it’s own libraries for a variety of languages.• Couchbase is document-oriented• See http://www.couchbase.com/develop 15
  • 16. Couchbase SDKsJava SDK User Code.Net SDK Java client API CouchbaseClient cb = new CouchbaseClient(listURIs, "aBucket", "letmein"); // this is all the same as before cb.set("hello", 0, "world"); cb.get("hello"); Couchbase Java LibraryPHP SDK (spymemcached)Ruby SDK Couchbase Server…and manymore http://www.couchbase.com/develop 16
  • 17. Couchbase Client Deployment Application server Farm Town Wars Farm Town Wars Application server App Code App Code Memcached Client Couchbase Java Moxi (Couchbase proxy) Client library OR 11210 8091 11210 8091 Couchbase Server Couchbase Server Couchbase Couchbase Server Server Client-side Moxi (“smart”) library 17
  • 18. SERVER AND CLUSTER SIZING (TIME FOR THE ARITHMETIC)Dev-Test Size Deploy Monitor Manage 18
  • 19. Size Couchbase Server Sizing == performance • Serve reads out of RAM • Enough IO for writes • Mitigate inevitable failures Reading Data Writing Data Application Server Application Server Give me Please store document A A document A Here is A OK, I stored document A document A Server Server 19
  • 20. How many nodes?4 Key Factors determine number of nodes needed:1) RAM2) Disk Application user3) Network4) Data Distribution/Safety Web application server Couchbase Servers 20
  • 21. RAM sizing Reading DataKeep working set in RAM Application Serverfor best read performance Give me document A A Here is document A 1) RAM • Working set A • Metadata A • Buffer/overhead • Active+Replica(s) Server 21
  • 22. Disk sizing: Space and I/O Writing Data Application Server Please store OK, I stored A document A document A2) Disk• Sustained write rate A I/O• Rebalance capacity A• Backups• Total dataset Space• Active+Replicas Server 23
  • 23. Network sizing3) Network Reads+Writes• Client traffic• Replication (writes)• Rebalancing Replication (multiply writes) and Rebalancing 24
  • 24. Data Distribution Servers fail, be prepared. The more nodes, the less impact a failure will have.4) Data Distribution / Safety (assuming one replica):• 1 node = BAD• 2 nodes = …better…• 3+ nodes = BEST!Note: Many applications will need more than 3 nodes 25
  • 25. Data Distribution APP SERVER 1 APP SERVER 2 COUCHBASE CLIENT LIBRARY COUCHBASE CLIENT LIBRARY CLUSTER MAP CLUSTER MAP Read/Write/Update Read/Write/Update SERVER 1 SERVER 2 SERVER 3 Active Docs Active Docs Active Docs Doc 5 DOC Doc 4 DOC Doc 1 DOC Doc 2 DOC Doc 7 DOC Doc 3 DOC Doc 9 DOC Doc 8 DOC Doc 6 DOC Replica Docs Replica Docs Replica Docs Doc 4 DOC Doc 6 DOC Doc 7 DOC Doc 1 DOC Doc 3 DOC Doc 9 DOC Doc 8 DOC Doc 2 DOC Doc 5 DOC COUCHBASE SERVER CLUSTER 26
  • 26. How many nodes? (recap)4 Key Factors determine number of nodes needed:1) RAM2) Disk Application user3) Network4) Data Distribution Web application server Couchbase Servers 27
  • 27. MONITORINGDev-Test Size Deploy Monitor Manage 28
  • 28. Key resources: RAM, Disk, Network Application Server Application Server Application Server NETWORK RAM RAM RAM DISK DISK DISK Server Server Server 29
  • 29. Monitoring Once in production, heart of operations is monitoring -RAM Usage -Disk writes queues / read activity -Network bandwidth, replication queues -Data distribution (balance, replicas) 30
  • 30. MonitoringIMMENSE amount of information available-Real-time traffic graphs-REST API accessible-Per bucket, per node and aggregate statistics-Application and inter-node traffic-RAM <-> Disk-Inter-system timing 31
  • 31. How do you know when your working set is not in RAM? Application Server Give me document A Here is document A If document A is in memoryA RAM return document A to the application ElseA add document to read queue reader eventually loads document from disk into memory return document A to the application DISK Server Cache Miss Ratio 32
  • 32. How do you know when you don’t have enough disk I/O?Disk Write Queue 33
  • 33. How do you know when you don’t have enough network I/O?TAP Replication Queue 34
  • 34. 35
  • 35. MANAGEMENT AND MAINTENANCEDev-Test Size Deploy Monitor Manage 36
  • 36. GrowthGoing from 5 million to 100 million users… – RAM usage is growing: • Cache misses increasing • Resident item ratios decreasing • Disk fetches increasing – Disk write queue growing higher than usualNeed to add a few more nodes...…More RAM, disk and network without any downtime 37
  • 37. Add Nodes APP SERVER 1 APP SERVER 2 COUCHBASE CLIENT LIBRARY COUCHBASE CLIENT LIBRARY CLUSTER MAP CLUSTER MAP Read/Write/Update Read/Write/Update SERVER 1 SERVER 2 SERVER 3 SERVER 4 SERVER 5 Active Docs Active Docs Active Docs Active Docs Active Docs Active Docs Doc 5 DOC Doc 4 DOC Doc 1 DOC Doc 3 Doc 2 DOC Doc 7 DOC Doc 3 DOC Doc 6 Doc 9 DOC Doc 8 DOC Doc 6 DOC Replica Docs Replica Docs Replica Docs Replica Docs Replica Docs Replica Docs Doc 4 DOC Doc 6 DOC Doc 7 DOC Doc 7 Doc 1 DOC Doc 3 DOC Doc 9 DOC Doc 9 Doc 8 DOC Doc 2 DOC Doc 5 DOC COUCHBASE SERVER CLUSTER 38
  • 38. Backup As simple as running a packaged script (cbbackup) Done on live system with minimal to no performance impact 39
  • 39. Restore1) Replace backup files, server will automatically “warmup”from disk files upon restart – Traditional RDBMS performance is acceptable while slowly populating cache – Our applications demand a different level of performance – Couchbase Server pre-loads as much as possible into RAM warmup 40
  • 40. Restore2) “cbrestore” used to restore data into live/different cluster cbrestore Data Files 41
  • 41. Upgrade 1. Add nodes of new version, rebalance… 2. Remove nodes of old version, rebalance… 3. Done! No disruption General use for software upgrade, hardware refresh, planned maintenance Upgrade existing Membase 1.7 to Couchbase Server 1.8 42
  • 42. Disk fragmentationCurrent use of sqlite causes performancedegradation as DB files get fragmented -“vacuum” available (but not as online operation) - Best practice: Repeat rebalance to “clean” disk filesUnder Development: “Maintenance mode” to allow forsafely offlining of node to perform vacuuming in place.Couchbase Server 2.0 has much improved behavior 43
  • 43. Failures Happen! Hardware Network Bugs 44
  • 44. Easy to Manage failures with Couchbase• Failover (automatic or manual): – Replica data promoted for immediate access – Replicas not recreated – Do NOT failover healthy node 45
  • 45. Fail Over APP SERVER 1 APP SERVER 2 COUCHBASE CLIENT LIBRARY COUCHBASE CLIENT LIBRARY CLUSTER MAP CLUSTER MAP SERVER 1 SERVER 2 SERVER 3 SERVER 4 SERVER 5 Active Docs Active Docs Active Docs Active Docs Active Docs Active Docs Doc 5 DOC Doc 4 DOC Doc 1 DOC Doc 9 DOC Doc 6 DOC Doc 3 Doc 2 DOC Doc 7 DOC Doc 3 Doc 8 DOC Doc 6 DOC Replica Docs Replica Docs Replica Docs Replica Docs Replica Docs Replica Docs Doc 4 DOC Doc 6 DOC Doc 7 DOC Doc 5 DOC Doc 8 DOC Doc 7 Doc 1 DOC Doc 3 DOC Doc 9 DOC Doc 2 DOC Doc 9 COUCHBASE SERVER CLUSTER 46
  • 46. Easy to maintain Couchbase• Use remove+rebalance on “malfunctioning” node: – Protects data distribution and “safety” – Replicas recreated – Best to “swap” with new node to maintain capacity 47
  • 47. Dev/Test Size Deploy Monitor Manage 48
  • 48. QUESTIONS? 49
  • 49. DEMO 50