CouchConf Israel Couchbase in Production 24x7

756 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
756
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • In this session I am going to talk about how to production lifecycle of using couchbase from setup, test to deploy, and maintain.Will try and demo as much as time permits – as this is a lot about practice
  • Based upon real-world experience from customers like Zynga, Quepasa, TribalCrossing and many others.We went though some model of how users use couchbase server through their product life cycle
  • Slow down on demo, describe each section of setupPrepare a 2 nodes on ec2.Uninstall from one and have it ready for installation.Install, go to the UI, show the registration and installation processAdd it to another clusterRebalance – have a 2 node cluster ready.In parallel, have a 5 node cluster ready with data.
  • - Memcached– widely used protocol, used on 18 of the top 20 websites in the world most developers who faced scale or performance issues dealt with it or maybe some other flavor of caching layers.- Query is used via direct Couchdb API.- Both available SDKs
  • Prepare some json objects to be inserted, create a viewMap function:function (doc) { if (doc.jsonType == "player"){ emit([“Level”,doc.level], doc.name); }}
  • Each on of these can determined the number of nodesData sets, work load, etc.
  • Calculate for both Active and number of replicas.Replicas will be the first to be dropped out of ram if not enough memory
  • This is a cluster quota, need to be distributed across cluster
  • Replication is needed only for writes/updates. Gets are not replicated.
  • The more nodes you have the less impactful a failure of one node on the remaining nodes on the cluster.
  • Recapping – any one of these factors could impact the amount of nodes needed, and the operations person need to understand the long poll of a deployment and size the cluster accordingly.
  • StatsStats timingDemo 3- Stats – check how we are on time.Load: -h localhost -i1000000 –M1000 –m9900 -t8 –K sharon –c500000 -l
  • An example of a weeklyview of application on production, clearly see the oscillation on the disk write queue load.About 13 node cluster at the time (grew since then),With ops/sec that varies from 1k at the low time to 65K at peak, running on EC2.We can easily see the traffic patterns on the disk write queue, and regardless the load, the application sees the same deterministic throughput.
  • So the monitoring goal is to help assess the cluster capacity usage which derive the decision of when to grow.
  • Do not failover a healthy node!
  • To summarize, we went over the initial setup and development,Deploy and size your clusterAnd the continuous loop of marinating, monitoring and growing your cluster
  • CouchConf Israel Couchbase in Production 24x7

    1. 1. Couchbase Server 2.0 in Production 1
    2. 2. Couchbase Server 2.0: Overview = Simple. Fast. Elastic. • Membase + CouchDB • Managed memory caching layer = super high performance • Clustering and online data redistribution (Rebalancing) • Indexing and Querying via JSON Map-Reduce • New SDK’s and client libraries • Developer Preview available at: • http://www.couchbase.com/downloads 2
    3. 3. Let’s build a social game… www.facebook.com/farm_town_wars Load Balancer Web Servers Couchbase Servers 3
    4. 4. COUCHBASE SERVER 2.0 IN PRODUCTION: DEV/OPS LIFECYCLE Client setup Initial Setup Test DeployView Development Sizing Monitor Grow Maintain Upgrade Backup/Restore Failures 4
    5. 5. Couchbase Server 2.0 in production: Initial Setup Extremely easy to get up and running: • RPM/deb/OSX/exe installation • Simple Web UI and setup wizard 5
    6. 6. Data and Indexes• Data goes “in and out” via the Memcached protocol• Queries/indexes are created and accessed via an HTTP protocol• Both are available separately, Couchbase-provided SDK’s will expose a single API to the developer and abstract the actual traffic 6
    7. 7. Data and IndexesExample JSON Document: Example view:{ function (doc) "_id": "Keith4540", { if (doc.jsonType == "player") "jsonType": "player", "name": "Keith4540", "level": 4, + { } emit(["Level", doc.level], doc._id); } …} … = Secondary index of players, by level (Demo) 7
    8. 8. Data and Indexes• Indexes/views are based on incremental map-reduce: • Indexes are updated with incremental changes (not batch)• View processing is done per-vbucket: • Parallel processing on subset of data • Couchbase provides scatter-gather aggregation 8
    9. 9. Data and Indexes• Views are “developed” off of a random (or specific) sampling of the overall dataset and then deployed • Faster • Less load on system• Updated views can be applied without rescanning entire dataset 9
    10. 10. Couchbase Client SDKs Java Client SDK User Code .Net SDK Java client API CouchbaseClient cb = new CouchbaseClient(listURIs, "aBucket", "letmein"); // this is all the same as before cb.set("hello", 0, "world"); cb.get("hello"); Spy HTTP couchDB Map<String, Object> manyThings = PHP SDK memcached cb.getBulk(Collection<String> keys); connection /* accessing a view Connection View view = cb.getView("design_document", "my_view"); Query query = new Query(); query.getRange("abegin", "theend"); Ruby SDK Couchbase Server Python SDK 10
    11. 11. COUCHBASE SERVER 2.0 IN PRODUCTION: DEV/OPS LIFECYCLE Client setup Deploy Sizing 11
    12. 12. Couchbase Server 2.0 in production: Sizing Sizing Question: How many nodes do I need? Considerations: RAM Disk Network Data Safety 12
    13. 13. Sizing Sizing Question: How many nodes do I need? Considerations: RAM • Metadata Active+Replica • Working set • Buffer/overhead 13
    14. 14. Sizing 500,000 Documents to begin with: -20 bytes in length -average Document size of 2k -metadata of about 150 bytes per key = ~1Gb to store active data, an extra 1Gb to store replica dataAdding in some headroom: Give 3GB RAM to Couchbase to start and grow with 14
    15. 15. Sizing Sizing Question: How many nodes do I need? Considerations: Disk: • Sustained write rate • Index generation (space and IO) • Append-only B-Tree • Compaction • Rebalance capacity 15
    16. 16. Sizing Sizing Question: How many nodes do I need? Considerations: Network: • Client traffic • Replication • Rebalancing 16
    17. 17. Sizing Sizing Question: How many nodes do I need? Considerations: Data Safety1 node = single point of failure (bad)2 nodes = 1 replica copy (better)3+ nodes = 1 replica copy AND data/load distribution 17
    18. 18. Sizing Sizing Question: How many nodes do I need? Considerations: RAM Disk Network Data Safety 18
    19. 19. Client-side Deployment Application server Farm Town Wars Farm Town Wars Application server App Code App Code Couchbase PHP Client Couchbase Java Library Client library Moxi (Couchbase proxy) OR 11210 8092 11210 8092 Query API Query API Couchbase Server Couchbase Server Couchbase Couchbase Client-side Moxi (“smart”) library 19
    20. 20. COUCHBASE SERVER 2.0 IN PRODUCTION: DEV/OPS LIFECYCLE Monitor Grow Maintain Upgrade Backup/Restore Failures 20
    21. 21. MonitoringIMMENSE amount of information available-Real-time traffic graphs-REST API accessible-Per bucket, per node and aggregate statistics-Application and inter-node traffic-RAM <-> Disk-Inter-system timing 21
    22. 22. 22
    23. 23. GrowthGoing from 1 hundred users to 2 million users… – RAM usage is growing: • Ejecting data to and fetching data from disk • Resident item ratios decreasing – might impact failover • Cache miss ratio increases – Disk write queue grows higher than usualNeed to add a few more nodes...Now we have more RAM and more disk throughputwithout any downtime (Demo) 23
    24. 24. General Maintenance• Persistence is using CouchDB technology: – Append-only B-tree – Reliability and data integrity – Constantly growing on-disk files• Compaction is the answer: – Automatic – “Fragmentation” data – Scheduled compaction – Partial compaction via per-vbucket database (saves disk space and time) 24
    25. 25. General Maintenance• Backup/Restore: – Backup is as simple as a file-level copy (thanks CouchDB!) – Server will automatically “warmup” from disk files upon reboot• Upgrade: – Add nodes of new version, rebalance… – Remove nodes of old version, rebalance… – Done! – No disruption – Upgrade from existing Membase 1.7 installations to Couchbase 2.0 25
    26. 26. Failures • Failures happen: – Hardware – Network – Bugs • Failover to replica data for immediate access • Remove and rebalance “malfunctioning” node (Demo) 26
    27. 27. COUCHBASE SERVER 2.0 IN PRODUCTION: DEV/OPS LIFECYCLE Client setup Initial Setup Test DeployView Development Sizing Monitor Grow Maintain Upgrade Backup/Restore Failures 27
    28. 28. THANK YOU! Q&A 28

    ×