TuningCouchbaseTim Smith, Engineer
IntroductionI am:●    tim@couchbase.com●    http://github.com/couchtim●    Support Engineer●    Sales Engineer
IntroductionYou are:●   Using Membase●   Using CouchDB●   Using it in production●   100ms response●   2ms response●   Smar...
Simple, Fast, Elastic*Membase●   5-minute cluster setup●   Memcached API●   Memcached-fast responses (working set)●   Satu...
Simple, Fast*, ElasticCouchDB●   Webophilic: JSON, RESTy HTTP, Javascript●   Append-only, crash-only, MVCC (hide the    fa...
Simple*, Fast, ElasticCouchbase Server 2.0●   Auto-sharding, clustered elasticity●   Caching, predictable low-latency●   S...
API ComboCouchbase Server 2.0●   Both memcached binary protocol●   And CouchDB HTTP protocol●   SDK provides consistent in...
Lots to Learn…We welcome your discoveries, ingenioussolutions and feedback.But we’re not starting from scratch!Best practi...
HardwareMembase:●   RAM, RAM, RAM!●   Fast disk (throughput) helpful for write-    intensive applications, and disk-heavy ...
Hardware—RAMProper cluster sizing is #1●   Couchbase.org wiki: Sizing Guidelines●   Main variables include total number of...
Hardware—DiskFast disk not too important…until it is●   Rebalance can move a lot of data around●   Especially when disk > ...
HardwareCouchDB:●   CPU usage can be signifcant with view    updates, replication flters, formatting    via /_list and /_s...
Hardware—CloudCloud hosting brings variability●   Disk bandwidth can occasionally drop●   Even identical instances may per...
ConfigurationMembase client●   Use a Membase-aware smart client    (spymemcached for Java, Enyim for C#)    ●   Or, run mo...
ConfigurationCouchDB client●   Caching, Etag / If-None-Match●   Compression, Accept-Encoding: gzip●   Keep-Alive●   By the...
API UsageMemcached API●   Binary protocol●   Multi-get and multi-set●   Incr, decr, append, prepend●   TTL expiration, get...
API UsageCouchDB API●   HEAD vs. GET, ?limit=1●    ?startkey, not ?skip●   Use built-in reduce functions: _sum,    _count,...
API UsageCouchDB API●   Use ?group_level to aggregate over    structured keys●   Emit null, and use ?include_docs to get  ...
Modeling—Doc SizeBundle related info into one document●   Fewer items → less caching overhead●   Reduce number of requests...
Modeling—Doc SizeBreak up serial items to separate docs●   E.g., comments, events, other “feeds”●   Each entry is self-des...
Modeling—Key SizeUse short key values●   At the clustering layer, all keys are kept    in RAM, tracked for replicas, etc.●...
Modeling—IndexesConsider other index types●   Full-text integration●   Geo-spatial (can be used for non-spatial    data, t...
Modeling—K/V TricksNon-obvious models in key/value space●   Example: level of indirection to “remove” a    bunch of keys w...
Diagnostic StatsMonitoring Couchbase Server●   Ops/sec●   RAM usage vs. high/low water marks●   Growth of RAM usage (mem_u...
Diagnostic StatsMonitoring Couchbase Server●   RAM ejections for active/replica data    (*eject*)●   Cache miss ratio (get...
Diagnostic StatsError condition stats●   Disk write errors (*failed*)●   Uptime resets●   Out of memory conditions (*oom*)...
What’d I MissBefore questions, I want assertions.That is, you’re smarter than I am●   …and you’ve got more experience●   W...
Thank You        Tim Smith   tim@couchbase.comhttp://github.com/couchtim       @couchtim
CouchConf-SF-Couchbase-Performance-Tuning
Upcoming SlideShare
Loading in...5
×

CouchConf-SF-Couchbase-Performance-Tuning

3,049

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,049
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
38
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

CouchConf-SF-Couchbase-Performance-Tuning

  1. 1. TuningCouchbaseTim Smith, Engineer
  2. 2. IntroductionI am:● tim@couchbase.com● http://github.com/couchtim● Support Engineer● Sales Engineer
  3. 3. IntroductionYou are:● Using Membase● Using CouchDB● Using it in production● 100ms response● 2ms response● Smarter than I am
  4. 4. Simple, Fast, Elastic*Membase● 5-minute cluster setup● Memcached API● Memcached-fast responses (working set)● Saturate network with minimal CPU● Pleasant admin UI● Rebalance on the fly
  5. 5. Simple, Fast*, ElasticCouchDB● Webophilic: JSON, RESTy HTTP, Javascript● Append-only, crash-only, MVCC (hide the fancy bits)● Bidi replication, /_changes● Replication & app-level sharding scale out● Your data. Everywhere.
  6. 6. Simple*, Fast, ElasticCouchbase Server 2.0● Auto-sharding, clustered elasticity● Caching, predictable low-latency● Scatter-gather, incremental map/reduce● Rich hands-on-your-data features
  7. 7. API ComboCouchbase Server 2.0● Both memcached binary protocol● And CouchDB HTTP protocol● SDK provides consistent interface● Optionally synchronous persistence
  8. 8. Lots to Learn…We welcome your discoveries, ingenioussolutions and feedback.But we’re not starting from scratch!Best practices from Membase and CouchDBstill apply.● Hardware and system resources● Client and API usage● Data modeling
  9. 9. HardwareMembase:● RAM, RAM, RAM!● Fast disk (throughput) helpful for write- intensive applications, and disk-heavy ops (rebalance)● Network bandwidth may become an issue● Adding more nodes can help with all three
  10. 10. Hardware—RAMProper cluster sizing is #1● Couchbase.org wiki: Sizing Guidelines● Main variables include total number of items, size of working set, replicas and per-item overhead● Under-provisioning reduces elasticity
  11. 11. Hardware—DiskFast disk not too important…until it is● Rebalance can move a lot of data around● Especially when disk > RAM● Warm-up time after node restart● Under-provisioning reduces elasticity● More nodes spread out the I/O● SSD, RAID, the usual stuf
  12. 12. HardwareCouchDB:● CPU usage can be signifcant with view updates, replication flters, formatting via /_list and /_show● Fast disk helpful for many applications, and disk-heavy ops (compaction)● Separate data and view on diferent filesystems to improve I/O● RAM can’t hurt
  13. 13. Hardware—CloudCloud hosting brings variability● Disk bandwidth can occasionally drop● Even identical instances may perform diferently● Large instances more reliable● More instances provide redundancy● Best bang-for-the-buck still an open question
  14. 14. ConfigurationMembase client● Use a Membase-aware smart client (spymemcached for Java, Enyim for C#) ● Or, run moxi on the client host ● Minimizes network hops, preserves bandwidth● Value compression (often automatic)
  15. 15. ConfigurationCouchDB client● Caching, Etag / If-None-Match● Compression, Accept-Encoding: gzip● Keep-Alive● By the way, Couchbase Single Server has some killer performance increases (coming soon to an Apache CouchDB release)
  16. 16. API UsageMemcached API● Binary protocol● Multi-get and multi-set● Incr, decr, append, prepend● TTL expiration, get-and-touch
  17. 17. API UsageCouchDB API● HEAD vs. GET, ?limit=1● ?startkey, not ?skip● Use built-in reduce functions: _sum, _count, _stats; write views in Erlang● Keep view index size in mind—emit just what you need
  18. 18. API UsageCouchDB API● Use ?group_level to aggregate over structured keys● Emit null, and use ?include_docs to get more data (faster view generation)● Emit more data, so ?include_docs isn’t needed (avoid random I/O on query)
  19. 19. Modeling—Doc SizeBundle related info into one document● Fewer items → less caching overhead● Reduce number of requests clients make● Promotes server-side processing with _show functions● More context available for flexible maps
  20. 20. Modeling—Doc SizeBreak up serial items to separate docs● E.g., comments, events, other “feeds”● Each entry is self-described● Avoids write contention on a container● Avoid read/write of container contents just to make a small addition● May be gathered with map/reduce view
  21. 21. Modeling—Key SizeUse short key values● At the clustering layer, all keys are kept in RAM, tracked for replicas, etc.● 255 bytes max length, but prefer short keys● At CouchDB layer, id is likewise used in many places, and short ids are more efficient● Semantic keys
  22. 22. Modeling—IndexesConsider other index types● Full-text integration● Geo-spatial (can be used for non-spatial data, too)● Hadoop connector w/ Couchbase Server (via TAP)
  23. 23. Modeling—K/V TricksNon-obvious models in key/value space● Example: level of indirection to “remove” a bunch of keys without knowing their keys:● Defne a master key, e.g. obj_rev: 3● Defne subordinate attribute keys with the master value in the key name, e.g. obj_foo-3, obj_bar-3● Increment obj_rev, and rely on TTL to reap stale attribute items
  24. 24. Diagnostic StatsMonitoring Couchbase Server● Ops/sec● RAM usage vs. high/low water marks● Growth of RAM usage (mem_used)● Growth of metadata usage (ep_overhead)
  25. 25. Diagnostic StatsMonitoring Couchbase Server● RAM ejections for active/replica data (*eject*)● Cache miss ratio (get_hits vs. ep_bg_fetched)● Disk write queue size (ep_queue_size + flusher_todo)● Disk space available
  26. 26. Diagnostic StatsError condition stats● Disk write errors (*failed*)● Uptime resets● Out of memory conditions (*oom*)● Swap usage
  27. 27. What’d I MissBefore questions, I want assertions.That is, you’re smarter than I am● …and you’ve got more experience● What’s the most important tip you know?● What mistakes did I make?
  28. 28. Thank You Tim Smith tim@couchbase.comhttp://github.com/couchtim @couchtim
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×