CouchConf-SF-Couchbase-Performance-Tuning
 

Like this? Share it with your network

Share

CouchConf-SF-Couchbase-Performance-Tuning

on

  • 3,424 views

 

Statistics

Views

Total Views
3,424
Views on SlideShare
3,139
Embed Views
285

Actions

Likes
1
Downloads
37
Comments
0

2 Embeds 285

http://www.scoop.it 282
http://bundlr.com 3

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

CouchConf-SF-Couchbase-Performance-Tuning Presentation Transcript

  • 1. TuningCouchbaseTim Smith, Engineer
  • 2. IntroductionI am:● tim@couchbase.com● http://github.com/couchtim● Support Engineer● Sales Engineer
  • 3. IntroductionYou are:● Using Membase● Using CouchDB● Using it in production● 100ms response● 2ms response● Smarter than I am
  • 4. Simple, Fast, Elastic*Membase● 5-minute cluster setup● Memcached API● Memcached-fast responses (working set)● Saturate network with minimal CPU● Pleasant admin UI● Rebalance on the fly
  • 5. Simple, Fast*, ElasticCouchDB● Webophilic: JSON, RESTy HTTP, Javascript● Append-only, crash-only, MVCC (hide the fancy bits)● Bidi replication, /_changes● Replication & app-level sharding scale out● Your data. Everywhere.
  • 6. Simple*, Fast, ElasticCouchbase Server 2.0● Auto-sharding, clustered elasticity● Caching, predictable low-latency● Scatter-gather, incremental map/reduce● Rich hands-on-your-data features
  • 7. API ComboCouchbase Server 2.0● Both memcached binary protocol● And CouchDB HTTP protocol● SDK provides consistent interface● Optionally synchronous persistence
  • 8. Lots to Learn…We welcome your discoveries, ingenioussolutions and feedback.But we’re not starting from scratch!Best practices from Membase and CouchDBstill apply.● Hardware and system resources● Client and API usage● Data modeling
  • 9. HardwareMembase:● RAM, RAM, RAM!● Fast disk (throughput) helpful for write- intensive applications, and disk-heavy ops (rebalance)● Network bandwidth may become an issue● Adding more nodes can help with all three
  • 10. Hardware—RAMProper cluster sizing is #1● Couchbase.org wiki: Sizing Guidelines● Main variables include total number of items, size of working set, replicas and per-item overhead● Under-provisioning reduces elasticity
  • 11. Hardware—DiskFast disk not too important…until it is● Rebalance can move a lot of data around● Especially when disk > RAM● Warm-up time after node restart● Under-provisioning reduces elasticity● More nodes spread out the I/O● SSD, RAID, the usual stuf
  • 12. HardwareCouchDB:● CPU usage can be signifcant with view updates, replication flters, formatting via /_list and /_show● Fast disk helpful for many applications, and disk-heavy ops (compaction)● Separate data and view on diferent filesystems to improve I/O● RAM can’t hurt
  • 13. Hardware—CloudCloud hosting brings variability● Disk bandwidth can occasionally drop● Even identical instances may perform diferently● Large instances more reliable● More instances provide redundancy● Best bang-for-the-buck still an open question
  • 14. ConfigurationMembase client● Use a Membase-aware smart client (spymemcached for Java, Enyim for C#) ● Or, run moxi on the client host ● Minimizes network hops, preserves bandwidth● Value compression (often automatic)
  • 15. ConfigurationCouchDB client● Caching, Etag / If-None-Match● Compression, Accept-Encoding: gzip● Keep-Alive● By the way, Couchbase Single Server has some killer performance increases (coming soon to an Apache CouchDB release)
  • 16. API UsageMemcached API● Binary protocol● Multi-get and multi-set● Incr, decr, append, prepend● TTL expiration, get-and-touch
  • 17. API UsageCouchDB API● HEAD vs. GET, ?limit=1● ?startkey, not ?skip● Use built-in reduce functions: _sum, _count, _stats; write views in Erlang● Keep view index size in mind—emit just what you need
  • 18. API UsageCouchDB API● Use ?group_level to aggregate over structured keys● Emit null, and use ?include_docs to get more data (faster view generation)● Emit more data, so ?include_docs isn’t needed (avoid random I/O on query)
  • 19. Modeling—Doc SizeBundle related info into one document● Fewer items → less caching overhead● Reduce number of requests clients make● Promotes server-side processing with _show functions● More context available for flexible maps
  • 20. Modeling—Doc SizeBreak up serial items to separate docs● E.g., comments, events, other “feeds”● Each entry is self-described● Avoids write contention on a container● Avoid read/write of container contents just to make a small addition● May be gathered with map/reduce view
  • 21. Modeling—Key SizeUse short key values● At the clustering layer, all keys are kept in RAM, tracked for replicas, etc.● 255 bytes max length, but prefer short keys● At CouchDB layer, id is likewise used in many places, and short ids are more efficient● Semantic keys
  • 22. Modeling—IndexesConsider other index types● Full-text integration● Geo-spatial (can be used for non-spatial data, too)● Hadoop connector w/ Couchbase Server (via TAP)
  • 23. Modeling—K/V TricksNon-obvious models in key/value space● Example: level of indirection to “remove” a bunch of keys without knowing their keys:● Defne a master key, e.g. obj_rev: 3● Defne subordinate attribute keys with the master value in the key name, e.g. obj_foo-3, obj_bar-3● Increment obj_rev, and rely on TTL to reap stale attribute items
  • 24. Diagnostic StatsMonitoring Couchbase Server● Ops/sec● RAM usage vs. high/low water marks● Growth of RAM usage (mem_used)● Growth of metadata usage (ep_overhead)
  • 25. Diagnostic StatsMonitoring Couchbase Server● RAM ejections for active/replica data (*eject*)● Cache miss ratio (get_hits vs. ep_bg_fetched)● Disk write queue size (ep_queue_size + flusher_todo)● Disk space available
  • 26. Diagnostic StatsError condition stats● Disk write errors (*failed*)● Uptime resets● Out of memory conditions (*oom*)● Swap usage
  • 27. What’d I MissBefore questions, I want assertions.That is, you’re smarter than I am● …and you’ve got more experience● What’s the most important tip you know?● What mistakes did I make?
  • 28. Thank You Tim Smith tim@couchbase.comhttp://github.com/couchtim @couchtim