Couchbase forPython developersHotCode 2013
What is Couchbase?
Couchbase• http://www.couchbase.com• Open source NoSQL database technology for interactive web and mobile apps• Easy scala...
History.Membase• Started in June 2010 as Membase• Simplicity and speed of Memcached• But also provide storage,persistence ...
History.Merge with CouchOne• In February 2011 Membase merged with CouchOne• CouchOne company with many developers of Couch...
History.Couchbase 2.0• On December 2012 Couchbase 2.0 released• New JSON document store• Indexing and querying data• Incre...
In total• CouchBase is• Memcached interface for managing data• CouchDB based interface for indexing and querying data• Sca...
Why we need anotherNoSQL solution?
What we have?• Key-value cache in RAM• Redis• Memcached• Eventually-consistent key-value store• Cassandra• Dynamo• Riak
What we have?• Document store• CouchDB• MongoDB• {{ name }}DB
Why we need another?• Cause developers can• This is cool to be a NoSQL storage author• There should be a serious NoSQL sto...
CouchBase as cache storage
Couchbase is Memcached• All Memcached commands* work with Couchbase• set/add/replace• append/prepend• get/delete• incr/dec...
How it works?• Connect to Memcached socket using binary protocol• Authenticate with or without password• Send Memcached co...
Differences• Additional commands• lock/unlock• vBucketID value should present in each key request
What is vBucket?• http://dustin.github.io/2010/06/29/memcached-vbuckets.html• http://www.couchbase.com/docs/couchbase-manu...
vBucket mapping9 vbuckets,3 clustersif hash(KEY) == vB 8:read key from server C
vBucket mapping after new node added9 vbuckets,4 clustersif hash(KEY) == vB 8:read key from server D
CouchBase as document storage
Same between CouchDB and Couchbase• NoSQL document database• JSON as document format• Append-only file format• Same approac...
Differences• Couchbase cache-based database in terms of read/write performance• CouchDB disk-based database• Couchbase has...
In total• Couchbase is not CouchDB
Indexing and querying data• Say hello to views!• First you need to create design document (via REST API or admin UI)• Desi...
View map functions# Returns all document IDs from bucketfunction(doc, meta) {emit(meta.id, null);}# Returns only JSON docu...
Querying indexed data• Querying available via REST API• Can filter results by exact key• Or using range (startkey,endkey)• ...
View reduce functions• When you need data to be summarized or reduced• Reduce function could be defined in view orsend whil...
Overview of Python clients
couchbase < 0.9• Very slow official Python client• Supports Memcached commands and REST API• Not ready for any usage• ~800 ...
couchbase >= 0.9• Python client based on C libcouchbase library• Only supports Memcached commands• No REST API support• ~8...
couchbase >= 0.9from couchbase import Couchbasecouchbase = Couchbase()client = couchbase.connect(bucket, host, port, usern...
mcdonnell• Experimental Python client created by myself• Not open sourced yet :(• Supports Memcached commands and REST API...
mcdonnellfrom mcdonnell.bucket import Bucketclient = Bucket(‘couchbase://username:password@host:port/bucket’)client.set(‘k...
mcdonnellimport typesfrom mcdonnell.bucket import Bucketclient = Bucket(‘couchbase://username:password@host:port/bucket’)r...
How we use CouchBasein real life in GetGoing?
What we have?• 3 m2.xlarge EC2 instances• 45 GB RAM• 1.43 TB disk
For what reasons?• Search results cache (with ttl)• User cache (persistance)• Hotels data document storage (persistance)
Clustering,replications andother NoSQL buzzwords
Clustering• New node could be added/deleted via command line or REST API• After node added/deleted cluster should be rebal...
Cross datacenter replication• Could replicate to other Couchbase cluster• Could replicate to other storage,like ElasticSea...
Couchbase -> ElasticSearch replication• ElasticSearch is search engine built on top of Apache Lucene,same to Solr,butmore ...
Other• Couchbase has experimental geo views support• Couchbase has two editions: community and enterprise
Questions?I am Igor Davydenkohttp://igordavydenko.comhttp://github.com/playpauseandstop
Upcoming SlideShare
Loading in...5
×

Igor Davydenko

827

Published on

HOTCODE conference 2013, Kiev

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
827
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Igor Davydenko

  1. 1. Couchbase forPython developersHotCode 2013
  2. 2. What is Couchbase?
  3. 3. Couchbase• http://www.couchbase.com• Open source NoSQL database technology for interactive web and mobile apps• Easy scalability• Consistent high performance• Flexible data model
  4. 4. History.Membase• Started in June 2010 as Membase• Simplicity and speed of Memcached• But also provide storage,persistence and querying capabilities• Initially created by NorthScale with co-sponsors Zynga and NHN
  5. 5. History.Merge with CouchOne• In February 2011 Membase merged with CouchOne• CouchOne company with many developers of CouchDB• After merge company named Couchbase• In January 2012 Couchbase 1.8 released
  6. 6. History.Couchbase 2.0• On December 2012 Couchbase 2.0 released• New JSON document store• Indexing and querying data• Incremental MapReduce• Cross datacenter replication
  7. 7. In total• CouchBase is• Memcached interface for managing data• CouchDB based interface for indexing and querying data• Scalability (clustering,replications) out of the box
  8. 8. Why we need anotherNoSQL solution?
  9. 9. What we have?• Key-value cache in RAM• Redis• Memcached• Eventually-consistent key-value store• Cassandra• Dynamo• Riak
  10. 10. What we have?• Document store• CouchDB• MongoDB• {{ name }}DB
  11. 11. Why we need another?• Cause developers can• This is cool to be a NoSQL storage author• There should be a serious NoSQL storages for business• But really,I don’t know• I still prefer Redis + PostgreSQL more than Couchbase :)
  12. 12. CouchBase as cache storage
  13. 13. Couchbase is Memcached• All Memcached commands* work with Couchbase• set/add/replace• append/prepend• get/delete• incr/decr• touch• stats• *except of flush
  14. 14. How it works?• Connect to Memcached socket using binary protocol• Authenticate with or without password• Send Memcached command as request• Receive response from Memcached
  15. 15. Differences• Additional commands• lock/unlock• vBucketID value should present in each key request
  16. 16. What is vBucket?• http://dustin.github.io/2010/06/29/memcached-vbuckets.html• http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-introduction-architecture-vbuckets.html• vBucket is computed subset of all possible keys• vBucket system used for distributing data and for supporting replicas• vBucket mapping allows server know the fastest way for getting/setting data• Couchbase vBucket implementation differs from standard Memcachedimplementation
  17. 17. vBucket mapping9 vbuckets,3 clustersif hash(KEY) == vB 8:read key from server C
  18. 18. vBucket mapping after new node added9 vbuckets,4 clustersif hash(KEY) == vB 8:read key from server D
  19. 19. CouchBase as document storage
  20. 20. Same between CouchDB and Couchbase• NoSQL document database• JSON as document format• Append-only file format• Same approach for indexing and querying• CouchDB replication technology is base for Couchbase cross datacenter replication
  21. 21. Differences• Couchbase cache-based database in terms of read/write performance• CouchDB disk-based database• Couchbase has a built-in clustering system that allows data spread over nodes• CouchDB is a single node solution with peer-to-peer replication• Each solution has their admin interfaces (Couchbase server admin,Futon)
  22. 22. In total• Couchbase is not CouchDB
  23. 23. Indexing and querying data• Say hello to views!• First you need to create design document (via REST API or admin UI)• Design document is collection of views• Next you need to create view and provide map/reduce functions• View should contain map function and could contain reduce function• After Couchbase server indexing data by executing map functions and stores resultin sorted order
  24. 24. View map functions# Returns all document IDs from bucketfunction(doc, meta) {emit(meta.id, null);}# Returns only JSON document IDsfunction(doc, meta) {if (meta.type == “json”) {emit(meta.id, null);}}# Multiple returns in one viewfunction(doc, meta) {if (meta.type == “json”) {if (doc.name) {emit(doc.name, null);}if (doc.nameLong) {emit(doc.nameLong, null);}}}
  25. 25. Querying indexed data• Querying available via REST API• Can filter results by exact key• Or using range (startkey,endkey)• All key requests should use JSON format• Also can group results,reverse results,paginate results
  26. 26. View reduce functions• When you need data to be summarized or reduced• Reduce function could be defined in view orsend while querying view via REST API• Built-in functions• _count• _sum• _stats
  27. 27. Overview of Python clients
  28. 28. couchbase < 0.9• Very slow official Python client• Supports Memcached commands and REST API• Not ready for any usage• ~800 ops/sec on my machine
  29. 29. couchbase >= 0.9• Python client based on C libcouchbase library• Only supports Memcached commands• No REST API support• ~8000 ops/sec on my machine
  30. 30. couchbase >= 0.9from couchbase import Couchbasecouchbase = Couchbase()client = couchbase.connect(bucket, host, port, username, password)client.set(‘key’, ‘value’)assert client.get(‘key’).value == ‘value’from couchbase import FMT_JSONclient.set(‘doc’, {‘key’: ‘value’}, FMT_JSON)assert client.get(‘key’).value == {‘key’: ‘value’}
  31. 31. mcdonnell• Experimental Python client created by myself• Not open sourced yet :(• Supports Memcached commands and REST API• Has Django cache backend and Celery result backend• ~6000 ops/sec on my machine
  32. 32. mcdonnellfrom mcdonnell.bucket import Bucketclient = Bucket(‘couchbase://username:password@host:port/bucket’)client.set(‘key’, ‘value’)assert client.get(‘key’) == ‘value’from mcdonnell.constants import FLAG_JSONclient.set(‘doc’, {‘key’: ‘value’}, flags=FLAG_JSON)assert client.get(‘key’) == {‘key’: ‘value’}
  33. 33. mcdonnellimport typesfrom mcdonnell.bucket import Bucketclient = Bucket(‘couchbase://username:password@host:port/bucket’)results = client.view(‘design’, ‘view’)assert isinstance(results, types.GeneratorType)from mcdonnell.ddocs import DesignDocddoc = DesignDoc(‘design_name’)ddoc.views[‘view_name’] = map_functionddoc.upload()
  34. 34. How we use CouchBasein real life in GetGoing?
  35. 35. What we have?• 3 m2.xlarge EC2 instances• 45 GB RAM• 1.43 TB disk
  36. 36. For what reasons?• Search results cache (with ttl)• User cache (persistance)• Hotels data document storage (persistance)
  37. 37. Clustering,replications andother NoSQL buzzwords
  38. 38. Clustering• New node could be added/deleted via command line or REST API• After node added/deleted cluster should be rebalanced• Can auto-failover broken nodes,but only for 3+ total nodes in cluster
  39. 39. Cross datacenter replication• Could replicate to other Couchbase cluster• Could replicate to other storage,like ElasticSearch
  40. 40. Couchbase -> ElasticSearch replication• ElasticSearch is search engine built on top of Apache Lucene,same to Solr,butmore REST API friendly• Couchbase has official transport to replicate all cluster data to ElasticSearch• This enables fulltext search over cluster data
  41. 41. Other• Couchbase has experimental geo views support• Couchbase has two editions: community and enterprise
  42. 42. Questions?I am Igor Davydenkohttp://igordavydenko.comhttp://github.com/playpauseandstop

×