Couchbase overviewDistributed Document Database               Sharon Barr                 VP Engineering               Sim...
Couchbase NoSQL Leadership    Leading NoSQL database company    Open Source development & business model    Document-orien...
Market Adoption – Customers    Internet Companies                          Enterprises  More than 300 customers -- 5,000 p...
Couchbase Server                                             (a.k.a. Membase)           Simple. Fast. Elastic. NoSQL.   Co...
Relational Technology Scales Up                                                         Application Scales Out            ...
Couchbase Server Scales Out Like App Tier                                                            Application Scales Ou...
Couchbase Server Is The Complete Solution           Easy                        Consistent High ✔      Scalability        ...
# 1 reason for users to move to noSQL                                        • 8                                        8
FLEXIBLENO SCHEMA            9
Relational vs Document Data Model        C1     C2      C3     C4                                                 {     JS...
RDBMS Example: User Profile             User Info                               Address Info     KEY    First    Last    Z...
Document Example: User Profile {     “ID”: 1,     “FIRST”: “Frank”,     “LAST”: “Weigel”,     “ZIP”: “94040”,     “CITY”: ...
Document database• Json objects• Each document has an independent schema{                                                 ...
PERFORMANCEPREDICTABLE LATENCY                      14
Key results of Cisco and Solarflare BenchmarkCouchbase Server demonstrates• Consistent sub-millisecond  latency for mixed ...
Your secret weapon: Sub-millisecond AND consistent latencyLatency (micro seconds)                                         ...
Your secret weapon: Linear scalability                                     High throughput with 1.4                       ...
SCALE        18
Draw Something by OMGPOP                           19
Draw Something “goes viral” 3 weeks after launch     Draw Something by OMGPOP     Daily Active Users (millions)16141210864...
As usage grew, game data went non-linear.         Draw Something by OMGPOP         Daily Active Users (millions)    16    ...
In contrast (used relational DB)     The Simpson’s: Tapped Out     Daily Active Users (millions)16141210864               ...
ALWAYS ONLINE                23
Partitioning The Data – vbucket (internal shards) map                                                        24
Basic Operation – scale out                   APP SERVER 1                                     APP SERVER 2               ...
Add Nodes                  APP SERVER 1                                APP SERVER 2                                       ...
Fail Over Node                  APP SERVER 1                                APP SERVER 2                                  ...
Couchbase Server 2.0  •   Next major release of Couchbase Server  •   Currently in Developer Preview, approaching Beta and...
Couchbase Server 2.0 Architecture    8092           11211                  11210    Couch View     Memcapable 1.0         ...
Couchbase Server 2.0 Architecture    8092           11211                  11210    Couch View     Memcapable 1.0         ...
Couchbase Server 2.0 Architecture    8092           11211                  11210    Couch View     Memcapable 1.0         ...
Indexing and querying• Built-in incremental map reduce• Map functions are written and executed on Java Script  (using Goog...
Map function• Map functionsfunction (doc) {  if (doc.country, doc.state, doc.city) {    emit([doc.country, doc.state, doc....
Reduce functions• Built in reduce functions   • _count   • _sum   • _stats ({“sum”: 1411, “count”: 1411, “min”: 1, “max”: ...
Indexing and Querying                  APP SERVER 1                                APP SERVER 2                   APP SERV...
Cross Data Center Replication   US DATA                        EUROPE DATA                 ASIA DATA   CENTER             ...
Couchbase and Hadoop Integration• Support large-scale analytics on application data by streaming data  from Couchbase to H...
Elastic Search integration                                    COUCHBASE SERVER CLUSTER                                    ...
Couchbase Client SDKsJava ClientSDK              User Code.Net SDK       Java client API                                  ...
THANK YOU          COUCHBASE  SIMPLE, FAST, ELASTIC NOSQLsharon@couchbase.com                                40
Upcoming SlideShare
Loading in...5
×

Couchbase presentation

416

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
416
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • In this session, we’re shifting gears from development to production. I’m going to talk about how to operate Couchbase in production – how to “care and feed” for the system to maintain application uptime and performance.I will try to demo as much as time permits – as this is a lot about practice.
  • Partial listing of companies with paid production deploymentsThousands more using open source
  • Typical architecture, we have stateless application servers, sitting behind a load balancer. as the usage grows, adding additional app servers , update the load balancer and scale out the application linearly on both aspects – Costs and Performance. But the data tier is has a shared everything architecture. At a minimum, these are shared cache or shared disk systems. And so you need to scale up you will need expensive hardware. And even from a performance perspective you hit a limit. so both cost and performance with this approach is non –linear.
  • If you contrast this architecture for NoSQL systems with relational systems, with a document model and auto-sharding, the database now scales horizontally along with your app servers tier. Giving you the linear cost and performance you want.
  • Most of you are probably familiar with the table layout. A table is defined with a set of column. And each record in the table conforms to the schema. If you wish to capture different data in the future, the table schema must be changed using the ALTER TABLE statement. Typically data is normalized in the 3rd normal form reduce duplication. Large tables are split into smaller tablesusing foreign keys
  • Example. Normalized schema 2 tables Foreign keys (links) connects the two. To get information about a specific error, you will perform and JOIN across the two tables
  • Single doc contains aggregated info that would normallly be distributed across tables. Of course in real use cases it tends to be info spread out over tens, hundresds or even thousands of tables in real world complex systems (like SAP)Example. Normalized schema 2 tables Fk connects the two. To get information about a specific error, you will perform and join across the two tables
  • CAPI interface – basic Couch API of which some goes through the caching layer (CRUD), some goes directly to Couch (Views)
  • CAPI interface – basic Couch API of which some goes through the caching layer (CRUD), some goes directly to Couch (Views)
  • CAPI interface – basic Couch API of which some goes through the caching layer (CRUD), some goes directly to Couch (Views)
  • Not yet enabled in current DP, will be available for Beta
  • Couchbase presentation

    1. 1. Couchbase overviewDistributed Document Database Sharon Barr VP Engineering Simple. Fast. Elastic. 1
    2. 2. Couchbase NoSQL Leadership Leading NoSQL database company Open Source development & business model Document-oriented NoSQL database Focused on interactive internet and mobile applications Provide more flexible, higher performance, more scalable database than relational alternative Most mature, reliable and widely deployed solution >5,000 paid production deployments worldwide Headquarters in Silicon Valley (Mountain View, CA) ~100 employees including >50 in engineering/product >80% of commits to Couchbase, memcached, Apache CouchDB 2
    3. 3. Market Adoption – Customers Internet Companies Enterprises More than 300 customers -- 5,000 production deployments worldwide 3
    4. 4. Couchbase Server (a.k.a. Membase) Simple. Fast. Elastic. NoSQL. Couchbase automatically distributes data across commodity servers. Built-in caching enables apps to read and write data with sub-millisecond latency. And with no schema to manage, Couchbase effortlessly accommodates changing data management requirements. 4
    5. 5. Relational Technology Scales Up Application Scales Out Just add more commodity web servers System Cost Application PerformanceWeb/App Server Tier Users RDBMS Scales Up Get a bigger, more complex server System Cost Application Performance Won’t scale beyond this point Relational Database Users Expensive and disruptive sharding, doesn’t perform at web scale 5
    6. 6. Couchbase Server Scales Out Like App Tier Application Scales Out Just add more commodity web servers System Cost Application PerformanceWeb/App Server Tier Users NoSQL Database Scales Out Cost and performance mirrors app tier System Cost Application Performance Couchbase Distributed Data Store Users Scaling out flattens the cost and performance curves 6
    7. 7. Couchbase Server Is The Complete Solution Easy Consistent High ✔ Scalability ✔ Performance One click scalability and no app Sub millisecond latency with high changes. throughput for reads and writes. ✔ Always On ✔ Flexible 24x7x365 Data Model Maintenance, upgrades and JSON document model with no fixed cluster resizing all online schema. without application downtime 7
    8. 8. # 1 reason for users to move to noSQL • 8 8
    9. 9. FLEXIBLENO SCHEMA 9
    10. 10. Relational vs Document Data Model C1 C2 C3 C4 { JSON JSON } JSON Relational data model Document data model Highly-structured table organization Collection of complex documents with with rigidly-defined data formats and arbitrary, nested data formats and record structure. varying “record” format. 10
    11. 11. RDBMS Example: User Profile User Info Address Info KEY First Last ZIP_id ZIP_id CITY STATE ZIP 1 Frank Weigel 2 1 DEN CO 30303 2 Ali Dodson 2 2 MV CA 94040 3 Mark Azad 2 3 CHI IL 60609 4 Steve Yen 3 4 NY NY 10010 To get info about specific user, you perform a join across two tables 11
    12. 12. Document Example: User Profile { “ID”: 1, “FIRST”: “Frank”, “LAST”: “Weigel”, “ZIP”: “94040”, “CITY”: “MV”, = + “STATE”: “CA” } JSON All data in a single document 12
    13. 13. Document database• Json objects• Each document has an independent schema{ { "_id": "brewery_Cleveland_ChopHouse_and_Brewery", "_id": "beer_Double_Cream_Oatmeal_Stout", "_rev": "1-00000061480b50910000000000000000", "_rev": "1-0000042ee19241b60000000000000000", "city": "Cleveland", "category": "North American Ale", "updated": "2010-07-22 20:00:20", "style": "American-Style Stout", "code": "44113", "name": "Double Cream Oatmeal Stout", "name": "Cleveland ChopHouse and Brewery", "updated": "2010-07-22 20:00:20", "country": "United States", "brewery": "Olde Peninsula Brewpub and Restaurant", "phone": "1-216-623-0909", "$expiration": 0, "state": "Ohio", "$flags": 0 "address": [ } "824 West St.Clair Avenue” ], "geo": { "loc": [ "-81.6994", "41.4995” ], ] "accuracy": "ROOFTOP” }, "$expiration": 0, "$flags": 0} 13
    14. 14. PERFORMANCEPREDICTABLE LATENCY 14
    15. 15. Key results of Cisco and Solarflare BenchmarkCouchbase Server demonstrates• Consistent sub-millisecond latency for mixed workload• High throughput• Linear scalability http://www.cisco.com/en/US/prod/collateral/switches/ps9441/ps9670/white_paper_c11-708169.pdf 15
    16. 16. Your secret weapon: Sub-millisecond AND consistent latencyLatency (micro seconds) Consistently low latencies in microseconds for varying documents sizes with a mixed workload Object size (Bytes) 16
    17. 17. Your secret weapon: Linear scalability High throughput with 1.4 GB/sec data transfer rate using 4 serversOperations per second Linear throughput scalability Number of servers in cluster 17
    18. 18. SCALE 18
    19. 19. Draw Something by OMGPOP 19
    20. 20. Draw Something “goes viral” 3 weeks after launch Draw Something by OMGPOP Daily Active Users (millions)161412108642 2/6 8 10 12 14 16 18 20 22 24 26 28 3/1 3 5 7 9 11 13 15 17 19 21 20
    21. 21. As usage grew, game data went non-linear. Draw Something by OMGPOP Daily Active Users (millions) 16 14 12 10 8 By March 19, there were 6 over 30,000,000 downloads of the app, over 5,000 drawings being stored per second, 4 over 2,200,000,000 drawings stored,over 105,000 database transactions per second, Instagram (7.5M MAU in 5 wks) 2 and over 3.3 terabytes of data stored. 2/6 8 10 12 14 16 18 20 22 24 26 28 3/1 3 5 7 9 11 13 15 17 19 21 21
    22. 22. In contrast (used relational DB) The Simpson’s: Tapped Out Daily Active Users (millions)16141210864 #2 Free app on iPad2 #3 Free app on iPhone 2/6 8 10 12 14 16 18 20 22 24 26 28 3/1 3 5 7 9 11 13 15 17 19 21 22
    23. 23. ALWAYS ONLINE 23
    24. 24. Partitioning The Data – vbucket (internal shards) map 24
    25. 25. Basic Operation – scale out APP SERVER 1 APP SERVER 2  Docs distributed evenly across COUCHBASE CLIENT LIBRARY COUCHBASE CLIENT LIBRARY servers in the cluster  Each server stores both active CLUSTER MAP CLUSTER MAP & replica docs  Only one server active at a time  Client library provides app with Read/Write/Update Read/Write/Update simple interface to database  Cluster map provides map to which server doc is on  App never needs to know SERVER 1 SERVER 2 SERVER 3  App reads, writes, updates Active Docs Active Docs Active Docs docs Doc 5 DOC Doc 4 DOC Doc 1 DOC  Multiple App Servers can Doc 2 DOC Doc 7 DOC Doc 3 DOC access same document at Doc 9 DOC Doc 8 DOC Doc 6 DOC same time Replica Docs Replica Docs Replica Docs Doc 4 DOC Doc 6 DOC Doc 7 DOC Doc 1 DOC Doc 3 DOC Doc 9 DOC Doc 8 DOC Doc 2 DOC Doc 5 DOC COUCHBASE SERVER CLUSTERUser Configured Replica Count = 1 25
    26. 26. Add Nodes APP SERVER 1 APP SERVER 2  Two servers added to COUCHBASE CLIENT LIBRARY COUCHBASE CLIENT LIBRARY cluster  One-click operation CLUSTER MAP CLUSTER MAP  Docs automatically rebalanced across cluster  Even distribution of docs Read/Write/Update Read/Write/Update  Minimum doc movement  Cluster map updated  App database calls now distributed over larger # SERVER 1 SERVER 2 SERVER 3 SERVER 4 SERVER 5 of servers Active Docs Active Docs Active Docs Active Docs Active Docs Active Docs Doc 5 DOC Doc 4 DOC Doc 1 DOC Doc 3 Doc 2 DOC Doc 7 DOC Doc 3 DOC Doc 6 Doc 9 DOC Doc 8 DOC Doc 6 DOC Replica Docs Replica Docs Replica Docs Replica Docs Replica Docs Replica Docs Doc 4 DOC Doc 6 DOC Doc 7 DOC Doc 7 Doc 1 DOC Doc 3 DOC Doc 9 DOC Doc 9 Doc 8 DOC Doc 2 DOC Doc 5 DOC COUCHBASE SERVER CLUSTERUser Configured Replica Count = 1 26
    27. 27. Fail Over Node APP SERVER 1 APP SERVER 2  App servers happily accessing docs on Server 3 COUCHBASE CLIENT LIBRARY COUCHBASE CLIENT LIBRARY  Server fails  App server requests to server 3 fail CLUSTER MAP CLUSTER MAP  Cluster detects server has failed  Promotes replicas of docs to active  Updates cluster map  App server requests for docs now go to appropriate server  Typically rebalance would follow SERVER 1 SERVER 2 SERVER 3 SERVER 4 SERVER 5 Active Docs Active Docs Active Docs Active Docs Active Docs Active Docs Doc 5 DOC Doc 4 DOC Doc 1 DOC Doc 9 DOC Doc 6 DOC Doc 3 Doc 2 DOC Doc 7 DOC Doc 3 Doc 8 DOC Doc 6 DOC Replica Docs Replica Docs Replica Docs Replica Docs Replica Docs Replica Docs Doc 4 DOC Doc 6 DOC Doc 7 DOC Doc 5 DOC Doc 8 DOC Doc 7 Doc 1 DOC Doc 3 DOC Doc 9 DOC Doc 2 DOC Doc 9 COUCHBASE SERVER CLUSTERUser Configured Replica Count = 1 27
    28. 28. Couchbase Server 2.0 • Next major release of Couchbase Server • Currently in Developer Preview, approaching Beta and GA. What’s new: • New storage engine technology (Append only b-tree) • Indexing and Querying • Incremental Map Reduce • Cross Data Center Replication • Better memory management, large data sets, and other technological improvments • Fully backwards compatible with existing Couchbase Server 28
    29. 29. Couchbase Server 2.0 Architecture 8092 11211 11210 Couch View Memcapable 1.0 Memcapable 2.0 Moxi REST management API/Web UI vBucket state and replication manager Memcached Interface Couch API Global singleton supervisor Rebalance orchestrator Configuration manager Node health monitor Process monitor Heartbeat Couchbase EP Engine Write/replica Hash table cache Data Manager Queues Cluster Manager Membase storage interface Distributed CouchStore Indexing Auto compaction http on each node one per cluster CouchBase Erlang/OTP HTTP Erlang port mapper Distributed Erlang 8091 4369 21100 - 21199 29
    30. 30. Couchbase Server 2.0 Architecture 8092 11211 11210 Couch View Memcapable 1.0 Memcapable 2.0 Moxi REST management API/Web UI vBucket state and replication manager Memcached Interface Couch API Global singleton supervisor Rebalance orchestrator Configuration manager Node health monitor Process monitor Heartbeat Couchbase EP Engine Write/replica Hash table cache Queues Cluster Manager Membase storage interface Distributed CouchStore Indexing Auto compaction http on each node one per cluster CouchBase Erlang/OTP HTTP Erlang port mapper Distributed Erlang 8091 4369 21100 - 21199 30
    31. 31. Couchbase Server 2.0 Architecture 8092 11211 11210 Couch View Memcapable 1.0 Memcapable 2.0 Moxi REST management API/Web UI vBucket state and replication manager Memcached Interface Couch API Global singleton supervisor Rebalance orchestrator Configuration manager Node health monitor Process monitor Heartbeat Couchbase EP Engine Hash table cache Write/replica Queues storage interface Distributed CouchStore Indexing Auto compaction http on each node one per cluster CouchBase Erlang/OTP HTTP Erlang port mapper Distributed Erlang 8091 4369 21100 - 21199 31
    32. 32. Indexing and querying• Built-in incremental map reduce• Map functions are written and executed on Java Script (using Google’s V8 engine)• Index is built incrementally as mutation streams in• Query in a scatter/gather fashion 32
    33. 33. Map function• Map functionsfunction (doc) { if (doc.country, doc.state, doc.city) { emit([doc.country, doc.state, doc.city], 1); } else if (doc.country, doc.state) { emit([doc.country, doc.state], 1); } else if (doc.country) { emit([doc.country], 1); }} REST call: http://db1.couchbase.com:8092/beer-sample/_design/dev_beer/_view/by_location?limit=10 33
    34. 34. Reduce functions• Built in reduce functions • _count • _sum • _stats ({“sum”: 1411, “count”: 1411, “min”: 1, “max”: 1, “sumsqr”:1411})• Developing procedure • Develop against a subset of the data • Built the index on the entire cluster • Promote a dev_ view to production 34
    35. 35. Indexing and Querying APP SERVER 1 APP SERVER 2 APP SERVER 1 APP SERVER 2  Indexing work is distributed COUCHBASE CLIENT LIBRARY COUCHBASE CLIENT LIBRARY COUCHBASE CLIENT LIBRARY COUCHBASE CLIENT LIBRARY amongst nodes  Large data set possible CLUSTER MAP MAP CLUSTER CLUSTER MAPMAP CLUSTER  Parallelize the effort  Each node has index for data stored on it Query Response  Queries combine the results from required nodes SERVER 1 SERVER 2 SERVER 3 Active Docs Active Docs Active Docs Doc 5 DOC Doc 4 DOC Doc 1 DOC Doc 2 DOC Doc 7 DOC Doc 3 DOC Doc 9 DOC Doc 8 DOC Doc 6 DOC Replica Docs Replica Docs Replica Docs Doc 4 DOC Doc 6 DOC Doc 7 DOC Doc 1 DOC Doc 3 DOC Doc 9 DOC Doc 8 DOC Doc 2 DOC Doc 5 DOCUser Configured Replica Count = 1 35
    36. 36. Cross Data Center Replication US DATA EUROPE DATA ASIA DATA CENTER CENTER CENTER Replication Replication Replication  Data close to users  Multiple locations for disaster recovery  Independently managed clusters serving local data 36
    37. 37. Couchbase and Hadoop Integration• Support large-scale analytics on application data by streaming data from Couchbase to Hadoop – Real-time integration using Flume – Batch integration using Sqoop• Examples – Various game statistics (e.g., monthly / daily / hourly rankings) – Analyze game patterns from users to enhance various game metrics memcached Sqoop TAP protocol listener/sender engine interface Couchbase Storage Engine 3 37
    38. 38. Elastic Search integration COUCHBASE SERVER CLUSTER  Use the cross data center SERVER 1 SERVER 2 SERVER 3 interface Active Docs Active Docs Active Docs  Agnostic to topology changes Doc 5 DOC Doc 4 DOC Doc 1 DOC  De-duplication Doc 2 DOC Doc 7 DOC Doc 3 DOC  Effective changes feed of the Doc 9 DOC Doc 8 DOC Doc 6 DOC entire cluster Replica Docs Replica Docs Replica Docs Doc 4 DOC Doc 6 DOC Doc 7 DOC Doc 1 DOC Doc 3 DOC Doc 9 DOC Doc 8 DOC Doc 2 DOC Doc 5 DOC CROSS DATA CENTER CONNETROR Changes feed to consumed by Elastic Search cluster, or any other consumer http://blog.couchbase.com/couchbase-and-full-text-search-couchbase-transport-elastic-searchUser Configured Replica Count = 1 38
    39. 39. Couchbase Client SDKsJava ClientSDK User Code.Net SDK Java client API CouchbaseClient cb = new CouchbaseClient(listURIs, "aBucket", "letmein"); // this is all the same as before cb.set("hello", 0, "world"); cb.get("hello"); spymemcached HTTP couchDB Map<String, Object> manyThings =PHP SDK Connection connection cb.getBulk(Collection<String> keys); /* accessing a view View view = cb.getView("design_document", "my_view"); Query query = new Query(); query.getRange("abegin", "theend");Ruby SDK Couchbase ServerPython SDK http://www.couchbase.come/develop 39
    40. 40. THANK YOU COUCHBASE SIMPLE, FAST, ELASTIC NOSQLsharon@couchbase.com 40

    ×