3. Couchbase NoSQL Leadership
Leading NoSQL database company
Open Source development & business model
Behind Couchbase open source project
Document-oriented NoSQL database
Focused on interactive internet and mobile applications
Provide more flexible, higher performance,
more scalable database than relational alternative
Most mature, reliable and widely deployed solution
>5,000 paid production deployments worldwide, over 350 customers
Headquarters in Silicon Valley (Mountain View, CA)
~100 employees including >60 in engineering/product
>80% of commits to Couchbase, memcached, Apache CouchDB
3
4. Agenda
What is a Document database
The document model
Couchbase Server
Couchbase nosql solution
5
12. Survey: The leading driver for NoSQL adoption
What is the biggest data management problem
driving your use of NoSQL in the coming year?
Lack of flexibility/rigid schemas 49%
Inability to scale out data 35%
High latency/low performance 29%
Costs 16%
All of these 12%
Other 11%
Source: Couchbase NoSQL Survey, December 2011, n=1351
13
14. Key Value vs. Document database
Pure Key-Value Database Document Database
10101001010100 {
100011110101100 “ID”: 1,
010100010100011 “FIRST”: “Frank”,
110011000101010 “LAST”: “Weigel”,
“ZIP”: “94040”,
010010010011001
“CITY”: “MV”,
101010100100011
“STATE”: “CA”
101010101001010
}
Couchbase Server 1.8 Couchbase Server 2.0
- Current release - Adds indexing/querying
Both Key-Value & Document Use-Cases Supported
15
15. Relational vs Document Data Model
C1 C2 C3 C4
{ JSON
JSON
}
JSON
Relational data model Document data model
Highly-structured table organization Collection of complex documents with
with rigidly-defined data formats and arbitrary, nested data formats and
record structure. varying “record” format.
16
16. RDBMS Example: User Profile
User Info Address Info
KEY First Last ZIP_id ZIP_id CITY STATE ZIP
1 Frank Weigel 2 1 DEN CO 30303
2 Ali Dodson 2 2 MV CA 94040
3 Mark Azad 2 3 CHI IL 60609
4 Steve Yen 3 4 NY NY 10010
To get info about specific user, you perform a join across two tables
17
17. Document Example: User Profile
{
“ID”: 1,
“FIRST”: “Frank”,
“LAST”: “Weigel”,
“ZIP”: “94040”,
“CITY”: “MV”,
= +
“STATE”: “CA”
}
JSON
All data in a single document
18
18. Making a Change Using RDBMS
User Table Photo Table Country Table
Country TEL Country
User ID First Last Zip User ID Photo ID Comment ID Country ID Country name
ID 3
2 d043 NYC 001 001 USA
1 Frank Wiegel 94040 001
2 b054 Bday 007 002 UK
2 Joe Smith 94040 001 5 c036 Miami 001 003 Argentina
3 Ali Dodson 94040 001 7 d072 Sunset 133
004 Australia
5002 e086 Spain 133
4 Sarah Gorin NW1 002 005 Aruba
001
Status Table 006 Austria
5 Bob Young 30303 Country
User ID Status ID Text ID
007 Brazil
6 Nancy Baker 10010 001 1 a42 At conf 134
008 Canada
4 b26 excited 007
7 Ray Jones 31311 001
5 c32 hockey 008 009 Chile
8 Lee Chen V5V3M 008
12 d83 Go A’s 001 •
•
•
5000 e34 sailing 005
• .
• . 130 Portugal
• .
Affiliations Table
Country
User ID Affl ID Affl Name ID 131 Romania
50000 Doug Moore 04252 001 2 a42 Cal 001 132 Russia
4 b96 USC 001
50001 Mary White SW195 002 133 Spain
7 c14 UW 001
50002 Lisa Clark 12425 001 8 e22 Oxford 002 134 Sweden
19
19. Making the Same Change With Couchbase
{
“ID”: 1,
“FIRST”: “Frank”,
“LAST”: “Weigel”,
“ZIP”: “94040”,
“CITY”: “MV”,
“STATE”: “CA”,
“STATUS”:
,}
{ “TEXT”: “At Conf”
} “GEO_LOC”: “134” },
“COUNTRY”: ”USA”
}
JSON
Just add information to a document
20
20. Document Databases
• Each record in the database is a self-
describing document {
• Each document has an independent “UUID”: “21f7f8de-8051-5b89-86
“Time”: “2011-04-01T13:01:02.42
“Server”: “A2223E”,
structure “Calling Server”: “A2213W”,
“Type”: “E100”,
“Initiating User”: “dsallings@spy.net”,
• Documents can be complex “Details”:
{
• All databases require a unique key
“IP”: “10.1.1.22”,
“API”: “InsertDVDQueueItem”,
“Trace”: “cleansed”,
• Documents are stored using JSON or
“Tags”:
[
“SERVER”,
XML or their derivatives “US-West”,
“API”
]
• Database can look into the documents }
}
• Content can be indexed and queried
21
22. Document modeling
• Are these separate object in the model layer?
Q •
•
Are these objects accessed together?
Do you need updates to these objects to be atomic?
• Are multiple people editing these objects concurrently?
When considering how to model data for a given
application
• Think of a logical container for the data
• Think of how data groups together
23
23. Document Design Options
• One document that contains all related data
– Data is de-normalized
– Better performance and scale
– Eliminate client-side joins
• Separate documents for different object types with cross
references
– Data duplication is reduced
– Objects may not be co-located
– Transactions supported only on a document boundary
– Most document databases do not support joins or multi
document transactions
24
24. Document ID / Key selection
• Similar to primary keys in relational databases
• Documents are sharded based on the document ID
• ID based document lookup is extremely fast
• Usually an ID can only appear once in a bucket
• Do you have a unique way of referencing objects?
Q • Are related objects stored in separate documents?
Options
•UUIDs, date-based IDs, numeric IDs
•Hand-crafted (human readable)
•Matching prefixes (for multiple related objects)
25
25. Example: Entities for a Blog
BLOG
• User profile
The main pointer into the user data
• Blog entries
• Badge settings, like a twitter badge
• Blog posts
Contains the blogs themselves
• Blog comments
• Comments from other users
26
27. Threaded Comments
• You can imagine how to take this to a threaded list
List First
Reply to
comment
Blog List comment
More
Comments
Advantages
• Only fetch the data when you need it
• For example, rendering part of a web page
• Spread the data and load across the entire cluster
28
29. Example 2 – Different object types
User
[Serializable] Key Value
User User_1234 1234;Cheli;
{
public long ID; Buddies
public string Name; Key Value
User_1234_Buddies User_5678
[NonSerialized] User_9876
public list<User> Buddies;
Messages
Key Value
[NonSerialized]
public list<Messages> Messages User_1234_Messages Expire-> 9/9/9999
Message_1234
Message_5678
[NonSerialized]
public Dictionary<Game,List<Bet>> BetsByGame
}
Key Value
User_1234_BetsByGame_1 Bet_1234
BetsByGame Bet_2345
Key Value
Key Value
User_1234_BetsByGame User_1234_BetsByGame_1
User_1234_BetsByGame_2 User_1234_BetsByGame_2 Bet_9876
30 30
31. Relational Technology Scales Up
Application Scales Out
Just add more commodity web servers
System Cost
Application Performance
Web/App Server Tier
Users
RDBMS Scales Up
Get a bigger, more complex server
System Cost
Application Performance
Won’t
scale
beyond
this point
Relational Database
Users
Expensive and disruptive sharding, doesn’t perform at web scale
32
32. Couchbase Server Scales Out Like App Tier
Application Scales Out
Just add more commodity web servers
System Cost
Application Performance
Web/App Server Tier
Users
NoSQL Database Scales Out
Cost and performance mirrors app tier
System Cost
Application Performance
Couchbase Distributed Data Store
Users
Scaling out flattens the cost and performance curves
33
33. Couchbase Server (a.k.a. Membase)
Simple. Fast. Elastic. NoSQL.
Couchbase automatically distributes data across commodity servers. Built-in caching enables
apps to read and write data with sub-millisecond latency. And with no schema to
manage, Couchbase effortlessly accommodates changing data management requirements.
34
34. Couchbase Server Is The Complete Solution
Easy Consistent High
✔ Scalability ✔ Performance
One click scalability and no app Sub millisecond latency with high
changes. throughput for reads and writes.
✔ Always On ✔ Flexible
24x365 Data Model
Maintenance, upgrades and JSON document model with no fixed
cluster resizing all online schema.
without application downtime
35
35. Use Case Examples
Web app or Use-case Couchbase Solution Example Customer
Content and Metadata Couchbase document store + Elastic Search McGraw-Hill…
Management System
Social Game or Mobile Couchbase stores game and player data Zynga, OMGPOP…
App
Ad Targeting Couchbase stores user information for fast AOL…
access
User Profile Store Couchbase Server as a key-value store TuneWiki…
Session Store Couchbase Server as a key-value store Concur….
High Availability Couchbase Server as a memcached tier Orbitz…
Caching Tier replacement
Chat/Messaging Couchbase Server DOCOMO…
Platform
37
36. # 1 reason for users to move to noSQL
• 3
38
38 8
38. Key results of Cisco and Solarflare Benchmark
Couchbase Server demonstrates
• Consistent sub-millisecond
latency for mixed workload
• High throughput
• Linear scalability
http://www.cisco.com/en/US/prod/collateral/switches/ps9441/ps9670/white_paper_c11-708169.pdf 40
39. Your secret weapon: Sub-millisecond AND consistent latency
Latency (micro seconds)
Consistently low latencies
in microseconds for
varying documents sizes
with a mixed workload
Object size (Bytes)
41
40. Your secret weapon: Linear scalability
High throughput with 1.4
GB/sec data transfer rate
using 4 servers
Operations per second
Linear throughput
scalability
Number of servers in cluster
42
49. Basic Operation – scale out
APP SERVER 1 APP SERVER 2
Docs distributed evenly across
COUCHBASE CLIENT LIBRARY COUCHBASE CLIENT LIBRARY
servers in the cluster
Each server stores both active
CLUSTER MAP CLUSTER MAP
& replica docs
Only one server active at a time
Client library provides app with
Read/Write/Update Read/Write/Update simple interface to database
Cluster map provides map to
which server doc is on
App never needs to know
SERVER 1 SERVER 2 SERVER 3
App reads, writes, updates
Active Docs Active Docs Active Docs
docs
Doc 5 DOC Doc 4 DOC Doc 1 DOC
Multiple App Servers can
Doc 2 DOC Doc 7 DOC Doc 3 DOC access same document at
Doc 9 DOC Doc 8 DOC Doc 6 DOC
same time
Replica Docs Replica Docs Replica Docs
Doc 4 DOC Doc 6 DOC Doc 7 DOC
Doc 1 DOC Doc 3 DOC Doc 9 DOC
Doc 8 DOC Doc 2 DOC Doc 5 DOC
COUCHBASE SERVER CLUSTER
User Configured Replica Count = 1 51
50. Add Nodes
APP SERVER 1 APP SERVER 2
Two servers added to
COUCHBASE CLIENT LIBRARY COUCHBASE CLIENT LIBRARY cluster
One-click operation
CLUSTER MAP CLUSTER MAP
Docs automatically
rebalanced across
cluster
Even distribution of
docs
Read/Write/Update Read/Write/Update Minimum doc
movement
Cluster map updated
App database calls now
distributed over larger #
SERVER 1 SERVER 2 SERVER 3 SERVER 4 SERVER 5 of servers
Active Docs Active Docs Active Docs Active Docs Active Docs
Active Docs
Doc 5 DOC Doc 4 DOC Doc 1 DOC
Doc 3
Doc 2 DOC Doc 7 DOC Doc 3 DOC
Doc 6
Doc 9 DOC Doc 8 DOC Doc 6 DOC
Replica Docs Replica Docs Replica Docs Replica Docs Replica Docs
Replica Docs
Doc 4 DOC Doc 6 DOC Doc 7 DOC
Doc 7
Doc 1 DOC Doc 3 DOC Doc 9 DOC
Doc 9
Doc 8 DOC Doc 2 DOC Doc 5 DOC
COUCHBASE SERVER CLUSTER
User Configured Replica Count = 1 52
51. Fail Over Node
APP SERVER 1 APP SERVER 2
App servers happily accessing docs
on Server 3
COUCHBASE CLIENT LIBRARY COUCHBASE CLIENT LIBRARY Server fails
App server requests to server 3 fail
CLUSTER MAP CLUSTER MAP Cluster detects server has failed
Promotes replicas of docs to active
Updates cluster map
App server requests for docs now
go to appropriate server
Typically rebalance would follow
SERVER 1 SERVER 2 SERVER 3 SERVER 4 SERVER 5
Active Docs Active Docs Active Docs Active Docs Active Docs
Active Docs
Doc 5 DOC Doc 4 DOC Doc 1 DOC Doc 9 DOC Doc 6 DOC
Doc 3
Doc 2 DOC Doc 7 DOC Doc 3 Doc 8 DOC
Doc 6
DOC
Replica Docs Replica Docs Replica Docs Replica Docs Replica Docs
Replica Docs
Doc 4 DOC Doc 6 DOC Doc 7 DOC Doc 5 DOC Doc 8 DOC
Doc 7
Doc 1 DOC Doc 3 DOC Doc 9 DOC Doc 2 DOC
Doc 9
COUCHBASE SERVER CLUSTER
User Configured Replica Count = 1 53
52. New in Couchbase Server 2.0
JSON support Indexing and Querying
JSON
JSON JSO
JSON N
JSON
Incremental Map Reduce Cross data center replication
54
53. Additional Couchbase Server Features
Append-only storage layer
Online compaction
Better working set management
Reduce server warm-up time
Monitoring and admin API & UI
SDKs, documentation and examples for a variety of languages
55
54. Couchbase Server 2.0 Architecture
8092 11211 11210
Couch View Memcapable 1.0 Memcapable 2.0
Moxi
REST management API/Web UI
vBucket state and replication manager
Memcached Interface
Couch API
Global singleton supervisor
Rebalance orchestrator
Configuration manager
Node health monitor
Process monitor
Heartbeat
Couchbase EP Engine
Write/replica
Hash table cache
Data Manager Queues Cluster Manager
Membase
storage interface
Distributed CouchStore
Indexing Auto compaction http on each node one per cluster
CouchBase Erlang/OTP
HTTP Erlang port mapper Distributed Erlang
8091 4369 21100 - 21199
56
55. Couchbase Server 2.0 Architecture
8092 11211 11210
Couch View Memcapable 1.0 Memcapable 2.0
Moxi
REST management API/Web UI
vBucket state and replication manager
Memcached Interface
Couch API
Global singleton supervisor
Rebalance orchestrator
Configuration manager
Node health monitor
Process monitor
Heartbeat
Couchbase EP Engine
Write/replica
Hash table cache
Queues Cluster Manager
Membase
storage interface
Distributed CouchStore
Indexing Auto compaction http on each node one per cluster
CouchBase Erlang/OTP
HTTP Erlang port mapper Distributed Erlang
8091 4369 21100 - 21199
57
56. Couchbase Server 2.0 Architecture
8092 11211 11210
Couch View Memcapable 1.0 Memcapable 2.0
Moxi
REST management API/Web UI
vBucket state and replication manager
Memcached Interface
Couch API
Global singleton supervisor
Rebalance orchestrator
Configuration manager
Node health monitor
Process monitor
Heartbeat
Couchbase EP Engine
Hash table cache Write/replica
Queues
storage interface
Distributed CouchStore
Indexing Auto compaction http on each node one per cluster
CouchBase Erlang/OTP
HTTP Erlang port mapper Distributed Erlang
8091 4369 21100 - 21199
58
57. Indexing and querying
• Built-in incremental map reduce
• Map functions are written and executed on Java Script
(using Google’s V8 engine)
• Index is built incrementally as mutation streams in
• Query in a scatter/gather fashion
59
58. Map function
• Map functions
function (doc) {
if (doc.country, doc.state, doc.city) {
emit([doc.country, doc.state, doc.city], 1);
} else if (doc.country, doc.state) {
emit([doc.country, doc.state], 1);
} else if (doc.country) {
emit([doc.country], 1);
}
}
REST call: http://db1.couchbase.com:8092/beer-sample/_design/dev_beer/_view/by_location?limit=10
60
59. Reduce functions
• Built in reduce functions
• _count
• _sum
• _stats ({“sum”: 1411, “count”: 1411, “min”: 1, “max”: 1, “sumsqr”:1411})
• Developing procedure
• Develop against a subset of the data
• Built the index on the entire cluster
• Promote a dev_ view to production
61
60. Indexing and Querying
APP SERVER 1 APP SERVER 2
APP SERVER 1 APP SERVER 2
Indexing work is distributed
COUCHBASE CLIENT LIBRARY
COUCHBASE CLIENT LIBRARY
COUCHBASE CLIENT LIBRARY
COUCHBASE CLIENT LIBRARY amongst nodes
Large data set possible
CLUSTER MAP MAP
CLUSTER CLUSTER MAPMAP
CLUSTER
Parallelize the effort
Each node has index for data
stored on it
Query
Response Queries combine the results
from required nodes
SERVER 1 SERVER 2 SERVER 3
Active Docs Active Docs Active Docs
Doc 5 DOC Doc 4 DOC Doc 1 DOC
Doc 2 DOC Doc 7 DOC Doc 3 DOC
Doc 9 DOC Doc 8 DOC Doc 6 DOC
Replica Docs Replica Docs Replica Docs
Doc 4 DOC Doc 6 DOC Doc 7 DOC
Doc 1 DOC Doc 3 DOC Doc 9 DOC
Doc 8 DOC Doc 2 DOC Doc 5 DOC
User Configured Replica Count = 1 62
61. Cross Data Center Replication
US DATA EUROPE DATA ASIA DATA
CENTER CENTER CENTER
Replication Replication
Replication
Data close to users
Multiple locations for disaster recovery
Independently managed clusters serving local data
63
62. XDCR: Cross Data Center Replication
• Replicate your Couchbase data across clusters
• Clusters may be spread across geos
• Configured on a per-bucket basis
• Supports unidirectional and bidirectional operation
• Application can read and write from both clusters
(active – active replication)
• Scales out linearly
• Different from intra-cluster replication
64
65. Elastic Search integration
COUCHBASE SERVER CLUSTER
Use the cross data center
SERVER 1 SERVER 2 SERVER 3 interface
Active Docs Active Docs Active Docs Agnostic to topology changes
Doc 5 DOC Doc 4 DOC Doc 1 DOC
De-duplication
Doc 2 DOC Doc 7 DOC Doc 3 DOC Effective changes feed of the
Doc 9 DOC Doc 8 DOC Doc 6 DOC entire cluster
Replica Docs Replica Docs Replica Docs
Doc 4 DOC Doc 6 DOC Doc 7 DOC
Doc 1 DOC Doc 3 DOC Doc 9 DOC
Doc 8 DOC Doc 2 DOC Doc 5 DOC
CROSS DATA CENTER CONNETROR
Changes feed to consumed by
Elastic Search cluster, or any other consumer
http://blog.couchbase.com/couchbase-and-full-text-search-couchbase-transport-elastic-search
User Configured Replica Count = 1 67
66. Couchbase and Hadoop Integration
• Support large-scale analytics on application data by streaming data
from Couchbase to Hadoop
– Real-time integration using Flume
– Batch integration using Sqoop
• Examples
– Various game statistics (e.g., monthly / daily / hourly rankings)
– Analyze game patterns from users to enhance various game metrics
memcached
Sqoop TAP protocol listener/sender
engine interface
Couchbase Storage Engine
6
68
67. Couchbase Client SDKs
Java Client
SDK
User Code
.Net SDK Java client API
CouchbaseClient cb = new CouchbaseClient(listURIs,
"aBucket", "letmein");
// this is all the same as before
cb.set("hello", 0, "world");
cb.get("hello");
spymemcached HTTP couchDB Map<String, Object> manyThings =
PHP SDK Connection connection
cb.getBulk(Collection<String> keys);
/* accessing a view
View view =
cb.getView("design_document", "my_view");
Query query = new Query();
query.getRange("abegin", "theend");
Ruby SDK
Couchbase Server
Python SDK
http://www.couchbase.come/develop
69
Partial listing of companies with paid production deploymentsThousands more using open source
Before we jump into evaluation of NoSQL, let’s take a look at a sampling of NoSQL databases. Foundationally, every NoSQL database , solution is a key –value store. A primary key identifies record and the value is just a blob. Document databases, column and graph databases add more functionality like indexing and querying, that’s what you would expect from a database. Many would argue that memcached was the precursor to all NoSQL databases. In-memory key value store. Redis is also an in-memory key value store but with a lot more operations on lists and sets. On the database side, membase open source. Key value with persistence, replication for high availability, and highly scalable with consistent sharding. Built-in object level cache, memcached compatibleCouchbase is a descendant of membase, is a document database and uses JSON as the data model. Its horizontally scalable, replicates data for high availability, includes a built-in cache for high throughput low latency, but in addition embeds couchDB technology for indexing and querying capabilities and incremental map reduceMongo – stores BSON, has master slave replication, auto-sharding, ad-hoc query support best when used with indexes.
we ran a survey sometime late last year, here are the result. We had 1300 respondants. Tried to advertise it in a few different places to get a minimally biased set. And something in these results surprised us. The requirements for applications have changed over the years, particularly interactive web apps. The need to support millions of users in some cases over a matter of weeks points to the need for a highly scalable data tier. So the scalability driver did not surprise us. What did surprise us was the schema flexibility requirement,. the need for schema flexibility to rapidly create and push out updates to applications. In our first webinar, James Philips walked through the NoSQL taxonomy and a comparison of key-value, document-oriented or column-oriented databases. Most of these databases can scale out and don’t require schema definition. But Today, I will focus on distributed document-oriented NoSQL database technology for the rest of this presentation, while biased, we at Couchbase believe that a document-oriented databases give you the best balance between schema flexibility and performance. MongoDB and Couchbase being the two most visible and widely adopted examples.
Most of you are probably familiar with the table layout. A table is defined with a set of column. And each record in the table conforms to the schema. If you wish to capture different data in the future, the table schema must be changed using the ALTER TABLE statement. Typically data is normalized in the 3rd normal form reduce duplication. Large tables are split into smaller tablesusing foreign keys
Example. Normalized schema 2 tables Foreign keys (links) connects the two. To get information about a specific error, you will perform and JOIN across the two tables
Single doc contains aggregated info that would normallly be distributed across tables. Of course in real use cases it tends to be info spread out over tens, hundresds or even thousands of tables in real world complex systems (like SAP)Example. Normalized schema 2 tables Fk connects the two. To get information about a specific error, you will perform and join across the two tables
The data is modeled for the application code and not for the database.
Document oriented databases are in some ways are extensions to key-value store, where you access the document based on keys but can also create indexes to ask specific questions of the content within the document data is stored as self describing documents. …. Each document can have a different schema. Simple – list of attributes with numbers or strings as values or objects embedded within objects to form complex docs. You need a unique key / document id used to reference / access the document Couchbase users json, mongodb uses bson. In Couchbase , for querying, you first create views over data. Views are built using incremental map reduce. Mongodb supports adhoc querying but in most cases you need indexes Sharding to scale horizontally across a cluster and replication for high availability in case on node failures.
This heavily depends on your application and use case. Are these separate objects in the ORM layer? Are these accessed together What are the atomicity requirementsWhat are concurrency requirements
The simpler approach is to embed all related information into 1 document. Data is denormalized and almost represent a pre-computed join across tables. In contrast to this, you could split out objects into separate docs and include references in related objects. The join needs to be processed client-side by the application.Most document databases currently do not support joins.
Key selection is very important. Key’s are hard to change at a latter point. ID’s are similar to the primary key defined when the table is created. Lookups are extremely fast because clients know exactly which server the document belongs to based on consistent hashing. ID’s can appear only once per bucket. In couchbase, we call them buckets, A bucket is equivalent to a table or a collection. Selection your ID depends on your document model as well. Questions. Options. UUID….Hand crafted. In Some NoSQL database systems,data is sorted by ID. If you use prefixes for related objects , you can look up related objects faster. Selecting a clever ID,can make your life a lot easier.
You have different entities within the application.---
It has mostof the fields you’d expect to have in a blog entry. The comment field is an array within embedded comment objects Easy to get all information about a blog. Issues with this approach. You may not want to display all of the comments on a page. Some blogs may be very popular and have lots commments. So you don’t want to get such a large amount of data from the database.
If you’re expecting a very large number of comments, or want to display them threaded, you can easily imagine doing so by extending the list technique discussed earlier. This makes the application more complicated but load is spread across the cluster.
As you see here, rather than storing comments inline, we can separate them to a comment list, and then from there to individual comments.
Typical architecture, we have stateless application servers, sitting behind a load balancer. as the usage grows, adding additional app servers , update the load balancer and scale out the application linearly on both aspects – Costs and Performance. But the data tier is has a shared everything architecture. At a minimum, these are shared cache or shared disk systems. And so you need to scale up you will need expensive hardware. And even from a performance perspective you hit a limit. so both cost and performance with this approach is non –linear.
If you contrast this architecture for NoSQL systems with relational systems, with a document model and auto-sharding, the database now scales horizontally along with your app servers tier. Giving you the linear cost and performance you want.
For those who don’t know what Draw Something is – it is a “social” game like Pictionary. Two players play. A player is presented with a list of three words, from which they pick one to draw. The other player then sees the drawing and has to guess the word. And it goes back and forth like that.
Well, the game launched on February 6, 2012. Like most new games, social media integration (Facebook in this case) makes it easy to both invite your friends to play, and to highlight that you are playing the game. This “social component” helps build popularity. A few weeks into its life, Draw Something began to get a lot of attention, including from celebrities who also used social media (facebook, twitter and pinterest) to talk about their experience. One of the stars of Jersey Shore tweeted about the game in early March, kicking off the initial round of growth – to 1 million daily active uers. Miley Cyrus tweeted about her Draw Something “addiction” on March 8 and growth accelerated – from over 4 million daily users. Two weeks later, at over 14 million daily active users, the company behind Draw Something was acquired by Zynga for a purported $200 plus million.
As user growth exploded, the data associated with the game expanded exponentially. By the time the company was acquired, there were over 5000 THOUSAND drawings EVERY SECOND being created and stored by Draw Something. Unprecedented growth – growth most systems would crumble under.
Unfortunately, not everyone prepares. On March 1, as Vinny and Pauly D of the Jersey Shore were tweeting about Draw Something, EA launched a game called The Simpson’s: Tapped Out. Almost immediately the game charged to #2 on the iPAD and #3 on the iPhone top free app lists. Growth started to follow the same trajectory as Draw Something! But the outcome couldn’t have been more different. While Draw Something continued to grow, EA was unable to keep up with the success of the game. Games were reportedly being “lost,” there was huge lag and users were beginning to complain, loudly. Rather than praise on twitter, there was a flood of negative reaction. EA was forced, just 4 days later! To pull the game from the App Store. As of the end of March, 2012, it had still not returned. What a contrast.
JSON support – natively stored as json, whne you build an app, there is not conversion required. New doc viewing , editing capability. Indexing and querying – look inside your json, build views and query for a key, for ranges or to aggregate data Incremental mapreduce – powers indexing. Build complex views over your data. Great for real-time analytics XDCR – replicate information from one cluster to another cluster
All nodes are equal, single node type, easy to scale your cluster. No single point of failoverEvery node manages some active data and some replica data. Data is distributed across the clsuter and hence the load is also uniformly distributed using auto sharding. We have a fixed number of shards that a key get hashed to. 1024 shards, distributed across the cluster. Replication within the cluster for high availability. Number of replicas are configurable with upto 3 replicas. With auto-failiover or manual failover, replica information is immediately promoted to active Add multiple nodes at a time to grow and shrink your cluster.
CAPI interface – basic Couch API of which some goes through the caching layer (CRUD), some goes directly to Couch (Views)
CAPI interface – basic Couch API of which some goes through the caching layer (CRUD), some goes directly to Couch (Views)
CAPI interface – basic Couch API of which some goes through the caching layer (CRUD), some goes directly to Couch (Views)
Not yet enabled in current DP, will be available for Beta
Overview of what this feature is
Review Existing Couchbase Server Replication*NEEDS HIGHER RES IMAGE*