Introduction toCouchbase ServerPerry KrugSr. Solutions Architect
Couchbase Server 2.0 is a high performance, easyto scale and flexible Document “NoSQL” Database.
EasyScalabilityConsistent HighPerformanceAlwaysOn24x365Grow cluster withoutapplication changes, withoutdowntime with a sin...
Couchbase Feature Set• Flexible Data Model:- JSON Support- Indexing/Querying- Incremental Map-Reduce• Easy Scalability:- “...
Couchbase Server ArchitectureReplication, Rebalance,Shard State ManagerREST managementAPI/Web UI8091Admin ConsoleErlang/OT...
Couchbase Operations
WebApplicationClient InteractionData FlowCluster ManagementWebApplicationCouchbaseClient LibraryWebApplication … …Couchbas...
33 2Write (‘set’) Operation2Managed CacheDiskQueueDiskReplicationQueueApp ServerCouchbase Server NodeDoc 1Doc 1Doc 1To oth...
33 2View processing and XDCR2Managed CacheDiskQueueDiskReplicationQueueApp ServerCouchbase Server NodeDoc 1Doc 1Doc 1To ot...
Disk Compaction• Disk writes to data files and index are ‘append-only’• On-disk size increases compared to actual stored d...
CompactionInitial file layout:Update some data:After compaction:Doc A Doc B Doc CDoc C Doc B’ Doc A’’Doc A Doc B Doc A’ Do...
GETDoc133 2Read (‘get’) Operation2DiskQueueReplicationQueueApp ServerCouchbase Server NodeDoc 1Doc 1Doc 1Managed CacheDisk...
33 2Cache Ejection2DiskQueueReplicationQueueApp ServerCouchbase Server NodeDoc 1Doc 6Doc 5Doc 4Doc 3Doc 2Doc 1Doc 6 Doc 5 ...
33 2Cache Miss2DiskQueueReplicationQueueApp ServerCouchbase Server NodeDoc 1Doc 3Doc 5 Doc 2Doc 4Doc 6 Doc 5 Doc 4 Doc 3 D...
COUCHBASE SERVER CLUSTERCluster wide - Basic Operation• Docs distributed evenly acrossservers• Each server stores both act...
Cluster wide - Add Nodes to Cluster• Two servers addedOne-click operation• Docs automaticallyrebalanced acrossclusterEven ...
Cluster wide - Fail Over NodeREPLICAACTIVEDoc 5Doc 2DocDocDoc 4Doc 1DocDocSERVER 1REPLICAACTIVEDoc 4Doc 7DocDocDoc 6Doc 3D...
COUCHBASE SERVER CLUSTERIndexing and QueryingUser Configured Replica Count = 1ACTIVEDoc 5Doc 2DocDocDocSERVER 1REPLICADoc ...
• Application can access both clusters (active – active replication)• Scales out linearly• Different from intra-cluster re...
Full Text Search
Documents
•get (key)– Retrieve a document•set (key, value)– Store a document, overwrites if exists•add (key, value)– Store a documen...
Check and Set/Compare and Swap (CAS)• Compares supplied CAS to validate achange to a value:- Client gets key and checksum(...
Document Driven• Use JSON to store documents- Replace serialized objects- Custom structures• Documents define a "record" o...
JSON Document Structuremeta{“id”: “u::jasdeep@couchbase.com”,“rev”: “1-0002bce0000000000”,“flags”: 0,“expiration”: 0,“type...
A JSON Document{“id": "beer_Hoptimus_Prime",“type”: “beer”,"abv": 10.0,"brewery": "Legacy Brewing Co.","category": "North ...
Other Documents and DocumentRelationships{“id": "beer_Hoptimus_Prime",“type” : “beer”,"abv": 10.0,"brewery": ”brewery_Lega...
Simplicity of Document Oriented Datastore• Schema is optional– Technically, each document has an implicit schema– Extend t...
Views/Indexes/Queries• Views create perspectives on a collection of documents- Primary/Secondary/Tertiary/Composite Indexi...
Cluster Administration
Web Console
BackupData FilescbbackupServerServer Servernetwork networknetwork
Restore2) “cbrestore” used to restore data into live/different clusterData Filescbrestore (-a)
Upgrading2 Methods to upgradeCouchbase Server cluster:In-place (offline) and Rolling (online)
Sizing a ClusterSizing == performance• Serve reads out of RAM• Enough IO for writes and disk operations• Mitigate inevitab...
How many nodes?5 Key Factors determine number of nodes needed:1) RAM2) Disk3) CPU4) Network5) Data Distribution/SafetyCouc...
DEMO
EasyScalabilityConsistent HighPerformanceAlwaysOn24x365Grow cluster withoutapplication changes, withoutdowntime with a sin...
Thank youCouchbaseNoSQL Document Database
Upcoming SlideShare
Loading in …5
×

Couchbase_John_Bryce_Israel_Training_couchbase_overview

923 views
867 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
923
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
20
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • As I mentioned, each Couchbase node is exactly the same.All nodes are broken down into two components: A data manager (on the left) and a cluster manager (on the right). It’s important to realize that these are separate processes within the system specifically designed so that a node can continue serving its data even in the face of cluster problems like network disruption. The data manager is written in C and C++ and is responsible both for the object caching layer, persistence layer and querying engine. It is based off of memcached and so provides a number of benefits;-The very low lock contention of memcached allows for extremely high throughput and low latencies both to a small set of documents (or just one) as well as across millions of documents-Being compatible with the memcached protocol means we are not only a drop-in replacement, but inherit support for automatic item expiration (TTL), atomic incrementer.-We’ve increased the maximum object size to 20mb, but still recommend keeping them much smaller-Support for both binary objects as well as natively supporting JSON documents-All of the metadata for the documents and their keys is kept in RAM at all times. While this does add a bit of overhead per item, it also allows for extremely fast “miss” speeds which are critical to the operation of some applications….we don’t have to scan a disk to know when we don’t have some data.The cluster manager is based on Erlang/OTP which was developed by Ericsson to deal with managing hundreds or even thousands of distributed telco switches. This component is responsible for configuration, administration, process monitoring, statistics gathering and the UI and REST interface. Note that there is no data manipulation done through this interface.
  • When an application server or process starts up, it instantiates a Couchbase client object. This object takes a bit of configuration (language dependent) which includes one or more URL’s to the Couchbase Server cluster. That client object then makes a connection on port 8091 to one of the URL’s in its list and receives the topology of the cluster (called a vbucket map). Technically a client connects to one bucket within the cluster. Using this map, the client library then sends the data requests to the individual Couchbase Server nodes. In this way, every application server does the load balancing for us without the need for any routing or proxy process.Let’s first start out by looking at the operations within each single node. Keep in mind again that each node is completely independent from one another when it comes to taking in and serving data. Every operation (with the exception of queries) is only between a single application server and a single Couchbase node. ALL operations are atomic and there is no blocking or locking done by the database itself. Application requests are responded to as quickly as possible which should mean sub-ms depending on your network unless a read is coming from disk and any failure (except timeouts) is designed to be sent as quickly as possible…”fail fast”.
  • Firstly, lets see how a write operation on a single document is handled(click) 1.  A set request comes in from the application .(click) 2.  Couchbase Server responds back that they key is written(click) 3. Couchbase Server then replicates the data out to memory to one or more nodes(click) 4.At the same time it is put the data into a write queue to be persisted to disk.Note that our primary form of high availability is getting the data off the node as quickly as possible. This is done from RAM to RAM and happens extremely quickly. The disk write process is always going to be a bit slower. We do everything asynchronously for the best performance, but also have a separate operation that the client can perform to wait for an item to be replicated and/or persisted to disk. It’s a separate operation on a key-by-key basis so the application developer can make the trade-off between performance and resiliency.
  • This slide has an click-by-click animation1.  (click) A set request comes in from the application .2.  Couchbase Server responses back that they key is written3. (click)Couchbase Server then Replicates the data out to memory in the other nodesAt the same time it is put the data into a write queue to be persisted to disk(click)Once it is on disk, the item is processed by the view engine and sent out any configured XDCR link to one or more clusters
  • Append-only file format puts all new/updated/deleted items at the end of the on-disk file.Better performance and reliabilityNo more fragmentation!This can lead to invalidated data in the “back” of the file.Need to compact data.The compaction process operates incrementally on a per-vbucket basis, and is controlled by both a fragmentation threshold and a time of day setting. It works by creating a new file with just the latest data and then switching over from the main one to the new one when complete.
  • Now let’s look at what a read operation looks like.(click) 1.  A get request comes in from the application.(click) 2.  Assuming the data is actually present in the system and available in cache, it is returned right away without any other interaction with other nodes or the disk. If a document is actually not in the node at all, an immediate message of “not found” is returned to the application
  • Now, as you fill up memory (click), some data that has already been written to diskwill be ejected from RAM to make room for new data. (click)Couchbase supports holding much more data than you have RAM available. It’s important to size the RAM capacity appropriately for your working set: the portion of data your application is working with at any given point in time and needs very low latency, high throughput access to. In some applications this is the entire data set, in others it is much smaller. As RAM fills up, we use a “not recently used” algorithm to determine the best data to be ejected from cache.
  • Should a read now come in for one of those documents that has been ejected (click), it is copied back from disk into RAM and sent back to the application. The document then remains in RAM as long as there is space and it is being accessed.
  • Understanding those same operations, let’s look at how this functions across a cluster.With a Couchbase Server cluster of three nodes, you can see that the documents are evenly distributed throughout the cluster. (click) Additionally, the replica document are also evenly distributed so that no replica document is on the same node as it’s active. This is showing one replica copy, but the same logic applies when there are two or three. After the application server comes online and receives the vbucket map, all requests (read/write/update/delete) to a given document are sent to the node that is active for it. In this way, Couchbase ensures immediate and strong consistency. An application will always read its own writes. At no point is the replica data read which would introduce inconsistency. We will see later what happens when a node fails and the replica data needs to be activated.The data is distributed (or “sharded”) based upon a CRC32 hash of the key name which creates very even and random distribution of the data across all the nodes. Other systems shard based upon some user-generated value, which can lead to hot spots and imbalances within a cluster which we don’t have. By distributing the data evenly across the cluster and letting the clients load balance themselves, the load is also evenly distributed across all the nodes in the cluster, making them “active-active”. Other systems using “master-slave” configurations basically end up wasting processing power and hardware in the background.Although the diagram only shows a few “shards” of data, we actually use 1024 slices/shards/vbuckets, technically this limits us to 1024 active nodes in a cluster, but also has lots of benefits for smaller clusters. The data is sharded very granularly and can be moved and compacted as such. This allows the cluster to scale very evenly and linearly for more RAM/disk/network and CPU.
  • Now lets look at what happens when it comes time to add servers to the cluster. Starting with the same set of three nodes, we bring two more online (click). Note that you can add or remove multiple nodes at once before actually migrating any data. This helps greatly when needing to add or swap lots of nodes since you don’t have to move the data around multiple times.Once the administrator is ready, pressing the rebalance button (click) moves some of the active data and some of the replica data to the new nodes. Despite what the animation shows, this is actually done incrementally one shard (or vbucket) at a time which not only means that load is immediately and incrementally transferred to the new nodes, but this process can be stopped at any point in the middle and leave the cluster still in a stable, albeit unbalanced state.This whole process is done online which the application is accessing data. There is an atomic switchover for each shard as it is moved, and the application continues reading and writing data from the original location until that happens. Any writes are synchronized to the new location before switching over, and it is also replicated (and optionally persisted) to ensure data safety. This same process can be used for software upgrades, hardware refreshes, and removing or swapping out misbehaving nodes.
  • http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-admin-tasks-failover.htmlFinally, let’s look at what happens when a node fails. Imagine the application is reading and writing to server #3. (click) In reality, it is sending requests to all the servers, but let’s just focus on number 3. If that nodes goes down, there have to be some requests that fail. Some will have already been sent on the wire, and others may be sent before the failure is detected. It’s important for your application to be prepared for some requests to fail, whether it’s a problem with Couchbase or not.Once the failure is detected, the node can be failed over either automatically by the cluster or manually by the administrator pressing a button or a script triggering our REST API. Once this happens (click), the replica data elsewhere in the cluster is made active, (click) the client libraries are updated and (click) subsequent accesses are immediately directed at the other nodes. Notice that server 3 doesn’t fail all of its data over to just one other server which would disproportionately increase the load on that node, but all of the other nodes in the cluster take on some of that data and traffic.Note also that the data on that node is not re-replicated. This would put undo load on an already degraded cluster and potentially lead to further failures.The failed node can now be rebooted or replaced and rebalanced back into the cluster. It is our best practice to return the cluster to full capacity before rebalancing which will automatically recreate any missing replicas. There is no worry about that node bringing its potentially stale data back online, once failed over the node is not allowed to return to the cluster without a rebalance.
  • Every node must be able to talk to every other node in each cluster…this has certain implications for cloud deployments: http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-admin-tasks-xdcr-cloud.html
  • http://blog.couchbase.com/couchbase-and-full-text-search-couchbase-transport-elastic-searchElasticSearch cluster is fed the documents from the Couchbase Server clusterElastic search indexes the fields(configurable which ones) and by default will only store references back to the document idThe application does document access via the Couchbase Server Cluster and uses The Views and incremental map reduce on the Couchbase cluster.For full text queries it queries the Ealstic search cluster directly (simple Http and JSON interface)The full text queries typically returns the ids of the matching documents.Documents are then retrieved from the Couchbase Server cluster.This way the high throughput document access always comes from high performance Couchbase Cluster.
  • 1. _ids2. ignore _rev3. data types
  • TODO: find a real brewery documentThere is a different schema between this beer and it's brewery. Obviously, there is a relationship but while a beer has an address, it's pretty temporary. A brewery probably technically has an alcohol by volume, but it's not commonly measured or tracked.
  • [FIXME] Title
  • http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-views.htmlMention explicitly that since views and reductions are prematerialized, the client can query the index in many different ways to get different results without requiring any extra processing.
  • http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-backup-restore.html
  • http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-backup-restore.html
  • Each on of these can determine the number of nodesData sets, work load, etc.This is where you want to ask the customer what their specific hardware/infrastructure/data requirements are. Talk about commodity hardware or cloud instances, scale out versus scale up, etc.
  • -Walk through UI-Show statistics and load running-Add node and rebalance-Prove data is still available and load running-Show vbuckets moving from node to node-Show Views on sample database-Show XDCR configuration
  • Couchbase_John_Bryce_Israel_Training_couchbase_overview

    1. 1. Introduction toCouchbase ServerPerry KrugSr. Solutions Architect
    2. 2. Couchbase Server 2.0 is a high performance, easyto scale and flexible Document “NoSQL” Database.
    3. 3. EasyScalabilityConsistent HighPerformanceAlwaysOn24x365Grow cluster withoutapplication changes, withoutdowntime with a single clickConsistent sub-millisecondread and write response timeswith consistent high throughputNo downtime for softwareupgrades, hardwaremaintenance, etc.Couchbase ServerJSONJSONJSONJSONJSONFlexible DataModelJSON document model withno fixed schema.The NoSQL Promise
    4. 4. Couchbase Feature Set• Flexible Data Model:- JSON Support- Indexing/Querying- Incremental Map-Reduce• Easy Scalability:- “Clone to grow” with auto-sharding- Cross-data center replication• Consistent High Performance:- Built-in Object level cache• Always on 24x365- Zero-downtime maintenance- Built-in data replication with auto-failover- Management and Monitoring UI- Reliable persistence architecture
    5. 5. Couchbase Server ArchitectureReplication, Rebalance,Shard State ManagerREST managementAPI/Web UI8091Admin ConsoleErlang/OTP11210 / 11211Data access portsObject-managedCacheStorage Engine8092Query APIQueryEnginehttpData Manager Cluster Manager
    6. 6. Couchbase Operations
    7. 7. WebApplicationClient InteractionData FlowCluster ManagementWebApplicationCouchbaseClient LibraryWebApplication … …Couchbase Server Couchbase Server Couchbase Server Couchbase ServerReplication Flow
    8. 8. 33 2Write (‘set’) Operation2Managed CacheDiskQueueDiskReplicationQueueApp ServerCouchbase Server NodeDoc 1Doc 1Doc 1To other node
    9. 9. 33 2View processing and XDCR2Managed CacheDiskQueueDiskReplicationQueueApp ServerCouchbase Server NodeDoc 1Doc 1Doc 1To other nodeXDCR QueueDoc 1To other clusterView engineDoc 1
    10. 10. Disk Compaction• Disk writes to data files and index are ‘append-only’• On-disk size increases compared to actual stored data• Compaction defragments data and index information• Operates on a live bucket (no downtime)• Both automatic and manual compaction available• Compaction operates per-shard on each node
    11. 11. CompactionInitial file layout:Update some data:After compaction:Doc A Doc B Doc CDoc C Doc B’ Doc A’’Doc A Doc B Doc A’ Doc B’ Doc A’’Doc A Doc B Doc C Doc A’ Doc DDoc D
    12. 12. GETDoc133 2Read (‘get’) Operation2DiskQueueReplicationQueueApp ServerCouchbase Server NodeDoc 1Doc 1Doc 1Managed CacheDiskTo other node
    13. 13. 33 2Cache Ejection2DiskQueueReplicationQueueApp ServerCouchbase Server NodeDoc 1Doc 6Doc 5Doc 4Doc 3Doc 2Doc 1Doc 6 Doc 5 Doc 4 Doc 3 Doc 2Managed CacheDiskTo other node
    14. 14. 33 2Cache Miss2DiskQueueReplicationQueueApp ServerCouchbase Server NodeDoc 1Doc 3Doc 5 Doc 2Doc 4Doc 6 Doc 5 Doc 4 Doc 3 Doc 2Doc 4GETDoc1Doc 1Doc 1Managed CacheDiskTo other node
    15. 15. COUCHBASE SERVER CLUSTERCluster wide - Basic Operation• Docs distributed evenly acrossservers• Each server stores both active andreplica docsOnly one server active at a time• Client library provides app withsimple interface to database• Cluster map provides mapto which server doc is onApp never needs to know• App reads, writes, updates docs• Multiple app servers can access samedocument at same timeUser Configured Replica Count = 1READ/WRITE/UPDATEACTIVEVB 1VB 7VB 4VB 8VB 14SERVER 1ACTIVEVB 2VB 9VB 5VB 10VB 16SERVER 2VB 15ACTIVEVB 3VB 11VB 6VB 12VB 18REPLICAVB 2VB 9VB 15VB 3VB 11VB 17REPLICAVB 4VB 8VB 13VB 6VB 12VB 18REPLICAVB 5VB 10VB 14VB 7VB 1VB 16SERVER 3VB 17APP SERVER 1COUCHBASE Client LibraryCLUSTER MAPCOUCHBASE Client LibraryCLUSTER MAPAPP SERVER 2VB 13
    16. 16. Cluster wide - Add Nodes to Cluster• Two servers addedOne-click operation• Docs automaticallyrebalanced acrossclusterEven distribution of docsMinimum doc movement• Cluster map updated• App databasecalls now distributedover larger number ofserversREPLICAACTIVEDoc 5Doc 2DocDocDoc 4Doc 1DocDocSERVER 1REPLICAACTIVEDoc 4Doc 7DocDocDoc 6Doc 3DocDocSERVER 2REPLICAACTIVEDoc 1Doc 2DocDocDoc 7Doc 9DocDocSERVER 3 SERVER 4 SERVER 5REPLICAACTIVEREPLICAACTIVEDocDoc 8 DocDoc 9 DocDoc 2 DocDoc 8 DocDoc 5 DocDoc 6READ/WRITE/UPDATE READ/WRITE/UPDATEAPP SERVER 1COUCHBASE Client LibraryCLUSTER MAPCOUCHBASE Client LibraryCLUSTER MAPAPP SERVER 2COUCHBASE SERVER CLUSTERUser Configured Replica Count = 1
    17. 17. Cluster wide - Fail Over NodeREPLICAACTIVEDoc 5Doc 2DocDocDoc 4Doc 1DocDocSERVER 1REPLICAACTIVEDoc 4Doc 7DocDocDoc 6Doc 3DocDocSERVER 2REPLICAACTIVEDoc 1Doc 2DocDocDoc 7Doc 9DocDocSERVER 3 SERVER 4 SERVER 5REPLICAACTIVEREPLICAACTIVEDoc 9Doc 8Doc Doc 6 DocDocDoc 5 DocDoc 2Doc 8 DocDoc• App servers accessing docs• Requests to Server 3 fail• Cluster detects server failedPromotes replicas of docs toactiveUpdates cluster map• Requests for docs now go toappropriate server• Typically rebalancewould followDocDoc 1 Doc 3APP SERVER 1COUCHBASE Client LibraryCLUSTER MAPCOUCHBASE Client LibraryCLUSTER MAPAPP SERVER 2User Configured Replica Count = 1COUCHBASE SERVER CLUSTER
    18. 18. COUCHBASE SERVER CLUSTERIndexing and QueryingUser Configured Replica Count = 1ACTIVEDoc 5Doc 2DocDocDocSERVER 1REPLICADoc 4Doc 1Doc 8DocDocDocAPP SERVER 1COUCHBASE Client LibraryCLUSTER MAPCOUCHBASE Client LibraryCLUSTER MAPAPP SERVER 2Doc 9• Indexing work is distributedamongst nodes• Large data set possible• Parallelize the effort• Each node has index for data storedon it• Queries combine the results fromrequired nodesACTIVEDoc 5Doc 2DocDocDocSERVER 2REPLICADoc 4Doc 1Doc 8DocDocDocDoc 9ACTIVEDoc 5Doc 2DocDocDocSERVER 3REPLICADoc 4Doc 1Doc 8DocDocDocDoc 9Query
    19. 19. • Application can access both clusters (active – active replication)• Scales out linearly• Different from intra-cluster replication (“CP” versus “AP”)XDCR: Cross Data Center Replication
    20. 20. Full Text Search
    21. 21. Documents
    22. 22. •get (key)– Retrieve a document•set (key, value)– Store a document, overwrites if exists•add (key, value)– Store a document, error/exception if exists•replace (key, value)– Store a document, error/exception if doesn’t exist•cas (key, value, cas)– Compare and swap, mutate document only if it hasn’t changedwhile executing this operationStore & Retrieve Operations
    23. 23. Check and Set/Compare and Swap (CAS)• Compares supplied CAS to validate achange to a value:- Client gets key and checksum(cas_token)- Client updates using key and checksum- If checksum doesn’t match, update fails• Client can only update if the key + CASmatch• Used when multiple clients accesssame data• First client with correct CAS wins• Subsequent client updates receiveCAS mismatchActor 1 Actor 2Couchbase ServerCAS mismatchSuccess
    24. 24. Document Driven• Use JSON to store documents- Replace serialized objects- Custom structures• Documents define a "record" of data• Store/Update/Retrieve using same protocol• JSON parsed by the server View system
    25. 25. JSON Document Structuremeta{“id”: “u::jasdeep@couchbase.com”,“rev”: “1-0002bce0000000000”,“flags”: 0,“expiration”: 0,“type”: “json”}document{“uid”: 123456,“firstname”: “jasdeep”,“lastname”: “Jaitla”,“age”: 22,“favorite_colors”: [“blue”, “black”],“email”: “jasdeep@couchbase.com”}Meta InformationIncluding KeyAll Keys Unique andKept in RAMDocument ValueMost Recent In RamAnd Persisted To Disk
    26. 26. A JSON Document{“id": "beer_Hoptimus_Prime",“type”: “beer”,"abv": 10.0,"brewery": "Legacy Brewing Co.","category": "North American Ale","name": "Hoptimus Prime","style": "Imperial or Double India Pale Ale",}The primary keyA floatThe type information
    27. 27. Other Documents and DocumentRelationships{“id": "beer_Hoptimus_Prime",“type” : “beer”,"abv": 10.0,"brewery": ”brewery_Legacy_Brewing_Co","category": "North American Ale","name": "Hoptimus Prime","style": “Double India Pale Ale”}{“id": ”brewery_Legacy_Brewing_Co”,“type” : “brewery”,"name" : "Legacy Brewing Co.","address": "525 Canal StreetReading, Pennsylvania, 19601 United States","updated": "2010-07-22 20:00:20","latitude": -75.928469,"longitude": 40.325725}Afterthought
    28. 28. Simplicity of Document Oriented Datastore• Schema is optional– Technically, each document has an implicit schema– Extend the schema at any time!• Need a new field? Add it. Define a default for similar objects which may not havethis field yet.• Data is self-contained– Documents more naturally support the world around you, the data structuresaround you• Model data for your App/Code instead for the Database• Try to keep documents as small as possible (less than 1MB)• Group data together that fits together, but split out portions that mayhave high levels of contention or are constantly growing
    29. 29. Views/Indexes/Queries• Views create perspectives on a collection of documents- Primary/Secondary/Tertiary/Composite Indexing- Aggregations• Use Incremental Map/Reduce- Map defines the relationship between fields in documents and output table- Reduce provides method for collating/summarizing• VIEWS materialize INDEXES- Data writes are fast (no index)- Index updates all changes since last update- Indexes are eventually indexed- Must be pre-materialized (ad-hoc querying available via full-text indexing)• Applications QUERY the INDEX- Queries are eventually consistent with respect to documents
    30. 30. Cluster Administration
    31. 31. Web Console
    32. 32. BackupData FilescbbackupServerServer Servernetwork networknetwork
    33. 33. Restore2) “cbrestore” used to restore data into live/different clusterData Filescbrestore (-a)
    34. 34. Upgrading2 Methods to upgradeCouchbase Server cluster:In-place (offline) and Rolling (online)
    35. 35. Sizing a ClusterSizing == performance• Serve reads out of RAM• Enough IO for writes and disk operations• Mitigate inevitable failuresReading Data Writing DataServerGive medocument AHere isdocument AApplication ServerAServerPlease storedocument AOK, I storeddocument AApplication ServerA
    36. 36. How many nodes?5 Key Factors determine number of nodes needed:1) RAM2) Disk3) CPU4) Network5) Data Distribution/SafetyCouchbase ServersWeb application serverApplication user
    37. 37. DEMO
    38. 38. EasyScalabilityConsistent HighPerformanceAlwaysOn24x365Grow cluster withoutapplication changes, withoutdowntime with a single clickConsistent sub-millisecondread and write response timeswith consistent high throughputNo downtime for softwareupgrades, hardwaremaintenance, etc.Couchbase ServerJSONJSONJSONJSONJSONFlexible DataModelJSON document model withno fixed schema.Couchbase is the Complete Solution
    39. 39. Thank youCouchbaseNoSQL Document Database

    ×