Scale your Application Globally
using Couchbase & XDCR
Ilam Siva
Senior Product Manager
Couchbase Open Source Project
• Leading NoSQL database project
focused on distributed database
technology and surrounding
...
Couchbase Server
Easy
Scalability
Grow cluster without
application changes, without
downtime with a single click

Always
O...
Additional Couchbase Server Features

Built-in clustering – All nodes equal

Append-only storage layer

Data replication w...
Single Node: Couchbase Server
Architecture
Query Engine

Query API

11210 / 11211

8091
Admin Console

Data access ports

...
Hash Partitioning
Basic Operation
APP SERVER 1

APP SERVER 2

COUCHBASE Client Library

COUCHBASE Client Library

CLUSTER MAP

CLUSTER MAP

...
XDCR: Cross Datacenter Replication

US DATA
CENTER

EUROPE DATA
CENTER
ASIA DATA
CENTER

http://blog.groosy.com/wp-content...
Cross Datacenter Replication
– The basics
• Replicate your Couchbase data across clusters

• Clusters may be spread across...
Intra-cluster Replication
Cross Datacenter Replication (XDCR)
Single node - Couchbase Write
Operation with XDCR
Doc 1

App Server

Couchbase Server Node
3
2
Managed Cache
Replication
Q...
Internal Data Flow
1. Document written to
managed cache
2. Document added to
intra-cluster replication
queue
3. Document a...
XDCR in action
SERVER 2

SERVER 1
ACTIVE

SERVER 3

ACTIVE

Doc

Doc

Doc

Doc

Doc 2

RAM

COUCHBASE SERVER CLUSTER
NYC D...
XDCR Architecture
Bucket-level XDCR
Cluster 1

Cluster 2

Bucket A

Bucket A

Bucket B

Bucket B

Bucket C

Bucket C
Continuous Reliable Replication
• All data mutations replicated to destination cluster
• Multiple streams round-robin acro...
Cluster Topology Aware
• Automatically handles node addition and removal in source
and destination clusters
Efficient
• Couchbase Server de-duplicates writes to disk
-

-

With multiple updates to the same document only the last v...
Active-Active Conflict Resolution
• Couchbase Server provides strong consistency at the
document level within a cluster

•...
Configuration and
Monitoring
STEP 1: Define Remote Cluster
STEP 2: Start Replication
Monitor Ongoing Replications
Detailed Replication Progress
• Source Cluster

• Destination Cluster
Demo!
XDCR Topologies
Unidirectional

• Hot spare / Disaster Recovery
• Development/Testing copies
Bidirectional

• Multiple Active Masters
• Disaster Recovery

• Datacenter Locality
Chain
Data aggregation
Data propagation
XDCR in the Cloud
• Server Naming
-

Optimal configuration using DNS name that resolves to internal address
for intra-clus...
Use Cases
Scale your data globally
• Data closer to your users is faster for your users
Disaster Recovery
• Ensure 24x7x365 data availability even if an entire data
center goes down
Development and Testing
• Test code changes with actual production data without
interrupting your production cluster
• Giv...
Impact of XDCR on the cluster
Your clusters need to be sized for XDCR
• XDCR is CPU intensive
-

Configure the number of p...
Additional Resources
• Couchbase Server Manual http://www.couchbase.com/docs/couchbase-manual2.0/couchbase-admin-tasks-xdc...
Q&A
Thank you
Webinar - Scale your Application Globally Using Couchbase and XDCR
Upcoming SlideShare
Loading in …5
×

Webinar - Scale your Application Globally Using Couchbase and XDCR

1,007 views
838 views

Published on

Couchbase has the ability to replicate your data across datacenters, offering a truly high-performance experience to a worldwide audience. Replication also provides resilience in the face of infrastructure failures.

In this webinar you will see:

An overview on how cross datacenter replication (XDCR) works in Couchbase Server
How you can use this feature to reduce risk in the face of infrastructure failures
Live demo of XDCR setup in Couchbase

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,007
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
30
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • All nodes are equal, single node type, easy to scale your cluster. No single point of failoverEvery node manages some active data and some replica data. Data is distributed across the clsuter and hence the load is also uniformly distributed using auto sharding. We have a fixed number of shards that a key get hashed to. 1024 shards, distributed across the cluster. Replication within the cluster for high availability. Number of replicas are configurable with upto 3 replicas. With auto-failiover or manual failover, replica information is immediately promoted to active Add multiple nodes at a time to grow and shrink your cluster.
  • As I mentioned, each Couchbase node is exactly the same.All nodes are broken down into two components: A data manager (on the left) and a cluster manager (on the right). It’s important to realize that these are separate processes within the system specifically designed so that a node can continue serving its data even in the face of cluster problems like network disruption. The data manager is written in C and C++ and is responsible both for the object caching layer, persistence layer and querying engine. It is based off of memcached and so provides a number of benefits;-The very low lock contention of memcached allows for extremely high throughput and low latencies both to a small set of documents (or just one) as well as across millions of documents-Being compatible with the memcached protocol means we are not only a drop-in replacement, but inherit support for automatic item expiration (TTL), atomic incrementer.-We’ve increased the maximum object size to 20mb, but still recommend keeping them much smaller-Support for both binary objects as well as natively supporting JSON documents-All of the metadata for the documents and their keys is kept in RAM at all times. While this does add a bit of overhead per item, it also allows for extremely fast “miss” speeds which are critical to the operation of some applications….we don’t have to scan a disk to know when we don’t have some data.The cluster manager is based on Erlang/OTP which was developed by Ericsson to deal with managing hundreds or even thousands of distributed telco switches. This component is responsible for configuration, administration, process monitoring, statistics gathering and the UI and REST interface. Note that there is no data manipulation done through this interface.
  • Bulletize the text. Make sure the builds work.
  • Web application have global usersSo have the same data in multiple data center and keep it synchronized with Cross Data Center ReplicationAllows reads and writes of data in any data center (master-master, active-active)Low latency local data access for local users (pin users to their local data center)Deal with failing datacenters by routing traffic to any of the other data centers (disaster recovery)Has conflict detection and simple resolution should same revision of document be modified concurrently in different datacenter. (no customized policy at this point)
  • Overview of what this feature is
  • Review Existing Couchbase Server Replication*NEEDS HIGHER RES IMAGE*
  • 1.  A set request comes in from the application .2.  Couchbase Server responses back that they key is written3. Couchbase Server then Replicates the data out to memory in the other nodes4. At the same time it is put the data into a write que to be persisted to disk
  • Move data close to usersRead and write data in any datacenterMultiple locations for disaster recoveryIndependently managed clusters serving same data
  • XDCR replications are configured on the bucket level within your clusterAs shown here, bucket A is configured to not be replicated at allBucket B is configured with uni-directional replication, from cluster 1 to cluster 2Changes made in bucket b on cluster 1 will be propagated to cluster 2, however changes made to bucket b on cluster 2 will NOT be propagated back to cluster 1Bucket C is configured with bi-directional replication between clusters 1 and 2Changes made in either cluster will be propagated to the otherBucket level configuration gives you flexibility to customize the behavior for your specific use case
  • Topology changes on both sides of the XDCR replication are supported
  • *INSERT a world map, highlighting latency*
  • *INSERT diagram depicting how application continues to operate with datacenter failure*
  • Webinar - Scale your Application Globally Using Couchbase and XDCR

    1. 1. Scale your Application Globally using Couchbase & XDCR Ilam Siva Senior Product Manager
    2. 2. Couchbase Open Source Project • Leading NoSQL database project focused on distributed database technology and surrounding ecosystem • Supports both key-value and 54.219.86.249 document-oriented use cases • All components are available under the Apache 2.0 Public License • Obtained as packaged software in both enterprise and community editions. Couchbase Open Source Project
    3. 3. Couchbase Server Easy Scalability Grow cluster without application changes, without downtime with a single click Always On 24x365 No downtime for software upgrades, hardware maintenance, etc. Consistent High Performance Consistent sub-millisecond read and write response times with consistent high throughput JSON JSON JSO JSON JSON N Flexible Data Model JSON document model with no fixed schema.
    4. 4. Additional Couchbase Server Features Built-in clustering – All nodes equal Append-only storage layer Data replication with auto-failover Online compaction Zero-downtime maintenance Monitoring and admin API & UI Built-in managed cached SDK for a variety of languages
    5. 5. Single Node: Couchbase Server Architecture Query Engine Query API 11210 / 11211 8091 Admin Console Data access ports http Object-managed Cache Erlang /OTP 8092 REST management API/Web UI Replication, Rebalance, Shard State Manager Storage Engine Data Manager Cluster Manager
    6. 6. Hash Partitioning
    7. 7. Basic Operation APP SERVER 1 APP SERVER 2 COUCHBASE Client Library COUCHBASE Client Library CLUSTER MAP CLUSTER MAP READ/WRITE/UPDATE SERVER 1 SERVER 2 SERVER 3 ACTIVE ACTIVE ACTIVE Doc 5 Doc Doc 4 Doc Doc 1 Doc Doc 2 Doc Doc 7 Doc Doc 2 Doc Doc 9 Doc Doc 8 Doc Doc 6 Doc REPLICA REPLICA REPLICA • Docs distributed evenly across servers • Each server stores both active and replica docs Only one server active at a time • Client library provides app with simple interface to database • Cluster map provides map to which server doc is on Doc 4 Doc Doc 6 Doc Doc 7 Doc Doc 1 Doc Doc 3 Doc Doc 9 Doc • App reads, writes, updates docs Doc 8 Doc Doc 2 Doc Doc 5 Doc • Multiple app servers can access same document at same time COUCHBASE SERVER CLUSTER User Configured Replica Count = 1 App never needs to know
    8. 8. XDCR: Cross Datacenter Replication US DATA CENTER EUROPE DATA CENTER ASIA DATA CENTER http://blog.groosy.com/wp-content/uploads/2011/10/internet-map.jpg
    9. 9. Cross Datacenter Replication – The basics • Replicate your Couchbase data across clusters • Clusters may be spread across geos • Configured on a per-bucket basis • Supports unidirectional and bidirectional operation • Application can read and write from both clusters (active – active replication) • Replication throughput scales out linearly • Different from intra-cluster replication
    10. 10. Intra-cluster Replication
    11. 11. Cross Datacenter Replication (XDCR)
    12. 12. Single node - Couchbase Write Operation with XDCR Doc 1 App Server Couchbase Server Node 3 2 Managed Cache Replication Queue Doc 1 Disk Queue To other node 3 Disk Doc 1 XDCR Engine To other cluster
    13. 13. Internal Data Flow 1. Document written to managed cache 2. Document added to intra-cluster replication queue 3. Document added to disk queue 4. XDCR push replicates to other clusters
    14. 14. XDCR in action SERVER 2 SERVER 1 ACTIVE SERVER 3 ACTIVE Doc Doc Doc Doc Doc 2 RAM COUCHBASE SERVER CLUSTER NYC DATA CENTER ACTIVE Doc Doc Doc 9 Doc Doc RAM Doc Doc DISK Doc RAM Doc Doc Doc DISK Doc Doc DISK SERVER 2 SERVER 1 ACTIVE SERVER 3 ACTIVE ACTIVE Doc COUCHBASE SERVER CLUSTER SF DATA CENTER Doc Doc Doc 2 RAM Doc Doc Doc Doc 9 Doc Doc RAM Doc DISK Doc Doc RAM Doc DISK Doc Doc Doc DISK Doc
    15. 15. XDCR Architecture
    16. 16. Bucket-level XDCR Cluster 1 Cluster 2 Bucket A Bucket A Bucket B Bucket B Bucket C Bucket C
    17. 17. Continuous Reliable Replication • All data mutations replicated to destination cluster • Multiple streams round-robin across vBuckets in parallel (32 default) • Automatic resume after network disruption
    18. 18. Cluster Topology Aware • Automatically handles node addition and removal in source and destination clusters
    19. 19. Efficient • Couchbase Server de-duplicates writes to disk - - With multiple updates to the same document only the last version is written to disk Only this last change written to disk is passed to XDCR • Document revisions are compared between clusters prior to transfer
    20. 20. Active-Active Conflict Resolution • Couchbase Server provides strong consistency at the document level within a cluster • XDCR provides eventual consistency across clusters • If a document is mutated on both clusters, both clusters will pick the same “winner” • In case of conflict, document with the most updates will be considered the “winner” Doc 1 on DC2 Doc 1 on DC1 {…} 3 Winner 3 {…}
    21. 21. Configuration and Monitoring
    22. 22. STEP 1: Define Remote Cluster
    23. 23. STEP 2: Start Replication
    24. 24. Monitor Ongoing Replications
    25. 25. Detailed Replication Progress • Source Cluster • Destination Cluster
    26. 26. Demo!
    27. 27. XDCR Topologies
    28. 28. Unidirectional • Hot spare / Disaster Recovery • Development/Testing copies
    29. 29. Bidirectional • Multiple Active Masters • Disaster Recovery • Datacenter Locality
    30. 30. Chain
    31. 31. Data aggregation
    32. 32. Data propagation
    33. 33. XDCR in the Cloud • Server Naming - Optimal configuration using DNS name that resolves to internal address for intra-cluster communication and public address for inter-cluster communication • Security - XDCR traffic is not encrypted, plan topology accordingly Consider 3rd party Amazon VPN solutions
    34. 34. Use Cases
    35. 35. Scale your data globally • Data closer to your users is faster for your users
    36. 36. Disaster Recovery • Ensure 24x7x365 data availability even if an entire data center goes down
    37. 37. Development and Testing • Test code changes with actual production data without interrupting your production cluster • Give developers local databases with real data, easy to dispose and recreate Test and Dev Staging Production
    38. 38. Impact of XDCR on the cluster Your clusters need to be sized for XDCR • XDCR is CPU intensive - Configure the number of parallel streams based on your CPU capacity Release 2.2 is a tremendous improvement in this regard. Strongly recommend the XDCR Ver 2 protocol in 2.2 release for high performance and optimized resource usage. • You are doubling your I/O usage - I/O capacity needs to be sized correctly • You will need more memory particularly for bidirectional XDCR - Memory capacity needs to be sized correctly
    39. 39. Additional Resources • Couchbase Server Manual http://www.couchbase.com/docs/couchbase-manual2.0/couchbase-admin-tasks-xdcr.html • Getting Started with XDCR blog http://blog.couchbase.com/cross-data-center-replicationstep-step-guide-amazon-aws
    40. 40. Q&A
    41. 41. Thank you

    ×