Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Deep Dive into High Availability and Disaster Recovery Features in Couchbase Server 4.0: Couchbase Connect 2015

1,323 views

Published on

Join this demo-filled session to learn how to deliver continuously available mission critical apps across data centers. For today’s mission critical apps, high availability is no longer a ‘nice to have’ but is essential. Downtime and data loss is unacceptable, resulting in lost revenue. In this session we will cover the wide array of high availability and disaster recovery features available in Couchbase Server.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Deep Dive into High Availability and Disaster Recovery Features in Couchbase Server 4.0: Couchbase Connect 2015

  1. 1. DEEP DIVE INTO HIGH AVAILABILITY & DISASTER RECOVERY FEATURES IN COUCHBASE SERVER Anil Kumar, Senior Product Manager Couchbase
  2. 2. ©2015 Couchbase Inc. 2 About Me Anil Kumar Sr. Product Manager, Couchbase anil@couchbase.com @anilkumar1129
  3. 3. ©2015 Couchbase Inc. 3 Next 40 minutes …  Part I - High Availability  Single-node architecture  Local data redundancy  Rebalance and failover  Node recovery  Part II - Disaster Recovery  Business continuity for “mission-critical” applications  Geo-redundancy  Backup-restore for the worst case scenario  Demo  Q & A
  4. 4. Part I - High Availability
  5. 5. ©2015 Couchbase Inc. 5 Couchbase Server – Single-Node Architecture  Single-node type is the foundation for a high- availability architecture  No single point of failure (SPOF)  Easy scalability STORAGE Couchbase Server 1 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster Manager Cluster Manager Managed Cache Storage Data Service Index Service Query Service STORAGE Couchbase Server 2 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster Manager Cluster Manager Managed Cache Storage Data Service Index Service Query Service STORAGE Couchbase Server 3 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster Manager Cluster Manager Managed Cache Storage Data Service Index Service Query Service
  6. 6. ©2015 Couchbase Inc. 6 Intra-Cluster Replication – Data Redundancy  RAM-to-RAM replication  Max of 4 copies of data in a cluster  Bandwidth optimized through deduplication Intra-cluster replication is the process of replicating data on multiple servers within a cluster in order to provide data redundancy.
  7. 7. ©2015 Couchbase Inc. 7 Write Operation – Data Redundancy APPLICATION SERVER MANAGED CACHE DISK DISK DOC 1 DOC 1DOC 1  Caching based on memcached: App gets an ACK when a write is successfully in RAM  Or RAM + Replicated  Or RAM + Persisted  Or RAM + Replicated + Persisted  DCP-based replication: writes are queued to other nodes  Couchstore-based storage: writes are queued for storage DCP INDEXER
  8. 8. ©2015 Couchbase Inc. 8 Database Change Protocol – Data Redundancy DCP is a new streaming replication protocol in Couchbase Server 3.0  High-performance, stream-based protocol  Better resume-ability after blips and failures  Ordering  Consistent Intra-Cluster Replication Cross Datacenter Replication Incremental Rebalance Incremental Backup & Restore External streams for Change Data Capture (CDC) in future Incremental Map/Reduce Views Global Secondary Indexes Connectors (Kafka, Scoop, Spark)
  9. 9. ©2015 Couchbase Inc. 9 AutoTuning SharedThread Pool - Durability  Efficient auto-tuning engine  Detect and allocate threads based on HW resources  Pool threads for best resource utilization  Improved latency across the board  Faster Rebalance  Faster Node Reactivation  Faster Durability withWrites & PersistTo
  10. 10. ©2015 Couchbase Inc. 10 Rebalance Operation – Data Availability  Rebalance redistributes data-partitions (data) around a cluster  When adding nodes  When removing nodes  When nodes have failed over  Aim is to bring a cluster back to optimal health  Data-partitions are moved between nodes automatically  Rebalance happens on an active cluster  Allows you to expand/shrink without pausing your application  Client libraries automatically handle the rebalance and redistribute their requests accordingly
  11. 11. ©2015 Couchbase Inc. 11 Failover Operation – FaultTolerance  Failover automatically switches-over to the replicas for a given database  Gracefully under node maintenance  Immediately under auto-failover  Can be triggered manually through the Admin-UI/REST/CLI  Automatic failover in case of unplanned outages – system failures  Can be configured through Admin-UI/REST/CLI  Constraints in place to avoid “split-brain” and false positives  30 second delay, multiple heartbeat “pings”  Clusters >=3 nodes  Only one node down at a time
  12. 12. ©2015 Couchbase Inc. 12 Automatic Failover – In Action SERVER 4 SERVER 5 Replica Active Replica ActiveActive SERVER 1 Shard 5 Shard 2 Shard 9Shard Shard Shard Replica Shard 4 Shard 1 Shard 8Shard Shard Shard Active SERVER 2 Shard 4 Shard 7 Shard 8 Shard Shard Shard Replica Shard 6 Shard 3 Shard 2 Shard Shard Shard Active SERVER 3 Shard 1 Shard 3 Shard 6Shard Shard Shard Replica Shard 7 Shard 9 Shard 5Shard Shard Shard  App servers accessing Shards  Requests to Server 3 fail  Cluster detects server failure  Promotes replicas of Shards to active  Updates cluster map  Requests for docs now go to appropriate server  Typically a rebalance would follow Shard 1 Shard 3 Shard COUCHBASE Client Library CLUSTER MAP COUCHBASE Client Library CLUSTER MAP
  13. 13. ©2015 Couchbase Inc. 13 Node Recovery – Bring Cluster Back to Capacity  Failed node can added back to the cluster:  Full recovery – Add back the failed node as a fresh node  Delta Node recovery – Add back the failed node incrementally into the cluster, without having to rebuild the full node.
  14. 14. ©2015 Couchbase Inc. 14 Rack-Zone Awareness – Rack-Zone Availability  Grouping of servers into server groups so that each group is on a physically separate rack  Ensures that replica data partitions are not on the same rack as the primary partitions Rack 1 1 2 3 Rack 2 4 5 6 Rack 3 7 8 9  Servers 1, 2, 3 on Rack 1  Servers 4, 5, 6 on Rack 2  Servers 7, 8, 9 on Rack 3  Cluster has 2 replicas (3 copies of data)  This is a balanced configuration
  15. 15. ©2015 Couchbase Inc. 15 Couchbase Server - MDS Architecture (NEW in 4.0) What is Multi-Dimensional Scalability? MDS is the architecture that enables independent scaling of data, query and indexing workloads. That also provides isolation of services for minimized interference. Independent “zones” for the query service, index service, and data service. Index Service Couchbase Cluster Query Service Data Service node1 node8
  16. 16. Demo !!!
  17. 17. Part I I – Disaster Recovery
  18. 18. ©2015 Couchbase Inc. 18
  19. 19. ©2015 Couchbase Inc. 19 Cross Datacenter Replication (XDCR)  Unidirectional Replication  Hot spare and disaster recovery  Development and testing copies  Bidirectional Replication  Datacenter locality  Multiple active masters
  20. 20. ©2015 Couchbase Inc. 20 Cross Datacenter Replication (XDCR) using DCP  Continuously replicates data from the source cluster to remote clusters that can be spread across geographies  Supports unidirectional and bidirectional operations  Applications can read and write from both clusters (active – active replication)  Automatically handles node addition and removal of nodes  Simplified administration via Admin UI, REST APIs, and CLI  Pause-and-resume of XDCR streams  (NEW in 4.0) Filtering of data on replication streams
  21. 21. ©2015 Couchbase Inc. 21 XDCR – Memory-based Using DCP APPLICATION SERVER MANAGED CACHE DISK DISK DOC 1 DOC 1 Intra-Cluster Replication INDEXER Cross Datacenter Replication DOC 1DOC 1
  22. 22. ©2015 Couchbase Inc. 22 Backup & Restore – Oops Case  cbbackup tools provides backup for a running cluster  Entire cluster – across all bucket  Single node – across all buckets  Single node – single bucket  Supports remote or local access  Incremental backups  Differential or cumulative  Only backs up data that has changed since the last backup  Minimize resource and time consumption during backups  Enables more frequent backups  Restore cluster to point in time of a differential or cumulative backup
  23. 23. Demo !!!
  24. 24. ©2015 Couchbase Inc. 25 Deeper Dive into Architecture THUR @1.00 - ArchitectureTrack Deep Dive into Cluster Manager in Couchbase Server 4.0 Dave Finlay, Senior Director of Development, Couchbase
  25. 25. ©2015 Couchbase Inc. 26 Deeper Dive into Architecture THUR @10.30 - ArchitectureTrack Multi-Dimensional Scaling: A New Architecture for Scaling Big Data Application Anil Kumar, Senior Product Manager, Couchbase
  26. 26. ©2015 Couchbase Inc. 27 Best Practices THUR @5.15 - OperationsTrack Best Practices: Enabling HA - DR for Mission Critical Production Systems Kirk Kirkconnell, Senior Solutions Engineer, Couchbase
  27. 27. Thank you.
  28. 28. Get Started withCouchbase Server 4.0: www.couchbase.com/beta GetTrained on Couchbase: training.couchbase.com

×