Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
REPLICATION
IN THE WILD
Ensar Basri Kahveci
REPLICATION
- Putting a data set into multiple nodes.
- Each replica has a full copy.
- A few reasons for replication:
- P...
NOTHING FOR FREE!
- Very easy to do when the data is immutable.
- Problems start when we have multiple copies
of the data ...
The dangers of replication and a solution
- Gray et al. [1] classify replication models by 2
parameters:
- Where to make u...
WHERE: PRIMARY COPY
- There is a single replica managing the updates.
- Concurrency control is easy.
- No conflicts and no...
WHERE: UPDATE ANYWHERE
- Each replica can initiate a transaction to make
an update.
- Complex concurrency control.
- Deadl...
WHEN: EAGER REPLICATION
- Synchronously updates all replicas as part of
one atomic transaction.
- Provides strong consiste...
WHEN: LAZY REPLICATION
- Updates each replica with a separate
transaction.
- Updates can execute quite fast.
- Degree of a...
WHERE?
WHEN?
PRIMARY COPY UPDATE ANYWHERE
EAGER
strong consistency
simple concurrency
slow
inflexible
strong consistency
c...
WHERE?
WHEN?
PRIMARY COPY UPDATE ANYWHERE
EAGER
Multi Paxos [5]
etcd and Consul (RAFT) [6]
Zookeeper (Zab) [7]
Kafka
Paxos...
PRIMARY COPY + EAGER REPLICATION
- When the primary fails, secondaries are
guaranteed to be up to date.
- Raft, Kafka etc....
UPDATE ANYWHERE + EAGER REPLICATION
- Each replica can initiate a new transaction.
- Concurrent transactions can compete w...
PRIMARy COPY + LAZY REPLICATION
- The primary copy can execute updates fast.
- Secondaries can fall behind the primary. It...
UPDATE ANYWHERE + LAZY REPLICATION
- Dynamo-style [4] highly available databases.
- Quorums
- Concurrent updates are first...
QUORUMS
- W + R > N
- W = 3, R = 1, N = 3
- W = 1, R = 3, N = 3
- W = 2, R = 2, N = 3
- If W or R is not met, consistency ...
Conflict-free replicated data types (CRDTS)
- Special data types that achieve strong
eventual consistency and monotonicity...
DISCARDING CONFLICTS: LAST WRITE WINS
- When 2 updates are concurrent, define an
arbitrary order among them.
- i.e., prete...
DETECTING CONFLICTS: VECTOR CLOCKS
- In Dynamo paper [4], each update is done
against a particular version of a data entry...
Resolving conflicts and EVENTUAL CONVERGENCE
- Write repair
- Read repair
- Anti-entropy
- Merkle trees
Recap
- We apply replication to make our systems
performant and fault tolerant.
- Replication suffers from core problems o...
REFerences
[1] Gray, Jim, et al. "The dangers of replication and a solution." ACM SIGMOD Record 25.2 (1996): 173-182.
[2] ...
THANKS!Any questions?
Upcoming SlideShare
Loading in …5
×

Replication in the Wild

576 views

Published on

A tutorial-like technical presentation that covers fundamental approaches for replication along with their advantages, disadvantages, comparisons with each other etc.

Published in: Engineering
  • Be the first to comment

Replication in the Wild

  1. 1. REPLICATION IN THE WILD Ensar Basri Kahveci
  2. 2. REPLICATION - Putting a data set into multiple nodes. - Each replica has a full copy. - A few reasons for replication: - Performance - Availability and fault tolerance - Mostly used with partitioning.
  3. 3. NOTHING FOR FREE! - Very easy to do when the data is immutable. - Problems start when we have multiple copies of the data and we want to update them. - Two main difficulties - Handling updates - Handling failures
  4. 4. The dangers of replication and a solution - Gray et al. [1] classify replication models by 2 parameters: - Where to make updates: primary copy or update anywhere - When to make updates: eagerly or lazily
  5. 5. WHERE: PRIMARY COPY - There is a single replica managing the updates. - Concurrency control is easy. - No conflicts and no conflict-handling logic. - Updates are executed on the primary and secondaries with the same order. - When primary fails, a new primary is elected. - Ensuring a single and good primary is hard.
  6. 6. WHERE: UPDATE ANYWHERE - Each replica can initiate a transaction to make an update. - Complex concurrency control. - Deadlocks or conflicts are possible. - In practice, there is also multi-leader.
  7. 7. WHEN: EAGER REPLICATION - Synchronously updates all replicas as part of one atomic transaction. - Provides strong consistency. - Not very flexible. Degree of availability can degrade on node failures. - Consensus algorithms.
  8. 8. WHEN: LAZY REPLICATION - Updates each replica with a separate transaction. - Updates can execute quite fast. - Degree of availability is high. - Eventual consistency. - Data copies can diverge. - Data loss or conflicts can occur.
  9. 9. WHERE? WHEN? PRIMARY COPY UPDATE ANYWHERE EAGER strong consistency simple concurrency slow inflexible strong consistency complex concurrency slow expensive deadlocks LAZY fast eventual consistency simple concurrency inconsistency fast available flexible eventual consistency inconsistency conflicts
  10. 10. WHERE? WHEN? PRIMARY COPY UPDATE ANYWHERE EAGER Multi Paxos [5] etcd and Consul (RAFT) [6] Zookeeper (Zab) [7] Kafka Paxos [5] Hazelcast Cluster State Change [12] LAZY Hazelcast MongoDB ElasticSearch Redis Dynamo [4] Cassandra Riak
  11. 11. PRIMARY COPY + EAGER REPLICATION - When the primary fails, secondaries are guaranteed to be up to date. - Raft, Kafka etc. - Majority approach can be used. - In Kafka, in-sync-replica set is maintained. [11] - Secondaries can be used for reads.
  12. 12. UPDATE ANYWHERE + EAGER REPLICATION - Each replica can initiate a new transaction. - Concurrent transactions can compete with each other. - Possibility of deadlocks. - In the basic Paxos algorithm, there is no designated leader.
  13. 13. PRIMARy COPY + LAZY REPLICATION - The primary copy can execute updates fast. - Secondaries can fall behind the primary. It is called replication lag. - It can lead to data loss during leader failover, but still no conflicts. - Secondaries can be used for reads.
  14. 14. UPDATE ANYWHERE + LAZY REPLICATION - Dynamo-style [4] highly available databases. - Quorums - Concurrent updates are first-class citizens. - Possibility of conflicts - Avoiding, discarding, detecting & resolving conflicts - Eventual convergence - Write repair, read repair and anti-entropy
  15. 15. QUORUMS - W + R > N - W = 3, R = 1, N = 3 - W = 1, R = 3, N = 3 - W = 2, R = 2, N = 3 - If W or R is not met, consistency may be broken. - Sloppy quorums and hinted handoff. - Even if W and R are met, it can be still broken.
  16. 16. Conflict-free replicated data types (CRDTS) - Special data types that achieve strong eventual consistency and monotonicity [2] - No conflicts - Merge function has 3 properties: - Commutative: A + B = B + A - Associative: A + (B + C) = (A + B) + C - Idempotent: f(f(x)) = f(x) - Riak Data Types [3]
  17. 17. DISCARDING CONFLICTS: LAST WRITE WINS - When 2 updates are concurrent, define an arbitrary order among them. - i.e., pretend that one of them is more recent. - Attach a timestamp to each write. - Cassandra uses physical timestamps [8], [9]
  18. 18. DETECTING CONFLICTS: VECTOR CLOCKS - In Dynamo paper [4], each update is done against a particular version of a data entry. - Multiple versions of a data entry can exist together. - Vector clocks [10] are used to track causality. - The system can determine the authoritative version: syntactic reconciliation - The system cannot reconcile multiple versions: semantic reconciliation
  19. 19. Resolving conflicts and EVENTUAL CONVERGENCE - Write repair - Read repair - Anti-entropy - Merkle trees
  20. 20. Recap - We apply replication to make our systems performant and fault tolerant. - Replication suffers from core problems of distributed systems. - We can build many replication protocols that vary on the 2 dimensions we discussed. - No silver bullet. It is a world of trade-offs.
  21. 21. REFerences [1] Gray, Jim, et al. "The dangers of replication and a solution." ACM SIGMOD Record 25.2 (1996): 173-182. [2] Shapiro, Marc, et al. "Conflict-free replicated data types." Symposium on Self-Stabilizing Systems. Springer, Berlin, Heidelberg, 2011. [3] http://docs.basho.com/riak/kv/2.2.0/learn/concepts/crdts/ [4] DeCandia, Giuseppe, et al. "Dynamo: amazon's highly available key-value store." ACM SIGOPS operating systems review 41.6 (2007): 205-220. [5] Lamport, Leslie. "Paxos made simple." ACM Sigact News 32.4 (2001): 18-25. [6] Ongaro, Diego, and John K. Ousterhout. "In Search of an Understandable Consensus Algorithm." USENIX Annual Technical Conference. 2014. [7] Hunt, Patrick, et al. "ZooKeeper: Wait-free Coordination for Internet-scale Systems." USENIX annual technical conference. Vol. 8. 2010. [8] http://www.datastax.com/dev/blog/why-cassandra-doesnt-need-vector-clocks [9] https://aphyr.com/posts/299-the-trouble-with-timestamps [10] Raynal, Michel, and Mukesh Singhal. "Logical time: Capturing causality in distributed systems." Computer 29.2 (1996): 49-56. [11] http://kafka.apache.org/documentation.html#replication [12] http://docs.hazelcast.org/docs/latest/manual/html-single/index.html#managing-cluster-and-member-states
  22. 22. THANKS!Any questions?

×