Successfully reported this slideshow.
Your SlideShare is downloading. ×

Replication in the Wild - Warsaw Cloud Native Meetup - May 2017

Ad

REPLICATION
IN THE WILD
Ensar Basri Kahveci

Ad

Hello! Ensar Basri Kahveci
Distributed Systems Engineer @ Hazelcast
website: basrikahveci.com
linkedin: basrikahveci
twitt...

Ad

- Leading open source Java IMDG
- Distributed Java collections, JCache, HD store, …
- Distributed computations and messagi...

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Check these out next

1 of 25 Ad
1 of 25 Ad
Advertisement

More Related Content

Similar to Replication in the Wild - Warsaw Cloud Native Meetup - May 2017 (20)

Advertisement
Advertisement

Replication in the Wild - Warsaw Cloud Native Meetup - May 2017

  1. 1. REPLICATION IN THE WILD Ensar Basri Kahveci
  2. 2. Hello! Ensar Basri Kahveci Distributed Systems Engineer @ Hazelcast website: basrikahveci.com linkedin: basrikahveci twitter & github: metanet
  3. 3. - Leading open source Java IMDG - Distributed Java collections, JCache, HD store, … - Distributed computations and messaging - Embedded or client - server deployment - Integration modules & cloud friendly - Highly available, scalable, elastic
  4. 4. REPLICATION - Putting a data set into multiple nodes. - Each replica has a full copy. - A few reasons for replication: - Performance - Availability
  5. 5. REPLICATION + PARTITIONING - Mostly used with partitioning. - Two partitions: P1, P2 - Two replicas for each partition.
  6. 6. NOTHING FOR FREE! - Very easy to do when the data is immutable. - Two main difficulties: - Handling updates, - Handling failures.
  7. 7. The dangers of replIcatIon and a solutIon - Gray et al. [1] classify replication models by 2 parameters: - Where to make updates: primary copy or update anywhere - When to make updates: eagerly or lazily
  8. 8. WHERE: PRIMARY COPY - There is a single replica managing the updates. - No conflicts and no conflict-handling logic. - Implies sticky availability. - When primary fails, a new primary is elected.
  9. 9. WHERE: UPDATE ANYWHERE - Each replica can initiate a transaction to make an update. - Complex concurrency control. - Deadlocks or conflicts are possible. - In practice, there is also multi-leader.
  10. 10. WHEN: EAGER REPLICATION - Synchronously updates all replicas as part of one atomic transaction. - Strong consistency. - Level of availability can degrade on node failures. - Consensus algorithms
  11. 11. WHEN: LAZY REPLICATION - Updates each replica with a separate transaction. - Updates can execute quite fast. - High availability. - Data copies can diverge.
  12. 12. WHERE? WHEN? PRIMARY COPY UPDATE ANYWHERE EAGER 2PC [24] Multi Paxos [5] etcd and Consul (RAFT) [6] Zookeeper (Zab) [7] Kafka 2PC [24] Paxos [5] Hazelcast Cluster State Change [12] MySQL 5.7 Group Replication [23] LAZY Hazelcast MongoDB ElasticSearch Redis Kafka Dynamo [4] Cassandra Riak Hazelcast Active-Active WAN Replication [22]
  13. 13. PRIMARY COPY + EAGER REPLICATION - When the primary fails, secondaries are guaranteed to be up to date. - Majority approach in consensus algorithms. - Expensive. Mostly used for storing metadata. - In Kafka, in-sync-replica set [11] is maintained.
  14. 14. UPDATE ANYWHERE + EAGER REPLICATION - Each replica can initiate a new transaction. - Concurrent transactions can compete with each other. - Possibility of races and deadlocks. - Hazelcast Cluster State Change [12]
  15. 15. PRIMARy COPY + LAZY REPLICATION - Hazelcast, Redis, ElasticSearch, Kafka ... - The primary copy can execute updates fast. - Secondaries can fall behind the primary. It is called replication lag. - It can lead to data loss during leader failover, but still no conflicts. - Secondaries can be used for reads.
  16. 16. Hazelcast: PRIMARy COPY + LAZY REPLICATION PRIMARY COPY strong consistency on a stable cluster sticky availability LAZY REPLICATION high throughput replication log
  17. 17. UPDATE ANYWHERE + LAZY REPLICATION - Dynamo-style [4] highly available databases. - Tunable quorums. - Concurrent updates are first-class citizens. - Possibility of conflicts - Avoiding, discarding, detecting & resolving conflicts - Eventual convergence - Write repair, read repair and anti-entropy
  18. 18. Tunable QUORUMS - W + R > N - W = 3, R = 1, N = 3 - W = 2, R = 1, N = 3 - If W or R are not met - Sloppy quorums and hinted handoff
  19. 19. CONCURRENT UPDATES - Avoiding conflicts: CRDTs [2] - Strong eventual consistency - Riak Data Types [3] - Discarding conflicts: Last write wins - Physical timestamps in Cassandra [8], [9] - Detecting conflicts: Vector clocks [10] - Dynamo-style databases [4]
  20. 20. EVENTUAL CONVERGENCE - Write repair - Read repair - Active Anti-entropy - Merkle trees
  21. 21. WHERE? WHEN? PRIMARY COPY UPDATE ANYWHERE EAGER strong consistency simple concurrency slow inflexible strong consistency complex concurrency expensive deadlocks LAZY fast eventual consistency simple concurrency Inconsistency flexible high availability eventual consistency conflicts
  22. 22. Recap - We apply replication to make distributed systems performant, available and fault tolerant. - Various replication protocols are built based on when and where to make updates. - No silver bullet. It is a world of trade-offs.
  23. 23. - We are hiring! - Senior Java Developer http://stackoverflow.com/jobs/129435/senior-java-developer-hazelcast - Solution Architect http://stackoverflow.com/jobs/131938/solutions-architect-hazelcast
  24. 24. REFerences [1] Gray, Jim, et al. "The dangers of replication and a solution." ACM SIGMOD Record 25.2 (1996): 173-182. [2] Shapiro, Marc, et al. "Conflict-free replicated data types." Symposium on Self-Stabilizing Systems. Springer, Berlin, Heidelberg, 2011. [3] http://docs.basho.com/riak/kv/2.2.0/learn/concepts/crdts/ [4] DeCandia, Giuseppe, et al. "Dynamo: amazon's highly available key-value store." ACM SIGOPS operating systems review 41.6 (2007): 205-220. [5] Lamport, Leslie. "Paxos made simple." ACM Sigact News 32.4 (2001): 18-25. [6] Ongaro, Diego, and John K. Ousterhout. "In Search of an Understandable Consensus Algorithm." USENIX Annual Technical Conference. 2014. [7] Hunt, Patrick, et al. "ZooKeeper: Wait-free Coordination for Internet-scale Systems." USENIX annual technical conference. Vol. 8. 2010. [8] http://www.datastax.com/dev/blog/why-cassandra-doesnt-need-vector-clocks [9] https://aphyr.com/posts/299-the-trouble-with-timestamps [10] Raynal, Michel, and Mukesh Singhal. "Logical time: Capturing causality in distributed systems." Computer 29.2 (1996): 49-56. [11] http://kafka.apache.org/documentation.html#replication [12] http://docs.hazelcast.org/docs/latest/manual/html-single/index.html#managing-cluster-and-member-states [13] E. Brewer, "Towards Robust Distributed Systems," Proc. 19th Ann. ACM Symp. Principles of Distributed Computing (PODC 00), ACM, 2000, pp. 7-10 [14] https://codahale.com/you-cant-sacrifice-partition-tolerance/ [15] http://blog.nahurst.com/visual-guide-to-nosql-systems [16] http://www.allthingsdistributed.com/2008/12/eventually_consistent.html [17] https://www.somethingsimilar.com/2013/01/14/notes-on-distributed-systems-for-young-bloods/ [18] https://www.infoq.com/articles/cap-twelve-years-later-how-the-rules-have-changed [19] Gilbert, Seth, and Nancy Lynch. "Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services." Acm Sigact News 33.2 (2002): 51-59. [20] https://martin.kleppmann.com/2015/05/11/please-stop-calling-databases-cp-or-ap.html [21] https://henryr.github.io/cap-faq/ [22] http://docs.hazelcast.org/docs/3.7/manual/html-single/index.html#wan-replication [23] https://dev.mysql.com/doc/refman/5.7/en/group-replication.html [24] Notes on data base operating systems, JN Gray - Operating Systems, 1978
  25. 25. THANKS!Any questions?

×