Be the first to like this
Data replication is a crucial component for distributed services deployed in a multi-Data Center environment. The replication schema needs to be carefully evaluated before its implementation, wrong design or the misuse in most of the case end with a big service outages.
To understand the replication it is needed to understand the algorithms behind it, for this reason the session will start to explaining the most used algorithms to solve the CAP theorem (Consistency , Availability and Partitioning Tolerance) like Consistent Hash, Vector clock, Gossip protocol, Paxos and Raft.
The second part of the talk will be focused to analyze how the products on the market do the replication (replication in action) with advantages and disadvantages, the talk will cover the distributed filesystem (cephs, tahoe, extreemfs..), distributed databases (db replication primitieves and external tool like Tungsten), Nosql (riak, cassandra, mongodb, couchdb) and Frameworks for in house solution (beardb, open replication,..). The talk will also show the evaluation methods and testing process for identify the best solution for your environment.