Data replication is a crucial component for distributed services deployed in a multi-Data Center environment. The replication schema needs to be carefully evaluated before its implementation, wrong design or the misuse in most of the case end with a big service outages.
To understand the replication it is needed to understand the algorithms behind it, for this reason the session will start to explaining the most used algorithms to solve the CAP theorem (Consistency , Availability and Partitioning Tolerance) like Consistent Hash, Vector clock, Gossip protocol, Paxos and Raft.
The second part of the talk will be focused to analyze how the products on the market do the replication (replication in action) with advantages and disadvantages, the talk will cover the distributed filesystem (cephs, tahoe, extreemfs..), distributed databases (db replication primitieves and external tool like Tungsten), Nosql (riak, cassandra, mongodb, couchdb) and Frameworks for in house solution (beardb, open replication,..). The talk will also show the evaluation methods and testing process for identify the best solution for your environment.
9. Beolink.org!
9"
CAP theorem
According to Brewer’s CAP theorem, it is impossible for any distributed computer
system to simultaneously provide all three of Consistency, Availability and
Partition Tolerance."
"
You "
can’t have the three at the
same time !
and get an acceptable latency."
10. Beolink.org!
10"
CAP
ACID!
!
Atomic: Everything in a transaction succeeds or the entire
transaction is rolled back."
Consistent: A transaction cannot leave the database in an
inconsistent state."
Isolated: Transactions cannot interfere with each other."
Durable: Completed transactions persist, even when servers
restart etc."
"
- Strong consistency for transaction highest priority"
- Pessimistic"
- Complex mechanisms"
"
- Availability and scaling highest priorities"
- Weak consistency"
- Optimistic"
- Best Effort"
- Simple and FAST "
Basic Availability"
Soft-state"
Eventual consistency"
"
BASE"
"
RDBMS!
NoSQL!
21. Beolink.org!
21"
Coordination Protocol
Consensus protocol!
"
Paxos , Raft, ect"
"
Based on the state machine approach (The state machine
approach is a technique for converting an algorithm into a
fault-tolerant, distributed implementation. )"
"
"
"
"
Epidemic (Gossip)!
"
epidemic: anybody can infect anyone "
else with equal probability"
"
"
"
"
"
"
Anti-entropy protocols assume
that synchronization is
performed by a fixed schedule
– every node regularly chooses
another node at random or by
some rule and exchanges
database contents, resolving
differences. "
O(log n)"
http://www.cis.cornell.edu/IAI/events/Gossip_Tutorial.pdf"
24. Beolink.org!
24"
Answer …no Answer
Block replication, file
Information
Document , blog,
session
Content with a TTL
over a 1m
Distributed file system
RDMBS
NoSQL
Caching system
25. Beolink.org!
25"
Distributed Filesystem
DFS is a service that provides a single point of reference and
a logical tree structure for file system resources that may be
physically located anywhere on the network."
"
"
One significant responsibility of a file system is to ensure
that, regardless of the actions by programs accessing the
data, the structure remains consistent…"
30. Beolink.org!
30"
RDBMS
"
"
Property of RDBMS!
"
• Quite Simple from application point of view"
• Data consistency"
"
Base on the solution!
"
• Low Partitioning Tolerance "
• Low Scalability"
• Low High Availability "
"
"
"
41. Beolink.org!
41"
Build a solution
• Split in pieces"
• Track version "
• Transfer when needed"
• Transfer the difference"
• Use Notification when is possible"
• Move data close to computation"
• Move master close to write operation"
• Split counter to avoid dead lock"
• In HTTP don’t forget the Etag and lastmodify"
"
"
"
openkad!
open-chord!
openReplica!
Raft!
43. Beolink.org!"
Five pylons
43"
Objects"
• Separation
btw data and
metadata"
• Each element
is marked with
a revision"
• Each element
is marked with
an hash."
Cache"
• Client side"
• Callback/
Notify"
• Persistent!
Transmission"
• Parallel
operation"
• Http like
protocol"
• Compression"
• Transfer by
difference"
Distribution"
• Resource
discovery by
DNS"
• Data spread
on multi node
cluster"
• Decentralize!
• Independents
cluster!
• Data
Replication!
Security"
• Secure
connection"
• Encryption
client side,"
• Extend ACL"
• Delegation/
Federation!
• Admin
Delegation!
44. Beolink.org!
44"
Build a solution
- Consistent HASH"
- Zmq transport protocol"
- Gossip protocol for failure detection"
- Tunable trade-offs "
"
Pisa is a simple block data replication !
on a wide range of node!