Replication and Consistency in Cassandra... What Does it All Mean? (Christopher Bradford, DataStax) | C* Summit 2016

Christopher Bradford
Replication and consistency in Cassandra... What does it all mean?

Christopher Bradford
Solutions Architect with DataStax
Built the world’s smallest C* cluster
Twitter: @bradfordcp
GitHub: bradfordcp
© DataStax, All Rights Reserved. 3

CAP Theorem
Pick 2 of the 3
Consistency
Availability Partition Tolerance

CAP Theorem
Consistency
Every read receives
the most recent
write or an error
Consistency

CAP Theorem
Every request
receives a response
Consistency
Availability

CAP Theorem
Partition Tolerance
The system continues
to operate despite
arbitrary partitioning
Consistency

CAP Theorem Evolved
The modern CAP goal should be to maximize
combinations of consistency and availability that
make sense for the specific application. Such an
approach incorporates plans for operation during
a partition and for recovery afterward, thus
helping designers think about CAP beyond its
historically perceived limitations.
- Eric Brewer
C
A P

CAP Theorem
Cassandra’s View
AP – Availability &
Partition tolerance above
all else.
Consistency

Replication
Availability & Partition Tolerance

Replication
Client
Coordinator
Replica
Replica
Replica
Write

Replication
Client
Coordinator
Replica
Replica
Replica
Write
+1 Hint

Replication
Client
Coordinator
Replica
Replica
Replica
Read

Configuring Replication
Replication is defined at the keyspace level.
Strategy
Parameters
CREATE KEYSPACE cassandra_summit
WITH REPLICATION = {
‘class’: ‘SimpleStrategy’,
‘replication_factor’: 3
};

1 Simple Strategy
2 Network Topology Strategy
Replication Strategies

Simple Strategy
Client
Coordinator
Replica
Replica
Replica
Request
Class: SimpleStrategy
Parameters:
• replication_factor

Simple Strategy
Client
Coordinator
Replica
Replica
Replica
Request

Network Topology Strategy
Client
Coordinator
Replica
Replica
Replica
Request
Class: NetworkTopologyStrategy
Parameters:
• dc_name: replication_factor

Client Coordinator
ReplicaRequest
Rack 1
Rack 2
Replica
Replica

Client
Coordinator
ReplicaRequest
Rack 1
Rack 3
Replica
Replica
Rack 2
Tools:
nodetool status ks
nodetool getendpoints ks table val

Consistency
Balancing performance and correctness

Consistency Levels

Consistent Reads
• ALL
• QUORUM
• LOCAL_QUORUM
• ONE
• LOCAL_ONE
• SERIAL
Replica
Replica
Replica

Consistent Writes
• ALL
• QUORUM
• LOCAL_QUORUM
• ONE
• LOCAL_ONE
• ANY
Replica
Replica
Replica

Consistency Failures
What happens when
the desired
consistency level
cannot be
achieved?
Replica
Replica
Replica

Failure Recovery
Staying Consistent
In the event of a failure
how do replicas get
the latest data?
Replica
Replica
Replica

Replication and Consistency in Cassandra... What Does it All Mean? (Christopher Bradford, DataStax) | C* Summit 2016

In this document