Spotify: Automating Cassandra repairs

Automating Cassandra
Repairs
Radovan Zvoncek
zvo@spotify.com
github.com/spotify/cassandra-reaper
#CassandraSummit

About zvo
Likes pancakes
Does this for the 3rd time

About zvo
Likes pancakes
Does this for the 3rd time
Works at Spotify

Working at Spotify
Is autonomous
Squads responsible for their full stack
Including Cassandra

Running Cassandra
Requires many things
One of them is keeping data consistent
Otherwise it can get lost or reappear

Running Cassandra
Requires many things
One of them is keeping data consistent
Eventually

Eventual consistency
Cassandra

Read Repairs
Cassandra
R W

Hinted Handoff
Cassandra

Anti-entropy Repair
Cassandra

Coordinated process
Anti-entropy Repair

Coordinated process
Four steps:
Anti-entropy Repair

Coordinated process
Four steps:
1: Hash
Anti-entropy Repair
#
#
#

Coordinated process
Four steps:
1: Hash
2: Compare
Anti-entropy Repair
###

Coordinated process
Four steps:
1: Hash
2: Compare
3: Stream
Anti-entropy Repair

Coordinated process
Four steps:
1: Hash
2: Compare
3: Stream
4: Merge
Anti-entropy Repair

Coordinated process
Four steps:
1: Hash
2: Compare
3: Stream
4: Merge
Can go wild...
Anti-entropy Repair

Repair gone wild
Eats a lot of disk IO
● because of hashing all the data

Repair gone wild
Saturates the network
● because of streaming a lot of data around

Repair gone wild
Fills up the disk
● because of receiving all replicas, possibly
from all other data centers

Repair gone wild
Fills up the disk
Causes a ton of compactions
● because of having to merge the received
data

Repair gone wild
Fills up the disk
Causes a ton of compactions
… one better be careful

nodetool repair
Careful repair
All three intervals

Partitioner range
● nodetool repair -pr
Careful repair
This interval only

Start & end tokens
● nodetool repair -pr -st -et
Careful repair
A part of interval only

Requires splitting the ring into smaller intervals
Smaller intervals mean less data
Less data means fewer repairs gone wild
Careful repair

Smaller intervals also mean more intervals
More intervals mean more actual repairs
Repairs need to be babysat :(
Careful repair

The Spotify way
Feature teams meant to do features
Not waste time operating their C* clusters
Cron-ing nodetool repair is no good
● mostly due to no feedback loop
This all led to creation of the Reaper

The Reaper
REST(ish) service
Does a lot of JMX
Orchestrates repairs for you

The reaping
You:
curl http://reaper/cluster --data ‘{“seedHost” : “my.cassandra.host.net”}’
The Reaper:
● Figures out cluster info (e.g. name, partitioner)

The reaping
You:
curl http://reaper/repair_run --data ‘{“clusterName”: “myCluster”}’
The Reaper:
● Prepares repair intervals

The reaping
You:
curl -X PUT http://reaper/repair_run/42 -d state=RUNNING
The Reaper:
● Starts triggering repairs of repair intervals

Reaper’s features
Carefulness - doesn’t kill a node
● checks for node load
● backs off after repairing an interval

Reaper’s features
Resilience - retries when things break
● because things break all the time

Reaper’s features
Parallelism - no idle nodes
● multiple small intervals in parallel

Reaper’s features
Scheduling - setup things only once
● regular full-ring repairs

Reaper’s features
Scheduling - setup things only once
Persistency - state saved somewhere
● a bit of extra resilience

What we reaped
First repair done 2015-01-28
1,700 repairs since then, recently 90 per week
176,000 (16%) segments failed at least once
60 repair failures

Reaper’s Future
CASSANDRA-10070
Whatever is needed until then

Greatest benefit
Cassandra Reaper automates a very tedious
maintenance operation of Cassandra clusters
in a rather smart, efficient and careful manner
while requiring minimal Cassandra expertise
#CassandraSummit

Thank you!
#CassandraSummit

Spotify: Automating Cassandra repairs

More Related Content

What's hot

Viewers also liked

Similar to Spotify: Automating Cassandra repairs

More from DataStax Academy

Recently uploaded

Spotify: Automating Cassandra repairs