1. In Search of an Understandable
Consensus Algorithm
- Raft 2014
Cloud Computing Paper Presentation
Yu-Cheng Lin
2017/05/15
2. What is the Problem?
• General problem
• How the system appear to be a single, highly reliable even if some servers fail?
• Solution
• Consensus algorithm
• Existed, most famous solution => Paxos
• However:
• Hard to understand + hard to implementation
• Proposed solution:
• Raft algorithm in this paper.
3. How important is Raft?
• Software that implements the idea: Chubby, Zookeeper
• Teachings:
• All the links:
• https://raft.github.io/
6. Leader Election Problem
• If many followers become candidates at the same time, votes could
be split so that no candidate obtains a majority.
• Solution
• Randomized election timeouts to ensure that split votes are rare and that
they are resolved quickly.
7. Log Replication
• How to decide committed changes without crash condition?
• Solution: Majority
• Example:
• 5 servers
• 3 servers are majority
8. Log Replication Problem
• Leader crash!!!
• These inconsistencies can compound over a series of leader and follower
crashes.
• A follower may be missing entries that are present on the leader
• A follower may have extra entries that are not present on the leader
• How to maintain log consistency when leader crashed?
• The leader handles inconsistencies by forcing the followers’ logs to duplicate its own.
• Leader must
1. Find the latest log entry where the two logs agree.
2. Delete any entries in the follower’s log after that point.
3. Send the follower all of the leader’s entries after that point.
4. Keep track of the NextIndex for every follower.
9. Log Replication Problem
• Problem: But, … how to find NextIndex when a new leader elected?
• Solution:
1. Initializes all nextIndex values to the index just after the last one in its log
2. If a follower’s log is inconsistent with the leader’s, the leader decrements
nextIndex and retries the RPC.
3. Eventually nextIndex will reach a point where the both logs match.
4. Repeat the previous slide steps(2-4).
10. Safety
• More restriction for abnormal cases.
• Problem:
• A new leader can be elected even if it doesn’t initially contain all of the
committed entries => delete previous leader’s work
• Solution:
• The voter denies its vote if its own log is more up-to-date than that of the
candidate.
• Define Up-to-date
1. The log with the later term
2. The log is longer entries
12. Result
• Deliverables: 2000 open source C++ code.
• Results from a user study demonstrate that Raft is easier for students
to learn than Paxos.
• Raft’s performance is similar to other consensus algorithms such as
Paxos.
13. Thank you
• Cloud Computing Paper Presentation
• Yu-Cheng Lin
• 2017/05/15