SlideShare a Scribd company logo
May 2016
Raft in details
Ivan Glushkov
ivan.glushkov@gmail.com
@gliush
Raft is a consensus algorithm for managing a
replicated log
Why
❖ Very important topic
❖ [Multi-]Paxos, the key player, is very difficult to
understand
❖ Paxos -> Multi-Paxos -> different approaches
❖ Different optimization, closed-source implementation
Raft goals
❖ Main goal - understandability:
❖ Decomposition: separated leader election, log
replication, safety, membership changes
❖ State space reduction
Replicated state machines
❖ The same state on each server
❖ Compute the state from replicated log
❖ Consensus algorithm: keep the replicated log consistent
Properties of consensus algorithm
❖ Never return an incorrect result: Safety
❖ Functional as long as any majority of the severs are
operational: Availability
❖ Do not depend on timing to ensure consistency (at
worst - unavailable)
Raft in a nutshell
❖ Elect a leader
❖ Give full responsibility for replicated log to leader:
❖ Leader accepts log entries from client
❖ Leader replicates log entries on other servers
❖ Tell servers when it is safe to apply changes on their
state
Raft basics: States
❖ Follower
❖ Candidate
❖ Leader
RequestVote RPC
❖ Become a candidate, if no communication from leader over
a timeout (random election timeout: e.g. 150-300ms)
❖ Spec: (term, candidateId, lastLogIndex, lastLogTerm) 

-> (term, voteGranted)
❖ Receiver:
❖ false if term < currentTerm
❖ true if not voted yet and term and log are Up-To-Date
❖ false
Leader Election
❖ Heartbeats from leader
❖ Election timeout
❖ Candidate results:
❖ It wins
❖ Another server wins
❖ No winner for a period of time -> new election
Leader Election Property
❖ Election Safety: at most one leader can be elected in a
given term
Log Replication
❖ log index
❖ log term
❖ log command
Log Replication
❖ Leader appends command from client to its log
❖ Leader issues AppendEntries RPC to replicate the entry
❖ Leader replicates entry to majority -> apply entry to
state -> “committed entry”
❖ Follower checks if entry is committed by server ->
commit it locally
Log Replication Properties
❖ Leader Append-Only: a leader never overwrites or
deletes entries in its log; it only appends new entries
❖ Log Matching: If two entries in different logs have the
same index and term, then they store the same
command.
❖ Log Matching: If two entries in different logs have the
same index and term, then the logs are identical in all
preceding entries.
Inconsistency
Solving Inconsistency
❖ Leader forces the followers’ logs to duplicate its own
❖ Conflicting entries in the follower logs will be
overwritten with entries from the Leader’s log:
❖ Find latest log where two servers agree
❖ Remove everything after this point
❖ Write new logs
Safety Property
❖ Leader Completeness: if a log entry is committed in a
given term, then that entry will be present in the logs of
the leaders for all higher-numbered terms
❖ State Machine Safety: if a server has applied a log entry
at a given index to its state machine, no other server will
ever apply a different log entry for the same index.
Safety Restriction
❖ Up-to-date: if the logs have last entries with different
terms, then the log with the later term is more up-to-
date. If the logs end with the same term, then whichever
log is longer is more up-to-date.
time
Follower and Candidate Crashes
❖ Retry indefinitely
❖ Raft RPC are idempotent
Raft Joint Consensus
❖ Two phase to change membership configuration
❖ In Joint Consensus state:
❖ replicate log entries to both configuration
❖ any server from either configuration may be a leader
❖ agreement (for election and commitment) requires
separate majorities from both configuration
Log compaction
❖ Independent

snapshotting
❖ Snapshot metadata
❖ InstallSnapshot RPC
Client Interaction
❖ Works with leader
❖ Leader respond to a request when it commits an entry
❖ Assign uniqueID to every command, leader stores latest
ID with response.
❖ Read-only - could be without writing to a log, possible
stale data. Or write “no-op” entry at start
Correctness
❖ TLA+ formal specification for consensus alg (400 LOC)
❖ Proof of linearizability in Raft using Coq (community)
Questions?

More Related Content

What's hot

HTTP2 and gRPC
HTTP2 and gRPCHTTP2 and gRPC
HTTP2 and gRPC
Guo Jing
 
Apache Kafka - Messaging System Overview
Apache Kafka - Messaging System OverviewApache Kafka - Messaging System Overview
Apache Kafka - Messaging System Overview
Dmitry Tolpeko
 
gRPC Design and Implementation
gRPC Design and ImplementationgRPC Design and Implementation
gRPC Design and Implementation
Varun Talwar
 
8. mutual exclusion in Distributed Operating Systems
8. mutual exclusion in Distributed Operating Systems8. mutual exclusion in Distributed Operating Systems
8. mutual exclusion in Distributed Operating Systems
Dr Sandeep Kumar Poonia
 
Metrics Are Not Enough: Monitoring Apache Kafka and Streaming Applications
Metrics Are Not Enough: Monitoring Apache Kafka and Streaming ApplicationsMetrics Are Not Enough: Monitoring Apache Kafka and Streaming Applications
Metrics Are Not Enough: Monitoring Apache Kafka and Streaming Applications
confluent
 
RPC: Remote procedure call
RPC: Remote procedure callRPC: Remote procedure call
RPC: Remote procedure call
Sunita Sahu
 
Introduction to OpenMP
Introduction to OpenMPIntroduction to OpenMP
Introduction to OpenMP
Akhila Prabhakaran
 
Chord Algorithm
Chord AlgorithmChord Algorithm
Chord Algorithm
Sijia Lyu
 
remote procedure calls
  remote procedure calls  remote procedure calls
remote procedure callsAshish Kumar
 
Replication in Distributed Systems
Replication in Distributed SystemsReplication in Distributed Systems
Replication in Distributed Systems
Kavya Barnadhya Hazarika
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisDvir Volk
 
gRPC - RPC rebirth?
gRPC - RPC rebirth?gRPC - RPC rebirth?
gRPC - RPC rebirth?
Luís Barbosa
 
High Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniHigh Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando Patroni
Zalando Technology
 
Distributed tracing 101
Distributed tracing 101Distributed tracing 101
Distributed tracing 101
Itiel Shwartz
 
Reactive Applications with Apache Pulsar and Spring Boot
Reactive Applications with Apache Pulsar and Spring BootReactive Applications with Apache Pulsar and Spring Boot
Reactive Applications with Apache Pulsar and Spring Boot
VMware Tanzu
 
DNS Security Presentation ISSA
DNS Security Presentation ISSADNS Security Presentation ISSA
DNS Security Presentation ISSA
Srikrupa Srivatsan
 
Kafka presentation
Kafka presentationKafka presentation
Kafka presentation
Mohammed Fazuluddin
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
Saroj Panyasrivanit
 
Unlocking the Power of Apache Flink: An Introduction in 4 Acts
Unlocking the Power of Apache Flink: An Introduction in 4 ActsUnlocking the Power of Apache Flink: An Introduction in 4 Acts
Unlocking the Power of Apache Flink: An Introduction in 4 Acts
HostedbyConfluent
 

What's hot (20)

HTTP2 and gRPC
HTTP2 and gRPCHTTP2 and gRPC
HTTP2 and gRPC
 
Apache Kafka - Messaging System Overview
Apache Kafka - Messaging System OverviewApache Kafka - Messaging System Overview
Apache Kafka - Messaging System Overview
 
gRPC Design and Implementation
gRPC Design and ImplementationgRPC Design and Implementation
gRPC Design and Implementation
 
8. mutual exclusion in Distributed Operating Systems
8. mutual exclusion in Distributed Operating Systems8. mutual exclusion in Distributed Operating Systems
8. mutual exclusion in Distributed Operating Systems
 
Metrics Are Not Enough: Monitoring Apache Kafka and Streaming Applications
Metrics Are Not Enough: Monitoring Apache Kafka and Streaming ApplicationsMetrics Are Not Enough: Monitoring Apache Kafka and Streaming Applications
Metrics Are Not Enough: Monitoring Apache Kafka and Streaming Applications
 
RPC: Remote procedure call
RPC: Remote procedure callRPC: Remote procedure call
RPC: Remote procedure call
 
Introduction to OpenMP
Introduction to OpenMPIntroduction to OpenMP
Introduction to OpenMP
 
Chord Algorithm
Chord AlgorithmChord Algorithm
Chord Algorithm
 
remote procedure calls
  remote procedure calls  remote procedure calls
remote procedure calls
 
Replication in Distributed Systems
Replication in Distributed SystemsReplication in Distributed Systems
Replication in Distributed Systems
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
gRPC - RPC rebirth?
gRPC - RPC rebirth?gRPC - RPC rebirth?
gRPC - RPC rebirth?
 
High Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniHigh Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando Patroni
 
Thread
ThreadThread
Thread
 
Distributed tracing 101
Distributed tracing 101Distributed tracing 101
Distributed tracing 101
 
Reactive Applications with Apache Pulsar and Spring Boot
Reactive Applications with Apache Pulsar and Spring BootReactive Applications with Apache Pulsar and Spring Boot
Reactive Applications with Apache Pulsar and Spring Boot
 
DNS Security Presentation ISSA
DNS Security Presentation ISSADNS Security Presentation ISSA
DNS Security Presentation ISSA
 
Kafka presentation
Kafka presentationKafka presentation
Kafka presentation
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Unlocking the Power of Apache Flink: An Introduction in 4 Acts
Unlocking the Power of Apache Flink: An Introduction in 4 ActsUnlocking the Power of Apache Flink: An Introduction in 4 Acts
Unlocking the Power of Apache Flink: An Introduction in 4 Acts
 

Similar to Raft in details

Consensus algo with_distributed_key_value_store_in_distributed_system
Consensus algo with_distributed_key_value_store_in_distributed_systemConsensus algo with_distributed_key_value_store_in_distributed_system
Consensus algo with_distributed_key_value_store_in_distributed_system
Atin Mukherjee
 
Introduction to Raft algorithm
Introduction to Raft algorithmIntroduction to Raft algorithm
Introduction to Raft algorithm
muayyad alsadi
 
Manging scalability of distributed system
Manging scalability of distributed systemManging scalability of distributed system
Manging scalability of distributed systemAtin Mukherjee
 
Module: Mutable Content in IPFS
Module: Mutable Content in IPFSModule: Mutable Content in IPFS
Module: Mutable Content in IPFS
Ioannis Psaras
 
Coordination in distributed systems
Coordination in distributed systemsCoordination in distributed systems
Coordination in distributed systems
Andrea Monacchi
 
New awesome features in MySQL 5.7
New awesome features in MySQL 5.7New awesome features in MySQL 5.7
New awesome features in MySQL 5.7
Zhaoyang Wang
 
Unveiling etcd: Architecture and Source Code Deep Dive
Unveiling etcd: Architecture and Source Code Deep DiveUnveiling etcd: Architecture and Source Code Deep Dive
Unveiling etcd: Architecture and Source Code Deep Dive
Chieh (Jack) Yu
 
Top 10 interview question and answers for mcsa
Top 10 interview question and answers for mcsaTop 10 interview question and answers for mcsa
Top 10 interview question and answers for mcsa
hopesuresh
 
NFSv4 Replication for Grid Computing
NFSv4 Replication for Grid ComputingNFSv4 Replication for Grid Computing
NFSv4 Replication for Grid Computing
peterhoneyman
 
Loadbalancing In-depth study for scale @ 80K TPS
Loadbalancing In-depth study for scale @ 80K TPSLoadbalancing In-depth study for scale @ 80K TPS
Loadbalancing In-depth study for scale @ 80K TPS
Shrey Agarwal
 
Google File Systems
Google File SystemsGoogle File Systems
Google File SystemsAzeem Mumtaz
 
Distributed transactions
Distributed transactionsDistributed transactions
Distributed transactions
Aritra Das
 
Real-Time Streaming Protocol -QOS
Real-Time Streaming Protocol -QOSReal-Time Streaming Protocol -QOS
Real-Time Streaming Protocol -QOS
Jayaprakash Nagaruru
 
M|18 Choosing the Right High Availability Strategy for You
M|18 Choosing the Right High Availability Strategy for YouM|18 Choosing the Right High Availability Strategy for You
M|18 Choosing the Right High Availability Strategy for You
MariaDB plc
 
Thriftypaxos
ThriftypaxosThriftypaxos
Thriftypaxos
YongraeJo
 
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache ApexFebruary 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
Yahoo Developer Network
 
Raft_Diego_Ongaro.pdf
Raft_Diego_Ongaro.pdfRaft_Diego_Ongaro.pdf
Raft_Diego_Ongaro.pdf
jsathy97
 
Denser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microserversDenser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microservers
Jez Halford
 
Scalable Web Apps
Scalable Web AppsScalable Web Apps
Scalable Web Apps
Piotr Pelczar
 
ONOS Platform Architecture
ONOS Platform ArchitectureONOS Platform Architecture
ONOS Platform Architecture
OpenDaylight
 

Similar to Raft in details (20)

Consensus algo with_distributed_key_value_store_in_distributed_system
Consensus algo with_distributed_key_value_store_in_distributed_systemConsensus algo with_distributed_key_value_store_in_distributed_system
Consensus algo with_distributed_key_value_store_in_distributed_system
 
Introduction to Raft algorithm
Introduction to Raft algorithmIntroduction to Raft algorithm
Introduction to Raft algorithm
 
Manging scalability of distributed system
Manging scalability of distributed systemManging scalability of distributed system
Manging scalability of distributed system
 
Module: Mutable Content in IPFS
Module: Mutable Content in IPFSModule: Mutable Content in IPFS
Module: Mutable Content in IPFS
 
Coordination in distributed systems
Coordination in distributed systemsCoordination in distributed systems
Coordination in distributed systems
 
New awesome features in MySQL 5.7
New awesome features in MySQL 5.7New awesome features in MySQL 5.7
New awesome features in MySQL 5.7
 
Unveiling etcd: Architecture and Source Code Deep Dive
Unveiling etcd: Architecture and Source Code Deep DiveUnveiling etcd: Architecture and Source Code Deep Dive
Unveiling etcd: Architecture and Source Code Deep Dive
 
Top 10 interview question and answers for mcsa
Top 10 interview question and answers for mcsaTop 10 interview question and answers for mcsa
Top 10 interview question and answers for mcsa
 
NFSv4 Replication for Grid Computing
NFSv4 Replication for Grid ComputingNFSv4 Replication for Grid Computing
NFSv4 Replication for Grid Computing
 
Loadbalancing In-depth study for scale @ 80K TPS
Loadbalancing In-depth study for scale @ 80K TPSLoadbalancing In-depth study for scale @ 80K TPS
Loadbalancing In-depth study for scale @ 80K TPS
 
Google File Systems
Google File SystemsGoogle File Systems
Google File Systems
 
Distributed transactions
Distributed transactionsDistributed transactions
Distributed transactions
 
Real-Time Streaming Protocol -QOS
Real-Time Streaming Protocol -QOSReal-Time Streaming Protocol -QOS
Real-Time Streaming Protocol -QOS
 
M|18 Choosing the Right High Availability Strategy for You
M|18 Choosing the Right High Availability Strategy for YouM|18 Choosing the Right High Availability Strategy for You
M|18 Choosing the Right High Availability Strategy for You
 
Thriftypaxos
ThriftypaxosThriftypaxos
Thriftypaxos
 
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache ApexFebruary 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
 
Raft_Diego_Ongaro.pdf
Raft_Diego_Ongaro.pdfRaft_Diego_Ongaro.pdf
Raft_Diego_Ongaro.pdf
 
Denser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microserversDenser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microservers
 
Scalable Web Apps
Scalable Web AppsScalable Web Apps
Scalable Web Apps
 
ONOS Platform Architecture
ONOS Platform ArchitectureONOS Platform Architecture
ONOS Platform Architecture
 

More from Ivan Glushkov

Distributed tracing with erlang/elixir
Distributed tracing with erlang/elixirDistributed tracing with erlang/elixir
Distributed tracing with erlang/elixir
Ivan Glushkov
 
Kubernetes is not needed to 90 percents of the companies.rus
Kubernetes is not needed to 90 percents of the companies.rusKubernetes is not needed to 90 percents of the companies.rus
Kubernetes is not needed to 90 percents of the companies.rus
Ivan Glushkov
 
Mystery Machine Overview
Mystery Machine OverviewMystery Machine Overview
Mystery Machine Overview
Ivan Glushkov
 
Hashicorp Nomad
Hashicorp NomadHashicorp Nomad
Hashicorp Nomad
Ivan Glushkov
 
Google Dataflow Intro
Google Dataflow IntroGoogle Dataflow Intro
Google Dataflow Intro
Ivan Glushkov
 
NewSQL overview, Feb 2015
NewSQL overview, Feb 2015NewSQL overview, Feb 2015
NewSQL overview, Feb 2015
Ivan Glushkov
 
Comparing ZooKeeper and Consul
Comparing ZooKeeper and ConsulComparing ZooKeeper and Consul
Comparing ZooKeeper and Consul
Ivan Glushkov
 
fp intro
fp introfp intro
fp intro
Ivan Glushkov
 

More from Ivan Glushkov (8)

Distributed tracing with erlang/elixir
Distributed tracing with erlang/elixirDistributed tracing with erlang/elixir
Distributed tracing with erlang/elixir
 
Kubernetes is not needed to 90 percents of the companies.rus
Kubernetes is not needed to 90 percents of the companies.rusKubernetes is not needed to 90 percents of the companies.rus
Kubernetes is not needed to 90 percents of the companies.rus
 
Mystery Machine Overview
Mystery Machine OverviewMystery Machine Overview
Mystery Machine Overview
 
Hashicorp Nomad
Hashicorp NomadHashicorp Nomad
Hashicorp Nomad
 
Google Dataflow Intro
Google Dataflow IntroGoogle Dataflow Intro
Google Dataflow Intro
 
NewSQL overview, Feb 2015
NewSQL overview, Feb 2015NewSQL overview, Feb 2015
NewSQL overview, Feb 2015
 
Comparing ZooKeeper and Consul
Comparing ZooKeeper and ConsulComparing ZooKeeper and Consul
Comparing ZooKeeper and Consul
 
fp intro
fp introfp intro
fp intro
 

Recently uploaded

2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
Georgi Kodinov
 
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptxText-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
ShamsuddeenMuhammadA
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
Fermin Galan
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Globus
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
Ortus Solutions, Corp
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
wottaspaceseo
 
Top 7 Unique WhatsApp API Benefits | Saudi Arabia
Top 7 Unique WhatsApp API Benefits | Saudi ArabiaTop 7 Unique WhatsApp API Benefits | Saudi Arabia
Top 7 Unique WhatsApp API Benefits | Saudi Arabia
Yara Milbes
 
Enterprise Software Development with No Code Solutions.pptx
Enterprise Software Development with No Code Solutions.pptxEnterprise Software Development with No Code Solutions.pptx
Enterprise Software Development with No Code Solutions.pptx
QuickwayInfoSystems3
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
takuyayamamoto1800
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Globus
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
abdulrafaychaudhry
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Globus
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Pro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp BookPro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp Book
abdulrafaychaudhry
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
informapgpstrackings
 
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket ManagementUtilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
Paco van Beckhoven
 

Recently uploaded (20)

2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
 
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptxText-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
 
Top 7 Unique WhatsApp API Benefits | Saudi Arabia
Top 7 Unique WhatsApp API Benefits | Saudi ArabiaTop 7 Unique WhatsApp API Benefits | Saudi Arabia
Top 7 Unique WhatsApp API Benefits | Saudi Arabia
 
Enterprise Software Development with No Code Solutions.pptx
Enterprise Software Development with No Code Solutions.pptxEnterprise Software Development with No Code Solutions.pptx
Enterprise Software Development with No Code Solutions.pptx
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Pro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp BookPro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp Book
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
 
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket ManagementUtilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
 

Raft in details

  • 1. May 2016 Raft in details Ivan Glushkov ivan.glushkov@gmail.com @gliush
  • 2. Raft is a consensus algorithm for managing a replicated log
  • 3. Why ❖ Very important topic ❖ [Multi-]Paxos, the key player, is very difficult to understand ❖ Paxos -> Multi-Paxos -> different approaches ❖ Different optimization, closed-source implementation
  • 4. Raft goals ❖ Main goal - understandability: ❖ Decomposition: separated leader election, log replication, safety, membership changes ❖ State space reduction
  • 5. Replicated state machines ❖ The same state on each server ❖ Compute the state from replicated log ❖ Consensus algorithm: keep the replicated log consistent
  • 6. Properties of consensus algorithm ❖ Never return an incorrect result: Safety ❖ Functional as long as any majority of the severs are operational: Availability ❖ Do not depend on timing to ensure consistency (at worst - unavailable)
  • 7. Raft in a nutshell ❖ Elect a leader ❖ Give full responsibility for replicated log to leader: ❖ Leader accepts log entries from client ❖ Leader replicates log entries on other servers ❖ Tell servers when it is safe to apply changes on their state
  • 8. Raft basics: States ❖ Follower ❖ Candidate ❖ Leader
  • 9. RequestVote RPC ❖ Become a candidate, if no communication from leader over a timeout (random election timeout: e.g. 150-300ms) ❖ Spec: (term, candidateId, lastLogIndex, lastLogTerm) 
 -> (term, voteGranted) ❖ Receiver: ❖ false if term < currentTerm ❖ true if not voted yet and term and log are Up-To-Date ❖ false
  • 10. Leader Election ❖ Heartbeats from leader ❖ Election timeout ❖ Candidate results: ❖ It wins ❖ Another server wins ❖ No winner for a period of time -> new election
  • 11. Leader Election Property ❖ Election Safety: at most one leader can be elected in a given term
  • 12. Log Replication ❖ log index ❖ log term ❖ log command
  • 13. Log Replication ❖ Leader appends command from client to its log ❖ Leader issues AppendEntries RPC to replicate the entry ❖ Leader replicates entry to majority -> apply entry to state -> “committed entry” ❖ Follower checks if entry is committed by server -> commit it locally
  • 14. Log Replication Properties ❖ Leader Append-Only: a leader never overwrites or deletes entries in its log; it only appends new entries ❖ Log Matching: If two entries in different logs have the same index and term, then they store the same command. ❖ Log Matching: If two entries in different logs have the same index and term, then the logs are identical in all preceding entries.
  • 16. Solving Inconsistency ❖ Leader forces the followers’ logs to duplicate its own ❖ Conflicting entries in the follower logs will be overwritten with entries from the Leader’s log: ❖ Find latest log where two servers agree ❖ Remove everything after this point ❖ Write new logs
  • 17. Safety Property ❖ Leader Completeness: if a log entry is committed in a given term, then that entry will be present in the logs of the leaders for all higher-numbered terms ❖ State Machine Safety: if a server has applied a log entry at a given index to its state machine, no other server will ever apply a different log entry for the same index.
  • 18. Safety Restriction ❖ Up-to-date: if the logs have last entries with different terms, then the log with the later term is more up-to- date. If the logs end with the same term, then whichever log is longer is more up-to-date. time
  • 19. Follower and Candidate Crashes ❖ Retry indefinitely ❖ Raft RPC are idempotent
  • 20. Raft Joint Consensus ❖ Two phase to change membership configuration ❖ In Joint Consensus state: ❖ replicate log entries to both configuration ❖ any server from either configuration may be a leader ❖ agreement (for election and commitment) requires separate majorities from both configuration
  • 21. Log compaction ❖ Independent
 snapshotting ❖ Snapshot metadata ❖ InstallSnapshot RPC
  • 22. Client Interaction ❖ Works with leader ❖ Leader respond to a request when it commits an entry ❖ Assign uniqueID to every command, leader stores latest ID with response. ❖ Read-only - could be without writing to a log, possible stale data. Or write “no-op” entry at start
  • 23. Correctness ❖ TLA+ formal specification for consensus alg (400 LOC) ❖ Proof of linearizability in Raft using Coq (community)