SlideShare a Scribd company logo
Clock-RSM: Low-Latency Inter-Datacenter
State Machine Replication Using Loosely
Synchronized Physical Clocks
Jiaqing Du, Daniele Sciascia, Sameh Elnikety
Willy Zwaenepoel, Fernando Pedone
EPFL, University of Lugano, Microsoft Research
Replicated State Machines (RSM)
• Strong consistency
– Execute same commands in same order
– Reach same state from same initial state
• Fault tolerance
– Store data at multiple replicas
– Failure masking / fast failover
2
Geo-Replication
Data Center
Data Center
Data Center
Data Center
Data Center
• High latency among replicas
• Messaging dominates replication latency
3
Leader-Based Protocols
• Order commands by a leader replica
• Require extra ordering messages at follower
Leader
client request client reply
Ordering
Replication
High latency for geo replication
Ordering
4
Follower
Clock-RSM
• Orders commands using physical clocks
• Overlaps ordering and replication
5
client request client reply
Ordering + Replication
Low latency for geo replication
Outline
• Clock-RSM
• Comparison with Paxos
• Evaluation
• Conclusion
6
Outline
• Clock-RSM
• Comparison with Paxos
• Evaluation
• Conclusion
7
Property and Assumption
• Provides linearizability
• Tolerates failure of minority replicas
• Assumptions
– Asynchronous FIFO channels
– Non-Byzantine faults
– Loosely synchronized physical clocks
8
Protocol Overview
client request client reply
client request client reply
9
PrepOK
cmd1.ts = Clock()
cmd2.ts = Clock()
Clock-RSM
cmd1cmd2
cmd1cmd2
cmd1cmd2
cmd1cmd2
cmd1cmd2
Major Message Steps
• Prep: Ask everyone to log a command
• PrepOK: Tell everyone after logging a command
R0
R2
R1
client request
R3
R4
Prep
PrepOK
PrepOK
cmd1.ts = 24
PrepOK
PrepOK
cmd1 committed?
client request
cmd2.ts = 23
10
Commit Conditions
• A command is committed if
– Replicated by a majority
– All commands ordered before are committed
• Wait until three conditions hold
C1: Majority replication
C2: Stable order
C3: Prefix replication
11
C1: Majority Replication
• More than half replicas log cmd1
R0
R2
R1
client request
R3
R4
PrepOK
PrepOK
cmd1.ts = 24
Prep
Replicated by R0, R1, R2
1 RTT: between R0 and majority
12
C2: Stable Order
• Replica knows all commands ordered before cmd1
– Receives a greater timestamp from every other replica
R0
R2
R1
client request
R3
R4
24
cmd1.ts = 24
2523
25
25
25
0.5 RTT: between R0 and farthest peer
cmd1 is stable at R0
13
Prep / PrepOK / ClockTime
C3: Prefix Replication
• All commands ordered before cmd1 are replicated
by a majority
14
R0
R2
R1
client request
R3
R4
cmd1.ts = 24
cmd2 is replicated
by R1, R2, R3
cmd2.ts = 23
Prep
PrepOk
1 RTT: R4 to majority + majority to R0
client request
Prep
Prep
PrepOkPrepOk
Overlapping Steps
15
R0
R2
R1
client request
R3
R4
Latency of cmd1 : about 1 RTT to majority
client reply
Majority replication
Stable order
Prefix replication
PrepOK
PrepOK
Prep
Log(cmd1)
Log(cmd1)
24 2523
25
25
25
Prep
Prep
PrepOk
PrepOk
cmd1.ts = 24
Commit Latency
Step Latency
Majority replication 1 RTT (majority1)
Stable order 0.5 RTT (farthest)
Prefix replication 1 RTT (majority2)
Overall latency =
MAX{ 1 RTT (majority1), 0.5 RTT (farthest), 1 RTT (majority2) }
16
If 0.5 RTT (farthest) < 1 RTT (majority),
then overall latency ≈ 1 RTT (majority).
R0
Topology Examples
Majority1
Farthest
R0
Majority1
Farthest
R3
R4
R2
R1
R4
R3
R2
R1
17
client request
client request
Outline
• Clock-RSM
• Comparison with Paxos
• Evaluation
• Conclusion
18
Paxos 1: Multi-Paxos
• Single leader orders commands
– Logical clock: 0, 1, 2, 3, ...
R0
Leader R2
R1
client request
Prep
CommitForward
client reply
PrepOK
R3
R4
Latency at followers: 2 RTTs (leader & majority) 19
Paxos 2: Paxos-bcast
• Every replica broadcasts PrepOK
– Trades off message complexity for latency
R0
Leader R2
R1
client request
Prep
Forward
client reply
PrepOK
R3
R4
Latency at followers: 1.5 RTTs (leader & majority)
20
Clock-RSM vs. Paxos
• With realistic topologies, Clock-RSM has
– Lower latency at Paxos follower replicas
– Similar / slightly higher latency at Paxos leader
21
Protocol Latency
Clock-RSM All replicas: 1 RTT (majority)
if 0.5 RTT (farthest) < 1 RTT (majority)
Paxos-bcast Leader: 1 RTT (majority)
Follower: 1.5 RTTs (leader & majority)
Outline
• Clock-RSM
• Comparison with Paxos
• Evaluation
• Conclusion
22
Experiment Setup
• Replicated key-value store
• Deployed on Amazon EC2
California (CA)
Virginia (VA)
Ireland (IR)
Singapore (SG)
Japan (JP)
23
Latency (1/2)
• All replicas serve client requests
24
Overlapping vs. Separate Steps
CA VA
IR
SG
JP
25
CA VA (L)
IR
SG
JP
Clock-RSM latency: max of three
Paxos-bcast latency: sum of three
client request
client request
Latency (2/2)
• Paxos leader is changed to CA
26
Throughput
• Five replicas on a local cluster
• Message batching is key
27
Also in the Paper
• A reconfiguration protocol
• Comparison with Mencius
• Latency analysis of protocols
28
Conclusion
• Clock-RSM: low latency geo-replication
– Uses loosely synchronized physical clocks
– Overlaps ordering and replication
• Leader-based protocols can incur high latency
29

More Related Content

What's hot

Real Time Application Interface for Linux
Real Time Application Interface for LinuxReal Time Application Interface for Linux
Real Time Application Interface for Linux
Sarah Hussein
 
SCHEDULING ALGORITHMS
SCHEDULING ALGORITHMSSCHEDULING ALGORITHMS
SCHEDULING ALGORITHMS
Dhaval Sakhiya
 
Free OpManager training Part 2- Monitoring Server Performance
Free OpManager training Part 2- Monitoring Server PerformanceFree OpManager training Part 2- Monitoring Server Performance
Free OpManager training Part 2- Monitoring Server Performance
ManageEngine, Zoho Corporation
 
Free OpManager training Part3- Network performance monitoring
Free OpManager training Part3- Network performance monitoringFree OpManager training Part3- Network performance monitoring
Free OpManager training Part3- Network performance monitoring
ManageEngine, Zoho Corporation
 
Round Robin Algorithm.pptx
Round Robin Algorithm.pptxRound Robin Algorithm.pptx
Round Robin Algorithm.pptx
Sanad Bhowmik
 
Free OpManager training Part1- Discovery and classification
Free OpManager training Part1- Discovery and classificationFree OpManager training Part1- Discovery and classification
Free OpManager training Part1- Discovery and classification
ManageEngine, Zoho Corporation
 
Measuring a 25 and 40Gb/s Data Plane
Measuring a 25 and 40Gb/s Data PlaneMeasuring a 25 and 40Gb/s Data Plane
Measuring a 25 and 40Gb/s Data Plane
Open-NFP
 
Linux Administation
Linux AdministationLinux Administation
Linux Administation
rkulandaivel
 
Raft presentation
Raft presentationRaft presentation
Raft presentation
Patroclos Christou
 
Flink Forward Berlin 2017: Tzu-Li (Gordon) Tai - Managing State in Apache Flink
Flink Forward Berlin 2017: Tzu-Li (Gordon) Tai - Managing State in Apache FlinkFlink Forward Berlin 2017: Tzu-Li (Gordon) Tai - Managing State in Apache Flink
Flink Forward Berlin 2017: Tzu-Li (Gordon) Tai - Managing State in Apache Flink
Flink Forward
 
Centos failover link
Centos failover link Centos failover link
Centos failover link
Ediga Watson
 
Getting Started with Performance Co-Pilot
Getting Started with Performance Co-PilotGetting Started with Performance Co-Pilot
Getting Started with Performance Co-Pilot
Paul V. Novarese
 
System performance monitoring pcp + vector
System performance monitoring   pcp + vectorSystem performance monitoring   pcp + vector
System performance monitoring pcp + vector
Sandeep Kunkunuru
 
Introduction to Remote Procedure Call
Introduction to Remote Procedure CallIntroduction to Remote Procedure Call
Introduction to Remote Procedure Call
Abdelrahman Al-Ogail
 
MidTerm-RatanMohapatra
MidTerm-RatanMohapatraMidTerm-RatanMohapatra
MidTerm-RatanMohapatra
Ratan Mohapatra
 
Lac2006 Lee Revell Slides
Lac2006 Lee Revell SlidesLac2006 Lee Revell Slides
Lac2006 Lee Revell Slides
rlrevell
 
Supporting Time-Sensitive Applications on a Commodity OS
Supporting Time-Sensitive Applications on a Commodity OSSupporting Time-Sensitive Applications on a Commodity OS
Supporting Time-Sensitive Applications on a Commodity OS
NamHyuk Ahn
 
Dns
DnsDns
PCP
PCPPCP
Week5 lec1-bscs1
Week5 lec1-bscs1Week5 lec1-bscs1
Week5 lec1-bscs1
syedhaiderraza
 

What's hot (20)

Real Time Application Interface for Linux
Real Time Application Interface for LinuxReal Time Application Interface for Linux
Real Time Application Interface for Linux
 
SCHEDULING ALGORITHMS
SCHEDULING ALGORITHMSSCHEDULING ALGORITHMS
SCHEDULING ALGORITHMS
 
Free OpManager training Part 2- Monitoring Server Performance
Free OpManager training Part 2- Monitoring Server PerformanceFree OpManager training Part 2- Monitoring Server Performance
Free OpManager training Part 2- Monitoring Server Performance
 
Free OpManager training Part3- Network performance monitoring
Free OpManager training Part3- Network performance monitoringFree OpManager training Part3- Network performance monitoring
Free OpManager training Part3- Network performance monitoring
 
Round Robin Algorithm.pptx
Round Robin Algorithm.pptxRound Robin Algorithm.pptx
Round Robin Algorithm.pptx
 
Free OpManager training Part1- Discovery and classification
Free OpManager training Part1- Discovery and classificationFree OpManager training Part1- Discovery and classification
Free OpManager training Part1- Discovery and classification
 
Measuring a 25 and 40Gb/s Data Plane
Measuring a 25 and 40Gb/s Data PlaneMeasuring a 25 and 40Gb/s Data Plane
Measuring a 25 and 40Gb/s Data Plane
 
Linux Administation
Linux AdministationLinux Administation
Linux Administation
 
Raft presentation
Raft presentationRaft presentation
Raft presentation
 
Flink Forward Berlin 2017: Tzu-Li (Gordon) Tai - Managing State in Apache Flink
Flink Forward Berlin 2017: Tzu-Li (Gordon) Tai - Managing State in Apache FlinkFlink Forward Berlin 2017: Tzu-Li (Gordon) Tai - Managing State in Apache Flink
Flink Forward Berlin 2017: Tzu-Li (Gordon) Tai - Managing State in Apache Flink
 
Centos failover link
Centos failover link Centos failover link
Centos failover link
 
Getting Started with Performance Co-Pilot
Getting Started with Performance Co-PilotGetting Started with Performance Co-Pilot
Getting Started with Performance Co-Pilot
 
System performance monitoring pcp + vector
System performance monitoring   pcp + vectorSystem performance monitoring   pcp + vector
System performance monitoring pcp + vector
 
Introduction to Remote Procedure Call
Introduction to Remote Procedure CallIntroduction to Remote Procedure Call
Introduction to Remote Procedure Call
 
MidTerm-RatanMohapatra
MidTerm-RatanMohapatraMidTerm-RatanMohapatra
MidTerm-RatanMohapatra
 
Lac2006 Lee Revell Slides
Lac2006 Lee Revell SlidesLac2006 Lee Revell Slides
Lac2006 Lee Revell Slides
 
Supporting Time-Sensitive Applications on a Commodity OS
Supporting Time-Sensitive Applications on a Commodity OSSupporting Time-Sensitive Applications on a Commodity OS
Supporting Time-Sensitive Applications on a Commodity OS
 
Dns
DnsDns
Dns
 
PCP
PCPPCP
PCP
 
Week5 lec1-bscs1
Week5 lec1-bscs1Week5 lec1-bscs1
Week5 lec1-bscs1
 

Similar to Clock-RSM: Low-Latency Inter-Datacenter State Machine Replication Using Loosely Synchronized Physical Clocks

3 process scheduling
3 process scheduling3 process scheduling
3 process scheduling
ahad alam
 
Real time database
Real time databaseReal time database
Real time database
RasikhaCSEngineering
 
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
Flink Forward
 
13 risc
13 risc13 risc
3_process_scheduling.ppt
3_process_scheduling.ppt3_process_scheduling.ppt
3_process_scheduling.ppt
ShrutiArora343479
 
fggggggggggggggggggggggggggggggfffffffffffffffffff
fggggggggggggggggggggggggggggggffffffffffffffffffffggggggggggggggggggggggggggggggfffffffffffffffffff
fggggggggggggggggggggggggggggggfffffffffffffffffff
adugnanegero
 
3_process_scheduling.ppt
3_process_scheduling.ppt3_process_scheduling.ppt
3_process_scheduling.ppt
AbdulRahman491811
 
Process Scheduling Algorithms for Operating Systems
Process Scheduling Algorithms for Operating SystemsProcess Scheduling Algorithms for Operating Systems
Process Scheduling Algorithms for Operating Systems
KathirvelRajan2
 
3_process_scheduling.ppt----------------
3_process_scheduling.ppt----------------3_process_scheduling.ppt----------------
3_process_scheduling.ppt----------------
DivyaBorade3
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large Scale
Ververica
 
rtos.ppt
rtos.pptrtos.ppt
rtos.ppt
karthik930637
 
RedisConf18 - Active-Active Geo-Distributed Apps with Redis CRDTs (conflict f...
RedisConf18 - Active-Active Geo-Distributed Apps with Redis CRDTs (conflict f...RedisConf18 - Active-Active Geo-Distributed Apps with Redis CRDTs (conflict f...
RedisConf18 - Active-Active Geo-Distributed Apps with Redis CRDTs (conflict f...
Redis Labs
 
Qualcomm lte-performance-challenges-09-01-2011
Qualcomm lte-performance-challenges-09-01-2011Qualcomm lte-performance-challenges-09-01-2011
Qualcomm lte-performance-challenges-09-01-2011
Muhammad Noor Ifansyah
 
(NET404) Making Every Packet Count
(NET404) Making Every Packet Count(NET404) Making Every Packet Count
(NET404) Making Every Packet Count
Amazon Web Services
 
AWS re:Invent 2016: Making Every Packet Count (NET404)
AWS re:Invent 2016: Making Every Packet Count (NET404)AWS re:Invent 2016: Making Every Packet Count (NET404)
AWS re:Invent 2016: Making Every Packet Count (NET404)
Amazon Web Services
 
Ceph Day Beijing - Ceph RDMA Update
Ceph Day Beijing - Ceph RDMA UpdateCeph Day Beijing - Ceph RDMA Update
Ceph Day Beijing - Ceph RDMA Update
Ceph Community
 
Ceph Day Beijing - Ceph RDMA Update
Ceph Day Beijing - Ceph RDMA UpdateCeph Day Beijing - Ceph RDMA Update
Ceph Day Beijing - Ceph RDMA Update
Danielle Womboldt
 
Lec13 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Multicore
Lec13 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- MulticoreLec13 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Multicore
Lec13 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Multicore
Hsien-Hsin Sean Lee, Ph.D.
 
dataprocess using different technology.ppt
dataprocess using different technology.pptdataprocess using different technology.ppt
dataprocess using different technology.ppt
ssuserf6eb9b
 
08Mapping.ppt
08Mapping.ppt08Mapping.ppt
08Mapping.ppt
MalikNuman8
 

Similar to Clock-RSM: Low-Latency Inter-Datacenter State Machine Replication Using Loosely Synchronized Physical Clocks (20)

3 process scheduling
3 process scheduling3 process scheduling
3 process scheduling
 
Real time database
Real time databaseReal time database
Real time database
 
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
Flink Forward SF 2017: Stephan Ewen - Experiences running Flink at Very Large...
 
13 risc
13 risc13 risc
13 risc
 
3_process_scheduling.ppt
3_process_scheduling.ppt3_process_scheduling.ppt
3_process_scheduling.ppt
 
fggggggggggggggggggggggggggggggfffffffffffffffffff
fggggggggggggggggggggggggggggggffffffffffffffffffffggggggggggggggggggggggggggggggfffffffffffffffffff
fggggggggggggggggggggggggggggggfffffffffffffffffff
 
3_process_scheduling.ppt
3_process_scheduling.ppt3_process_scheduling.ppt
3_process_scheduling.ppt
 
Process Scheduling Algorithms for Operating Systems
Process Scheduling Algorithms for Operating SystemsProcess Scheduling Algorithms for Operating Systems
Process Scheduling Algorithms for Operating Systems
 
3_process_scheduling.ppt----------------
3_process_scheduling.ppt----------------3_process_scheduling.ppt----------------
3_process_scheduling.ppt----------------
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large Scale
 
rtos.ppt
rtos.pptrtos.ppt
rtos.ppt
 
RedisConf18 - Active-Active Geo-Distributed Apps with Redis CRDTs (conflict f...
RedisConf18 - Active-Active Geo-Distributed Apps with Redis CRDTs (conflict f...RedisConf18 - Active-Active Geo-Distributed Apps with Redis CRDTs (conflict f...
RedisConf18 - Active-Active Geo-Distributed Apps with Redis CRDTs (conflict f...
 
Qualcomm lte-performance-challenges-09-01-2011
Qualcomm lte-performance-challenges-09-01-2011Qualcomm lte-performance-challenges-09-01-2011
Qualcomm lte-performance-challenges-09-01-2011
 
(NET404) Making Every Packet Count
(NET404) Making Every Packet Count(NET404) Making Every Packet Count
(NET404) Making Every Packet Count
 
AWS re:Invent 2016: Making Every Packet Count (NET404)
AWS re:Invent 2016: Making Every Packet Count (NET404)AWS re:Invent 2016: Making Every Packet Count (NET404)
AWS re:Invent 2016: Making Every Packet Count (NET404)
 
Ceph Day Beijing - Ceph RDMA Update
Ceph Day Beijing - Ceph RDMA UpdateCeph Day Beijing - Ceph RDMA Update
Ceph Day Beijing - Ceph RDMA Update
 
Ceph Day Beijing - Ceph RDMA Update
Ceph Day Beijing - Ceph RDMA UpdateCeph Day Beijing - Ceph RDMA Update
Ceph Day Beijing - Ceph RDMA Update
 
Lec13 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Multicore
Lec13 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- MulticoreLec13 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Multicore
Lec13 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Multicore
 
dataprocess using different technology.ppt
dataprocess using different technology.pptdataprocess using different technology.ppt
dataprocess using different technology.ppt
 
08Mapping.ppt
08Mapping.ppt08Mapping.ppt
08Mapping.ppt
 

Recently uploaded

The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
kalichargn70th171
 
Modelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - AmsterdamModelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - Amsterdam
Alberto Brandolini
 
Unveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdfUnveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdf
brainerhub1
 
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
gapen1
 
The Rising Future of CPaaS in the Middle East 2024
The Rising Future of CPaaS in the Middle East 2024The Rising Future of CPaaS in the Middle East 2024
The Rising Future of CPaaS in the Middle East 2024
Yara Milbes
 
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
kalichargn70th171
 
Quarter 3 SLRP grade 9.. gshajsbhhaheabh
Quarter 3 SLRP grade 9.. gshajsbhhaheabhQuarter 3 SLRP grade 9.. gshajsbhhaheabh
Quarter 3 SLRP grade 9.. gshajsbhhaheabh
aisafed42
 
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Paul Brebner
 
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdfTop Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
VALiNTRY360
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
Drona Infotech
 
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
dakas1
 
Project Management: The Role of Project Dashboards.pdf
Project Management: The Role of Project Dashboards.pdfProject Management: The Role of Project Dashboards.pdf
Project Management: The Role of Project Dashboards.pdf
Karya Keeper
 
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesE-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
Quickdice ERP
 
一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
dakas1
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
ICS
 
Oracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptxOracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptx
Remote DBA Services
 
All you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVMAll you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVM
Alina Yurenko
 
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
XfilesPro
 
What’s New in Odoo 17 – A Complete Roadmap
What’s New in Odoo 17 – A Complete RoadmapWhat’s New in Odoo 17 – A Complete Roadmap
What’s New in Odoo 17 – A Complete Roadmap
Envertis Software Solutions
 
Preparing Non - Technical Founders for Engaging a Tech Agency
Preparing Non - Technical Founders for Engaging  a  Tech AgencyPreparing Non - Technical Founders for Engaging  a  Tech Agency
Preparing Non - Technical Founders for Engaging a Tech Agency
ISH Technologies
 

Recently uploaded (20)

The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
 
Modelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - AmsterdamModelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - Amsterdam
 
Unveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdfUnveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdf
 
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
 
The Rising Future of CPaaS in the Middle East 2024
The Rising Future of CPaaS in the Middle East 2024The Rising Future of CPaaS in the Middle East 2024
The Rising Future of CPaaS in the Middle East 2024
 
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
 
Quarter 3 SLRP grade 9.. gshajsbhhaheabh
Quarter 3 SLRP grade 9.. gshajsbhhaheabhQuarter 3 SLRP grade 9.. gshajsbhhaheabh
Quarter 3 SLRP grade 9.. gshajsbhhaheabh
 
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
 
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdfTop Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
 
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
 
Project Management: The Role of Project Dashboards.pdf
Project Management: The Role of Project Dashboards.pdfProject Management: The Role of Project Dashboards.pdf
Project Management: The Role of Project Dashboards.pdf
 
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesE-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
 
一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
 
Oracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptxOracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptx
 
All you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVMAll you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVM
 
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
 
What’s New in Odoo 17 – A Complete Roadmap
What’s New in Odoo 17 – A Complete RoadmapWhat’s New in Odoo 17 – A Complete Roadmap
What’s New in Odoo 17 – A Complete Roadmap
 
Preparing Non - Technical Founders for Engaging a Tech Agency
Preparing Non - Technical Founders for Engaging  a  Tech AgencyPreparing Non - Technical Founders for Engaging  a  Tech Agency
Preparing Non - Technical Founders for Engaging a Tech Agency
 

Clock-RSM: Low-Latency Inter-Datacenter State Machine Replication Using Loosely Synchronized Physical Clocks

  • 1. Clock-RSM: Low-Latency Inter-Datacenter State Machine Replication Using Loosely Synchronized Physical Clocks Jiaqing Du, Daniele Sciascia, Sameh Elnikety Willy Zwaenepoel, Fernando Pedone EPFL, University of Lugano, Microsoft Research
  • 2. Replicated State Machines (RSM) • Strong consistency – Execute same commands in same order – Reach same state from same initial state • Fault tolerance – Store data at multiple replicas – Failure masking / fast failover 2
  • 3. Geo-Replication Data Center Data Center Data Center Data Center Data Center • High latency among replicas • Messaging dominates replication latency 3
  • 4. Leader-Based Protocols • Order commands by a leader replica • Require extra ordering messages at follower Leader client request client reply Ordering Replication High latency for geo replication Ordering 4 Follower
  • 5. Clock-RSM • Orders commands using physical clocks • Overlaps ordering and replication 5 client request client reply Ordering + Replication Low latency for geo replication
  • 6. Outline • Clock-RSM • Comparison with Paxos • Evaluation • Conclusion 6
  • 7. Outline • Clock-RSM • Comparison with Paxos • Evaluation • Conclusion 7
  • 8. Property and Assumption • Provides linearizability • Tolerates failure of minority replicas • Assumptions – Asynchronous FIFO channels – Non-Byzantine faults – Loosely synchronized physical clocks 8
  • 9. Protocol Overview client request client reply client request client reply 9 PrepOK cmd1.ts = Clock() cmd2.ts = Clock() Clock-RSM cmd1cmd2 cmd1cmd2 cmd1cmd2 cmd1cmd2 cmd1cmd2
  • 10. Major Message Steps • Prep: Ask everyone to log a command • PrepOK: Tell everyone after logging a command R0 R2 R1 client request R3 R4 Prep PrepOK PrepOK cmd1.ts = 24 PrepOK PrepOK cmd1 committed? client request cmd2.ts = 23 10
  • 11. Commit Conditions • A command is committed if – Replicated by a majority – All commands ordered before are committed • Wait until three conditions hold C1: Majority replication C2: Stable order C3: Prefix replication 11
  • 12. C1: Majority Replication • More than half replicas log cmd1 R0 R2 R1 client request R3 R4 PrepOK PrepOK cmd1.ts = 24 Prep Replicated by R0, R1, R2 1 RTT: between R0 and majority 12
  • 13. C2: Stable Order • Replica knows all commands ordered before cmd1 – Receives a greater timestamp from every other replica R0 R2 R1 client request R3 R4 24 cmd1.ts = 24 2523 25 25 25 0.5 RTT: between R0 and farthest peer cmd1 is stable at R0 13 Prep / PrepOK / ClockTime
  • 14. C3: Prefix Replication • All commands ordered before cmd1 are replicated by a majority 14 R0 R2 R1 client request R3 R4 cmd1.ts = 24 cmd2 is replicated by R1, R2, R3 cmd2.ts = 23 Prep PrepOk 1 RTT: R4 to majority + majority to R0 client request Prep Prep PrepOkPrepOk
  • 15. Overlapping Steps 15 R0 R2 R1 client request R3 R4 Latency of cmd1 : about 1 RTT to majority client reply Majority replication Stable order Prefix replication PrepOK PrepOK Prep Log(cmd1) Log(cmd1) 24 2523 25 25 25 Prep Prep PrepOk PrepOk cmd1.ts = 24
  • 16. Commit Latency Step Latency Majority replication 1 RTT (majority1) Stable order 0.5 RTT (farthest) Prefix replication 1 RTT (majority2) Overall latency = MAX{ 1 RTT (majority1), 0.5 RTT (farthest), 1 RTT (majority2) } 16 If 0.5 RTT (farthest) < 1 RTT (majority), then overall latency ≈ 1 RTT (majority).
  • 18. Outline • Clock-RSM • Comparison with Paxos • Evaluation • Conclusion 18
  • 19. Paxos 1: Multi-Paxos • Single leader orders commands – Logical clock: 0, 1, 2, 3, ... R0 Leader R2 R1 client request Prep CommitForward client reply PrepOK R3 R4 Latency at followers: 2 RTTs (leader & majority) 19
  • 20. Paxos 2: Paxos-bcast • Every replica broadcasts PrepOK – Trades off message complexity for latency R0 Leader R2 R1 client request Prep Forward client reply PrepOK R3 R4 Latency at followers: 1.5 RTTs (leader & majority) 20
  • 21. Clock-RSM vs. Paxos • With realistic topologies, Clock-RSM has – Lower latency at Paxos follower replicas – Similar / slightly higher latency at Paxos leader 21 Protocol Latency Clock-RSM All replicas: 1 RTT (majority) if 0.5 RTT (farthest) < 1 RTT (majority) Paxos-bcast Leader: 1 RTT (majority) Follower: 1.5 RTTs (leader & majority)
  • 22. Outline • Clock-RSM • Comparison with Paxos • Evaluation • Conclusion 22
  • 23. Experiment Setup • Replicated key-value store • Deployed on Amazon EC2 California (CA) Virginia (VA) Ireland (IR) Singapore (SG) Japan (JP) 23
  • 24. Latency (1/2) • All replicas serve client requests 24
  • 25. Overlapping vs. Separate Steps CA VA IR SG JP 25 CA VA (L) IR SG JP Clock-RSM latency: max of three Paxos-bcast latency: sum of three client request client request
  • 26. Latency (2/2) • Paxos leader is changed to CA 26
  • 27. Throughput • Five replicas on a local cluster • Message batching is key 27
  • 28. Also in the Paper • A reconfiguration protocol • Comparison with Mencius • Latency analysis of protocols 28
  • 29. Conclusion • Clock-RSM: low latency geo-replication – Uses loosely synchronized physical clocks – Overlaps ordering and replication • Leader-based protocols can incur high latency 29