SlideShare a Scribd company logo
Hermes
A Fast, Fault-tolerant and Linearizable
Replication Protocol
Antonios Katsarakis, V. Gavrielatos, S. Katebzadeh,
A. Joshi*, B. Grot, V. Nagarajan, A. Dragojevic†
University of Edinburgh, *Intel, †Microsoft Research
hermes-protocol.com
Thanks to:
In-memory with read/write API
Backbone of online services
Need:
High performance
Fault tolerance
Distributed datastores
2
Distributed Datastore
In-memory with read/write API
Backbone of online services
Need:
High performance
Fault tolerance
Distributed datastores
3
Distributed Datastore
In-memory with read/write API
Backbone of online services
Need:
High performance
Fault tolerance
Distributed datastores
4
Distributed Datastore
In-memory with read/write API
Backbone of online services
Need:
High performance
Fault tolerance
Distributed datastores
5
Distributed Datastore
In-memory with read/write API
Backbone of online services
Need:
High performance
Fault tolerance
Distributed datastores
6
Distributed Datastore
In-memory with read/write API
Backbone of online services
Need:
High performance
Fault tolerance
Distributed datastores
7
Distributed Datastore
Mandates data replication
Typically 3 to 7 replicas
Consistency
Weak: performance but nasty surprises
Strong: programmable and intuitive
Reliable replication protocols
• Strong consistency even under faults
• Define actions to execute reads & writes
à these determine a datastore’s performance
Replication 101
9
…… … …
Typically 3 to 7 replicas
Consistency
Weak: performance but nasty surprises
Strong: programmable and intuitive
Reliable replication protocols
• Strong consistency even under faults
• Define actions to execute reads & writes
à these determine a datastore’s performance
Replication 101
10
…… … …
Typically 3 to 7 replicas
Consistency
Weak: performance but nasty surprises
Strong: programmable and intuitive
Reliable replication protocols
• Strong consistency even under faults
• Define actions to execute reads & writes
à these determine a datastore’s performance
Replication 101
11
…… … …
Reliable Replication Protocol
Typically 3 to 7 replicas
Consistency
Weak: performance but nasty surprises
Strong: programmable and intuitive
Reliable replication protocols
• Strong consistency even under faults
• Define actions to execute reads & writes
à these determine a datastore’s performance
Replication 101
12
…… … …
Reliable Replication Protocol
Typically 3 to 7 replicas
Consistency
Weak: performance but nasty surprises
Strong: programmable and intuitive
Reliable replication protocols
• Strong consistency even under faults
• Define actions to execute reads & writes
à these determine a datastore’s performance
Replication 101
13
Can reliable protocols provide high performance?
…… … …
Reliable Replication Protocol
Golden standard
strong consistency and fault tolerance
Low performance
reads à inter-replica communication
writes à multiple RTTs over the network
Common-case performance (i.e., no faults)
as bad as worst-case (under faults)
15
Paxos
Golden standard
strong consistency and fault tolerance
Low performance
reads à inter-replica communication
writes à multiple RTTs over the network
Common-case performance (i.e., no faults)
as bad as worst-case (under faults)
16
Paxos
Golden standard
strong consistency and fault tolerance
Low performance
reads à inter-replica communication
writes à multiple RTTs over the network
Common-case performance (i.e., no faults)
as bad as worst-case (under faults)
17
Paxos
Golden standard
strong consistency and fault tolerance
Low performance
reads à inter-replica communication
writes à multiple RTTs over the network
Common-case performance (i.e., no faults)
as bad as worst-case (under faults)
18
Paxos
State-of-the-art reliable protocols exploit
failure-free operation for performance
20
Performance of state-of-the-art protocols
Leader
ZAB
replicas
21
Performance of state-of-the-art protocols
Leader
ZAB
writeread bcastucast
Local reads form all replicas à Fast
22
Performance of state-of-the-art protocols
Leader
ZAB
Leader
Writes serialize on the leader
à Low throughput
writeread bcastucast
Local reads form all replicas à Fast
23
Performance of state-of-the-art protocols
Leader
ZAB
Leader
Writes serialize on the leader
à Low throughput
Head Tail
CRAQ
writeread bcastucast
Local reads form all replicas à Fast
24
Performance of state-of-the-art protocols
Leader
ZAB
Leader
Writes serialize on the leader
à Low throughput
Head Tail
CRAQ
writeread bcastucast
Local reads form all replicas à Fast Local reads form all replicas à Fast
25
Performance of state-of-the-art protocols
Leader
ZAB
Leader
Writes serialize on the leader
à Low throughput
Head Tail
CRAQ
Head Tail
Writes traverse length of the chain
à High latency
writeread bcastucast
Local reads form all replicas à Fast Local reads form all replicas à Fast
26
Performance of state-of-the-art protocols
Leader
ZAB
Leader
Writes serialize on the leader
à Low throughput
Head Tail
CRAQ
Head Tail
Writes traverse length of the chain
à High latency
writeread bcastucast
Fast reads but poor write performance
Local reads form all replicas à Fast Local reads form all replicas à Fast
28
Goal: low-latency + high-throughput
Reads
Local from all replicas
Writes
Fast
- Minimize network hops
Decentralized
- No serialization points
Fully concurrent
- Any replica can service a write
Key protocol features for high performance
29
Goal: low-latency + high-throughput
Reads
Local from all replicas
Writes
Fast
- Minimize network hops
Decentralized
- No serialization points
Fully concurrent
- Any replica can service a write
Key protocol features for high performance
Local reads from all replicas
30
Goal: low-latency + high-throughput
Reads
Local from all replicas
Writes
Fast
- Minimize network hops
Decentralized
- No serialization points
Fully concurrent
- Any replica can service a write
Key protocol features for high performance
Local reads from all replicas
Head Tail
Avoid long latencies
32
Goal: low-latency + high-throughput
Reads
Local from all replicas
Writes
Fast
- Minimize network hops
Decentralized
- No serialization points
Fully concurrent
- Any replica can service a write
Leader
Avoid write serialization
Key protocol features for high performance
Local reads from all replicas
33
Goal: low-latency + high-throughput
Reads
Local from all replicas
Writes
Fast
- Minimize network hops
Decentralized
- No serialization points
Fully concurrent
- Any replica can service a write
Key protocol features for high performance
Local reads from all replicas
Fast, decentralized, fully concurrent writes
34
Goal: low-latency + high-throughput
Reads
Local from all replicas
Writes
Fast
- Minimize network hops
Decentralized
- No serialization points
Fully concurrent
- Any replica can service a write
Key protocol features for high performance
Local reads from all replicas
Fast, decentralized, fully concurrent writes
Existing replication protocols are deficient
Broadcast-based, invalidating replication protocol
Inspired by multiprocessor cache-coherence protocols
Fault-free operation:
1. Coordinator broadcasts Invalidations
- Coordinator is a replica servicing a write
Enter Hermes
36
Broadcast-based, invalidating replication protocol
Inspired by multiprocessor cache-coherence protocols
Fault-free operation:
1. Coordinator broadcasts Invalidations
- Coordinator is a replica servicing a write
Enter Hermes
37
write(A=3)
Coordinator Followers
Broadcast-based, invalidating replication protocol
Inspired by multiprocessor cache-coherence protocols
Fault-free operation:
1. Coordinator broadcasts Invalidations
- Coordinator is a replica servicing a write
Enter Hermes
38
States of A: Valid, Invalid
write(A=3)
Coordinator Followers
I
Invalidation
I
Broadcast-based, invalidating replication protocol
Inspired by multiprocessor cache-coherence protocols
Fault-free operation:
1. Coordinator broadcasts Invalidations
- Coordinator is a replica servicing a write
Enter Hermes
39
States of A: Valid, Invalid
write(A=3)
Coordinator Followers
At this point, no stale reads can be served
Strong consistency!
I
Invalidation
I
Broadcast-based, invalidating replication protocol
Inspired by multiprocessor cache-coherence protocols
Fault-free operation:
1. Coordinator broadcasts Invalidations
2. Followers Acknowledge invalidation
3. Coordinator broadcasts Validations
- All replicas can now serve reads for this object
Strongest consistency Linearizability
Local reads from all replicas
à valid objects = latest value
Enter Hermes
41
States of A: Valid, Invalid
write(A=3)
Coordinator Followers
Ack
Ack
I
Invalidation
I
Broadcast-based, invalidating replication protocol
Inspired by multiprocessor cache-coherence protocols
Fault-free operation:
1. Coordinator broadcasts Invalidations
2. Followers Acknowledge invalidation
3. Coordinator broadcasts Validations
- All replicas can now serve reads for this object
Strongest consistency Linearizability
Local reads from all replicas
à valid objects = latest value
Enter Hermes
42
States of A: Valid, Invalid
write(A=3)
Coordinator Followers
Ack
Ack
I
Invalidation
I
Vcommit
Broadcast-based, invalidating replication protocol
Inspired by multiprocessor cache-coherence protocols
Fault-free operation:
1. Coordinator broadcasts Invalidations
2. Followers Acknowledge invalidation
3. Coordinator broadcasts Validations
- All replicas can now serve reads for this object
Strongest consistency Linearizability
Local reads from all replicas
à valid objects = latest value
Enter Hermes
43
States of A: Valid, Invalid
write(A=3)
Coordinator Followers
V
Validation
V
Ack
Ack
I
Invalidation
I
V
Broadcast-based, invalidating replication protocol
Inspired by multiprocessor cache-coherence protocols
Fault-free operation:
1. Coordinator broadcasts Invalidations
2. Followers Acknowledge invalidation
3. Coordinator broadcasts Validations
- All replicas can now serve reads for this object
Strongest consistency Linearizability
Local reads from all replicas
à valid objects = latest value
Enter Hermes
44
States of A: Valid, Invalid
write(A=3)
Coordinator Followers
V
Validation
V
Ack
Ack
I
Invalidation
I
V
Broadcast-based, invalidating replication protocol
Inspired by multiprocessor cache-coherence protocols
Fault-free operation:
1. Coordinator broadcasts Invalidations
2. Followers Acknowledge invalidation
3. Coordinator broadcasts Validations
- All replicas can now serve reads for this object
Strongest consistency Linearizability
Local reads from all replicas
à valid objects = latest value
Enter Hermes
45
States of A: Valid, Invalid
write(A=3)
Coordinator Followers
What about concurrent writes?
V
Validation
V
Ack
Ack
I
Invalidation
I
V
Challenge
How to efficiently order concurrent writes to an object?
Solution
Store a logical timestamp (TS) along with each object
- Upon a write:
coordinator increments TS and sends it with Invalidations
- Upon receiving Invalidation:
a follower updates the object’s TS
- When two writes to the same object race:
use node ID to order them
Concurrent writes = challenge
47
write(A=3) write(A=1)
Challenge
How to efficiently order concurrent writes to an object?
Solution
Store a logical timestamp (TS) along with each object
- Upon a write:
coordinator increments TS and sends it with Invalidations
- Upon receiving Invalidation:
a follower updates the object’s TS
- When two writes to the same object race:
use node ID to order them
Concurrent writes = challenge
48
write(A=3) write(A=1)
Challenge
How to efficiently order concurrent writes to an object?
Solution
Store a logical timestamp (TS) along with each object
- Upon a write:
coordinator increments TS and sends it with Invalidations
- Upon receiving Invalidation:
a follower updates the object’s TS
- When two writes to the same object race:
use node ID to order them
Concurrent writes = challenge
49
write(A=3) write(A=1)
Inv(TS1) Inv(TS4)
Challenge
How to efficiently order concurrent writes to an object?
Solution
Store a logical timestamp (TS) along with each object
- Upon a write:
coordinator increments TS and sends it with Invalidations
- Upon receiving Invalidation:
a follower updates the object’s TS
- When two writes to the same object race:
use node ID to order them
Concurrent writes = challenge
50
write(A=3) write(A=1)
Inv(TS1) Inv(TS4)
Challenge
How to efficiently order concurrent writes to an object?
Solution
Store a logical timestamp (TS) along with each object
- Upon a write:
coordinator increments TS and sends it with Invalidations
- Upon receiving Invalidation:
a follower updates the object’s TS
- When two writes to the same object race:
use node ID to order them
Concurrent writes = challenge
51
write(A=3) write(A=1)
Inv(TS1) Inv(TS4)
Challenge
How to efficiently order concurrent writes to an object?
Solution
Store a logical timestamp (TS) along with each object
- Upon a write:
coordinator increments TS and sends it with Invalidations
- Upon receiving Invalidation:
a follower updates the object’s TS
- When two writes to the same object race:
use node ID to order them
Concurrent writes = challenge
52
write(A=3) write(A=1)
Inv(TS1) Inv(TS4)
Broadcast + Invalidations + TS à high performance writes
1. Decentralized
Fully distributed write ordering at endpoints
2. Fully concurrent
Any replica can coordinate a write
Writes to different objects proceed in parallel
3. Fast
Writes commit in 1 RTT
Writes never abort
Writes in Hermes
54
Broadcast + Invalidations + TS
1. Decentralized
Fully distributed write ordering at endpoints
2. Fully concurrent
Any replica can coordinate a write
Writes to different objects proceed in parallel
3. Fast
Writes commit in 1 RTT
Writes never abort
Writes in Hermes
55
Broadcast + Invalidations + TS
1. Decentralized
Fully distributed write ordering at endpoints
2. Fully concurrent
Any replica can coordinate a write
Writes to different objects proceed in parallel
3. Fast
Writes commit in 1 RTT
Writes never abort
Writes in Hermes
56
Broadcast + Invalidations + TS
1. Decentralized
Fully distributed write ordering at endpoints
2. Fully concurrent
Any replica can coordinate a write
Writes to different objects proceed in parallel
3. Fast
Writes commit in 1 RTT
Writes never abort
Writes in Hermes
57
Broadcast + Invalidations + TS
1. Decentralized
Fully distributed write ordering at endpoints
2. Fully concurrent
Any replica can coordinate a write
Writes to different objects proceed in parallel
3. Fast
Writes commit in 1 RTT
Writes never abort
Writes in Hermes
58
Awesome! But what about fault tolerance?
Broadcast + Invalidations + TS
Problem
A failure in the middle of a write can
permanently leave a replica in Invalid state
Solution: send write value with Invalidation à Early value propagation
60
Handling faults in Hermes
Problem
A failure in the middle of a write can
permanently leave a replica in Invalid state
Solution: send write value with Invalidation à Early value propagation
write(A=3)
Coordinator Followers
61
Handling faults in Hermes
Problem
A failure in the middle of a write can
permanently leave a replica in Invalid state
Solution: send write value with Invalidation à Early value propagation
write(A=3)
Coordinator Followers
62
Handling faults in Hermes
Inv(TS)
I
I
Problem
A failure in the middle of a write can
permanently leave a replica in Invalid state
Solution: send write value with Invalidation à Early value propagation
write(A=3)
Coordinator Followers
63
Handling faults in Hermes
Inv(TS)
Coordinator
fails
I
I
Problem
A failure in the middle of a write can
permanently leave a replica in Invalid state
Solution: send write value with Invalidation à Early value propagation
write(A=3)
Coordinator Followers
64
Handling faults in Hermes
read(A)
Inv(TS)
Coordinator
fails
I
I
Problem
A failure in the middle of a write can
permanently leave a replica in Invalid state
Solution: send write value with Invalidation à Early value propagation
write(A=3)
Coordinator Followers
65
Handling faults in Hermes
read(A)
Inv(TS)
Coordinator
fails
I
I
Problem
A failure in the middle of a write can
permanently leave a replica in Invalid state
Idea
Allow any Invalidated replica to
replay the write and unblock.
Solution: send write value with Invalidation à Early value propagation
write(A=3)
Coordinator Followers
66
Handling faults in Hermes
read(A)
Inv(TS)
Coordinator
fails
I
I
Problem
A failure in the middle of a write can
permanently leave a replica in Invalid state
Idea
Allow any Invalidated replica to
replay the write and unblock.
How?
Insight: to replay a write need
- Write’s original TS (for ordering)
- Write value
Solution: send write value with Invalidation à Early value propagation
write(A=3)
Coordinator Followers
67
Handling faults in Hermes
read(A)
Inv(TS)
Coordinator
fails
I
I
Problem
A failure in the middle of a write can
permanently leave a replica in Invalid state
Idea
Allow any Invalidated replica to
replay the write and unblock.
How?
Insight: to replay a write need
- Write’s original TS (for ordering)
- Write value
TS sent with Invalidation, but write value is not
Solution: send write value with Invalidation à Early value propagation
write(A=3)
Coordinator Followers
68
Handling faults in Hermes
read(A)
Inv(TS)
Coordinator
fails
I
I
Problem
A failure in the middle of a write can
permanently leave a replica in Invalid state
Idea
Allow any Invalidated replica to
replay the write and unblock.
How?
Insight: to replay a write need
- Write’s original TS (for ordering)
- Write value
TS sent with Invalidation, but write value is not
Solution: send write value with Invalidation à Early value propagation
Handling faults in Hermes
70
Inv(3,TS)write(A=3)
Coordinator
fails
I
I
Coordinator Followers
Problem
A failure in the middle of a write can
permanently leave a replica in Invalid state
Idea
Allow any Invalidated replica to
replay the write and unblock.
How?
Insight: to replay a write need
- Write’s original TS (for ordering)
- Write value
TS sent with Invalidation, but write value is not
Solution: send write value with Invalidation à Early value propagation
Handling faults in Hermes
71
Inv(3,TS)write(A=3)
read(A)
Coordinator
fails
I
I
Coordinator Followers
Problem
A failure in the middle of a write can
permanently leave a replica in Invalid state
Idea
Allow any Invalidated replica to
replay the write and unblock.
How?
Insight: to replay a write need
- Write’s original TS (for ordering)
- Write value
TS sent with Invalidation, but write value is not
Solution: send write value with Invalidation à Early value propagation
V
V
Inv(3,TS)
completion
write
replay
read(A)
Handling faults in Hermes
73
Inv(3,TS)write(A=3)
Coordinator
fails
I
I
Coordinator Followers
Problem
A failure in the middle of a write can
permanently leave a replica in Invalid state
Idea
Allow any Invalidated replica to
replay the write and unblock.
How?
Insight: to replay a write need
- Write’s original TS (for ordering)
- Write value
TS sent with Invalidation, but write value is not
Solution: send write value with Invalidation à Early value propagation
V
V
Inv(3,TS)
completion
write
replay
read(A)
Handling faults in Hermes
74
Inv(3,TS)write(A=3)
Early value propagation enables write replays
Coordinator
fails
I
I
Coordinator Followers
Strong Consistency
through CC-inspired Invalidations
Fault-tolerance
write replays via early value propagation
High Performance
Local reads at all replicas
High performance writes
Fast
Decentralized
Fully-distributed
Hermes recap
76
V
I
write(A=3)
commit
Coordinator Followers
Inv(3,TS)
V
I
V
Broadcast + Invalidations + TS + early value propagation
Strong Consistency
through CC-inspired Invalidations
Fault-tolerance
write replays via early value propagation
High Performance
Local reads at all replicas
High performance writes
Fast
Decentralized
Fully-distributed
Hermes recap
77
V
I
write(A=3)
commit
Coordinator Followers
Inv(3,TS)
V
I
V
Broadcast + Invalidations + TS + early value propagation
In the paper: protocol details, RMWs, other goodies
Evaluation
78
State-of-the-art hardware testbed
- 5 servers
- 2x 10 core Intel Xeon E5-2630v4 per server
- 56 Gb/s InfiniBand NICs
KVS Workload
- Uniform access distribution
- Million KV pairs: <8B keys, 32B values>
Evaluated protocols:
- ZAB
- CRAQ
- Hermes
Performance
79
Throughput
high-perf. writes + local reads
conc. writes + local reads
local reads
Millionrequests/sec
Performance
80
Throughput
high-perf. writes + local reads
conc. writes + local reads
local reads
4x
40%
Millionrequests/sec
Performance
81
Throughput
high-perf. writes + local reads
conc. writes + local reads
local reads
4x
40%
Millionrequests/sec
Write performance matters even at low write ratios
Performance
82
Throughput
high-perf. writes + local reads
conc. writes + local reads
local reads
4x
40%
5% Write Ratio
Write Latency
(normalized to Hermes)
Millionrequests/sec
Write performance matters even at low write ratios
Performance
83
Throughput
high-perf. writes + local reads
conc. writes + local reads
local reads
4x
40%
5% Write Ratio
Write Latency
(normalized to Hermes)
Millionrequests/sec
Write performance matters even at low write ratios
6x
Performance
84
Throughput
high-perf. writes + local reads
conc. writes + local reads
local reads
4x
40%
5% Write Ratio
Write Latency
(normalized to Hermes)
Millionrequests/sec
Write performance matters even at low write ratios
6x
Hermes: highest throughput & lowest latency
Hermes
Broadcast + Invalidations + TS + early value propagation
Hermes-protocol.com
Code available
TLA+ verification
Q&A
Conclusion
86
Hermes
Broadcast + Invalidations + TS + early value propagation
Strong consistency
Fault tolerance via write replays
High performance
Local reads from all replicas
High performance writes
Fast
Decentralized
Fully concurrent
Hermes-protocol.com
Code available
TLA+ verification
Q&A
Conclusion
87
Hermes
Broadcast + Invalidations + TS + early value propagation
Strong consistency
Fault tolerance via write replays
High performance
Local reads from all replicas
High performance writes
Fast
Decentralized
Fully concurrent
Hermes-protocol.com
Code available
TLA+ verification
Q&A
Conclusion
88
Hermes
Broadcast + Invalidations + TS + early value propagation
Strong consistency
Fault tolerance via write replays
High performance
Local reads from all replicas
High performance writes
Fast
Decentralized
Fully concurrent
Hermes-protocol.com
Code available
TLA+ verification
Q&A
Conclusion
89
Need reliability and performance? Choose Hermes!

More Related Content

What's hot

Kafka High Availability in multi data center setup with floating Observers wi...
Kafka High Availability in multi data center setup with floating Observers wi...Kafka High Availability in multi data center setup with floating Observers wi...
Kafka High Availability in multi data center setup with floating Observers wi...HostedbyConfluent
 
Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...
Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...
Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...confluent
 
Apache Kafka® Security Overview
Apache Kafka® Security OverviewApache Kafka® Security Overview
Apache Kafka® Security Overviewconfluent
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeperSaurav Haloi
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlJiangjie Qin
 
Introducing Confluent labs Parallel Consumer client | Anthony Stubbes, Confluent
Introducing Confluent labs Parallel Consumer client | Anthony Stubbes, ConfluentIntroducing Confluent labs Parallel Consumer client | Anthony Stubbes, Confluent
Introducing Confluent labs Parallel Consumer client | Anthony Stubbes, ConfluentHostedbyConfluent
 
Kafka replication apachecon_2013
Kafka replication apachecon_2013Kafka replication apachecon_2013
Kafka replication apachecon_2013Jun Rao
 
From Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed SystemsFrom Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed SystemsTyler Treat
 
Getting Started with Confluent Schema Registry
Getting Started with Confluent Schema RegistryGetting Started with Confluent Schema Registry
Getting Started with Confluent Schema Registryconfluent
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotFlink Forward
 
Plan a successful enterprise Linux migration
Plan a successful enterprise Linux migrationPlan a successful enterprise Linux migration
Plan a successful enterprise Linux migrationRogue Wave Software
 
From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.Taras Matyashovsky
 
DNS Security Presentation ISSA
DNS Security Presentation ISSADNS Security Presentation ISSA
DNS Security Presentation ISSASrikrupa Srivatsan
 
Scylla on Kubernetes: Introducing the Scylla Operator
Scylla on Kubernetes: Introducing the Scylla OperatorScylla on Kubernetes: Introducing the Scylla Operator
Scylla on Kubernetes: Introducing the Scylla OperatorScyllaDB
 
MirrorMaker: Beyond the Basics with Mickael Maison
MirrorMaker: Beyond the Basics with Mickael MaisonMirrorMaker: Beyond the Basics with Mickael Maison
MirrorMaker: Beyond the Basics with Mickael MaisonHostedbyConfluent
 
分散システム読書会 06章-同期(前編)
分散システム読書会 06章-同期(前編)分散システム読書会 06章-同期(前編)
分散システム読書会 06章-同期(前編)Ichiro TAKAHASHI
 
Dual write strategies for microservices
Dual write strategies for microservicesDual write strategies for microservices
Dual write strategies for microservicesBilgin Ibryam
 

What's hot (20)

Kafka High Availability in multi data center setup with floating Observers wi...
Kafka High Availability in multi data center setup with floating Observers wi...Kafka High Availability in multi data center setup with floating Observers wi...
Kafka High Availability in multi data center setup with floating Observers wi...
 
Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...
Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...
Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...
 
Apache Kafka® Security Overview
Apache Kafka® Security OverviewApache Kafka® Security Overview
Apache Kafka® Security Overview
 
Deep Dive on Amazon Aurora
Deep Dive on Amazon AuroraDeep Dive on Amazon Aurora
Deep Dive on Amazon Aurora
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeper
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
 
Introducing Confluent labs Parallel Consumer client | Anthony Stubbes, Confluent
Introducing Confluent labs Parallel Consumer client | Anthony Stubbes, ConfluentIntroducing Confluent labs Parallel Consumer client | Anthony Stubbes, Confluent
Introducing Confluent labs Parallel Consumer client | Anthony Stubbes, Confluent
 
Kafka replication apachecon_2013
Kafka replication apachecon_2013Kafka replication apachecon_2013
Kafka replication apachecon_2013
 
From Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed SystemsFrom Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed Systems
 
Getting Started with Confluent Schema Registry
Getting Started with Confluent Schema RegistryGetting Started with Confluent Schema Registry
Getting Started with Confluent Schema Registry
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
 
Plan a successful enterprise Linux migration
Plan a successful enterprise Linux migrationPlan a successful enterprise Linux migration
Plan a successful enterprise Linux migration
 
From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.
 
Kafka at scale facebook israel
Kafka at scale   facebook israelKafka at scale   facebook israel
Kafka at scale facebook israel
 
DNS Security Presentation ISSA
DNS Security Presentation ISSADNS Security Presentation ISSA
DNS Security Presentation ISSA
 
Scylla on Kubernetes: Introducing the Scylla Operator
Scylla on Kubernetes: Introducing the Scylla OperatorScylla on Kubernetes: Introducing the Scylla Operator
Scylla on Kubernetes: Introducing the Scylla Operator
 
MirrorMaker: Beyond the Basics with Mickael Maison
MirrorMaker: Beyond the Basics with Mickael MaisonMirrorMaker: Beyond the Basics with Mickael Maison
MirrorMaker: Beyond the Basics with Mickael Maison
 
分散システム読書会 06章-同期(前編)
分散システム読書会 06章-同期(前編)分散システム読書会 06章-同期(前編)
分散システム読書会 06章-同期(前編)
 
Kubernetes 101
Kubernetes 101Kubernetes 101
Kubernetes 101
 
Dual write strategies for microservices
Dual write strategies for microservicesDual write strategies for microservices
Dual write strategies for microservices
 

Similar to Hermes Reliable Replication Protocol - ASPLOS'20 Presentation

Invalidation-Based Protocols for Replicated Datastores
Invalidation-Based Protocols for Replicated DatastoresInvalidation-Based Protocols for Replicated Datastores
Invalidation-Based Protocols for Replicated DatastoresAntonios Katsarakis
 
High performance network programming on the jvm oscon 2012
High performance network programming on the jvm   oscon 2012 High performance network programming on the jvm   oscon 2012
High performance network programming on the jvm oscon 2012 Erik Onnen
 
What a Modern Database Enables_Srini Srinivasan.pdf
What a Modern Database Enables_Srini Srinivasan.pdfWhat a Modern Database Enables_Srini Srinivasan.pdf
What a Modern Database Enables_Srini Srinivasan.pdfAerospike, Inc.
 
Streaming architecture patterns
Streaming architecture patternsStreaming architecture patterns
Streaming architecture patternshadooparchbook
 
Towards Benchmaking Modern Distruibuted Systems-(Grace Huang, Intel)
Towards Benchmaking Modern Distruibuted Systems-(Grace Huang, Intel)Towards Benchmaking Modern Distruibuted Systems-(Grace Huang, Intel)
Towards Benchmaking Modern Distruibuted Systems-(Grace Huang, Intel)Spark Summit
 
Architectural Overview of MapR's Apache Hadoop Distribution
Architectural Overview of MapR's Apache Hadoop DistributionArchitectural Overview of MapR's Apache Hadoop Distribution
Architectural Overview of MapR's Apache Hadoop Distributionmcsrivas
 
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...Виталий Стародубцев
 
Socket programming with php
Socket programming with phpSocket programming with php
Socket programming with phpElizabeth Smith
 
High Performance Communication for Oracle using InfiniBand
High Performance Communication for Oracle using InfiniBandHigh Performance Communication for Oracle using InfiniBand
High Performance Communication for Oracle using InfiniBandwebhostingguy
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Anton Nazaruk
 
Rpc (Distributed computing)
Rpc (Distributed computing)Rpc (Distributed computing)
Rpc (Distributed computing)Sri Prasanna
 
Geographically dispersed perconaxtra db cluster deployment
Geographically dispersed perconaxtra db cluster deploymentGeographically dispersed perconaxtra db cluster deployment
Geographically dispersed perconaxtra db cluster deploymentMarco Tusa
 
ONOS Open Network Operating System
ONOS Open Network Operating SystemONOS Open Network Operating System
ONOS Open Network Operating SystemON.Lab
 
2.communcation in distributed system
2.communcation in distributed system2.communcation in distributed system
2.communcation in distributed systemGd Goenka University
 
Disaggregated Networking - The Drivers, the Software & The High Availability
Disaggregated Networking - The Drivers, the Software & The High AvailabilityDisaggregated Networking - The Drivers, the Software & The High Availability
Disaggregated Networking - The Drivers, the Software & The High AvailabilityOpen Networking Summit
 
Network Bottleneck Avoidance Using Edge Routers
Network Bottleneck Avoidance Using Edge RoutersNetwork Bottleneck Avoidance Using Edge Routers
Network Bottleneck Avoidance Using Edge RoutersAnkur Singhal
 
A Beginner’s Guide to Kafka Performance in Cloud Environments with Steffen Ha...
A Beginner’s Guide to Kafka Performance in Cloud Environments with Steffen Ha...A Beginner’s Guide to Kafka Performance in Cloud Environments with Steffen Ha...
A Beginner’s Guide to Kafka Performance in Cloud Environments with Steffen Ha...HostedbyConfluent
 

Similar to Hermes Reliable Replication Protocol - ASPLOS'20 Presentation (20)

Invalidation-Based Protocols for Replicated Datastores
Invalidation-Based Protocols for Replicated DatastoresInvalidation-Based Protocols for Replicated Datastores
Invalidation-Based Protocols for Replicated Datastores
 
High performance network programming on the jvm oscon 2012
High performance network programming on the jvm   oscon 2012 High performance network programming on the jvm   oscon 2012
High performance network programming on the jvm oscon 2012
 
What a Modern Database Enables_Srini Srinivasan.pdf
What a Modern Database Enables_Srini Srinivasan.pdfWhat a Modern Database Enables_Srini Srinivasan.pdf
What a Modern Database Enables_Srini Srinivasan.pdf
 
Streaming architecture patterns
Streaming architecture patternsStreaming architecture patterns
Streaming architecture patterns
 
Towards Benchmaking Modern Distruibuted Systems-(Grace Huang, Intel)
Towards Benchmaking Modern Distruibuted Systems-(Grace Huang, Intel)Towards Benchmaking Modern Distruibuted Systems-(Grace Huang, Intel)
Towards Benchmaking Modern Distruibuted Systems-(Grace Huang, Intel)
 
Architectural Overview of MapR's Apache Hadoop Distribution
Architectural Overview of MapR's Apache Hadoop DistributionArchitectural Overview of MapR's Apache Hadoop Distribution
Architectural Overview of MapR's Apache Hadoop Distribution
 
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...
 
Reactive Streams
Reactive StreamsReactive Streams
Reactive Streams
 
Socket programming with php
Socket programming with phpSocket programming with php
Socket programming with php
 
The L2AW theorem
The L2AW theoremThe L2AW theorem
The L2AW theorem
 
Iptables presentation
Iptables presentationIptables presentation
Iptables presentation
 
High Performance Communication for Oracle using InfiniBand
High Performance Communication for Oracle using InfiniBandHigh Performance Communication for Oracle using InfiniBand
High Performance Communication for Oracle using InfiniBand
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?
 
Rpc (Distributed computing)
Rpc (Distributed computing)Rpc (Distributed computing)
Rpc (Distributed computing)
 
Geographically dispersed perconaxtra db cluster deployment
Geographically dispersed perconaxtra db cluster deploymentGeographically dispersed perconaxtra db cluster deployment
Geographically dispersed perconaxtra db cluster deployment
 
ONOS Open Network Operating System
ONOS Open Network Operating SystemONOS Open Network Operating System
ONOS Open Network Operating System
 
2.communcation in distributed system
2.communcation in distributed system2.communcation in distributed system
2.communcation in distributed system
 
Disaggregated Networking - The Drivers, the Software & The High Availability
Disaggregated Networking - The Drivers, the Software & The High AvailabilityDisaggregated Networking - The Drivers, the Software & The High Availability
Disaggregated Networking - The Drivers, the Software & The High Availability
 
Network Bottleneck Avoidance Using Edge Routers
Network Bottleneck Avoidance Using Edge RoutersNetwork Bottleneck Avoidance Using Edge Routers
Network Bottleneck Avoidance Using Edge Routers
 
A Beginner’s Guide to Kafka Performance in Cloud Environments with Steffen Ha...
A Beginner’s Guide to Kafka Performance in Cloud Environments with Steffen Ha...A Beginner’s Guide to Kafka Performance in Cloud Environments with Steffen Ha...
A Beginner’s Guide to Kafka Performance in Cloud Environments with Steffen Ha...
 

Recently uploaded

AI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in MichelangeloAI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in MichelangeloAlluxio, Inc.
 
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAGAI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAGAlluxio, Inc.
 
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfA Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfkalichargn70th171
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyanic lab
 
Breaking the Code : A Guide to WhatsApp Business API.pdf
Breaking the Code : A Guide to WhatsApp Business API.pdfBreaking the Code : A Guide to WhatsApp Business API.pdf
Breaking the Code : A Guide to WhatsApp Business API.pdfMeon Technology
 
De mooiste recreatieve routes ontdekken met RouteYou en FME
De mooiste recreatieve routes ontdekken met RouteYou en FMEDe mooiste recreatieve routes ontdekken met RouteYou en FME
De mooiste recreatieve routes ontdekken met RouteYou en FMEJelle | Nordend
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfOrtus Solutions, Corp
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar
 
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1KnowledgeSeed
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Anthony Dahanne
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTier1 app
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandIES VE
 
Designing for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web ServicesDesigning for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web ServicesKrzysztofKkol1
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamtakuyayamamoto1800
 
Crafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM IntegrationCrafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM IntegrationWave PLM
 
iGaming Platform & Lottery Solutions by Skilrock
iGaming Platform & Lottery Solutions by SkilrockiGaming Platform & Lottery Solutions by Skilrock
iGaming Platform & Lottery Solutions by SkilrockSkilrock Technologies
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessWSO2
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILNatan Silnitsky
 

Recently uploaded (20)

AI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in MichelangeloAI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in Michelangelo
 
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAGAI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
 
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfA Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
 
Breaking the Code : A Guide to WhatsApp Business API.pdf
Breaking the Code : A Guide to WhatsApp Business API.pdfBreaking the Code : A Guide to WhatsApp Business API.pdf
Breaking the Code : A Guide to WhatsApp Business API.pdf
 
De mooiste recreatieve routes ontdekken met RouteYou en FME
De mooiste recreatieve routes ontdekken met RouteYou en FMEDe mooiste recreatieve routes ontdekken met RouteYou en FME
De mooiste recreatieve routes ontdekken met RouteYou en FME
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
 
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
 
Designing for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web ServicesDesigning for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web Services
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
 
Crafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM IntegrationCrafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM Integration
 
iGaming Platform & Lottery Solutions by Skilrock
iGaming Platform & Lottery Solutions by SkilrockiGaming Platform & Lottery Solutions by Skilrock
iGaming Platform & Lottery Solutions by Skilrock
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
 

Hermes Reliable Replication Protocol - ASPLOS'20 Presentation

  • 1. Hermes A Fast, Fault-tolerant and Linearizable Replication Protocol Antonios Katsarakis, V. Gavrielatos, S. Katebzadeh, A. Joshi*, B. Grot, V. Nagarajan, A. Dragojevic† University of Edinburgh, *Intel, †Microsoft Research hermes-protocol.com Thanks to:
  • 2. In-memory with read/write API Backbone of online services Need: High performance Fault tolerance Distributed datastores 2 Distributed Datastore
  • 3. In-memory with read/write API Backbone of online services Need: High performance Fault tolerance Distributed datastores 3 Distributed Datastore
  • 4. In-memory with read/write API Backbone of online services Need: High performance Fault tolerance Distributed datastores 4 Distributed Datastore
  • 5. In-memory with read/write API Backbone of online services Need: High performance Fault tolerance Distributed datastores 5 Distributed Datastore
  • 6. In-memory with read/write API Backbone of online services Need: High performance Fault tolerance Distributed datastores 6 Distributed Datastore
  • 7. In-memory with read/write API Backbone of online services Need: High performance Fault tolerance Distributed datastores 7 Distributed Datastore Mandates data replication
  • 8. Typically 3 to 7 replicas Consistency Weak: performance but nasty surprises Strong: programmable and intuitive Reliable replication protocols • Strong consistency even under faults • Define actions to execute reads & writes à these determine a datastore’s performance Replication 101 9 …… … …
  • 9. Typically 3 to 7 replicas Consistency Weak: performance but nasty surprises Strong: programmable and intuitive Reliable replication protocols • Strong consistency even under faults • Define actions to execute reads & writes à these determine a datastore’s performance Replication 101 10 …… … …
  • 10. Typically 3 to 7 replicas Consistency Weak: performance but nasty surprises Strong: programmable and intuitive Reliable replication protocols • Strong consistency even under faults • Define actions to execute reads & writes à these determine a datastore’s performance Replication 101 11 …… … … Reliable Replication Protocol
  • 11. Typically 3 to 7 replicas Consistency Weak: performance but nasty surprises Strong: programmable and intuitive Reliable replication protocols • Strong consistency even under faults • Define actions to execute reads & writes à these determine a datastore’s performance Replication 101 12 …… … … Reliable Replication Protocol
  • 12. Typically 3 to 7 replicas Consistency Weak: performance but nasty surprises Strong: programmable and intuitive Reliable replication protocols • Strong consistency even under faults • Define actions to execute reads & writes à these determine a datastore’s performance Replication 101 13 Can reliable protocols provide high performance? …… … … Reliable Replication Protocol
  • 13. Golden standard strong consistency and fault tolerance Low performance reads à inter-replica communication writes à multiple RTTs over the network Common-case performance (i.e., no faults) as bad as worst-case (under faults) 15 Paxos
  • 14. Golden standard strong consistency and fault tolerance Low performance reads à inter-replica communication writes à multiple RTTs over the network Common-case performance (i.e., no faults) as bad as worst-case (under faults) 16 Paxos
  • 15. Golden standard strong consistency and fault tolerance Low performance reads à inter-replica communication writes à multiple RTTs over the network Common-case performance (i.e., no faults) as bad as worst-case (under faults) 17 Paxos
  • 16. Golden standard strong consistency and fault tolerance Low performance reads à inter-replica communication writes à multiple RTTs over the network Common-case performance (i.e., no faults) as bad as worst-case (under faults) 18 Paxos State-of-the-art reliable protocols exploit failure-free operation for performance
  • 17. 20 Performance of state-of-the-art protocols Leader ZAB replicas
  • 18. 21 Performance of state-of-the-art protocols Leader ZAB writeread bcastucast Local reads form all replicas à Fast
  • 19. 22 Performance of state-of-the-art protocols Leader ZAB Leader Writes serialize on the leader à Low throughput writeread bcastucast Local reads form all replicas à Fast
  • 20. 23 Performance of state-of-the-art protocols Leader ZAB Leader Writes serialize on the leader à Low throughput Head Tail CRAQ writeread bcastucast Local reads form all replicas à Fast
  • 21. 24 Performance of state-of-the-art protocols Leader ZAB Leader Writes serialize on the leader à Low throughput Head Tail CRAQ writeread bcastucast Local reads form all replicas à Fast Local reads form all replicas à Fast
  • 22. 25 Performance of state-of-the-art protocols Leader ZAB Leader Writes serialize on the leader à Low throughput Head Tail CRAQ Head Tail Writes traverse length of the chain à High latency writeread bcastucast Local reads form all replicas à Fast Local reads form all replicas à Fast
  • 23. 26 Performance of state-of-the-art protocols Leader ZAB Leader Writes serialize on the leader à Low throughput Head Tail CRAQ Head Tail Writes traverse length of the chain à High latency writeread bcastucast Fast reads but poor write performance Local reads form all replicas à Fast Local reads form all replicas à Fast
  • 24. 28 Goal: low-latency + high-throughput Reads Local from all replicas Writes Fast - Minimize network hops Decentralized - No serialization points Fully concurrent - Any replica can service a write Key protocol features for high performance
  • 25. 29 Goal: low-latency + high-throughput Reads Local from all replicas Writes Fast - Minimize network hops Decentralized - No serialization points Fully concurrent - Any replica can service a write Key protocol features for high performance Local reads from all replicas
  • 26. 30 Goal: low-latency + high-throughput Reads Local from all replicas Writes Fast - Minimize network hops Decentralized - No serialization points Fully concurrent - Any replica can service a write Key protocol features for high performance Local reads from all replicas Head Tail Avoid long latencies
  • 27. 32 Goal: low-latency + high-throughput Reads Local from all replicas Writes Fast - Minimize network hops Decentralized - No serialization points Fully concurrent - Any replica can service a write Leader Avoid write serialization Key protocol features for high performance Local reads from all replicas
  • 28. 33 Goal: low-latency + high-throughput Reads Local from all replicas Writes Fast - Minimize network hops Decentralized - No serialization points Fully concurrent - Any replica can service a write Key protocol features for high performance Local reads from all replicas Fast, decentralized, fully concurrent writes
  • 29. 34 Goal: low-latency + high-throughput Reads Local from all replicas Writes Fast - Minimize network hops Decentralized - No serialization points Fully concurrent - Any replica can service a write Key protocol features for high performance Local reads from all replicas Fast, decentralized, fully concurrent writes Existing replication protocols are deficient
  • 30. Broadcast-based, invalidating replication protocol Inspired by multiprocessor cache-coherence protocols Fault-free operation: 1. Coordinator broadcasts Invalidations - Coordinator is a replica servicing a write Enter Hermes 36
  • 31. Broadcast-based, invalidating replication protocol Inspired by multiprocessor cache-coherence protocols Fault-free operation: 1. Coordinator broadcasts Invalidations - Coordinator is a replica servicing a write Enter Hermes 37 write(A=3) Coordinator Followers
  • 32. Broadcast-based, invalidating replication protocol Inspired by multiprocessor cache-coherence protocols Fault-free operation: 1. Coordinator broadcasts Invalidations - Coordinator is a replica servicing a write Enter Hermes 38 States of A: Valid, Invalid write(A=3) Coordinator Followers I Invalidation I
  • 33. Broadcast-based, invalidating replication protocol Inspired by multiprocessor cache-coherence protocols Fault-free operation: 1. Coordinator broadcasts Invalidations - Coordinator is a replica servicing a write Enter Hermes 39 States of A: Valid, Invalid write(A=3) Coordinator Followers At this point, no stale reads can be served Strong consistency! I Invalidation I
  • 34. Broadcast-based, invalidating replication protocol Inspired by multiprocessor cache-coherence protocols Fault-free operation: 1. Coordinator broadcasts Invalidations 2. Followers Acknowledge invalidation 3. Coordinator broadcasts Validations - All replicas can now serve reads for this object Strongest consistency Linearizability Local reads from all replicas à valid objects = latest value Enter Hermes 41 States of A: Valid, Invalid write(A=3) Coordinator Followers Ack Ack I Invalidation I
  • 35. Broadcast-based, invalidating replication protocol Inspired by multiprocessor cache-coherence protocols Fault-free operation: 1. Coordinator broadcasts Invalidations 2. Followers Acknowledge invalidation 3. Coordinator broadcasts Validations - All replicas can now serve reads for this object Strongest consistency Linearizability Local reads from all replicas à valid objects = latest value Enter Hermes 42 States of A: Valid, Invalid write(A=3) Coordinator Followers Ack Ack I Invalidation I Vcommit
  • 36. Broadcast-based, invalidating replication protocol Inspired by multiprocessor cache-coherence protocols Fault-free operation: 1. Coordinator broadcasts Invalidations 2. Followers Acknowledge invalidation 3. Coordinator broadcasts Validations - All replicas can now serve reads for this object Strongest consistency Linearizability Local reads from all replicas à valid objects = latest value Enter Hermes 43 States of A: Valid, Invalid write(A=3) Coordinator Followers V Validation V Ack Ack I Invalidation I V
  • 37. Broadcast-based, invalidating replication protocol Inspired by multiprocessor cache-coherence protocols Fault-free operation: 1. Coordinator broadcasts Invalidations 2. Followers Acknowledge invalidation 3. Coordinator broadcasts Validations - All replicas can now serve reads for this object Strongest consistency Linearizability Local reads from all replicas à valid objects = latest value Enter Hermes 44 States of A: Valid, Invalid write(A=3) Coordinator Followers V Validation V Ack Ack I Invalidation I V
  • 38. Broadcast-based, invalidating replication protocol Inspired by multiprocessor cache-coherence protocols Fault-free operation: 1. Coordinator broadcasts Invalidations 2. Followers Acknowledge invalidation 3. Coordinator broadcasts Validations - All replicas can now serve reads for this object Strongest consistency Linearizability Local reads from all replicas à valid objects = latest value Enter Hermes 45 States of A: Valid, Invalid write(A=3) Coordinator Followers What about concurrent writes? V Validation V Ack Ack I Invalidation I V
  • 39. Challenge How to efficiently order concurrent writes to an object? Solution Store a logical timestamp (TS) along with each object - Upon a write: coordinator increments TS and sends it with Invalidations - Upon receiving Invalidation: a follower updates the object’s TS - When two writes to the same object race: use node ID to order them Concurrent writes = challenge 47 write(A=3) write(A=1)
  • 40. Challenge How to efficiently order concurrent writes to an object? Solution Store a logical timestamp (TS) along with each object - Upon a write: coordinator increments TS and sends it with Invalidations - Upon receiving Invalidation: a follower updates the object’s TS - When two writes to the same object race: use node ID to order them Concurrent writes = challenge 48 write(A=3) write(A=1)
  • 41. Challenge How to efficiently order concurrent writes to an object? Solution Store a logical timestamp (TS) along with each object - Upon a write: coordinator increments TS and sends it with Invalidations - Upon receiving Invalidation: a follower updates the object’s TS - When two writes to the same object race: use node ID to order them Concurrent writes = challenge 49 write(A=3) write(A=1) Inv(TS1) Inv(TS4)
  • 42. Challenge How to efficiently order concurrent writes to an object? Solution Store a logical timestamp (TS) along with each object - Upon a write: coordinator increments TS and sends it with Invalidations - Upon receiving Invalidation: a follower updates the object’s TS - When two writes to the same object race: use node ID to order them Concurrent writes = challenge 50 write(A=3) write(A=1) Inv(TS1) Inv(TS4)
  • 43. Challenge How to efficiently order concurrent writes to an object? Solution Store a logical timestamp (TS) along with each object - Upon a write: coordinator increments TS and sends it with Invalidations - Upon receiving Invalidation: a follower updates the object’s TS - When two writes to the same object race: use node ID to order them Concurrent writes = challenge 51 write(A=3) write(A=1) Inv(TS1) Inv(TS4)
  • 44. Challenge How to efficiently order concurrent writes to an object? Solution Store a logical timestamp (TS) along with each object - Upon a write: coordinator increments TS and sends it with Invalidations - Upon receiving Invalidation: a follower updates the object’s TS - When two writes to the same object race: use node ID to order them Concurrent writes = challenge 52 write(A=3) write(A=1) Inv(TS1) Inv(TS4) Broadcast + Invalidations + TS à high performance writes
  • 45. 1. Decentralized Fully distributed write ordering at endpoints 2. Fully concurrent Any replica can coordinate a write Writes to different objects proceed in parallel 3. Fast Writes commit in 1 RTT Writes never abort Writes in Hermes 54 Broadcast + Invalidations + TS
  • 46. 1. Decentralized Fully distributed write ordering at endpoints 2. Fully concurrent Any replica can coordinate a write Writes to different objects proceed in parallel 3. Fast Writes commit in 1 RTT Writes never abort Writes in Hermes 55 Broadcast + Invalidations + TS
  • 47. 1. Decentralized Fully distributed write ordering at endpoints 2. Fully concurrent Any replica can coordinate a write Writes to different objects proceed in parallel 3. Fast Writes commit in 1 RTT Writes never abort Writes in Hermes 56 Broadcast + Invalidations + TS
  • 48. 1. Decentralized Fully distributed write ordering at endpoints 2. Fully concurrent Any replica can coordinate a write Writes to different objects proceed in parallel 3. Fast Writes commit in 1 RTT Writes never abort Writes in Hermes 57 Broadcast + Invalidations + TS
  • 49. 1. Decentralized Fully distributed write ordering at endpoints 2. Fully concurrent Any replica can coordinate a write Writes to different objects proceed in parallel 3. Fast Writes commit in 1 RTT Writes never abort Writes in Hermes 58 Awesome! But what about fault tolerance? Broadcast + Invalidations + TS
  • 50. Problem A failure in the middle of a write can permanently leave a replica in Invalid state Solution: send write value with Invalidation à Early value propagation 60 Handling faults in Hermes
  • 51. Problem A failure in the middle of a write can permanently leave a replica in Invalid state Solution: send write value with Invalidation à Early value propagation write(A=3) Coordinator Followers 61 Handling faults in Hermes
  • 52. Problem A failure in the middle of a write can permanently leave a replica in Invalid state Solution: send write value with Invalidation à Early value propagation write(A=3) Coordinator Followers 62 Handling faults in Hermes Inv(TS) I I
  • 53. Problem A failure in the middle of a write can permanently leave a replica in Invalid state Solution: send write value with Invalidation à Early value propagation write(A=3) Coordinator Followers 63 Handling faults in Hermes Inv(TS) Coordinator fails I I
  • 54. Problem A failure in the middle of a write can permanently leave a replica in Invalid state Solution: send write value with Invalidation à Early value propagation write(A=3) Coordinator Followers 64 Handling faults in Hermes read(A) Inv(TS) Coordinator fails I I
  • 55. Problem A failure in the middle of a write can permanently leave a replica in Invalid state Solution: send write value with Invalidation à Early value propagation write(A=3) Coordinator Followers 65 Handling faults in Hermes read(A) Inv(TS) Coordinator fails I I
  • 56. Problem A failure in the middle of a write can permanently leave a replica in Invalid state Idea Allow any Invalidated replica to replay the write and unblock. Solution: send write value with Invalidation à Early value propagation write(A=3) Coordinator Followers 66 Handling faults in Hermes read(A) Inv(TS) Coordinator fails I I
  • 57. Problem A failure in the middle of a write can permanently leave a replica in Invalid state Idea Allow any Invalidated replica to replay the write and unblock. How? Insight: to replay a write need - Write’s original TS (for ordering) - Write value Solution: send write value with Invalidation à Early value propagation write(A=3) Coordinator Followers 67 Handling faults in Hermes read(A) Inv(TS) Coordinator fails I I
  • 58. Problem A failure in the middle of a write can permanently leave a replica in Invalid state Idea Allow any Invalidated replica to replay the write and unblock. How? Insight: to replay a write need - Write’s original TS (for ordering) - Write value TS sent with Invalidation, but write value is not Solution: send write value with Invalidation à Early value propagation write(A=3) Coordinator Followers 68 Handling faults in Hermes read(A) Inv(TS) Coordinator fails I I
  • 59. Problem A failure in the middle of a write can permanently leave a replica in Invalid state Idea Allow any Invalidated replica to replay the write and unblock. How? Insight: to replay a write need - Write’s original TS (for ordering) - Write value TS sent with Invalidation, but write value is not Solution: send write value with Invalidation à Early value propagation Handling faults in Hermes 70 Inv(3,TS)write(A=3) Coordinator fails I I Coordinator Followers
  • 60. Problem A failure in the middle of a write can permanently leave a replica in Invalid state Idea Allow any Invalidated replica to replay the write and unblock. How? Insight: to replay a write need - Write’s original TS (for ordering) - Write value TS sent with Invalidation, but write value is not Solution: send write value with Invalidation à Early value propagation Handling faults in Hermes 71 Inv(3,TS)write(A=3) read(A) Coordinator fails I I Coordinator Followers
  • 61. Problem A failure in the middle of a write can permanently leave a replica in Invalid state Idea Allow any Invalidated replica to replay the write and unblock. How? Insight: to replay a write need - Write’s original TS (for ordering) - Write value TS sent with Invalidation, but write value is not Solution: send write value with Invalidation à Early value propagation V V Inv(3,TS) completion write replay read(A) Handling faults in Hermes 73 Inv(3,TS)write(A=3) Coordinator fails I I Coordinator Followers
  • 62. Problem A failure in the middle of a write can permanently leave a replica in Invalid state Idea Allow any Invalidated replica to replay the write and unblock. How? Insight: to replay a write need - Write’s original TS (for ordering) - Write value TS sent with Invalidation, but write value is not Solution: send write value with Invalidation à Early value propagation V V Inv(3,TS) completion write replay read(A) Handling faults in Hermes 74 Inv(3,TS)write(A=3) Early value propagation enables write replays Coordinator fails I I Coordinator Followers
  • 63. Strong Consistency through CC-inspired Invalidations Fault-tolerance write replays via early value propagation High Performance Local reads at all replicas High performance writes Fast Decentralized Fully-distributed Hermes recap 76 V I write(A=3) commit Coordinator Followers Inv(3,TS) V I V Broadcast + Invalidations + TS + early value propagation
  • 64. Strong Consistency through CC-inspired Invalidations Fault-tolerance write replays via early value propagation High Performance Local reads at all replicas High performance writes Fast Decentralized Fully-distributed Hermes recap 77 V I write(A=3) commit Coordinator Followers Inv(3,TS) V I V Broadcast + Invalidations + TS + early value propagation In the paper: protocol details, RMWs, other goodies
  • 65. Evaluation 78 State-of-the-art hardware testbed - 5 servers - 2x 10 core Intel Xeon E5-2630v4 per server - 56 Gb/s InfiniBand NICs KVS Workload - Uniform access distribution - Million KV pairs: <8B keys, 32B values> Evaluated protocols: - ZAB - CRAQ - Hermes
  • 66. Performance 79 Throughput high-perf. writes + local reads conc. writes + local reads local reads Millionrequests/sec
  • 67. Performance 80 Throughput high-perf. writes + local reads conc. writes + local reads local reads 4x 40% Millionrequests/sec
  • 68. Performance 81 Throughput high-perf. writes + local reads conc. writes + local reads local reads 4x 40% Millionrequests/sec Write performance matters even at low write ratios
  • 69. Performance 82 Throughput high-perf. writes + local reads conc. writes + local reads local reads 4x 40% 5% Write Ratio Write Latency (normalized to Hermes) Millionrequests/sec Write performance matters even at low write ratios
  • 70. Performance 83 Throughput high-perf. writes + local reads conc. writes + local reads local reads 4x 40% 5% Write Ratio Write Latency (normalized to Hermes) Millionrequests/sec Write performance matters even at low write ratios 6x
  • 71. Performance 84 Throughput high-perf. writes + local reads conc. writes + local reads local reads 4x 40% 5% Write Ratio Write Latency (normalized to Hermes) Millionrequests/sec Write performance matters even at low write ratios 6x Hermes: highest throughput & lowest latency
  • 72. Hermes Broadcast + Invalidations + TS + early value propagation Hermes-protocol.com Code available TLA+ verification Q&A Conclusion 86
  • 73. Hermes Broadcast + Invalidations + TS + early value propagation Strong consistency Fault tolerance via write replays High performance Local reads from all replicas High performance writes Fast Decentralized Fully concurrent Hermes-protocol.com Code available TLA+ verification Q&A Conclusion 87
  • 74. Hermes Broadcast + Invalidations + TS + early value propagation Strong consistency Fault tolerance via write replays High performance Local reads from all replicas High performance writes Fast Decentralized Fully concurrent Hermes-protocol.com Code available TLA+ verification Q&A Conclusion 88
  • 75. Hermes Broadcast + Invalidations + TS + early value propagation Strong consistency Fault tolerance via write replays High performance Local reads from all replicas High performance writes Fast Decentralized Fully concurrent Hermes-protocol.com Code available TLA+ verification Q&A Conclusion 89 Need reliability and performance? Choose Hermes!