SlideShare a Scribd company logo

Invalidation-Based Protocols for Replicated Datastores

PhD viva slides of Antonios Katsarakis. University of Edinburgh 2021.

1 of 53
Download to read offline
Invalidation-Based Protocols
for Replicated Datastores
Antonios Katsarakis
Doctor of Philosophy
T
H
E
U
N I V E R
S
I
T
Y
O
F
E
D I N B U
R
G
H
Data: in-memory, sharded across
servers within a datacenter (DC)
Offer a single-object read/write
or multi-object transactions API
Backbone of online services and
cloud applications
Must provide:
High performance
Fault tolerance
Distributed datastores
2
distributed datastore
Data: in-memory, sharded across
servers within a datacenter (DC)
Offer a single-object read/write
or multi-object transactions API
Backbone of online services and
cloud applications
Must provide:
High performance
Fault tolerance
Distributed datastores
2
distributed datastore
Data: in-memory, sharded across
servers within a datacenter (DC)
Offer a single-object read/write
or multi-object transactions API
Backbone of online services and
cloud applications
Must provide:
High performance
Fault tolerance
Distributed datastores
2
distributed datastore
Mandate data replication
Performance
a single node may not keep up with load
Fault tolerance
data remain available despite failures
Typically 3 to 7 replicas
Consistency
Weak: performance but nasty surprises
Strong: intuitive, broadest spectrum of apps
Replication protocols
- Strong consistency even under faults – if fault tolerant
- Define actions to execute reads/writes or transactions (txs)
à determine the datastore’s performance
Replication 101
3
…
… … …
replication protocol
Performance
a single node may not keep up with load
Fault tolerance
data remain available despite failures
Typically 3 to 7 replicas
Consistency
Weak: performance but nasty surprises
Strong: intuitive, broadest spectrum of apps
Replication protocols
- Strong consistency even under faults – if fault tolerant
- Define actions to execute reads/writes or transactions (txs)
à determine the datastore’s performance
Replication 101
3
…
… … …
replication protocol
Ad

Recommended

Hermes Reliable Replication Protocol - ASPLOS'20 Presentation
Hermes Reliable Replication Protocol -  ASPLOS'20 PresentationHermes Reliable Replication Protocol -  ASPLOS'20 Presentation
Hermes Reliable Replication Protocol - ASPLOS'20 PresentationAntonios Katsarakis
 
Spinnaker VLDB 2011
Spinnaker VLDB 2011Spinnaker VLDB 2011
Spinnaker VLDB 2011sandeep_tata
 
Apache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community UpdateApache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community UpdateDataWorks Summit
 
Architectural Overview of MapR's Apache Hadoop Distribution
Architectural Overview of MapR's Apache Hadoop DistributionArchitectural Overview of MapR's Apache Hadoop Distribution
Architectural Overview of MapR's Apache Hadoop Distributionmcsrivas
 
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...Виталий Стародубцев
 
SD Big Data Monthly Meetup #4 - Session 2 - WANDisco
SD Big Data Monthly Meetup #4 - Session 2 - WANDiscoSD Big Data Monthly Meetup #4 - Session 2 - WANDisco
SD Big Data Monthly Meetup #4 - Session 2 - WANDiscoBig Data Joe™ Rossi
 
The Open Source and Cloud Part of Oracle Big Data Cloud Service for Beginners
The Open Source and Cloud Part of Oracle Big Data Cloud Service for BeginnersThe Open Source and Cloud Part of Oracle Big Data Cloud Service for Beginners
The Open Source and Cloud Part of Oracle Big Data Cloud Service for BeginnersEdelweiss Kammermann
 
Storage Primer
Storage PrimerStorage Primer
Storage Primersriramr
 

More Related Content

Similar to Invalidation-Based Protocols for Replicated Datastores

Knowledge share about scalable application architecture
Knowledge share about scalable application architectureKnowledge share about scalable application architecture
Knowledge share about scalable application architectureAHM Pervej Kabir
 
What a Modern Database Enables_Srini Srinivasan.pdf
What a Modern Database Enables_Srini Srinivasan.pdfWhat a Modern Database Enables_Srini Srinivasan.pdf
What a Modern Database Enables_Srini Srinivasan.pdfAerospike, Inc.
 
Distributed Systems: scalability and high availability
Distributed Systems: scalability and high availabilityDistributed Systems: scalability and high availability
Distributed Systems: scalability and high availabilityRenato Lucindo
 
Sharing High-Performance Interconnects Across Multiple Virtual Machines
Sharing High-Performance Interconnects Across Multiple Virtual MachinesSharing High-Performance Interconnects Across Multiple Virtual Machines
Sharing High-Performance Interconnects Across Multiple Virtual Machinesinside-BigData.com
 
Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4Chris Nauroth
 
Socket programming with php
Socket programming with phpSocket programming with php
Socket programming with phpElizabeth Smith
 
HTTP at your local BigCo
HTTP at your local BigCoHTTP at your local BigCo
HTTP at your local BigCopgriess
 
Data center disaster recovery.ppt
Data center disaster recovery.ppt Data center disaster recovery.ppt
Data center disaster recovery.ppt omalreda
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Uwe Printz
 
[B4]deview 2012-hdfs
[B4]deview 2012-hdfs[B4]deview 2012-hdfs
[B4]deview 2012-hdfsNAVER D2
 
Reference Architecture: Architecting Ceph Storage Solutions
Reference Architecture: Architecting Ceph Storage Solutions Reference Architecture: Architecting Ceph Storage Solutions
Reference Architecture: Architecting Ceph Storage Solutions Ceph Community
 
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYCScalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYCCal Henderson
 
Clustered Architecture Patterns Delivering Scalability And Availability
Clustered Architecture Patterns Delivering Scalability And AvailabilityClustered Architecture Patterns Delivering Scalability And Availability
Clustered Architecture Patterns Delivering Scalability And AvailabilityConSanFrancisco123
 
Network and distributed systems
Network and distributed systemsNetwork and distributed systems
Network and distributed systemsSri Prasanna
 
VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...
VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...
VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...VMworld
 
Tech Talk Series, Part 4: How do you achieve high availability in a MySQL env...
Tech Talk Series, Part 4: How do you achieve high availability in a MySQL env...Tech Talk Series, Part 4: How do you achieve high availability in a MySQL env...
Tech Talk Series, Part 4: How do you achieve high availability in a MySQL env...Clustrix
 
Creating customized openSUSE versions with SUSE Studio
Creating customized openSUSE versions with SUSE StudioCreating customized openSUSE versions with SUSE Studio
Creating customized openSUSE versions with SUSE Studioelliando dias
 
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711Dave Anselmi
 
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011Mike Willbanks
 

Similar to Invalidation-Based Protocols for Replicated Datastores (20)

Knowledge share about scalable application architecture
Knowledge share about scalable application architectureKnowledge share about scalable application architecture
Knowledge share about scalable application architecture
 
What a Modern Database Enables_Srini Srinivasan.pdf
What a Modern Database Enables_Srini Srinivasan.pdfWhat a Modern Database Enables_Srini Srinivasan.pdf
What a Modern Database Enables_Srini Srinivasan.pdf
 
Distributed Systems: scalability and high availability
Distributed Systems: scalability and high availabilityDistributed Systems: scalability and high availability
Distributed Systems: scalability and high availability
 
Sharing High-Performance Interconnects Across Multiple Virtual Machines
Sharing High-Performance Interconnects Across Multiple Virtual MachinesSharing High-Performance Interconnects Across Multiple Virtual Machines
Sharing High-Performance Interconnects Across Multiple Virtual Machines
 
Kafka & Hadoop in Rakuten
Kafka & Hadoop in RakutenKafka & Hadoop in Rakuten
Kafka & Hadoop in Rakuten
 
Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4
 
Socket programming with php
Socket programming with phpSocket programming with php
Socket programming with php
 
HTTP at your local BigCo
HTTP at your local BigCoHTTP at your local BigCo
HTTP at your local BigCo
 
Data center disaster recovery.ppt
Data center disaster recovery.ppt Data center disaster recovery.ppt
Data center disaster recovery.ppt
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
 
[B4]deview 2012-hdfs
[B4]deview 2012-hdfs[B4]deview 2012-hdfs
[B4]deview 2012-hdfs
 
Reference Architecture: Architecting Ceph Storage Solutions
Reference Architecture: Architecting Ceph Storage Solutions Reference Architecture: Architecting Ceph Storage Solutions
Reference Architecture: Architecting Ceph Storage Solutions
 
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYCScalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
 
Clustered Architecture Patterns Delivering Scalability And Availability
Clustered Architecture Patterns Delivering Scalability And AvailabilityClustered Architecture Patterns Delivering Scalability And Availability
Clustered Architecture Patterns Delivering Scalability And Availability
 
Network and distributed systems
Network and distributed systemsNetwork and distributed systems
Network and distributed systems
 
VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...
VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...
VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...
 
Tech Talk Series, Part 4: How do you achieve high availability in a MySQL env...
Tech Talk Series, Part 4: How do you achieve high availability in a MySQL env...Tech Talk Series, Part 4: How do you achieve high availability in a MySQL env...
Tech Talk Series, Part 4: How do you achieve high availability in a MySQL env...
 
Creating customized openSUSE versions with SUSE Studio
Creating customized openSUSE versions with SUSE StudioCreating customized openSUSE versions with SUSE Studio
Creating customized openSUSE versions with SUSE Studio
 
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711
Scaling RDBMS on AWS- ClustrixDB @AWS Meetup 20160711
 
MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011MNPHP Scalable Architecture 101 - Feb 3 2011
MNPHP Scalable Architecture 101 - Feb 3 2011
 

More from Antonios Katsarakis

Zeus: Locality-aware Distributed Transactions [Eurosys '21 presentation]
Zeus: Locality-aware Distributed Transactions [Eurosys '21 presentation]Zeus: Locality-aware Distributed Transactions [Eurosys '21 presentation]
Zeus: Locality-aware Distributed Transactions [Eurosys '21 presentation]Antonios Katsarakis
 
Hermes Reliable Replication Protocol - Poster
Hermes Reliable Replication Protocol - Poster Hermes Reliable Replication Protocol - Poster
Hermes Reliable Replication Protocol - Poster Antonios Katsarakis
 
Distributed Processing Frameworks
Distributed Processing FrameworksDistributed Processing Frameworks
Distributed Processing FrameworksAntonios Katsarakis
 
Spark Overview and Performance Issues
Spark Overview and Performance IssuesSpark Overview and Performance Issues
Spark Overview and Performance IssuesAntonios Katsarakis
 

More from Antonios Katsarakis (6)

Zeus: Locality-aware Distributed Transactions [Eurosys '21 presentation]
Zeus: Locality-aware Distributed Transactions [Eurosys '21 presentation]Zeus: Locality-aware Distributed Transactions [Eurosys '21 presentation]
Zeus: Locality-aware Distributed Transactions [Eurosys '21 presentation]
 
Hermes Reliable Replication Protocol - Poster
Hermes Reliable Replication Protocol - Poster Hermes Reliable Replication Protocol - Poster
Hermes Reliable Replication Protocol - Poster
 
Scale-out ccNUMA - Eurosys'18
Scale-out ccNUMA - Eurosys'18Scale-out ccNUMA - Eurosys'18
Scale-out ccNUMA - Eurosys'18
 
Tensor Processing Unit (TPU)
Tensor Processing Unit (TPU)Tensor Processing Unit (TPU)
Tensor Processing Unit (TPU)
 
Distributed Processing Frameworks
Distributed Processing FrameworksDistributed Processing Frameworks
Distributed Processing Frameworks
 
Spark Overview and Performance Issues
Spark Overview and Performance IssuesSpark Overview and Performance Issues
Spark Overview and Performance Issues
 

Recently uploaded

21ST CENTURY LITERACY FROM TRADITIONAL TO MODERN
21ST CENTURY LITERACY FROM TRADITIONAL TO MODERN21ST CENTURY LITERACY FROM TRADITIONAL TO MODERN
21ST CENTURY LITERACY FROM TRADITIONAL TO MODERNRonnelBaroc
 
"DevOps Practisting Platform on EKS with Karpenter autoscaling", Dmytro Kozhevin
"DevOps Practisting Platform on EKS with Karpenter autoscaling", Dmytro Kozhevin"DevOps Practisting Platform on EKS with Karpenter autoscaling", Dmytro Kozhevin
"DevOps Practisting Platform on EKS with Karpenter autoscaling", Dmytro KozhevinFwdays
 
Enterprise Architecture As Strategy - Book Review
Enterprise Architecture As Strategy - Book ReviewEnterprise Architecture As Strategy - Book Review
Enterprise Architecture As Strategy - Book ReviewAshraf Fouad
 
"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google
"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google
"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, GoogleISPMAIndia
 
Building Products That Think- Bhaskaran Srinivasan & Ashish Gupta
Building Products That Think- Bhaskaran Srinivasan & Ashish GuptaBuilding Products That Think- Bhaskaran Srinivasan & Ashish Gupta
Building Products That Think- Bhaskaran Srinivasan & Ashish GuptaISPMAIndia
 
HBR SERIES METAL HOUSED RESISTORS POWER ELECTRICAL ABSORBS HIGH CURRENT DURIN...
HBR SERIES METAL HOUSED RESISTORS POWER ELECTRICAL ABSORBS HIGH CURRENT DURIN...HBR SERIES METAL HOUSED RESISTORS POWER ELECTRICAL ABSORBS HIGH CURRENT DURIN...
HBR SERIES METAL HOUSED RESISTORS POWER ELECTRICAL ABSORBS HIGH CURRENT DURIN...htrindia
 
10 things that helped me advance my career - PHP UK Conference 2024
10 things that helped me advance my career - PHP UK Conference 202410 things that helped me advance my career - PHP UK Conference 2024
10 things that helped me advance my career - PHP UK Conference 2024Thijs Feryn
 
How AI and ChatGPT are changing cybersecurity forever.pptx
How AI and ChatGPT are changing cybersecurity forever.pptxHow AI and ChatGPT are changing cybersecurity forever.pptx
How AI and ChatGPT are changing cybersecurity forever.pptxInfosec
 
Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24
Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24
Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24Umar Saif
 
"AIRe - AI Reliability Engineering", Denys Vasyliev
"AIRe - AI Reliability Engineering", Denys Vasyliev"AIRe - AI Reliability Engineering", Denys Vasyliev
"AIRe - AI Reliability Engineering", Denys VasylievFwdays
 
Introduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVAIntroduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVARobert McDermott
 
Cultivating Entrepreneurial Mindset in Product Management: Strategies for Suc...
Cultivating Entrepreneurial Mindset in Product Management: Strategies for Suc...Cultivating Entrepreneurial Mindset in Product Management: Strategies for Suc...
Cultivating Entrepreneurial Mindset in Product Management: Strategies for Suc...Product School
 
Relationship Counselling: From Disjointed Features to Product-First Thinking ...
Relationship Counselling: From Disjointed Features to Product-First Thinking ...Relationship Counselling: From Disjointed Features to Product-First Thinking ...
Relationship Counselling: From Disjointed Features to Product-First Thinking ...Product School
 
Harnessing the Power of GenAI for Exceptional Product Outcomes by Booking.com...
Harnessing the Power of GenAI for Exceptional Product Outcomes by Booking.com...Harnessing the Power of GenAI for Exceptional Product Outcomes by Booking.com...
Harnessing the Power of GenAI for Exceptional Product Outcomes by Booking.com...Product School
 
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...Neo4j
 
How to write an effective Cyber Incident Response Plan
How to write an effective Cyber Incident Response PlanHow to write an effective Cyber Incident Response Plan
How to write an effective Cyber Incident Response PlanDatabarracks
 
Imaging and Design for the Online Environment Part 1.pptx
Imaging and Design for the Online Environment Part 1.pptxImaging and Design for the Online Environment Part 1.pptx
Imaging and Design for the Online Environment Part 1.pptxPower Point
 
Avoiding Bad Stats and the Benefits of Playing Trivia with Friends: PancakesC...
Avoiding Bad Stats and the Benefits of Playing Trivia with Friends: PancakesC...Avoiding Bad Stats and the Benefits of Playing Trivia with Friends: PancakesC...
Avoiding Bad Stats and the Benefits of Playing Trivia with Friends: PancakesC...Adrian Sanabria
 
AI Act & Standardization: UNINFO involvement
AI Act & Standardization: UNINFO involvementAI Act & Standardization: UNINFO involvement
AI Act & Standardization: UNINFO involvementMimmo Squillace
 
Dev Dives: Leverage APIs and Gen AI to power automations for RPA and software...
Dev Dives: Leverage APIs and Gen AI to power automations for RPA and software...Dev Dives: Leverage APIs and Gen AI to power automations for RPA and software...
Dev Dives: Leverage APIs and Gen AI to power automations for RPA and software...UiPathCommunity
 

Recently uploaded (20)

21ST CENTURY LITERACY FROM TRADITIONAL TO MODERN
21ST CENTURY LITERACY FROM TRADITIONAL TO MODERN21ST CENTURY LITERACY FROM TRADITIONAL TO MODERN
21ST CENTURY LITERACY FROM TRADITIONAL TO MODERN
 
"DevOps Practisting Platform on EKS with Karpenter autoscaling", Dmytro Kozhevin
"DevOps Practisting Platform on EKS with Karpenter autoscaling", Dmytro Kozhevin"DevOps Practisting Platform on EKS with Karpenter autoscaling", Dmytro Kozhevin
"DevOps Practisting Platform on EKS with Karpenter autoscaling", Dmytro Kozhevin
 
Enterprise Architecture As Strategy - Book Review
Enterprise Architecture As Strategy - Book ReviewEnterprise Architecture As Strategy - Book Review
Enterprise Architecture As Strategy - Book Review
 
"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google
"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google
"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google
 
Building Products That Think- Bhaskaran Srinivasan & Ashish Gupta
Building Products That Think- Bhaskaran Srinivasan & Ashish GuptaBuilding Products That Think- Bhaskaran Srinivasan & Ashish Gupta
Building Products That Think- Bhaskaran Srinivasan & Ashish Gupta
 
HBR SERIES METAL HOUSED RESISTORS POWER ELECTRICAL ABSORBS HIGH CURRENT DURIN...
HBR SERIES METAL HOUSED RESISTORS POWER ELECTRICAL ABSORBS HIGH CURRENT DURIN...HBR SERIES METAL HOUSED RESISTORS POWER ELECTRICAL ABSORBS HIGH CURRENT DURIN...
HBR SERIES METAL HOUSED RESISTORS POWER ELECTRICAL ABSORBS HIGH CURRENT DURIN...
 
10 things that helped me advance my career - PHP UK Conference 2024
10 things that helped me advance my career - PHP UK Conference 202410 things that helped me advance my career - PHP UK Conference 2024
10 things that helped me advance my career - PHP UK Conference 2024
 
How AI and ChatGPT are changing cybersecurity forever.pptx
How AI and ChatGPT are changing cybersecurity forever.pptxHow AI and ChatGPT are changing cybersecurity forever.pptx
How AI and ChatGPT are changing cybersecurity forever.pptx
 
Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24
Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24
Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24
 
"AIRe - AI Reliability Engineering", Denys Vasyliev
"AIRe - AI Reliability Engineering", Denys Vasyliev"AIRe - AI Reliability Engineering", Denys Vasyliev
"AIRe - AI Reliability Engineering", Denys Vasyliev
 
Introduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVAIntroduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVA
 
Cultivating Entrepreneurial Mindset in Product Management: Strategies for Suc...
Cultivating Entrepreneurial Mindset in Product Management: Strategies for Suc...Cultivating Entrepreneurial Mindset in Product Management: Strategies for Suc...
Cultivating Entrepreneurial Mindset in Product Management: Strategies for Suc...
 
Relationship Counselling: From Disjointed Features to Product-First Thinking ...
Relationship Counselling: From Disjointed Features to Product-First Thinking ...Relationship Counselling: From Disjointed Features to Product-First Thinking ...
Relationship Counselling: From Disjointed Features to Product-First Thinking ...
 
Harnessing the Power of GenAI for Exceptional Product Outcomes by Booking.com...
Harnessing the Power of GenAI for Exceptional Product Outcomes by Booking.com...Harnessing the Power of GenAI for Exceptional Product Outcomes by Booking.com...
Harnessing the Power of GenAI for Exceptional Product Outcomes by Booking.com...
 
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
 
How to write an effective Cyber Incident Response Plan
How to write an effective Cyber Incident Response PlanHow to write an effective Cyber Incident Response Plan
How to write an effective Cyber Incident Response Plan
 
Imaging and Design for the Online Environment Part 1.pptx
Imaging and Design for the Online Environment Part 1.pptxImaging and Design for the Online Environment Part 1.pptx
Imaging and Design for the Online Environment Part 1.pptx
 
Avoiding Bad Stats and the Benefits of Playing Trivia with Friends: PancakesC...
Avoiding Bad Stats and the Benefits of Playing Trivia with Friends: PancakesC...Avoiding Bad Stats and the Benefits of Playing Trivia with Friends: PancakesC...
Avoiding Bad Stats and the Benefits of Playing Trivia with Friends: PancakesC...
 
AI Act & Standardization: UNINFO involvement
AI Act & Standardization: UNINFO involvementAI Act & Standardization: UNINFO involvement
AI Act & Standardization: UNINFO involvement
 
Dev Dives: Leverage APIs and Gen AI to power automations for RPA and software...
Dev Dives: Leverage APIs and Gen AI to power automations for RPA and software...Dev Dives: Leverage APIs and Gen AI to power automations for RPA and software...
Dev Dives: Leverage APIs and Gen AI to power automations for RPA and software...
 

Invalidation-Based Protocols for Replicated Datastores

  • 1. Invalidation-Based Protocols for Replicated Datastores Antonios Katsarakis Doctor of Philosophy T H E U N I V E R S I T Y O F E D I N B U R G H
  • 2. Data: in-memory, sharded across servers within a datacenter (DC) Offer a single-object read/write or multi-object transactions API Backbone of online services and cloud applications Must provide: High performance Fault tolerance Distributed datastores 2 distributed datastore
  • 3. Data: in-memory, sharded across servers within a datacenter (DC) Offer a single-object read/write or multi-object transactions API Backbone of online services and cloud applications Must provide: High performance Fault tolerance Distributed datastores 2 distributed datastore
  • 4. Data: in-memory, sharded across servers within a datacenter (DC) Offer a single-object read/write or multi-object transactions API Backbone of online services and cloud applications Must provide: High performance Fault tolerance Distributed datastores 2 distributed datastore Mandate data replication
  • 5. Performance a single node may not keep up with load Fault tolerance data remain available despite failures Typically 3 to 7 replicas Consistency Weak: performance but nasty surprises Strong: intuitive, broadest spectrum of apps Replication protocols - Strong consistency even under faults – if fault tolerant - Define actions to execute reads/writes or transactions (txs) à determine the datastore’s performance Replication 101 3 … … … … replication protocol
  • 6. Performance a single node may not keep up with load Fault tolerance data remain available despite failures Typically 3 to 7 replicas Consistency Weak: performance but nasty surprises Strong: intuitive, broadest spectrum of apps Replication protocols - Strong consistency even under faults – if fault tolerant - Define actions to execute reads/writes or transactions (txs) à determine the datastore’s performance Replication 101 3 … … … … replication protocol
  • 7. Performance a single node may not keep up with load Fault tolerance data remain available despite failures Typically 3 to 7 replicas Consistency Weak: performance but nasty surprises Strong: intuitive, broadest spectrum of apps Replication protocols - Strong consistency even under faults – if fault tolerant - Define actions to execute reads/writes or transactions (txs) à determine the datastore’s performance Replication 101 3 Can strongly consistent protocols offer fault tolerance and high performance? … … … … replication protocol
  • 8. Multiprocessor: coherence / HTM data replicated across caches Fault tolerance Performance via Invalidations (low-latency interconnect) - reads/writes: concurrency & speed - txs: fully exploit locality Strongly consistent replication 4 Datastores: replication protocols data replicated across nodes Fault tolerance Performance - reads/writes: sacrifice concurrency or speed - txs: cannot exploit locality Multiprocessor: coherence / HTM data replicated across caches Fault tolerance Performance via Invalidations (low-latency interconnect) - reads/writes: concurrency & speed - txs: fully exploit locality
  • 9. Multiprocessor: coherence / HTM data replicated across caches Fault tolerance Performance via Invalidations (low-latency interconnect) - reads/writes: concurrency & speed - txs: fully exploit locality Replication protocols inside a DC - Network: fast, remote direct memory access (RDMA) - Faults are rare within a replica group A server fails at most twice a year fault-free operation >> operation under faults Strongly consistent replication 4 Datastores: replication protocols data replicated across nodes Fault tolerance Performance - reads/writes: sacrifice concurrency or speed - txs: cannot exploit locality Multiprocessor: coherence / HTM data replicated across caches Fault tolerance Performance via Invalidations (low-latency interconnect) - reads/writes: concurrency & speed - txs: fully exploit locality
  • 10. Multiprocessor: coherence / HTM data replicated across caches Fault tolerance Performance via Invalidations (low-latency interconnect) - reads/writes: concurrency & speed - txs: fully exploit locality Replication protocols inside a DC - Network: fast, remote direct memory access (RDMA) - Faults are rare within a replica group A server fails at most twice a year fault-free operation >> operation under faults Strongly consistent replication 4 Datastores: replication protocols data replicated across nodes Fault tolerance Performance - reads/writes: sacrifice concurrency or speed - txs: cannot exploit locality Multiprocessor: coherence / HTM data replicated across caches Fault tolerance Performance via Invalidations (low-latency interconnect) - reads/writes: concurrency & speed - txs: fully exploit locality The common operation of replication protocols resembles the multiprocessor!
  • 11. Thesis overview 5 Adapting multiprocessor-inspired invalidating protocols to intra-DC replicated datastores enables: strong consistency, fault tolerance, high performance Primary contributions 4 invalidating protocols à 3 most common replication uses in datastores 1-slide summary N-slides 1-slide summary Scale-out ccNUMA [Eurosys’18] Galene protocol Performant read/write replication (skew) Zeus [Eurosys’21] Zeus ownership, Zeus reliable commit Replicated fault-tolerant distributed txs Hermes [ASPLOS’20] Hermes protocol Fast fault-tolerant read/write replication
  • 12. Performant read/write replication for skew 12 Many workloads exhibit skewed data accesses a few servers are overloaded, most are underutilized State-of-the-art skew mitigation distributes accesses across all servers & uses RDMA No locality: most requests need remote access à increased latency, bottlenecked by network b/w Symmetric caching all servers have a cache same hottest objects Throughput scales with numbers of servers Less network b/w: most requests served locally Challenge: efficiently keep caches consistent Existing protocols serialize writes @ physical point = hotspot Galene protocol invalidations + logical timestamps = fully distributed writes RDMA
  • 13. Performant read/write replication for skew 13 Many workloads exhibit skewed data accesses a few servers are overloaded, most are underutilized State-of-the-art skew mitigation distributes accesses across all servers & uses RDMA No locality: most requests need remote access à increased latency, bottlenecked by network b/w Symmetric caching all servers have a cache same hottest objects Throughput scales with numbers of servers Less network b/w: most requests served locally Challenge: efficiently keep caches consistent Existing protocols serialize writes @ physical point = hotspot Galene protocol invalidations + logical timestamps = fully distributed writes RDMA RDMA Symmetric caching and Galene
  • 14. Performant read/write replication for skew 14 Many workloads exhibit skewed data accesses a few servers are overloaded, most are underutilized State-of-the-art skew mitigation distributes accesses across all servers & uses RDMA No locality: most requests need remote access à increased latency, bottlenecked by network b/w Symmetric caching all servers have a cache same hottest objects Throughput scales with numbers of servers Less network b/w: most requests served locally Challenge: efficiently keep caches consistent Existing protocols serialize writes @ physical point = hotspot Galene protocol invalidations + logical timestamps = fully distributed writes RDMA RDMA Symmetric caching and Galene 100s millions ops/sec & up to 3x state-of-the-art!
  • 15. 15 Hmmm … Invalidating protocols good read/write performance when replicating under skew can maintain high read/write performance while providing fault tolerance? reliable = strongly consistent + fault tolerant 2nd primary contribution: Hermes!
  • 16. 16 Hmmm … Invalidating protocols good read/write performance when replicating under skew can maintain high read/write performance while providing fault tolerance? reliable = strongly consistent + fault tolerant 2nd primary contribution: Hermes! What is the issue of existing reliable protocols?
  • 17. Golden standard strong consistency and fault tolerance Low performance reads à inter-replica communication writes à multiple RTTs over the network Common-case performance (i.e., no faults) as bad as worst-case (under faults) 17 Paxos
  • 18. Golden standard strong consistency and fault tolerance Low performance reads à inter-replica communication writes à multiple RTTs over the network Common-case performance (i.e., no faults) as bad as worst-case (under faults) 18 Paxos State-of-the-art replication protocols exploit failure-free operation for performance
  • 19. 11 Performance of state-of-the-art protocols Leader ZAB replicas
  • 20. 20 Performance of state-of-the-art protocols Leader ZAB Leader Writes serialize on the leader à Low throughput Head Tail CRAQ Head Tail Writes traverse length of the chain à High latency write read bcast ucast Local reads form all replicas à Fast Local reads form all replicas à Fast
  • 21. 21 Performance of state-of-the-art protocols Leader ZAB Leader Writes serialize on the leader à Low throughput Head Tail CRAQ Head Tail Writes traverse length of the chain à High latency write read bcast ucast Fast reads but poor write performance Local reads form all replicas à Fast Local reads form all replicas à Fast
  • 22. 13 Goal: low latency + high throughput Reads Local from all replicas Writes Fast - Minimize network hops Decentralized - No serialization points Fully concurrent - Any replica can service a write Key protocol features for high performance Local reads from all replicas Head Tail Avoid long latencies
  • 23. 23 Goal: low-latency + high-throughput Reads Local from all replicas Writes Fast - Minimize network hops Decentralized - No serialization points Fully concurrent - Any replica can service Leader Avoid write serialization Key protocol features for high performance Local reads from all replicas
  • 24. 24 Goal: low-latency + high-throughput Reads Local from all replicas Writes Fast - Minimize network hops Decentralized - No serialization points Fully concurrent - Any replica can service a write Key protocol features for high performance Local reads from all replicas Fast, decentralized, fully concurrent writes
  • 25. 25 Goal: low-latency + high-throughput Reads Local from all replicas Writes Fast - Minimize network hops Decentralized - No serialization points Fully concurrent - Any replica can service a write Key protocol features for high performance Local reads from all replicas Fast, decentralized, fully concurrent writes Existing replication protocols are deficient
  • 26. Broadcast-based, invalidating replication protocol Inspired by multiprocessor cache-coherence protocols Fault-free operation: 1. Coordinator broadcasts Invalidations - Coordinator is a replica servicing a write Enter Hermes 26 States of A: Valid, Invalid write(A=3) Coordinator Followers I Invalidation I
  • 27. Broadcast-based, invalidating replication protocol Inspired by multiprocessor cache-coherence protocols Fault-free operation: 1. Coordinator broadcasts Invalidations - Coordinator is a replica servicing a write Enter Hermes 27 States of A: Valid, Invalid write(A=3) Coordinator Followers At this point, no stale reads can be served Strong consistency! I Invalidation I
  • 28. Broadcast-based, invalidating replication protocol Inspired by multiprocessor cache-coherence protocols Fault-free operation: 1. Coordinator broadcasts Invalidations 2. Followers Acknowledge invalidation 3. Coordinator broadcasts Validations - All replicas can now serve reads for this object Strongest consistency Linearizability Local reads from all replicas à valid objects = latest value Enter Hermes 28 States of A: Valid, Invalid write(A=3) Coordinator Followers V Validation V Ack Ack I Invalidation I V commit
  • 29. Broadcast-based, invalidating replication protocol Inspired by multiprocessor cache-coherence protocols Fault-free operation: 1. Coordinator broadcasts Invalidations 2. Followers Acknowledge invalidation 3. Coordinator broadcasts Validations - All replicas can now serve reads for this object Strongest consistency Linearizability Local reads from all replicas à valid objects = latest value Enter Hermes 29 States of A: Valid, Invalid write(A=3) Coordinator Followers What about concurrent writes? V Validation V Ack Ack I Invalidation I V commit
  • 30. Challenge How to efficiently order concurrent writes to an object? Solution Store a logical timestamp (TS) along with each object - Upon a write: coordinator increments TS and sends it with Invalidations - Upon receiving Invalidation: a follower updates the object’s TS - When two writes to the same object race: use node ID to order them Concurrent writes = challenge 30 write(A=3) write(A=1) Inv(TS1) Inv(TS4)
  • 31. Challenge How to efficiently order concurrent writes to an object? Solution Store a logical timestamp (TS) along with each object - Upon a write: coordinator increments TS and sends it with Invalidations - Upon receiving Invalidation: a follower updates the object’s TS - When two writes to the same object race: use node ID to order them Concurrent writes = challenge 31 write(A=3) write(A=1) Inv(TS1) Inv(TS4) Broadcast + Invalidations + TS à high performance writes
  • 32. 1. Decentralized Fully distributed write ordering at endpoints 2. Fully concurrent Any replica can coordinate a write Writes to different objects proceed in parallel 3. Fast Commit in 1 RTT Never abort Writes in Hermes 32 Broadcast + Invalidations + TS
  • 33. 1. Decentralized Fully distributed write ordering at endpoints 2. Fully concurrent Any replica can coordinate a write Writes to different objects proceed in parallel 3. Fast Commit in 1 RTT Never abort Writes in Hermes 33 Awesome! But what about fault tolerance? Broadcast + Invalidations + TS
  • 34. Problem A failure in the middle of a write can permanently leave a replica in Invalid state Idea Allow any Invalidated replica to replay the write and unblock. How? Insight: to replay a write need - Write’s original TS (for ordering) - Write value TS sent with Invalidation, but write value is not Solution: send write value with Invalidation à Early value propagation write(A=3) Coordinator Followers 34 Handling faults in Hermes read(A) Inv(TS) Coordinator fails I I
  • 35. Problem A failure in the middle of a write can permanently leave a replica in Invalid state Idea Allow any Invalidated replica to replay the write and unblock. How? Insight: to replay a write need - Write’s original TS (for ordering) - Write value TS sent with Invalidation, but write value is not Solution: send write value with Invalidation à early value propagation Handling faults in Hermes 35 Inv(3,TS) write(A=3) read(A) Coordinator fails I I Coordinator Followers
  • 36. Problem A failure in the middle of a write can permanently leave a replica in Invalid state Idea Allow any Invalidated replica to replay the write and unblock. How? Insight: to replay a write need - Write’s original TS (for ordering) - Write value TS sent with Invalidation, but write value is not Solution: send write value with Invalidation à early value propagation V V Inv(3,TS) completion write replay read(A) Handling faults in Hermes 36 Inv(3,TS) write(A=3) Coordinator fails I I Coordinator Followers
  • 37. Problem A failure in the middle of a write can permanently leave a replica in Invalid state Idea Allow any Invalidated replica to replay the write and unblock. How? Insight: to replay a write need - Write’s original TS (for ordering) - Write value TS sent with Invalidation, but write value is not Solution: send write value with Invalidation à early value propagation V V Inv(3,TS) completion write replay read(A) Handling faults in Hermes 37 Inv(3,TS) write(A=3) Early value propagation enables write replays Coordinator fails I I Coordinator Followers
  • 38. Evaluation 38 Evaluated protocols: - ZAB - CRAQ - Hermes State-of-the-art hardware testbed - 5 servers - 56 Gb/s InfiniBand NICs - 2x 10 core Intel Xeon E5-2630v4 per server KVS Workload - Uniform access distribution - Million key-value pairs: <8B keys, 32B values>
  • 39. Performance 39 Throughput high-perf. writes + local reads conc. writes + local reads local reads 4x 40% 5% Write Ratio Write Latency (normalized to Hermes) Million requests / sec Write performance matters even at low write ratios 6x % Write Ratio
  • 40. Performance 40 Throughput high-perf. writes + local reads conc. writes + local reads local reads 4x 40% 5% Write Ratio Write Latency (normalized to Hermes) Million requests / sec Write performance matters even at low write ratios 6x Hermes: highest throughput & lowest latency % Write Ratio
  • 41. Strong Consistency through multiprocessor-inspired Invalidations Fault-tolerance write replays via early value propagation High Performance Local reads at all replicas High performance writes Fast Decentralized Fully concurrent Hermes recap 41 V I write(A=3) commit Coordinator Followers Inv(3,TS) V I V Broadcast + Invalidations + TS + early value propagation
  • 42. Strong Consistency through multiprocessor-inspired Invalidations Fault-tolerance write replays via early value propagation High Performance Local reads at all replicas High performance writes Fast Decentralized Fully concurrent Hermes recap 42 V I write(A=3) commit Coordinator Followers Inv(3,TS) V I V Broadcast + Invalidations + TS + early value propagation What about reliable txs? … 3rd primary contribution (1-slide)!
  • 43. Reliable replicated transactions 43 Many tx workloads exhibit locality in accesses State-of-the-art datastores rely on static sharding Reliable txs regardless of access pattern Objects randomly sharded on fixed nodes - remote accesses to execute - expensive distributed commit Zeus – locality-aware reliable txs: Each object: node owner = data + excl. write access changes dynamically Coordinator: becomes owner of all tx’s objects à single node commit Ownership stays with coordinator à future tx = local accesses Reliable ownership (1.5 RTT) alters replica placement, access levels Reliable commit - read-only txs: local from all replicas - fast write txs: pipelined, 1 RTT to commit distributed commit 1. tx: if (p) b++; remote accesses Adapted from FaSST [OSDI’16] distributed commit 1. tx: if (p) b++; remote accesses Adapted from FaSST [OSDI’16] Adapted from FaSST [OSDI’16]
  • 44. Reliable replicated transactions 44 Many tx workloads exhibit locality in accesses State-of-the-art datastores rely on static sharding Reliable txs regardless of access pattern Objects randomly sharded on fixed nodes - remote accesses to execute - expensive distributed commit Zeus – locality-aware reliable txs: Each object: node owner = data + excl. write access changes dynamically Coordinator: becomes owner of all tx’s objects à single node commit Ownership stays with coordinator à future tx = local accesses Reliable ownership (1.5 RTT) alters replica placement, access levels Reliable commit - read-only txs: local from all replicas - fast write txs: pipelined, 1 RTT to commit distributed commit 1. tx: if (p) b++; remote accesses Adapted from FaSST [OSDI’16] distributed commit 1. tx: if (p) b++; remote accesses Adapted from FaSST [OSDI’16] Adapted from FaSST [OSDI’16] costly txs, cannot exploit locality
  • 45. Reliable replicated transactions 45 Many tx workloads exhibit locality in accesses State-of-the-art datastores rely on static sharding Reliable txs regardless of access pattern Objects randomly sharded on fixed nodes - remote accesses to execute - expensive distributed commit Zeus – locality-aware reliable txs: Each object: node owner = data + excl. write access changes dynamically Coordinator: becomes owner of all tx’s objects à single node commit Ownership stays with coordinator à future tx = local accesses Reliable ownership (1.5 RTT) alters replica placement, access levels Reliable commit - read-only txs: local from all replicas - fast write txs: pipelined, 1 RTT to commit distributed commit 1. tx: if (p) b++; remote accesses Adapted from FaSST [OSDI’16] distributed commit 1. tx: if (p) b++; remote accesses Adapted from FaSST [OSDI’16] Adapted from FaSST [OSDI’16] costly txs, cannot exploit locality
  • 46. Reliable replicated transactions 46 Many tx workloads exhibit locality in accesses State-of-the-art datastores rely on static sharding Reliable txs regardless of access pattern Objects randomly sharded on fixed nodes - remote accesses to execute - expensive distributed commit Zeus – locality-aware reliable txs: Each object: node owner = data + excl. write access changes dynamically Coordinator: becomes owner of all tx’s objects à single node commit Ownership stays with coordinator à future tx = local accesses Reliable ownership (1.5 RTT) alters replica placement, access levels Reliable commit - read-only txs: local from all replicas - fast write txs: pipelined, 1 RTT to commit distributed commit 1. tx: if (p) b++; remote accesses Adapted from FaSST [OSDI’16] distributed commit 1. tx: if (p) b++; remote accesses Adapted from FaSST [OSDI’16] Adapted from FaSST [OSDI’16] costly txs, cannot exploit locality
  • 47. Reliable replicated transactions 47 Many tx workloads exhibit locality in accesses State-of-the-art datastores rely on static sharding Reliable txs regardless of access pattern Objects randomly sharded on fixed nodes - remote accesses to execute - expensive distributed commit Zeus – locality-aware reliable txs: Each object: node owner = data + excl. write access changes dynamically Coordinator: becomes owner of all tx’s objects à single node commit Ownership stays with coordinator à future tx = local accesses Reliable ownership (1.5 RTT) alters replica placement, access levels Reliable commit - read-only txs: local from all replicas - fast write txs: pipelined, 1 RTT to commit distributed commit 1. tx: if (p) b++; remote accesses Adapted from FaSST [OSDI’16] distributed commit 1. tx: if (p) b++; remote accesses Adapted from FaSST [OSDI’16] Adapted from FaSST [OSDI’16] costly txs, cannot exploit locality
  • 48. Reliable replicated transactions 48 Many tx workloads exhibit locality in accesses State-of-the-art datastores rely on static sharding Reliable txs regardless of access pattern Objects randomly sharded on fixed nodes - remote accesses to execute - expensive distributed commit Zeus – locality-aware reliable txs: Each object: node owner = data + excl. write access changes dynamically Coordinator: becomes owner of all tx’s objects à single node commit Ownership stays with coordinator à future tx = local accesses Reliable ownership (1.5 RTT) alters replica placement, access levels Reliable commit - read-only txs: local from all replicas - fast write txs: pipelined, 1 RTT to commit 10s millions txs/sec & up to 2x state-of-the-art! distributed commit 1. tx: if (p) b++; remote accesses Adapted from FaSST [OSDI’16] distributed commit 1. tx: if (p) b++; remote accesses Adapted from FaSST [OSDI’16] Adapted from FaSST [OSDI’16] costly txs, cannot exploit locality
  • 49. Reliable replicated transactions 49 Many tx workloads exhibit locality in accesses State-of-the-art datastores rely on static sharding Reliable txs regardless of access pattern Objects randomly sharded on fixed nodes - remote accesses to execute - expensive distributed commit Zeus – locality-aware reliable txs: Each object: node owner = data + excl. write access changes dynamically Coordinator: becomes owner of all tx’s objects à single node commit Ownership stays with coordinator à future tx = local accesses Reliable ownership (1.5 RTT) alters replica placement, access levels Reliable commit - read-only txs: local from all replicas - fast write txs: pipelined, 1 RTT to commit 10s millions txs/sec & up to 2x state-of-the-art! distributed commit 1. tx: if (p) b++; remote accesses Adapted from FaSST [OSDI’16] distributed commit 1. tx: if (p) b++; remote accesses Adapted from FaSST [OSDI’16] Adapted from FaSST [OSDI’16] costly txs, cannot exploit locality Two Invalidating protocols!
  • 50. Thesis summary 50 Replicated datastores powered by multiprocessor-inspired invalidating protocols can deliver: strong consistency, fault tolerance, high performance 4 invalidating protocols à 3 most common replication uses in datastores - High performance (10s–100s M ops / sec) - Strong consistency under concurrency & faults (formally verified in TLA+) Scale-out ccNUMA [Eurosys’18] Hermes [ASPLOS’20] Zeus [Eurosys’21] Galene protocol Hermes protocol Zeus ownership, Zeus reliable commit Performant read/write replication for skew Fast reliable read/write replication Locality-aware reliable txs with dynamic sharding
  • 51. Thesis summary 51 Replicated datastores powered by multiprocessor-inspired invalidating protocols can deliver: strong consistency, fault tolerance, high performance 4 invalidating protocols à 3 most common replication uses in datastores - High performance (10s–100s M ops / sec) - Strong consistency under concurrency & faults (formally verified in TLA+) Scale-out ccNUMA [Eurosys’18] Hermes [ASPLOS’20] Zeus [Eurosys’21] Galene protocol Hermes protocol Zeus ownership, Zeus reliable commit Performant read/write replication for skew Fast reliable read/write replication Locality-aware reliable txs with dynamic sharding Is this the end ??
  • 52. Follow up research 52 • The L2AW theorem [to be submitted] • Hardware offloading • Replication across datacenters • Single-shot reliable writes from external clients • Non-blocking reconfiguration on node crashes …
  • 53. Follow up research 53 • The L2AW theorem [to be submitted] • Hardware offloading • Replication across datacenters • Single-shot reliable writes from external clients • Non-blocking reconfiguration on node crashes … Thank you! Questions?