SlideShare a Scribd company logo
Dr. Rebecca Bilbro August 2021
Beyond
Off-the-Shelf
Consensus
02
03
01
I have no idea 😳
Eventual consistency is good enough for us.
We require strong consistency.
Question #1
Think about the app you’re
currently working on.
How much consistency does
it require?
02
03
01
We have total visibility and control over the region(s)
where user data is stored and replicated.
We added that cookie banner; wasn’t that enough?
It’s on our roadmap, but we haven’t tackled it yet.
Question #2
Think about the app you’re
currently working on.
How concerned are you
about compliance with CCPA,
GDPR, LGPD, etc. ?
02
03
01
I think our frontend uses gettext and
CLDR.
We track geographic deployments and data replication
to guarantee consistent UX around the world.
We haven’t started thinking about global
markets yet.
Question #3
Think about the app you’re
currently working on.
How well does your app
support i18n/l10n?
02
01
04
Commercial Consensus
What is Consensus?
Clocks, quorums, and
distracted parliamentarians
etcd, Spanner, Aurora, and more
Growing Pains
The downsides of success
03
An API for Consensus
Consensus made open source
Table of contents
Dr. Rebecca Bilbro
● Founder & CTO, Rotational Labs, LLC
● Adjunct Faculty, Georgetown University
● Applied Text Analysis with Python, O’Reilly
● Creator & Maintainer, Scikit- Yellowbrick
@rebeccabilbro
What if data systems were
a little smarter?
rotational.io
What is Consensus?
01
Clocks, quorums, and distracted parliamentarians
Context: A Single Server
Picture a simple data storage
system where clients can:
- GET(key) → value
- PUT(key, value)
- DEL(key)
Context: A Single Server
The order of the operations
determines the responses the
system gives.
In a single server system, the
server can determine any
order it wants.
It is always consistent.
1 2
3
4
5
6
Consistency
The system responds to requests
predictably
Failure: A Single Server
Failure can occur for a lot of reasons,
from crashes to network outages and
is a routine.
In the single server system, when the
server fails the entire system
becomes unavailable.
Worse, data loss might occur for any
information stored on volatile
memory.
Context: A Distributed System
A system is distributed when it
contains more than one server that
must communicate.
If one server in the system fails, it
doesn’t necessarily become
unavailable because other servers
can answer requests.
If data is replicated we can also
avoid data loss.
Inconsistency
When multiple servers participate
in the system, they need to
communicate in order to ensure
they remain in the same state.
Communication takes time (latency)
and the more servers in the system,
the more time it takes to
synchronize.
PUT(x, 42)
ok
GET(x)
not found
Concurrency
Because of delays in
communication, it is possible for
two clients to perform operations
concurrently. In other words, from
the system’s perspective, they
happen at the same time.
The order of these operations
determines the response.
PUT(x, 42)
PUT(x, 27)
GET(x) → ?
Consistency
Levels
Strong Consistency: any GET request is guaranteed
to return the most recent PUT request.
Causal Consistency: any PUT request will only be
delayed by dependent keys, not all keys.
Eventual Consistency: in the absence of PUT
requests, the system will eventually become
consistent.
Monotonic Reads: every GET returns a value that is
more up to date.
Synchronization and ordering increase
the amount of time it takes to make
requests, however consistency can be
relaxed to improve performance.
Consistency
Levels
Strong Consistency: any GET request is guaranteed
to return the most recent PUT request.
Causal Consistency: any PUT request will only be
delayed by dependent keys, not all keys.
Eventual Consistency: in the absence of PUT
requests, the system will eventually become
consistent.
Monotonic Reads: every GET returns a value that is
more up to date.
Most data systems that exist today are
eventually consistent, preferring
performance over a low likelihood of
inconsistency.
Consistency
Levels
Strong Consistency: any GET request is guaranteed
to return the most recent PUT request.
Causal Consistency: any PUT request will only be
delayed by dependent keys, not all keys.
Eventual Consistency: in the absence of PUT
requests, the system will eventually become
consistent.
Monotonic Reads: every GET returns a value that is
more up to date.
However, many applications require
strong consistency in addition to
performance but have to rely on
centralized systems.
Distributed
Consensus
Fault Tolerant Decision Making for
a Network of Replicas.
?
Paxos Consensus
Every server is a state machine that can
apply commands in a single order.
Each server maintains a log of the
operations (e.g. GET or PUT), and
applies those operations when the
entry in the log is committed.
Committing entries happens by
majority vote as follows.
Paxos Consensus
When a client makes a request, the
server requests a slot in the log to apply
a command, the “prepare” phase.
Paxos Consensus
When a client makes a request, the
server requests a slot in the log to apply
a command, the “prepare” phase.
If the other servers have that spot free,
they will reserve the slot for the
requesting server.
Paxos Consensus
If a majority of servers responds to the
prepare phase, the originating server,
will send the command to be applied to
the log in that spot, the accept phase.
Paxos Consensus
If a majority of servers responds to the
prepare phase, the originating server,
will send the command to be applied to
the log in that spot, the accept phase.
If a majority of servers reply to the
accept phase, the entry in the log is
committed.
Paxos Consensus
Even if servers fail, so long as a majority
of servers are still running, decisions
can be made.
Once the server returns, it can be
brought back up to date by the other
servers.
Rules for responding to the prepare
and accept phases ensure that there
will only ever be one log order.
Leader Optimization: Raft & Multi-Paxos
An optimization where the prepare
phase is performed once, ahead of
time by electing a designated leader.
Heartbeats are used to determine if the
leader has died and a new leader must
be elected.
This results in faster responses to
clients, but small outages during
elections.
Ballot Optimization: Mencius
To avoid a prepare phase, slots are
granted to leaders in a predetermined
fashion (e.g. round-robin).
Clients tend to broadcast to multiple
leaders in order to apply the command
to the next available slot.
Good for dense workloads with
continuous accesses. Compaction and
forwarding also help manage “empty”
log slots.
Optimistic Consensus: Fast Paxos, ePaxos
Attempt to commit a command in the
first phase, known as the “fast path”.
During the fast path commit, conflict
detection is applied.
If a conflict is detected, then perform
regular 2 phase Paxos (slow path).
If conflicts are rare, most accesses will
be fast path. But conflicts require 3
communication phases.
Commercial Consensus
02
etcd, Spanner, Aurora, and more
2001: Lamport publishes “Paxos Made Simple”
2006: Google develops Chubby, a distributed locking
service based on Paxos, to rescue the GFS
2010: Apache Zookeeper offers an open source version of
Chubby using a Paxos-variant called ZAB
2013: CoreOS releases etcd, based on Raft, to manage a
cluster of Container Linux.
2014: Google launches k8s using etcd for the
configuration store.
Chubby, Zookeeper, etcd, etcetera
“ZooKeeper… got popular and became the de
facto coordination service for cloud computing
applications. However, since the bar on using
the ZooKeeper interface was so low, it has been
abused/misused by many applications.
When ZooKeeper is improperly used, it often
constituted the bottleneck in performance of
these applications and caused scalability
problems.”
Ailijiang, Charapko, Demirbas (2016)
Chubby, Zookeeper, etcd, etcetera
“Despite the increased choices and
specialization of Paxos protocols and
Paxos systems, the confusion remains
about the proper use cases of these
systems and about which systems are
more suitable for which tasks.”
Ailijiang, Charapko, Demirbas (2016)
Chubby, Zookeeper, etcd, etcetera
Globally Distributed Databases
more global = more users
But… scaling consensus is hard
● The larger the quorum, the slower
it is to respond to requests.
● Things get worse when you have
servers in different data centers:
○ Latency increases because of physical
limits.
○ Network partitions can cut off groups
of servers.
○ Servers respond more quickly to
colocated clients.
Commercial cloud is
designed to work best
here, so hopefully that’s
where your users are!
Legalities of Global
Systems
Systems are regulated
in the countries where
the data resides (where
the servers are).
Building a global app
now means navigating
the complex waters of
data compliance.
Growing Pains
03
The downsides of success
Case Studies
“The launch of the augmented reality game Pokémon Go
was an unmitigated disaster. Due to extremely overloaded
servers from the release’s extreme popularity, users could
not download the game, login, create avatars, or find
augmented reality artifacts in their locales.
The game world was hosted by a suite of Google Cloud
services, primarily backed by the Cloud Datastore, a
geographically distributed NoSQL database. Scaling the
application to millions of users therefore involved
provisioning extra capacity to the database by increasing
the number of shards as well as improving load balancing
and autoscaling of application logic run in Kubernetes
containers.”
Bengfort (2019)
21 February 2018: 501c3 Signal
Technology Foundation formed
4 Jan 2021: WhatsApp updates their
privacy policy re: data sharing with
Facebook
7 Jan 2021: Elon Musk tells his 60M
followers to switch to Signal
13 January 2021: Signal goes from
10M users to 50M users in under 24
hours, bringing the service down for
several days.
Dropbox Annual Revenue 2016-2020
“Between February and October of 2015,
Dropbox successfully relocated 90 percent of
an estimated 600 petabytes of its customer
data to its in-house network of data centers
dubbed Magic Pocket.”
“It was clear to us from the beginning that we’d
have to build everything from scratch,” wrote
Dropbox infrastructure VP Akhil Gupta on his
company blog in 2016, “since there’s nothing in
the open source community that’s proven to
work reliably at our scale. Few companies in the
world have the same requirements for scale of
storage as we do.”
Fulton (2020)
Different systems need
different consensus solutions
An API for Consensus
04
Consensus made open source
scikit-learn
Transformer
fit()
transform()
Estimator
fit()
predict()
X, y
X, y
ŷ
X′
Data Loader
Transformer
Transformer
Estimator
fit()
predict()
2007: Google Summer of Code project
2010: INRIA releases first open source
version
2013: Authors publish “API design for
machine learning software:
experiences from the scikit-learn
project”
2021: 2k contributors, 47k stars, used by
246k projects on Github, including
scikit- Yellowbrick!
The central insight of
scikit-Yellowbrick is that there is
no one “best” machine
learning model, only a set of
best practices for finding the
best model for a given dataset.
Concur: An API for Consensus Network
How do we send messages? What
does messaging imply?
Peer Management
Who’s in? How to reconfigure? How do
newcomers join?
Decision Making
How to vote? How to detect conflict?
Who’s the leader?
Execution
When is a decision final?
sys := &system.New{
QuorumSize: 7,
Network: &message.Stream{
Protocol: grpc,
Volume: broadcast
},
PeerManagement: dynamic,
DecisionMaking: LeadershipStrategy,
Execution: onCommit,
}
if err := sys.Validate(); err != nil {
return errors.New(“Invalid system: ”, err)
}
sys.Concur()
Want to contribute?
tinyurl.com/concurapi
Want to vent?
rebecca@rotational.io
CREDITS: This presentation template was created by Slidesgo,
including icons by Flaticon, and infographics & images by Freepik
Thank
You

More Related Content

What's hot

NATS + Docker meetup talk Oct - 2016
NATS + Docker meetup talk Oct - 2016NATS + Docker meetup talk Oct - 2016
NATS + Docker meetup talk Oct - 2016
wallyqs
 
Distributed Systems Concepts
Distributed Systems ConceptsDistributed Systems Concepts
Distributed Systems Concepts
Jordan Halterman
 
Developing a Globally Distributed Purging System
Developing a Globally Distributed Purging SystemDeveloping a Globally Distributed Purging System
Developing a Globally Distributed Purging System
Fastly
 
Process Management-Process Migration
Process Management-Process MigrationProcess Management-Process Migration
Process Management-Process Migration
MNM Jain Engineering College
 
Simple Solutions for Complex Problems
Simple Solutions for Complex ProblemsSimple Solutions for Complex Problems
Simple Solutions for Complex Problems
Tyler Treat
 
334839757 task-assignment
334839757 task-assignment334839757 task-assignment
334839757 task-assignment
sachinmore76
 
A Coherent Discussion About Performance
A Coherent Discussion About PerformanceA Coherent Discussion About Performance
A Coherent Discussion About PerformanceTheo Schlossnagle
 
Storm@Twitter, SIGMOD 2014 paper
Storm@Twitter, SIGMOD 2014 paperStorm@Twitter, SIGMOD 2014 paper
Storm@Twitter, SIGMOD 2014 paper
Karthik Ramasamy
 
Seminar Report on Google File System
Seminar Report on Google File SystemSeminar Report on Google File System
Seminar Report on Google File System
Vishal Polley
 
X-trace a pervasive network tracing framework
X-trace a pervasive network tracing frameworkX-trace a pervasive network tracing framework
X-trace a pervasive network tracing framework
ssuser804d54
 
Best practice-high availability-solution-geo-distributed-final
Best practice-high availability-solution-geo-distributed-finalBest practice-high availability-solution-geo-distributed-final
Best practice-high availability-solution-geo-distributed-final
Marco Tusa
 
What is active-active
What is active-activeWhat is active-active
What is active-active
Saif Ahmad
 
Types of Load distributing algorithm in Distributed System
Types of Load distributing algorithm in Distributed SystemTypes of Load distributing algorithm in Distributed System
Types of Load distributing algorithm in Distributed System
DHIVYADEVAKI
 
The Zen of High Performance Messaging with NATS (Strange Loop 2016)
The Zen of High Performance Messaging with NATS (Strange Loop 2016)The Zen of High Performance Messaging with NATS (Strange Loop 2016)
The Zen of High Performance Messaging with NATS (Strange Loop 2016)
wallyqs
 
PHP Backends for Real-Time User Interaction using Apache Storm.
PHP Backends for Real-Time User Interaction using Apache Storm.PHP Backends for Real-Time User Interaction using Apache Storm.
PHP Backends for Real-Time User Interaction using Apache Storm.
DECK36
 
From Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed SystemsFrom Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed Systems
Tyler Treat
 
DNS-OARC 34: Measuring DNS Flag Day 2020
DNS-OARC 34: Measuring DNS Flag Day 2020DNS-OARC 34: Measuring DNS Flag Day 2020
DNS-OARC 34: Measuring DNS Flag Day 2020
APNIC
 
All you didn't know about the CAP theorem
All you didn't know about the CAP theoremAll you didn't know about the CAP theorem
All you didn't know about the CAP theorem
Kanstantsin Hontarau
 
FIWARE Data Management in High Availability
FIWARE Data Management in High AvailabilityFIWARE Data Management in High Availability
FIWARE Data Management in High Availability
Federico Michele Facca
 

What's hot (20)

NATS + Docker meetup talk Oct - 2016
NATS + Docker meetup talk Oct - 2016NATS + Docker meetup talk Oct - 2016
NATS + Docker meetup talk Oct - 2016
 
Distributed Systems Concepts
Distributed Systems ConceptsDistributed Systems Concepts
Distributed Systems Concepts
 
DNS Cache White Paper
DNS Cache White PaperDNS Cache White Paper
DNS Cache White Paper
 
Developing a Globally Distributed Purging System
Developing a Globally Distributed Purging SystemDeveloping a Globally Distributed Purging System
Developing a Globally Distributed Purging System
 
Process Management-Process Migration
Process Management-Process MigrationProcess Management-Process Migration
Process Management-Process Migration
 
Simple Solutions for Complex Problems
Simple Solutions for Complex ProblemsSimple Solutions for Complex Problems
Simple Solutions for Complex Problems
 
334839757 task-assignment
334839757 task-assignment334839757 task-assignment
334839757 task-assignment
 
A Coherent Discussion About Performance
A Coherent Discussion About PerformanceA Coherent Discussion About Performance
A Coherent Discussion About Performance
 
Storm@Twitter, SIGMOD 2014 paper
Storm@Twitter, SIGMOD 2014 paperStorm@Twitter, SIGMOD 2014 paper
Storm@Twitter, SIGMOD 2014 paper
 
Seminar Report on Google File System
Seminar Report on Google File SystemSeminar Report on Google File System
Seminar Report on Google File System
 
X-trace a pervasive network tracing framework
X-trace a pervasive network tracing frameworkX-trace a pervasive network tracing framework
X-trace a pervasive network tracing framework
 
Best practice-high availability-solution-geo-distributed-final
Best practice-high availability-solution-geo-distributed-finalBest practice-high availability-solution-geo-distributed-final
Best practice-high availability-solution-geo-distributed-final
 
What is active-active
What is active-activeWhat is active-active
What is active-active
 
Types of Load distributing algorithm in Distributed System
Types of Load distributing algorithm in Distributed SystemTypes of Load distributing algorithm in Distributed System
Types of Load distributing algorithm in Distributed System
 
The Zen of High Performance Messaging with NATS (Strange Loop 2016)
The Zen of High Performance Messaging with NATS (Strange Loop 2016)The Zen of High Performance Messaging with NATS (Strange Loop 2016)
The Zen of High Performance Messaging with NATS (Strange Loop 2016)
 
PHP Backends for Real-Time User Interaction using Apache Storm.
PHP Backends for Real-Time User Interaction using Apache Storm.PHP Backends for Real-Time User Interaction using Apache Storm.
PHP Backends for Real-Time User Interaction using Apache Storm.
 
From Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed SystemsFrom Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed Systems
 
DNS-OARC 34: Measuring DNS Flag Day 2020
DNS-OARC 34: Measuring DNS Flag Day 2020DNS-OARC 34: Measuring DNS Flag Day 2020
DNS-OARC 34: Measuring DNS Flag Day 2020
 
All you didn't know about the CAP theorem
All you didn't know about the CAP theoremAll you didn't know about the CAP theorem
All you didn't know about the CAP theorem
 
FIWARE Data Management in High Availability
FIWARE Data Management in High AvailabilityFIWARE Data Management in High Availability
FIWARE Data Management in High Availability
 

Similar to Beyond Off the-Shelf Consensus

Distributed Consensus: Making Impossible Possible [Revised]
Distributed Consensus: Making Impossible Possible [Revised]Distributed Consensus: Making Impossible Possible [Revised]
Distributed Consensus: Making Impossible Possible [Revised]
Heidi Howard
 
Distributed OPERATING SYSTEM FOR BACHELOR OF BUSINESS INFORMATION TECHNOLOGY
Distributed OPERATING SYSTEM FOR BACHELOR OF BUSINESS INFORMATION TECHNOLOGYDistributed OPERATING SYSTEM FOR BACHELOR OF BUSINESS INFORMATION TECHNOLOGY
Distributed OPERATING SYSTEM FOR BACHELOR OF BUSINESS INFORMATION TECHNOLOGY
reginamutio48
 
Os solved question paper
Os solved question paperOs solved question paper
Os solved question paperAnkit Bhatnagar
 
Event-driven Infrastructure - Mike Place, SaltStack - DevOpsDays Tel Aviv 2016
Event-driven Infrastructure - Mike Place, SaltStack - DevOpsDays Tel Aviv 2016Event-driven Infrastructure - Mike Place, SaltStack - DevOpsDays Tel Aviv 2016
Event-driven Infrastructure - Mike Place, SaltStack - DevOpsDays Tel Aviv 2016
DevOpsDays Tel Aviv
 
Designing distributed systems
Designing distributed systemsDesigning distributed systems
Designing distributed systems
Malisa Ncube
 
Hyperledger Consensus Algorithms
Hyperledger Consensus AlgorithmsHyperledger Consensus Algorithms
Hyperledger Consensus Algorithms
MabelOza12
 
Chapter 6-Consistency and Replication.ppt
Chapter 6-Consistency and Replication.pptChapter 6-Consistency and Replication.ppt
Chapter 6-Consistency and Replication.ppt
sirajmohammed35
 
Tef con2016 (1)
Tef con2016 (1)Tef con2016 (1)
Tef con2016 (1)
ggarber
 
End to-end arguments in system design
End to-end arguments in system designEnd to-end arguments in system design
End to-end arguments in system designnody111
 
MSB-Distributed systems goals
MSB-Distributed systems goalsMSB-Distributed systems goals
MSB-Distributed systems goals
MOHD. SHAHRUKH BHATI
 
data replication
data replicationdata replication
data replication
Hassanein Alwan
 
Operating system Interview Questions
Operating system Interview QuestionsOperating system Interview Questions
Operating system Interview Questions
Kuntal Bhowmick
 
Dos unit3
Dos unit3Dos unit3
Dos unit3
JebasheelaSJ
 
I0935053
I0935053I0935053
I0935053
IOSR Journals
 
Air traffic controller - Streams Processing meetup
Air traffic controller  - Streams Processing meetupAir traffic controller  - Streams Processing meetup
Air traffic controller - Streams Processing meetup
Ed Yakabosky
 
Enea Enabling Real-Time in Linux Whitepaper
Enea Enabling Real-Time in Linux WhitepaperEnea Enabling Real-Time in Linux Whitepaper
Enea Enabling Real-Time in Linux Whitepaper
Enea Software AB
 

Similar to Beyond Off the-Shelf Consensus (20)

Distributed Consensus: Making Impossible Possible [Revised]
Distributed Consensus: Making Impossible Possible [Revised]Distributed Consensus: Making Impossible Possible [Revised]
Distributed Consensus: Making Impossible Possible [Revised]
 
Distributed OPERATING SYSTEM FOR BACHELOR OF BUSINESS INFORMATION TECHNOLOGY
Distributed OPERATING SYSTEM FOR BACHELOR OF BUSINESS INFORMATION TECHNOLOGYDistributed OPERATING SYSTEM FOR BACHELOR OF BUSINESS INFORMATION TECHNOLOGY
Distributed OPERATING SYSTEM FOR BACHELOR OF BUSINESS INFORMATION TECHNOLOGY
 
HA Summary
HA SummaryHA Summary
HA Summary
 
Os solved question paper
Os solved question paperOs solved question paper
Os solved question paper
 
Event-driven Infrastructure - Mike Place, SaltStack - DevOpsDays Tel Aviv 2016
Event-driven Infrastructure - Mike Place, SaltStack - DevOpsDays Tel Aviv 2016Event-driven Infrastructure - Mike Place, SaltStack - DevOpsDays Tel Aviv 2016
Event-driven Infrastructure - Mike Place, SaltStack - DevOpsDays Tel Aviv 2016
 
Designing distributed systems
Designing distributed systemsDesigning distributed systems
Designing distributed systems
 
Hyperledger Consensus Algorithms
Hyperledger Consensus AlgorithmsHyperledger Consensus Algorithms
Hyperledger Consensus Algorithms
 
Chapter 6-Consistency and Replication.ppt
Chapter 6-Consistency and Replication.pptChapter 6-Consistency and Replication.ppt
Chapter 6-Consistency and Replication.ppt
 
Tef con2016 (1)
Tef con2016 (1)Tef con2016 (1)
Tef con2016 (1)
 
End to-end arguments in system design
End to-end arguments in system designEnd to-end arguments in system design
End to-end arguments in system design
 
PNUTS
PNUTSPNUTS
PNUTS
 
Pnuts Review
Pnuts ReviewPnuts Review
Pnuts Review
 
Pnuts
PnutsPnuts
Pnuts
 
MSB-Distributed systems goals
MSB-Distributed systems goalsMSB-Distributed systems goals
MSB-Distributed systems goals
 
data replication
data replicationdata replication
data replication
 
Operating system Interview Questions
Operating system Interview QuestionsOperating system Interview Questions
Operating system Interview Questions
 
Dos unit3
Dos unit3Dos unit3
Dos unit3
 
I0935053
I0935053I0935053
I0935053
 
Air traffic controller - Streams Processing meetup
Air traffic controller  - Streams Processing meetupAir traffic controller  - Streams Processing meetup
Air traffic controller - Streams Processing meetup
 
Enea Enabling Real-Time in Linux Whitepaper
Enea Enabling Real-Time in Linux WhitepaperEnea Enabling Real-Time in Linux Whitepaper
Enea Enabling Real-Time in Linux Whitepaper
 

More from Rebecca Bilbro

Data Structures for Data Privacy: Lessons Learned in Production
Data Structures for Data Privacy: Lessons Learned in ProductionData Structures for Data Privacy: Lessons Learned in Production
Data Structures for Data Privacy: Lessons Learned in Production
Rebecca Bilbro
 
Conflict-Free Replicated Data Types (PyCon 2022)
Conflict-Free Replicated Data Types (PyCon 2022)Conflict-Free Replicated Data Types (PyCon 2022)
Conflict-Free Replicated Data Types (PyCon 2022)
Rebecca Bilbro
 
(Py)testing the Limits of Machine Learning
(Py)testing the Limits of Machine Learning(Py)testing the Limits of Machine Learning
(Py)testing the Limits of Machine Learning
Rebecca Bilbro
 
Anti-Entropy Replication for Cost-Effective Eventual Consistency
Anti-Entropy Replication for Cost-Effective Eventual ConsistencyAnti-Entropy Replication for Cost-Effective Eventual Consistency
Anti-Entropy Replication for Cost-Effective Eventual Consistency
Rebecca Bilbro
 
The Promise and Peril of Very Big Models
The Promise and Peril of Very Big ModelsThe Promise and Peril of Very Big Models
The Promise and Peril of Very Big Models
Rebecca Bilbro
 
PyData Global: Thrifty Machine Learning
PyData Global: Thrifty Machine LearningPyData Global: Thrifty Machine Learning
PyData Global: Thrifty Machine Learning
Rebecca Bilbro
 
EuroSciPy 2019: Visual diagnostics at scale
EuroSciPy 2019: Visual diagnostics at scaleEuroSciPy 2019: Visual diagnostics at scale
EuroSciPy 2019: Visual diagnostics at scale
Rebecca Bilbro
 
Visual diagnostics at scale
Visual diagnostics at scaleVisual diagnostics at scale
Visual diagnostics at scale
Rebecca Bilbro
 
Steering Model Selection with Visual Diagnostics: Women in Analytics 2019
Steering Model Selection with Visual Diagnostics: Women in Analytics 2019Steering Model Selection with Visual Diagnostics: Women in Analytics 2019
Steering Model Selection with Visual Diagnostics: Women in Analytics 2019
Rebecca Bilbro
 
A Visual Exploration of Distance, Documents, and Distributions
A Visual Exploration of Distance, Documents, and DistributionsA Visual Exploration of Distance, Documents, and Distributions
A Visual Exploration of Distance, Documents, and Distributions
Rebecca Bilbro
 
Words in space
Words in spaceWords in space
Words in space
Rebecca Bilbro
 
The Incredible Disappearing Data Scientist
The Incredible Disappearing Data ScientistThe Incredible Disappearing Data Scientist
The Incredible Disappearing Data Scientist
Rebecca Bilbro
 
Camlis
CamlisCamlis
Learning machine learning with Yellowbrick
Learning machine learning with YellowbrickLearning machine learning with Yellowbrick
Learning machine learning with Yellowbrick
Rebecca Bilbro
 
Escaping the Black Box
Escaping the Black BoxEscaping the Black Box
Escaping the Black Box
Rebecca Bilbro
 
Data Intelligence 2017 - Building a Gigaword Corpus
Data Intelligence 2017 - Building a Gigaword CorpusData Intelligence 2017 - Building a Gigaword Corpus
Data Intelligence 2017 - Building a Gigaword Corpus
Rebecca Bilbro
 
Building a Gigaword Corpus (PyCon 2017)
Building a Gigaword Corpus (PyCon 2017)Building a Gigaword Corpus (PyCon 2017)
Building a Gigaword Corpus (PyCon 2017)
Rebecca Bilbro
 
Yellowbrick: Steering machine learning with visual transformers
Yellowbrick: Steering machine learning with visual transformersYellowbrick: Steering machine learning with visual transformers
Yellowbrick: Steering machine learning with visual transformers
Rebecca Bilbro
 
Visualizing the model selection process
Visualizing the model selection processVisualizing the model selection process
Visualizing the model selection process
Rebecca Bilbro
 
NLP for Everyday People
NLP for Everyday PeopleNLP for Everyday People
NLP for Everyday People
Rebecca Bilbro
 

More from Rebecca Bilbro (20)

Data Structures for Data Privacy: Lessons Learned in Production
Data Structures for Data Privacy: Lessons Learned in ProductionData Structures for Data Privacy: Lessons Learned in Production
Data Structures for Data Privacy: Lessons Learned in Production
 
Conflict-Free Replicated Data Types (PyCon 2022)
Conflict-Free Replicated Data Types (PyCon 2022)Conflict-Free Replicated Data Types (PyCon 2022)
Conflict-Free Replicated Data Types (PyCon 2022)
 
(Py)testing the Limits of Machine Learning
(Py)testing the Limits of Machine Learning(Py)testing the Limits of Machine Learning
(Py)testing the Limits of Machine Learning
 
Anti-Entropy Replication for Cost-Effective Eventual Consistency
Anti-Entropy Replication for Cost-Effective Eventual ConsistencyAnti-Entropy Replication for Cost-Effective Eventual Consistency
Anti-Entropy Replication for Cost-Effective Eventual Consistency
 
The Promise and Peril of Very Big Models
The Promise and Peril of Very Big ModelsThe Promise and Peril of Very Big Models
The Promise and Peril of Very Big Models
 
PyData Global: Thrifty Machine Learning
PyData Global: Thrifty Machine LearningPyData Global: Thrifty Machine Learning
PyData Global: Thrifty Machine Learning
 
EuroSciPy 2019: Visual diagnostics at scale
EuroSciPy 2019: Visual diagnostics at scaleEuroSciPy 2019: Visual diagnostics at scale
EuroSciPy 2019: Visual diagnostics at scale
 
Visual diagnostics at scale
Visual diagnostics at scaleVisual diagnostics at scale
Visual diagnostics at scale
 
Steering Model Selection with Visual Diagnostics: Women in Analytics 2019
Steering Model Selection with Visual Diagnostics: Women in Analytics 2019Steering Model Selection with Visual Diagnostics: Women in Analytics 2019
Steering Model Selection with Visual Diagnostics: Women in Analytics 2019
 
A Visual Exploration of Distance, Documents, and Distributions
A Visual Exploration of Distance, Documents, and DistributionsA Visual Exploration of Distance, Documents, and Distributions
A Visual Exploration of Distance, Documents, and Distributions
 
Words in space
Words in spaceWords in space
Words in space
 
The Incredible Disappearing Data Scientist
The Incredible Disappearing Data ScientistThe Incredible Disappearing Data Scientist
The Incredible Disappearing Data Scientist
 
Camlis
CamlisCamlis
Camlis
 
Learning machine learning with Yellowbrick
Learning machine learning with YellowbrickLearning machine learning with Yellowbrick
Learning machine learning with Yellowbrick
 
Escaping the Black Box
Escaping the Black BoxEscaping the Black Box
Escaping the Black Box
 
Data Intelligence 2017 - Building a Gigaword Corpus
Data Intelligence 2017 - Building a Gigaword CorpusData Intelligence 2017 - Building a Gigaword Corpus
Data Intelligence 2017 - Building a Gigaword Corpus
 
Building a Gigaword Corpus (PyCon 2017)
Building a Gigaword Corpus (PyCon 2017)Building a Gigaword Corpus (PyCon 2017)
Building a Gigaword Corpus (PyCon 2017)
 
Yellowbrick: Steering machine learning with visual transformers
Yellowbrick: Steering machine learning with visual transformersYellowbrick: Steering machine learning with visual transformers
Yellowbrick: Steering machine learning with visual transformers
 
Visualizing the model selection process
Visualizing the model selection processVisualizing the model selection process
Visualizing the model selection process
 
NLP for Everyday People
NLP for Everyday PeopleNLP for Everyday People
NLP for Everyday People
 

Recently uploaded

Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
JoytuBarua2
 
PROJECT FORMAT FOR EVS AMITY UNIVERSITY GWALIOR.ppt
PROJECT FORMAT FOR EVS AMITY UNIVERSITY GWALIOR.pptPROJECT FORMAT FOR EVS AMITY UNIVERSITY GWALIOR.ppt
PROJECT FORMAT FOR EVS AMITY UNIVERSITY GWALIOR.ppt
bhadouriyakaku
 
6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)
ClaraZara1
 
Technical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prismsTechnical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prisms
heavyhaig
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
[JPP-1] - (JEE 3.0) - Kinematics 1D - 14th May..pdf
[JPP-1] - (JEE 3.0) - Kinematics 1D - 14th May..pdf[JPP-1] - (JEE 3.0) - Kinematics 1D - 14th May..pdf
[JPP-1] - (JEE 3.0) - Kinematics 1D - 14th May..pdf
awadeshbabu
 
Modelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdfModelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdf
camseq
 
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
ihlasbinance2003
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
Victor Morales
 
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
Mukeshwaran Balu
 
bank management system in java and mysql report1.pdf
bank management system in java and mysql report1.pdfbank management system in java and mysql report1.pdf
bank management system in java and mysql report1.pdf
Divyam548318
 
Understanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine LearningUnderstanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine Learning
SUTEJAS
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
thanhdowork
 
Online aptitude test management system project report.pdf
Online aptitude test management system project report.pdfOnline aptitude test management system project report.pdf
Online aptitude test management system project report.pdf
Kamal Acharya
 
Low power architecture of logic gates using adiabatic techniques
Low power architecture of logic gates using adiabatic techniquesLow power architecture of logic gates using adiabatic techniques
Low power architecture of logic gates using adiabatic techniques
nooriasukmaningtyas
 
PPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testingPPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testing
anoopmanoharan2
 
Swimming pool mechanical components design.pptx
Swimming pool  mechanical components design.pptxSwimming pool  mechanical components design.pptx
Swimming pool mechanical components design.pptx
yokeleetan1
 
A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...
nooriasukmaningtyas
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Christina Lin
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
zwunae
 

Recently uploaded (20)

Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
 
PROJECT FORMAT FOR EVS AMITY UNIVERSITY GWALIOR.ppt
PROJECT FORMAT FOR EVS AMITY UNIVERSITY GWALIOR.pptPROJECT FORMAT FOR EVS AMITY UNIVERSITY GWALIOR.ppt
PROJECT FORMAT FOR EVS AMITY UNIVERSITY GWALIOR.ppt
 
6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)
 
Technical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prismsTechnical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prisms
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
[JPP-1] - (JEE 3.0) - Kinematics 1D - 14th May..pdf
[JPP-1] - (JEE 3.0) - Kinematics 1D - 14th May..pdf[JPP-1] - (JEE 3.0) - Kinematics 1D - 14th May..pdf
[JPP-1] - (JEE 3.0) - Kinematics 1D - 14th May..pdf
 
Modelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdfModelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdf
 
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
 
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
 
bank management system in java and mysql report1.pdf
bank management system in java and mysql report1.pdfbank management system in java and mysql report1.pdf
bank management system in java and mysql report1.pdf
 
Understanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine LearningUnderstanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine Learning
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
 
Online aptitude test management system project report.pdf
Online aptitude test management system project report.pdfOnline aptitude test management system project report.pdf
Online aptitude test management system project report.pdf
 
Low power architecture of logic gates using adiabatic techniques
Low power architecture of logic gates using adiabatic techniquesLow power architecture of logic gates using adiabatic techniques
Low power architecture of logic gates using adiabatic techniques
 
PPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testingPPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testing
 
Swimming pool mechanical components design.pptx
Swimming pool  mechanical components design.pptxSwimming pool  mechanical components design.pptx
Swimming pool mechanical components design.pptx
 
A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
 

Beyond Off the-Shelf Consensus

  • 1. Dr. Rebecca Bilbro August 2021 Beyond Off-the-Shelf Consensus
  • 2. 02 03 01 I have no idea 😳 Eventual consistency is good enough for us. We require strong consistency. Question #1 Think about the app you’re currently working on. How much consistency does it require?
  • 3. 02 03 01 We have total visibility and control over the region(s) where user data is stored and replicated. We added that cookie banner; wasn’t that enough? It’s on our roadmap, but we haven’t tackled it yet. Question #2 Think about the app you’re currently working on. How concerned are you about compliance with CCPA, GDPR, LGPD, etc. ?
  • 4. 02 03 01 I think our frontend uses gettext and CLDR. We track geographic deployments and data replication to guarantee consistent UX around the world. We haven’t started thinking about global markets yet. Question #3 Think about the app you’re currently working on. How well does your app support i18n/l10n?
  • 5. 02 01 04 Commercial Consensus What is Consensus? Clocks, quorums, and distracted parliamentarians etcd, Spanner, Aurora, and more Growing Pains The downsides of success 03 An API for Consensus Consensus made open source Table of contents
  • 6. Dr. Rebecca Bilbro ● Founder & CTO, Rotational Labs, LLC ● Adjunct Faculty, Georgetown University ● Applied Text Analysis with Python, O’Reilly ● Creator & Maintainer, Scikit- Yellowbrick @rebeccabilbro
  • 7. What if data systems were a little smarter? rotational.io
  • 8. What is Consensus? 01 Clocks, quorums, and distracted parliamentarians
  • 9. Context: A Single Server Picture a simple data storage system where clients can: - GET(key) → value - PUT(key, value) - DEL(key)
  • 10. Context: A Single Server The order of the operations determines the responses the system gives. In a single server system, the server can determine any order it wants. It is always consistent. 1 2 3 4 5 6
  • 11. Consistency The system responds to requests predictably
  • 12. Failure: A Single Server Failure can occur for a lot of reasons, from crashes to network outages and is a routine. In the single server system, when the server fails the entire system becomes unavailable. Worse, data loss might occur for any information stored on volatile memory.
  • 13. Context: A Distributed System A system is distributed when it contains more than one server that must communicate. If one server in the system fails, it doesn’t necessarily become unavailable because other servers can answer requests. If data is replicated we can also avoid data loss.
  • 14. Inconsistency When multiple servers participate in the system, they need to communicate in order to ensure they remain in the same state. Communication takes time (latency) and the more servers in the system, the more time it takes to synchronize. PUT(x, 42) ok GET(x) not found
  • 15. Concurrency Because of delays in communication, it is possible for two clients to perform operations concurrently. In other words, from the system’s perspective, they happen at the same time. The order of these operations determines the response. PUT(x, 42) PUT(x, 27) GET(x) → ?
  • 16. Consistency Levels Strong Consistency: any GET request is guaranteed to return the most recent PUT request. Causal Consistency: any PUT request will only be delayed by dependent keys, not all keys. Eventual Consistency: in the absence of PUT requests, the system will eventually become consistent. Monotonic Reads: every GET returns a value that is more up to date. Synchronization and ordering increase the amount of time it takes to make requests, however consistency can be relaxed to improve performance.
  • 17. Consistency Levels Strong Consistency: any GET request is guaranteed to return the most recent PUT request. Causal Consistency: any PUT request will only be delayed by dependent keys, not all keys. Eventual Consistency: in the absence of PUT requests, the system will eventually become consistent. Monotonic Reads: every GET returns a value that is more up to date. Most data systems that exist today are eventually consistent, preferring performance over a low likelihood of inconsistency.
  • 18. Consistency Levels Strong Consistency: any GET request is guaranteed to return the most recent PUT request. Causal Consistency: any PUT request will only be delayed by dependent keys, not all keys. Eventual Consistency: in the absence of PUT requests, the system will eventually become consistent. Monotonic Reads: every GET returns a value that is more up to date. However, many applications require strong consistency in addition to performance but have to rely on centralized systems.
  • 19. Distributed Consensus Fault Tolerant Decision Making for a Network of Replicas. ?
  • 20. Paxos Consensus Every server is a state machine that can apply commands in a single order. Each server maintains a log of the operations (e.g. GET or PUT), and applies those operations when the entry in the log is committed. Committing entries happens by majority vote as follows.
  • 21. Paxos Consensus When a client makes a request, the server requests a slot in the log to apply a command, the “prepare” phase.
  • 22. Paxos Consensus When a client makes a request, the server requests a slot in the log to apply a command, the “prepare” phase. If the other servers have that spot free, they will reserve the slot for the requesting server.
  • 23. Paxos Consensus If a majority of servers responds to the prepare phase, the originating server, will send the command to be applied to the log in that spot, the accept phase.
  • 24. Paxos Consensus If a majority of servers responds to the prepare phase, the originating server, will send the command to be applied to the log in that spot, the accept phase. If a majority of servers reply to the accept phase, the entry in the log is committed.
  • 25. Paxos Consensus Even if servers fail, so long as a majority of servers are still running, decisions can be made. Once the server returns, it can be brought back up to date by the other servers. Rules for responding to the prepare and accept phases ensure that there will only ever be one log order.
  • 26. Leader Optimization: Raft & Multi-Paxos An optimization where the prepare phase is performed once, ahead of time by electing a designated leader. Heartbeats are used to determine if the leader has died and a new leader must be elected. This results in faster responses to clients, but small outages during elections.
  • 27. Ballot Optimization: Mencius To avoid a prepare phase, slots are granted to leaders in a predetermined fashion (e.g. round-robin). Clients tend to broadcast to multiple leaders in order to apply the command to the next available slot. Good for dense workloads with continuous accesses. Compaction and forwarding also help manage “empty” log slots.
  • 28. Optimistic Consensus: Fast Paxos, ePaxos Attempt to commit a command in the first phase, known as the “fast path”. During the fast path commit, conflict detection is applied. If a conflict is detected, then perform regular 2 phase Paxos (slow path). If conflicts are rare, most accesses will be fast path. But conflicts require 3 communication phases.
  • 30. 2001: Lamport publishes “Paxos Made Simple” 2006: Google develops Chubby, a distributed locking service based on Paxos, to rescue the GFS 2010: Apache Zookeeper offers an open source version of Chubby using a Paxos-variant called ZAB 2013: CoreOS releases etcd, based on Raft, to manage a cluster of Container Linux. 2014: Google launches k8s using etcd for the configuration store. Chubby, Zookeeper, etcd, etcetera
  • 31. “ZooKeeper… got popular and became the de facto coordination service for cloud computing applications. However, since the bar on using the ZooKeeper interface was so low, it has been abused/misused by many applications. When ZooKeeper is improperly used, it often constituted the bottleneck in performance of these applications and caused scalability problems.” Ailijiang, Charapko, Demirbas (2016) Chubby, Zookeeper, etcd, etcetera
  • 32. “Despite the increased choices and specialization of Paxos protocols and Paxos systems, the confusion remains about the proper use cases of these systems and about which systems are more suitable for which tasks.” Ailijiang, Charapko, Demirbas (2016) Chubby, Zookeeper, etcd, etcetera
  • 34. more global = more users
  • 35. But… scaling consensus is hard ● The larger the quorum, the slower it is to respond to requests. ● Things get worse when you have servers in different data centers: ○ Latency increases because of physical limits. ○ Network partitions can cut off groups of servers. ○ Servers respond more quickly to colocated clients.
  • 36. Commercial cloud is designed to work best here, so hopefully that’s where your users are!
  • 37. Legalities of Global Systems Systems are regulated in the countries where the data resides (where the servers are). Building a global app now means navigating the complex waters of data compliance.
  • 40. “The launch of the augmented reality game Pokémon Go was an unmitigated disaster. Due to extremely overloaded servers from the release’s extreme popularity, users could not download the game, login, create avatars, or find augmented reality artifacts in their locales. The game world was hosted by a suite of Google Cloud services, primarily backed by the Cloud Datastore, a geographically distributed NoSQL database. Scaling the application to millions of users therefore involved provisioning extra capacity to the database by increasing the number of shards as well as improving load balancing and autoscaling of application logic run in Kubernetes containers.” Bengfort (2019)
  • 41. 21 February 2018: 501c3 Signal Technology Foundation formed 4 Jan 2021: WhatsApp updates their privacy policy re: data sharing with Facebook 7 Jan 2021: Elon Musk tells his 60M followers to switch to Signal 13 January 2021: Signal goes from 10M users to 50M users in under 24 hours, bringing the service down for several days.
  • 42. Dropbox Annual Revenue 2016-2020 “Between February and October of 2015, Dropbox successfully relocated 90 percent of an estimated 600 petabytes of its customer data to its in-house network of data centers dubbed Magic Pocket.” “It was clear to us from the beginning that we’d have to build everything from scratch,” wrote Dropbox infrastructure VP Akhil Gupta on his company blog in 2016, “since there’s nothing in the open source community that’s proven to work reliably at our scale. Few companies in the world have the same requirements for scale of storage as we do.” Fulton (2020)
  • 43. Different systems need different consensus solutions
  • 44. An API for Consensus 04 Consensus made open source
  • 45. scikit-learn Transformer fit() transform() Estimator fit() predict() X, y X, y ŷ X′ Data Loader Transformer Transformer Estimator fit() predict() 2007: Google Summer of Code project 2010: INRIA releases first open source version 2013: Authors publish “API design for machine learning software: experiences from the scikit-learn project” 2021: 2k contributors, 47k stars, used by 246k projects on Github, including scikit- Yellowbrick!
  • 46. The central insight of scikit-Yellowbrick is that there is no one “best” machine learning model, only a set of best practices for finding the best model for a given dataset.
  • 47. Concur: An API for Consensus Network How do we send messages? What does messaging imply? Peer Management Who’s in? How to reconfigure? How do newcomers join? Decision Making How to vote? How to detect conflict? Who’s the leader? Execution When is a decision final? sys := &system.New{ QuorumSize: 7, Network: &message.Stream{ Protocol: grpc, Volume: broadcast }, PeerManagement: dynamic, DecisionMaking: LeadershipStrategy, Execution: onCommit, } if err := sys.Validate(); err != nil { return errors.New(“Invalid system: ”, err) } sys.Concur()
  • 48. Want to contribute? tinyurl.com/concurapi Want to vent? rebecca@rotational.io
  • 49. CREDITS: This presentation template was created by Slidesgo, including icons by Flaticon, and infographics & images by Freepik Thank You