SlideShare a Scribd company logo
From Mainframe to Microservice 
An Introduction to 
Distributed Systems 
@tyler_treat 
Workiva
An Introduction to Distributed Systems 
❖ Building a foundation of understanding 
❖ Why distributed systems? 
❖ Universal fallacies 
❖ Characteristics and the CAP theorem 
❖ Common pitfalls 
❖ Digging deeper 
❖ Byzantine Generals Problem and consensus 
❖ Split-brain 
❖ Hybrid consistency models 
❖ Scaling shared data and CRDTs
“A distributed system is one in which the failure of 
a computer you didn't even know existed can 
render your own computer unusable.” 
–Leslie Lamport
Scale Up vs. Scale Out 
Vertical Scaling 
❖ Add resources to a node 
❖ Increases node capacity, load 
is unaffected 
❖ System complexity unaffected 
Horizontal Scaling 
❖ Add nodes to a cluster 
❖ Decreases load, capacity is 
unaffected 
❖ Availability and throughput w/ 
increased complexity
A distributed system 
is a collection of independent computers 
that behave as a single coherent system.
Why Distributed Systems? 
Availability 
Fault Tolerance 
Throughput 
Architecture 
Economics 
serve every request 
resilient to failures 
parallel computation 
decoupled, focused services 
scale-out becoming manageable/ 
cost-effective
oh shit…
“You have to design distributed systems with the 
expectation of failure.” 
–Ken Arnold
Distributed systems engineers are 
the world’s biggest pessimists.
Universal Fallacy #1 
The network is reliable. 
❖ Message delivery is never guaranteed 
❖ Best effort 
❖ Is it worth it? 
❖ Resiliency/redundancy/failover
Universal Fallacy #2 
Latency is zero. 
❖ We cannot defy the laws of physics 
❖ LAN to WAN deteriorates quickly 
❖ Minimize network calls (batch) 
❖ Design asynchronous systems
Universal Fallacy #3 
Bandwidth is infinite. 
❖ Out of our control 
❖ Limit message sizes 
❖ Use message queueing
Universal Fallacy #4 
The network is secure. 
❖ Everyone is out to get you 
❖ Build in security from day 1 
❖ Multi-layered 
❖ Encrypt, pentest, train developers
Universal Fallacy #5 
Topology doesn’t change. 
❖ Network topology is dynamic 
❖ Don’t statically address hosts 
❖ Collection of services, not nodes 
❖ Service discovery
Universal Fallacy #6 
There is one administrator. 
❖ May integrate with third-party systems 
❖ “Is it our problem or theirs?” 
❖ Conflicting policies/priorities 
❖ Third parties constrain; weigh the risk
Universal Fallacy #7 
Transport cost is zero. 
❖ Monetary and practical costs 
❖ Building/maintaining a network is not 
trivial 
❖ The “perfect” system might be too costly
Universal Fallacy #8 
The network is homogenous. 
❖ Networks are almost never homogenous 
❖ Third-party integration? 
❖ Consider interoperability 
❖ Avoid proprietary protocols
These problems apply to LAN and WAN systems 
(single-data-center and cross-data-center) 
No one is safe.
“Anything that can go 
wrong will go wrong.” 
–Murphy’s Law
From Mainframe to Microservice: An Introduction to Distributed Systems
Characteristics of a Reliable Distributed System 
Fault-tolerant 
Available 
Scalable 
Consistent 
Secure 
Performant 
nodes can fail 
serve all the requests, all the time 
behave correctly with changing 
topologies 
state is coordinated across nodes 
access is authenticated 
it’s fast!
From Mainframe to Microservice: An Introduction to Distributed Systems
Distributed systems are 
all about trade-offs.
CAP Theorem 
❖ Presented in 1998 by Eric 
Brewer 
❖ Impossible to guarantee 
all three: 
❖ Consistency 
❖ Availability 
❖ Partition tolerance
Consistency
Consistency 
❖ Linearizable - there exists a total order of all state 
updates and each update appears atomic 
❖ E.g. mutexes make operations appear atomic 
❖ When operations are linearizable, we can assign a unique 
“timestamp” to each one (total order) 
❖ A system is consistent if every node shares the same 
total order 
❖ Consistency which is both global and instantaneous is 
impossible
Consistency 
Eventual consistency 
replicas allowed to diverge, 
eventually converge 
Strong consistency 
replicas can’t diverge; 
requires linearizability
Availability 
❖ Every request received by a non-failing node must be 
served 
❖ If a piece of data required for a request is unavailable, 
the system is unavailable 
❖ 100% availability is a myth
Partition Tolerance 
❖ A partition is a split in the network—many causes 
❖ Partition tolerance means partitions can happen 
❖ CA is easy when your network is perfectly reliable 
❖ Your network is not perfectly reliable
Partition Tolerance
Common Pitfalls 
❖ Halting failure - machine stops 
❖ Network failure - network connection breaks 
❖ Omission failure - messages are lost 
❖ Timing failure - clock skew 
❖ Byzantine failure - arbitrary failure
Digging Deeper 
Exploring some higher-level concepts
Byzantine Generals Problem 
❖ Consider a city under siege by two allied armies 
❖ Each army has a general 
❖ One general is the leader 
❖ Armies must agree when to attack 
❖ Must use messengers to communicate 
❖ Messengers can be captured by defenders
Byzantine Generals Problem
Byzantine Generals Problem 
❖ Send 100 messages, attack no matter what 
❖ A might attack without B 
❖ Send 100 messages, wait for acks, attack if confident 
❖ B might attack without A 
❖ Messages have overhead 
❖ Can’t reliably make decision (provenly impossible)
Distributed Consensus 
❖ Replace 2 generals with N 
generals 
❖ Nodes must agree on data 
value 
❖ Solutions: 
❖ Multi-phase commit 
❖ State replication
Two-Phase Commit 
❖ Blocking protocol 
❖ Coordinator waits for 
cohorts 
❖ Cohorts wait for 
commit/rollback 
❖ Can deadlock
Three-Phase Commit 
❖ Non-blocking 
protocol 
❖ Abort on timeouts 
❖ Susceptible to 
network partitions
State Replication 
❖ E.g. Paxos, Raft protocols 
❖ Elect a leader (coordinator) 
❖ All changes go through leader 
❖ Each change appends log entry 
❖ Each node has log replica
State Replication 
❖ Must have quorum (majority) 
to proceed 
❖ Commit once quorum acks 
❖ Quorums mitigate partitions 
❖ Logs allow state to be rebuilt
Split-Brain
Split-Brain
Split-Brain
Split-Brain 
❖ Optimistic (AP) - let partitions work as usual 
❖ Pessimistic (CP) - quorum partition works, fence others
Hybrid Consistency Models 
❖ Weak == available, low latency, stale reads 
❖ Strong == fresh reads, less available, high latency 
❖ How do you choose a consistency model? 
❖ Hybrid models 
❖ Weaker models when possible (likes, followers, votes) 
❖ Stronger models when necessary 
❖ Tunable consistency models (Cassandra, Riak, etc.)
Scaling Shared Data 
❖ Sharing mutable data at large scale is difficult 
❖ Solutions: 
❖ Immutable data 
❖ Last write wins 
❖ Application-level conflict resolution 
❖ Causal ordering (e.g. vector clocks) 
❖ Distributed data types (CRDTs)
Scaling Shared Data 
Imagine a shared, global 
counter… 
“Get, add 1, and put” 
transaction will not 
scale
CRDT 
❖ Conflict-free Replicated Data Type 
❖ Convergent: state-based 
❖ Commutative: operations-based 
❖ E.g. distributed sets, lists, maps, counters 
❖ Update concurrently w/o writer coordination
CRDT 
❖ CRDTs always converge (provably) 
❖ Operations commute (order doesn’t matter) 
❖ Highly available, eventually consistent 
❖ Always reach consistent state 
❖ Drawbacks: 
❖ Requires knowledge of all clients 
❖ Must be associative, commutative, and idempotent
G-Counter
CRDT 
❖ Add to set is associative, commutative, idempotent 
❖ add(“a”), add(“b”), add(“a”) => {“a”, “b”} 
❖ Adding and removing items is not 
❖ add(“a”), remove(“a”) => {} 
❖ remove(“a”), add(“a”) => {“a”} 
❖ CRDTs require interpretation of common data 
structures w/ limitations
Two-Phase Set 
❖ Use two sets, one for adding, one for removing 
❖ Elements can be added once and removed once 
❖ { 
“a”: [“a”, “b”, “c”], 
“r”: [“a”] 
} 
❖ => {“b”, “c”} 
❖ add(“a”), remove(“a”) => {“a”: [“a”], “r”: [“a”]} 
❖ remove(“a”), add(“a”) => {“a”: [“a”], “r”: [“a”]}
From Mainframe to Microservice: An Introduction to Distributed Systems
Let’s Recap...
Distributed architectures allow us to build 
highly available, fault-tolerant systems.
We can't live in this fantasy land 
where everything works perfectly 
all of the time.
From Mainframe to Microservice: An Introduction to Distributed Systems
From Mainframe to Microservice: An Introduction to Distributed Systems
Shit happens — network partitions, 
hardware failure, GC pauses, 
latency, dropped packets…
Build resilient systems.
Design for failure.
kill -9
Consider the trade-off between 
consistency and availability.
Partition tolerance is not an option, 
it’s required. 
(if you’re building a distributed system)
Use weak consistency when possible, 
strong when necessary.
Sharing data at scale is hard, 
let’s go shopping. 
(or consider your options)
State is hell.
Further Readings 
❖ Jepsen series 
Kyle Kingsbury (aphyr) 
❖ A Comprehensive Study of Convergent and Commutative 
Replicated Data Types 
Shapiro et al. 
❖ In Search of an Understandable Consensus Algorithm 
Ongaro et al. 
❖ CAP Twelve Years Later 
Eric Brewer 
❖ Many, many more…
Thanks! 
@tyler_treat 
github.com/tylertreat 
bravenewgeek.com

More Related Content

What's hot

Unique ID generation in distributed systems
Unique ID generation in distributed systemsUnique ID generation in distributed systems
Unique ID generation in distributed systems
Dave Gardner
 
Changelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache FlinkChangelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache Flink
Flink Forward
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
Flink Forward
 
Monitoring Hadoop with Prometheus (Hadoop User Group Ireland, December 2015)
Monitoring Hadoop with Prometheus (Hadoop User Group Ireland, December 2015)Monitoring Hadoop with Prometheus (Hadoop User Group Ireland, December 2015)
Monitoring Hadoop with Prometheus (Hadoop User Group Ireland, December 2015)
Brian Brazil
 
Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - B...
Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - B...Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - B...
Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - B...
Flink Forward
 
Sql Antipatterns Strike Back
Sql Antipatterns Strike BackSql Antipatterns Strike Back
Sql Antipatterns Strike Back
Karwin Software Solutions LLC
 
Using Queryable State for Fun and Profit
Using Queryable State for Fun and ProfitUsing Queryable State for Fun and Profit
Using Queryable State for Fun and Profit
Flink Forward
 
ClickHouse Keeper
ClickHouse KeeperClickHouse Keeper
ClickHouse Keeper
Altinity Ltd
 
ksqlDB로 실시간 데이터 변환 및 스트림 처리
ksqlDB로 실시간 데이터 변환 및 스트림 처리ksqlDB로 실시간 데이터 변환 및 스트림 처리
ksqlDB로 실시간 데이터 변환 및 스트림 처리
confluent
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
Gokhan Atil
 
Event Sourcing & CQRS, Kafka, Rabbit MQ
Event Sourcing & CQRS, Kafka, Rabbit MQEvent Sourcing & CQRS, Kafka, Rabbit MQ
Event Sourcing & CQRS, Kafka, Rabbit MQ
Araf Karsh Hamid
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
mumrah
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability Patterns
Jonas Bonér
 
Event Driven-Architecture from a Scalability perspective
Event Driven-Architecture from a Scalability perspectiveEvent Driven-Architecture from a Scalability perspective
Event Driven-Architecture from a Scalability perspective
Jonas Bonér
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)
Prashant Gupta
 
How to Reduce Your Database Total Cost of Ownership with TimescaleDB
How to Reduce Your Database Total Cost of Ownership with TimescaleDBHow to Reduce Your Database Total Cost of Ownership with TimescaleDB
How to Reduce Your Database Total Cost of Ownership with TimescaleDB
Timescale
 
Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?
confluent
 
RocksDB detail
RocksDB detailRocksDB detail
RocksDB detail
MIJIN AN
 
Spark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka StreamsSpark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka Streams
Guido Schmutz
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Altinity Ltd
 

What's hot (20)

Unique ID generation in distributed systems
Unique ID generation in distributed systemsUnique ID generation in distributed systems
Unique ID generation in distributed systems
 
Changelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache FlinkChangelog Stream Processing with Apache Flink
Changelog Stream Processing with Apache Flink
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
 
Monitoring Hadoop with Prometheus (Hadoop User Group Ireland, December 2015)
Monitoring Hadoop with Prometheus (Hadoop User Group Ireland, December 2015)Monitoring Hadoop with Prometheus (Hadoop User Group Ireland, December 2015)
Monitoring Hadoop with Prometheus (Hadoop User Group Ireland, December 2015)
 
Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - B...
Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - B...Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - B...
Flink Forward Berlin 2017: Piotr Nowojski - "Hit me, baby, just one time" - B...
 
Sql Antipatterns Strike Back
Sql Antipatterns Strike BackSql Antipatterns Strike Back
Sql Antipatterns Strike Back
 
Using Queryable State for Fun and Profit
Using Queryable State for Fun and ProfitUsing Queryable State for Fun and Profit
Using Queryable State for Fun and Profit
 
ClickHouse Keeper
ClickHouse KeeperClickHouse Keeper
ClickHouse Keeper
 
ksqlDB로 실시간 데이터 변환 및 스트림 처리
ksqlDB로 실시간 데이터 변환 및 스트림 처리ksqlDB로 실시간 데이터 변환 및 스트림 처리
ksqlDB로 실시간 데이터 변환 및 스트림 처리
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
 
Event Sourcing & CQRS, Kafka, Rabbit MQ
Event Sourcing & CQRS, Kafka, Rabbit MQEvent Sourcing & CQRS, Kafka, Rabbit MQ
Event Sourcing & CQRS, Kafka, Rabbit MQ
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability Patterns
 
Event Driven-Architecture from a Scalability perspective
Event Driven-Architecture from a Scalability perspectiveEvent Driven-Architecture from a Scalability perspective
Event Driven-Architecture from a Scalability perspective
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)
 
How to Reduce Your Database Total Cost of Ownership with TimescaleDB
How to Reduce Your Database Total Cost of Ownership with TimescaleDBHow to Reduce Your Database Total Cost of Ownership with TimescaleDB
How to Reduce Your Database Total Cost of Ownership with TimescaleDB
 
Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?Kafka Streams: What it is, and how to use it?
Kafka Streams: What it is, and how to use it?
 
RocksDB detail
RocksDB detailRocksDB detail
RocksDB detail
 
Spark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka StreamsSpark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka Streams
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
 

Similar to From Mainframe to Microservice: An Introduction to Distributed Systems

Distributed computing for new bloods
Distributed computing for new bloodsDistributed computing for new bloods
Distributed computing for new bloods
Raymond Tay
 
Everything you always wanted to know about Distributed databases, at devoxx l...
Everything you always wanted to know about Distributed databases, at devoxx l...Everything you always wanted to know about Distributed databases, at devoxx l...
Everything you always wanted to know about Distributed databases, at devoxx l...
javier ramirez
 
Intro to distributed systems
Intro to distributed systemsIntro to distributed systems
Intro to distributed systems
Ahmed Soliman
 
Introduction
IntroductionIntroduction
Introduction
Mohamed Diallo
 
Distributed systems and scalability rules
Distributed systems and scalability rulesDistributed systems and scalability rules
Distributed systems and scalability rules
Oleg Tsal-Tsalko
 
Scylla Summit 2018: Consensus in Eventually Consistent Databases
Scylla Summit 2018: Consensus in Eventually Consistent DatabasesScylla Summit 2018: Consensus in Eventually Consistent Databases
Scylla Summit 2018: Consensus in Eventually Consistent Databases
ScyllaDB
 
Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...
Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...
Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...
confluent
 
Database Expert Q&A from 2600hz and Cloudant
Database Expert Q&A from 2600hz and CloudantDatabase Expert Q&A from 2600hz and Cloudant
Database Expert Q&A from 2600hz and Cloudant
Joshua Goldbard
 
Design Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational DatabasesDesign Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational Databases
guestdfd1ec
 
Design Patterns For Distributed NO-reational databases
Design Patterns For Distributed NO-reational databasesDesign Patterns For Distributed NO-reational databases
Design Patterns For Distributed NO-reational databases
lovingprince58
 
Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...
Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...
Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...
confluent
 
Module: Mutable Content in IPFS
Module: Mutable Content in IPFSModule: Mutable Content in IPFS
Module: Mutable Content in IPFS
Ioannis Psaras
 
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInJay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
LinkedIn
 
Data Consitency Patterns in Cloud Native Applications
Data Consitency Patterns in Cloud Native ApplicationsData Consitency Patterns in Cloud Native Applications
Data Consitency Patterns in Cloud Native Applications
Ryan Knight
 
Cassandra
CassandraCassandra
Cassandra
Upaang Saxena
 
The Power of Determinism in Database Systems
The Power of Determinism in Database SystemsThe Power of Determinism in Database Systems
The Power of Determinism in Database Systems
Daniel Abadi
 
OSI reference model
OSI reference modelOSI reference model
OSI reference model
shanthishyam
 
The Reactive Principles: Eight Tenets For Building Cloud Native Applications
The Reactive Principles: Eight Tenets For Building Cloud Native ApplicationsThe Reactive Principles: Eight Tenets For Building Cloud Native Applications
The Reactive Principles: Eight Tenets For Building Cloud Native Applications
Lightbend
 
Cassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage SystemCassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage System
Varad Meru
 
Highly available distributed databases, how they work, javier ramirez at teowaki
Highly available distributed databases, how they work, javier ramirez at teowakiHighly available distributed databases, how they work, javier ramirez at teowaki
Highly available distributed databases, how they work, javier ramirez at teowaki
javier ramirez
 

Similar to From Mainframe to Microservice: An Introduction to Distributed Systems (20)

Distributed computing for new bloods
Distributed computing for new bloodsDistributed computing for new bloods
Distributed computing for new bloods
 
Everything you always wanted to know about Distributed databases, at devoxx l...
Everything you always wanted to know about Distributed databases, at devoxx l...Everything you always wanted to know about Distributed databases, at devoxx l...
Everything you always wanted to know about Distributed databases, at devoxx l...
 
Intro to distributed systems
Intro to distributed systemsIntro to distributed systems
Intro to distributed systems
 
Introduction
IntroductionIntroduction
Introduction
 
Distributed systems and scalability rules
Distributed systems and scalability rulesDistributed systems and scalability rules
Distributed systems and scalability rules
 
Scylla Summit 2018: Consensus in Eventually Consistent Databases
Scylla Summit 2018: Consensus in Eventually Consistent DatabasesScylla Summit 2018: Consensus in Eventually Consistent Databases
Scylla Summit 2018: Consensus in Eventually Consistent Databases
 
Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...
Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...
Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...
 
Database Expert Q&A from 2600hz and Cloudant
Database Expert Q&A from 2600hz and CloudantDatabase Expert Q&A from 2600hz and Cloudant
Database Expert Q&A from 2600hz and Cloudant
 
Design Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational DatabasesDesign Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational Databases
 
Design Patterns For Distributed NO-reational databases
Design Patterns For Distributed NO-reational databasesDesign Patterns For Distributed NO-reational databases
Design Patterns For Distributed NO-reational databases
 
Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...
Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...
Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...
 
Module: Mutable Content in IPFS
Module: Mutable Content in IPFSModule: Mutable Content in IPFS
Module: Mutable Content in IPFS
 
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInJay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
 
Data Consitency Patterns in Cloud Native Applications
Data Consitency Patterns in Cloud Native ApplicationsData Consitency Patterns in Cloud Native Applications
Data Consitency Patterns in Cloud Native Applications
 
Cassandra
CassandraCassandra
Cassandra
 
The Power of Determinism in Database Systems
The Power of Determinism in Database SystemsThe Power of Determinism in Database Systems
The Power of Determinism in Database Systems
 
OSI reference model
OSI reference modelOSI reference model
OSI reference model
 
The Reactive Principles: Eight Tenets For Building Cloud Native Applications
The Reactive Principles: Eight Tenets For Building Cloud Native ApplicationsThe Reactive Principles: Eight Tenets For Building Cloud Native Applications
The Reactive Principles: Eight Tenets For Building Cloud Native Applications
 
Cassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage SystemCassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage System
 
Highly available distributed databases, how they work, javier ramirez at teowaki
Highly available distributed databases, how they work, javier ramirez at teowakiHighly available distributed databases, how they work, javier ramirez at teowaki
Highly available distributed databases, how they work, javier ramirez at teowaki
 

More from Tyler Treat

Cloud-Native Observability
Cloud-Native ObservabilityCloud-Native Observability
Cloud-Native Observability
Tyler Treat
 
The Observability Pipeline
The Observability PipelineThe Observability Pipeline
The Observability Pipeline
Tyler Treat
 
Distributed Systems Are a UX Problem
Distributed Systems Are a UX ProblemDistributed Systems Are a UX Problem
Distributed Systems Are a UX Problem
Tyler Treat
 
The Future of Ops
The Future of OpsThe Future of Ops
The Future of Ops
Tyler Treat
 
Building a Distributed Message Log from Scratch - SCaLE 16x
Building a Distributed Message Log from Scratch - SCaLE 16xBuilding a Distributed Message Log from Scratch - SCaLE 16x
Building a Distributed Message Log from Scratch - SCaLE 16x
Tyler Treat
 
Building a Distributed Message Log from Scratch
Building a Distributed Message Log from ScratchBuilding a Distributed Message Log from Scratch
Building a Distributed Message Log from Scratch
Tyler Treat
 
So You Wanna Go Fast?
So You Wanna Go Fast?So You Wanna Go Fast?
So You Wanna Go Fast?
Tyler Treat
 
Simple Solutions for Complex Problems
Simple Solutions for Complex ProblemsSimple Solutions for Complex Problems
Simple Solutions for Complex Problems
Tyler Treat
 
Probabilistic algorithms for fun and pseudorandom profit
Probabilistic algorithms for fun and pseudorandom profitProbabilistic algorithms for fun and pseudorandom profit
Probabilistic algorithms for fun and pseudorandom profit
Tyler Treat
 
The Economics of Scale: Promises and Perils of Going Distributed
The Economics of Scale: Promises and Perils of Going DistributedThe Economics of Scale: Promises and Perils of Going Distributed
The Economics of Scale: Promises and Perils of Going Distributed
Tyler Treat
 

More from Tyler Treat (10)

Cloud-Native Observability
Cloud-Native ObservabilityCloud-Native Observability
Cloud-Native Observability
 
The Observability Pipeline
The Observability PipelineThe Observability Pipeline
The Observability Pipeline
 
Distributed Systems Are a UX Problem
Distributed Systems Are a UX ProblemDistributed Systems Are a UX Problem
Distributed Systems Are a UX Problem
 
The Future of Ops
The Future of OpsThe Future of Ops
The Future of Ops
 
Building a Distributed Message Log from Scratch - SCaLE 16x
Building a Distributed Message Log from Scratch - SCaLE 16xBuilding a Distributed Message Log from Scratch - SCaLE 16x
Building a Distributed Message Log from Scratch - SCaLE 16x
 
Building a Distributed Message Log from Scratch
Building a Distributed Message Log from ScratchBuilding a Distributed Message Log from Scratch
Building a Distributed Message Log from Scratch
 
So You Wanna Go Fast?
So You Wanna Go Fast?So You Wanna Go Fast?
So You Wanna Go Fast?
 
Simple Solutions for Complex Problems
Simple Solutions for Complex ProblemsSimple Solutions for Complex Problems
Simple Solutions for Complex Problems
 
Probabilistic algorithms for fun and pseudorandom profit
Probabilistic algorithms for fun and pseudorandom profitProbabilistic algorithms for fun and pseudorandom profit
Probabilistic algorithms for fun and pseudorandom profit
 
The Economics of Scale: Promises and Perils of Going Distributed
The Economics of Scale: Promises and Perils of Going DistributedThe Economics of Scale: Promises and Perils of Going Distributed
The Economics of Scale: Promises and Perils of Going Distributed
 

Recently uploaded

Independent Girls call Service Pune 000XX00000 Provide Best And Top Girl Serv...
Independent Girls call Service Pune 000XX00000 Provide Best And Top Girl Serv...Independent Girls call Service Pune 000XX00000 Provide Best And Top Girl Serv...
Independent Girls call Service Pune 000XX00000 Provide Best And Top Girl Serv...
bhumivarma35300
 
AWS Cloud Practitioner Essentials (Second Edition) (Arabic) Final Course Know...
AWS Cloud Practitioner Essentials (Second Edition) (Arabic) Final Course Know...AWS Cloud Practitioner Essentials (Second Edition) (Arabic) Final Course Know...
AWS Cloud Practitioner Essentials (Second Edition) (Arabic) Final Course Know...
karim wahed
 
dachnug51 - HCL Sametime 12 as a Software Appliance.pdf
dachnug51 - HCL Sametime 12 as a Software Appliance.pdfdachnug51 - HCL Sametime 12 as a Software Appliance.pdf
dachnug51 - HCL Sametime 12 as a Software Appliance.pdf
DNUG e.V.
 
welcome to presentation on Google Apps
welcome to   presentation on Google Appswelcome to   presentation on Google Apps
welcome to presentation on Google Apps
AsifKarimJim
 
COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...
COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...
COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...
Hironori Washizaki
 
Odoo E-commerce website development guides
Odoo E-commerce website development guidesOdoo E-commerce website development guides
Odoo E-commerce website development guides
jhkdigitalmarketing
 
11 Top Cross Browser Testing Tools to Know About.pdf
11 Top Cross Browser Testing Tools to Know About.pdf11 Top Cross Browser Testing Tools to Know About.pdf
11 Top Cross Browser Testing Tools to Know About.pdf
kalichargn70th171
 
Break data silos with real-time connectivity using Confluent Cloud Connectors
Break data silos with real-time connectivity using Confluent Cloud ConnectorsBreak data silos with real-time connectivity using Confluent Cloud Connectors
Break data silos with real-time connectivity using Confluent Cloud Connectors
confluent
 
Folding Cheat Sheet #7 - seventh in a series
Folding Cheat Sheet #7 - seventh in a seriesFolding Cheat Sheet #7 - seventh in a series
Folding Cheat Sheet #7 - seventh in a series
Philip Schwarz
 
Google ML-Kit - Understanding on-device machine learning
Google ML-Kit - Understanding on-device machine learningGoogle ML-Kit - Understanding on-device machine learning
Google ML-Kit - Understanding on-device machine learning
VishrutGoyani1
 
AWS Cloud Practitioner Essentials (Second Edition) (Arabic) .pdf
AWS Cloud Practitioner Essentials (Second Edition) (Arabic) .pdfAWS Cloud Practitioner Essentials (Second Edition) (Arabic) .pdf
AWS Cloud Practitioner Essentials (Second Edition) (Arabic) .pdf
karim wahed
 
NYGGS 360: A Complete ERP for Construction Innovation
NYGGS 360: A Complete ERP for Construction InnovationNYGGS 360: A Complete ERP for Construction Innovation
NYGGS 360: A Complete ERP for Construction Innovation
NYGGS Construction ERP Software
 
Shivam Pandit working on Php Web Developer.
Shivam Pandit working on Php Web Developer.Shivam Pandit working on Php Web Developer.
Shivam Pandit working on Php Web Developer.
shivamt017
 
Top 10 Tips To Get Google AdSense For Your Website
Top 10 Tips To Get Google AdSense For Your WebsiteTop 10 Tips To Get Google AdSense For Your Website
Top 10 Tips To Get Google AdSense For Your Website
e-Definers Technology
 
AWS Cloud Practitioner Essentials (Second Edition) (Arabic) Course Introducti...
AWS Cloud Practitioner Essentials (Second Edition) (Arabic) Course Introducti...AWS Cloud Practitioner Essentials (Second Edition) (Arabic) Course Introducti...
AWS Cloud Practitioner Essentials (Second Edition) (Arabic) Course Introducti...
karim wahed
 
dachnug51 - All you ever wanted to know about domino licensing.pdf
dachnug51 - All you ever wanted to know about domino licensing.pdfdachnug51 - All you ever wanted to know about domino licensing.pdf
dachnug51 - All you ever wanted to know about domino licensing.pdf
DNUG e.V.
 
Splunk_Remote_Work_Insights_Overview.pptx
Splunk_Remote_Work_Insights_Overview.pptxSplunk_Remote_Work_Insights_Overview.pptx
Splunk_Remote_Work_Insights_Overview.pptx
sudsdeep
 
Vip Girls Call ServiCe Hyderabad 0000000000 Pooja Best High Class Hyderabad A...
Vip Girls Call ServiCe Hyderabad 0000000000 Pooja Best High Class Hyderabad A...Vip Girls Call ServiCe Hyderabad 0000000000 Pooja Best High Class Hyderabad A...
Vip Girls Call ServiCe Hyderabad 0000000000 Pooja Best High Class Hyderabad A...
ashiklo9823
 
NBFC Software: Optimize Your Non-Banking Financial Company
NBFC Software: Optimize Your Non-Banking Financial CompanyNBFC Software: Optimize Your Non-Banking Financial Company
NBFC Software: Optimize Your Non-Banking Financial Company
NBFC Softwares
 
Software development... for all? (keynote at ICSOFT'2024)
Software development... for all? (keynote at ICSOFT'2024)Software development... for all? (keynote at ICSOFT'2024)
Software development... for all? (keynote at ICSOFT'2024)
miso_uam
 

Recently uploaded (20)

Independent Girls call Service Pune 000XX00000 Provide Best And Top Girl Serv...
Independent Girls call Service Pune 000XX00000 Provide Best And Top Girl Serv...Independent Girls call Service Pune 000XX00000 Provide Best And Top Girl Serv...
Independent Girls call Service Pune 000XX00000 Provide Best And Top Girl Serv...
 
AWS Cloud Practitioner Essentials (Second Edition) (Arabic) Final Course Know...
AWS Cloud Practitioner Essentials (Second Edition) (Arabic) Final Course Know...AWS Cloud Practitioner Essentials (Second Edition) (Arabic) Final Course Know...
AWS Cloud Practitioner Essentials (Second Edition) (Arabic) Final Course Know...
 
dachnug51 - HCL Sametime 12 as a Software Appliance.pdf
dachnug51 - HCL Sametime 12 as a Software Appliance.pdfdachnug51 - HCL Sametime 12 as a Software Appliance.pdf
dachnug51 - HCL Sametime 12 as a Software Appliance.pdf
 
welcome to presentation on Google Apps
welcome to   presentation on Google Appswelcome to   presentation on Google Apps
welcome to presentation on Google Apps
 
COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...
COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...
COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...
 
Odoo E-commerce website development guides
Odoo E-commerce website development guidesOdoo E-commerce website development guides
Odoo E-commerce website development guides
 
11 Top Cross Browser Testing Tools to Know About.pdf
11 Top Cross Browser Testing Tools to Know About.pdf11 Top Cross Browser Testing Tools to Know About.pdf
11 Top Cross Browser Testing Tools to Know About.pdf
 
Break data silos with real-time connectivity using Confluent Cloud Connectors
Break data silos with real-time connectivity using Confluent Cloud ConnectorsBreak data silos with real-time connectivity using Confluent Cloud Connectors
Break data silos with real-time connectivity using Confluent Cloud Connectors
 
Folding Cheat Sheet #7 - seventh in a series
Folding Cheat Sheet #7 - seventh in a seriesFolding Cheat Sheet #7 - seventh in a series
Folding Cheat Sheet #7 - seventh in a series
 
Google ML-Kit - Understanding on-device machine learning
Google ML-Kit - Understanding on-device machine learningGoogle ML-Kit - Understanding on-device machine learning
Google ML-Kit - Understanding on-device machine learning
 
AWS Cloud Practitioner Essentials (Second Edition) (Arabic) .pdf
AWS Cloud Practitioner Essentials (Second Edition) (Arabic) .pdfAWS Cloud Practitioner Essentials (Second Edition) (Arabic) .pdf
AWS Cloud Practitioner Essentials (Second Edition) (Arabic) .pdf
 
NYGGS 360: A Complete ERP for Construction Innovation
NYGGS 360: A Complete ERP for Construction InnovationNYGGS 360: A Complete ERP for Construction Innovation
NYGGS 360: A Complete ERP for Construction Innovation
 
Shivam Pandit working on Php Web Developer.
Shivam Pandit working on Php Web Developer.Shivam Pandit working on Php Web Developer.
Shivam Pandit working on Php Web Developer.
 
Top 10 Tips To Get Google AdSense For Your Website
Top 10 Tips To Get Google AdSense For Your WebsiteTop 10 Tips To Get Google AdSense For Your Website
Top 10 Tips To Get Google AdSense For Your Website
 
AWS Cloud Practitioner Essentials (Second Edition) (Arabic) Course Introducti...
AWS Cloud Practitioner Essentials (Second Edition) (Arabic) Course Introducti...AWS Cloud Practitioner Essentials (Second Edition) (Arabic) Course Introducti...
AWS Cloud Practitioner Essentials (Second Edition) (Arabic) Course Introducti...
 
dachnug51 - All you ever wanted to know about domino licensing.pdf
dachnug51 - All you ever wanted to know about domino licensing.pdfdachnug51 - All you ever wanted to know about domino licensing.pdf
dachnug51 - All you ever wanted to know about domino licensing.pdf
 
Splunk_Remote_Work_Insights_Overview.pptx
Splunk_Remote_Work_Insights_Overview.pptxSplunk_Remote_Work_Insights_Overview.pptx
Splunk_Remote_Work_Insights_Overview.pptx
 
Vip Girls Call ServiCe Hyderabad 0000000000 Pooja Best High Class Hyderabad A...
Vip Girls Call ServiCe Hyderabad 0000000000 Pooja Best High Class Hyderabad A...Vip Girls Call ServiCe Hyderabad 0000000000 Pooja Best High Class Hyderabad A...
Vip Girls Call ServiCe Hyderabad 0000000000 Pooja Best High Class Hyderabad A...
 
NBFC Software: Optimize Your Non-Banking Financial Company
NBFC Software: Optimize Your Non-Banking Financial CompanyNBFC Software: Optimize Your Non-Banking Financial Company
NBFC Software: Optimize Your Non-Banking Financial Company
 
Software development... for all? (keynote at ICSOFT'2024)
Software development... for all? (keynote at ICSOFT'2024)Software development... for all? (keynote at ICSOFT'2024)
Software development... for all? (keynote at ICSOFT'2024)
 

From Mainframe to Microservice: An Introduction to Distributed Systems

  • 1. From Mainframe to Microservice An Introduction to Distributed Systems @tyler_treat Workiva
  • 2. An Introduction to Distributed Systems ❖ Building a foundation of understanding ❖ Why distributed systems? ❖ Universal fallacies ❖ Characteristics and the CAP theorem ❖ Common pitfalls ❖ Digging deeper ❖ Byzantine Generals Problem and consensus ❖ Split-brain ❖ Hybrid consistency models ❖ Scaling shared data and CRDTs
  • 3. “A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable.” –Leslie Lamport
  • 4. Scale Up vs. Scale Out Vertical Scaling ❖ Add resources to a node ❖ Increases node capacity, load is unaffected ❖ System complexity unaffected Horizontal Scaling ❖ Add nodes to a cluster ❖ Decreases load, capacity is unaffected ❖ Availability and throughput w/ increased complexity
  • 5. A distributed system is a collection of independent computers that behave as a single coherent system.
  • 6. Why Distributed Systems? Availability Fault Tolerance Throughput Architecture Economics serve every request resilient to failures parallel computation decoupled, focused services scale-out becoming manageable/ cost-effective
  • 8. “You have to design distributed systems with the expectation of failure.” –Ken Arnold
  • 9. Distributed systems engineers are the world’s biggest pessimists.
  • 10. Universal Fallacy #1 The network is reliable. ❖ Message delivery is never guaranteed ❖ Best effort ❖ Is it worth it? ❖ Resiliency/redundancy/failover
  • 11. Universal Fallacy #2 Latency is zero. ❖ We cannot defy the laws of physics ❖ LAN to WAN deteriorates quickly ❖ Minimize network calls (batch) ❖ Design asynchronous systems
  • 12. Universal Fallacy #3 Bandwidth is infinite. ❖ Out of our control ❖ Limit message sizes ❖ Use message queueing
  • 13. Universal Fallacy #4 The network is secure. ❖ Everyone is out to get you ❖ Build in security from day 1 ❖ Multi-layered ❖ Encrypt, pentest, train developers
  • 14. Universal Fallacy #5 Topology doesn’t change. ❖ Network topology is dynamic ❖ Don’t statically address hosts ❖ Collection of services, not nodes ❖ Service discovery
  • 15. Universal Fallacy #6 There is one administrator. ❖ May integrate with third-party systems ❖ “Is it our problem or theirs?” ❖ Conflicting policies/priorities ❖ Third parties constrain; weigh the risk
  • 16. Universal Fallacy #7 Transport cost is zero. ❖ Monetary and practical costs ❖ Building/maintaining a network is not trivial ❖ The “perfect” system might be too costly
  • 17. Universal Fallacy #8 The network is homogenous. ❖ Networks are almost never homogenous ❖ Third-party integration? ❖ Consider interoperability ❖ Avoid proprietary protocols
  • 18. These problems apply to LAN and WAN systems (single-data-center and cross-data-center) No one is safe.
  • 19. “Anything that can go wrong will go wrong.” –Murphy’s Law
  • 21. Characteristics of a Reliable Distributed System Fault-tolerant Available Scalable Consistent Secure Performant nodes can fail serve all the requests, all the time behave correctly with changing topologies state is coordinated across nodes access is authenticated it’s fast!
  • 23. Distributed systems are all about trade-offs.
  • 24. CAP Theorem ❖ Presented in 1998 by Eric Brewer ❖ Impossible to guarantee all three: ❖ Consistency ❖ Availability ❖ Partition tolerance
  • 26. Consistency ❖ Linearizable - there exists a total order of all state updates and each update appears atomic ❖ E.g. mutexes make operations appear atomic ❖ When operations are linearizable, we can assign a unique “timestamp” to each one (total order) ❖ A system is consistent if every node shares the same total order ❖ Consistency which is both global and instantaneous is impossible
  • 27. Consistency Eventual consistency replicas allowed to diverge, eventually converge Strong consistency replicas can’t diverge; requires linearizability
  • 28. Availability ❖ Every request received by a non-failing node must be served ❖ If a piece of data required for a request is unavailable, the system is unavailable ❖ 100% availability is a myth
  • 29. Partition Tolerance ❖ A partition is a split in the network—many causes ❖ Partition tolerance means partitions can happen ❖ CA is easy when your network is perfectly reliable ❖ Your network is not perfectly reliable
  • 31. Common Pitfalls ❖ Halting failure - machine stops ❖ Network failure - network connection breaks ❖ Omission failure - messages are lost ❖ Timing failure - clock skew ❖ Byzantine failure - arbitrary failure
  • 32. Digging Deeper Exploring some higher-level concepts
  • 33. Byzantine Generals Problem ❖ Consider a city under siege by two allied armies ❖ Each army has a general ❖ One general is the leader ❖ Armies must agree when to attack ❖ Must use messengers to communicate ❖ Messengers can be captured by defenders
  • 35. Byzantine Generals Problem ❖ Send 100 messages, attack no matter what ❖ A might attack without B ❖ Send 100 messages, wait for acks, attack if confident ❖ B might attack without A ❖ Messages have overhead ❖ Can’t reliably make decision (provenly impossible)
  • 36. Distributed Consensus ❖ Replace 2 generals with N generals ❖ Nodes must agree on data value ❖ Solutions: ❖ Multi-phase commit ❖ State replication
  • 37. Two-Phase Commit ❖ Blocking protocol ❖ Coordinator waits for cohorts ❖ Cohorts wait for commit/rollback ❖ Can deadlock
  • 38. Three-Phase Commit ❖ Non-blocking protocol ❖ Abort on timeouts ❖ Susceptible to network partitions
  • 39. State Replication ❖ E.g. Paxos, Raft protocols ❖ Elect a leader (coordinator) ❖ All changes go through leader ❖ Each change appends log entry ❖ Each node has log replica
  • 40. State Replication ❖ Must have quorum (majority) to proceed ❖ Commit once quorum acks ❖ Quorums mitigate partitions ❖ Logs allow state to be rebuilt
  • 44. Split-Brain ❖ Optimistic (AP) - let partitions work as usual ❖ Pessimistic (CP) - quorum partition works, fence others
  • 45. Hybrid Consistency Models ❖ Weak == available, low latency, stale reads ❖ Strong == fresh reads, less available, high latency ❖ How do you choose a consistency model? ❖ Hybrid models ❖ Weaker models when possible (likes, followers, votes) ❖ Stronger models when necessary ❖ Tunable consistency models (Cassandra, Riak, etc.)
  • 46. Scaling Shared Data ❖ Sharing mutable data at large scale is difficult ❖ Solutions: ❖ Immutable data ❖ Last write wins ❖ Application-level conflict resolution ❖ Causal ordering (e.g. vector clocks) ❖ Distributed data types (CRDTs)
  • 47. Scaling Shared Data Imagine a shared, global counter… “Get, add 1, and put” transaction will not scale
  • 48. CRDT ❖ Conflict-free Replicated Data Type ❖ Convergent: state-based ❖ Commutative: operations-based ❖ E.g. distributed sets, lists, maps, counters ❖ Update concurrently w/o writer coordination
  • 49. CRDT ❖ CRDTs always converge (provably) ❖ Operations commute (order doesn’t matter) ❖ Highly available, eventually consistent ❖ Always reach consistent state ❖ Drawbacks: ❖ Requires knowledge of all clients ❖ Must be associative, commutative, and idempotent
  • 51. CRDT ❖ Add to set is associative, commutative, idempotent ❖ add(“a”), add(“b”), add(“a”) => {“a”, “b”} ❖ Adding and removing items is not ❖ add(“a”), remove(“a”) => {} ❖ remove(“a”), add(“a”) => {“a”} ❖ CRDTs require interpretation of common data structures w/ limitations
  • 52. Two-Phase Set ❖ Use two sets, one for adding, one for removing ❖ Elements can be added once and removed once ❖ { “a”: [“a”, “b”, “c”], “r”: [“a”] } ❖ => {“b”, “c”} ❖ add(“a”), remove(“a”) => {“a”: [“a”], “r”: [“a”]} ❖ remove(“a”), add(“a”) => {“a”: [“a”], “r”: [“a”]}
  • 55. Distributed architectures allow us to build highly available, fault-tolerant systems.
  • 56. We can't live in this fantasy land where everything works perfectly all of the time.
  • 59. Shit happens — network partitions, hardware failure, GC pauses, latency, dropped packets…
  • 63. Consider the trade-off between consistency and availability.
  • 64. Partition tolerance is not an option, it’s required. (if you’re building a distributed system)
  • 65. Use weak consistency when possible, strong when necessary.
  • 66. Sharing data at scale is hard, let’s go shopping. (or consider your options)
  • 68. Further Readings ❖ Jepsen series Kyle Kingsbury (aphyr) ❖ A Comprehensive Study of Convergent and Commutative Replicated Data Types Shapiro et al. ❖ In Search of an Understandable Consensus Algorithm Ongaro et al. ❖ CAP Twelve Years Later Eric Brewer ❖ Many, many more…