Put Your Thinking CAP On

Tomer Gabel
Tomer GabelConsulting Engineer at Substrate Software Services
Put Your Thinking
CAP On
Tomer Gabel, Wix
JDay Lviv, 2015
Credits
Originally a talk by
Yoav Abrahami (Wix)
Based on “Call Me Maybe” by
Kyle “Aphyr” Kingsbury
Brewer’s CAP Theorem
Partition
Tolerance
ConsistencyAvailability
Brewer’s CAP Theorem
Partition
Tolerance
ConsistencyAvailability
By Example
• I want this book!
– I add it to the cart
– Then continue
browsing
• There’s only one copy
in stock!
By Example
• I want this book!
– I add it to the cart
– Then continue
browsing
• There’s only one copy
in stock!
• … and someone else
just bought it.
Consistency
Consistency: Defined
• In a consistent
system:
All participants
see the same value
at the same time
• “Do you have this
book in stock?”
Consistency: Defined
• If our book store is an
inconsistent system:
– Two customers may
buy the book
– But there’s only one
item in inventory!
• We’ve just violated a
business constraint.
Availability
Availability: Defined
• An available system:
– Is reachable
– Responds to requests
(within SLA)
• Availability does not
guarantee success!
– The operation may fail
– “This book is no longer
available”
Availability: Defined
• What if the system is
unavailable?
– I complete the
checkout
– And click on “Pay”
– And wait
– And wait some more
– And…
• Did I purchase the
book or not?!
Partition
Tolerance
Partition Tolerance: Defined
• Partition: one or
more nodes are
unreachable
• No practical
system runs on a
single node
• So all systems are
susceptible!
A
B
C
D
E
“The Network is Reliable”
• All four happen in an
IP network
• To a client, delays
and drops are the
same
• Perfect failure
detection is provably
impossible1!
A B
drop delay
duplicate reorder
A B
A B A B
time
1 “Impossibility of Distributed Consensus with One Faulty Process”, Fischer, Lynch and Paterson
Partition Tolerance: Reified
• External causes:
– Bad network config
– Faulty equipment
– Scheduled
maintenance
• Even software causes
partitions:
– Bad network config.
– GC pauses
– Overloaded servers
• Plenty of war stories!
– Netflix
– Twilio
– GitHub
– Wix :-)
• Some hard numbers1:
– 5.2 failed devices/day
– 59K lost packets/day
– Adding redundancy
only improves by 40%
1 “Understanding Network Failures in Data Centers: Measurement, Analysis, and Implications”, Gill et al
“Proving” CAP
In Pictures
• Let’s consider a simple
system:
– Service A writes values
– Service B reads values
– Values are replicated
between nodes
• These are “ideal”
systems
– Bug-free, predictable
Node 1
V0A
Node 2
V0B
In Pictures
• “Sunny day scenario”:
– A writes a new value V1
– The value is replicated
to node 2
– B reads the new value
Node 1
V0A
Node 2
V0B
V1
V1
V1
V1
In Pictures
• What happens if the
network drops?
– A writes a new value V1
– Replication fails
– B still sees the old value
– The system is
inconsistent
Node 1
V0A
Node 2
V0B
V1
V0
V1
In Pictures
• Possible mitigation is
synchronous replication
– A writes a new value V1
– Cannot replicate, so write is
rejected
– Both A and B still see V0
– The system is logically
unavailable
Node 1
V0A
Node 2
V0B
V1
What does it all mean?
The network is not reliable
• Distributed systems must handle partitions
• Any modern system runs on >1 nodes…
• … and is therefore distributed
• Ergo, you have to choose:
– Consistency over availability
– Availability over consistency
Granularity
• Real systems comprise many operations
– “Add book to cart”
– “Pay for the book”
• Each has different properties
• It’s a spectrum, not a binary choice!
Consistency Availability
Shopping CartCheckout
CAP IN THE REAL
WORLD
Kyle “Aphyr” Kingsbury
Breaking consistency
guarantees since 2013
PostgreSQL
• Traditional RDBMS
– Transactional
– ACID compliant
• Primarily a CP system
– Writes against a
master node
• “Not a distributed
system”
– Except with a client at
play!
PostgreSQL
• Writes are a simplified
2PC:
– Client votes to commit
– Server validates
transaction
– Server stores changes
– Server acknowledges
commit
– Client receives
acknowledgement
Client Server
Store
PostgreSQL
• But what if the ack is
never received?
• The commit is already
stored…
• … but the client has
no indication!
• The system is in an
inconsistent state
Client Server
Store
?
PostgreSQL
• Let’s experiment!
• 5 clients write to a
PostgreSQL instance
• We then drop the server
from the network
• Results:
– 1000 writes
– 950 acknowledged
– 952 survivors
So what can we do?
1. Accept false-negatives
– May not be acceptable for your use case!
2. Use idempotent operations
3. Apply unique transaction IDs
– Query state after partition is resolved
• These strategies apply to any RDBMS
• A document-oriented database
• Availability/scale via replica sets
– Client writes to a master node
– Master replicates writes to n replicas
• User-selectable consistency guarantees
MongoDB
• When a partition occurs:
– If the master is in the
minority, it is demoted
– The majority promotes a
new master…
– … selected by the highest
optime
MongoDB
• The cluster “heals” after partition resolution:
– The “old” master rejoins the cluster
– Acknowleged minority writes are reverted!
MongoDB
• Let’s experiment!
• Set up a 5-node
MongoDB cluster
• 5 clients write to
the cluster
• We then partition
the cluster
• … and restore it to
see what happens
MongoDB
• With write concern
unacknowleged:
– Server does not ack
writes (except TCP)
– The default prior to
November 2012
• Results:
– 6000 writes
– 5700 acknowledged
– 3319 survivors
– 42% data loss!
MongoDB
• With write concern
acknowleged:
– Server acknowledges
writes (after store)
– The default guarantee
• Results:
– 6000 writes
– 5900 acknowledged
– 3692 survivors
– 37% data loss!
MongoDB
• With write concern
replica acknowleged:
– Client specifies
minimum replicas
– Server acks after
writes to replicas
• Results:
– 6000 writes
– 5695 acknowledged
– 3768 survivors
– 33% data loss!
MongoDB
• With write concern
majority:
– For an n-node cluster,
requires at least n/2
replicas
– Also called “quorum”
• Results:
– 6000 writes
– 5700 acknowledged
– 5701 survivors
– No data loss
So what can we do?
1. Keep calm and carry on
– As Aphyr puts it, “not all applications need
consistency”
– Have a reliable backup strategy
– … and make sure you drill restores!
2. Use write concern majority
– And take the performance hit
The prime suspects
• Aphyr’s Jepsen tests
include:
– Redis
– Riak
– Zookeeper
– Kafka
– Cassandra
– RabbitMQ
– etcd (and consul)
– ElasticSearch
• If you’re
considering them,
go read his posts
• In fact, go read his
posts regardless
http://aphyr.com/tags/jepsen
STRATEGIES FOR
DISTRIBUTED SYSTEMS
Immutable Data
• Immutable (adj.):
“Unchanging over
time or unable to be
changed.”
• Meaning:
– No deletes
– No updates
– No merge conflicts
– Replication is trivial
Idempotence
• An idempotent
operation:
– Can be applied one or
more times with the
same effect
• Enables retries
• Not always possible
– Side-effects are key
– Consider: payments
Eventual Consistency
• A design which prefers
availability
• … but guarantees that
clients will eventually see
consistent reads
• Consider git:
– Always available locally
– Converges via push/pull
– Human conflict resolution
Eventual Consistency
• The system expects
data to diverge
• … and includes
mechanisms to regain
convergence
– Partial ordering to
minimize conflicts
– A merge function to
resolve conflicts
Vector Clocks
• A technique for partial ordering
• Each node has a logical clock
– The clock increases on every write
– Track the last observed clocks for each item
– Include this vector on replication
• When observed and inbound vectors have
no common ancestor, we have a conflict
• This lets us know when history diverged
CRDTs
• Commutative Replicated Data Types1
• A CRDT is a data structure that:
– Eventually converges to a consistent state
– Guarantees no conflicts on replication
1 “A comprehensive study of Convergent and Commutative Replicated Data Types”, Shapiro et al
CRDTs
• CRDTs provide specialized semantics:
– G-Counter: Monotonously increasing counter
– PN-Counter: Also supports decrements
– G-Set: A set that only supports adds
– 2P-Set: Supports removals but only once
• OR-Sets are particularly useful
– Keeps track of both additions and removals
– Can be used for shopping carts
Questions?
Complaints?
WE’RE DONE
HERE!
Thank you for listening
tomer@tomergabel.com
@tomerg
http://il.linkedin.com/in/tomergabel
Aphyr’s “Call Me Maybe” blog posts:
http://aphyr.com/tags/jepsen
1 of 50

Recommended

How Shit Works: Storage by
How Shit Works: StorageHow Shit Works: Storage
How Shit Works: StorageTomer Gabel
914 views44 slides
The Wix Microservice Stack by
The Wix Microservice StackThe Wix Microservice Stack
The Wix Microservice StackTomer Gabel
1.7K views42 slides
Rainbows, Unicorns, and other Fairy Tales in the Land of Serverless Dreams by
Rainbows, Unicorns, and other Fairy Tales in the Land of Serverless DreamsRainbows, Unicorns, and other Fairy Tales in the Land of Serverless Dreams
Rainbows, Unicorns, and other Fairy Tales in the Land of Serverless DreamsJosh Carlisle
115 views31 slides
Building a smarter application stack - service discovery and wiring for Docker by
Building a smarter application stack - service discovery and wiring for DockerBuilding a smarter application stack - service discovery and wiring for Docker
Building a smarter application stack - service discovery and wiring for DockerTomas Doran
6.2K views42 slides
Empowering developers to deploy their own data stores by
Empowering developers to deploy their own data storesEmpowering developers to deploy their own data stores
Empowering developers to deploy their own data storesTomas Doran
23.1K views29 slides
Clack: glue for web apps by
Clack: glue for web appsClack: glue for web apps
Clack: glue for web appsfukamachi
2.5K views27 slides

More Related Content

What's hot

CFWheels - Pragmatic, Beautiful Code by
CFWheels - Pragmatic, Beautiful CodeCFWheels - Pragmatic, Beautiful Code
CFWheels - Pragmatic, Beautiful Codeindiver
2.2K views32 slides
Innovating faster with SBT, Continuous Delivery, and LXC by
Innovating faster with SBT, Continuous Delivery, and LXCInnovating faster with SBT, Continuous Delivery, and LXC
Innovating faster with SBT, Continuous Delivery, and LXCkscaldef
10K views44 slides
Why ruby and rails by
Why ruby and railsWhy ruby and rails
Why ruby and railsReuven Lerner
472 views31 slides
Grand Central Dispatch and multi-threading [iCONdev 2014] by
Grand Central Dispatch and multi-threading [iCONdev 2014]Grand Central Dispatch and multi-threading [iCONdev 2014]
Grand Central Dispatch and multi-threading [iCONdev 2014]Kuba Břečka
5.3K views24 slides
Woo: Writing a fast web server by
Woo: Writing a fast web serverWoo: Writing a fast web server
Woo: Writing a fast web serverfukamachi
3K views44 slides
Actors Set the Stage for Project Orleans by
Actors Set the Stage for Project OrleansActors Set the Stage for Project Orleans
Actors Set the Stage for Project Orleanscjmyers
1.8K views30 slides

What's hot(17)

CFWheels - Pragmatic, Beautiful Code by indiver
CFWheels - Pragmatic, Beautiful CodeCFWheels - Pragmatic, Beautiful Code
CFWheels - Pragmatic, Beautiful Code
indiver2.2K views
Innovating faster with SBT, Continuous Delivery, and LXC by kscaldef
Innovating faster with SBT, Continuous Delivery, and LXCInnovating faster with SBT, Continuous Delivery, and LXC
Innovating faster with SBT, Continuous Delivery, and LXC
kscaldef10K views
Grand Central Dispatch and multi-threading [iCONdev 2014] by Kuba Břečka
Grand Central Dispatch and multi-threading [iCONdev 2014]Grand Central Dispatch and multi-threading [iCONdev 2014]
Grand Central Dispatch and multi-threading [iCONdev 2014]
Kuba Břečka5.3K views
Woo: Writing a fast web server by fukamachi
Woo: Writing a fast web serverWoo: Writing a fast web server
Woo: Writing a fast web server
fukamachi3K views
Actors Set the Stage for Project Orleans by cjmyers
Actors Set the Stage for Project OrleansActors Set the Stage for Project Orleans
Actors Set the Stage for Project Orleans
cjmyers1.8K views
Woo: Writing a fast web server @ ELS2015 by fukamachi
Woo: Writing a fast web server @ ELS2015Woo: Writing a fast web server @ ELS2015
Woo: Writing a fast web server @ ELS2015
fukamachi9.7K views
Introduction to Cassandra - Denver by Jon Haddad
Introduction to Cassandra - DenverIntroduction to Cassandra - Denver
Introduction to Cassandra - Denver
Jon Haddad3.1K views
Cassandra @ Sony: The good, the bad, and the ugly part 2 by DataStax Academy
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
DataStax Academy1.8K views
Concurrency and Multithreading Demistified - Reversim Summit 2014 by Haim Yadid
Concurrency and Multithreading Demistified - Reversim Summit 2014Concurrency and Multithreading Demistified - Reversim Summit 2014
Concurrency and Multithreading Demistified - Reversim Summit 2014
Haim Yadid1.9K views
Play concurrency by Justin Long
Play concurrencyPlay concurrency
Play concurrency
Justin Long1.4K views
GoSF Summerfest - Why Go at Apcera by Derek Collison
GoSF Summerfest - Why Go at ApceraGoSF Summerfest - Why Go at Apcera
GoSF Summerfest - Why Go at Apcera
Derek Collison2.5K views
Scaling Social Games by Paolo Negri
Scaling Social GamesScaling Social Games
Scaling Social Games
Paolo Negri1.6K views
Leveraging Docker and CoreOS to provide always available Cassandra at Instacl... by DataStax
Leveraging Docker and CoreOS to provide always available Cassandra at Instacl...Leveraging Docker and CoreOS to provide always available Cassandra at Instacl...
Leveraging Docker and CoreOS to provide always available Cassandra at Instacl...
DataStax1.7K views

Similar to Put Your Thinking CAP On

Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop by
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and HadoopEventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and HadoopAyon Sinha
2.6K views23 slides
Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17 by
Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17
Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17Gwen (Chen) Shapira
9.6K views33 slides
Kafka Summit SF 2017 - One Data Center is Not Enough: Scaling Apache Kafka Ac... by
Kafka Summit SF 2017 - One Data Center is Not Enough: Scaling Apache Kafka Ac...Kafka Summit SF 2017 - One Data Center is Not Enough: Scaling Apache Kafka Ac...
Kafka Summit SF 2017 - One Data Center is Not Enough: Scaling Apache Kafka Ac...confluent
4.5K views35 slides
Disaster Recovery Plans for Apache Kafka by
Disaster Recovery Plans for Apache KafkaDisaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache Kafkaconfluent
11.3K views34 slides
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn by
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInJay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInLinkedIn
3.7K views46 slides
Storage Systems For Scalable systems by
Storage Systems For Scalable systemsStorage Systems For Scalable systems
Storage Systems For Scalable systemselliando dias
649 views41 slides

Similar to Put Your Thinking CAP On(20)

Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop by Ayon Sinha
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and HadoopEventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Ayon Sinha2.6K views
Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17 by Gwen (Chen) Shapira
Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17
Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17
Gwen (Chen) Shapira9.6K views
Kafka Summit SF 2017 - One Data Center is Not Enough: Scaling Apache Kafka Ac... by confluent
Kafka Summit SF 2017 - One Data Center is Not Enough: Scaling Apache Kafka Ac...Kafka Summit SF 2017 - One Data Center is Not Enough: Scaling Apache Kafka Ac...
Kafka Summit SF 2017 - One Data Center is Not Enough: Scaling Apache Kafka Ac...
confluent4.5K views
Disaster Recovery Plans for Apache Kafka by confluent
Disaster Recovery Plans for Apache KafkaDisaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache Kafka
confluent11.3K views
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn by LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInJay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
LinkedIn3.7K views
Storage Systems For Scalable systems by elliando dias
Storage Systems For Scalable systemsStorage Systems For Scalable systems
Storage Systems For Scalable systems
elliando dias649 views
Is NoSQL The Future of Data Storage? by Saltmarch Media
Is NoSQL The Future of Data Storage?Is NoSQL The Future of Data Storage?
Is NoSQL The Future of Data Storage?
Saltmarch Media1.1K views
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018) by Bob Pusateri
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Bob Pusateri418 views
The Rise of NoSQL and Polyglot Persistence by Abdelmonaim Remani
The Rise of NoSQL and Polyglot PersistenceThe Rise of NoSQL and Polyglot Persistence
The Rise of NoSQL and Polyglot Persistence
Abdelmonaim Remani19.6K views
From 100s to 100s of Millions by Erik Onnen
From 100s to 100s of MillionsFrom 100s to 100s of Millions
From 100s to 100s of Millions
Erik Onnen12.1K views
PlayStation and Lucene - Indexing 1M documents per second: Presented by Alexa... by Lucidworks
PlayStation and Lucene - Indexing 1M documents per second: Presented by Alexa...PlayStation and Lucene - Indexing 1M documents per second: Presented by Alexa...
PlayStation and Lucene - Indexing 1M documents per second: Presented by Alexa...
Lucidworks617 views
[db tech showcase Tokyo 2016] E32: My Life as a Disruptor by Jim Starkey by Insight Technology, Inc.
[db tech showcase Tokyo 2016] E32: My Life as a Disruptor by Jim Starkey[db tech showcase Tokyo 2016] E32: My Life as a Disruptor by Jim Starkey
[db tech showcase Tokyo 2016] E32: My Life as a Disruptor by Jim Starkey
PayPal Big Data and MySQL Cluster by Mat Keep
PayPal Big Data and MySQL ClusterPayPal Big Data and MySQL Cluster
PayPal Big Data and MySQL Cluster
Mat Keep16.8K views
Putting Kafka Into Overdrive by Todd Palino
Putting Kafka Into OverdrivePutting Kafka Into Overdrive
Putting Kafka Into Overdrive
Todd Palino8.5K views
Sparking Science up with Research Recommendations by Maya Hristakeva
Sparking Science up with Research RecommendationsSparking Science up with Research Recommendations
Sparking Science up with Research Recommendations
Maya Hristakeva2.7K views
Building High-Throughput, Low-Latency Pipelines in Kafka by confluent
Building High-Throughput, Low-Latency Pipelines in KafkaBuilding High-Throughput, Low-Latency Pipelines in Kafka
Building High-Throughput, Low-Latency Pipelines in Kafka
confluent3.5K views
Clojure's take on concurrency by yoavrubin
Clojure's take on concurrencyClojure's take on concurrency
Clojure's take on concurrency
yoavrubin3.8K views

More from Tomer Gabel

How shit works: Time by
How shit works: TimeHow shit works: Time
How shit works: TimeTomer Gabel
342 views53 slides
Nondeterministic Software for the Rest of Us by
Nondeterministic Software for the Rest of UsNondeterministic Software for the Rest of Us
Nondeterministic Software for the Rest of UsTomer Gabel
329 views39 slides
Slaying Sacred Cows: Deconstructing Dependency Injection by
Slaying Sacred Cows: Deconstructing Dependency InjectionSlaying Sacred Cows: Deconstructing Dependency Injection
Slaying Sacred Cows: Deconstructing Dependency InjectionTomer Gabel
1.3K views34 slides
An Abridged Guide to Event Sourcing by
An Abridged Guide to Event SourcingAn Abridged Guide to Event Sourcing
An Abridged Guide to Event SourcingTomer Gabel
1K views32 slides
How shit works: the CPU by
How shit works: the CPUHow shit works: the CPU
How shit works: the CPUTomer Gabel
1.8K views38 slides
Java 8 and Beyond, a Scala Story by
Java 8 and Beyond, a Scala StoryJava 8 and Beyond, a Scala Story
Java 8 and Beyond, a Scala StoryTomer Gabel
747 views24 slides

More from Tomer Gabel(20)

How shit works: Time by Tomer Gabel
How shit works: TimeHow shit works: Time
How shit works: Time
Tomer Gabel342 views
Nondeterministic Software for the Rest of Us by Tomer Gabel
Nondeterministic Software for the Rest of UsNondeterministic Software for the Rest of Us
Nondeterministic Software for the Rest of Us
Tomer Gabel329 views
Slaying Sacred Cows: Deconstructing Dependency Injection by Tomer Gabel
Slaying Sacred Cows: Deconstructing Dependency InjectionSlaying Sacred Cows: Deconstructing Dependency Injection
Slaying Sacred Cows: Deconstructing Dependency Injection
Tomer Gabel1.3K views
An Abridged Guide to Event Sourcing by Tomer Gabel
An Abridged Guide to Event SourcingAn Abridged Guide to Event Sourcing
An Abridged Guide to Event Sourcing
Tomer Gabel1K views
How shit works: the CPU by Tomer Gabel
How shit works: the CPUHow shit works: the CPU
How shit works: the CPU
Tomer Gabel1.8K views
Java 8 and Beyond, a Scala Story by Tomer Gabel
Java 8 and Beyond, a Scala StoryJava 8 and Beyond, a Scala Story
Java 8 and Beyond, a Scala Story
Tomer Gabel747 views
Scala Refactoring for Fun and Profit (Japanese subtitles) by Tomer Gabel
Scala Refactoring for Fun and Profit (Japanese subtitles)Scala Refactoring for Fun and Profit (Japanese subtitles)
Scala Refactoring for Fun and Profit (Japanese subtitles)
Tomer Gabel6.6K views
Scala Refactoring for Fun and Profit by Tomer Gabel
Scala Refactoring for Fun and ProfitScala Refactoring for Fun and Profit
Scala Refactoring for Fun and Profit
Tomer Gabel985 views
Onboarding at Scale by Tomer Gabel
Onboarding at ScaleOnboarding at Scale
Onboarding at Scale
Tomer Gabel1.5K views
Scala in the Wild by Tomer Gabel
Scala in the WildScala in the Wild
Scala in the Wild
Tomer Gabel2.8K views
Speaking Scala: Refactoring for Fun and Profit (Workshop) by Tomer Gabel
Speaking Scala: Refactoring for Fun and Profit (Workshop)Speaking Scala: Refactoring for Fun and Profit (Workshop)
Speaking Scala: Refactoring for Fun and Profit (Workshop)
Tomer Gabel765 views
Leveraging Scala Macros for Better Validation by Tomer Gabel
Leveraging Scala Macros for Better ValidationLeveraging Scala Macros for Better Validation
Leveraging Scala Macros for Better Validation
Tomer Gabel1.4K views
A Field Guide to DSL Design in Scala by Tomer Gabel
A Field Guide to DSL Design in ScalaA Field Guide to DSL Design in Scala
A Field Guide to DSL Design in Scala
Tomer Gabel6.5K views
Functional Leap of Faith (Keynote at JDay Lviv 2014) by Tomer Gabel
Functional Leap of Faith (Keynote at JDay Lviv 2014)Functional Leap of Faith (Keynote at JDay Lviv 2014)
Functional Leap of Faith (Keynote at JDay Lviv 2014)
Tomer Gabel1.5K views
Scala Back to Basics: Type Classes by Tomer Gabel
Scala Back to Basics: Type ClassesScala Back to Basics: Type Classes
Scala Back to Basics: Type Classes
Tomer Gabel3.7K views
5 Bullets to Scala Adoption by Tomer Gabel
5 Bullets to Scala Adoption5 Bullets to Scala Adoption
5 Bullets to Scala Adoption
Tomer Gabel2.7K views
Nashorn: JavaScript that doesn’t suck (ILJUG) by Tomer Gabel
Nashorn: JavaScript that doesn’t suck (ILJUG)Nashorn: JavaScript that doesn’t suck (ILJUG)
Nashorn: JavaScript that doesn’t suck (ILJUG)
Tomer Gabel5.9K views
Ponies and Unicorns With Scala by Tomer Gabel
Ponies and Unicorns With ScalaPonies and Unicorns With Scala
Ponies and Unicorns With Scala
Tomer Gabel961 views
Lab: JVM Production Debugging 101 by Tomer Gabel
Lab: JVM Production Debugging 101Lab: JVM Production Debugging 101
Lab: JVM Production Debugging 101
Tomer Gabel2.1K views
DevCon³: Scala Best Practices by Tomer Gabel
DevCon³: Scala Best PracticesDevCon³: Scala Best Practices
DevCon³: Scala Best Practices
Tomer Gabel3.7K views

Recently uploaded

FIMA 2023 Neo4j & FS - Entity Resolution.pptx by
FIMA 2023 Neo4j & FS - Entity Resolution.pptxFIMA 2023 Neo4j & FS - Entity Resolution.pptx
FIMA 2023 Neo4j & FS - Entity Resolution.pptxNeo4j
7 views26 slides
DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J... by
DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J...DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J...
DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J...Deltares
9 views24 slides
DSD-INT 2023 Delft3D FM Suite 2024.01 2D3D - New features + Improvements - Ge... by
DSD-INT 2023 Delft3D FM Suite 2024.01 2D3D - New features + Improvements - Ge...DSD-INT 2023 Delft3D FM Suite 2024.01 2D3D - New features + Improvements - Ge...
DSD-INT 2023 Delft3D FM Suite 2024.01 2D3D - New features + Improvements - Ge...Deltares
17 views12 slides
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - Dols by
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - DolsDSD-INT 2023 European Digital Twin Ocean and Delft3D FM - Dols
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - DolsDeltares
7 views23 slides
Team Transformation Tactics for Holistic Testing and Quality (Japan Symposium... by
Team Transformation Tactics for Holistic Testing and Quality (Japan Symposium...Team Transformation Tactics for Holistic Testing and Quality (Japan Symposium...
Team Transformation Tactics for Holistic Testing and Quality (Japan Symposium...Lisi Hocke
30 views124 slides
SUGCON ANZ Presentation V2.1 Final.pptx by
SUGCON ANZ Presentation V2.1 Final.pptxSUGCON ANZ Presentation V2.1 Final.pptx
SUGCON ANZ Presentation V2.1 Final.pptxJack Spektor
22 views34 slides

Recently uploaded(20)

FIMA 2023 Neo4j & FS - Entity Resolution.pptx by Neo4j
FIMA 2023 Neo4j & FS - Entity Resolution.pptxFIMA 2023 Neo4j & FS - Entity Resolution.pptx
FIMA 2023 Neo4j & FS - Entity Resolution.pptx
Neo4j7 views
DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J... by Deltares
DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J...DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J...
DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J...
Deltares9 views
DSD-INT 2023 Delft3D FM Suite 2024.01 2D3D - New features + Improvements - Ge... by Deltares
DSD-INT 2023 Delft3D FM Suite 2024.01 2D3D - New features + Improvements - Ge...DSD-INT 2023 Delft3D FM Suite 2024.01 2D3D - New features + Improvements - Ge...
DSD-INT 2023 Delft3D FM Suite 2024.01 2D3D - New features + Improvements - Ge...
Deltares17 views
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - Dols by Deltares
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - DolsDSD-INT 2023 European Digital Twin Ocean and Delft3D FM - Dols
DSD-INT 2023 European Digital Twin Ocean and Delft3D FM - Dols
Deltares7 views
Team Transformation Tactics for Holistic Testing and Quality (Japan Symposium... by Lisi Hocke
Team Transformation Tactics for Holistic Testing and Quality (Japan Symposium...Team Transformation Tactics for Holistic Testing and Quality (Japan Symposium...
Team Transformation Tactics for Holistic Testing and Quality (Japan Symposium...
Lisi Hocke30 views
SUGCON ANZ Presentation V2.1 Final.pptx by Jack Spektor
SUGCON ANZ Presentation V2.1 Final.pptxSUGCON ANZ Presentation V2.1 Final.pptx
SUGCON ANZ Presentation V2.1 Final.pptx
Jack Spektor22 views
DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko... by Deltares
DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko...DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko...
DSD-INT 2023 Simulation of Coastal Hydrodynamics and Water Quality in Hong Ko...
Deltares14 views
DSD-INT 2023 Thermobaricity in 3D DCSM-FM - taking pressure into account in t... by Deltares
DSD-INT 2023 Thermobaricity in 3D DCSM-FM - taking pressure into account in t...DSD-INT 2023 Thermobaricity in 3D DCSM-FM - taking pressure into account in t...
DSD-INT 2023 Thermobaricity in 3D DCSM-FM - taking pressure into account in t...
Deltares9 views
BushraDBR: An Automatic Approach to Retrieving Duplicate Bug Reports by Ra'Fat Al-Msie'deen
BushraDBR: An Automatic Approach to Retrieving Duplicate Bug ReportsBushraDBR: An Automatic Approach to Retrieving Duplicate Bug Reports
BushraDBR: An Automatic Approach to Retrieving Duplicate Bug Reports
DSD-INT 2023 Process-based modelling of salt marsh development coupling Delft... by Deltares
DSD-INT 2023 Process-based modelling of salt marsh development coupling Delft...DSD-INT 2023 Process-based modelling of salt marsh development coupling Delft...
DSD-INT 2023 Process-based modelling of salt marsh development coupling Delft...
Deltares7 views
Gen Apps on Google Cloud PaLM2 and Codey APIs in Action by Márton Kodok
Gen Apps on Google Cloud PaLM2 and Codey APIs in ActionGen Apps on Google Cloud PaLM2 and Codey APIs in Action
Gen Apps on Google Cloud PaLM2 and Codey APIs in Action
Márton Kodok5 views
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ... by Donato Onofri
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...
Donato Onofri825 views
Myths and Facts About Hospice Care: Busting Common Misconceptions by Care Coordinations
Myths and Facts About Hospice Care: Busting Common MisconceptionsMyths and Facts About Hospice Care: Busting Common Misconceptions
Myths and Facts About Hospice Care: Busting Common Misconceptions
DSD-INT 2023 Salt intrusion Modelling of the Lauwersmeer, towards a measureme... by Deltares
DSD-INT 2023 Salt intrusion Modelling of the Lauwersmeer, towards a measureme...DSD-INT 2023 Salt intrusion Modelling of the Lauwersmeer, towards a measureme...
DSD-INT 2023 Salt intrusion Modelling of the Lauwersmeer, towards a measureme...
Deltares5 views
Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI... by Marc Müller
Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...
Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...
Marc Müller37 views

Put Your Thinking CAP On

  • 1. Put Your Thinking CAP On Tomer Gabel, Wix JDay Lviv, 2015
  • 2. Credits Originally a talk by Yoav Abrahami (Wix) Based on “Call Me Maybe” by Kyle “Aphyr” Kingsbury
  • 5. By Example • I want this book! – I add it to the cart – Then continue browsing • There’s only one copy in stock!
  • 6. By Example • I want this book! – I add it to the cart – Then continue browsing • There’s only one copy in stock! • … and someone else just bought it.
  • 8. Consistency: Defined • In a consistent system: All participants see the same value at the same time • “Do you have this book in stock?”
  • 9. Consistency: Defined • If our book store is an inconsistent system: – Two customers may buy the book – But there’s only one item in inventory! • We’ve just violated a business constraint.
  • 11. Availability: Defined • An available system: – Is reachable – Responds to requests (within SLA) • Availability does not guarantee success! – The operation may fail – “This book is no longer available”
  • 12. Availability: Defined • What if the system is unavailable? – I complete the checkout – And click on “Pay” – And wait – And wait some more – And… • Did I purchase the book or not?!
  • 14. Partition Tolerance: Defined • Partition: one or more nodes are unreachable • No practical system runs on a single node • So all systems are susceptible! A B C D E
  • 15. “The Network is Reliable” • All four happen in an IP network • To a client, delays and drops are the same • Perfect failure detection is provably impossible1! A B drop delay duplicate reorder A B A B A B time 1 “Impossibility of Distributed Consensus with One Faulty Process”, Fischer, Lynch and Paterson
  • 16. Partition Tolerance: Reified • External causes: – Bad network config – Faulty equipment – Scheduled maintenance • Even software causes partitions: – Bad network config. – GC pauses – Overloaded servers • Plenty of war stories! – Netflix – Twilio – GitHub – Wix :-) • Some hard numbers1: – 5.2 failed devices/day – 59K lost packets/day – Adding redundancy only improves by 40% 1 “Understanding Network Failures in Data Centers: Measurement, Analysis, and Implications”, Gill et al
  • 18. In Pictures • Let’s consider a simple system: – Service A writes values – Service B reads values – Values are replicated between nodes • These are “ideal” systems – Bug-free, predictable Node 1 V0A Node 2 V0B
  • 19. In Pictures • “Sunny day scenario”: – A writes a new value V1 – The value is replicated to node 2 – B reads the new value Node 1 V0A Node 2 V0B V1 V1 V1 V1
  • 20. In Pictures • What happens if the network drops? – A writes a new value V1 – Replication fails – B still sees the old value – The system is inconsistent Node 1 V0A Node 2 V0B V1 V0 V1
  • 21. In Pictures • Possible mitigation is synchronous replication – A writes a new value V1 – Cannot replicate, so write is rejected – Both A and B still see V0 – The system is logically unavailable Node 1 V0A Node 2 V0B V1
  • 22. What does it all mean?
  • 23. The network is not reliable • Distributed systems must handle partitions • Any modern system runs on >1 nodes… • … and is therefore distributed • Ergo, you have to choose: – Consistency over availability – Availability over consistency
  • 24. Granularity • Real systems comprise many operations – “Add book to cart” – “Pay for the book” • Each has different properties • It’s a spectrum, not a binary choice! Consistency Availability Shopping CartCheckout
  • 25. CAP IN THE REAL WORLD Kyle “Aphyr” Kingsbury Breaking consistency guarantees since 2013
  • 26. PostgreSQL • Traditional RDBMS – Transactional – ACID compliant • Primarily a CP system – Writes against a master node • “Not a distributed system” – Except with a client at play!
  • 27. PostgreSQL • Writes are a simplified 2PC: – Client votes to commit – Server validates transaction – Server stores changes – Server acknowledges commit – Client receives acknowledgement Client Server Store
  • 28. PostgreSQL • But what if the ack is never received? • The commit is already stored… • … but the client has no indication! • The system is in an inconsistent state Client Server Store ?
  • 29. PostgreSQL • Let’s experiment! • 5 clients write to a PostgreSQL instance • We then drop the server from the network • Results: – 1000 writes – 950 acknowledged – 952 survivors
  • 30. So what can we do? 1. Accept false-negatives – May not be acceptable for your use case! 2. Use idempotent operations 3. Apply unique transaction IDs – Query state after partition is resolved • These strategies apply to any RDBMS
  • 31. • A document-oriented database • Availability/scale via replica sets – Client writes to a master node – Master replicates writes to n replicas • User-selectable consistency guarantees
  • 32. MongoDB • When a partition occurs: – If the master is in the minority, it is demoted – The majority promotes a new master… – … selected by the highest optime
  • 33. MongoDB • The cluster “heals” after partition resolution: – The “old” master rejoins the cluster – Acknowleged minority writes are reverted!
  • 34. MongoDB • Let’s experiment! • Set up a 5-node MongoDB cluster • 5 clients write to the cluster • We then partition the cluster • … and restore it to see what happens
  • 35. MongoDB • With write concern unacknowleged: – Server does not ack writes (except TCP) – The default prior to November 2012 • Results: – 6000 writes – 5700 acknowledged – 3319 survivors – 42% data loss!
  • 36. MongoDB • With write concern acknowleged: – Server acknowledges writes (after store) – The default guarantee • Results: – 6000 writes – 5900 acknowledged – 3692 survivors – 37% data loss!
  • 37. MongoDB • With write concern replica acknowleged: – Client specifies minimum replicas – Server acks after writes to replicas • Results: – 6000 writes – 5695 acknowledged – 3768 survivors – 33% data loss!
  • 38. MongoDB • With write concern majority: – For an n-node cluster, requires at least n/2 replicas – Also called “quorum” • Results: – 6000 writes – 5700 acknowledged – 5701 survivors – No data loss
  • 39. So what can we do? 1. Keep calm and carry on – As Aphyr puts it, “not all applications need consistency” – Have a reliable backup strategy – … and make sure you drill restores! 2. Use write concern majority – And take the performance hit
  • 40. The prime suspects • Aphyr’s Jepsen tests include: – Redis – Riak – Zookeeper – Kafka – Cassandra – RabbitMQ – etcd (and consul) – ElasticSearch • If you’re considering them, go read his posts • In fact, go read his posts regardless http://aphyr.com/tags/jepsen
  • 42. Immutable Data • Immutable (adj.): “Unchanging over time or unable to be changed.” • Meaning: – No deletes – No updates – No merge conflicts – Replication is trivial
  • 43. Idempotence • An idempotent operation: – Can be applied one or more times with the same effect • Enables retries • Not always possible – Side-effects are key – Consider: payments
  • 44. Eventual Consistency • A design which prefers availability • … but guarantees that clients will eventually see consistent reads • Consider git: – Always available locally – Converges via push/pull – Human conflict resolution
  • 45. Eventual Consistency • The system expects data to diverge • … and includes mechanisms to regain convergence – Partial ordering to minimize conflicts – A merge function to resolve conflicts
  • 46. Vector Clocks • A technique for partial ordering • Each node has a logical clock – The clock increases on every write – Track the last observed clocks for each item – Include this vector on replication • When observed and inbound vectors have no common ancestor, we have a conflict • This lets us know when history diverged
  • 47. CRDTs • Commutative Replicated Data Types1 • A CRDT is a data structure that: – Eventually converges to a consistent state – Guarantees no conflicts on replication 1 “A comprehensive study of Convergent and Commutative Replicated Data Types”, Shapiro et al
  • 48. CRDTs • CRDTs provide specialized semantics: – G-Counter: Monotonously increasing counter – PN-Counter: Also supports decrements – G-Set: A set that only supports adds – 2P-Set: Supports removals but only once • OR-Sets are particularly useful – Keeps track of both additions and removals – Can be used for shopping carts
  • 50. WE’RE DONE HERE! Thank you for listening tomer@tomergabel.com @tomerg http://il.linkedin.com/in/tomergabel Aphyr’s “Call Me Maybe” blog posts: http://aphyr.com/tags/jepsen

Editor's Notes

  1. Image source: http://en.wikipedia.org/wiki/File:Seuss-cat-hat.gif
  2. Image source: http://en.wikipedia.org/wiki/File:Seuss-cat-hat.gif
  3. Photo source: http://pixabay.com/en/meerkat-zoo-animal-sand-desert-363051/
  4. Photo source: Unknown
  5. Image source: https://www.flickr.com/photos/framesofmind/8541529818/
  6. Image source: http://duelingcouches.blogspot.com/2008/12/patiently-waiting.html
  7. Image source: http://anapt.deviantart.com/art/together-157107893
  8. Image source: https://www.flickr.com/photos/infocux/8450190120/in/set-72157632701634780
  9. Image source: http://en.wikipedia.org/wiki/Great_Pyramid_of_Giza#mediaviewer/File:Kheops-Pyramid.jpg
  10. Image source: http://2.bp.blogspot.com/--VVPUQ06BaQ/TzmEacERFoI/AAAAAAAAEzE/e2QPIrRWQAg/s1600/washrinse.jpg
  11. Photo source: https://www.flickr.com/photos/luschei/1569384007