SlideShare a Scribd company logo
1 of 51
Eventually-
Consistent Data
   Structures
        Sean Cribbs
   @seancribbs #CRDT
  Berlin Buzzwords 2012
I work for Basho
       We make
Riak is
     Eventually
     Consistent
So are Voldemort and Cassandra
Eventual
Consistency
          Replicated
      Loose coordination
  3        Forward
         progression
Eventual is Good

   ✔ Fault-tolerant
   ✔ Highly available
   ✔ Low-latency
Consistency?

           No clear winner!
           Throw one out?
       3
             Keep both?
B
Consistency?

              No clear winner!
              Throw one out?
       3
                  Keep both?
B     Cassandra
Consistency?

              No clear winner!
              Throw one out?
       3
                  Keep both?
B     Cassandra

                  Riak & Voldemort
Conflicts!
  A!     B!




 Now what?
Semantic
        Resolution
• Your app knows the domain - use
  business rules to resolve

• Amazon Dynamo’s shopping cart
Semantic
           Resolution
  • Your app knows the domain - use
     business rules to resolve

  • Amazon Dynamo’s shopping cart
“Ad hoc approaches have proven brittle and
              error-prone”
Conflict-Free
 Replicated
 Data Types
Conflict-Free
 Replicated
 Data Types
       useful abstractions
Conflict-Free
          Replicated
          Data Types
     multiple
independent copies   useful abstractions
resolves automatically
                      toward a single value



         Conflict-Free
          Replicated
          Data Types
     multiple
independent copies      useful abstractions
CRDT Flavors
• Convergent: State
 • Weak messaging requirements
•Commutative: Operations
 •Reliable broadcast required
 •Causal ordering sufficient
Convergent CRDTs
Commutative
  CRDTs
Registers
A place to put your stuff
Registers

• Last-Write Wins (LWW-Register)
 • e.g. Columns in Cassandra
• Multi-Valued (MV-Register)
 • e.g. Objects (values) in Riak
Counters
 Keeping tabs
G-Counter
G-Counter
// Starts empty
[]
G-Counter
// Starts empty
[]

// A increments twice, forwarding state
[{a,1}] // == 1
[{a,2}] // == 2
G-Counter
// Starts empty
[]

// A increments twice, forwarding state
[{a,1}] // == 1
[{a,2}] // == 2

             // B increments
             [{b,1}] // == 1
G-Counter
// Starts empty
[]

// A increments twice, forwarding state
[{a,1}] // == 1
[{a,2}] // == 2

                 // B increments
                 [{b,1}] // == 1

// Merging
[{a,2}, {b,1}]     [{a,1}, {b,1}]
PN-Counter
// A PN-Counter
{
  P = [{a,10},{b,2}],
  N = [{a,1},{c,5}]
}
// == (10+2)-(1+5) == 12-6 == 6
Sets
Members Only
G-Set
G-Set
// Starts empty
{}
G-Set
// Starts empty
{}

// A adds a and b, forwarding state
{a}
{a,b}
G-Set
// Starts empty
{}

// A adds a and b, forwarding state
{a}
{a,b}

             // B adds c
             {c}
G-Set
// Starts empty
{}

// A adds a and b, forwarding state
{a}
{a,b}

             // B adds c
             {c}

// Merging
{a,b,c}        {a,c}
2P-Set
2P-Set
// Starts empty
{A={},R={}}
2P-Set
// Starts empty
{A={},R={}}

// A adds a and b, forwarding state,
// removes a
{A={a}, R={}}   // == {a}
{A={a,b},R={}} // == {a,b}
2P-Set
// Starts empty
{A={},R={}}

// A adds a and b, forwarding state,
// removes a
{A={a}, R={}}   // == {a}
{A={a,b},R={}} // == {a,b}
{A={a,b},R={a}} // == {b}
2P-Set
// Starts empty
{A={},R={}}

// A adds a and b, forwarding state,
// removes a
{A={a}, R={}}   // == {a}
{A={a,b},R={}} // == {a,b}
{A={a,b},R={a}} // == {b}

             // B adds c
2P-Set
// Starts empty
{A={},R={}}

// A adds a and b, forwarding state,
// removes a
{A={a}, R={}}   // == {a}
{A={a,b},R={}} // == {a,b}
{A={a,b},R={a}} // == {b}

             // B adds c
             {A={c},R={}} // == {c}
2P-Set
// Starts empty
{A={},R={}}

// A adds a and b, forwarding state,
// removes a
{A={a}, R={}}   // == {a}
{A={a,b},R={}} // == {a,b}
{A={a,b},R={a}} // == {b}

             // B adds c
             {A={c},R={}} // == {c}
// Merging
2P-Set
// Starts empty
{A={},R={}}

// A adds a and b, forwarding state,
// removes a
{A={a}, R={}}   // == {a}
{A={a,b},R={}} // == {a,b}
{A={a,b},R={a}} // == {b}

              // B adds c
              {A={c},R={}} // == {c}
// Merging
{A={a,b,c},R={a}}   {A={a,c}, R={}}
LWW-Element-Set
OR-Set
G = (V,E)
Graphs   E⊆V×V
G = (V,E)
Graphs   E⊆V×V
G = (V,E)
Graphs   E⊆V×V
Use-Cases

• Social graph (OR-Set or a Graph)
• Web page visits (G-Counter)
• Shopping Cart (Modified OR-Set)
• “Like” button (U-Set)
Challenges: GC


• CRDTs are inefficient
• Synchronization may be required
Challenges:
    Responsibility
• Client
 • Erlang: mochi/statebox
 • Clojure: reiddraper/knockbox
 • Ruby: aphyr/meangirls
• Server
 • Very few options
Thanks

More Related Content

Similar to Eventually-Consistent Data Structures

Development By The Numbers - ConFoo Edition
Development By The Numbers - ConFoo EditionDevelopment By The Numbers - ConFoo Edition
Development By The Numbers - ConFoo EditionAnthony Ferrara
 
Development by the numbers
Development by the numbersDevelopment by the numbers
Development by the numbersAnthony Ferrara
 
4-Regular expression to Deterministic Finite Automata (Direct method)-05-05-2...
4-Regular expression to Deterministic Finite Automata (Direct method)-05-05-2...4-Regular expression to Deterministic Finite Automata (Direct method)-05-05-2...
4-Regular expression to Deterministic Finite Automata (Direct method)-05-05-2...venkatapranaykumarGa
 
Scala + WattzOn, sitting in a tree....
Scala + WattzOn, sitting in a tree....Scala + WattzOn, sitting in a tree....
Scala + WattzOn, sitting in a tree....Raffi Krikorian
 
Introducing Riak
Introducing RiakIntroducing Riak
Introducing RiakKevin Smith
 
Introducing Riak
Introducing RiakIntroducing Riak
Introducing RiakKevin Smith
 
compiler design ujjwal matoliya 2nd sem MCA.pptx
compiler design ujjwal matoliya 2nd sem MCA.pptxcompiler design ujjwal matoliya 2nd sem MCA.pptx
compiler design ujjwal matoliya 2nd sem MCA.pptxujjwalmatoliya
 
Real World Optimization
Real World OptimizationReal World Optimization
Real World OptimizationDavid Golden
 
Ramda, a functional JavaScript library
Ramda, a functional JavaScript libraryRamda, a functional JavaScript library
Ramda, a functional JavaScript libraryDerek Willian Stavis
 
TDC2016POA | Trilha Programacao Funcional - Ramda JS como alternativa a under...
TDC2016POA | Trilha Programacao Funcional - Ramda JS como alternativa a under...TDC2016POA | Trilha Programacao Funcional - Ramda JS como alternativa a under...
TDC2016POA | Trilha Programacao Funcional - Ramda JS como alternativa a under...tdc-globalcode
 
Business Natural Languages Development In Ruby
Business Natural Languages Development In RubyBusiness Natural Languages Development In Ruby
Business Natural Languages Development In RubyConSanFrancisco123
 
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Databricks
 
Datastage real time scenario
Datastage real time scenarioDatastage real time scenario
Datastage real time scenarioNaresh Bala
 
Distributed system sans consensus
Distributed system sans consensusDistributed system sans consensus
Distributed system sans consensusPraveen Singh Bora
 
Lecture8a_Regularization.pptx
Lecture8a_Regularization.pptxLecture8a_Regularization.pptx
Lecture8a_Regularization.pptxVictor Seelan
 
Vector Algebra One Shot #BounceBack.pdf
Vector Algebra One Shot #BounceBack.pdfVector Algebra One Shot #BounceBack.pdf
Vector Algebra One Shot #BounceBack.pdfvaibahvgoel3620
 
Practical Consistency
Practical ConsistencyPractical Consistency
Practical ConsistencyDavid Golden
 
Dynomite at Erlang Factory
Dynomite at Erlang FactoryDynomite at Erlang Factory
Dynomite at Erlang Factorymoonpolysoft
 
Speeding Up Distributed Machine Learning Using Codes
Speeding Up Distributed Machine Learning Using CodesSpeeding Up Distributed Machine Learning Using Codes
Speeding Up Distributed Machine Learning Using CodesNAVER Engineering
 

Similar to Eventually-Consistent Data Structures (20)

Development By The Numbers - ConFoo Edition
Development By The Numbers - ConFoo EditionDevelopment By The Numbers - ConFoo Edition
Development By The Numbers - ConFoo Edition
 
Development by the numbers
Development by the numbersDevelopment by the numbers
Development by the numbers
 
4-Regular expression to Deterministic Finite Automata (Direct method)-05-05-2...
4-Regular expression to Deterministic Finite Automata (Direct method)-05-05-2...4-Regular expression to Deterministic Finite Automata (Direct method)-05-05-2...
4-Regular expression to Deterministic Finite Automata (Direct method)-05-05-2...
 
Scala + WattzOn, sitting in a tree....
Scala + WattzOn, sitting in a tree....Scala + WattzOn, sitting in a tree....
Scala + WattzOn, sitting in a tree....
 
Introducing Riak
Introducing RiakIntroducing Riak
Introducing Riak
 
Introducing Riak
Introducing RiakIntroducing Riak
Introducing Riak
 
compiler design ujjwal matoliya 2nd sem MCA.pptx
compiler design ujjwal matoliya 2nd sem MCA.pptxcompiler design ujjwal matoliya 2nd sem MCA.pptx
compiler design ujjwal matoliya 2nd sem MCA.pptx
 
Real World Optimization
Real World OptimizationReal World Optimization
Real World Optimization
 
Ramda, a functional JavaScript library
Ramda, a functional JavaScript libraryRamda, a functional JavaScript library
Ramda, a functional JavaScript library
 
TDC2016POA | Trilha Programacao Funcional - Ramda JS como alternativa a under...
TDC2016POA | Trilha Programacao Funcional - Ramda JS como alternativa a under...TDC2016POA | Trilha Programacao Funcional - Ramda JS como alternativa a under...
TDC2016POA | Trilha Programacao Funcional - Ramda JS como alternativa a under...
 
Business Natural Languages Development In Ruby
Business Natural Languages Development In RubyBusiness Natural Languages Development In Ruby
Business Natural Languages Development In Ruby
 
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
 
Datastage real time scenario
Datastage real time scenarioDatastage real time scenario
Datastage real time scenario
 
Distributed system sans consensus
Distributed system sans consensusDistributed system sans consensus
Distributed system sans consensus
 
Lecture8a_Regularization.pptx
Lecture8a_Regularization.pptxLecture8a_Regularization.pptx
Lecture8a_Regularization.pptx
 
Vector Algebra One Shot #BounceBack.pdf
Vector Algebra One Shot #BounceBack.pdfVector Algebra One Shot #BounceBack.pdf
Vector Algebra One Shot #BounceBack.pdf
 
Practical Consistency
Practical ConsistencyPractical Consistency
Practical Consistency
 
Dynomite at Erlang Factory
Dynomite at Erlang FactoryDynomite at Erlang Factory
Dynomite at Erlang Factory
 
LalitBDA2015V3
LalitBDA2015V3LalitBDA2015V3
LalitBDA2015V3
 
Speeding Up Distributed Machine Learning Using Codes
Speeding Up Distributed Machine Learning Using CodesSpeeding Up Distributed Machine Learning Using Codes
Speeding Up Distributed Machine Learning Using Codes
 

More from Sean Cribbs

A Case of Accidental Concurrency
A Case of Accidental ConcurrencyA Case of Accidental Concurrency
A Case of Accidental ConcurrencySean Cribbs
 
Embrace NoSQL and Eventual Consistency with Ripple
Embrace NoSQL and Eventual Consistency with RippleEmbrace NoSQL and Eventual Consistency with Ripple
Embrace NoSQL and Eventual Consistency with RippleSean Cribbs
 
Riak with node.js
Riak with node.jsRiak with node.js
Riak with node.jsSean Cribbs
 
Schema Design for Riak (Take 2)
Schema Design for Riak (Take 2)Schema Design for Riak (Take 2)
Schema Design for Riak (Take 2)Sean Cribbs
 
Riak (Øredev nosql day)
Riak (Øredev nosql day)Riak (Øredev nosql day)
Riak (Øredev nosql day)Sean Cribbs
 
Riak Tutorial (Øredev)
Riak Tutorial (Øredev)Riak Tutorial (Øredev)
Riak Tutorial (Øredev)Sean Cribbs
 
The Radiant Ethic
The Radiant EthicThe Radiant Ethic
The Radiant EthicSean Cribbs
 
Introduction to Riak and Ripple (KC.rb)
Introduction to Riak and Ripple (KC.rb)Introduction to Riak and Ripple (KC.rb)
Introduction to Riak and Ripple (KC.rb)Sean Cribbs
 
Schema Design for Riak
Schema Design for RiakSchema Design for Riak
Schema Design for RiakSean Cribbs
 
Introduction to Riak - Red Dirt Ruby Conf Training
Introduction to Riak - Red Dirt Ruby Conf TrainingIntroduction to Riak - Red Dirt Ruby Conf Training
Introduction to Riak - Red Dirt Ruby Conf TrainingSean Cribbs
 
Introducing Riak and Ripple
Introducing Riak and RippleIntroducing Riak and Ripple
Introducing Riak and RippleSean Cribbs
 
Round PEG, Round Hole - Parsing Functionally
Round PEG, Round Hole - Parsing FunctionallyRound PEG, Round Hole - Parsing Functionally
Round PEG, Round Hole - Parsing FunctionallySean Cribbs
 
Story Driven Development With Cucumber
Story Driven Development With CucumberStory Driven Development With Cucumber
Story Driven Development With CucumberSean Cribbs
 
Achieving Parsing Sanity In Erlang
Achieving Parsing Sanity In ErlangAchieving Parsing Sanity In Erlang
Achieving Parsing Sanity In ErlangSean Cribbs
 
Of Rats And Dragons
Of Rats And DragonsOf Rats And Dragons
Of Rats And DragonsSean Cribbs
 
Erlang/OTP for Rubyists
Erlang/OTP for RubyistsErlang/OTP for Rubyists
Erlang/OTP for RubyistsSean Cribbs
 
Content Management That Won't Rot Your Brain
Content Management That Won't Rot Your BrainContent Management That Won't Rot Your Brain
Content Management That Won't Rot Your BrainSean Cribbs
 

More from Sean Cribbs (18)

A Case of Accidental Concurrency
A Case of Accidental ConcurrencyA Case of Accidental Concurrency
A Case of Accidental Concurrency
 
Embrace NoSQL and Eventual Consistency with Ripple
Embrace NoSQL and Eventual Consistency with RippleEmbrace NoSQL and Eventual Consistency with Ripple
Embrace NoSQL and Eventual Consistency with Ripple
 
Riak with node.js
Riak with node.jsRiak with node.js
Riak with node.js
 
Schema Design for Riak (Take 2)
Schema Design for Riak (Take 2)Schema Design for Riak (Take 2)
Schema Design for Riak (Take 2)
 
Riak (Øredev nosql day)
Riak (Øredev nosql day)Riak (Øredev nosql day)
Riak (Øredev nosql day)
 
Riak Tutorial (Øredev)
Riak Tutorial (Øredev)Riak Tutorial (Øredev)
Riak Tutorial (Øredev)
 
The Radiant Ethic
The Radiant EthicThe Radiant Ethic
The Radiant Ethic
 
Introduction to Riak and Ripple (KC.rb)
Introduction to Riak and Ripple (KC.rb)Introduction to Riak and Ripple (KC.rb)
Introduction to Riak and Ripple (KC.rb)
 
Riak with Rails
Riak with RailsRiak with Rails
Riak with Rails
 
Schema Design for Riak
Schema Design for RiakSchema Design for Riak
Schema Design for Riak
 
Introduction to Riak - Red Dirt Ruby Conf Training
Introduction to Riak - Red Dirt Ruby Conf TrainingIntroduction to Riak - Red Dirt Ruby Conf Training
Introduction to Riak - Red Dirt Ruby Conf Training
 
Introducing Riak and Ripple
Introducing Riak and RippleIntroducing Riak and Ripple
Introducing Riak and Ripple
 
Round PEG, Round Hole - Parsing Functionally
Round PEG, Round Hole - Parsing FunctionallyRound PEG, Round Hole - Parsing Functionally
Round PEG, Round Hole - Parsing Functionally
 
Story Driven Development With Cucumber
Story Driven Development With CucumberStory Driven Development With Cucumber
Story Driven Development With Cucumber
 
Achieving Parsing Sanity In Erlang
Achieving Parsing Sanity In ErlangAchieving Parsing Sanity In Erlang
Achieving Parsing Sanity In Erlang
 
Of Rats And Dragons
Of Rats And DragonsOf Rats And Dragons
Of Rats And Dragons
 
Erlang/OTP for Rubyists
Erlang/OTP for RubyistsErlang/OTP for Rubyists
Erlang/OTP for Rubyists
 
Content Management That Won't Rot Your Brain
Content Management That Won't Rot Your BrainContent Management That Won't Rot Your Brain
Content Management That Won't Rot Your Brain
 

Recently uploaded

Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 

Recently uploaded (20)

Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 

Eventually-Consistent Data Structures

Editor's Notes

  1. \n
  2. \n
  3. \n
  4. In an eventually consistent system, you tend to have multiple copies of the same datum, which means that it’s replicated. They also tend to allow loose coordination and things like sloppy quora, since you don’t require expensive multi-phase commit protocols. This also makes them resilient to network partitions. Eventually consistent systems must also include means for state to move forward when staleness is detected. In Dynamo-like systems, this is usually done with read-repair, that is, writing the newer value to stale replicas when reading.\n
  5. While not as simple to understand as an ACID system, eventual consistency has many practical benefits. When encountering failures, especially network-related ones, the system can more often remain available to reads and writes despite the failures. In the same vein, relying on dynamic participation in operations lends itself to systems with low, consistent latency because only promptly-responding replicas need to be considered.\n
  6. Of course the tradeoff of those benefits, thanks to the CAP theorem, is that you sacrifice strict consistency. There is no total ordering of events in the system, you have no transactions, you have weak guarantees of delivery. This means it’s incredibly difficult to decide who wins when there are concurrent writes in the system. The solutions to the problem are both non-ideal, but they are generally: first, to throw one version out by applying an arbitrary ordering, usually a timestamp of sorts; second, to keep both values around and let the user decide. These are the approaches of Cassandra, and Riak/Voldemort respectively.\n
  7. Of course the tradeoff of those benefits, thanks to the CAP theorem, is that you sacrifice strict consistency. There is no total ordering of events in the system, you have no transactions, you have weak guarantees of delivery. This means it’s incredibly difficult to decide who wins when there are concurrent writes in the system. The solutions to the problem are both non-ideal, but they are generally: first, to throw one version out by applying an arbitrary ordering, usually a timestamp of sorts; second, to keep both values around and let the user decide. These are the approaches of Cassandra, and Riak/Voldemort respectively.\n
  8. So maybe you chose Riak or Voldemort, you get write conflicts (Riak calls them siblings). Now that you’ve got both values, how does your application decide what the real state should be?\n
  9. One strategy, which I call “semantic resolution”, is to say that your application encodes the domain of the problem and so it can use business rules to resolve the conflict. This is the strategy implemented by the “shopping cart” described in the Amazon Dynamo paper. It merges toward the maximum quantity of each item in the cart; however, it exhibits some problems -- namely that sometimes items that were removed from the cart can reappear! From Amazon’s point of view this is okay because it might encourage the customer to buy more, but it is a bewildering user-experience!\n\nFortunately, there is some interesting recent research about a more rigorous approach to eventual consistency.\n\n\n
  10. They are sometimes called Conflict-Free Replicated Data Types. This basically means that instead of strictly opaque values, the datastore provides useful abstract data structures. Since we’re in an eventually consistent system, the data structure is replicated to multiple locations, all of which act independently. But by far the most compelling part is that these data structures have the ability to resolve automatically toward a single value, given any number of conflicting values at individual replicas.\n\n\n
  11. They are sometimes called Conflict-Free Replicated Data Types. This basically means that instead of strictly opaque values, the datastore provides useful abstract data structures. Since we’re in an eventually consistent system, the data structure is replicated to multiple locations, all of which act independently. But by far the most compelling part is that these data structures have the ability to resolve automatically toward a single value, given any number of conflicting values at individual replicas.\n\n\n
  12. They are sometimes called Conflict-Free Replicated Data Types. This basically means that instead of strictly opaque values, the datastore provides useful abstract data structures. Since we’re in an eventually consistent system, the data structure is replicated to multiple locations, all of which act independently. But by far the most compelling part is that these data structures have the ability to resolve automatically toward a single value, given any number of conflicting values at individual replicas.\n\n\n
  13. The primary work on this research has been done by two researchers at INRIA and their colleagues in Portugal. Marc Shapiro also gave a great talk on the subject at Microsoft Research called “Strong Eventual Consistency” which you can easily find online.\n\nThe paper above is where I’ve gotten most of the content and diagrams, but I’ve tried to simplify the content so that we can get through it in 40 minutes. If you want the real thing, search for <title>, it’s free to download.\n
  14. There are two flavors of CRDTs as you might have noticed. They both provide the same conflict-free property, but differ in their implementation strategy.\n\nConvergent types are based on a local modification of state, followed by forwarding the resulting state downstream, where a merge operation is performed at other replicas. The state itself encodes all information needed to converge. They are great for systems with weak message delivery guarantees - for example, a Dynamo-style system. Convergent types can also be resolved in clients, which is helpful for systems that do not provide rich datatypes.\n\nCommutative types, on the other hand, replicate commutative operations rather than state, and tend to rely on systems with reliable broadcast (that assures operations reach all replicas). Operations are generally not required to have a total ordering -- a local causal ordering is sufficient.\n
  15. This diagram from the paper shows the basic format of a convergent, state based CRDT. Note how the mutation is applied locally, then forwarded downstream as a merge operation. As long as all replicas eventually receive states that include all mutations, they will converge on the same value.\n
  16. Again, in Commutative types forward operations to other replicas, not the state. Obviously, if an operation is not delivered, or applied out-of-order locally, the states don’t converge. However, again, unlike the convergent type, a reliable broadcast channel is required. As long as functions f() and g() commute, state will converge.\n
  17. A register is the simplest type of data structure - a memory cell storing an opaque value. It only supports two operations - “assign” and “value” (get and set). Concurrent updates will not commute (who should win?). We’ve seen this problem before.\n
  18. The two approaches to concurrent resolution are the same ones taken by Cassandra and Riak, respectively. That is, Last-Write-Wins (called an LWW-Register) and Multi-Valued (called MV-Register)-- keeping all divergent values. For resolution, LWW tend to use timestamps with a reasonable guarantee of ordering (which is difficult in practice, but in some systems sufficient). MV on the other hand, requires the more expensive version vector to resolve conflicts and produces the union of all divergent values (but it doesn’t behave like a set!)\n
  19. Counters are simply integers that are replicated and support the increment and decrement operations. Counters are useful for things like tracking the number of logged-in users, or click-throughs on an advertisement.\n\nThe simplest type of counter is a Commutative or operation-based type, since add and subtract are commutative, any delivery order is sufficient (ignoring over-/under-flow). The state-based counters are more interesting so we’ll look at those.\n
  20. A G-Counter only counts up and is basically a version vector (vector clock). Each replica increments its own pair only, the value is computed by summing the count of all replicas. Convergence is achieved by taking the maximum count for each replica. This is basically the Cassandra counters implementation.\n
  21. A G-Counter only counts up and is basically a version vector (vector clock). Each replica increments its own pair only, the value is computed by summing the count of all replicas. Convergence is achieved by taking the maximum count for each replica. This is basically the Cassandra counters implementation.\n
  22. A G-Counter only counts up and is basically a version vector (vector clock). Each replica increments its own pair only, the value is computed by summing the count of all replicas. Convergence is achieved by taking the maximum count for each replica. This is basically the Cassandra counters implementation.\n
  23. A G-Counter only counts up and is basically a version vector (vector clock). Each replica increments its own pair only, the value is computed by summing the count of all replicas. Convergence is achieved by taking the maximum count for each replica. This is basically the Cassandra counters implementation.\n
  24. A G-Counter only counts up and is basically a version vector (vector clock). Each replica increments its own pair only, the value is computed by summing the count of all replicas. Convergence is achieved by taking the maximum count for each replica. This is basically the Cassandra counters implementation.\n
  25. A G-Counter only counts up and is basically a version vector (vector clock). Each replica increments its own pair only, the value is computed by summing the count of all replicas. Convergence is achieved by taking the maximum count for each replica. This is basically the Cassandra counters implementation.\n
  26. A G-Counter only counts up and is basically a version vector (vector clock). Each replica increments its own pair only, the value is computed by summing the count of all replicas. Convergence is achieved by taking the maximum count for each replica. This is basically the Cassandra counters implementation.\n
  27. A G-Counter only counts up and is basically a version vector (vector clock). Each replica increments its own pair only, the value is computed by summing the count of all replicas. Convergence is achieved by taking the maximum count for each replica. This is basically the Cassandra counters implementation.\n
  28. A G-Counter only counts up and is basically a version vector (vector clock). Each replica increments its own pair only, the value is computed by summing the count of all replicas. Convergence is achieved by taking the maximum count for each replica. This is basically the Cassandra counters implementation.\n
  29. A G-Counter only counts up and is basically a version vector (vector clock). Each replica increments its own pair only, the value is computed by summing the count of all replicas. Convergence is achieved by taking the maximum count for each replica. This is basically the Cassandra counters implementation.\n
  30. A G-Counter only counts up and is basically a version vector (vector clock). Each replica increments its own pair only, the value is computed by summing the count of all replicas. Convergence is achieved by taking the maximum count for each replica. This is basically the Cassandra counters implementation.\n
  31. A G-Counter only counts up and is basically a version vector (vector clock). Each replica increments its own pair only, the value is computed by summing the count of all replicas. Convergence is achieved by taking the maximum count for each replica. This is basically the Cassandra counters implementation.\n
  32. A G-Counter only counts up and is basically a version vector (vector clock). Each replica increments its own pair only, the value is computed by summing the count of all replicas. Convergence is achieved by taking the maximum count for each replica. This is basically the Cassandra counters implementation.\n
  33. A G-Counter only counts up and is basically a version vector (vector clock). Each replica increments its own pair only, the value is computed by summing the count of all replicas. Convergence is achieved by taking the maximum count for each replica. This is basically the Cassandra counters implementation.\n
  34. PN-Counter - composed of two G-Counters - P for increments and N for decrements. The value is the difference between the values of the two G-Counters. The resolution is the pairwise resolution of the P and N counters.\n
  35. Sets constitute one of the most basic data structures. Containers, Maps, and Graphs are all based on Sets. There are two operations, add and remove.\n
  36. Like a G-Counter, a G-Set only grows in size. That is, it doesn’t allow removal - its merge operation is a simple set-union, returning the maximal grouping without duplicates. Since add commutes with union, a G-Set can also be implemented as a commutative type. However, it’s not an incredibly useful data-type on its own, but it can be part of another data structure.\n
  37. Like a G-Counter, a G-Set only grows in size. That is, it doesn’t allow removal - its merge operation is a simple set-union, returning the maximal grouping without duplicates. Since add commutes with union, a G-Set can also be implemented as a commutative type. However, it’s not an incredibly useful data-type on its own, but it can be part of another data structure.\n
  38. Like a G-Counter, a G-Set only grows in size. That is, it doesn’t allow removal - its merge operation is a simple set-union, returning the maximal grouping without duplicates. Since add commutes with union, a G-Set can also be implemented as a commutative type. However, it’s not an incredibly useful data-type on its own, but it can be part of another data structure.\n
  39. Like a G-Counter, a G-Set only grows in size. That is, it doesn’t allow removal - its merge operation is a simple set-union, returning the maximal grouping without duplicates. Since add commutes with union, a G-Set can also be implemented as a commutative type. However, it’s not an incredibly useful data-type on its own, but it can be part of another data structure.\n
  40. Like a G-Counter, a G-Set only grows in size. That is, it doesn’t allow removal - its merge operation is a simple set-union, returning the maximal grouping without duplicates. Since add commutes with union, a G-Set can also be implemented as a commutative type. However, it’s not an incredibly useful data-type on its own, but it can be part of another data structure.\n
  41. Like a G-Counter, a G-Set only grows in size. That is, it doesn’t allow removal - its merge operation is a simple set-union, returning the maximal grouping without duplicates. Since add commutes with union, a G-Set can also be implemented as a commutative type. However, it’s not an incredibly useful data-type on its own, but it can be part of another data structure.\n
  42. Like a G-Counter, a G-Set only grows in size. That is, it doesn’t allow removal - its merge operation is a simple set-union, returning the maximal grouping without duplicates. Since add commutes with union, a G-Set can also be implemented as a commutative type. However, it’s not an incredibly useful data-type on its own, but it can be part of another data structure.\n
  43. Like a G-Counter, a G-Set only grows in size. That is, it doesn’t allow removal - its merge operation is a simple set-union, returning the maximal grouping without duplicates. Since add commutes with union, a G-Set can also be implemented as a commutative type. However, it’s not an incredibly useful data-type on its own, but it can be part of another data structure.\n
  44. Like a G-Counter, a G-Set only grows in size. That is, it doesn’t allow removal - its merge operation is a simple set-union, returning the maximal grouping without duplicates. Since add commutes with union, a G-Set can also be implemented as a commutative type. However, it’s not an incredibly useful data-type on its own, but it can be part of another data structure.\n
  45. Like a G-Counter, a G-Set only grows in size. That is, it doesn’t allow removal - its merge operation is a simple set-union, returning the maximal grouping without duplicates. Since add commutes with union, a G-Set can also be implemented as a commutative type. However, it’s not an incredibly useful data-type on its own, but it can be part of another data structure.\n
  46. Like a G-Counter, a G-Set only grows in size. That is, it doesn’t allow removal - its merge operation is a simple set-union, returning the maximal grouping without duplicates. Since add commutes with union, a G-Set can also be implemented as a commutative type. However, it’s not an incredibly useful data-type on its own, but it can be part of another data structure.\n
  47. Like a G-Counter, a G-Set only grows in size. That is, it doesn’t allow removal - its merge operation is a simple set-union, returning the maximal grouping without duplicates. Since add commutes with union, a G-Set can also be implemented as a commutative type. However, it’s not an incredibly useful data-type on its own, but it can be part of another data structure.\n
  48. Like a G-Counter, a G-Set only grows in size. That is, it doesn’t allow removal - its merge operation is a simple set-union, returning the maximal grouping without duplicates. Since add commutes with union, a G-Set can also be implemented as a commutative type. However, it’s not an incredibly useful data-type on its own, but it can be part of another data structure.\n
  49. Like a G-Counter, a G-Set only grows in size. That is, it doesn’t allow removal - its merge operation is a simple set-union, returning the maximal grouping without duplicates. Since add commutes with union, a G-Set can also be implemented as a commutative type. However, it’s not an incredibly useful data-type on its own, but it can be part of another data structure.\n
  50. The second type of Set is a two-phase set, where a removed set member cannot be re-added. It is basically two G-Sets, one for add and one for remove. The removal set is sometimes called a tombstone set. To prevent spurious states (e.g. remove-before-add, making add have no effect), it has a precondition for remove that the local state must already contain the member.\n\nA special case of the 2P-Set is the U-Set. If the system can reasonably guarantee uniqueness, that is, the element will never be added again after removal, then the tombstone set is unnecessary. Uniqueness could be satisfied with a Lamport clock or suitably large RNG space.\n
  51. The second type of Set is a two-phase set, where a removed set member cannot be re-added. It is basically two G-Sets, one for add and one for remove. The removal set is sometimes called a tombstone set. To prevent spurious states (e.g. remove-before-add, making add have no effect), it has a precondition for remove that the local state must already contain the member.\n\nA special case of the 2P-Set is the U-Set. If the system can reasonably guarantee uniqueness, that is, the element will never be added again after removal, then the tombstone set is unnecessary. Uniqueness could be satisfied with a Lamport clock or suitably large RNG space.\n
  52. The second type of Set is a two-phase set, where a removed set member cannot be re-added. It is basically two G-Sets, one for add and one for remove. The removal set is sometimes called a tombstone set. To prevent spurious states (e.g. remove-before-add, making add have no effect), it has a precondition for remove that the local state must already contain the member.\n\nA special case of the 2P-Set is the U-Set. If the system can reasonably guarantee uniqueness, that is, the element will never be added again after removal, then the tombstone set is unnecessary. Uniqueness could be satisfied with a Lamport clock or suitably large RNG space.\n
  53. The second type of Set is a two-phase set, where a removed set member cannot be re-added. It is basically two G-Sets, one for add and one for remove. The removal set is sometimes called a tombstone set. To prevent spurious states (e.g. remove-before-add, making add have no effect), it has a precondition for remove that the local state must already contain the member.\n\nA special case of the 2P-Set is the U-Set. If the system can reasonably guarantee uniqueness, that is, the element will never be added again after removal, then the tombstone set is unnecessary. Uniqueness could be satisfied with a Lamport clock or suitably large RNG space.\n
  54. The second type of Set is a two-phase set, where a removed set member cannot be re-added. It is basically two G-Sets, one for add and one for remove. The removal set is sometimes called a tombstone set. To prevent spurious states (e.g. remove-before-add, making add have no effect), it has a precondition for remove that the local state must already contain the member.\n\nA special case of the 2P-Set is the U-Set. If the system can reasonably guarantee uniqueness, that is, the element will never be added again after removal, then the tombstone set is unnecessary. Uniqueness could be satisfied with a Lamport clock or suitably large RNG space.\n
  55. The second type of Set is a two-phase set, where a removed set member cannot be re-added. It is basically two G-Sets, one for add and one for remove. The removal set is sometimes called a tombstone set. To prevent spurious states (e.g. remove-before-add, making add have no effect), it has a precondition for remove that the local state must already contain the member.\n\nA special case of the 2P-Set is the U-Set. If the system can reasonably guarantee uniqueness, that is, the element will never be added again after removal, then the tombstone set is unnecessary. Uniqueness could be satisfied with a Lamport clock or suitably large RNG space.\n
  56. The second type of Set is a two-phase set, where a removed set member cannot be re-added. It is basically two G-Sets, one for add and one for remove. The removal set is sometimes called a tombstone set. To prevent spurious states (e.g. remove-before-add, making add have no effect), it has a precondition for remove that the local state must already contain the member.\n\nA special case of the 2P-Set is the U-Set. If the system can reasonably guarantee uniqueness, that is, the element will never be added again after removal, then the tombstone set is unnecessary. Uniqueness could be satisfied with a Lamport clock or suitably large RNG space.\n
  57. The second type of Set is a two-phase set, where a removed set member cannot be re-added. It is basically two G-Sets, one for add and one for remove. The removal set is sometimes called a tombstone set. To prevent spurious states (e.g. remove-before-add, making add have no effect), it has a precondition for remove that the local state must already contain the member.\n\nA special case of the 2P-Set is the U-Set. If the system can reasonably guarantee uniqueness, that is, the element will never be added again after removal, then the tombstone set is unnecessary. Uniqueness could be satisfied with a Lamport clock or suitably large RNG space.\n
  58. The second type of Set is a two-phase set, where a removed set member cannot be re-added. It is basically two G-Sets, one for add and one for remove. The removal set is sometimes called a tombstone set. To prevent spurious states (e.g. remove-before-add, making add have no effect), it has a precondition for remove that the local state must already contain the member.\n\nA special case of the 2P-Set is the U-Set. If the system can reasonably guarantee uniqueness, that is, the element will never be added again after removal, then the tombstone set is unnecessary. Uniqueness could be satisfied with a Lamport clock or suitably large RNG space.\n
  59. The second type of Set is a two-phase set, where a removed set member cannot be re-added. It is basically two G-Sets, one for add and one for remove. The removal set is sometimes called a tombstone set. To prevent spurious states (e.g. remove-before-add, making add have no effect), it has a precondition for remove that the local state must already contain the member.\n\nA special case of the 2P-Set is the U-Set. If the system can reasonably guarantee uniqueness, that is, the element will never be added again after removal, then the tombstone set is unnecessary. Uniqueness could be satisfied with a Lamport clock or suitably large RNG space.\n
  60. The second type of Set is a two-phase set, where a removed set member cannot be re-added. It is basically two G-Sets, one for add and one for remove. The removal set is sometimes called a tombstone set. To prevent spurious states (e.g. remove-before-add, making add have no effect), it has a precondition for remove that the local state must already contain the member.\n\nA special case of the 2P-Set is the U-Set. If the system can reasonably guarantee uniqueness, that is, the element will never be added again after removal, then the tombstone set is unnecessary. Uniqueness could be satisfied with a Lamport clock or suitably large RNG space.\n
  61. The second type of Set is a two-phase set, where a removed set member cannot be re-added. It is basically two G-Sets, one for add and one for remove. The removal set is sometimes called a tombstone set. To prevent spurious states (e.g. remove-before-add, making add have no effect), it has a precondition for remove that the local state must already contain the member.\n\nA special case of the 2P-Set is the U-Set. If the system can reasonably guarantee uniqueness, that is, the element will never be added again after removal, then the tombstone set is unnecessary. Uniqueness could be satisfied with a Lamport clock or suitably large RNG space.\n
  62. The second type of Set is a two-phase set, where a removed set member cannot be re-added. It is basically two G-Sets, one for add and one for remove. The removal set is sometimes called a tombstone set. To prevent spurious states (e.g. remove-before-add, making add have no effect), it has a precondition for remove that the local state must already contain the member.\n\nA special case of the 2P-Set is the U-Set. If the system can reasonably guarantee uniqueness, that is, the element will never be added again after removal, then the tombstone set is unnecessary. Uniqueness could be satisfied with a Lamport clock or suitably large RNG space.\n
  63. The second type of Set is a two-phase set, where a removed set member cannot be re-added. It is basically two G-Sets, one for add and one for remove. The removal set is sometimes called a tombstone set. To prevent spurious states (e.g. remove-before-add, making add have no effect), it has a precondition for remove that the local state must already contain the member.\n\nA special case of the 2P-Set is the U-Set. If the system can reasonably guarantee uniqueness, that is, the element will never be added again after removal, then the tombstone set is unnecessary. Uniqueness could be satisfied with a Lamport clock or suitably large RNG space.\n
  64. The second type of Set is a two-phase set, where a removed set member cannot be re-added. It is basically two G-Sets, one for add and one for remove. The removal set is sometimes called a tombstone set. To prevent spurious states (e.g. remove-before-add, making add have no effect), it has a precondition for remove that the local state must already contain the member.\n\nA special case of the 2P-Set is the U-Set. If the system can reasonably guarantee uniqueness, that is, the element will never be added again after removal, then the tombstone set is unnecessary. Uniqueness could be satisfied with a Lamport clock or suitably large RNG space.\n
  65. The second type of Set is a two-phase set, where a removed set member cannot be re-added. It is basically two G-Sets, one for add and one for remove. The removal set is sometimes called a tombstone set. To prevent spurious states (e.g. remove-before-add, making add have no effect), it has a precondition for remove that the local state must already contain the member.\n\nA special case of the 2P-Set is the U-Set. If the system can reasonably guarantee uniqueness, that is, the element will never be added again after removal, then the tombstone set is unnecessary. Uniqueness could be satisfied with a Lamport clock or suitably large RNG space.\n
  66. Tag each element in A and R with timestamp. Greatest timestamp wins out for each individual element. Could be implemented with Cassandra super-columns.\n\nFigure 12: LWW-element-Set; elements masked by one with a higher timestamp are elided (state-based)\n\n
  67. Tag each added element uniquely (without exposing them). When removing, remove all seen and forward operation downstream with tags. State-based version would be based on U-Set.\n\n
  68. You might notice we’re going up in complexity here in terms of the types of data-structures. Graphs are incredibly useful for many problems, but also have a bunch of potential anomalies within them - concurrent add/removes of vertices and edges may not converge - that is, global invariants can’t be guaranteed. For example, in the case of a DAG or linked-list where elements can be removed or added concurrently. Some anomalies may be removed via restricting the semantics, for example, making a graph add-only. I’m not going to go into detail about how Graphs are implemented, but a simple one is the 2P2P graph, based on a pair of 2P-sets, one for vertices and one for edges. In the case where a vertex is removed, the most reliable (and intuitive) solution is to remove all attached edges, thus a 2P-Set paradigm works well for the components of a generic graph.\n\n\n
  69. You might notice we’re going up in complexity here in terms of the types of data-structures. Graphs are incredibly useful for many problems, but also have a bunch of potential anomalies within them - concurrent add/removes of vertices and edges may not converge - that is, global invariants can’t be guaranteed. For example, in the case of a DAG or linked-list where elements can be removed or added concurrently. Some anomalies may be removed via restricting the semantics, for example, making a graph add-only. I’m not going to go into detail about how Graphs are implemented, but a simple one is the 2P2P graph, based on a pair of 2P-sets, one for vertices and one for edges. In the case where a vertex is removed, the most reliable (and intuitive) solution is to remove all attached edges, thus a 2P-Set paradigm works well for the components of a generic graph.\n\n\n
  70. You might notice we’re going up in complexity here in terms of the types of data-structures. Graphs are incredibly useful for many problems, but also have a bunch of potential anomalies within them - concurrent add/removes of vertices and edges may not converge - that is, global invariants can’t be guaranteed. For example, in the case of a DAG or linked-list where elements can be removed or added concurrently. Some anomalies may be removed via restricting the semantics, for example, making a graph add-only. I’m not going to go into detail about how Graphs are implemented, but a simple one is the 2P2P graph, based on a pair of 2P-sets, one for vertices and one for edges. In the case where a vertex is removed, the most reliable (and intuitive) solution is to remove all attached edges, thus a 2P-Set paradigm works well for the components of a generic graph.\n\n\n
  71. You might notice we’re going up in complexity here in terms of the types of data-structures. Graphs are incredibly useful for many problems, but also have a bunch of potential anomalies within them - concurrent add/removes of vertices and edges may not converge - that is, global invariants can’t be guaranteed. For example, in the case of a DAG or linked-list where elements can be removed or added concurrently. Some anomalies may be removed via restricting the semantics, for example, making a graph add-only. I’m not going to go into detail about how Graphs are implemented, but a simple one is the 2P2P graph, based on a pair of 2P-sets, one for vertices and one for edges. In the case where a vertex is removed, the most reliable (and intuitive) solution is to remove all attached edges, thus a 2P-Set paradigm works well for the components of a generic graph.\n\n\n
  72. \n
  73. CRDTs tend to create a lot of garbage: tombstones grow and internal structures become unbalanced. In general, garbage collection is extremely difficult to do without synchronization. Luckily, this doesn’t impact correctness, only efficiency and performance.\n
  74. Client - have to come up with a common representation across languages, allocation of actor IDs is problematic, can only use state-based CRDTs.\nServer - no one implements them yet, really (Cassandra’s counter has some anomalies)\n
  75. \n