Eventually-Consistent Data Structures

•Download as KEY, PDF•

19 likes•10,380 views

The document discusses eventually consistent data structures and conflict-free replicated data types (CRDTs). It provides examples of different types of CRDTs like registers, counters, sets, and graphs that can be used to build eventually consistent distributed systems. Challenges with CRDTs are that they can be inefficient and synchronization may still be required in some cases. The document promotes the use of CRDTs to provide availability, tolerance to network failures, and resolve conflicts automatically without centralized coordination.

Technology Business

Eventually-
Consistent Data
Structures
Sean Cribbs
@seancribbs #CRDT
Berlin Buzzwords 2012

Riak is
Eventually
Consistent
So are Voldemort and Cassandra

Eventual
Consistency
Replicated
Loose coordination
3 Forward
progression

Eventual is Good

✔ Fault-tolerant
✔ Highly available
✔ Low-latency

Consistency?

No clear winner!
Throw one out?
3
Keep both?
B

Consistency?

No clear winner!
Throw one out?
3
Keep both?
B Cassandra

Consistency?

No clear winner!
Throw one out?
3
Keep both?
B Cassandra

Riak & Voldemort

Semantic
Resolution
• Your app knows the domain - use
business rules to resolve

• Amazon Dynamo’s shopping cart

Semantic
Resolution
• Your app knows the domain - use
business rules to resolve

• Amazon Dynamo’s shopping cart
“Ad hoc approaches have proven brittle and
error-prone”

Conﬂict-Free
Replicated
Data Types
useful abstractions

Conﬂict-Free
Replicated
Data Types
multiple
independent copies useful abstractions

resolves automatically
toward a single value

Conﬂict-Free
Replicated
Data Types
multiple
independent copies useful abstractions

CRDT Flavors
• Convergent: State
• Weak messaging requirements
•Commutative: Operations
•Reliable broadcast required
•Causal ordering sufficient

Registers

• Last-Write Wins (LWW-Register)
• e.g. Columns in Cassandra
• Multi-Valued (MV-Register)
• e.g. Objects (values) in Riak

G-Counter
// Starts empty
[]

// A increments twice, forwarding state
[{a,1}] // == 1
[{a,2}] // == 2

G-Counter
// Starts empty
[]

// A increments twice, forwarding state
[{a,1}] // == 1
[{a,2}] // == 2

// B increments
[{b,1}] // == 1

G-Counter
// Starts empty
[]

// A increments twice, forwarding state
[{a,1}] // == 1
[{a,2}] // == 2

// B increments
[{b,1}] // == 1

// Merging
[{a,2}, {b,1}] [{a,1}, {b,1}]

PN-Counter
// A PN-Counter
{
P = [{a,10},{b,2}],
N = [{a,1},{c,5}]
}
// == (10+2)-(1+5) == 12-6 == 6

G-Set
// Starts empty
{}

// A adds a and b, forwarding state
{a}
{a,b}

G-Set
// Starts empty
{}

// A adds a and b, forwarding state
{a}
{a,b}

// B adds c
{c}

G-Set
// Starts empty
{}

// A adds a and b, forwarding state
{a}
{a,b}

// B adds c
{c}

// Merging
{a,b,c} {a,c}

2P-Set
// Starts empty
{A={},R={}}

// A adds a and b, forwarding state,
// removes a
{A={a}, R={}} // == {a}
{A={a,b},R={}} // == {a,b}

2P-Set
// Starts empty
{A={},R={}}

// A adds a and b, forwarding state,
// removes a
{A={a}, R={}} // == {a}
{A={a,b},R={}} // == {a,b}
{A={a,b},R={a}} // == {b}

2P-Set
// Starts empty
{A={},R={}}

// A adds a and b, forwarding state,
// removes a
{A={a}, R={}} // == {a}
{A={a,b},R={}} // == {a,b}
{A={a,b},R={a}} // == {b}

// B adds c

2P-Set
// Starts empty
{A={},R={}}

// A adds a and b, forwarding state,
// removes a
{A={a}, R={}} // == {a}
{A={a,b},R={}} // == {a,b}
{A={a,b},R={a}} // == {b}

// B adds c
{A={c},R={}} // == {c}

2P-Set
// Starts empty
{A={},R={}}

// A adds a and b, forwarding state,
// removes a
{A={a}, R={}} // == {a}
{A={a,b},R={}} // == {a,b}
{A={a,b},R={a}} // == {b}

// B adds c
{A={c},R={}} // == {c}
// Merging

2P-Set
// Starts empty
{A={},R={}}

// A adds a and b, forwarding state,
// removes a
{A={a}, R={}} // == {a}
{A={a,b},R={}} // == {a,b}
{A={a,b},R={a}} // == {b}

// B adds c
{A={c},R={}} // == {c}
// Merging
{A={a,b,c},R={a}} {A={a,c}, R={}}

Use-Cases

• Social graph (OR-Set or a Graph)
• Web page visits (G-Counter)
• Shopping Cart (Modiﬁed OR-Set)
• “Like” button (U-Set)

Challenges: GC

• CRDTs are inefficient
• Synchronization may be required

Challenges:
Responsibility
• Client
• Erlang: mochi/statebox
• Clojure: reiddraper/knockbox
• Ruby: aphyr/meangirls
• Server
• Very few options

Similar to Eventually-Consistent Data Structures

Development By The Numbers - ConFoo Edition

Anthony Ferrara

Development by the numbers

Anthony Ferrara

4-Regular expression to Deterministic Finite Automata (Direct method)-05-05-2...

venkatapranaykumarGa

Scala + WattzOn, sitting in a tree....

Raffi Krikorian

Introducing Riak

Kevin Smith

Introducing Riak

Kevin Smith

In syntax directed translation, along with the grammar we associate some informal notations and these notations are called as semantic rules. So we can say that Grammar + semantic rule = SDT (syntax directed translation) Application of Syntax Directed Translation We use SDT(Syntax Directed Translation) for Executing Arithmetic Expressions Conversion from infix to postfix expression Conversion from infix to prefix expression For Binary to decimal conversion Counting the number of Reductions Creating a Syntax tree Generating intermediate code Storing information into the symbol table Type checking S-attributed SDT : If an SDT uses only synthesized attributes, it is called as S-attributed SDT. S-attributed SDTs are evaluated in bottom-up parsing, as the values of the parent nodes depend upon the values of the child nodes. Semantic actions are placed in rightmost place of RHS. L-attributed SDT: If an SDT uses both synthesized attributes and inherited attributes with a restriction that inherited attribute can inherit values from left siblings only, it is called as L-attributed SDT. Attributes in L-attributed SDTs are evaluated by depth-first and left-to-right parsing manner. Semantic actions are placed anywhere in RHS. Example : S->ABC, Here attribute B can only obtain its value either from the parent – S or its left sibling A but It can’t inherit from its right sibling C. Same goes for A & C – A can only get its value from its parent & C can get its value from S, A, & B as well because C is the rightmost attribute in the given production.

compiler design ujjwal matoliya 2nd sem MCA.pptx

ujjwalmatoliya

Real World Optimization

David Golden

TDC2016POA | Trilha Programacao Funcional - Ramda JS como alternativa a under...

tdc-globalcode

Ramda, a functional JavaScript library

Derek Willian Stavis

Business Natural Languages Development In Ruby

ConSanFrancisco123

Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...

Databricks

Datastage real time scenario

Naresh Bala

Distributed system sans consensus

Praveen Singh Bora

Lecture8a_Regularization.pptx

Victor Seelan

Vector Algebra One Shot #BounceBack.pdf

vaibahvgoel3620

Distributed database consistency is a jargon-filled tarpit - of great interest to theorists but misunderstood or ignored by developers. But it doesn't have to be. What if you had a simple mental model for reasoning about consistency? What if you had simple rules of thumb for making the right tradeoffs in your applications? MongoDB staff engineer David Golden will share ideas for practical consistency and demonstrate how to achieve it with the MongoDB Perl driver.

Practical Consistency

David Golden

Dynomite at Erlang Factory

moonpolysoft

LalitBDA2015V3

Lalit Kumar

발표자: 이강욱 (KAIST 박사 후 연구원) 발표일: 2017.5. Kangwook Lee is a postdoctoral scholar in the School of EE at KAIST, working with Prof. Changho Suh. He received his PhD degree in 2016 from the EECS department at UC Berkeley under the supervision of Prof. Kannan Ramchandran. He also obtained his MS degree in EECS from UC Berkeley in 2012, and BS degree in EE from KAIST in 2010. 목차: 1. Coded Computation 2. Coded Shuffling

Speeding Up Distributed Machine Learning Using Codes

NAVER Engineering

Similar to Eventually-Consistent Data Structures (20)

Development By The Numbers - ConFoo Edition

Development by the numbers

4-Regular expression to Deterministic Finite Automata (Direct method)-05-05-2...

Scala + WattzOn, sitting in a tree....

Introducing Riak

compiler design ujjwal matoliya 2nd sem MCA.pptx

Real World Optimization

TDC2016POA | Trilha Programacao Funcional - Ramda JS como alternativa a under...

Ramda, a functional JavaScript library

Business Natural Languages Development In Ruby

Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...

Datastage real time scenario

Distributed system sans consensus

Lecture8a_Regularization.pptx

Vector Algebra One Shot #BounceBack.pdf

Practical Consistency

Dynomite at Erlang Factory

LalitBDA2015V3

Speeding Up Distributed Machine Learning Using Codes

More from Sean Cribbs

Concurrency in Ruby is all the rage these days, and people can't seem to agree whether Threads, Fibers, event loops, or actors are the best solution. But did you ever consider that your *sequential* Ruby program might be concurrent, with nary a Thread, Fiber, or callback in sight? Well, it happened to me. This is the story of how accidental concurrency (also known as re-entrancy) broke my brain multiple times over the course of two years, spawned flamewars on Twitter, long blog posts, and the various solutions I took to solve the problem. Along the way we'll illuminate some subtleties of concurrent programming in Ruby, differences between several Ruby implementations, and how we can all write code that is friendlier when accidental concurrency strikes.

A Case of Accidental Concurrency

Sean Cribbs

So, there's this "NoSQL" thing you may have heard of, and this related thing called "eventual consistency". Supposedly, they help you scale, but no one has ever explained why! Well, wonder no more! This talk will demystify NoSQL, eventual consistency, how they might help you scale, and -- most importantly -- why you should care. We'll look closely at how Riak, a linearly-scalable, distributed and fault-tolerant NoSQL datastore, implements eventual consistency, and how you can harness it from Ruby via the slick Ripple client/ORM. When the talk is finished, you'll have the tools both to understand eventual consistency and to handle it like a pro inside your next Ruby application.

Embrace NoSQL and Eventual Consistency with Ripple

Sean Cribbs

Riak with node.js

Sean Cribbs

Schema Design for Riak (Take 2)

Sean Cribbs

Riak (Øredev nosql day)

Sean Cribbs

Riak Tutorial (Øredev)

Sean Cribbs

The Radiant Ethic

Sean Cribbs

Introduction to Riak and Ripple (KC.rb)

Sean Cribbs

Riak with Rails

Sean Cribbs

Schema Design for Riak

Sean Cribbs

Introduction to Riak - Red Dirt Ruby Conf Training

Sean Cribbs

Introducing Riak and Ripple

Sean Cribbs

Many developers will be familiar with lex, flex, yacc, bison, ANTLR, and other related tools to generate parsers for use inside their own code. For recognizing computer-friendly languages, however, context-free grammars and their parser-generators leave a few things to be desired. This is about how the seemingly simple prospect of parsing some text turned into a new parser toolkit for Erlang, and why functional programming makes parsing fun and awesome

Round PEG, Round Hole - Parsing Functionally

Sean Cribbs

Software projects are rarely on-spec, on-time and on-budget, and the primary cause is miscommunication. As Martin Fowler says, there is a "yawning crevasse of doom" between stakeholders and developers, full of misunderstanding. How do you make sure that you're building something that adds value? How do you know you're building the thing that was asked for? How does your bottom line affect user experience? Into the fray leaps Cucumber, a business-readable DSL combined with an awesome Ruby library that lets domain experts express business requirements as executable user stories. We'll cover outside-in, story-driven development with Cucumber, how to write effective stories, and how to make Cucumber work for your project. (as given to CharlotteRuby on Jan 6, 2010)

Story Driven Development With Cucumber

Sean Cribbs

Most developers will be familiar with lex, flex, yacc, bison, ANTLR, and other tools to generate parsers for use inside their own code. Erlang, the concurrent functional programming language, has its own pair, leex and yecc, for accomplishing most complicated text-processing tasks. This talk is about how the seemingly simple prospect of parsing text turned into a new parser toolkit for Erlang, and why functional programming makes parsing fun and awesome.

Achieving Parsing Sanity In Erlang

Sean Cribbs

Of Rats And Dragons

Sean Cribbs

Erlang/OTP for Rubyists

Sean Cribbs

Content Management That Won't Rot Your Brain

Sean Cribbs

More from Sean Cribbs (18)

A Case of Accidental Concurrency

Embrace NoSQL and Eventual Consistency with Ripple

Riak with node.js

Schema Design for Riak (Take 2)

Riak (Øredev nosql day)

Riak Tutorial (Øredev)

The Radiant Ethic

Introduction to Riak and Ripple (KC.rb)

Riak with Rails

Schema Design for Riak

Introduction to Riak - Red Dirt Ruby Conf Training

Introducing Riak and Ripple

Round PEG, Round Hole - Parsing Functionally

Story Driven Development With Cucumber

Achieving Parsing Sanity In Erlang

Of Rats And Dragons

Erlang/OTP for Rubyists

Content Management That Won't Rot Your Brain

Recently uploaded

Webinar Recording: https://www.panagenda.com/webinars/why-teams-call-analytics-is-critical-to-your-entire-business Nothing is as frustrating and noticeable as being in an important call and being unable to see or hear the other person. Not surprising then, that issues with Teams calls are among the most common problems users call their helpdesk for. Having in depth insight into everything relevant going on at the user’s device, local network, ISP and Microsoft itself during the call is crucial for good Microsoft Teams Call quality support. To ensure a quick and adequate solution and to ensure your users get the most out of their Microsoft 365. But did you know that ‘bad calls’ are also an excellent indicator of other problems arising? Precisely because it is so noticeable!? Like the canary in the mine, bad calls can be early indicators of problems. Problems that might otherwise not have been noticed for a while but can have a big impact on productivity and satisfaction. Join this session by Christoph Adler to learn how true Microsoft Teams call quality analytics helped other organizations troubleshoot bad calls and identify and fix problems that impacted Teams calls or the use of Microsoft365 in general. See what it can do to keep your users happy and productive! In this session we will cover - Why CQD data alone is not enough to troubleshoot call problems - The importance of attributing call problems to the right call participant - What call quality analytics can do to help you quickly find, fix-, and prevent problems - Why having retrospective detailed insights matters - Real life examples of how others have used Microsoft Teams call quality monitoring to problem shoot problems with their ISP, network, device health and more.

Why Teams call analytics are critical to your entire business

panagenda

[BuildWithAI] Introduction to Gemini.pdf

Sandro Moreira

JohnPollard-hybrid-app-RailsConf2024.pptx

JohnPollard37

Key topics covered: - Real-world examples of Choreo's comprehensive coverage from application design and deployment, security, scaling, and monitoring - Running different types of workloads, such as web applications, APIs, microservices, integrations, and tasks at scale, and wire them together to deliver seamless omnichannel digital experiences - How Choreo improves the developer experience by eliminating repetition, silos, and redundancy through enhanced discoverability and self-serviceability

Choreo: Empowering the Future of Enterprise Software Engineering

WSO2

MINDCTI Revenue Release Quarter One 2024

MIND CTI

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...

Zilliz

Dubai, often portrayed as a shimmering oasis in the desert, faces its own set of challenges, including the occasional threat of flooding. Despite its reputation for opulence and modernity, the emirate is not immune to the forces of nature. In recent years, Dubai has experienced sporadic but significant floods, testing the resilience of its infrastructure and communities. Among the critical lifelines in this bustling metropolis is the Dubai International Airport, a bustling hub that connects the city to the world. This article explores the intersection of Dubai flood events and the resilience demonstrated by the Dubai International Airport in the face of such challenges.

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...

Orbitshub

Elevate Developer Efficiency & build GenAI Application with Amazon Q

Bhuvaneswari Subramani

💉💊+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI}}+971581248768 +971581248768 Mtp-Kit (500MG) Prices » Dubai [(+971581248768**)] Abortion Pills For Sale In Dubai, UAE, Mifepristone and Misoprostol Tablets Available In Dubai, UAE CONTACT DR.Maya Whatsapp +971581248768 We Have Abortion Pills / Cytotec Tablets /Mifegest Kit Available in Dubai, Sharjah, Abudhabi, Ajman, Alain, Fujairah, Ras Al Khaimah, Umm Al Quwain, UAE, Buy cytotec in Dubai +971581248768''''Abortion Pills near me DUBAI | ABU DHABI|UAE. Price of Misoprostol, Cytotec” +971581248768' Dr.DEEM ''BUY ABORTION PILLS MIFEGEST KIT, MISOPROTONE, CYTOTEC PILLS IN DUBAI, ABU DHABI,UAE'' Contact me now via What's App…… abortion Pills Cytotec also available Oman Qatar Doha Saudi Arabia Bahrain Above all, Cytotec Abortion Pills are Available In Dubai / UAE, you will be very happy to do abortion in Dubai we are providing cytotec 200mg abortion pill in Dubai, UAE. Medication abortion offers an alternative to Surgical Abortion for women in the early weeks of pregnancy. We only offer abortion pills from 1 week-6 Months. We then advise you to use surgery if its beyond 6 months. Our Abu Dhabi, Ajman, Al Ain, Dubai, Fujairah, Ras Al Khaimah (RAK), Sharjah, Umm Al Quwain (UAQ) United Arab Emirates Abortion Clinic provides the safest and most advanced techniques for providing non-surgical, medical and surgical abortion methods for early through late second trimester, including the Abortion By Pill Procedure (RU 486, Mifeprex, Mifepristone, early options French Abortion Pill), Tamoxifen, Methotrexate and Cytotec (Misoprostol). The Abu Dhabi, United Arab Emirates Abortion Clinic performs Same Day Abortion Procedure using medications that are taken on the first day of the office visit and will cause the abortion to occur generally within 4 to 6 hours (as early as 30 minutes) for patients who are 3 to 12 weeks pregnant. When Mifepristone and Misoprostol are used, 50% of patients complete in 4 to 6 hours; 75% to 80% in 12 hours; and 90% in 24 hours. We use a regimen that allows for completion without the need for surgery 99% of the time. All advanced second trimester and late term pregnancies at our Tampa clinic (17 to 24 weeks or greater) can be completed within 24 hours or less 99% of the time without the need surgery. The procedure is completed with minimal to no complications. Our Women's Health Center located in Abu Dhabi, United Arab Emirates, uses the latest medications for medical abortions (RU-486, Mifeprex, Mifegyne, Mifepristone, early options French abortion pill), Methotrexate and Cytotec (Misoprostol). The safety standards of our Abu Dhabi, United Arab Emirates Abortion Doctors remain unparalleled. They consistently maintain the lowest complication rates throughout the nation. Our Physicians and staff are always available to answer questions and care for women in one of the most difficult times in their lives. The decision to have an abortion at the Abortion Cl

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@

Discover the innovative features and strategic vision that keep WSO2 an industry leader. Explore the exciting 2024 roadmap of WSO2 API management, showcasing innovations, unified APIM/APK control plane, natural language API interaction, and cloud native agility. Discover how open source solutions, microservices architecture, and cloud native technologies unlock seamless API management in today's dynamic landscapes. Leave with a clear blueprint to revolutionize your API journey and achieve industry success!

WSO2's API Vision: Unifying Control, Empowering Developers

WSO2

Stronger Together: Developing an Organizational Strategy for Accessible Desig...

caitlingebhard1

Effective data discovery is crucial for maintaining compliance and mitigating risks in today's rapidly evolving privacy landscape. However, traditional manual approaches often struggle to keep pace with the growing volume and complexity of data. Join us for an insightful webinar where industry leaders from TrustArc and Privya will share their expertise on leveraging AI-powered solutions to revolutionize data discovery. You'll learn how to: - Effortlessly maintain a comprehensive, up-to-date data inventory - Harness code scanning insights to gain complete visibility into data flows leveraging the advantages of code scanning over DB scanning - Simplify compliance by leveraging Privya's integration with TrustArc - Implement proven strategies to mitigate third-party risks Our panel of experts will discuss real-world case studies and share practical strategies for overcoming common data discovery challenges. They'll also explore the latest trends and innovations in AI-driven data management, and how these technologies can help organizations stay ahead of the curve in an ever-changing privacy landscape.

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery

TrustArc

Vector Search -An Introduction in Oracle Database 23ai.pptx

Remote DBA Services

Three things you will take away from the session: • How to run an effective tenant-to-tenant migration • Best practices for before, during, and after migration • Tips for using migration as a springboard to prepare for Copilot in Microsoft 365 Main ideas: Migration Overview: The presentation covers the current reality of cross-tenant migrations, the triggers, phases, best practices, and benefits of a successful tenant migration Considerations: When considering a migration, it is important to consider the migration scope, performance, customization, flexibility, user-friendly interface, automation, monitoring, support, training, scalability, data integrity, data security, cost, and licensing structure Next Wave: The next wave of change includes the launch of Copilot, which requires businesses to be prepared for upcoming changes related to Copilot and the cloud, and to consolidate data and tighten governance ShareGate: ShareGate can help with pre-migration analysis, configurable migration tool, and automated, end-user driven collaborative governance

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

sammart93

Decarbonising Commercial Real Estate: The Role of Operational Performance

IES VE

At its core, the challenge of managing Human Resources data is an integration challenge: estimates range from 2-3 HR systems in use at a typical SMB, up to a few dozen systems implemented amongst enterprise HR departments, and these systems seldom integrate seamlessly between themselves. Providing a multi-tenant, cloud-native solution to integrate these hundreds of HR-related systems, normalize their disparate data models and then render that consolidated information for stakeholder decision making has been a substantial undertaking, but one significantly eased by leveraging Ballerina. In this session, we’ll cover: The overall software architecture for VHR’s Cloud Data Platform Critical decision points leading to adoption of Ballerina for the CDP Ballerina’s role in multiple evolutionary steps to the current architecture Roadmap for the CDP architecture and plans for Ballerina WSO2’s partnership in bringing continual success for the CD

Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform

WSO2

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving

Edi Saputra

Six Myths about Ontologies: The Basics of Formal Ontology

johnbeverley2021

Understanding the FAA Part 107 License ..

Christopher Logan Kennedy

Simplifying Mobile A11y Presentation.pptx

MarkSteadman7

Recently uploaded (20)

Why Teams call analytics are critical to your entire business

[BuildWithAI] Introduction to Gemini.pdf

JohnPollard-hybrid-app-RailsConf2024.pptx

Choreo: Empowering the Future of Enterprise Software Engineering

MINDCTI Revenue Release Quarter One 2024

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...

Elevate Developer Efficiency & build GenAI Application with Amazon Q

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

WSO2's API Vision: Unifying Control, Empowering Developers

Stronger Together: Developing an Organizational Strategy for Accessible Desig...

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery

Vector Search -An Introduction in Oracle Database 23ai.pptx

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

Decarbonising Commercial Real Estate: The Role of Operational Performance

Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving

Six Myths about Ontologies: The Basics of Formal Ontology

Understanding the FAA Part 107 License ..

Simplifying Mobile A11y Presentation.pptx

Eventually-Consistent Data Structures

1. Eventually- Consistent Data Structures Sean Cribbs @seancribbs #CRDT Berlin Buzzwords 2012

2. I work for Basho We make

3. Riak is Eventually Consistent So are Voldemort and Cassandra

4. Eventual Consistency Replicated Loose coordination 3 Forward progression

5. Eventual is Good ✔ Fault-tolerant ✔ Highly available ✔ Low-latency

6. Consistency? No clear winner! Throw one out? 3 Keep both? B

7. Consistency? No clear winner! Throw one out? 3 Keep both? B Cassandra

8. Consistency? No clear winner! Throw one out? 3 Keep both? B Cassandra Riak & Voldemort

9. Conﬂicts! A! B! Now what?

10. Semantic Resolution • Your app knows the domain - use business rules to resolve • Amazon Dynamo’s shopping cart

11. Semantic Resolution • Your app knows the domain - use business rules to resolve • Amazon Dynamo’s shopping cart “Ad hoc approaches have proven brittle and error-prone”

12. Conﬂict-Free Replicated Data Types

13. Conﬂict-Free Replicated Data Types useful abstractions

14. Conﬂict-Free Replicated Data Types multiple independent copies useful abstractions

15. resolves automatically toward a single value Conﬂict-Free Replicated Data Types multiple independent copies useful abstractions

16.

17. CRDT Flavors • Convergent: State • Weak messaging requirements •Commutative: Operations •Reliable broadcast required •Causal ordering sufficient

18. Convergent CRDTs

19. Commutative CRDTs

20. Registers A place to put your stuff

21. Registers • Last-Write Wins (LWW-Register) • e.g. Columns in Cassandra • Multi-Valued (MV-Register) • e.g. Objects (values) in Riak

22. Counters Keeping tabs

23. G-Counter

24. G-Counter // Starts empty []

25. G-Counter // Starts empty [] // A increments twice, forwarding state [{a,1}] // == 1 [{a,2}] // == 2

26. G-Counter // Starts empty [] // A increments twice, forwarding state [{a,1}] // == 1 [{a,2}] // == 2 // B increments [{b,1}] // == 1

27. G-Counter // Starts empty [] // A increments twice, forwarding state [{a,1}] // == 1 [{a,2}] // == 2 // B increments [{b,1}] // == 1 // Merging [{a,2}, {b,1}] [{a,1}, {b,1}]

28. PN-Counter // A PN-Counter { P = [{a,10},{b,2}], N = [{a,1},{c,5}] } // == (10+2)-(1+5) == 12-6 == 6

29. Sets Members Only

30. G-Set

31. G-Set // Starts empty {}

32. G-Set // Starts empty {} // A adds a and b, forwarding state {a} {a,b}

33. G-Set // Starts empty {} // A adds a and b, forwarding state {a} {a,b} // B adds c {c}

34. G-Set // Starts empty {} // A adds a and b, forwarding state {a} {a,b} // B adds c {c} // Merging {a,b,c} {a,c}

35. 2P-Set

36. 2P-Set // Starts empty {A={},R={}}

37. 2P-Set // Starts empty {A={},R={}} // A adds a and b, forwarding state, // removes a {A={a}, R={}} // == {a} {A={a,b},R={}} // == {a,b}

38. 2P-Set // Starts empty {A={},R={}} // A adds a and b, forwarding state, // removes a {A={a}, R={}} // == {a} {A={a,b},R={}} // == {a,b} {A={a,b},R={a}} // == {b}

39. 2P-Set // Starts empty {A={},R={}} // A adds a and b, forwarding state, // removes a {A={a}, R={}} // == {a} {A={a,b},R={}} // == {a,b} {A={a,b},R={a}} // == {b} // B adds c

40. 2P-Set // Starts empty {A={},R={}} // A adds a and b, forwarding state, // removes a {A={a}, R={}} // == {a} {A={a,b},R={}} // == {a,b} {A={a,b},R={a}} // == {b} // B adds c {A={c},R={}} // == {c}

41. 2P-Set // Starts empty {A={},R={}} // A adds a and b, forwarding state, // removes a {A={a}, R={}} // == {a} {A={a,b},R={}} // == {a,b} {A={a,b},R={a}} // == {b} // B adds c {A={c},R={}} // == {c} // Merging

42. 2P-Set // Starts empty {A={},R={}} // A adds a and b, forwarding state, // removes a {A={a}, R={}} // == {a} {A={a,b},R={}} // == {a,b} {A={a,b},R={a}} // == {b} // B adds c {A={c},R={}} // == {c} // Merging {A={a,b,c},R={a}} {A={a,c}, R={}}

43. LWW-Element-Set

44. OR-Set

45. G = (V,E) Graphs E⊆V×V

46. G = (V,E) Graphs E⊆V×V

47. G = (V,E) Graphs E⊆V×V

48. Use-Cases • Social graph (OR-Set or a Graph) • Web page visits (G-Counter) • Shopping Cart (Modiﬁed OR-Set) • “Like” button (U-Set)

49. Challenges: GC • CRDTs are inefficient • Synchronization may be required

50. Challenges: Responsibility • Client • Erlang: mochi/statebox • Clojure: reiddraper/knockbox • Ruby: aphyr/meangirls • Server • Very few options

51. Thanks

Editor's Notes

\n
\n
\n
In an eventually consistent system, you tend to have multiple copies of the same datum, which means that it&#x2019;s replicated. They also tend to allow loose coordination and things like sloppy quora, since you don&#x2019;t require expensive multi-phase commit protocols. This also makes them resilient to network partitions. Eventually consistent systems must also include means for state to move forward when staleness is detected. In Dynamo-like systems, this is usually done with read-repair, that is, writing the newer value to stale replicas when reading.\n
While not as simple to understand as an ACID system, eventual consistency has many practical benefits. When encountering failures, especially network-related ones, the system can more often remain available to reads and writes despite the failures. In the same vein, relying on dynamic participation in operations lends itself to systems with low, consistent latency because only promptly-responding replicas need to be considered.\n
Of course the tradeoff of those benefits, thanks to the CAP theorem, is that you sacrifice strict consistency. There is no total ordering of events in the system, you have no transactions, you have weak guarantees of delivery. This means it&#x2019;s incredibly difficult to decide who wins when there are concurrent writes in the system. The solutions to the problem are both non-ideal, but they are generally: first, to throw one version out by applying an arbitrary ordering, usually a timestamp of sorts; second, to keep both values around and let the user decide. These are the approaches of Cassandra, and Riak/Voldemort respectively.\n
Of course the tradeoff of those benefits, thanks to the CAP theorem, is that you sacrifice strict consistency. There is no total ordering of events in the system, you have no transactions, you have weak guarantees of delivery. This means it&#x2019;s incredibly difficult to decide who wins when there are concurrent writes in the system. The solutions to the problem are both non-ideal, but they are generally: first, to throw one version out by applying an arbitrary ordering, usually a timestamp of sorts; second, to keep both values around and let the user decide. These are the approaches of Cassandra, and Riak/Voldemort respectively.\n
So maybe you chose Riak or Voldemort, you get write conflicts (Riak calls them siblings). Now that you&#x2019;ve got both values, how does your application decide what the real state should be?\n
One strategy, which I call &#x201C;semantic resolution&#x201D;, is to say that your application encodes the domain of the problem and so it can use business rules to resolve the conflict. This is the strategy implemented by the &#x201C;shopping cart&#x201D; described in the Amazon Dynamo paper. It merges toward the maximum quantity of each item in the cart; however, it exhibits some problems -- namely that sometimes items that were removed from the cart can reappear! From Amazon&#x2019;s point of view this is okay because it might encourage the customer to buy more, but it is a bewildering user-experience!\n\nFortunately, there is some interesting recent research about a more rigorous approach to eventual consistency.\n\n\n
They are sometimes called Conflict-Free Replicated Data Types. This basically means that instead of strictly opaque values, the datastore provides useful abstract data structures. Since we&#x2019;re in an eventually consistent system, the data structure is replicated to multiple locations, all of which act independently. But by far the most compelling part is that these data structures have the ability to resolve automatically toward a single value, given any number of conflicting values at individual replicas.\n\n\n
They are sometimes called Conflict-Free Replicated Data Types. This basically means that instead of strictly opaque values, the datastore provides useful abstract data structures. Since we&#x2019;re in an eventually consistent system, the data structure is replicated to multiple locations, all of which act independently. But by far the most compelling part is that these data structures have the ability to resolve automatically toward a single value, given any number of conflicting values at individual replicas.\n\n\n
They are sometimes called Conflict-Free Replicated Data Types. This basically means that instead of strictly opaque values, the datastore provides useful abstract data structures. Since we&#x2019;re in an eventually consistent system, the data structure is replicated to multiple locations, all of which act independently. But by far the most compelling part is that these data structures have the ability to resolve automatically toward a single value, given any number of conflicting values at individual replicas.\n\n\n
The primary work on this research has been done by two researchers at INRIA and their colleagues in Portugal. Marc Shapiro also gave a great talk on the subject at Microsoft Research called &#x201C;Strong Eventual Consistency&#x201D; which you can easily find online.\n\nThe paper above is where I&#x2019;ve gotten most of the content and diagrams, but I&#x2019;ve tried to simplify the content so that we can get through it in 40 minutes. If you want the real thing, search for <title>, it&#x2019;s free to download.\n
There are two flavors of CRDTs as you might have noticed. They both provide the same conflict-free property, but differ in their implementation strategy.\n\nConvergent types are based on a local modification of state, followed by forwarding the resulting state downstream, where a merge operation is performed at other replicas. The state itself encodes all information needed to converge. They are great for systems with weak message delivery guarantees - for example, a Dynamo-style system. Convergent types can also be resolved in clients, which is helpful for systems that do not provide rich datatypes.\n\nCommutative types, on the other hand, replicate commutative operations rather than state, and tend to rely on systems with reliable broadcast (that assures operations reach all replicas). Operations are generally not required to have a total ordering -- a local causal ordering is sufficient.\n
This diagram from the paper shows the basic format of a convergent, state based CRDT. Note how the mutation is applied locally, then forwarded downstream as a merge operation. As long as all replicas eventually receive states that include all mutations, they will converge on the same value.\n
Again, in Commutative types forward operations to other replicas, not the state. Obviously, if an operation is not delivered, or applied out-of-order locally, the states don&#x2019;t converge. However, again, unlike the convergent type, a reliable broadcast channel is required. As long as functions f() and g() commute, state will converge.\n
A register is the simplest type of data structure - a memory cell storing an opaque value. It only supports two operations - &#x201C;assign&#x201D; and &#x201C;value&#x201D; (get and set). Concurrent updates will not commute (who should win?). We&#x2019;ve seen this problem before.\n
The two approaches to concurrent resolution are the same ones taken by Cassandra and Riak, respectively. That is, Last-Write-Wins (called an LWW-Register) and Multi-Valued (called MV-Register)-- keeping all divergent values. For resolution, LWW tend to use timestamps with a reasonable guarantee of ordering (which is difficult in practice, but in some systems sufficient). MV on the other hand, requires the more expensive version vector to resolve conflicts and produces the union of all divergent values (but it doesn&#x2019;t behave like a set!)\n
Counters are simply integers that are replicated and support the increment and decrement operations. Counters are useful for things like tracking the number of logged-in users, or click-throughs on an advertisement.\n\nThe simplest type of counter is a Commutative or operation-based type, since add and subtract are commutative, any delivery order is sufficient (ignoring over-/under-flow). The state-based counters are more interesting so we&#x2019;ll look at those.\n
A G-Counter only counts up and is basically a version vector (vector clock). Each replica increments its own pair only, the value is computed by summing the count of all replicas. Convergence is achieved by taking the maximum count for each replica. This is basically the Cassandra counters implementation.\n
A G-Counter only counts up and is basically a version vector (vector clock). Each replica increments its own pair only, the value is computed by summing the count of all replicas. Convergence is achieved by taking the maximum count for each replica. This is basically the Cassandra counters implementation.\n
A G-Counter only counts up and is basically a version vector (vector clock). Each replica increments its own pair only, the value is computed by summing the count of all replicas. Convergence is achieved by taking the maximum count for each replica. This is basically the Cassandra counters implementation.\n
A G-Counter only counts up and is basically a version vector (vector clock). Each replica increments its own pair only, the value is computed by summing the count of all replicas. Convergence is achieved by taking the maximum count for each replica. This is basically the Cassandra counters implementation.\n
A G-Counter only counts up and is basically a version vector (vector clock). Each replica increments its own pair only, the value is computed by summing the count of all replicas. Convergence is achieved by taking the maximum count for each replica. This is basically the Cassandra counters implementation.\n
A G-Counter only counts up and is basically a version vector (vector clock). Each replica increments its own pair only, the value is computed by summing the count of all replicas. Convergence is achieved by taking the maximum count for each replica. This is basically the Cassandra counters implementation.\n
A G-Counter only counts up and is basically a version vector (vector clock). Each replica increments its own pair only, the value is computed by summing the count of all replicas. Convergence is achieved by taking the maximum count for each replica. This is basically the Cassandra counters implementation.\n
A G-Counter only counts up and is basically a version vector (vector clock). Each replica increments its own pair only, the value is computed by summing the count of all replicas. Convergence is achieved by taking the maximum count for each replica. This is basically the Cassandra counters implementation.\n
A G-Counter only counts up and is basically a version vector (vector clock). Each replica increments its own pair only, the value is computed by summing the count of all replicas. Convergence is achieved by taking the maximum count for each replica. This is basically the Cassandra counters implementation.\n
A G-Counter only counts up and is basically a version vector (vector clock). Each replica increments its own pair only, the value is computed by summing the count of all replicas. Convergence is achieved by taking the maximum count for each replica. This is basically the Cassandra counters implementation.\n
A G-Counter only counts up and is basically a version vector (vector clock). Each replica increments its own pair only, the value is computed by summing the count of all replicas. Convergence is achieved by taking the maximum count for each replica. This is basically the Cassandra counters implementation.\n
A G-Counter only counts up and is basically a version vector (vector clock). Each replica increments its own pair only, the value is computed by summing the count of all replicas. Convergence is achieved by taking the maximum count for each replica. This is basically the Cassandra counters implementation.\n
A G-Counter only counts up and is basically a version vector (vector clock). Each replica increments its own pair only, the value is computed by summing the count of all replicas. Convergence is achieved by taking the maximum count for each replica. This is basically the Cassandra counters implementation.\n
A G-Counter only counts up and is basically a version vector (vector clock). Each replica increments its own pair only, the value is computed by summing the count of all replicas. Convergence is achieved by taking the maximum count for each replica. This is basically the Cassandra counters implementation.\n
PN-Counter - composed of two G-Counters - P for increments and N for decrements. The value is the difference between the values of the two G-Counters. The resolution is the pairwise resolution of the P and N counters.\n
Sets constitute one of the most basic data structures. Containers, Maps, and Graphs are all based on Sets. There are two operations, add and remove.\n
Like a G-Counter, a G-Set only grows in size. That is, it doesn&#x2019;t allow removal - its merge operation is a simple set-union, returning the maximal grouping without duplicates. Since add commutes with union, a G-Set can also be implemented as a commutative type. However, it&#x2019;s not an incredibly useful data-type on its own, but it can be part of another data structure.\n
Like a G-Counter, a G-Set only grows in size. That is, it doesn&#x2019;t allow removal - its merge operation is a simple set-union, returning the maximal grouping without duplicates. Since add commutes with union, a G-Set can also be implemented as a commutative type. However, it&#x2019;s not an incredibly useful data-type on its own, but it can be part of another data structure.\n
Like a G-Counter, a G-Set only grows in size. That is, it doesn&#x2019;t allow removal - its merge operation is a simple set-union, returning the maximal grouping without duplicates. Since add commutes with union, a G-Set can also be implemented as a commutative type. However, it&#x2019;s not an incredibly useful data-type on its own, but it can be part of another data structure.\n
Like a G-Counter, a G-Set only grows in size. That is, it doesn&#x2019;t allow removal - its merge operation is a simple set-union, returning the maximal grouping without duplicates. Since add commutes with union, a G-Set can also be implemented as a commutative type. However, it&#x2019;s not an incredibly useful data-type on its own, but it can be part of another data structure.\n
Like a G-Counter, a G-Set only grows in size. That is, it doesn&#x2019;t allow removal - its merge operation is a simple set-union, returning the maximal grouping without duplicates. Since add commutes with union, a G-Set can also be implemented as a commutative type. However, it&#x2019;s not an incredibly useful data-type on its own, but it can be part of another data structure.\n
Like a G-Counter, a G-Set only grows in size. That is, it doesn&#x2019;t allow removal - its merge operation is a simple set-union, returning the maximal grouping without duplicates. Since add commutes with union, a G-Set can also be implemented as a commutative type. However, it&#x2019;s not an incredibly useful data-type on its own, but it can be part of another data structure.\n
Like a G-Counter, a G-Set only grows in size. That is, it doesn&#x2019;t allow removal - its merge operation is a simple set-union, returning the maximal grouping without duplicates. Since add commutes with union, a G-Set can also be implemented as a commutative type. However, it&#x2019;s not an incredibly useful data-type on its own, but it can be part of another data structure.\n
Like a G-Counter, a G-Set only grows in size. That is, it doesn&#x2019;t allow removal - its merge operation is a simple set-union, returning the maximal grouping without duplicates. Since add commutes with union, a G-Set can also be implemented as a commutative type. However, it&#x2019;s not an incredibly useful data-type on its own, but it can be part of another data structure.\n
Like a G-Counter, a G-Set only grows in size. That is, it doesn&#x2019;t allow removal - its merge operation is a simple set-union, returning the maximal grouping without duplicates. Since add commutes with union, a G-Set can also be implemented as a commutative type. However, it&#x2019;s not an incredibly useful data-type on its own, but it can be part of another data structure.\n
Like a G-Counter, a G-Set only grows in size. That is, it doesn&#x2019;t allow removal - its merge operation is a simple set-union, returning the maximal grouping without duplicates. Since add commutes with union, a G-Set can also be implemented as a commutative type. However, it&#x2019;s not an incredibly useful data-type on its own, but it can be part of another data structure.\n
Like a G-Counter, a G-Set only grows in size. That is, it doesn&#x2019;t allow removal - its merge operation is a simple set-union, returning the maximal grouping without duplicates. Since add commutes with union, a G-Set can also be implemented as a commutative type. However, it&#x2019;s not an incredibly useful data-type on its own, but it can be part of another data structure.\n
Like a G-Counter, a G-Set only grows in size. That is, it doesn&#x2019;t allow removal - its merge operation is a simple set-union, returning the maximal grouping without duplicates. Since add commutes with union, a G-Set can also be implemented as a commutative type. However, it&#x2019;s not an incredibly useful data-type on its own, but it can be part of another data structure.\n
Like a G-Counter, a G-Set only grows in size. That is, it doesn&#x2019;t allow removal - its merge operation is a simple set-union, returning the maximal grouping without duplicates. Since add commutes with union, a G-Set can also be implemented as a commutative type. However, it&#x2019;s not an incredibly useful data-type on its own, but it can be part of another data structure.\n
Like a G-Counter, a G-Set only grows in size. That is, it doesn&#x2019;t allow removal - its merge operation is a simple set-union, returning the maximal grouping without duplicates. Since add commutes with union, a G-Set can also be implemented as a commutative type. However, it&#x2019;s not an incredibly useful data-type on its own, but it can be part of another data structure.\n
The second type of Set is a two-phase set, where a removed set member cannot be re-added. It is basically two G-Sets, one for add and one for remove. The removal set is sometimes called a tombstone set. To prevent spurious states (e.g. remove-before-add, making add have no effect), it has a precondition for remove that the local state must already contain the member.\n\nA special case of the 2P-Set is the U-Set. If the system can reasonably guarantee uniqueness, that is, the element will never be added again after removal, then the tombstone set is unnecessary. Uniqueness could be satisfied with a Lamport clock or suitably large RNG space.\n
The second type of Set is a two-phase set, where a removed set member cannot be re-added. It is basically two G-Sets, one for add and one for remove. The removal set is sometimes called a tombstone set. To prevent spurious states (e.g. remove-before-add, making add have no effect), it has a precondition for remove that the local state must already contain the member.\n\nA special case of the 2P-Set is the U-Set. If the system can reasonably guarantee uniqueness, that is, the element will never be added again after removal, then the tombstone set is unnecessary. Uniqueness could be satisfied with a Lamport clock or suitably large RNG space.\n
The second type of Set is a two-phase set, where a removed set member cannot be re-added. It is basically two G-Sets, one for add and one for remove. The removal set is sometimes called a tombstone set. To prevent spurious states (e.g. remove-before-add, making add have no effect), it has a precondition for remove that the local state must already contain the member.\n\nA special case of the 2P-Set is the U-Set. If the system can reasonably guarantee uniqueness, that is, the element will never be added again after removal, then the tombstone set is unnecessary. Uniqueness could be satisfied with a Lamport clock or suitably large RNG space.\n
The second type of Set is a two-phase set, where a removed set member cannot be re-added. It is basically two G-Sets, one for add and one for remove. The removal set is sometimes called a tombstone set. To prevent spurious states (e.g. remove-before-add, making add have no effect), it has a precondition for remove that the local state must already contain the member.\n\nA special case of the 2P-Set is the U-Set. If the system can reasonably guarantee uniqueness, that is, the element will never be added again after removal, then the tombstone set is unnecessary. Uniqueness could be satisfied with a Lamport clock or suitably large RNG space.\n
The second type of Set is a two-phase set, where a removed set member cannot be re-added. It is basically two G-Sets, one for add and one for remove. The removal set is sometimes called a tombstone set. To prevent spurious states (e.g. remove-before-add, making add have no effect), it has a precondition for remove that the local state must already contain the member.\n\nA special case of the 2P-Set is the U-Set. If the system can reasonably guarantee uniqueness, that is, the element will never be added again after removal, then the tombstone set is unnecessary. Uniqueness could be satisfied with a Lamport clock or suitably large RNG space.\n
The second type of Set is a two-phase set, where a removed set member cannot be re-added. It is basically two G-Sets, one for add and one for remove. The removal set is sometimes called a tombstone set. To prevent spurious states (e.g. remove-before-add, making add have no effect), it has a precondition for remove that the local state must already contain the member.\n\nA special case of the 2P-Set is the U-Set. If the system can reasonably guarantee uniqueness, that is, the element will never be added again after removal, then the tombstone set is unnecessary. Uniqueness could be satisfied with a Lamport clock or suitably large RNG space.\n
The second type of Set is a two-phase set, where a removed set member cannot be re-added. It is basically two G-Sets, one for add and one for remove. The removal set is sometimes called a tombstone set. To prevent spurious states (e.g. remove-before-add, making add have no effect), it has a precondition for remove that the local state must already contain the member.\n\nA special case of the 2P-Set is the U-Set. If the system can reasonably guarantee uniqueness, that is, the element will never be added again after removal, then the tombstone set is unnecessary. Uniqueness could be satisfied with a Lamport clock or suitably large RNG space.\n
The second type of Set is a two-phase set, where a removed set member cannot be re-added. It is basically two G-Sets, one for add and one for remove. The removal set is sometimes called a tombstone set. To prevent spurious states (e.g. remove-before-add, making add have no effect), it has a precondition for remove that the local state must already contain the member.\n\nA special case of the 2P-Set is the U-Set. If the system can reasonably guarantee uniqueness, that is, the element will never be added again after removal, then the tombstone set is unnecessary. Uniqueness could be satisfied with a Lamport clock or suitably large RNG space.\n
The second type of Set is a two-phase set, where a removed set member cannot be re-added. It is basically two G-Sets, one for add and one for remove. The removal set is sometimes called a tombstone set. To prevent spurious states (e.g. remove-before-add, making add have no effect), it has a precondition for remove that the local state must already contain the member.\n\nA special case of the 2P-Set is the U-Set. If the system can reasonably guarantee uniqueness, that is, the element will never be added again after removal, then the tombstone set is unnecessary. Uniqueness could be satisfied with a Lamport clock or suitably large RNG space.\n
The second type of Set is a two-phase set, where a removed set member cannot be re-added. It is basically two G-Sets, one for add and one for remove. The removal set is sometimes called a tombstone set. To prevent spurious states (e.g. remove-before-add, making add have no effect), it has a precondition for remove that the local state must already contain the member.\n\nA special case of the 2P-Set is the U-Set. If the system can reasonably guarantee uniqueness, that is, the element will never be added again after removal, then the tombstone set is unnecessary. Uniqueness could be satisfied with a Lamport clock or suitably large RNG space.\n
The second type of Set is a two-phase set, where a removed set member cannot be re-added. It is basically two G-Sets, one for add and one for remove. The removal set is sometimes called a tombstone set. To prevent spurious states (e.g. remove-before-add, making add have no effect), it has a precondition for remove that the local state must already contain the member.\n\nA special case of the 2P-Set is the U-Set. If the system can reasonably guarantee uniqueness, that is, the element will never be added again after removal, then the tombstone set is unnecessary. Uniqueness could be satisfied with a Lamport clock or suitably large RNG space.\n
The second type of Set is a two-phase set, where a removed set member cannot be re-added. It is basically two G-Sets, one for add and one for remove. The removal set is sometimes called a tombstone set. To prevent spurious states (e.g. remove-before-add, making add have no effect), it has a precondition for remove that the local state must already contain the member.\n\nA special case of the 2P-Set is the U-Set. If the system can reasonably guarantee uniqueness, that is, the element will never be added again after removal, then the tombstone set is unnecessary. Uniqueness could be satisfied with a Lamport clock or suitably large RNG space.\n
The second type of Set is a two-phase set, where a removed set member cannot be re-added. It is basically two G-Sets, one for add and one for remove. The removal set is sometimes called a tombstone set. To prevent spurious states (e.g. remove-before-add, making add have no effect), it has a precondition for remove that the local state must already contain the member.\n\nA special case of the 2P-Set is the U-Set. If the system can reasonably guarantee uniqueness, that is, the element will never be added again after removal, then the tombstone set is unnecessary. Uniqueness could be satisfied with a Lamport clock or suitably large RNG space.\n
The second type of Set is a two-phase set, where a removed set member cannot be re-added. It is basically two G-Sets, one for add and one for remove. The removal set is sometimes called a tombstone set. To prevent spurious states (e.g. remove-before-add, making add have no effect), it has a precondition for remove that the local state must already contain the member.\n\nA special case of the 2P-Set is the U-Set. If the system can reasonably guarantee uniqueness, that is, the element will never be added again after removal, then the tombstone set is unnecessary. Uniqueness could be satisfied with a Lamport clock or suitably large RNG space.\n
The second type of Set is a two-phase set, where a removed set member cannot be re-added. It is basically two G-Sets, one for add and one for remove. The removal set is sometimes called a tombstone set. To prevent spurious states (e.g. remove-before-add, making add have no effect), it has a precondition for remove that the local state must already contain the member.\n\nA special case of the 2P-Set is the U-Set. If the system can reasonably guarantee uniqueness, that is, the element will never be added again after removal, then the tombstone set is unnecessary. Uniqueness could be satisfied with a Lamport clock or suitably large RNG space.\n
The second type of Set is a two-phase set, where a removed set member cannot be re-added. It is basically two G-Sets, one for add and one for remove. The removal set is sometimes called a tombstone set. To prevent spurious states (e.g. remove-before-add, making add have no effect), it has a precondition for remove that the local state must already contain the member.\n\nA special case of the 2P-Set is the U-Set. If the system can reasonably guarantee uniqueness, that is, the element will never be added again after removal, then the tombstone set is unnecessary. Uniqueness could be satisfied with a Lamport clock or suitably large RNG space.\n
Tag each element in A and R with timestamp. Greatest timestamp wins out for each individual element. Could be implemented with Cassandra super-columns.\n\nFigure 12: LWW-element-Set; elements masked by one with a higher timestamp are elided (state-based)\n\n
Tag each added element uniquely (without exposing them). When removing, remove all seen and forward operation downstream with tags. State-based version would be based on U-Set.\n\n
You might notice we&#x2019;re going up in complexity here in terms of the types of data-structures. Graphs are incredibly useful for many problems, but also have a bunch of potential anomalies within them - concurrent add/removes of vertices and edges may not converge - that is, global invariants can&#x2019;t be guaranteed. For example, in the case of a DAG or linked-list where elements can be removed or added concurrently. Some anomalies may be removed via restricting the semantics, for example, making a graph add-only. I&#x2019;m not going to go into detail about how Graphs are implemented, but a simple one is the 2P2P graph, based on a pair of 2P-sets, one for vertices and one for edges. In the case where a vertex is removed, the most reliable (and intuitive) solution is to remove all attached edges, thus a 2P-Set paradigm works well for the components of a generic graph.\n\n\n
You might notice we&#x2019;re going up in complexity here in terms of the types of data-structures. Graphs are incredibly useful for many problems, but also have a bunch of potential anomalies within them - concurrent add/removes of vertices and edges may not converge - that is, global invariants can&#x2019;t be guaranteed. For example, in the case of a DAG or linked-list where elements can be removed or added concurrently. Some anomalies may be removed via restricting the semantics, for example, making a graph add-only. I&#x2019;m not going to go into detail about how Graphs are implemented, but a simple one is the 2P2P graph, based on a pair of 2P-sets, one for vertices and one for edges. In the case where a vertex is removed, the most reliable (and intuitive) solution is to remove all attached edges, thus a 2P-Set paradigm works well for the components of a generic graph.\n\n\n
You might notice we&#x2019;re going up in complexity here in terms of the types of data-structures. Graphs are incredibly useful for many problems, but also have a bunch of potential anomalies within them - concurrent add/removes of vertices and edges may not converge - that is, global invariants can&#x2019;t be guaranteed. For example, in the case of a DAG or linked-list where elements can be removed or added concurrently. Some anomalies may be removed via restricting the semantics, for example, making a graph add-only. I&#x2019;m not going to go into detail about how Graphs are implemented, but a simple one is the 2P2P graph, based on a pair of 2P-sets, one for vertices and one for edges. In the case where a vertex is removed, the most reliable (and intuitive) solution is to remove all attached edges, thus a 2P-Set paradigm works well for the components of a generic graph.\n\n\n
You might notice we&#x2019;re going up in complexity here in terms of the types of data-structures. Graphs are incredibly useful for many problems, but also have a bunch of potential anomalies within them - concurrent add/removes of vertices and edges may not converge - that is, global invariants can&#x2019;t be guaranteed. For example, in the case of a DAG or linked-list where elements can be removed or added concurrently. Some anomalies may be removed via restricting the semantics, for example, making a graph add-only. I&#x2019;m not going to go into detail about how Graphs are implemented, but a simple one is the 2P2P graph, based on a pair of 2P-sets, one for vertices and one for edges. In the case where a vertex is removed, the most reliable (and intuitive) solution is to remove all attached edges, thus a 2P-Set paradigm works well for the components of a generic graph.\n\n\n
\n
CRDTs tend to create a lot of garbage: tombstones grow and internal structures become unbalanced. In general, garbage collection is extremely difficult to do without synchronization. Luckily, this doesn&#x2019;t impact correctness, only efficiency and performance.\n
Client - have to come up with a common representation across languages, allocation of actor IDs is problematic, can only use state-based CRDTs.\nServer - no one implements them yet, really (Cassandra&#x2019;s counter has some anomalies)\n
\n

Eventually-Consistent Data Structures

Recommended

Recommended

More Related Content

Similar to Eventually-Consistent Data Structures

Similar to Eventually-Consistent Data Structures (20)

More from Sean Cribbs

More from Sean Cribbs (18)

Recently uploaded

Recently uploaded (20)

Eventually-Consistent Data Structures

Editor's Notes