CAP Theorem - Theory, Implications and Practices


Published on

By Yoav Francis and Tomer Cagan, Concurrent and Distributed Computing Seminar, IDC 2012

Overview on the CAP Theorem, NoSQL and Distributed Databases

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Atoimc Data Object=תחשבו על המערכת כ-node אחד שהמשתמש עובד מולו – בשבילו זה שקוף והוא רוצה שהמערכת תתנהג בהתאם כאילו הוא על node יחיד.
  • including those in which messages are lost = Partition Tolerance
  • הרעיון הוא שברשת א-סינכרונית, תהליך\אלגוריתם לא יכול לדעת האם הודעה הגיע לצד השני (אין חסם) והוא חייב לדעת להמשיך ללא הידע הזה. אם יש אלגוריתם שפועל כמו שצריך בסביבה כזאת ניתן לטעון שאין ממש הבדל בין המצב הזה לבין המצב בו הודעות יכולות ללכת לאיבוד – ואם קיים אלגוריתם כזה אז הוא יכול לעבוד גם המקרה השני וזה בניגוד לתאוריה 1.In an async network, process cannot tell if a message reached another proces – and must continue without this knowledge.If an algorithm can work in this environment, it will also work in an environment in which messages are lost – since it cannot differentiate between the 2 conditions.אנחנו לא יודעים אם הודעות נעלמות או לא כי אנחנו אסינכרונים – אם היינו יודעים אז לא היינו אסינכרונים
  • Skipיש שעון פנימי מסוכנרן – ניתן לעשות timeout וכו – פה אם הודעות לא נעבדות אז כן ניתן לממש מערכת שכזאת.
  • CAP Theorem - Theory, Implications and Practices

    1. 1. CAP Theorem Theory, Implications and Practices Tomer Cagan Yoav Francis June 2012 Seminar in Concurrent and Distributed Computing - Prof. Gadi Taubenfeld 2012/2 Interdisciplinary Centre, Herzelia, Israel
    2. 2. Agenda • Survey • CAP - Quick Glance • Background and Motivation • Who needs it? • Why? • A little about consistency model • CAP Theorem • Brewer‟s conjecture and proof • Context (FLP – Concensus) • Implications • CAP in Practice - Living with CAP
    3. 3. Quick Survey • Do you know/use any of the following: • SQL? • ACID? • Database replication? • NoSQL? • NoSQL Types/Implementations? • Distributed Development? • Ever heard of CAP Theorem?
    4. 4. Purpose/Goals • CAP Theorem is at the base of developing distributed systems • Still - not everyone aware of it. • We want to (goals) o introduce it (theorem) o understand what it means to us as developers (implications and criticism) o learn (CAP in practice):  of what others are doing  what can be done
    5. 5. Brewer‟s CAP Theorem • Presented as a conjuncture at PODC 2000 (Brewer's conjecture) • Formalized and proved in 2002 by Nancy Lynch and Seth Gilbert (MIT) • Consistency, Availability and Partition- Tolerance cannot be achieved all at the same time in a distributed system • There is a tradeoff between these 3 properties
    6. 6. CAP - Definition In simple terms: in an asynchronous network that performs as expected, where messages may be lost (partition-tolerance), it is impossible to implement a service that provides consistent data and responds eventually to every request (availability) under every pattern of message loss.
    7. 7. CAP - Definitions • Consistency • Data is consistent and the same for all nodes. • All the nodes in the system see the same state of the data vi • Availability • Partition-tolerance vi vi vi
    8. 8. CAP - Definitions • Availability • Every request to non-failing node should be processed and receive response whether it failed or succeeded. • Consistency • Partition-tolerance
    9. 9. CAP - Definitions • Partition-Tolerance • If some nodes crash / communication fails, service still performs as expected • Consistency • Availability
    10. 10. Example: Slides from: Talk.pdf
    11. 11. - No Partitions
    12. 12. - No Partitions
    13. 13. - No Partitions
    14. 14. - No Partitions
    15. 15. - With Partitions
    16. 16. - With Partitions
    17. 17. A Step Back Background and Motivation
    18. 18. CAP Theorem, related to whom ?
    19. 19. Little Background • RDBMS • Scalability o Vertical Scaling o Horizontal Scaling o Big data challenge • Consistency Model
    20. 20. RDBMS • Emerged in 1970 (initially a mess) • Standardized with SQL • Ubiquitous – widely used and understood • Supports transactions • High availability is achieved via Replication • Master – Master • Master – Slave • Synchronous/Asynchronous • But, in general scales vertically…
    21. 21. Scalability • Vertical (scale up) • Few (10s max) nodes • Grow by add/replace hardware on nodes • “Simple” to work with (less concurrency) • But, expensive to scale (huge nodes, expensive equipment, expensive dedicated storage solutions)
    22. 22. Scalability – cont. • Horizontal (scale out) • Many nodes (100s, 1000s) • Grow by adding nodes • Easy to grow – adding commodity servers is not expensive • Especially with Virtualization and the cloud. • But • More complex management • Harder to understand state and develop…
    23. 23. Large Scale RDBMS • We saw that is expensive to scale • We know what it gives us : • Guarantees Atomicity, Consistency, Isolation and Durability • E.g. Support transactions • “You know what you will get” • But how well it works in case of: • largely distributed environment? • Very large volumes of data?
    24. 24. Reminder – Consistency Model • We've talked about it before: • Linearization, quasi-linearization • Transactional memory • Etc. • When does change take effect • What is the "contract" between the system and the developer • Will discuss more later...
    25. 25. Consistency Model in RDBMS • We are very used to RDBMS • Clear contract o Transactions • Ever thought about the consistency model? • Consistency Model in RDBMS is ACID
    26. 26. • Atomicity • of an operation(transaction) - "All or nothing“ – If part fails, the entire transaction fails. • Consistency* • Database will remain in a valid state after the transaction. • Means adhering to the database rules (key, uniqueness, etc.) • Different from CAP‟s consistency definition. • Isolation • 2 Simultaneous transactions cannot interfere one with the other. (Executed as if executed sequentially) • Durability • Once a transaction is commited, it remains so indefinitely, even after power loss or crash. (no caching) Definition – ACID
    27. 27. ACID in Dist. Systems • Works well in many (most?) large sites • Proved problematic in very big sites • How to guarantee ACID properties ? • Atomicity requires more thought - e.g. two-phase commit (and 3-phase commit, PAXOS…) • Isolation requires to hold all of its locks for the entire transaction duration - High Lock Contention ! • Complex • Prone to failure - algorithm should handle • Failure = outage during write. • Comes with High overhead commits.
    28. 28. ACID in Dist. Systems • Ensuring ACID properties comes with Performance Overhead • Sometime it‟s mandatory, in order not to sacrifice data integrity. • In very large scale sites – this adds up: • Google : 0.5 sec in response = 20% decrease in traffic • Amazon: : 1ms decrease in response = 1% drop in income
    29. 29. Back to CAP • Vendors therefore came up with their own storage solutions, e.g. • Google BigTable (over Google File System) • Amazon DynamoDB • Facebook – hybrid (Cassandra, Hadoop) • Twitter – move from MySQL to Cassandra • These solutions, as a group, is dubbed as "NoSQL"; more on this later... • Common approach is to relax the consistency requirements for higher availability (or latency) • Sacrifice ACID-compliance in order to achieve higher performance – this is according to CAP.
    30. 30. Relaxed Consistency? Why do we need to give up consistency for availability? Can't we have both (at the same time)? Let‟s look more deeply into CAP Theorem..
    31. 31. CAP - Model • Atomic Data Object • There must exist a total order of operations s.t. each operation looks as if it where completed at a single instant – equivalent to as if they were executed on a single node. • Available Data Object • Every request receive by non-failing node must get a response (any alg. used in service must eventually terminate) • Partition Tolerance • Both above should tolerate partitions • Model partitions as messages delayed/lost in network
    32. 32. CAP – Theorem 1 It is impossible in the asynchronous network model to implement a read/write data object that guarantees the following properties: • Availability • Atomic Consistency in all fair executions (including those in which messages are lost) Asynchronous, i.e. there is no clock, nodes make decisions based only on the messages received and local computation.
    33. 33. CAP – Corollary 1.1 It is impossible in the asynchronous network model to implement a read/write data object that guarantees the following properties: • Availability, in all fair executions • Atomic Consistency, in fair executions in which no messages are lost. Intuition • In asynchronous model the alg. doesn„t know if messages are lost • Thus, there is no difference between this definition to theorem 1 and if such alg. exist it contradicts theorem 1.
    34. 34. CAP – Theorem 2 It is impossible in the partially asynchronous network model to implement a read/write data object that guarantees the following properties: • Availability • Atomic Consistency in all fair executions (including those in which messages are lost) Partially synchronous, i.e. every node has a clock, and all clocks increase at the same rate. However, they are not synchronized.
    35. 35. - implications
    36. 36. Side note - context Does this looks somewhat familiar? • Several Processes • Asynchronous communication • Some processes may fail (partitioned) Impossibility of Distributed Consensus with One Faulty Process (FLP, 1985)
    37. 37. Side note - context CAP ~ Impossibility of guaranteeing both safety and liveness in an unreliable distributed system: Consistency ~ safety - every response sent to a client is correct Availability ~ liveness - every request eventually receives a response
    38. 38. Side note - context Actually - this is similar to (a case of) consensus in an asynchronous system with faulty processes ( impossible). Consensus is more difficult to meet than the requirements of CAP - achieving agreement is (provably) harder than achieving CAP‟s consistency requirement. CAP also implies that it is impossible to achieve consensus in a system subject to partitions.
    39. 39. Side note - context Criticism: The reduction is not one-to-one Requirement/definition in CAP for availability is slightly different from the fail-stop assumption in FLP - Failed nodes still participate. Read more here: (It is important and practical to know these "theoretical" problems - Take the distributed algorithms course next year!)
    40. 40. Eventual Consistency - BASE Along with the CAP conjuncture, Brewer suggested a new consistency model - BASE (Basically Available, Soft state, Eventual consistency) • BASE model gives up on Consistency from the CAP Theorem. • This model is optimistic and accepts eventual consistency, in contrast to ACID. o Given enough time, all nodes will be consistent and every request will result with same responses. • Brewer points out that ACID and BASE are two extremes and one can have a range of options in choosing the balance between consistency and availability. (consistency models).
    41. 41. Eventual Consistency & BASE • Basically Available - the system does guarantee availability, in terms of the CAP theorem. It is always available, but subsets of data may become unavailable for short periods of time. • Soft state - State of system may change over time, even without input. Data does not have to be consistent. • Eventual Consistency - System will become consistent eventually in the future. ACID, on the contrary, enforces consistency immediately after any operation.
    42. 42. CAP Implications/Perspectives CAP is very prominent in discussion over the development of large, distributed systems. A new “eco-system” for “CAP aware” solutions is available and made common by the increase in massive web services. Over the years since introduced there is some criticism regarding the theorem.
    43. 43. Cannot omit Partition-Tolerance You can‟t really choose 2 out of 3: For a distributed (i.e., multi-node) system to not require partition-tolerance it would have to run on a network which is guaranteed to never drop messages (or even deliver them late) and whose nodes are guaranteed to never die. We do not work with these types of systems - simply because they don't exist.
    44. 44. CAP - Revisited CAP was devised and proved relatively early to the prevalence of systems that are affected by it Brewer - "CAP Twelve Years Later: How the “Rules” Have Changed" (2012)(reference) o Discusses misconceptions, suggests different models of consistency [3] Gilbert, Lynch - "Perspectives on the CAP Theorem" (2012)(reference) o Revisit proof concepts and discuss practical implications [4]
    45. 45. CAP - Revisited • Formal model too restrictive o In prove compared to conjecture. o E.g. relax time constraints (here) • Partitions are guaranteed to happen: o Maybe call it PAC (here) o Discussion (here) • Ignoring latency o Maybe call it CAPELC: CAP or Else Latency/Consistency (here)
    46. 46. CAP in Practice • CAP implications changes way of thinking/developing distributed systems • When designing/developing such a system one should be aware of CAP‟s considerations • We will now explore practical examples and techniques
    47. 47. Give up Scale Develop as usual with ACID Restricts the growth options This is more a business/design decision But some really don't need it (Small businesses that with relatively small data or limited number of users) Not so interesting for this discussion so we will continue.
    48. 48. NoSQL - Give up Consistency • Coined in 1998 by Carlo Strozzi for his RDBMS that does not use the standard SQL interface • Usually, gives up consistency to achieve availability o Does not support joins (are expensive) o No constraints (PK-FK) (related to joins) o Denormalization • Re-coined by a Rackspace employee in 2009 to label all data stores that do not provide ACID. • Started by internet giants (Google, Amazon etc) and later released as open source • Many, many variants... o see for a list.
    49. 49. NoSQL • Not Only SQL (initially strictly NO but...) • Sacrifices ACID-Compliance in order to achieve higher performance (Use BASE) o Maintains eventual consistency instead. • Distributed and fault tolerant • Scalable, redundancy on other servers (If one fails we can recover) • Usually scales horizontally and manages big amounts of data. • Used when performance and real-time-ness is more important than consistency of data.
    50. 50. NoSQL • Does not use SQL - Data does not necessarily follow a schema - it is partitioned among many nodes. We cannot do join operations. • Optimized for retrieval and appending. Usually works in key-value record storage. • Useful when working with a lot of data, that does not require following the ACID model. o Maybe not the best idea for your next banking application.
    51. 51. NoSQL - Taxonomy • Key-value: store key-value pairs. Values can be list etc. • Column-oriented (or column family): key value with subgroups of "columns" within a value that can be retrieved as a block • Document-oriented: store structured documents (JSON, XML) • Graph Database: model around nodes and edges with attributes on both. See more/comparison: for-Java-Developers
    52. 52. NoSQL – Key-Value • Support Simple Operations o get o put o delete • Operations based on (access by) a primary key. • Due to consistency model (eventual) you may have duplicates etc. • Very fast • Examples: o DynamoDB (Amazon) o Berkeley DB o Voldemort o Many others...
    53. 53. NoSQL - Column Oriented • Column-oriented systems still use tables but have no joins (joins must be handled within your application). Obviously, they store data by column as opposed to traditional row-oriented databases. This makes aggregations much easier. • Examples: o Hadoop/HBase (Apache) o Cassandra (Apache) o SimpleDB o BigTable (Google) • MapReduce used for retrieving (map) and aggregating (reduce) data.
    54. 54. NoSQL - Document Oriented • Document-oriented systems store structured "documents" such as JSON or XML but have no joins (joins must be handled within your application). It's very easy to map data from object-oriented software to these systems. • Query the document with relatively familiar syntax (same as the document syntax) • Isolation at document level but not between documents o Can have a transaction on a document o Not so easy for many documents • Examples: ◦CouchDb ◦MongoDb RavenDb (.NET)
    55. 55. NoSQL - Graph • Model Nodes and Relations o Node may have associated attributes o Node as relations to other nodes o Relations may have attributes as well • Replace joins with relationships o No more many-many tables = performance • Very useful and natural for networks: Social, Communication, Biology • Many times natively supports common graph operations o Short path o Diameter etc • Examples: Neo4J...
    56. 56. NoSQL - Future Work on UnQL (unstructured) has begun - allows querying NoSQL DB's. • Queries collections instead of tables. • Queries documents instead of rows. • Superset of SQL (in SQL queries return same fields) • Does not support Data Definition (DDL) : CREATE TABLE etc. (but many times these stores are schema-less)
    57. 57. NoSQL - Summary • New classes of databases that are based on example work from internet giants • Usually relaxes consistency o but not always - read your manual • Not as trivial as using SQL o Transaction support may be limited and restricted in scope o No joins - need to care for it yourself o Design around questions and not "model"  E.g. aggregated keys • Implement various consistency models (coming up) • And be careful with it !
    58. 58. Consistency Models • Give up (some) consistency in response to CAP • Several models that may fit different usage scenarios • Combine between models in application o Catalog with eventual consistency o Checkout/register with strong consistency • Next, look at some of the variants • We look from the client side (programmer) based on discussion in
    59. 59. Consistency Models - Examples Causal consistency. • cause (and effect, not casual) • If process A has communicated to process B that it has updated a data item, a subsequent access by process B will return the updated value, and a write is guaranteed to supersede the earlier write. o Processes A and B are causally related • Access by process C that has no causal relationship to process A is subject to the normal eventual consistency rules. o A and C have not relation
    60. 60. Read-your-writes consistency. This is an important model where process A, after it has updated a data item, always accesses the updated value and will never see an older value. This is a special case of the causal consistency model. (A is causal to itself) Session consistency. Practical version of the previous model Within a session the system guarantees read-your-writes consistency (use cache on server) When session terminates data is stored. Guarantees do not overlap the sessions Consistency Models - Examples
    61. 61. Monotonic read consistency. If a process has seen a particular value for the object, any subsequent accesses will never return any previous values. Good when data relatively static, even use local cache Monotonic write consistency. In this case the system guarantees to serialize the writes by the same process. Systems that do not guarantee this level of consistency are notoriously hard to program. Consistency Models
    62. 62. Segmentation - Smart Partitioning • No single uniform requirement o some aspects require strong consistency o others high availability. • Segmentation to component an approach to circumventing CAP o each provide different types of guarantees. • Overall guarantees neither consistency nor availability, yet ultimately each part of the service provides exactly what is needed. • Can be partitioned along various dimensions. o guarantees not always clear o specific to the given application and the particular partitioning scheme. o Thus, difficult but maybe necessary
    63. 63. Partitioning - Examples • Data partitioning • Operational partitioning • Functional partitioning • User partitioning • Hierarchical partitioning
    64. 64. Partitioning - Examples Data partitioning • Different data may require different consistency and availability • Example: o Shopping cart - high availability, responsive, can sometimes suffer anomalies o Product information need to be available, slight variation in inventory is sufferale o Checkout, billing, shopping records must be consistent...
    65. 65. Partitioning - Examples Operational partitioning • Each operation may require different balance between consistency and availability • Example o Reads - high availability o Writes - high consistency - lock when writing
    66. 66. Partitioning - Examples Hierarchical partitioning • Large global services with local "extensions" • Different location in hierarchy may use different consistency • Example o Local servers (better connected) guarantee more consistency and availability o Global servers has more partition and thus relax one of the requiements
    67. 67. Partitioning - Examples User partitioning • Try to keep related data close together to assure better performance • Minimize partitioning and thus get better consistency and availability • Less consistency between non-related data • E.g. keep cluster of Facebook user together.
    68. 68. Partitioning - Examples Functional partitioning • System consists of sub-services • Different sub-services provide balance according to requirements • The composition (whole system) is not always available and consistent but each part is assured to work well.
    69. 69. Best-effort availability • This means sacrificing availability for consistency • Still optimize to give as much availability as possible • Makes more sense when the network is more reliable
    70. 70. Summary • Introduced CAP • What it is, what it is made of and to where it applies • Explore its properties and Implications • You can‟t have it all • You can‟t really give up on partition tolerance • Saw some ways systems are designed around this concept in order to achieve their (business) goals
    71. 71. Summary • The last point is important – we have to understand the limitations we face and see how to achieve the requirements of the system while taking these limitations into account • Must be another consideration in the design phase • Technical • But also, functional/business decision
    72. 72. So, what to use? “And remember also, most people are not building facebook, they are building reservation systems, tracking systems, HR systems, finance systems, order entry systems, banking systems, etc - things where transactions are sort of important (lose my status update - no big deal, lose my $100 transfer and I'm sort of mad). There is room for a lot of things out there.” (from AskTom answer)
    73. 73. Questions ? • We are probably out of time…
    74. 74. Bibliography 1) Eric A Brewer, Toward Robust Distributed Systems, PODC 2000 2) Seth Gilbert, Nancy Lynch, Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services, 2002 3) Eric A Brewer, CAP Twelve Years Later - How the rules have changed, IEEE Computer Society Computer Magazine, Feb. 2012 4) Seth Gilbert, Nancy Lynch, Perspectives on the CAP Theorem, IEEE Computer Society Computer Magazine, Feb. 2012 5) Werner Vogels (CTO, Amazon), Eventually Consistent – Revisited, All Things Distributed blog, December 2008 6) Ivan Giangreco, CAP Theorem Talk, University Of Basel, Fall 2010 (part of Distributed Information System course) 7) Kaushik Sathupadi, A plain English introduction to CAP Theorem 8) Google : CAP Theorem