Consistency Tradeoffs in Modern
Distributed Database System Design
Article by Daniel J. Abadi, Yale
University

Presentation by Arinto Murdopo
Outline
•   CAP Theorem
•   What’s wrong with CAP?
•   Consistency/Latency Tradeoff
•   Type of Replication
•   PACELC
•   DDBS in PACELC metrics
•   Conclusion
CAP theorem


   Consistency                Availability
                      CA



                 CP        AP

                  Partition Tolerance
What’s wrong with CAP?
  Consistency     CA       Availability

                                          Reduced Consistency,
           CP           AP                let’s justify it!
           Partition Tolerance




(?)Tolerant to network partition
High availability                             Sacrifice consistency
What’s wrong with CAP?
The P in CAP is combination of:
• Partition tolerance ~ commonly used as justification
• Existence of a network partition itself ~ often forgotten
Consistency/Latency Tradeoff
Modern Database Design
• Dynamo ~ Amazon
• Cassandra ~ Facebook           Availability & Latency
• Voldemort ~ LinkedIn           are critical!
• PNUTS ~ Yahoo


High Availability -> Replication is needed
Replication is used -> Consistency/Latency Tradeoff
occurs
Type of Replication
1. Data updates sent to all replicas at same time
 a. Without preprocessing layer ~ latency
 b. With preprocessing layer ~ consistency

2. Data updates sent to agreed-upon location first
 a. Synchronous ~ consistency
 b. Asynchronous ~ latency. Used by PNUTS.
 c. Combination ~ configurable
Type of Replication
3. Data updates sent to an arbitrary location
• Location for data item is not always same
  a. Synchronous ~ consistency
  b. Asynchronous ~ latency
• Used by Dynamo, Cassandra, and Riak -> combined
  with 2c
PACELC
P ~ when there is partitioning
• Trade off between AC

E ~ else (which is no partitioning)
• Trade off between LC
DDBS in PACELC metrics
DBSS                A            C            L             C
Dynamo              v                         v
Cassandra           v                         v
Riak                v                         v
VoltDB/H-Store                   v                          v
Megastore                        v                          v
MongoDB             v                                       v
PNUTS                            v            v



Dynamo, Cassandra and Riak have user-adjustable settings in LC tradeoff!
Conclusion
• CAP is still important
• Exploring new metrics is good
• PACELC metrics are worth to consider

Consistency Tradeoffs in Modern Distributed Database System Design

  • 1.
    Consistency Tradeoffs inModern Distributed Database System Design Article by Daniel J. Abadi, Yale University Presentation by Arinto Murdopo
  • 2.
    Outline • CAP Theorem • What’s wrong with CAP? • Consistency/Latency Tradeoff • Type of Replication • PACELC • DDBS in PACELC metrics • Conclusion
  • 3.
    CAP theorem Consistency Availability CA CP AP Partition Tolerance
  • 4.
    What’s wrong withCAP? Consistency CA Availability Reduced Consistency, CP AP let’s justify it! Partition Tolerance (?)Tolerant to network partition High availability Sacrifice consistency
  • 5.
    What’s wrong withCAP? The P in CAP is combination of: • Partition tolerance ~ commonly used as justification • Existence of a network partition itself ~ often forgotten
  • 6.
    Consistency/Latency Tradeoff Modern DatabaseDesign • Dynamo ~ Amazon • Cassandra ~ Facebook Availability & Latency • Voldemort ~ LinkedIn are critical! • PNUTS ~ Yahoo High Availability -> Replication is needed Replication is used -> Consistency/Latency Tradeoff occurs
  • 7.
    Type of Replication 1.Data updates sent to all replicas at same time a. Without preprocessing layer ~ latency b. With preprocessing layer ~ consistency 2. Data updates sent to agreed-upon location first a. Synchronous ~ consistency b. Asynchronous ~ latency. Used by PNUTS. c. Combination ~ configurable
  • 8.
    Type of Replication 3.Data updates sent to an arbitrary location • Location for data item is not always same a. Synchronous ~ consistency b. Asynchronous ~ latency • Used by Dynamo, Cassandra, and Riak -> combined with 2c
  • 9.
    PACELC P ~ whenthere is partitioning • Trade off between AC E ~ else (which is no partitioning) • Trade off between LC
  • 10.
    DDBS in PACELCmetrics DBSS A C L C Dynamo v v Cassandra v v Riak v v VoltDB/H-Store v v Megastore v v MongoDB v v PNUTS v v Dynamo, Cassandra and Riak have user-adjustable settings in LC tradeoff!
  • 11.
    Conclusion • CAP isstill important • Exploring new metrics is good • PACELC metrics are worth to consider