SlideShare a Scribd company logo
1 of 19
Dynamo: Amazon’s Highly
Available Key-value Store
Giuseppe DeCandia, Deniz Hastorun,
Madan Jampani, Gunavardhan Kakulapati,
Avinash Lakshman, Alex Pilchin, Swaminathan
Sivasubramanian, Peter Vosshall
and Werner Vogels
Motivation
 Build a distributed storage system:
 Scale
 Simple: key-value
 Highly available
 Guarantee Service Level Agreements (SLA)
System Assumptions and Requirements
 Query Model: simple read and write operations to a data
item that is uniquely identified by a key.
 ACID Properties: Atomicity, Consistency, Isolation,
Durability.
 Efficiency: latency requirements which are in general
measured at the 99.9th percentile of the distribution.
 Other Assumptions: operation environment is assumed
to be non-hostile and there are no security related requirements
such as authentication and authorization.
Service Level Agreements (SLA)
 Application can deliver its
functionality in abounded
time: Every dependency in the
platform needs to deliver its
functionality with even tighter
bounds.
 Example: service guaranteeing
that it will provide a response within
300ms for 99.9% of its requests for a
peak client load of 500 requests per
second.
Service-oriented architecture of
Amazon’s platform
Design Consideration
 Sacrifice strong consistency for availability
 Conflict resolution is executed during read
instead of write, i.e. “always writeable”.
 Other principles:
 Incremental scalability.
 Symmetry.
 Decentralization.
 Heterogeneity.
Summary of techniques used in Dynamo
and their advantages
Problem Technique Advantage
Partitioning Consistent Hashing Incremental Scalability
High Availability for writes
Vector clocks with reconciliation
during reads
Version size is decoupled from
update rates.
Handling temporary failures Sloppy Quorum and hinted handoff Provides high availability and
durability guarantee when some of
the replicas are not available.
Recovering from permanent
failures
Anti-entropy using Merkle trees
Synchronizes divergent replicas in
the background.
Membership and failure detection
Gossip-based membership protocol
and failure detection.
Preserves symmetry and avoids
having a centralized registry for
storing membership and node
liveness information.
Partition Algorithm
 Consistent hashing: the output
range of a hash function is treated as a
fixed circular space or “ring”.
 ”Virtual Nodes”: Each node can
be responsible for more than one
virtual node.
Advantages of using virtual nodes
 If a node becomes unavailable the
load handled by this node is evenly
dispersed across the remaining
available nodes.
 When a node becomes available
again, the newly available node
accepts a roughly equivalent
amount of load from each of the
other available nodes.
 The number of virtual nodes that a
node is responsible can decided
based on its capacity, accounting
for heterogeneity in the physical
infrastructure.
Replication
 Each data item is
replicated at N hosts.
 “preference list”: The list of
nodes that is responsible
for storing a particular key.
Data Versioning
 A put() call may return to its caller before the
update has been applied at all the replicas
 A get() call may return many versions of the
same object.
 Challenge: an object having distinct version sub-histories,
which the system will need to reconcile in the future.
 Solution: uses vector clocks in order to capture causality
between different versions of the same object.
Vector Clock
 A vector clock is a list of (node, counter)
pairs.
 Every version of every object is associated
with one vector clock.
 If the counters on the first object’s clock are
less-than-or-equal to all of the nodes in the
second clock, then the first is an ancestor of
the second and can be forgotten.
Vector clock example
Execution of get () and put ()
operations
1. Route its request through a generic load
balancer that will select a node based on
load information.
2. Use a partition-aware client library that
routes requests directly to the appropriate
coordinator nodes.
Sloppy Quorum
 R/W is the minimum number of nodes that
must participate in a successful read/write
operation.
 Setting R + W > N yields a quorum-like
system.
 In this model, the latency of a get (or put)
operation is dictated by the slowest of the R
(or W) replicas. For this reason, R and W are
usually configured to be less than N, to
provide better latency.
Hinted handoff
 Assume N = 3. When A
is temporarily down or
unreachable during a
write, send replica to D.
 D is hinted that the
replica is belong to A and
it will deliver to A when A
is recovered.
 Again: “always writeable”
Other techniques
 Replica synchronization:
 Merkle hash tree.
 Membership and Failure Detection:
 Gossip
Implementation
 Java
 Local persistence component allows for
different storage engines to be plugged in:
 Berkeley Database (BDB) Transactional Data
Store: object of tens of kilobytes
 MySQL: object of > tens of kilobytes
 BDB Java Edition, etc.
Evaluation
Evaluation

More Related Content

Similar to Dynamo.ppt

Fundamentals Of Transaction Systems - Part 1: Causality banishes Acausality ...
Fundamentals Of Transaction Systems - Part 1: Causality banishes Acausality ...Fundamentals Of Transaction Systems - Part 1: Causality banishes Acausality ...
Fundamentals Of Transaction Systems - Part 1: Causality banishes Acausality ...Valverde Computing
 
Cassandra & Python - Springfield MO User Group
Cassandra & Python - Springfield MO User GroupCassandra & Python - Springfield MO User Group
Cassandra & Python - Springfield MO User GroupAdam Hutson
 
An introduction to_rac_system_test_planning_methods
An introduction to_rac_system_test_planning_methodsAn introduction to_rac_system_test_planning_methods
An introduction to_rac_system_test_planning_methodsAjith Narayanan
 
Talon systems - Distributed multi master replication strategy
Talon systems - Distributed multi master replication strategyTalon systems - Distributed multi master replication strategy
Talon systems - Distributed multi master replication strategySaptarshi Chatterjee
 
Distributed Algorithms
Distributed AlgorithmsDistributed Algorithms
Distributed Algorithms913245857
 
A Distributed Control Law for Load Balancing in Content Delivery Networks
A Distributed Control Law for Load Balancing in Content Delivery NetworksA Distributed Control Law for Load Balancing in Content Delivery Networks
A Distributed Control Law for Load Balancing in Content Delivery NetworksSruthi Kamal
 
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...DataStax Academy
 
You need Event Mesh, not Service Mesh - Chris Suszynski [WJUG 301]
You need Event Mesh, not Service Mesh - Chris Suszynski [WJUG 301]You need Event Mesh, not Service Mesh - Chris Suszynski [WJUG 301]
You need Event Mesh, not Service Mesh - Chris Suszynski [WJUG 301]Chris Suszyński
 
Apache Cassandra - Drivers deep dive
Apache Cassandra - Drivers deep diveApache Cassandra - Drivers deep dive
Apache Cassandra - Drivers deep diveAlex Thompson
 
REEF: Towards a Big Data Stdlib
REEF: Towards a Big Data StdlibREEF: Towards a Big Data Stdlib
REEF: Towards a Big Data StdlibDataWorks Summit
 
MC0085 – Advanced Operating Systems - Master of Computer Science - MCA - SMU DE
MC0085 – Advanced Operating Systems - Master of Computer Science - MCA - SMU DEMC0085 – Advanced Operating Systems - Master of Computer Science - MCA - SMU DE
MC0085 – Advanced Operating Systems - Master of Computer Science - MCA - SMU DEAravind NC
 
RAC - The Savior of DBA
RAC - The Savior of DBARAC - The Savior of DBA
RAC - The Savior of DBANikhil Kumar
 

Similar to Dynamo.ppt (20)

Fundamentals Of Transaction Systems - Part 1: Causality banishes Acausality ...
Fundamentals Of Transaction Systems - Part 1: Causality banishes Acausality ...Fundamentals Of Transaction Systems - Part 1: Causality banishes Acausality ...
Fundamentals Of Transaction Systems - Part 1: Causality banishes Acausality ...
 
Cassandra & Python - Springfield MO User Group
Cassandra & Python - Springfield MO User GroupCassandra & Python - Springfield MO User Group
Cassandra & Python - Springfield MO User Group
 
Cl306
Cl306Cl306
Cl306
 
An introduction to_rac_system_test_planning_methods
An introduction to_rac_system_test_planning_methodsAn introduction to_rac_system_test_planning_methods
An introduction to_rac_system_test_planning_methods
 
NoSql Database
NoSql DatabaseNoSql Database
NoSql Database
 
Cassandra
CassandraCassandra
Cassandra
 
Talon systems - Distributed multi master replication strategy
Talon systems - Distributed multi master replication strategyTalon systems - Distributed multi master replication strategy
Talon systems - Distributed multi master replication strategy
 
I0935053
I0935053I0935053
I0935053
 
Distributed Algorithms
Distributed AlgorithmsDistributed Algorithms
Distributed Algorithms
 
A Distributed Control Law for Load Balancing in Content Delivery Networks
A Distributed Control Law for Load Balancing in Content Delivery NetworksA Distributed Control Law for Load Balancing in Content Delivery Networks
A Distributed Control Law for Load Balancing in Content Delivery Networks
 
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
 
You need Event Mesh, not Service Mesh - Chris Suszynski [WJUG 301]
You need Event Mesh, not Service Mesh - Chris Suszynski [WJUG 301]You need Event Mesh, not Service Mesh - Chris Suszynski [WJUG 301]
You need Event Mesh, not Service Mesh - Chris Suszynski [WJUG 301]
 
Apache Cassandra - Drivers deep dive
Apache Cassandra - Drivers deep diveApache Cassandra - Drivers deep dive
Apache Cassandra - Drivers deep dive
 
GARUDA
GARUDAGARUDA
GARUDA
 
Oracle Coherence
Oracle CoherenceOracle Coherence
Oracle Coherence
 
REEF: Towards a Big Data Stdlib
REEF: Towards a Big Data StdlibREEF: Towards a Big Data Stdlib
REEF: Towards a Big Data Stdlib
 
Failover cluster
Failover clusterFailover cluster
Failover cluster
 
MC0085 – Advanced Operating Systems - Master of Computer Science - MCA - SMU DE
MC0085 – Advanced Operating Systems - Master of Computer Science - MCA - SMU DEMC0085 – Advanced Operating Systems - Master of Computer Science - MCA - SMU DE
MC0085 – Advanced Operating Systems - Master of Computer Science - MCA - SMU DE
 
No sql (not only sql)
No sql                 (not only sql)No sql                 (not only sql)
No sql (not only sql)
 
RAC - The Savior of DBA
RAC - The Savior of DBARAC - The Savior of DBA
RAC - The Savior of DBA
 

Recently uploaded

SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Recently uploaded (20)

SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Dynamo.ppt

  • 1. Dynamo: Amazon’s Highly Available Key-value Store Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall and Werner Vogels
  • 2. Motivation  Build a distributed storage system:  Scale  Simple: key-value  Highly available  Guarantee Service Level Agreements (SLA)
  • 3. System Assumptions and Requirements  Query Model: simple read and write operations to a data item that is uniquely identified by a key.  ACID Properties: Atomicity, Consistency, Isolation, Durability.  Efficiency: latency requirements which are in general measured at the 99.9th percentile of the distribution.  Other Assumptions: operation environment is assumed to be non-hostile and there are no security related requirements such as authentication and authorization.
  • 4. Service Level Agreements (SLA)  Application can deliver its functionality in abounded time: Every dependency in the platform needs to deliver its functionality with even tighter bounds.  Example: service guaranteeing that it will provide a response within 300ms for 99.9% of its requests for a peak client load of 500 requests per second. Service-oriented architecture of Amazon’s platform
  • 5. Design Consideration  Sacrifice strong consistency for availability  Conflict resolution is executed during read instead of write, i.e. “always writeable”.  Other principles:  Incremental scalability.  Symmetry.  Decentralization.  Heterogeneity.
  • 6. Summary of techniques used in Dynamo and their advantages Problem Technique Advantage Partitioning Consistent Hashing Incremental Scalability High Availability for writes Vector clocks with reconciliation during reads Version size is decoupled from update rates. Handling temporary failures Sloppy Quorum and hinted handoff Provides high availability and durability guarantee when some of the replicas are not available. Recovering from permanent failures Anti-entropy using Merkle trees Synchronizes divergent replicas in the background. Membership and failure detection Gossip-based membership protocol and failure detection. Preserves symmetry and avoids having a centralized registry for storing membership and node liveness information.
  • 7. Partition Algorithm  Consistent hashing: the output range of a hash function is treated as a fixed circular space or “ring”.  ”Virtual Nodes”: Each node can be responsible for more than one virtual node.
  • 8. Advantages of using virtual nodes  If a node becomes unavailable the load handled by this node is evenly dispersed across the remaining available nodes.  When a node becomes available again, the newly available node accepts a roughly equivalent amount of load from each of the other available nodes.  The number of virtual nodes that a node is responsible can decided based on its capacity, accounting for heterogeneity in the physical infrastructure.
  • 9. Replication  Each data item is replicated at N hosts.  “preference list”: The list of nodes that is responsible for storing a particular key.
  • 10. Data Versioning  A put() call may return to its caller before the update has been applied at all the replicas  A get() call may return many versions of the same object.  Challenge: an object having distinct version sub-histories, which the system will need to reconcile in the future.  Solution: uses vector clocks in order to capture causality between different versions of the same object.
  • 11. Vector Clock  A vector clock is a list of (node, counter) pairs.  Every version of every object is associated with one vector clock.  If the counters on the first object’s clock are less-than-or-equal to all of the nodes in the second clock, then the first is an ancestor of the second and can be forgotten.
  • 13. Execution of get () and put () operations 1. Route its request through a generic load balancer that will select a node based on load information. 2. Use a partition-aware client library that routes requests directly to the appropriate coordinator nodes.
  • 14. Sloppy Quorum  R/W is the minimum number of nodes that must participate in a successful read/write operation.  Setting R + W > N yields a quorum-like system.  In this model, the latency of a get (or put) operation is dictated by the slowest of the R (or W) replicas. For this reason, R and W are usually configured to be less than N, to provide better latency.
  • 15. Hinted handoff  Assume N = 3. When A is temporarily down or unreachable during a write, send replica to D.  D is hinted that the replica is belong to A and it will deliver to A when A is recovered.  Again: “always writeable”
  • 16. Other techniques  Replica synchronization:  Merkle hash tree.  Membership and Failure Detection:  Gossip
  • 17. Implementation  Java  Local persistence component allows for different storage engines to be plugged in:  Berkeley Database (BDB) Transactional Data Store: object of tens of kilobytes  MySQL: object of > tens of kilobytes  BDB Java Edition, etc.