Dynamo is a highly available key-value storage system built by Amazon to power its e-commerce platform. It uses consistent hashing to partition data across nodes in a ring topology and achieves high availability of writes through techniques like vector clocks, hinted handoff, and quorums. Dynamo provides simple interfaces to store and retrieve data identified by unique keys at massive scales with low latency despite failures through an eventually consistent model.
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Amazon Dynamo
1. Dynamo: Amazon’s Highly Available Key-Value Store
Farley Lai
University of Iowa
poyuan-lai@uiowa.edu
February 21, 2014
Farley Lai (UIOWA)
Amazon Dynamo (Big Data)
February 21, 2014
1 / 14
2. Motivation
MapReduce processes big data in a parallel and distributed fashion.
Daynamo forms the foundation of big data, namely, the storage.
Shopping Cart
Clients tend to insert and update items frequenty but review the cart to
check out only at the end. Is it fun for the sytem to always ask you to
retry later in minutes whenever there is an item inserted/updated in the
shopping cart?
Farley Lai (UIOWA)
Amazon Dynamo (Big Data)
February 21, 2014
2 / 14
3. SOA of Amazon’s Platform
Farley Lai (UIOWA)
Amazon Dynamo (Big Data)
February 21, 2014
3 / 14
4. Roles
Service Provider: Amazon
Service: Dynamo, the storage service
Customer: application/service vendors
Client: applications/services
User: human and/or bots
Service Level Agreements (SLA)
SLA are contracts signed by service providers and customers, specifying
the quality of service guaranteed for a client access distribution.
Example: service guaranteeing that it will provide a response within
300ms for 99.9% of its requests for a peak client load of 500 requests per
second.
Farley Lai (UIOWA)
Amazon Dynamo (Big Data)
February 21, 2014
4 / 14
5. What is Dynamo?
A distributed key-value storage service built on a ring topology with
high availability for writes
eventual consistency
Farley Lai (UIOWA)
Amazon Dynamo (Big Data)
February 21, 2014
5 / 14
6. Requirements and Assumptions
Requirements
Simple read/write to data items identified by unique keys
ACID: automicity, consistency, isolation and durability
SLA: latency constraints on the 99.9th percentile of the
distribution
Assumptions
Trusted environment and machines without security concerns
Farley Lai (UIOWA)
Amazon Dynamo (Big Data)
February 21, 2014
6 / 14
7. Problems, Techniques and Advantages
Problems
Partitioning
High write availability
Temporary failures
Permanent failures
Membership
Farley Lai (UIOWA)
Techniques
Advantages
Consistent Hashing
Vector clocks with
conlict resolution
Sloppy
Quorum,
hinted handoff
Incremental Scalability
Version size is decoupled
from update rates
High availability and durability guarantee despite
some unavailable replicas
Fast replica synchronization
decentralized registry for
storing membership and
liveness info
Merkle trees
Gossip protocol
Amazon Dynamo (Big Data)
February 21, 2014
7 / 14