Amazon Dynamo

920 views
796 views

Published on

Big data class presentation of the Amazon Dynamo storage service.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
920
On SlideShare
0
From Embeds
0
Number of Embeds
391
Actions
Shares
0
Downloads
17
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Amazon Dynamo

  1. 1. Dynamo: Amazon’s Highly Available Key-Value Store Farley Lai University of Iowa poyuan-lai@uiowa.edu February 21, 2014 Farley Lai (UIOWA) Amazon Dynamo (Big Data) February 21, 2014 1 / 14
  2. 2. Motivation MapReduce processes big data in a parallel and distributed fashion. Daynamo forms the foundation of big data, namely, the storage. Shopping Cart Clients tend to insert and update items frequenty but review the cart to check out only at the end. Is it fun for the sytem to always ask you to retry later in minutes whenever there is an item inserted/updated in the shopping cart? Farley Lai (UIOWA) Amazon Dynamo (Big Data) February 21, 2014 2 / 14
  3. 3. SOA of Amazon’s Platform Farley Lai (UIOWA) Amazon Dynamo (Big Data) February 21, 2014 3 / 14
  4. 4. Roles Service Provider: Amazon Service: Dynamo, the storage service Customer: application/service vendors Client: applications/services User: human and/or bots Service Level Agreements (SLA) SLA are contracts signed by service providers and customers, specifying the quality of service guaranteed for a client access distribution. Example: service guaranteeing that it will provide a response within 300ms for 99.9% of its requests for a peak client load of 500 requests per second. Farley Lai (UIOWA) Amazon Dynamo (Big Data) February 21, 2014 4 / 14
  5. 5. What is Dynamo? A distributed key-value storage service built on a ring topology with high availability for writes eventual consistency Farley Lai (UIOWA) Amazon Dynamo (Big Data) February 21, 2014 5 / 14
  6. 6. Requirements and Assumptions Requirements Simple read/write to data items identified by unique keys ACID: automicity, consistency, isolation and durability SLA: latency constraints on the 99.9th percentile of the distribution Assumptions Trusted environment and machines without security concerns Farley Lai (UIOWA) Amazon Dynamo (Big Data) February 21, 2014 6 / 14
  7. 7. Problems, Techniques and Advantages Problems Partitioning High write availability Temporary failures Permanent failures Membership Farley Lai (UIOWA) Techniques Advantages Consistent Hashing Vector clocks with conlict resolution Sloppy Quorum, hinted handoff Incremental Scalability Version size is decoupled from update rates High availability and durability guarantee despite some unavailable replicas Fast replica synchronization decentralized registry for storing membership and liveness info Merkle trees Gossip protocol Amazon Dynamo (Big Data) February 21, 2014 7 / 14
  8. 8. Partitioning Consistent hashing 1 key space 2 tokens assignment 3 replication 4 load distribution 5 node availability 6 node capacity Farley Lai (UIOWA) Amazon Dynamo (Big Data) February 21, 2014 8 / 14
  9. 9. Data Versioning Operations 1 read()⇒get() 2 write()⇒put() 3 conflict resolution 4 vector clock Farley Lai (UIOWA) Amazon Dynamo (Big Data) February 21, 2014 9 / 14
  10. 10. Sloppy Quorum 1 R(2) + W (2) > N(3) 2 latency Farley Lai (UIOWA) Amazon Dynamo (Big Data) February 21, 2014 10 / 14
  11. 11. Replica Synchronization Figure : Merkle hash tree1 Farley Lai (UIOWA) Figure : Merkle hash tree2 Amazon Dynamo (Big Data) February 21, 2014 11 / 14
  12. 12. Evaluation: latency Farley Lai (UIOWA) Amazon Dynamo (Big Data) February 21, 2014 12 / 14
  13. 13. Evaluation: load balance Farley Lai (UIOWA) Amazon Dynamo (Big Data) February 21, 2014 13 / 14
  14. 14. Evaluation: write buffer Farley Lai (UIOWA) Amazon Dynamo (Big Data) February 21, 2014 14 / 14

×