Your SlideShare is downloading. ×
Cs782 presentation group7
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Cs782 presentation group7

97

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
97
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Consistent Hashing and the Dynamo Model Ai Ren, Yina Du, and Mingliang Sun Group 7
  • 2. Outline Motivation & Objective Key Ideas in Dynamo Simulation Method & Result Conclusion
  • 3. Motivation It is all about $! − Massive scale data in hundreds of nodes − Commodity hardware infrastructure − Failure is the norm, not the exception
  • 4. Motivation - Availability always-on experience for end users − How to handle failures transparently? − Parity checking or replication? − Strongly consistent or eventually consistent? − Conflict resolution: who and when?
  • 5. Motivation - Scalability $ matters! Poor performance means losing customers and money − Increase capacity easily and incrementally Over-provisioning means unnecessary cost − Decrease capacity easily and incrementally
  • 6. ObjectiveService is always available for customers with a guaranteed response time no matter what, and achieve this with as little $ as possible
  • 7. Key Ideas A fully decentralized DHT (Distributed Hash Table) Consistent hashing − Natural partitioning and LB(division of labor) − Minimum data migration when node joins/leaves Replication for fault tolerance − Quorum techniques: R + W > N Eventual(weak) consistency model Conflict resolution − By application, not Dynamo − When reading, not writing
  • 8. Simulation - Overview  Performance test tool for concurrent requests − Dynamo applications − Gather and record results  a ring of services as dynamo nodes − replication and fault tolerance  A proxy sits between the PT tool and the ring − a simple service interface − requests randomness − membership discovering
  • 9. Simulation - Availability  When a node leaves, the coordinating node uses the next available node on the ring  With node replacement, right after a node leaves the ring (fails), a new node will join the ring, keeping the number of nodes unchanged  System load increases gradually (from100 to 200 requests / second)  4 simulation cases − W=2, N=3 (R=2)  With node replacement (15 nodes)  Without node replacement (15 → 10 nodes) − W=3, N=3 (R=1)  With node replacement (15 nodes)  Without node replacement (15 → 10 nodes)
  • 10. Simulation - Availability  No failure requests recorded for all cases, service remains available when node leaves (and joins)  With replacement nodes, service level (throughput) is maintained  A W=2 setting gives better performance, while a W=3 setting provides better fault tolerance
  • 11. Simulation - Scalability  Scalability: more nodes → larger capacity  Incremental & dynamic scalability: no service interruption  System load increases gradually (from 100 to 200 requests / second)  6 simulation cases − W=2, N=3 (R=2)  10 nodes  From 10 to 15 nodes  15 nodes − W=3, N=3 (R=1)  10 nodes  From 10 to 15 nodes  15 nodes
  • 12. Simulation - Scalability  A Ring with more nodes provide greater capacity (throughput) than a ring with less nodes does  Moreover, capacity (throughput) increased incrementally (dynamically) when more nodes join the ring, without incurring service interruption  Higher the W setting, better fault tolerance, but worse writing performance
  • 13. Conclusion With consistent hashing, the Dynamo model is able to provide great scalability and availability Massive scale data storage on large cluster of commodity infrastructure is possible A real application: the shopping cart on www.amazon.com

×