Cs782 presentation group7

179 views
157 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
179
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Cs782 presentation group7

  1. 1. Consistent Hashing and the Dynamo Model Ai Ren, Yina Du, and Mingliang Sun Group 7
  2. 2. Outline Motivation & Objective Key Ideas in Dynamo Simulation Method & Result Conclusion
  3. 3. Motivation It is all about $! − Massive scale data in hundreds of nodes − Commodity hardware infrastructure − Failure is the norm, not the exception
  4. 4. Motivation - Availability always-on experience for end users − How to handle failures transparently? − Parity checking or replication? − Strongly consistent or eventually consistent? − Conflict resolution: who and when?
  5. 5. Motivation - Scalability $ matters! Poor performance means losing customers and money − Increase capacity easily and incrementally Over-provisioning means unnecessary cost − Decrease capacity easily and incrementally
  6. 6. ObjectiveService is always available for customers with a guaranteed response time no matter what, and achieve this with as little $ as possible
  7. 7. Key Ideas A fully decentralized DHT (Distributed Hash Table) Consistent hashing − Natural partitioning and LB(division of labor) − Minimum data migration when node joins/leaves Replication for fault tolerance − Quorum techniques: R + W > N Eventual(weak) consistency model Conflict resolution − By application, not Dynamo − When reading, not writing
  8. 8. Simulation - Overview  Performance test tool for concurrent requests − Dynamo applications − Gather and record results  a ring of services as dynamo nodes − replication and fault tolerance  A proxy sits between the PT tool and the ring − a simple service interface − requests randomness − membership discovering
  9. 9. Simulation - Availability  When a node leaves, the coordinating node uses the next available node on the ring  With node replacement, right after a node leaves the ring (fails), a new node will join the ring, keeping the number of nodes unchanged  System load increases gradually (from100 to 200 requests / second)  4 simulation cases − W=2, N=3 (R=2)  With node replacement (15 nodes)  Without node replacement (15 → 10 nodes) − W=3, N=3 (R=1)  With node replacement (15 nodes)  Without node replacement (15 → 10 nodes)
  10. 10. Simulation - Availability  No failure requests recorded for all cases, service remains available when node leaves (and joins)  With replacement nodes, service level (throughput) is maintained  A W=2 setting gives better performance, while a W=3 setting provides better fault tolerance
  11. 11. Simulation - Scalability  Scalability: more nodes → larger capacity  Incremental & dynamic scalability: no service interruption  System load increases gradually (from 100 to 200 requests / second)  6 simulation cases − W=2, N=3 (R=2)  10 nodes  From 10 to 15 nodes  15 nodes − W=3, N=3 (R=1)  10 nodes  From 10 to 15 nodes  15 nodes
  12. 12. Simulation - Scalability  A Ring with more nodes provide greater capacity (throughput) than a ring with less nodes does  Moreover, capacity (throughput) increased incrementally (dynamically) when more nodes join the ring, without incurring service interruption  Higher the W setting, better fault tolerance, but worse writing performance
  13. 13. Conclusion With consistent hashing, the Dynamo model is able to provide great scalability and availability Massive scale data storage on large cluster of commodity infrastructure is possible A real application: the shopping cart on www.amazon.com

×