Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Cs782 presentation group7

245 views

Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Cs782 presentation group7

  1. 1. Consistent Hashing and the Dynamo Model Ai Ren, Yina Du, and Mingliang Sun Group 7
  2. 2. Outline Motivation & Objective Key Ideas in Dynamo Simulation Method & Result Conclusion
  3. 3. Motivation It is all about $! − Massive scale data in hundreds of nodes − Commodity hardware infrastructure − Failure is the norm, not the exception
  4. 4. Motivation - Availability always-on experience for end users − How to handle failures transparently? − Parity checking or replication? − Strongly consistent or eventually consistent? − Conflict resolution: who and when?
  5. 5. Motivation - Scalability $ matters! Poor performance means losing customers and money − Increase capacity easily and incrementally Over-provisioning means unnecessary cost − Decrease capacity easily and incrementally
  6. 6. ObjectiveService is always available for customers with a guaranteed response time no matter what, and achieve this with as little $ as possible
  7. 7. Key Ideas A fully decentralized DHT (Distributed Hash Table) Consistent hashing − Natural partitioning and LB(division of labor) − Minimum data migration when node joins/leaves Replication for fault tolerance − Quorum techniques: R + W > N Eventual(weak) consistency model Conflict resolution − By application, not Dynamo − When reading, not writing
  8. 8. Simulation - Overview  Performance test tool for concurrent requests − Dynamo applications − Gather and record results  a ring of services as dynamo nodes − replication and fault tolerance  A proxy sits between the PT tool and the ring − a simple service interface − requests randomness − membership discovering
  9. 9. Simulation - Availability  When a node leaves, the coordinating node uses the next available node on the ring  With node replacement, right after a node leaves the ring (fails), a new node will join the ring, keeping the number of nodes unchanged  System load increases gradually (from100 to 200 requests / second)  4 simulation cases − W=2, N=3 (R=2)  With node replacement (15 nodes)  Without node replacement (15 → 10 nodes) − W=3, N=3 (R=1)  With node replacement (15 nodes)  Without node replacement (15 → 10 nodes)
  10. 10. Simulation - Availability  No failure requests recorded for all cases, service remains available when node leaves (and joins)  With replacement nodes, service level (throughput) is maintained  A W=2 setting gives better performance, while a W=3 setting provides better fault tolerance
  11. 11. Simulation - Scalability  Scalability: more nodes → larger capacity  Incremental & dynamic scalability: no service interruption  System load increases gradually (from 100 to 200 requests / second)  6 simulation cases − W=2, N=3 (R=2)  10 nodes  From 10 to 15 nodes  15 nodes − W=3, N=3 (R=1)  10 nodes  From 10 to 15 nodes  15 nodes
  12. 12. Simulation - Scalability  A Ring with more nodes provide greater capacity (throughput) than a ring with less nodes does  Moreover, capacity (throughput) increased incrementally (dynamically) when more nodes join the ring, without incurring service interruption  Higher the W setting, better fault tolerance, but worse writing performance
  13. 13. Conclusion With consistent hashing, the Dynamo model is able to provide great scalability and availability Massive scale data storage on large cluster of commodity infrastructure is possible A real application: the shopping cart on www.amazon.com

×