Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Strategies for Distributed Data Storage

8,202 views

Published on

Published in: Technology, Business
  • Dating for everyone is here: ❶❶❶ http://bit.ly/2F4cEJi ❶❶❶
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Follow the link, new dating source: ❶❶❶ http://bit.ly/2F4cEJi ❶❶❶
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Hi there! Get Your Professional Job-Winning Resume Here - Check our website! http://bit.ly/resumpro
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Strategies for Distributed Data Storage

  1. 1. Cassandra Strategies for Distributed Data Storage
  2. 2. I: Fat Clients are Expensive II: Availability vs. Consistency III: Strategies for Eventual Consistency Cassandra: Strategies for Distributed Data Storage
  3. 3. I: Fat Clients are Expensive Cassandra: Strategies for Distributed Data Storage
  4. 4. In the Beginning... Web Thin Data API Simple: 1 web server DB 1 database Cassandra: Strategies for Distributed Data Storage
  5. 5. Your Data Grows... Web Data API Move tables to DB DB different DBs. user item Cassandra: Strategies for Distributed Data Storage
  6. 6. A table grows too large... Web Data API ... Shard table by DB DB DB PK ranges. item item item ... 0 1 2 PK Range: [0, 10k) [10k, 20k) [20k, 30k) Cassandra: Strategies for Distributed Data Storage
  7. 7. Problem: Multiple Client Languages python ruby java Data API Data API Data API Cassandra: Strategies for Distributed Data Storage
  8. 8. Are there other trade-offs? Cassandra: Strategies for Distributed Data Storage
  9. 9. II: Availability vs. Consistency Cassandra: Strategies for Distributed Data Storage
  10. 10. Why consistency vs. availability? CAP Theorem Cassandra: Strategies for Distributed Data Storage
  11. 11. CAP Theorem You can have at most two of these properties in a shared-data system: Consistency Availability Partition-Tolerance Cassandra: Strategies for Distributed Data Storage
  12. 12. Problem: Sharded DB Cluster Favors C over A. Web Data API ... ... SPOF No ... DB shard ... Replication Cassandra: Strategies for Distributed Data Storage
  13. 13. Slightly better with master-slave replication... Web Data Write: ... DB shard ... SPOF Bottlenecked master ... DB ... Read: Replicated shard slave Cassandra: Strategies for Distributed Data Storage
  14. 14. Availability Arguments Avoid SPOFs Distribute Writes to All Nodes in Replica Set Cassandra: Strategies for Distributed Data Storage
  15. 15. Availability Easy: Write replica A value: “x” Write coord. replica B replica C Cassandra: Strategies for Distributed Data Storage
  16. 16. Availability Harder: Consistency Across Replicas replica A value: “x” coord. replica B value: “x” replica C value: “x” Cassandra: Strategies for Distributed Data Storage
  17. 17. So, how do we achieve consistency? Cassandra: Strategies for Distributed Data Storage
  18. 18. III: Strategies for Eventual Consistency Cassandra: Strategies for Distributed Data Storage
  19. 19. I: Write-Related Strategies II: Read-Related Strategies Cassandra: Strategies for Distributed Data Storage
  20. 20. Write-Related Strategies I: Hinted Hand-Off II: Gossip Cassandra: Strategies for Distributed Data Storage
  21. 21. I: Hinted Hand-Off Cassandra: Strategies for Distributed Data Storage
  22. 22. Hinted Hand-Off Problem Write to an Unavailable Node Cassandra: Strategies for Distributed Data Storage
  23. 23. Hinted Hand-Off Solution 1) “hinted” write to a live node 2) deliver hints when node is reachable Cassandra: Strategies for Distributed Data Storage
  24. 24. Hinted Hand-Off Step 1: “hinted” write to a live node part of replica set is available A target (dead) “hinted” coord. write B nearest live replica C Cassandra: Strategies for Distributed Data Storage
  25. 25. Hinted Hand-Off Step 1: “hinted” write to a live node all replica nodes unreachable A target (dead) closest coord. “hinted” B node (dead) write C (dead) Cassandra: Strategies for Distributed Data Storage
  26. 26. Hinted Hand-Off Step 2: deliver hints when node is reachable node deliver replica target (now available) “hinted” writes Cassandra: Strategies for Distributed Data Storage
  27. 27. How does a node learn when another node is available? Cassandra: Strategies for Distributed Data Storage
  28. 28. II: Gossip Cassandra: Strategies for Distributed Data Storage
  29. 29. Gossip Problem Each node cannot scalably ping every other node. 8 nodes: 82 = 64 100 nodes: 1002 = 10,000 Cassandra: Strategies for Distributed Data Storage
  30. 30. Gossip Solution I: Anti-Entropy Gossip Protocol II: Phi-Accrual Failure Detector Cassandra: Strategies for Distributed Data Storage
  31. 31. Gossip Anti-Entropy Gossip Protocol node node Cassandra: Strategies for Distributed Data Storage
  32. 32. Gossip Phi-Accrual Failure Detector Dynamically adjusts its “suspicion” level of another node, based on inter-arrival times of gossip messages. Cassandra: Strategies for Distributed Data Storage
  33. 33. Read-Related Strategies I: Read-Repair II: Anti-Entropy Service Cassandra: Strategies for Distributed Data Storage
  34. 34. I: Read-Repair Cassandra: Strategies for Distributed Data Storage
  35. 35. Read-Repair Problem A Write Has Not Propagated to All Replicas Cassandra: Strategies for Distributed Data Storage
  36. 36. Read-Repair Solution Repair Outdated Replicas After Read Cassandra: Strategies for Distributed Data Storage
  37. 37. Read-Repair Example Quorum Read Replication Factor: 3 Cassandra: Strategies for Distributed Data Storage
  38. 38. Read-Repair Steps 1) do digest-based read (if digests match) 2) do full read and repair replicas Cassandra: Strategies for Distributed Data Storage
  39. 39. Read-Repair Step 1: do digest-based read one full read; other reads are digest A F coord. B D D C Cassandra: Strategies for Distributed Data Storage
  40. 40. Read-Repair Step 1: do digest-based read wait for 2 replies (where one is full read) A F coord. B D C Cassandra: Strategies for Distributed Data Storage
  41. 41. Read-Repair Step 1: do digest-based read return value to client (if all digests match) D == digest( F ) coord. return value to client Cassandra: Strategies for Distributed Data Storage
  42. 42. Read-Repair Step 2: do full read and repair replicas full read from all replicas A F coord. B F F C Cassandra: Strategies for Distributed Data Storage
  43. 43. Read-Repair Step 2: do full read and repair replicas wait for 2 replies A F coord. B F C Cassandra: Strategies for Distributed Data Storage
  44. 44. Read-Repair Step 2: do full read and repair replicas calculate newest value from replies value timestamp replica A: “x” t0 replica B: “y” t1 reconciled: “y” t1 Cassandra: Strategies for Distributed Data Storage
  45. 45. Read-Repair Step 2: do full read and repair replicas return newest value to client coord. return reconciled value to client Cassandra: Strategies for Distributed Data Storage
  46. 46. Read-Repair Step 2: do full read and repair replicas calculate repair mutations for each replica diff(reconciled value, replica value) = repair mutation Repair for Replica A Repair for Replica B diff( “y” @ t1, “x” @ t0) diff( “y” @ t1, “y” @ t1) = “y” @ t1 = null Cassandra: Strategies for Distributed Data Storage
  47. 47. Read-Repair Step 2: do full read and repair replicas send repair mutation to each replica A R coord. B C Cassandra: Strategies for Distributed Data Storage
  48. 48. What about values that have not been read? Cassandra: Strategies for Distributed Data Storage
  49. 49. II: Anti-Entropy Service Cassandra: Strategies for Distributed Data Storage
  50. 50. Anti-Entropy Service Problem How to Repair Unread Values Cassandra: Strategies for Distributed Data Storage
  51. 51. Anti-Entropy Service Solution 1) detect inconsistency via Merkle Trees 2) repair inconsistent data Cassandra: Strategies for Distributed Data Storage
  52. 52. Anti-Entropy Service Merkle Tree a tree where a node’s hash summarizes the hashes of its children root node hash A summarizes its children’s hashes node hash B C summarizes its children’s hashes leaf hash D E F G hash of a data block Cassandra: Strategies for Distributed Data Storage
  53. 53. Anti-Entropy Service Step 1: detect inconsistency create Merkle Trees on all replicas B request A Merkle Tree creation create local Merkle Tree C Cassandra: Strategies for Distributed Data Storage
  54. 54. Anti-Entropy Service Step 1: detect inconsistency exchange Merkle Trees between replicas B exchange A Merkle Tree across all replicas C Cassandra: Strategies for Distributed Data Storage
  55. 55. Anti-Entropy Service Step 1: detect inconsistency compare local and remote Merkle Trees Replica A Replica B A A match mismatch B C B C D E F G D E F G Cassandra: Strategies for Distributed Data Storage
  56. 56. Anti-Entropy Service Step 2: repair inconsistent data send repair to remote replica A B send repair for data hashed by node F Cassandra: Strategies for Distributed Data Storage
  57. 57. Any Questions? Cassandra: Strategies for Distributed Data Storage
  58. 58. More Information Cassandra Site: http://cassandra.apache.org/ My email address: kakugawa@gmail.com Cassandra: Strategies for Distributed Data Storage

×