Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

3,625 views
3,546 views

Published on

Sam Overton's talk from Cassandra Europe on March 28th 2012

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,625
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
86
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

  1. 1. Highly Available: TheCassandra Distribution Model Sam Overton Cassandra Europe 2012
  2. 2. Highly Available: The Cassandra Distribution ModelCassandra is:● built for scalability● built to tolerate failure In this talk:● Cassandra distribution overview● Partitioning and placement● Replication● Consistency Cassandra Europe 2012
  3. 3. Highly Available: The Cassandra Distribution ModelCassandra is:● built for scalability● built to tolerate failure In this talk:● Cassandra distribution overview● Partitioning and placement● Replication● Consistency Cassandra Europe 2012
  4. 4. Highly Available: The Cassandra Distribution ModelOverview● High availability● Partition tolerant● Tunable consistency● Scalable● Replication● No single point of failure Cassandra Europe 2012
  5. 5. Highly Available: The Cassandra Distribution ModelCassandra is:● built for scalability● built to tolerate failure In this talk:● Cassandra distribution overview● Partitioning and placement● Replication● Consistency Cassandra Europe 2012
  6. 6. Highly Available: The Cassandra Distribution ModelPartitioning and placementShould...● Assign data to hosts● Have no S.P.O.F for routing clients to data● Balance load● Allow scaling without moving too much data Cassandra Europe 2012
  7. 7. Highly Available: The Cassandra Distribution ModelConsistent Hashing Cassandra Europe 2012
  8. 8. Highly Available: The Cassandra Distribution ModelConsistent Hashing (k2, v2) (k1, v1) (k3, v3) Cassandra Europe 2012
  9. 9. Highly Available: The Cassandra Distribution ModelConsistent Hashing● partitioner maps key to ring token● hosts tokens determine placement of keys● and proportion of data assigned to each host● each row is stored on one host● wide rows can cause hot-spotting!So how does it scale? Cassandra Europe 2012
  10. 10. Highly Available: The Cassandra Distribution ModelConsistent Hashing Cassandra Europe 2012
  11. 11. Highly Available: The Cassandra Distribution ModelConsistent HashingBootstrapping anew node Cassandra Europe 2012
  12. 12. Highly Available: The Cassandra Distribution ModelConsistent HashingRange istransferred from oldhost to new host Cassandra Europe 2012
  13. 13. Highly Available: The Cassandra Distribution ModelConsistent Hashing Cassandra Europe 2012
  14. 14. Highly Available: The Cassandra Distribution ModelConsistent Hashing Cassandra Europe 2012
  15. 15. Highly Available: The Cassandra Distribution ModelConsistent Hashing Cassandra Europe 2012
  16. 16. Highly Available: The Cassandra Distribution ModelConsistent HashingDecommission isthe reverse process Cassandra Europe 2012
  17. 17. Highly Available: The Cassandra Distribution ModelConsistent Hashing Cassandra Europe 2012
  18. 18. Highly Available: The Cassandra Distribution ModelConsistent Hashing● Tokens can be assigned manually, automaticallyor randomly● Every node has full knowledge of placement● Client connects to any node, max 1 hop to data● Node status is gossiped Cassandra Europe 2012
  19. 19. Highly Available: The Cassandra Distribution ModelPartitioners● Converts a row key (from client data) into atoken on the ring● RandomPartitioner● Order Preserving Partitioner Cassandra Europe 2012
  20. 20. Highly Available: The Cassandra Distribution ModelPartitionersRandom Partitioner● token = hash(key)● good load balancing● no range queries across row keys Cassandra Europe 2012
  21. 21. Highly Available: The Cassandra Distribution ModelPartitionersOrder Preserving Partitioner● token = key● requires manual load balancing● careful selection of tokens around the ring● allows range queries across row keys Cassandra Europe 2012
  22. 22. Highly Available: The Cassandra Distribution ModelPartitioners● Get it right first time!● Design data model for RP● Custom partitioners are possible if necessary Cassandra Europe 2012
  23. 23. Highly Available: The Cassandra Distribution ModelCassandra is:● built for scalability● built to tolerate failure In this talk:● Cassandra distribution overview● Partitioning and placement● Replication● Consistency Cassandra Europe 2012
  24. 24. Highly Available: The Cassandra Distribution ModelReplication● For availability● For redundancy● Can increase read bandwidth Cassandra Europe 2012
  25. 25. Highly Available: The Cassandra Distribution ModelReplication● Replication Factor (RF) is number of copies ofdata● Defined per-keyspace● Can be changed (eg. If data becomes more/lessvaluable)● Determines how many failures can be tolerated Cassandra Europe 2012
  26. 26. Highly Available: The Cassandra Distribution ModelReplication Strategy● Determines how replicas are assigned for eachhost● Defined per keyspace (like RF)● SimpleStrategy● NetworkTopologyStrategy● Custom strategies can be written Cassandra Europe 2012
  27. 27. Highly Available: The Cassandra Distribution Model Replication Strategy : Simple Strategy(k1, v1) eg. RF=3 (k2, v2) Cassandra Europe 2012
  28. 28. Highly Available: The Cassandra Distribution ModelReplication Strategy : Network Topology Strategy Cassandra Europe 2012
  29. 29. Highly Available: The Cassandra Distribution ModelReplication Strategy : Network Topology Strategy Multi-datacentre support DC1 DC2 Cassandra Europe 2012
  30. 30. Highly Available: The Cassandra Distribution ModelReplication Strategy : Network Topology Strategy Cassandra Europe 2012
  31. 31. Highly Available: The Cassandra Distribution ModelSnitches● Enables routing of requests according to nodeproximity● Used by replication strategy to determine rackand DC membership● Custom snitches can be written Cassandra Europe 2012
  32. 32. Highly Available: The Cassandra Distribution ModelSimple Snitch●Every host is in the same rack & DC with equalproximityRackInferringSnitchInfers the rack & DC from IP address of host●123.8.2.100 DC rack host Cassandra Europe 2012
  33. 33. Highly Available: The Cassandra Distribution ModelEC2Snitch● DC = EC2 region● Rack = EC2 availability zoneProperty file snitch●Rack and DC membership read fromconfiguration file Cassandra Europe 2012
  34. 34. Highly Available: The Cassandra Distribution ModelDynamicSnitch● Wraps each of the other snitches● Records latency stats from read operations● Avoids routing to slow hosts● Configurable update intervals Cassandra Europe 2012
  35. 35. Highly Available: The Cassandra Distribution ModelCassandra is:● built for scalability● built to tolerate failure In this talk:● Cassandra distribution overview● Partitioning and placement● Replication● Consistency Cassandra Europe 2012
  36. 36. Highly Available: The Cassandra Distribution ModelConsistency● Replication and failures/partitions causeinconsistency● Old versions of data can be returned Timestamps:● Chosen by the client● Can be used to avoid read-modify-write Cassandra Europe 2012
  37. 37. Highly Available: The Cassandra Distribution ModelConsistency● Cassandra allows a trade-off between partition-tolerance and consistencyFor strong consistency:●R+W>N 1 1●Eg. with 5 replicas 1 1 1(RF = N = 5)write to 3read from 3 Cassandra Europe 2012
  38. 38. Highly Available: The Cassandra Distribution ModelConsistency● Cassandra allows a trade-off between partition-tolerance and consistencyFor strong consistency:● writeR+W>N 2 1●Eg. with 5 replicas 2 2 1(RF = N = 5)write to 3read from 3 Cassandra Europe 2012
  39. 39. Highly Available: The Cassandra Distribution ModelConsistency● Cassandra allows a trade-off between partition-tolerance and consistencyFor strong consistency:● readR+W>N 2 1●Eg. with 5 replicas 2 2 1(RF = N = 5)write to 3read from 3 Cassandra Europe 2012
  40. 40. Highly Available: The Cassandra Distribution ModelConsistency Level● ANY (only for writes)● ONE, TWO, THREE● QUORUM (N/2 + 1)● LOCAL QUORUM● ALL● Relax strong consistency for partition tolerance● To tolerate 1 node failure with strong consistencyuse RF=3 with CL=QUORUM Cassandra Europe 2012
  41. 41. Highly Available: The Cassandra Distribution ModelIncreasing Consistency● Read repair● Hinted hand-off● Anti-entropy repair Cassandra Europe 2012
  42. 42. Highly Available: The Cassandra Distribution ModelRead Repair Cassandra Europe 2012
  43. 43. Highly Available: The Cassandra Distribution ModelRead Repair Cassandra Europe 2012
  44. 44. Highly Available: The Cassandra Distribution ModelRead Repair Cassandra Europe 2012
  45. 45. Highly Available: The Cassandra Distribution ModelRead Repair Cassandra Europe 2012
  46. 46. Highly Available: The Cassandra Distribution ModelHinted Hand-off (k1, v1)eg. RF=2 (k1, v1) Cassandra Europe 2012
  47. 47. Highly Available: The Cassandra Distribution ModelHinted Hand-off (k1, v1)eg. RF=2 (k1, v1) Write (k1, v2) Cassandra Europe 2012
  48. 48. Highly Available: The Cassandra Distribution ModelHinted Hand-off (k1, v1)eg. RF=2 (k1, v1) Write (k1, v2) Cassandra Europe 2012
  49. 49. Highly Available: The Cassandra Distribution ModelHinted Hand-off (k1, v1)eg. RF=2 (k1, v1) Write (k1, v2) Cassandra Europe 2012
  50. 50. Highly Available: The Cassandra Distribution ModelHinted Hand-off (k1, v1)eg. RF=2 (k1, v1) Write (k1, v2) (k1, Cassandra Europe 2012 v2)
  51. 51. Highly Available: The Cassandra Distribution ModelHinted Hand-off (k1, v2)eg. RF=2 (k1, v1) (k1, Cassandra Europe 2012 v2)
  52. 52. Highly Available: The Cassandra Distribution ModelHinted Hand-off (k1, v2)eg. RF=2 (k1, v2) (k1, Cassandra Europe 2012 v2)
  53. 53. Highly Available: The Cassandra Distribution ModelHinted Hand-off (k1, v2)eg. RF=2 (k1, v2) (k1, Cassandra Europe 2012 v2)
  54. 54. Highly Available: The Cassandra Distribution ModelHinted Hand-off● Hinted writes do not count towards the chosenconsistency level● … except with CL=ANY which succeeds even ifall replicas are down● Dont rely on hints: hints cannot be read! Cassandra Europe 2012
  55. 55. Highly Available: The Cassandra Distribution ModelAnti-entropy repair● Manual maintenance process● Compares all data stored on a host with thereplicas● Differences are streamed to restore consistency● Must be run every 10 days to ensuretombstones are replicated Cassandra Europe 2012
  56. 56. Highly Available: The Cassandra Distribution ModelCassandra is:● built for scalability● built to tolerate failure In this talk:● Cassandra distribution overview● Partitioning and placement● Replication● Consistency fin. Cassandra Europe 2012

×