Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton
Upcoming SlideShare
Loading in...5
×
 

Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton

on

  • 3,805 views

Sam Overton's talk from Cassandra Europe on March 28th 2012

Sam Overton's talk from Cassandra Europe on March 28th 2012

Statistics

Views

Total Views
3,805
Views on SlideShare
2,715
Embed Views
1,090

Actions

Likes
2
Downloads
83
Comments
0

4 Embeds 1,090

http://www.acunu.com 923
http://www.weebly.com 135
http://acunutest.weebly.com 31
http://acunu-staging.weebly.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam Overton Presentation Transcript

  • Highly Available: TheCassandra Distribution Model Sam Overton Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelCassandra is:● built for scalability● built to tolerate failure In this talk:● Cassandra distribution overview● Partitioning and placement● Replication● Consistency Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelCassandra is:● built for scalability● built to tolerate failure In this talk:● Cassandra distribution overview● Partitioning and placement● Replication● Consistency Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelOverview● High availability● Partition tolerant● Tunable consistency● Scalable● Replication● No single point of failure Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelCassandra is:● built for scalability● built to tolerate failure In this talk:● Cassandra distribution overview● Partitioning and placement● Replication● Consistency Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelPartitioning and placementShould...● Assign data to hosts● Have no S.P.O.F for routing clients to data● Balance load● Allow scaling without moving too much data Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelConsistent Hashing Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelConsistent Hashing (k2, v2) (k1, v1) (k3, v3) Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelConsistent Hashing● partitioner maps key to ring token● hosts tokens determine placement of keys● and proportion of data assigned to each host● each row is stored on one host● wide rows can cause hot-spotting!So how does it scale? Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelConsistent Hashing Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelConsistent HashingBootstrapping anew node Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelConsistent HashingRange istransferred from oldhost to new host Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelConsistent Hashing Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelConsistent Hashing Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelConsistent Hashing Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelConsistent HashingDecommission isthe reverse process Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelConsistent Hashing Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelConsistent Hashing● Tokens can be assigned manually, automaticallyor randomly● Every node has full knowledge of placement● Client connects to any node, max 1 hop to data● Node status is gossiped Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelPartitioners● Converts a row key (from client data) into atoken on the ring● RandomPartitioner● Order Preserving Partitioner Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelPartitionersRandom Partitioner● token = hash(key)● good load balancing● no range queries across row keys Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelPartitionersOrder Preserving Partitioner● token = key● requires manual load balancing● careful selection of tokens around the ring● allows range queries across row keys Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelPartitioners● Get it right first time!● Design data model for RP● Custom partitioners are possible if necessary Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelCassandra is:● built for scalability● built to tolerate failure In this talk:● Cassandra distribution overview● Partitioning and placement● Replication● Consistency Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelReplication● For availability● For redundancy● Can increase read bandwidth Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelReplication● Replication Factor (RF) is number of copies ofdata● Defined per-keyspace● Can be changed (eg. If data becomes more/lessvaluable)● Determines how many failures can be tolerated Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelReplication Strategy● Determines how replicas are assigned for eachhost● Defined per keyspace (like RF)● SimpleStrategy● NetworkTopologyStrategy● Custom strategies can be written Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution Model Replication Strategy : Simple Strategy(k1, v1) eg. RF=3 (k2, v2) Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelReplication Strategy : Network Topology Strategy Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelReplication Strategy : Network Topology Strategy Multi-datacentre support DC1 DC2 Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelReplication Strategy : Network Topology Strategy Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelSnitches● Enables routing of requests according to nodeproximity● Used by replication strategy to determine rackand DC membership● Custom snitches can be written Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelSimple Snitch●Every host is in the same rack & DC with equalproximityRackInferringSnitchInfers the rack & DC from IP address of host●123.8.2.100 DC rack host Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelEC2Snitch● DC = EC2 region● Rack = EC2 availability zoneProperty file snitch●Rack and DC membership read fromconfiguration file Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelDynamicSnitch● Wraps each of the other snitches● Records latency stats from read operations● Avoids routing to slow hosts● Configurable update intervals Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelCassandra is:● built for scalability● built to tolerate failure In this talk:● Cassandra distribution overview● Partitioning and placement● Replication● Consistency Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelConsistency● Replication and failures/partitions causeinconsistency● Old versions of data can be returned Timestamps:● Chosen by the client● Can be used to avoid read-modify-write Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelConsistency● Cassandra allows a trade-off between partition-tolerance and consistencyFor strong consistency:●R+W>N 1 1●Eg. with 5 replicas 1 1 1(RF = N = 5)write to 3read from 3 Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelConsistency● Cassandra allows a trade-off between partition-tolerance and consistencyFor strong consistency:● writeR+W>N 2 1●Eg. with 5 replicas 2 2 1(RF = N = 5)write to 3read from 3 Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelConsistency● Cassandra allows a trade-off between partition-tolerance and consistencyFor strong consistency:● readR+W>N 2 1●Eg. with 5 replicas 2 2 1(RF = N = 5)write to 3read from 3 Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelConsistency Level● ANY (only for writes)● ONE, TWO, THREE● QUORUM (N/2 + 1)● LOCAL QUORUM● ALL● Relax strong consistency for partition tolerance● To tolerate 1 node failure with strong consistencyuse RF=3 with CL=QUORUM Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelIncreasing Consistency● Read repair● Hinted hand-off● Anti-entropy repair Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelRead Repair Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelRead Repair Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelRead Repair Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelRead Repair Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelHinted Hand-off (k1, v1)eg. RF=2 (k1, v1) Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelHinted Hand-off (k1, v1)eg. RF=2 (k1, v1) Write (k1, v2) Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelHinted Hand-off (k1, v1)eg. RF=2 (k1, v1) Write (k1, v2) Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelHinted Hand-off (k1, v1)eg. RF=2 (k1, v1) Write (k1, v2) Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelHinted Hand-off (k1, v1)eg. RF=2 (k1, v1) Write (k1, v2) (k1, Cassandra Europe 2012 v2)
  • Highly Available: The Cassandra Distribution ModelHinted Hand-off (k1, v2)eg. RF=2 (k1, v1) (k1, Cassandra Europe 2012 v2)
  • Highly Available: The Cassandra Distribution ModelHinted Hand-off (k1, v2)eg. RF=2 (k1, v2) (k1, Cassandra Europe 2012 v2)
  • Highly Available: The Cassandra Distribution ModelHinted Hand-off (k1, v2)eg. RF=2 (k1, v2) (k1, Cassandra Europe 2012 v2)
  • Highly Available: The Cassandra Distribution ModelHinted Hand-off● Hinted writes do not count towards the chosenconsistency level● … except with CL=ANY which succeeds even ifall replicas are down● Dont rely on hints: hints cannot be read! Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelAnti-entropy repair● Manual maintenance process● Compares all data stored on a host with thereplicas● Differences are streamed to restore consistency● Must be run every 10 days to ensuretombstones are replicated Cassandra Europe 2012
  • Highly Available: The Cassandra Distribution ModelCassandra is:● built for scalability● built to tolerate failure In this talk:● Cassandra distribution overview● Partitioning and placement● Replication● Consistency fin. Cassandra Europe 2012