Cassandra
diegopacheco
@diego_pacheco
Diego Pacheco
http://cassandra.apache.org/
Why Apache Cassandra?
❏ Open Source
❏ Written in Java
❏ Scalability & High Availability
❏ Fault Tolerance
❏ Replication across multiple datacenters
❏ Async Masterless Replication
❏ No Single Point of Failure
❏ Based on Amazon Dynamo paper
❏ Created by Facebook, open sourced to apache in 2008
Battle Tested by
http://planetcassandra.org/apache-cassandra-use-cases/
Benchmark: 1 million writes per second
http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html
CAP: Consistency VS Availability
Cluster
Murmur3Partitioner
❏ Murmur3Partitioner
❏ Default
❏ 3-5x faster than RandomPartitioner
❏ Based on Tokens hash values
❏ Uniform
❏ RandomPartitioner
❏ Uniform
❏ MD5 hash
❏ ByteOrderedPartitioner
❏ Lexically ordered by key bytes
❏ Ordered partition
❏ Not Recommended:
❏ Difficult LB, Hot Spots, Uneven LB multiple tables.
Replication
❏ Concepts:
❏ Virtual Nodes: Data ownership to machines
❏ Partitioner: Partitions data on the cluster
❏ Replication Strategy: Determine Replicas for each row of data
❏ Snitch: Topology, information about replicas and strategy.
❏ Client writes to any node
❏ Node coordinates with replicas
❏ Replication happens in parallel
❏ Replication Factor = How many nodes with same data? I.E. 3.
❏ SimpleStrategy VS NetworkTopologyStrategy
❏ Design: Nodes, Racks, Data Centers great for Cloud Computing!
Replication
Consistency
❏ Tunable Consistency: Reads and Writes:
❏ Consistency VS Availability Trade Offs:
❏ ONE, TWO, THREE
❏ QUORUM(majority = N /2 + 1) - LOCAL_QUORUM(majority local dc)
❏ EACH_QUORUM (majority all dcs)
❏ LOCAL_ONE
❏ ALL
❏ ANY (Just for writes)
❏ Disaster Recovery scenarios:
❏ SERIAL
❏ LOCAL_SERIAL
Reads and Index
❏ Partition key Cache
❏ Off Heap
❏ Configurable
❏ Row Cache
❏ Off Heap
❏ Configurable
❏ Secondary Index
❏ Filter data on table by non-primary key
❏ ALLOW FILTERING - Could be problematic
❏ Cassandra 3.4 - SASI Secondary Index
❏ Better Performance
❏ In memory mapped B+ tree
❏ Can't use with collections
Storage
❏ Log-Structured Merge Tree(don't use B-TREE)
❏ Avoid Read before write
❏ Flavors Latency
❏ Cass Groups Insert/Updates in memory. Periodically SYNC to disk(sequential append).
❏ Immutable Data
❏ Check before write? Use Lightweight Transactions.
❏ Writes: Commit Log -> Memtable -> Flush -> Disk SSTABLE
❏ All writes are versioned
❏ No Delete: Tombstones
❏ Reads: Bloom filter
❏ Off Heap structure for SSTable
Java Driver
❏ Specific to Cassandra
❏ Prepared Statements
❏ Connection Pooling
❏ Reconnection Policies
❏ Load Balancing Policies
❏ Retry Policies
❏ Async Netty
❏ Native Protocol
Cassandra
diegopacheco
@diego_pacheco
Diego Pacheco

Cassandra

  • 1.
  • 2.
  • 3.
    Why Apache Cassandra? ❏Open Source ❏ Written in Java ❏ Scalability & High Availability ❏ Fault Tolerance ❏ Replication across multiple datacenters ❏ Async Masterless Replication ❏ No Single Point of Failure ❏ Based on Amazon Dynamo paper ❏ Created by Facebook, open sourced to apache in 2008
  • 4.
  • 5.
    Benchmark: 1 millionwrites per second http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html
  • 6.
    CAP: Consistency VSAvailability
  • 7.
  • 8.
    Murmur3Partitioner ❏ Murmur3Partitioner ❏ Default ❏3-5x faster than RandomPartitioner ❏ Based on Tokens hash values ❏ Uniform ❏ RandomPartitioner ❏ Uniform ❏ MD5 hash ❏ ByteOrderedPartitioner ❏ Lexically ordered by key bytes ❏ Ordered partition ❏ Not Recommended: ❏ Difficult LB, Hot Spots, Uneven LB multiple tables.
  • 9.
    Replication ❏ Concepts: ❏ VirtualNodes: Data ownership to machines ❏ Partitioner: Partitions data on the cluster ❏ Replication Strategy: Determine Replicas for each row of data ❏ Snitch: Topology, information about replicas and strategy. ❏ Client writes to any node ❏ Node coordinates with replicas ❏ Replication happens in parallel ❏ Replication Factor = How many nodes with same data? I.E. 3. ❏ SimpleStrategy VS NetworkTopologyStrategy ❏ Design: Nodes, Racks, Data Centers great for Cloud Computing!
  • 10.
  • 11.
    Consistency ❏ Tunable Consistency:Reads and Writes: ❏ Consistency VS Availability Trade Offs: ❏ ONE, TWO, THREE ❏ QUORUM(majority = N /2 + 1) - LOCAL_QUORUM(majority local dc) ❏ EACH_QUORUM (majority all dcs) ❏ LOCAL_ONE ❏ ALL ❏ ANY (Just for writes) ❏ Disaster Recovery scenarios: ❏ SERIAL ❏ LOCAL_SERIAL
  • 12.
    Reads and Index ❏Partition key Cache ❏ Off Heap ❏ Configurable ❏ Row Cache ❏ Off Heap ❏ Configurable ❏ Secondary Index ❏ Filter data on table by non-primary key ❏ ALLOW FILTERING - Could be problematic ❏ Cassandra 3.4 - SASI Secondary Index ❏ Better Performance ❏ In memory mapped B+ tree ❏ Can't use with collections
  • 13.
    Storage ❏ Log-Structured MergeTree(don't use B-TREE) ❏ Avoid Read before write ❏ Flavors Latency ❏ Cass Groups Insert/Updates in memory. Periodically SYNC to disk(sequential append). ❏ Immutable Data ❏ Check before write? Use Lightweight Transactions. ❏ Writes: Commit Log -> Memtable -> Flush -> Disk SSTABLE ❏ All writes are versioned ❏ No Delete: Tombstones ❏ Reads: Bloom filter ❏ Off Heap structure for SSTable
  • 14.
    Java Driver ❏ Specificto Cassandra ❏ Prepared Statements ❏ Connection Pooling ❏ Reconnection Policies ❏ Load Balancing Policies ❏ Retry Policies ❏ Async Netty ❏ Native Protocol
  • 15.