©2013 DataStax Confidential. Do not distribute without consent.
@PatrickMcFadin
Patrick McFadin
Chief Evangelist/Solution Architect - DataStax
Cassandra 2.0: Better, Stronger, Faster
Thursday, October 3, 13
Five Years of Cassandra
Jul-09 May-10 Feb-11 Dec-11 Oct-12 Jul-13
0.1 0.3 0.6 0.7 1.0 1.2
...
2.0
DSE
Jul-08
Thursday, October 3, 13
Cassandra 2.0 - Big new features
Thursday, October 3, 13
SELECT * FROM users
WHERE username = ’jbellis’
[empty resultset]
Session 1
SELECT * FROM users
WHERE username = ’jbellis’
[empty resultset]
Session 2
Lightweight transactions: the problem
INSERT INTO users
(username,password)
VALUES (’jbellis’,‘xdg44hh’)
INSERT INTO users
(userName,password)
VALUES (’jbellis’,‘8dhh43k’)
It’s a Race!
Who wins?
Thursday, October 3, 13
Client
(locks)
Coordinatorrequest
Replica
internal
request
Why Locking Doesn’t Work
• Client locks
• Write times out
• Lock released
• Hint is replayed!!
Thursday, October 3, 13
Client
(locks)
Coordinatorrequest
Replica
internal
request
X
Why Locking Doesn’t Work
• Client locks
• Write times out
• Lock released
• Hint is replayed!!
Thursday, October 3, 13
Client
(locks)
Coordinatorrequest
Replica
internal
request
hint
X
Why Locking Doesn’t Work
• Client locks
• Write times out
• Lock released
• Hint is replayed!!
Thursday, October 3, 13
Client
(locks)
Coordinatorrequest
Replica
internal
request
hint
timeout
response
X
Why Locking Doesn’t Work
• Client locks
• Write times out
• Lock released
• Hint is replayed!!
Thursday, October 3, 13
Paxos
• Consensus algorithm
• All operations are quorum-based
• Each replica sends information about unfinished operations to the leader
during prepare
• Paxos made Simple
Thursday, October 3, 13
LWT: details
• 4 round trips vs 1 for normal updates
• Paxos state is durable
• Immediate consistency with no leader election or failover
• ConsistencyLevel.SERIAL
• http://www.datastax.com/dev/blog/lightweight-transactions-in-
cassandra-2-0
Thursday, October 3, 13
LWT: Use with caution
• Great for 1% of your application
• Eventual consistency is your friend
• http://www.slideshare.net/planetcassandra/c-summit-2013-eventual-consistency-
hopeful-consistency-by-christos-kalantzis
Thursday, October 3, 13
UPDATE USERS
SET email = ’jonathan@datastax.com’, ...
WHERE username = ’jbellis’
IF email = ’jbellis@datastax.com’;
INSERT INTO USERS (username, email, ...)
VALUES (‘jbellis’, ‘jbellis@datastax.com’, ... )
IF NOT EXISTS;
Using LWT
• Don’t overwrite an existing record
• Only update record if condition is met
Thursday, October 3, 13
Triggers
CREATE TRIGGER <name> ON <table> USING <classname>;
DROP TRIGGER <name> ON [<keyspace>.]<table>;
• Executed on the coordinator before mutation
• Takes original mutation and adds any new
• Jars deployed per server
Thursday, October 3, 13
Trigger implementation
class MyTrigger implements ITrigger
{
public Collection<RowMutation> augment(ByteBuffer key, ColumnFamily update)
{
...
}
}
• You have to implement your own ITrigger (for now)
• Compile and deploy to each server
Thursday, October 3, 13
Experimental!
• Relies on internal RowMutation, ColumnFamily classes
• Not sandboxed. Be careful!
• Expect changes in 2.1
Thursday, October 3, 13
CQL Improvements
• ALTER DROP
• Remove a field from a CQL table.
• Conditional schema changes
• Only execute if condition met
CREATE KEYSPACE IF NOT EXISTS ks
WITH replication = { 'class': 'SimpleStrategy','replication_factor' :
3 };
CREATE TABLE IF NOT EXISTS test (k int PRIMARY KEY);
DROP KEYSPACE IF EXISTS ks;
ALTER TABLE users DROP address3;
Thursday, October 3, 13
CQL Improvements
• Aliases in SELECT
• Limit and TTL in prepared statements
SELECT event_id, dateOf(created_at) AS creation_date,
blobAsText(content) AS content
FROM timeline;
event_id | creation_date | content
-------------------------+--------------------------+----------------------
550e8400-e29b-41d4-a716 | 2013-07-26 10:44:33+0200 | Something happened!?
SELECT * FROM myTable LIMIT ?;
UPDATE myTable USING TTL ? SET v = 2 WHERE k = 'foo';
Thursday, October 3, 13
Cassandra 2.0 - Minor features
Thursday, October 3, 13
Query performance	
• Hint when reading time series data
• Time series slices find data faster
• Hybrid approach to Leveled Compaction under stress
• Use size tiered until we catch up
• Reduce read latency impact
• Off-heap memory speedup
• Bytes moved on and off 10x faster
• Removal of row-level bloom filters
Thursday, October 3, 13
Server performance
• Single pass compaction
• No more incremental compaction for large storage rows
• LMAX Disruptor on Thrift interface
• Crazy fast and efficient concurrent threads. Faster HSHA
• Support for pluggable off-heap memory allocators
• JEMalloc support to start. Faster memory access.
• Bigger Level 0 file size
• 5M was just too small. Now 160M
Thursday, October 3, 13
Removed features
• SuperColumns are gone!
• Not the API just the underlying implementation
• On-heap row cache
• Row cache is no longer an option in the JVM
• Memory pressure relief valves - Gone from yaml
• flush_largest_memtables_at
• reduce_cache_sizes_at
• reduce_cache_sizes_to
Thursday, October 3, 13
Operation Changes
• JDK 7 now required
• Vnodes are default
• Streaming overhaul
• Control. Streams are grouped and broken into plans
• Traceability. Each stream has an ID. Monitor each stream.
• Performance. Streams are now pipelined. No waiting for ACK
Thursday, October 3, 13
Thank you!
Apache Cassandra 2.0 - Data model on fire
Next talk in my data model series!
Thursday, October 3, 13
©2013 DataStax Confidential. Do not distribute without consent. 21
Thursday, October 3, 13

Cassandra 2.0 better, faster, stronger

  • 1.
    ©2013 DataStax Confidential.Do not distribute without consent. @PatrickMcFadin Patrick McFadin Chief Evangelist/Solution Architect - DataStax Cassandra 2.0: Better, Stronger, Faster Thursday, October 3, 13
  • 2.
    Five Years ofCassandra Jul-09 May-10 Feb-11 Dec-11 Oct-12 Jul-13 0.1 0.3 0.6 0.7 1.0 1.2 ... 2.0 DSE Jul-08 Thursday, October 3, 13
  • 3.
    Cassandra 2.0 -Big new features Thursday, October 3, 13
  • 4.
    SELECT * FROMusers WHERE username = ’jbellis’ [empty resultset] Session 1 SELECT * FROM users WHERE username = ’jbellis’ [empty resultset] Session 2 Lightweight transactions: the problem INSERT INTO users (username,password) VALUES (’jbellis’,‘xdg44hh’) INSERT INTO users (userName,password) VALUES (’jbellis’,‘8dhh43k’) It’s a Race! Who wins? Thursday, October 3, 13
  • 5.
    Client (locks) Coordinatorrequest Replica internal request Why Locking Doesn’tWork • Client locks • Write times out • Lock released • Hint is replayed!! Thursday, October 3, 13
  • 6.
    Client (locks) Coordinatorrequest Replica internal request X Why Locking Doesn’tWork • Client locks • Write times out • Lock released • Hint is replayed!! Thursday, October 3, 13
  • 7.
    Client (locks) Coordinatorrequest Replica internal request hint X Why Locking Doesn’tWork • Client locks • Write times out • Lock released • Hint is replayed!! Thursday, October 3, 13
  • 8.
    Client (locks) Coordinatorrequest Replica internal request hint timeout response X Why Locking Doesn’tWork • Client locks • Write times out • Lock released • Hint is replayed!! Thursday, October 3, 13
  • 9.
    Paxos • Consensus algorithm •All operations are quorum-based • Each replica sends information about unfinished operations to the leader during prepare • Paxos made Simple Thursday, October 3, 13
  • 10.
    LWT: details • 4round trips vs 1 for normal updates • Paxos state is durable • Immediate consistency with no leader election or failover • ConsistencyLevel.SERIAL • http://www.datastax.com/dev/blog/lightweight-transactions-in- cassandra-2-0 Thursday, October 3, 13
  • 11.
    LWT: Use withcaution • Great for 1% of your application • Eventual consistency is your friend • http://www.slideshare.net/planetcassandra/c-summit-2013-eventual-consistency- hopeful-consistency-by-christos-kalantzis Thursday, October 3, 13
  • 12.
    UPDATE USERS SET email= ’jonathan@datastax.com’, ... WHERE username = ’jbellis’ IF email = ’jbellis@datastax.com’; INSERT INTO USERS (username, email, ...) VALUES (‘jbellis’, ‘jbellis@datastax.com’, ... ) IF NOT EXISTS; Using LWT • Don’t overwrite an existing record • Only update record if condition is met Thursday, October 3, 13
  • 13.
    Triggers CREATE TRIGGER <name>ON <table> USING <classname>; DROP TRIGGER <name> ON [<keyspace>.]<table>; • Executed on the coordinator before mutation • Takes original mutation and adds any new • Jars deployed per server Thursday, October 3, 13
  • 14.
    Trigger implementation class MyTriggerimplements ITrigger { public Collection<RowMutation> augment(ByteBuffer key, ColumnFamily update) { ... } } • You have to implement your own ITrigger (for now) • Compile and deploy to each server Thursday, October 3, 13
  • 15.
    Experimental! • Relies oninternal RowMutation, ColumnFamily classes • Not sandboxed. Be careful! • Expect changes in 2.1 Thursday, October 3, 13
  • 16.
    CQL Improvements • ALTERDROP • Remove a field from a CQL table. • Conditional schema changes • Only execute if condition met CREATE KEYSPACE IF NOT EXISTS ks WITH replication = { 'class': 'SimpleStrategy','replication_factor' : 3 }; CREATE TABLE IF NOT EXISTS test (k int PRIMARY KEY); DROP KEYSPACE IF EXISTS ks; ALTER TABLE users DROP address3; Thursday, October 3, 13
  • 17.
    CQL Improvements • Aliasesin SELECT • Limit and TTL in prepared statements SELECT event_id, dateOf(created_at) AS creation_date, blobAsText(content) AS content FROM timeline; event_id | creation_date | content -------------------------+--------------------------+---------------------- 550e8400-e29b-41d4-a716 | 2013-07-26 10:44:33+0200 | Something happened!? SELECT * FROM myTable LIMIT ?; UPDATE myTable USING TTL ? SET v = 2 WHERE k = 'foo'; Thursday, October 3, 13
  • 18.
    Cassandra 2.0 -Minor features Thursday, October 3, 13
  • 19.
    Query performance • Hintwhen reading time series data • Time series slices find data faster • Hybrid approach to Leveled Compaction under stress • Use size tiered until we catch up • Reduce read latency impact • Off-heap memory speedup • Bytes moved on and off 10x faster • Removal of row-level bloom filters Thursday, October 3, 13
  • 20.
    Server performance • Singlepass compaction • No more incremental compaction for large storage rows • LMAX Disruptor on Thrift interface • Crazy fast and efficient concurrent threads. Faster HSHA • Support for pluggable off-heap memory allocators • JEMalloc support to start. Faster memory access. • Bigger Level 0 file size • 5M was just too small. Now 160M Thursday, October 3, 13
  • 21.
    Removed features • SuperColumnsare gone! • Not the API just the underlying implementation • On-heap row cache • Row cache is no longer an option in the JVM • Memory pressure relief valves - Gone from yaml • flush_largest_memtables_at • reduce_cache_sizes_at • reduce_cache_sizes_to Thursday, October 3, 13
  • 22.
    Operation Changes • JDK7 now required • Vnodes are default • Streaming overhaul • Control. Streams are grouped and broken into plans • Traceability. Each stream has an ID. Monitor each stream. • Performance. Streams are now pipelined. No waiting for ACK Thursday, October 3, 13
  • 23.
    Thank you! Apache Cassandra2.0 - Data model on fire Next talk in my data model series! Thursday, October 3, 13
  • 24.
    ©2013 DataStax Confidential.Do not distribute without consent. 21 Thursday, October 3, 13