Cassandra 2.0 better, faster, stronger

©2013 DataStax Conﬁdential. Do not distribute without consent.
@PatrickMcFadin
Patrick McFadin
Chief Evangelist/Solution Architect - DataStax
Cassandra 2.0: Better, Stronger, Faster
Thursday, October 3, 13

Five Years of Cassandra
Jul-09 May-10 Feb-11 Dec-11 Oct-12 Jul-13
0.1 0.3 0.6 0.7 1.0 1.2
...
2.0
DSE
Jul-08

Cassandra 2.0 - Big new features

SELECT * FROM users
WHERE username = ’jbellis’
[empty resultset]
Session 1
SELECT * FROM users
[empty resultset]
Session 2
Lightweight transactions: the problem
INSERT INTO users
(username,password)
VALUES (’jbellis’,‘xdg44hh’)
INSERT INTO users
(userName,password)
VALUES (’jbellis’,‘8dhh43k’)
It’s a Race!
Who wins?

Client
(locks)
Coordinatorrequest
Replica
internal
request
Why Locking Doesn’t Work
• Client locks
• Write times out
• Lock released
• Hint is replayed!!

Client
(locks)
Coordinatorrequest
Replica
internal
request
X
• Client locks
• Write times out
• Lock released

Client
(locks)
Coordinatorrequest
Replica
internal
request
hint
X
• Client locks
• Write times out
• Lock released

Client
(locks)
Coordinatorrequest
Replica
internal
request
hint
timeout
response
X
• Client locks
• Write times out
• Lock released

Paxos
• Consensus algorithm
• All operations are quorum-based
• Each replica sends information about unfinished operations to the leader
during prepare
• Paxos made Simple

LWT: details
• 4 round trips vs 1 for normal updates
• Paxos state is durable
• Immediate consistency with no leader election or failover
• ConsistencyLevel.SERIAL
• http://www.datastax.com/dev/blog/lightweight-transactions-in-
cassandra-2-0

LWT: Use with caution
• Great for 1% of your application
• Eventual consistency is your friend
• http://www.slideshare.net/planetcassandra/c-summit-2013-eventual-consistency-
hopeful-consistency-by-christos-kalantzis

UPDATE USERS
SET email = ’jonathan@datastax.com’, ...
IF email = ’jbellis@datastax.com’;
INSERT INTO USERS (username, email, ...)
VALUES (‘jbellis’, ‘jbellis@datastax.com’, ... )
IF NOT EXISTS;
Using LWT
• Don’t overwrite an existing record
• Only update record if condition is met

Triggers
CREATE TRIGGER <name> ON <table> USING <classname>;
DROP TRIGGER <name> ON [<keyspace>.]<table>;
• Executed on the coordinator before mutation
• Takes original mutation and adds any new
• Jars deployed per server

Trigger implementation
class MyTrigger implements ITrigger
{
public Collection<RowMutation> augment(ByteBuffer key, ColumnFamily update)
{
...
}
}
• You have to implement your own ITrigger (for now)
• Compile and deploy to each server

Experimental!
• Relies on internal RowMutation, ColumnFamily classes
• Not sandboxed. Be careful!
• Expect changes in 2.1

CQL Improvements
• ALTER DROP
• Remove a field from a CQL table.
• Conditional schema changes
• Only execute if condition met
CREATE KEYSPACE IF NOT EXISTS ks
WITH replication = { 'class': 'SimpleStrategy','replication_factor' :
3 };
CREATE TABLE IF NOT EXISTS test (k int PRIMARY KEY);
DROP KEYSPACE IF EXISTS ks;
ALTER TABLE users DROP address3;

CQL Improvements
• Aliases in SELECT
• Limit and TTL in prepared statements
SELECT event_id, dateOf(created_at) AS creation_date,
blobAsText(content) AS content
FROM timeline;
event_id | creation_date | content
-------------------------+--------------------------+----------------------
550e8400-e29b-41d4-a716 | 2013-07-26 10:44:33+0200 | Something happened!?
SELECT * FROM myTable LIMIT ?;
UPDATE myTable USING TTL ? SET v = 2 WHERE k = 'foo';

Cassandra 2.0 - Minor features

Query performance
• Hint when reading time series data
• Time series slices find data faster
• Hybrid approach to Leveled Compaction under stress
• Use size tiered until we catch up
• Reduce read latency impact
• Off-heap memory speedup
• Bytes moved on and off 10x faster
• Removal of row-level bloom filters

Server performance
• Single pass compaction
• No more incremental compaction for large storage rows
• LMAX Disruptor on Thrift interface
• Crazy fast and efficient concurrent threads. Faster HSHA
• Support for pluggable off-heap memory allocators
• JEMalloc support to start. Faster memory access.
• Bigger Level 0 file size
• 5M was just too small. Now 160M

Removed features
• SuperColumns are gone!
• Not the API just the underlying implementation
• On-heap row cache
• Row cache is no longer an option in the JVM
• Memory pressure relief valves - Gone from yaml
• flush_largest_memtables_at
• reduce_cache_sizes_at
• reduce_cache_sizes_to

Operation Changes
• JDK 7 now required
• Vnodes are default
• Streaming overhaul
• Control. Streams are grouped and broken into plans
• Traceability. Each stream has an ID. Monitor each stream.
• Performance. Streams are now pipelined. No waiting for ACK

Thank you!
Apache Cassandra 2.0 - Data model on ﬁre
Next talk in my data model series!

Cassandra 2.0 better, faster, stronger

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (9)

Similar to Cassandra 2.0 better, faster, stronger

Similar to Cassandra 2.0 better, faster, stronger (20)

More from Patrick McFadin

More from Patrick McFadin (10)

Recently uploaded

Recently uploaded (20)

Cassandra 2.0 better, faster, stronger