CQL: SQL In Cassandra

CQL: SQL for Cassandra
Cassandra NYC
December 6, 2011

Eric Evans
eric@acunu.com
@jericevans, @acunu

● Overview, history, motivation
● Performance characteristics
● Coming soon (?)
● Drivers status

What?
● Cassandra Query Language
● aka CQL
● aka /ˈsēkwəl/
● Exactly like SQL (except where it's not)
● Introduced in Cassandra 0.8.0
● Ready for production use

SQL? Almost.

–- Inserts or updates
INSERT INTO Standard1 (KEY, col0, col1)
VALUES (key, value0, value1)
vs.
–- Inserts or updates
UPDATE Standard1
SET col0=value0, col1=value1 WHERE KEY=key

SQL? Almost.
–- Get columns for a row
SELECT col0,col1 FROM Standard1 WHERE KEY=key

–- Range of columns for a row
SELECT col0..colN
FROM Standard1 WHERE KEY=key

–- First 10 results from a range of columns
SELECT FIRST 10 col0..colN

–- Invert the sorting of results
SELECT REVERSED col0..colN

(Un)ease of use
Column col = new Column(ByteBuffer.wrap(“name”.getBytes()));
col.setValue(ByteBuffer.wrap(“value”.getBytes()));
col.setTimestamp(System.currentTimeMillis());

ColumnOrSuperColumn cosc = new ColumnOrSuperColumn();
cosc.setColumn(col);
Mutation mutation = new Mutation();
Mutation.setColumnOrSuperColumn(cosc);
List mutations = new ArrayList<Mutation>();
mutations.add(mutation);
Map mutations_map = new HashMap<ByteBuffer, Map<String, List<Mutation>>>();
Map cf_map = new HashMap<String, List<Mutation>>();
cf_map.set(“Standard1”, mutations);
mutations.put(ByteBuffer.wrap(“key”.getBytes()), cf_map)

CQL
INSERT INTO Standard1 (KEY, col0)
VALUES (key, value0)

Why? How about...
● Better stability guarantees
● Easier to use (you already know it)
● Better code readability / maintainability

Why? How about...
● Irritates the NoSQL purists

Why? How about...
● Irritates the NoSQL purists
● (Still )irritates the SQL purists

Thrift RPC
Column col = new Column(ByteBuffer.wrap(“name”.getBytes()));
col.setValue(ByteBuffer.wrap(“value”.getBytes()));
col.setTimestamp(System.currentTimeMillis());

ColumnOrSuperColumn cosc = new ColumnOrSuperColumn();
cosc.setColumn(col);
Mutation mutation = new Mutation();
Mutation.setColumnOrSuperColumn(cosc);
List mutations = new ArrayList<Mutation>();
mutations.add(mutation);
Map mutations_map = new HashMap<ByteBuffer, Map<String, List<Mutation>>>();
Map cf_map = new HashMap<String, List<Mutation>>();
cf_map.set(“Standard1”, mutations);
mutations.put(ByteBuffer.wrap(“key”.getBytes()), cf_map)

CQL

INSERT INTO Standard1 (KEY, col0)
VALUES (key, value0)

Hotspot
Quoted string literals

UPDATE table SET 'name' = 'value'
WHERE KEY = 'somekey'

Hotspot
Quoted string literals

UPDATE table SET 'name' = 'value'
WHERE KEY = 'somekey'

● Anything that appears between quotes
● Inlined Java constructs a StringBuilder to store
the contents (slow not fast)
● Incurred multiple times per statement

Hotspot
Marshalling

UPDATE table SET 'clear' = 'abffaadd10'
WHERE KEY = 'acfe12ff'

Hotspot
Marshalling

ascii blob

Hotspot
Marshalling

ascii blob

● Terms are marshalled to bytes by type
● String.getBytes is slow (AsciiType)
● Hex conversion is fast faster (BytesType)
● Incurred multiple times per statement

Hotspot
Copying / Conversion

execute_cql_query(
ByteBuffer query, enum compression)
● Query is binary to support compression (is it worth it?)
● And don't forget the String → ByteBuffer conversion on
the client-side
● Incurred only once per statement!

Achtung!
(These tests weren't perfect)

● Uneeded String → ByteBuffer → String
● No query compression implemented
● Co-located client and server

Insert 20M rows, 5 columns

Avg rate Avg latency
RPC 20,953/s 1.6ms
CQL 19,176/s (-8%) 1.7ms (+9%)

Insert 10M rows, 5 cols (indexed)

RPC 9,850/s 5.3ms
CQL 9,290/s (-6%) 5.5ms (+4%)

Counts, 10M rows, 5 cols

RPC 18,052/s 1.7ms
CQL 17,635/s (-2%) 1.7ms

Reading 20M rows, 5 cols

RPC 22.726/s 2.0ms
CQL 20,272/s (-11%) 2.3ms (+10%)

In Summary
Don't step over dollars to pick up pennies!

Roadmap
● Prepared statements (CASSANDRA-2475)
● Compound columns (CASSANDRA-2474)
● Custom transport / protocol (CASSANDRA-2478)
● Performance testing (CASSANDRA-2268)
● Schema introspection (CASSANDRA-2477)
● Multiget support (CASSANDRA-3069)

Drivers
● Hosted on Apache Extras (Google Code)
● Tagged cassandra and cql
● Licensed using Apache License 2.0
● Conforming to a standard for database
connectivity (if applicable)
● Coming soon, automated testing and
acceptance criteria

Drivers
Driver Platform Status
cassandra-jdbc Java Good
cassandra-dbapi2 Python Good
cassandra-ruby Ruby New
cassandra-pdo PHP New
cassandra-node Node.js Good

http://code.google.com/a/apache-extras.org/hosting/search?q=label%3aCassandra

CQL: SQL In Cassandra

More Related Content

What's hot

Viewers also liked

Similar to CQL: SQL In Cassandra

More from Eric Evans

Recently uploaded

CQL: SQL In Cassandra