CQL3 in depth

CQL3 in depth
Cassandra Conference in Tokyo, 11/29/2012

Yuki Morishita
Software Engineer@DataStax / Apache Cassandra Committer

©2012 DataStax
1

Agenda!
• Why CQL3?
• CQL3 walkthrough
• Deﬁning Schema
• Querying / Mutating Data
• New features
• Related topics
• Native transport

©2012 DataStax
2

Why CQL3?
©2012 DataStax
3

Cassandra Storage
create column family profiles
with key_validation_class = UTF8Type
and comparator = UTF8Type
and column_metadata = [
{column_name: first_name, validation_class: UTF8Type},
{column_name: last_name, validation_class: UTF8Type},
{column_name: year, validation_class: IntegerType}
];

row key columns values are validated by validation_class

nobu first_name Nobunaga
columns are sorted
last_name Oda
in comparator order
year 1582

©2012 DataStax
4

Thrift API
• Low level: get, get_slice, mutate...
• Directly exposes internal storage
structure
• Hard to change the signature of API

©2012 DataStax
5

Inserting data with Thrift
Column col = new Column(ByteBuffer.wrap("name".getBytes()));
col.setValue(ByteBuffer.wrap("value".getBytes()));
col.setTimestamp(System.currentTimeMillis());

ColumnOrSuperColumn cosc = new ColumnOrSuperColumn();
cosc.setColumn(col);

Mutation mutation = new Mutation();
mutation.setColumn_or_supercolumn(cosc);

List<Mutation> mutations = new ArrayList<Mutation>();
mutations.add(mutation);

Map<String, List<Mutation>> cf = new HashMap<String, List<Mutation>>();
cf.put("Standard1", mutations);

Map<ByteBuffer, Map<String, List<Mutation>>> records
= new HashMap<ByteBuffer, Map<String, List<Mutation>>>();
records.put(ByteBuffer.wrap("key".getBytes()), cf);

client.batch_mutate(records, consistencyLevel);

©2012 DataStax
6

... with Cassandra Query Language

INSERT INTO “Standard1” (key, name)
VALUES (“key”, “value”);

• Introduced in 0.8(CQL), updated in
1.0(CQL2)
• Syntax similar to SQL
• More extensible than Thrift API
©2012 DataStax
7

CQL2 Problems
• Almost 1 to 1 mapping to Thrift API, so
not compose with the row-oriented parts
of SQL
• No support for CompositeType

©2012 DataStax
8

CQL3
• Maps storage to a more natural rows-
and-columns representation using
CompositeType
• Wide rows are “transposed” and unpacked
into named columns
• beta in 1.1, default in 1.2
• New features
• Collection support

©2012 DataStax
9

CQL3 walkthrough
©2012 DataStax
10

Defining Keyspace
• Syntax is changed from CQL2

CREATE KEYSPACE my_keyspace WITH replication = {
'class': 'SimpleStrategy',
'replication_factor': 2
};

©2012 DataStax
11

Defining Static Column Family
• “Strict” schema definition (and it’s good
thing)
• You cannot add column arbitrary
• You need ALTER TABLE ... ADD column
first
• Columns are defined and sorted using
CompositeType comparator

©2012 DataStax
12

Defining Static Column Family

CREATE TABLE profiles (
user_id text PRIMARY KEY, user_id | first_name | last_name | year
first_name text, ---------+------------+-----------+------
last_name text,
year int nobu | Nobunaga | Oda | 1582
)

CompositeType(UTF8Type)
user_id values are validated by type deﬁnition

nobu :

first_name: Nobunaga
columns are sorted
last_name: Oda
in comparator order
year: 1582

©2012 DataStax
13

Defining Dynamic Column Family
• Then, how can we add columns
dynamically to our time series data like
we did before?
• Use compound key

©2012 DataStax
14

Compound key
CREATE TABLE comments (
article_id uuid,
posted_at timestamp,
author text,
content text,
PRIMARY KEY (article_id, posted_at)
)

CompositeType(DateType, UTF8Type)

article_id values are validated by type deﬁnition

550e8400-.. 1350499616:

1350499616:author yukim
columns are sorted
1350499616:content blah, blah, blah in comparator order,
ﬁrst by date, and then
1368499616: column name
1368499616:author yukim

1368499616:content well, well, well
...
©2012 DataStax
15

Changes worth noting
• Identiﬁers (keyspace/table/columns
names) are always case insensitive by
default
• Use double quote(“) to force case
• Compaction setting is now map type
CREATE TABLE test (
...
) WITH COMPACTION = {
'class': 'SizeTieredCompactionStrategy',
'min_threshold': 2,
'max_threshold': 4
};
©2012 DataStax
17

Changes worth noting
• system.schema_*
• All schema information are stored in system
Keyspace
• schema_keyspaces, schema_columnfamilies,
schema_columns
• system tables themselves are CQL3 schema
• CQL3 schema are not visible through
cassandra-cli’s ‘describe’ command.
• use cqlsh’s ‘describe columnfamily’
©2012 DataStax
18

More on CQL3 schema
• Thrift to CQL3 migration
• http://www.datastax.com/dev/blog/thrift-to-cql3

• For better understanding
• http://www.datastax.com/dev/blog/whats-new-in-cql-3-0
• http://www.datastax.com/dev/blog/cql3-evolutions
• http://www.datastax.com/dev/blog/cql3-for-cassandra-experts

©2012 DataStax
19

Mutating Data

INSERT INTO example (id, name) VALUES (...)

UPDATE example SET f = ‘foo’ WHERE ...

DELETE FROM example WHERE ...

• No more USING CONSISTENCY
• Consistency level setting is moved to protocol
level
©2012 DataStax
20

Batch Mutate
BEGIN BATCH
INSERT INTO aaa (id, col) VALUES (...)
UPDATE bbb SET col1 = ‘val1’ WHERE ...
...
APPLY BATCH;

• Batches are atomic by default from 1.2
• does not mean mutations are isolated
(mutation within a row is isolated from 1.1)
• some performance penalty because of batch
log process
©2012 DataStax
21

Batch Mutate
• Use non atomic batch if you need
performance, not atomicity
BEGIN UNLOGGED BATCH
...
APPLY BATCH;

• More on dev blog
• http://www.datastax.com/dev/blog/atomic-batches-in-cassandra-1-2

©2012 DataStax
22

Querying Data

SELECT article_id, posted_at, author
FROM comments
WHERE
article_id >= ‘...’
ORDER BY posted_at DESC
LIMIT 100;

©2012 DataStax
23

Querying Data
• TTL/WRITETIME
• You can query TTL or write time of the column.

cqlsh:ks> SELECT WRITETIME(author) FROM comments;

writetime(author)
-------------------
1354146105288000

©2012 DataStax
24

Collection support
• Collection
• Set
• Unordered, no duplicates
• List
• Ordered, allow duplicates
• Map
• Keys and associated values

©2012 DataStax
25

Collection support

CREATE TABLE example (
id uuid PRIMARY KEY,
tags set<text>,
points list<int>,
attributes map<text, text>
);

• Collections are typed, but cannot be
nested(no list<list<text>>)
• No secondary index on collections
©2012 DataStax
26

Collection support

INSERT INTO example (id, tags, points, attributes)
VALUES (
‘62c36092-82a1-3a00-93d1-46196ee77204’,
{‘foo’, ‘bar’, ‘baz’}, // set
[100, 20, 93], // list
{‘abc’: ‘def’} // map
);

©2012 DataStax
27

Collection support
• Set
UPDATE example SET tags = tags + {‘qux’} WHERE ...
UPDATE example SET tags = tags - {‘foo’} WHERE ...

• List
UPDATE example SET points = points + [20, 30] WHERE ...
UPDATE example SET points = points - [100] WHERE ...

• Map
UPDATE example SET attributes[‘ghi’] = ‘jkl’ WHERE ...
DELETE attributes[‘abc’] FROM example WHERE ...

©2012 DataStax
28

Collection support

SELECT tags, points, attributes FROM example;

tags | points | attributes
-----------------+---------------+--------------
{baz, foo, bar} | [100, 20, 93] | {abc: def}

• You cannot retrieve item in collection
individually

©2012 DataStax
29

Collection support
• Each element in collection is internally
stored as one Cassandra column
• More on dev blog
• http://www.datastax.com/dev/blog/cql3_collections

©2012 DataStax
30

CQL3 in depth

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (8)

Similar to CQL3 in depth

Similar to CQL3 in depth (20)

More from Yuki Morishita

More from Yuki Morishita (12)

CQL3 in depth