0
CQL3 in depthCassandra Conference in Tokyo, 11/29/2012Yuki MorishitaSoftware Engineer@DataStax / Apache Cassandra Committe...
Agenda!   • Why CQL3?   • CQL3 walkthrough            • Defining Schema            • Querying / Mutating Data            • ...
Why CQL3?©2012 DataStax                 3
Cassandra Storage                 create column family profiles                 with key_validation_class = UTF8Type      ...
Thrift API   • Low level: get, get_slice, mutate...   • Directly exposes internal storage     structure   • Hard to change...
Inserting data with Thrift   Column col = new Column(ByteBuffer.wrap("name".getBytes()));   col.setValue(ByteBuffer.wrap("...
... with Cassandra Query Language                 INSERT INTO “Standard1” (key, name)                 VALUES (“key”, “valu...
CQL2 Problems   • Almost 1 to 1 mapping to Thrift API, so     not compose with the row-oriented parts     of SQL   • No su...
CQL3   • Maps storage to a more natural rows-     and-columns representation using     CompositeType            • Wide row...
CQL3 walkthrough©2012 DataStax                   10
Defining Keyspace   • Syntax is changed from CQL2     CREATE KEYSPACE my_keyspace WITH replication = {         class: Simp...
Defining Static Column Family   • “Strict” schema definition (and it’s good     thing)            • You cannot add column a...
Defining Static Column Family   CREATE TABLE profiles (     user_id text PRIMARY KEY,              user_id | first_name | ...
Defining Dynamic Column Family   • Then, how can we add columns     dynamically to our time series data like     we did be...
Compound key                        CREATE TABLE comments (                            article_id uuid,                   ...
Compound keycqlsh:ks> SELECT * FROM comments; article_id   | posted_at                | author | content--------------+---...
Changes worth noting   • Identifiers (keyspace/table/columns     names) are always case insensitive by     default         ...
Changes worth noting   • system.schema_*            • All schema information are stored in system              Keyspace   ...
More on CQL3 schema   • Thrift to CQL3 migration            • http://www.datastax.com/dev/blog/thrift-to-cql3   • For bett...
Mutating Data                 INSERT INTO example (id, name) VALUES (...)                 UPDATE example SET f = ‘foo’ WHE...
Batch Mutate                 BEGIN BATCH                     INSERT INTO aaa (id, col) VALUES (...)                     UP...
Batch Mutate   • Use non atomic batch if you need     performance, not atomicity                 BEGIN UNLOGGED BATCH     ...
Querying Data                 SELECT article_id, posted_at, author                 FROM comments                 WHERE    ...
Querying Data   • TTL/WRITETIME            • You can query TTL or write time of the column.                   cqlsh:ks> SE...
Collection support   • Collection            • Set                 • Unordered, no duplicates            • List           ...
Collection support                 CREATE TABLE example (                    id uuid PRIMARY KEY,                    tags ...
Collection support           INSERT INTO example (id, tags, points, attributes)           VALUES (               ‘62c36092...
Collection support   • Set    UPDATE example SET tags = tags + {‘qux’} WHERE ...    UPDATE example SET tags = tags - {‘foo...
Collection support           SELECT tags, points, attributes FROM example;            tags            | points        | at...
Collection support   • Each element in collection is internally     stored as one Cassandra column   • More on dev blog   ...
Related topics©2012 DataStax                 31
Native Transport   • CQL3 still goes through Thrift’s     execute_cql3_query API   • Native Transport support introduces  ...
Question ?                 Or contact me later if you have one                         yuki@datastax.com                  ...
Upcoming SlideShare
Loading in...5
×

CQL3 in depth

12,955

Published on

Slides from my t

1 Comment
28 Likes
Statistics
Notes
No Downloads
Views
Total Views
12,955
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
412
Comments
1
Likes
28
Embeds 0
No embeds

No notes for slide

Transcript of "CQL3 in depth"

  1. 1. CQL3 in depthCassandra Conference in Tokyo, 11/29/2012Yuki MorishitaSoftware Engineer@DataStax / Apache Cassandra Committer©2012 DataStax 1
  2. 2. Agenda! • Why CQL3? • CQL3 walkthrough • Defining Schema • Querying / Mutating Data • New features • Related topics • Native transport©2012 DataStax 2
  3. 3. Why CQL3?©2012 DataStax 3
  4. 4. Cassandra Storage create column family profiles with key_validation_class = UTF8Type and comparator = UTF8Type and column_metadata = [ {column_name: first_name, validation_class: UTF8Type}, {column_name: last_name, validation_class: UTF8Type}, {column_name: year, validation_class: IntegerType} ]; row key columns values are validated by validation_class nobu first_name Nobunaga columns are sorted last_name Oda in comparator order year 1582©2012 DataStax 4
  5. 5. Thrift API • Low level: get, get_slice, mutate... • Directly exposes internal storage structure • Hard to change the signature of API©2012 DataStax 5
  6. 6. Inserting data with Thrift Column col = new Column(ByteBuffer.wrap("name".getBytes())); col.setValue(ByteBuffer.wrap("value".getBytes())); col.setTimestamp(System.currentTimeMillis()); ColumnOrSuperColumn cosc = new ColumnOrSuperColumn(); cosc.setColumn(col); Mutation mutation = new Mutation(); mutation.setColumn_or_supercolumn(cosc); List<Mutation> mutations = new ArrayList<Mutation>(); mutations.add(mutation); Map<String, List<Mutation>> cf = new HashMap<String, List<Mutation>>(); cf.put("Standard1", mutations); Map<ByteBuffer, Map<String, List<Mutation>>> records = new HashMap<ByteBuffer, Map<String, List<Mutation>>>(); records.put(ByteBuffer.wrap("key".getBytes()), cf); client.batch_mutate(records, consistencyLevel);©2012 DataStax 6
  7. 7. ... with Cassandra Query Language INSERT INTO “Standard1” (key, name) VALUES (“key”, “value”); • Introduced in 0.8(CQL), updated in 1.0(CQL2) • Syntax similar to SQL • More extensible than Thrift API©2012 DataStax 7
  8. 8. CQL2 Problems • Almost 1 to 1 mapping to Thrift API, so not compose with the row-oriented parts of SQL • No support for CompositeType©2012 DataStax 8
  9. 9. CQL3 • Maps storage to a more natural rows- and-columns representation using CompositeType • Wide rows are “transposed” and unpacked into named columns • beta in 1.1, default in 1.2 • New features • Collection support©2012 DataStax 9
  10. 10. CQL3 walkthrough©2012 DataStax 10
  11. 11. Defining Keyspace • Syntax is changed from CQL2 CREATE KEYSPACE my_keyspace WITH replication = { class: SimpleStrategy, replication_factor: 2 };©2012 DataStax 11
  12. 12. Defining Static Column Family • “Strict” schema definition (and it’s good thing) • You cannot add column arbitrary • You need ALTER TABLE ... ADD column first • Columns are defined and sorted using CompositeType comparator©2012 DataStax 12
  13. 13. Defining Static Column Family CREATE TABLE profiles ( user_id text PRIMARY KEY, user_id | first_name | last_name | year first_name text, ---------+------------+-----------+------ last_name text, year int nobu | Nobunaga | Oda | 1582 ) CompositeType(UTF8Type) user_id values are validated by type definition nobu : first_name: Nobunaga columns are sorted last_name: Oda in comparator order year: 1582©2012 DataStax 13
  14. 14. Defining Dynamic Column Family • Then, how can we add columns dynamically to our time series data like we did before? • Use compound key©2012 DataStax 14
  15. 15. Compound key CREATE TABLE comments ( article_id uuid, posted_at timestamp, author text, content text, PRIMARY KEY (article_id, posted_at) ) CompositeType(DateType, UTF8Type) article_id values are validated by type definition 550e8400-.. 1350499616: 1350499616:author yukim columns are sorted 1350499616:content blah, blah, blah in comparator order, first by date, and then 1368499616: column name 1368499616:author yukim 1368499616:content well, well, well ...©2012 DataStax 15
  16. 16. Compound keycqlsh:ks> SELECT * FROM comments; article_id | posted_at | author | content--------------+--------------------------+--------+------------------ 550e8400-... | 1970-01-17 00:08:19+0900 | yukim | blah, blah, blah 550e8400-... | 1970-01-17 05:08:19+0900 | yukim | well, well, wellcqlsh:ks> SELECT * FROM comments WHERE posted_at >= 1970-01-17 05:08:19+0900; article_id | posted_at | author | content--------------+--------------------------+--------+------------------ 550e8400-... | 1970-01-17 05:08:19+0900 | yukim | well, well, well©2012 DataStax 16
  17. 17. Changes worth noting • Identifiers (keyspace/table/columns names) are always case insensitive by default • Use double quote(“) to force case • Compaction setting is now map type CREATE TABLE test ( ... ) WITH COMPACTION = { class: SizeTieredCompactionStrategy, min_threshold: 2, max_threshold: 4 };©2012 DataStax 17
  18. 18. Changes worth noting • system.schema_* • All schema information are stored in system Keyspace • schema_keyspaces, schema_columnfamilies, schema_columns • system tables themselves are CQL3 schema • CQL3 schema are not visible through cassandra-cli’s ‘describe’ command. • use cqlsh’s ‘describe columnfamily’©2012 DataStax 18
  19. 19. More on CQL3 schema • Thrift to CQL3 migration • http://www.datastax.com/dev/blog/thrift-to-cql3 • For better understanding • http://www.datastax.com/dev/blog/whats-new-in-cql-3-0 • http://www.datastax.com/dev/blog/cql3-evolutions • http://www.datastax.com/dev/blog/cql3-for-cassandra-experts©2012 DataStax 19
  20. 20. Mutating Data INSERT INTO example (id, name) VALUES (...) UPDATE example SET f = ‘foo’ WHERE ... DELETE FROM example WHERE ... • No more USING CONSISTENCY • Consistency level setting is moved to protocol level©2012 DataStax 20
  21. 21. Batch Mutate BEGIN BATCH INSERT INTO aaa (id, col) VALUES (...) UPDATE bbb SET col1 = ‘val1’ WHERE ... ... APPLY BATCH; • Batches are atomic by default from 1.2 • does not mean mutations are isolated (mutation within a row is isolated from 1.1) • some performance penalty because of batch log process©2012 DataStax 21
  22. 22. Batch Mutate • Use non atomic batch if you need performance, not atomicity BEGIN UNLOGGED BATCH ... APPLY BATCH; • More on dev blog • http://www.datastax.com/dev/blog/atomic-batches-in-cassandra-1-2©2012 DataStax 22
  23. 23. Querying Data SELECT article_id, posted_at, author FROM comments WHERE article_id >= ‘...’ ORDER BY posted_at DESC LIMIT 100;©2012 DataStax 23
  24. 24. Querying Data • TTL/WRITETIME • You can query TTL or write time of the column. cqlsh:ks> SELECT WRITETIME(author) FROM comments; writetime(author) ------------------- 1354146105288000©2012 DataStax 24
  25. 25. Collection support • Collection • Set • Unordered, no duplicates • List • Ordered, allow duplicates • Map • Keys and associated values©2012 DataStax 25
  26. 26. Collection support CREATE TABLE example ( id uuid PRIMARY KEY, tags set<text>, points list<int>, attributes map<text, text> ); • Collections are typed, but cannot be nested(no list<list<text>>) • No secondary index on collections©2012 DataStax 26
  27. 27. Collection support INSERT INTO example (id, tags, points, attributes) VALUES ( ‘62c36092-82a1-3a00-93d1-46196ee77204’, {‘foo’, ‘bar’, ‘baz’}, // set [100, 20, 93], // list {‘abc’: ‘def’} // map );©2012 DataStax 27
  28. 28. Collection support • Set UPDATE example SET tags = tags + {‘qux’} WHERE ... UPDATE example SET tags = tags - {‘foo’} WHERE ... • List UPDATE example SET points = points + [20, 30] WHERE ... UPDATE example SET points = points - [100] WHERE ... • Map UPDATE example SET attributes[‘ghi’] = ‘jkl’ WHERE ... DELETE attributes[‘abc’] FROM example WHERE ...©2012 DataStax 28
  29. 29. Collection support SELECT tags, points, attributes FROM example; tags | points | attributes -----------------+---------------+-------------- {baz, foo, bar} | [100, 20, 93] | {abc: def} • You cannot retrieve item in collection individually©2012 DataStax 29
  30. 30. Collection support • Each element in collection is internally stored as one Cassandra column • More on dev blog • http://www.datastax.com/dev/blog/cql3_collections©2012 DataStax 30
  31. 31. Related topics©2012 DataStax 31
  32. 32. Native Transport • CQL3 still goes through Thrift’s execute_cql3_query API • Native Transport support introduces Cassandra’s original binary protocol • Async IO, server event push, ... • http://www.datastax.com/dev/blog/binary-protocol • Try DataStax Java native driver with C* 1.2 beta today! • https://github.com/datastax/java-driver©2012 DataStax 32
  33. 33. Question ? Or contact me later if you have one yuki@datastax.com yukim (IRC, twitter) Now Hiring talented engineers from all over the world!©2012 DataStax 33
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×