C*ollege Credit: What's New in Apache Cassandra 1.2

2,376
-1

Published on

Apache Cassandra Project Chair, Jonathan Ellis, looks at all the great improvements in Cassandra 1.2, including VNodes, Parallel Leveled Compaction, Collections, Atomic Batches and CQL3.

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,376
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

C*ollege Credit: What's New in Apache Cassandra 1.2

  1. 1. Cassandra 1.2Jonathan EllisProject Chair, Apache CassandraCTO, DataStax@spyced
  2. 2. C* in a nutshell • Massively scalable • High performance • Reliable/Available©2012 DataStax 2
  3. 3. ©2012 DataStax
  4. 4. 1.2 • Concurrent schema • Atomic batches changes • CQL3 • Virtual nodes • Collections • “Fat node” support • Data dictionary • JBOD improvements • Tracing • Off-heap bloom filters, compression metadata • Parallel leveled compaction©2012 DataStax
  5. 5. Concurrent Schema Changes CREATE TABLE X; ... DROP TABLE X; Client Cassandra Cluster Client CREATE TABLE Y; ...©2012 DataStax DROP TABLE Y;
  6. 6. Virtual nodes A C D B E A F F B P G Ring without Ring with vnodes vnodes O H E C N I M J D L K©2012 DataStax
  7. 7. Virtual nodes A C D B E A F F B P G Ring without Ring with vnodes vnodes O H E C N I M J D L K©2012 DataStax
  8. 8. Virtual nodes A C D B E A F F B P G Ring without Ring with vnodes vnodes O H E C N I M J D L K©2012 DataStax
  9. 9. Node Rebuild without vnodes Node 1 Node 2 Node 3 A B C F E A F B A A F B Ring without vnodes E C D D E F C B D C E D Node 4 Node 5 Node 6©2012 DataStax
  10. 10. Node Rebuild with vnodes Node 1 Node 2 Node 3 B E A P K G G K M O C N C D D J D H J F B E A F L A K F P I P Ring with G O VNodes H N I M O E P H C M J L K I H I A B O B L M C N E F D G N J L Node 4 Node 5 Node 6©2012 DataStax
  11. 11. JBOD support Cassandra Instance HDD1 HDD2 HDD3 HDD4©2012 DataStax
  12. 12. JBOD support Cassandra Instance HDD1 X HDD2 HDD3 HDD4©2012 DataStax
  13. 13. Moving O(n) structures off-heap • Row (partition) bloom filter • 1-2GB per billion rows • Compression metadata • ~20GB per TB compressed data©2012 DataStax
  14. 14. On-Heap/Off-Heap On-Heap Off-Heap Managed by GC Not managed by GC JVM Java Heap Native Memory Java Process©2012 DataStax
  15. 15. Batches Partition Replica Coordinator Partition Client Node Replica Partition Replica©2012 DataStax
  16. 16. Batches Partition Replica Coordinator Partition Client Node Replica Partition Replica©2012 DataStax
  17. 17. Batches Partition Replica Coordinator Partition Client Node Replica Partition Replica©2012 DataStax
  18. 18. Batches Partition Replica Coordinator Partition Client Node Replica Partition Replica©2012 DataStax
  19. 19. Batches Partition Replica Client X Coordinator Node Partition Replica Partition Replica©2012 DataStax
  20. 20. Atomic batches Partition Replica Coordinator Partition Client Node Replica Partition Batchlog Replica Node©2012 DataStax
  21. 21. Atomic batches Partition Replica Coordinator Partition Client Node Replica Partition Batchlog Replica Node©2012 DataStax
  22. 22. Atomic batches Partition Replica Coordinator Partition Client Node Replica Partition Batchlog Replica Node©2012 DataStax
  23. 23. Atomic batches Partition Replica Coordinator Partition Client Node Replica Partition Batchlog Replica Node©2012 DataStax
  24. 24. Atomic batches Partition Replica Client X Coordinator Node Partition Replica Partition Batchlog Replica Node©2012 DataStax
  25. 25. Atomic batches Partition Replica Client X Coordinator Node Partition Replica Partition Batchlog Replica Node©2012 DataStax
  26. 26. CQL: You got SQL in my NoSQL! CREATE TABLE users ( id uuid PRIMARY KEY, name text, state text, birth_date int ); CREATE INDEX ON users(state); SELECT * FROM users WHERE state=‘Texas’ AND birth_date > 1950;©2012 DataStax
  27. 27. Strictly “realtime” focused • No joins • No subqueries • No aggregation functions* or GROUP BY • Strictly limited ORDER BY©2012 DataStax
  28. 28. songscreate column family songswith key_validation_class = UUIDTypeand comparator = UTF8Type -- cell names are stringsand column_metdata = [{column_name: title, validation_class: UTF8Type} {column_name: album, validation_class: UTF8Type} {column_name: artist, validation_class: UTF8Type {column_name: data, validation_class: BytesType} a3e64f8f... title: La Grange artist: ZZ Top album: Tres Hombres 8a172618... title: Moving in Stereo artist: Fu Manchu album: We Must Obey 2b09185b... title: Outside Woman Blues artist: Back Door Slam album: Roll Away ©2012 DataStax
  29. 29. CREATE TABLE songs ( id uuid PRIMARY KEY, title text, artist text, album text, data blob); id title artist album a3e64f8f... La Grange ZZ Top Tres Hombres 8a172618... Moving in Stereo Fu Manchu We Must Obey 2b09185b... Outside Woman Blues Back Door Slam Roll Away©2012 DataStax
  30. 30. song_tagscreate column family song_tagswith key_validation_class = UUIDTypeand comparator = UTF8Type; a3e64f8f... blues: 1973: 8a172618... covers: 2003:©2012 DataStax
  31. 31. CREATE TABLE song_tags ( id uuid, tag_name text, PRIMARY KEY (id, tag_name) ); a3e64f8f... blues: 1973: 8a172618... covers: 2003: id tag_name a3e64f8f... blues a3e64f8f... 1973 8a172618... covers 8a172618... 2003©2012 DataStax
  32. 32. Easier way to add tags ALTER TABLE songs ADD tags set<text>; id title artist album tags a3e64f8f... La Grange ZZ Top Tres Hombres {blues, 1973} 8a172618... Moving in Stereo Fu Manchu We Must Obey {covers, 2003} 2b09185b... Outside Woman Blues Back Door Slam Roll Away©2012 DataStax
  33. 33. playlistscreate column family playlistswith key_validation_class = UUIDTypeand comparator = CompositeType(UTF8Type, UTF8Type, UTF8Type)and default_validation_class = UUIDType;62c36092... La Grange, Moving in S..., Outside Wo..., ZZ Top, : a3e64f8f... Fu Manchu, : 8a172618... Back Door ..., : 2b09185b... Tres Hombres We Must O... Roll Away©2012 DataStax
  34. 34. playlistscreate column family playlistswith key_validation_class = UUIDTypeand comparator = CompositeType(UTF8Type, UTF8Type, UTF8Type)and default_validation_class = UUIDType;62c36092... La Grange, Moving in S..., Outside Wo..., ZZ Top, : a3e64f8f... Fu Manchu, : 8a172618... Back Door ..., : 2b09185b... Tres Hombres We Must O... Roll Away©2012 DataStax
  35. 35. CREATE TABLE playlists ( id uuid, title text, album text, artist text, song_id uuid, PRIMARY KEY (id, title, album, artist));62c36092... La Grange, Moving in S..., Outside Wo..., ZZ Top, : a3e64f8f... Fu Manchu, : 8a172618... Back Door ..., : 2b09185b... Tres Hombres We Must O... Roll Away id title artist album song_id 62c36092... La Grange ZZ Top Tres Hombres a3e64f8f... 62c36092... Moving in Stereo Fu Manchu We Must Obey 8a172618... 62c36092...©2012 DataStax Outside Wo... Back Door Slam Roll Away 2b09185b...
  36. 36. ClusteringCREATE TABLE timeline ( user_id tweet_id _author _body  user_id uuid, jbellis 3290f9da.. rbranson lorem  tweet_id timeuuid, jbellis 3895411a.. tjake ipsum  tweet_author uuid, ... ... ... tweet_body text,  PRIMARY KEY (user_id, driftx 3290f9da.. rbranson lorem tweet_id) driftx 71b46a84.. yzhang dolor); ... ... ... yukim 3290f9da.. rbranson loremSELECT * FROM timeline yukim e451dd42.. tjake ametWHERE user_id = ’driftx’; ... ... ... ©2012 DataStax
  37. 37. ClusteringCREATE TABLE timeline ( user_id tweet_id _author _body  user_id uuid, jbellis 3290f9da.. rbranson lorem  tweet_id timeuuid, jbellis 3895411a.. tjake ipsum  tweet_author uuid, ... ... ... tweet_body text,  PRIMARY KEY (user_id, driftx 3290f9da.. rbranson lorem tweet_id) driftx 71b46a84.. yzhang dolor); ... ... ... yukim 3290f9da.. rbranson loremSELECT * FROM timeline yukim e451dd42.. tjake ametWHERE user_id = ’driftx’; ... ... ... ©2012 DataStax
  38. 38. Data dictionarycqlsh:system> SELECT * FROM schema_keyspaces; keyspace_name | durable_writes | strategy_class | strategy_options---------------+----------------+----------------+---------------------------- keyspace1 | True | SimpleStrategy | {"replication_factor":"1"} system | True | LocalStrategy | {} system_traces | True | SimpleStrategy | {"replication_factor":"1"} ©2012 DataStax
  39. 39. Data dictionarycqlsh:system> SELECT * FROM schema_keyspaces; keyspace_name | durable_writes | strategy_class | strategy_options---------------+----------------+----------------+---------------------------- keyspace1 | True | SimpleStrategy | {"replication_factor":"1"} system | True | LocalStrategy | {} system_traces | True | SimpleStrategy | {"replication_factor":"1"} ©2012 DataStax
  40. 40. Data dictionarycqlsh:system> SELECT * FROM schema_keyspaces; keyspace_name | durable_writes | strategy_class | strategy_options---------------+----------------+----------------+---------------------------- keyspace1 | True | SimpleStrategy | {"replication_factor":"1"} system | True | LocalStrategy | {} system_traces | True | SimpleStrategy | {"replication_factor":"1"}cqlsh:system> SELECT * FROM schema_columnfamilies WHERE keyspace_name=keyspace1 ANDcolumnfamily_name=test; ©2012 DataStax
  41. 41. Data dictionarycqlsh:system> SELECT * FROM schema_keyspaces; keyspace_name | durable_writes | strategy_class | strategy_options---------------+----------------+----------------+---------------------------- keyspace1 | True | SimpleStrategy | {"replication_factor":"1"} system | True | LocalStrategy | {} system_traces | True | SimpleStrategy | {"replication_factor":"1"}cqlsh:system> SELECT * FROM schema_columnfamilies WHERE keyspace_name=keyspace1 ANDcolumnfamily_name=test;cqlsh:system> SELECT * FROM schema_columns WHERE keyspace_name=keyspace1 ANDcolumnfamily_name=test; ©2012 DataStax
  42. 42. Data dictionarycqlsh:system> SELECT * FROM local; key | bootstrapped | cluster_name | cql_version | data_center | gossip_generation |partitioner | rack | release_version | ring_id| thrift_version | tokens | truncated_at-------+--------------+--------------+-------------+-------------+-------------------+---------------------------------------------+-------+----------------------+--------------------------------------+----------------+--------+-------------- local | COMPLETED | test | 3.0.0 | datacenter1 | 1352846064 |org.apache.cassandra.dht.Murmur3Partitioner | rack1 | 1.2.0-beta2-SNAPSHOT |224c55d5-21b4-42b0-8969-afc0cc04e812 | 19.35.0 | {0} | null ©2012 DataStax
  43. 43. Data dictionarycqlsh:system> SELECT * FROM peers LIMIT 1; peer | data_center | rack | release_version | ring_id| rpc_address | schema_version | tokens-----------+-------------+-------+----------------------+--------------------------------------+-------------+--------------------------------------+----------------------- 127.0.0.3 | datacenter1 | rack1 | 1.2.0-beta2-SNAPSHOT | f6782327-ef8e-41cf-87b9-2edc287b1ffe | 127.0.0.3 | 915ed888-ddd0-3448-860c-582f4eea1bc6 |{6148914691236517204} ©2012 DataStax
  44. 44. Request tracing cqlsh:foo> INSERT INTO bar (i, j) VALUES (6, 2); Tracing session: 4ad36250-1eb4-11e2-0000-fe8ebeead9f9 activity | timestamp | source | source_elapsed -------------------------------------+--------------+-----------+---------------- Determining replicas for mutation | 00:02:37,015 | 127.0.0.1 | 540 Sending message to /127.0.0.2 | 00:02:37,015 | 127.0.0.1 | 779 Message received from /127.0.0.1 | 00:02:37,016 | 127.0.0.2 | 63 Applying mutation | 00:02:37,016 | 127.0.0.2 | 220 Acquiring switchLock | 00:02:37,016 | 127.0.0.2 | 250 Appending to commitlog | 00:02:37,016 | 127.0.0.2 | 277 Adding to memtable | 00:02:37,016 | 127.0.0.2 | 378 Enqueuing response to /127.0.0.1 | 00:02:37,016 | 127.0.0.2 | 710 Sending message to /127.0.0.1 | 00:02:37,016 | 127.0.0.2 | 888 Message received from /127.0.0.2 | 00:02:37,017 | 127.0.0.1 | 2334 Processing response from /127.0.0.2 | 00:02:37,017 | 127.0.0.1 | 2550©2012 DataStax
  45. 45. Tracing an antipattern CREATE TABLE queues ( id text, created_at timeuuid, value blob, PRIMARY KEY (id, created_at) ); id created_at value myqueue 3092e86f 9b0450d30de9 myqueue 0867f47c fc7aee5f6a66 myqueue 5fc74be0 668fdb3a2196©2012 DataStax
  46. 46. Tracing an antipattern CREATE TABLE queues ( id text, created_at timeuuid, value blob, PRIMARY KEY (id, created_at) ); id created_at value myqueue 3092e86f 9b0450d30de9 myqueue 0867f47c fc7aee5f6a66 myqueue 5fc74be0 668fdb3a2196©2012 DataStax
  47. 47. CREATE TABLE queues ( id text, created_at timeuuid, value blob, PRIMARY KEY (id, created_at) ); id created_at value myqueue 3092e86f 9b0450d30de9 myqueue 0867f47c fc7aee5f6a66 myqueue 5fc74be0 668fdb3a2196©2012 DataStax
  48. 48. cqlsh:foo> SELECT FROM queues WHERE id = myqueue ORDER BY created_at LIMIT 1; Tracing session: 4ad36250-1eb4-11e2-0000-fe8ebeead9f9 activity | timestamp | source | source_elapsed ------------------------------------------+--------------+-----------+--------------- execute_cql3_query | 19:31:05,650 | 127.0.0.1 | 0 Sending message to /127.0.0.3 | 19:31:05,651 | 127.0.0.1 | 541 Message received from /127.0.0.1 | 19:31:05,651 | 127.0.0.3 | 39 Executing single-partition query | 19:31:05,652 | 127.0.0.3 | 943 Acquiring sstable references | 19:31:05,652 | 127.0.0.3 | 973 Merging memtable contents | 19:31:05,652 | 127.0.0.3 | 1020 Merging data from memtables and sstables | 19:31:05,652 | 127.0.0.3 | 1081 Read 1 live cells and 100000 tombstoned | 19:31:05,686 | 127.0.0.3 | 35072 Enqueuing response to /127.0.0.1 | 19:31:05,687 | 127.0.0.3 | 35220 Sending message to /127.0.0.1 | 19:31:05,687 | 127.0.0.3 | 35314 Message received from /127.0.0.3 | 19:31:05,687 | 127.0.0.1 | 36908 Processing response from /127.0.0.3 | 19:31:05,688 | 127.0.0.1 | 37650 Request complete | 19:31:05,688 | 127.0.0.1 | 38047©2012 DataStax

×