Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
@thetweetofmatt
Matt Kennedy
DataStax
Cassandra 3.0
Windows Support
JSON
UDF
Role-based authz
Commitlog Compression
New Storage Engine
New Hints Architecture
Materialized Vie...
Windows Support
DTCS
UDF
Role-based authz
Commitlog Compression
New Storage Engine
New Hints Architecture
Materialized Vie...
JSON
CREATE TABLE users (

id uuid PRIMARY KEY,

name text,

state text,

birth_date int

);
INSERT INTO users (id, name, ...
Collections
CREATE TABLE example (

    id int PRIMARY KEY,

    tupleval tuple<int, text>,

    numbers set<int>,

    wo...
User-defined types
CREATE TYPE address (number int, street text);
CREATE TABLE users (

id int PRIMARY KEY,

street_address...
Deeper Nesting
CREATE TYPE address (

street text,

city text,

zip_code int,

phones set<text>

);
CREATE TABLE users (

...
Deeper Nesting
INSERT INTO users JSON
'{"id": "0514e410-2a9f-11e5-a2cb-0800200c9a66",
"name": "jellis",
"addresses": {"hom...
Role-based Authorization
CREATE ROLE accounting;
GRANT all ON invoices TO accounting;

GRANT select ON expenses TO account...
User-defined Functions (UDF)
CREATE FUNCTION my_sin (input double)

RETURNS double LANGUAGE java

AS ’

return input == nul...
UDF Aggregation
CREATE AGGREGATE avg (int)

SFUNC avgState

STYPE tuple<long, int>

FINALFUNC avgFinal;
CREATE FUNCTION av...
Commitlog Compression
2.2
2.1
Operations/s
Time
DateTieredCompactionStrategyReads/s
DTCS
STCS
LCS
2.5TB
DateTieredCompactionStrategyWrites/s
DTCS
STCS
LCS
3.0 Features
New Storage Engine
Why???
Pre 3.0
• Clustering keys are repeated for each cell
• Timestamps are repeated in each cell
• TTLS are.. you get the idea
...
Storage in 3.0
• Rows are a first class entity
• Timestamps and TTLS can be
stored on the Row
• Clustering keys are not re...
New Storage Engine
Workloads
Bytes
Pre 3.0
• Clustering keys are repeated for each cell
• Timestamps are repeated in each cell
• TTLS are.. you get the idea
...
Storage Model - Logical View
2005:12:1:10
-5.6
2005:12:1:9
-5.1
2005:12:1:8
-4.9
10010:99999
10010:99999
10010:99999
wsid ...
2005:12:1:10
-5.6 -5.3-4.9-5.1
Storage Model - Disk Layout
2005:12:1:9 2005:12:1:8
10010:99999
2005:12:1:7
SELECT wsid, ho...
Storage in 3.0
• Rows are a first class entity
• Timestamps and TTLS can be stored on the Row
• Clustering keys are not re...
New Storage Engine
Workloads
Bytes
Hinted Handoff Improvements
client
p1
p1
p1
X
Hinted Handoff Improvements
client
p1
p1
p1
Hint
Hinted Handoff Improvements
client
p1
p1
p1
Handoff
SSTable-based Hints
Hint
Compacted
Tombstone
Memtable
Commitlog
SSTableCommitlog
SSTable
Memtable
File-based Hints
Hint
.168.101
Hint
Hint
Hint
Hint
Hint
Hint
Hint
.168.104
Hint
Hint
Hint
Hint
Hint
Hint
Hint
Hint
.168.11...
File-based Hints
.168.104
Hint
Hint
Hint
Hint
Hint
Hint
Hint
Hint
.168.112
Hint
Hint
Hint
Hint
Hint
Hint
Hint
Hint
3.0
2.2
2.2 Hints vs 3.0Operations/s
Time
Materialized Views
CREATE TABLE songs (

  id uuid PRIMARY KEY,

  title text,

  album text,

  artist text

);
CREATE MA...
Indexes
CREATE TABLE songs (

  id uuid PRIMARY KEY,

  title text,

  album text,

  artist text

);
CREATE INDEX songs_b...
Local Indexes
client
title artist album
La
Grange
ZZ Top
Tres
Hombre
s
title artist album
Outside...
Back Door
Slam
Roll A...
Materialized Views
client
album id
Tres
Hombres
a3e64f8f
Tres
Hombres
8a172618
album id
Roll Away 2b09185b
Stress: raw vs 1 MV vs 5 MV
Operations/s
Time
raw
1 MV
5 MV
mvbench, 4 denormalizations
MV
Manual
Operations/s
Time
Writing Materialized Views
Beyond 3.0
JBOD-aware vnodes
disk1 disk2 disk3 disk4
JBOD-aware vnodes
disk1 disk2 disk3 disk4
3.x Development Process
3.0 releases
3.0 rc1: Out now
3.0 GA
November
December
3.1
3.2
January
February
3.3
3.0 releases
3.0 rc1: Out now
3.0 GA
October
December
3.1
3.0.1
3.2
3.0.2
January
February
3.3
3.0.3
Compatibility
3.0
3.1
3.2
DataStax Training for Apache Cassandra™
Upcoming SlideShare
Loading in …5
×

Cassandra v3.0 at Rakuten meet-up on 12/2/2015

682 views

Published on

Cassandra v3.0 sessions at Cassandra Meet-up at Rakuten Tokyo, Fall 2015. New Functionality (support of JSON, new storage engine, Mview, UDF, UDA etc..)

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Cassandra v3.0 at Rakuten meet-up on 12/2/2015

  1. 1. @thetweetofmatt Matt Kennedy DataStax Cassandra 3.0
  2. 2. Windows Support JSON UDF Role-based authz Commitlog Compression New Storage Engine New Hints Architecture Materialized Views DTCS Message Coalescing 3.0
  3. 3. Windows Support DTCS UDF Role-based authz Commitlog Compression New Storage Engine New Hints Architecture Materialized Views JSON Message Coalescing 2.2 3.0(July) (Released November 9)
  4. 4. JSON CREATE TABLE users (
 id uuid PRIMARY KEY,
 name text,
 state text,
 birth_date int
 ); INSERT INTO users (id, name, state, birth_date)
 VALUES(now(), 'Joe User', 'TX', 1982); INSERT INTO users JSON
 '{"id": "1a4f88e2-6dc8-4edd-9e16-a7ba9c941f8d",
 "name": "Joe User",
 "state": "TX",
 "birth_date": 1982}';
  5. 5. Collections CREATE TABLE example (
     id int PRIMARY KEY,
     tupleval tuple<int, text>,
     numbers set<int>,
     words list<text>
 ); INSERT INTO example (id, tupleval, numbers, words)
 VALUES (0, (1, 'foo'), {1, 2, 3, 6}, ['the', 'quick', 'brown', 'fox']); INSERT INTO example JSON
 '{"id": 0,
 "tupleval": [1, "foo"],
 "numbers": [1, 2, 3, 6],
 "words": ["the", "quick", "brown", "fox"]}';
  6. 6. User-defined types CREATE TYPE address (number int, street text); CREATE TABLE users (
 id int PRIMARY KEY,
 street_address frozen<address>
 ); INSERT INTO users (id, street_address)
 VALUES (1, {number: 123, street: 'Cassandra Ave'}); INSERT INTO users JSON
 '{"id": 1,
 "street_address": {"number": 1,
 "street": "Cassandra Ave"}}';
  7. 7. Deeper Nesting CREATE TYPE address (
 street text,
 city text,
 zip_code int,
 phones set<text>
 ); CREATE TABLE users (
 id uuid PRIMARY KEY,
 name text,
 addresses map<text, frozen<address>>
 );
  8. 8. Deeper Nesting INSERT INTO users JSON '{"id": "0514e410-2a9f-11e5-a2cb-0800200c9a66", "name": "jellis", "addresses": {"home": {"street": "9920 Cassandra Ave", "city": "Austin", "zip_code": 78700, "phones": ["1238614789"]}}}';
  9. 9. Role-based Authorization CREATE ROLE accounting; GRANT all ON invoices TO accounting;
 GRANT select ON expenses TO accounting;
 GRANT select ON payroll TO accounting; GRANT accounting TO josie;
 GRANT accounting TO jay;
  10. 10. User-defined Functions (UDF) CREATE FUNCTION my_sin (input double)
 RETURNS double LANGUAGE java
 AS ’
 return input == null
 ? null
 : Double.valueOf(Math.sin(input.doubleValue()));
 ’; SELECT key, my_sin(value) FROM my_table WHERE key IN (1, 2, 3);
  11. 11. UDF Aggregation CREATE AGGREGATE avg (int)
 SFUNC avgState
 STYPE tuple<long, int>
 FINALFUNC avgFinal; CREATE FUNCTION avgState (state frozen<tuple<bigint, int>>, i int)
 RETURNS frozen<tuple<bigint, int>>, int
 LANGUAGE JAVA AS ’
 // (state[0] + i, state[1] + 1)
 state.setLong(0, state.getLong(0) + i.intValue());
 state.setInt(1, state.getInt(1) + 1);
 return state;
 ’; CREATE FUNCTION avgFinal (state frozen<tuple<bigint, int>>)
 RETURNS double
 LANGUAGE JAVA AS ’
 double r = state.getLong(0) / state.getInt(1);
 return Double.valueOf(r);
 ’;
  12. 12. Commitlog Compression 2.2 2.1 Operations/s Time
  13. 13. DateTieredCompactionStrategyReads/s DTCS STCS LCS 2.5TB
  14. 14. DateTieredCompactionStrategyWrites/s DTCS STCS LCS
  15. 15. 3.0 Features
  16. 16. New Storage Engine Why???
  17. 17. Pre 3.0 • Clustering keys are repeated for each cell • Timestamps are repeated in each cell • TTLS are.. you get the idea • Rows are a bolted on construct, only known by a convention • Lots of wasted space • Lots of repetition
  18. 18. Storage in 3.0 • Rows are a first class entity • Timestamps and TTLS can be stored on the Row • Clustering keys are not repeated • Conversion to iterators for memory efficiency
  19. 19. New Storage Engine Workloads Bytes
  20. 20. Pre 3.0 • Clustering keys are repeated for each cell • Timestamps are repeated in each cell • TTLS are.. you get the idea • Rows are a bolted on construct, only known by a convention • Lots of wasted space • Lots of repetition
  21. 21. Storage Model - Logical View 2005:12:1:10 -5.6 2005:12:1:9 -5.1 2005:12:1:8 -4.9 10010:99999 10010:99999 10010:99999 wsid hour temperature 2005:12:1:7 -5.3 10010:99999 SELECT wsid, hour, temperature
 FROM raw_weather_data
 WHERE wsid=‘10010:99999’
 AND year = 2005 AND month = 12 AND day = 1;
  22. 22. 2005:12:1:10 -5.6 -5.3-4.9-5.1 Storage Model - Disk Layout 2005:12:1:9 2005:12:1:8 10010:99999 2005:12:1:7 SELECT wsid, hour, temperature
 FROM raw_weather_data
 WHERE wsid=‘10010:99999’
 AND year = 2005 AND month = 12 AND day = 1; TTL Timestamp TTL Timestamp TTL Timestamp TTL Timestamp Repeated column values Repeated TTL and Timestamps
  23. 23. Storage in 3.0 • Rows are a first class entity • Timestamps and TTLS can be stored on the Row • Clustering keys are not repeated • Conversion to iterators for memory efficiency
  24. 24. New Storage Engine Workloads Bytes
  25. 25. Hinted Handoff Improvements client p1 p1 p1 X
  26. 26. Hinted Handoff Improvements client p1 p1 p1 Hint
  27. 27. Hinted Handoff Improvements client p1 p1 p1 Handoff
  28. 28. SSTable-based Hints Hint Compacted Tombstone Memtable Commitlog SSTableCommitlog SSTable Memtable
  29. 29. File-based Hints Hint .168.101 Hint Hint Hint Hint Hint Hint Hint .168.104 Hint Hint Hint Hint Hint Hint Hint Hint .168.112 Hint Hint Hint Hint Hint Hint Hint Hint
  30. 30. File-based Hints .168.104 Hint Hint Hint Hint Hint Hint Hint Hint .168.112 Hint Hint Hint Hint Hint Hint Hint Hint
  31. 31. 3.0 2.2 2.2 Hints vs 3.0Operations/s Time
  32. 32. Materialized Views CREATE TABLE songs (
   id uuid PRIMARY KEY,
   title text,
   album text,
   artist text
 ); CREATE MATERIALIZED VIEW songs_by_album AS
 SELECT * FROM songs
 WHERE album IS NOT NULL
 PRIMARY KEY (album, id); SELECT * FROM songs_by_album
 WHERE album = ‘Tres Hombres’;
  33. 33. Indexes CREATE TABLE songs (
   id uuid PRIMARY KEY,
   title text,
   album text,
   artist text
 ); CREATE INDEX songs_by_album on songs(album); SELECT * FROM songs
 WHERE album = ‘Tres Hombres’;
  34. 34. Local Indexes client title artist album La Grange ZZ Top Tres Hombre s title artist album Outside... Back Door Slam Roll Away title artist album Waitin... ZZ Top Tres Hombres
  35. 35. Materialized Views client album id Tres Hombres a3e64f8f Tres Hombres 8a172618 album id Roll Away 2b09185b
  36. 36. Stress: raw vs 1 MV vs 5 MV Operations/s Time raw 1 MV 5 MV
  37. 37. mvbench, 4 denormalizations MV Manual Operations/s Time
  38. 38. Writing Materialized Views
  39. 39. Beyond 3.0
  40. 40. JBOD-aware vnodes disk1 disk2 disk3 disk4
  41. 41. JBOD-aware vnodes disk1 disk2 disk3 disk4
  42. 42. 3.x Development Process
  43. 43. 3.0 releases 3.0 rc1: Out now 3.0 GA November December 3.1 3.2 January February 3.3
  44. 44. 3.0 releases 3.0 rc1: Out now 3.0 GA October December 3.1 3.0.1 3.2 3.0.2 January February 3.3 3.0.3
  45. 45. Compatibility 3.0 3.1 3.2
  46. 46. DataStax Training for Apache Cassandra™

×