Victor Coustenoble
@vizanalytics
2.2 & 3.0
http://www.datastax.com/dev/blog/cassandra-2-2
Where did 2.2 come from?
Don't start Thrift rpc by default (CASSANDRA-9319)
New features
• 2.2
- JSON
- User defined functions
- User defined aggregates
- Other useful features
- http://docs.datastax.com/en/cassandra/2.2/cassandra/features.html
- http://www.datastax.com/dev/blog/cassandra-2-2
• 3.0
- New storage engine (8099)
- A new way to denormalise/duplicate : Materialized View
So who’s taken some data out of C* and
serialised it as JSON?
Hello JSON
• create TABLE user (username text primary key,
first_name text , last_name text , emails set<text> ,
country text);
• INSERT INTO user JSON '{"username": "chbatey",
"first_name":"Christopher", "last_name": "Batey",
“emails":["christopher.batey@datastax.com"]}';
Goodbye Serialisation!
JSON + User Defined Types
• CREATE TYPE movie (title text, time timestamp,
description text);
• ALTER TABLE user ADD movies set<frozen<movie>>;
• UPDATE user SET movies = {{ title:'Batman',
time:'2011-02-03T04:05:00+0000', description:
'This film rocks' }} where username = 'chbatey';
Out it comes
• Run code on the server !Dangerous!
- Disabled by default
• Java + Java Script supported out of the box
• Any language that supports the Java Scripting API
(Java, Javascript, Ruby, Python …)
User Defined Functions
UDF example
CREATE TABLE user (
username text primary key,
first_name text ,
last_name text ,
emails set<text> ,
country text);
Concat function
CREATE FUNCTION name ( first_name text, last_name text )
CALLED ON NULL INPUT
RETURNS text LANGUAGE java
AS ‘return first_name + " " + last_name;’;
cqlsh:twotwo> select name(first_name, last_name) FROM user;
twotwo.name(first_name, last_name)
------------------------------------
Victor Coustenoble
User Defined Aggregates
CREATE AGGREGATE average ( int )
SFUNC averageState
STYPE tuple<int,bigint>
FINALFUNC averageFinal
INITCOND (0, 0);
Called for every row
state passed between
Initial state
Return type (CQL)
Optional function called on
final state
State function (like a UDF)
CREATE FUNCTION averageState ( state tuple<int,bigint>, value int )
CALLED ON NULL INPUT
RETURNS tuple<int,bigint>
LANGUAGE java
AS '
if (value != null) {
state.setInt(0, state.getInt(0)+1);
state.setLong(1, state.getLong(1)+val.intValue());
}
return state;
';
Type Columns
Final function
CREATE FUNCTION averageFinal ( state tuple<int,bigint> )
CALLED ON NULL INPUT
RETURNS double
LANGUAGE java
AS '
if (state.getInt(0) == 0) return null;
double r = state.getLong(1) / state.getInt(0);
return Double.valueOf(r);
';
State typeOverall return type
Putting it all together
Customer events
CREATE AGGREGATE count_by_type(text)
SFUNC countEventTypes
STYPE map<text, int>
INITCOND {};
CREATE FUNCTION countEventTypes( state map<text, int>, type text )
CALLED ON NULL INPUT
RETURNS map<text, int>
LANGUAGE java AS '
Integer count = (Integer) state.get(type);
if (count == null) count = 1;
else count = count + 1; state.put(type, count);
return state; ';
Customer events
Built in aggregates
• count
• max
• min
• avg
• sum
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java
Built in time functions
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/TimeFcts.java
Built in aggregates in action
1/ “Materialised views” with Spark
2/ Pure C*
2/ Pure C*
JSON, UDF and UDA available in DevCenter
Roles based Access
Other bits and pieces…
• Compressed commit log
• Resumable bootstrapping
• New types
- smallint - short
- tinyint - byte
- date
- time
• Warnings now sent back to client
- batch too large
http://www.datastax.com/dev/blog/new-in-cassandra-3-0-materialized-views
New Storage Engine
• CASSANDRA-8099
• More efficient storage
• Aware of CQL structure
• Reduce sstable size
• Reduce memory used
• …
Customer events table
CREATE TABLE if NOT EXISTS customer_events (
customer_id text,
staff_id text,
store_type text,
time timeuuid ,
event_type text,
PRIMARY KEY (customer_id, time))
create INDEX on customer_events (staff_id) ;
Indexes to the rescue?
customer_id time staff_id
chbatey 2015-03-03 08:52:45 trevor
chbatey 2015-03-03 08:52:54 trevor
chbatey 2015-03-03 08:53:11 bill
chbatey 2015-03-03 08:53:18 bill
rusty 2015-03-03 08:56:57
bill
rusty 2015-03-03 08:57:02
bill
rusty 2015-03-03 08:57:20 trevor
staff_id customer_id
trevor chbatey
trevor chbatey
bill chbatey
bill chbatey
bill rusty
bill rusty
trevor rusty
Secondary index are local
• The staff_id partition in the secondary index is not
distributed like a normal table
• The secondary index entries are only stored on the node
that contains the customer_id partition
Indexes to the rescue?
staff_id customer_id
trevor chbatey
trevor chbatey
bill chbatey
bill chbatey
staff_id customer_id
bill rusty
bill
rusty
trevor rusty
A B
chbatey rusty
customer_id time staff_id
chbatey 2015-03-03 08:52:45 trevor
chbatey 2015-03-03 08:52:54 trevor
chbatey 2015-03-03 08:53:11 bill
chbatey 2015-03-03 08:53:18 bill
rusty 2015-03-03 08:56:57
bill
rusty 2015-03-03 08:57:02
bill
rusty 2015-03-03 08:57:20 trevor
customer_events table
staff_id customer_id
trevor chbatey
trevor chbatey
bill chbatey
bill chbatey
bill rusty
bill
rusty
trevor rusty
staff_id index
Do it yourself index ?
CREATE TABLE if NOT EXISTS customer_events (
customer_id text,
staff_id text,
store_type text,
time timeuuid ,
event_type text,
PRIMARY KEY (customer_id, time))
CREATE TABLE if NOT EXISTS customer_events_by_staff (
customer_id text,
staff_id text,
store_type text,
time timeuuid ,
event_type text,
PRIMARY KEY (staff_id, time))
1.2 Logged batches
client
C
BATCH LOG
BL-R
BL-R
BL-R: Batch log replica
Pattern
• Write only:
- Duplicate with a different primary key
- (Optional) Logged batch for eventual consistency
• Full updates:
- No real difference
• Partial updates:
- No staff id in update?
Score Data Model
CREATE TABLE scores
(
user TEXT,
game TEXT,
year INT,
month INT,
day INT,
score INT,
PRIMARY KEY (user, game, year, month, day)
)
Materialized Views
CREATE MATERIALIZED VIEW alltimehigh AS
SELECT user FROM scores
WHERE game IS NOT NULL AND
score IS NOT NULL AND
user IS NOT NULL AND
year IS NOT NULL AND
month IS NOT NULL AND
day IS NOT NULL
PRIMARY KEY (game, score, user, year, month, day)
WITH CLUSTERING ORDER BY (score desc)
Materialized Views
INSERT INTO scores (user, game, year, month, day, score) VALUES ('pcmanus', 'Coup', 2015, 05, 01, 4000)
INSERT INTO scores (user, game, year, month, day, score) VALUES ('jbellis', 'Coup', 2015, 05, 03, 1750)
INSERT INTO scores (user, game, year, month, day, score) VALUES ('yukim', 'Coup', 2015, 05, 03, 2250)
INSERT INTO scores (user, game, year, month, day, score) VALUES ('tjake', 'Coup', 2015, 05, 03, 500)
INSERT INTO scores (user, game, year, month, day, score) VALUES ('jmckenzie', 'Coup', 2015, 06, 01, 2000)
INSERT INTO scores (user, game, year, month, day, score) VALUES ('iamaleksey', 'Coup', 2015, 06, 01, 2500)
INSERT INTO scores (user, game, year, month, day, score) VALUES ('tjake', 'Coup', 2015, 06, 02, 1000)
INSERT INTO scores (user, game, year, month, day, score) VALUES ('pcmanus', 'Coup', 2015, 06, 02, 2000)
SELECT user, score FROM alltimehigh WHERE game = 'Coup'
user | score
-----------+-------
pcmanus | 4000
iamaleksey | 2500
yukim | 2250
jmckenzie | 2000
pcmanus | 2000
jbellis | 1750
tjake | 1000
tjake | 500
KillrWeather data model
Combining aggregates + MVs
How it works…
http://www.datastax.com/dev/blog/new-in-cassandra-3-0-materialized-views
https://issues.apache.org/jira/browse/CASSANDRA-6477
For more details
Fine print
• All Primary Key columns must be present in your view
• If the part of your primary key is NULL then it won't
appear in the materialised view
• Performance will be a factor!
- More operations to complete (read-before-write,
consistency check …)
- Batch writes for MV
• Bad for low cardinality data (hot spot)
Conclusions
• We still denormalise and duplicate to achieve scalability
and performance
• We just let C* do it for us :)
Find Out More
• Documentation: http://www.datastax.com/docs
• Developer Blog: http://www.datastax.com/dev/blog
• Academy: https://academy.datastax.com
• Community Site: http://planetcassandra.org

Cassandra 2.2 & 3.0

  • 1.
  • 2.
  • 3.
    Where did 2.2come from?
  • 4.
    Don't start Thriftrpc by default (CASSANDRA-9319)
  • 5.
    New features • 2.2 -JSON - User defined functions - User defined aggregates - Other useful features - http://docs.datastax.com/en/cassandra/2.2/cassandra/features.html - http://www.datastax.com/dev/blog/cassandra-2-2 • 3.0 - New storage engine (8099) - A new way to denormalise/duplicate : Materialized View
  • 6.
    So who’s takensome data out of C* and serialised it as JSON?
  • 7.
    Hello JSON • createTABLE user (username text primary key, first_name text , last_name text , emails set<text> , country text); • INSERT INTO user JSON '{"username": "chbatey", "first_name":"Christopher", "last_name": "Batey", “emails":["christopher.batey@datastax.com"]}';
  • 8.
  • 9.
    JSON + UserDefined Types • CREATE TYPE movie (title text, time timestamp, description text); • ALTER TABLE user ADD movies set<frozen<movie>>; • UPDATE user SET movies = {{ title:'Batman', time:'2011-02-03T04:05:00+0000', description: 'This film rocks' }} where username = 'chbatey';
  • 10.
  • 11.
    • Run codeon the server !Dangerous! - Disabled by default • Java + Java Script supported out of the box • Any language that supports the Java Scripting API (Java, Javascript, Ruby, Python …) User Defined Functions
  • 12.
    UDF example CREATE TABLEuser ( username text primary key, first_name text , last_name text , emails set<text> , country text);
  • 13.
    Concat function CREATE FUNCTIONname ( first_name text, last_name text ) CALLED ON NULL INPUT RETURNS text LANGUAGE java AS ‘return first_name + " " + last_name;’; cqlsh:twotwo> select name(first_name, last_name) FROM user; twotwo.name(first_name, last_name) ------------------------------------ Victor Coustenoble
  • 14.
    User Defined Aggregates CREATEAGGREGATE average ( int ) SFUNC averageState STYPE tuple<int,bigint> FINALFUNC averageFinal INITCOND (0, 0); Called for every row state passed between Initial state Return type (CQL) Optional function called on final state
  • 15.
    State function (likea UDF) CREATE FUNCTION averageState ( state tuple<int,bigint>, value int ) CALLED ON NULL INPUT RETURNS tuple<int,bigint> LANGUAGE java AS ' if (value != null) { state.setInt(0, state.getInt(0)+1); state.setLong(1, state.getLong(1)+val.intValue()); } return state; '; Type Columns
  • 16.
    Final function CREATE FUNCTIONaverageFinal ( state tuple<int,bigint> ) CALLED ON NULL INPUT RETURNS double LANGUAGE java AS ' if (state.getInt(0) == 0) return null; double r = state.getLong(1) / state.getInt(0); return Double.valueOf(r); '; State typeOverall return type
  • 17.
  • 18.
    Customer events CREATE AGGREGATEcount_by_type(text) SFUNC countEventTypes STYPE map<text, int> INITCOND {}; CREATE FUNCTION countEventTypes( state map<text, int>, type text ) CALLED ON NULL INPUT RETURNS map<text, int> LANGUAGE java AS ' Integer count = (Integer) state.get(type); if (count == null) count = 1; else count = count + 1; state.put(type, count); return state; ';
  • 19.
  • 20.
    Built in aggregates •count • max • min • avg • sum https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java
  • 21.
    Built in timefunctions https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/TimeFcts.java
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
    JSON, UDF andUDA available in DevCenter
  • 27.
  • 28.
    Other bits andpieces… • Compressed commit log • Resumable bootstrapping • New types - smallint - short - tinyint - byte - date - time • Warnings now sent back to client - batch too large
  • 29.
  • 30.
    New Storage Engine •CASSANDRA-8099 • More efficient storage • Aware of CQL structure • Reduce sstable size • Reduce memory used • …
  • 31.
    Customer events table CREATETABLE if NOT EXISTS customer_events ( customer_id text, staff_id text, store_type text, time timeuuid , event_type text, PRIMARY KEY (customer_id, time)) create INDEX on customer_events (staff_id) ;
  • 32.
    Indexes to therescue? customer_id time staff_id chbatey 2015-03-03 08:52:45 trevor chbatey 2015-03-03 08:52:54 trevor chbatey 2015-03-03 08:53:11 bill chbatey 2015-03-03 08:53:18 bill rusty 2015-03-03 08:56:57 bill rusty 2015-03-03 08:57:02 bill rusty 2015-03-03 08:57:20 trevor staff_id customer_id trevor chbatey trevor chbatey bill chbatey bill chbatey bill rusty bill rusty trevor rusty
  • 33.
    Secondary index arelocal • The staff_id partition in the secondary index is not distributed like a normal table • The secondary index entries are only stored on the node that contains the customer_id partition
  • 34.
    Indexes to therescue? staff_id customer_id trevor chbatey trevor chbatey bill chbatey bill chbatey staff_id customer_id bill rusty bill rusty trevor rusty A B chbatey rusty customer_id time staff_id chbatey 2015-03-03 08:52:45 trevor chbatey 2015-03-03 08:52:54 trevor chbatey 2015-03-03 08:53:11 bill chbatey 2015-03-03 08:53:18 bill rusty 2015-03-03 08:56:57 bill rusty 2015-03-03 08:57:02 bill rusty 2015-03-03 08:57:20 trevor customer_events table staff_id customer_id trevor chbatey trevor chbatey bill chbatey bill chbatey bill rusty bill rusty trevor rusty staff_id index
  • 35.
    Do it yourselfindex ? CREATE TABLE if NOT EXISTS customer_events ( customer_id text, staff_id text, store_type text, time timeuuid , event_type text, PRIMARY KEY (customer_id, time)) CREATE TABLE if NOT EXISTS customer_events_by_staff ( customer_id text, staff_id text, store_type text, time timeuuid , event_type text, PRIMARY KEY (staff_id, time))
  • 36.
    1.2 Logged batches client C BATCHLOG BL-R BL-R BL-R: Batch log replica
  • 37.
    Pattern • Write only: -Duplicate with a different primary key - (Optional) Logged batch for eventual consistency • Full updates: - No real difference • Partial updates: - No staff id in update?
  • 38.
    Score Data Model CREATETABLE scores ( user TEXT, game TEXT, year INT, month INT, day INT, score INT, PRIMARY KEY (user, game, year, month, day) )
  • 39.
    Materialized Views CREATE MATERIALIZEDVIEW alltimehigh AS SELECT user FROM scores WHERE game IS NOT NULL AND score IS NOT NULL AND user IS NOT NULL AND year IS NOT NULL AND month IS NOT NULL AND day IS NOT NULL PRIMARY KEY (game, score, user, year, month, day) WITH CLUSTERING ORDER BY (score desc)
  • 40.
    Materialized Views INSERT INTOscores (user, game, year, month, day, score) VALUES ('pcmanus', 'Coup', 2015, 05, 01, 4000) INSERT INTO scores (user, game, year, month, day, score) VALUES ('jbellis', 'Coup', 2015, 05, 03, 1750) INSERT INTO scores (user, game, year, month, day, score) VALUES ('yukim', 'Coup', 2015, 05, 03, 2250) INSERT INTO scores (user, game, year, month, day, score) VALUES ('tjake', 'Coup', 2015, 05, 03, 500) INSERT INTO scores (user, game, year, month, day, score) VALUES ('jmckenzie', 'Coup', 2015, 06, 01, 2000) INSERT INTO scores (user, game, year, month, day, score) VALUES ('iamaleksey', 'Coup', 2015, 06, 01, 2500) INSERT INTO scores (user, game, year, month, day, score) VALUES ('tjake', 'Coup', 2015, 06, 02, 1000) INSERT INTO scores (user, game, year, month, day, score) VALUES ('pcmanus', 'Coup', 2015, 06, 02, 2000) SELECT user, score FROM alltimehigh WHERE game = 'Coup' user | score -----------+------- pcmanus | 4000 iamaleksey | 2500 yukim | 2250 jmckenzie | 2000 pcmanus | 2000 jbellis | 1750 tjake | 1000 tjake | 500
  • 41.
  • 42.
  • 43.
  • 44.
    Fine print • AllPrimary Key columns must be present in your view • If the part of your primary key is NULL then it won't appear in the materialised view • Performance will be a factor! - More operations to complete (read-before-write, consistency check …) - Batch writes for MV • Bad for low cardinality data (hot spot)
  • 45.
    Conclusions • We stilldenormalise and duplicate to achieve scalability and performance • We just let C* do it for us :)
  • 46.
    Find Out More •Documentation: http://www.datastax.com/docs • Developer Blog: http://www.datastax.com/dev/blog • Academy: https://academy.datastax.com • Community Site: http://planetcassandra.org

Editor's Notes

  • #11 also a toJson and fromJson if you want individual fields
  • #17 User defined types??
  • #28 time is modelled as a long, nanoseconds since midnight
  • #29 time is modelled as a long, nanoseconds since midnight
  • #33 Looks good so far. The first problem however is that a single query can result in many partitions being queries. We know why this is bad.
  • #35 Each of the segments of the index table
  • #37 Start by writing it out to a batch log on 2 other replicas Downside: Look at the extra round trips Extra complexity Serial reads