SlideShare a Scribd company logo
1 of 46
Victor Coustenoble
@vizanalytics
2.2 & 3.0
http://www.datastax.com/dev/blog/cassandra-2-2
Where did 2.2 come from?
Don't start Thrift rpc by default (CASSANDRA-9319)
New features
• 2.2
- JSON
- User defined functions
- User defined aggregates
- Other useful features
- http://docs.datastax.com/en/cassandra/2.2/cassandra/features.html
- http://www.datastax.com/dev/blog/cassandra-2-2
• 3.0
- New storage engine (8099)
- A new way to denormalise/duplicate : Materialized View
So who’s taken some data out of C* and
serialised it as JSON?
Hello JSON
• create TABLE user (username text primary key,
first_name text , last_name text , emails set<text> ,
country text);
• INSERT INTO user JSON '{"username": "chbatey",
"first_name":"Christopher", "last_name": "Batey",
“emails":["christopher.batey@datastax.com"]}';
Goodbye Serialisation!
JSON + User Defined Types
• CREATE TYPE movie (title text, time timestamp,
description text);
• ALTER TABLE user ADD movies set<frozen<movie>>;
• UPDATE user SET movies = {{ title:'Batman',
time:'2011-02-03T04:05:00+0000', description:
'This film rocks' }} where username = 'chbatey';
Out it comes
• Run code on the server !Dangerous!
- Disabled by default
• Java + Java Script supported out of the box
• Any language that supports the Java Scripting API
(Java, Javascript, Ruby, Python …)
User Defined Functions
UDF example
CREATE TABLE user (
username text primary key,
first_name text ,
last_name text ,
emails set<text> ,
country text);
Concat function
CREATE FUNCTION name ( first_name text, last_name text )
CALLED ON NULL INPUT
RETURNS text LANGUAGE java
AS ‘return first_name + " " + last_name;’;
cqlsh:twotwo> select name(first_name, last_name) FROM user;
twotwo.name(first_name, last_name)
------------------------------------
Victor Coustenoble
User Defined Aggregates
CREATE AGGREGATE average ( int )
SFUNC averageState
STYPE tuple<int,bigint>
FINALFUNC averageFinal
INITCOND (0, 0);
Called for every row
state passed between
Initial state
Return type (CQL)
Optional function called on
final state
State function (like a UDF)
CREATE FUNCTION averageState ( state tuple<int,bigint>, value int )
CALLED ON NULL INPUT
RETURNS tuple<int,bigint>
LANGUAGE java
AS '
if (value != null) {
state.setInt(0, state.getInt(0)+1);
state.setLong(1, state.getLong(1)+val.intValue());
}
return state;
';
Type Columns
Final function
CREATE FUNCTION averageFinal ( state tuple<int,bigint> )
CALLED ON NULL INPUT
RETURNS double
LANGUAGE java
AS '
if (state.getInt(0) == 0) return null;
double r = state.getLong(1) / state.getInt(0);
return Double.valueOf(r);
';
State typeOverall return type
Putting it all together
Customer events
CREATE AGGREGATE count_by_type(text)
SFUNC countEventTypes
STYPE map<text, int>
INITCOND {};
CREATE FUNCTION countEventTypes( state map<text, int>, type text )
CALLED ON NULL INPUT
RETURNS map<text, int>
LANGUAGE java AS '
Integer count = (Integer) state.get(type);
if (count == null) count = 1;
else count = count + 1; state.put(type, count);
return state; ';
Customer events
Built in aggregates
• count
• max
• min
• avg
• sum
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java
Built in time functions
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/TimeFcts.java
Built in aggregates in action
1/ “Materialised views” with Spark
2/ Pure C*
2/ Pure C*
JSON, UDF and UDA available in DevCenter
Roles based Access
Other bits and pieces…
• Compressed commit log
• Resumable bootstrapping
• New types
- smallint - short
- tinyint - byte
- date
- time
• Warnings now sent back to client
- batch too large
http://www.datastax.com/dev/blog/new-in-cassandra-3-0-materialized-views
New Storage Engine
• CASSANDRA-8099
• More efficient storage
• Aware of CQL structure
• Reduce sstable size
• Reduce memory used
• …
Customer events table
CREATE TABLE if NOT EXISTS customer_events (
customer_id text,
staff_id text,
store_type text,
time timeuuid ,
event_type text,
PRIMARY KEY (customer_id, time))
create INDEX on customer_events (staff_id) ;
Indexes to the rescue?
customer_id time staff_id
chbatey 2015-03-03 08:52:45 trevor
chbatey 2015-03-03 08:52:54 trevor
chbatey 2015-03-03 08:53:11 bill
chbatey 2015-03-03 08:53:18 bill
rusty 2015-03-03 08:56:57
bill
rusty 2015-03-03 08:57:02
bill
rusty 2015-03-03 08:57:20 trevor
staff_id customer_id
trevor chbatey
trevor chbatey
bill chbatey
bill chbatey
bill rusty
bill rusty
trevor rusty
Secondary index are local
• The staff_id partition in the secondary index is not
distributed like a normal table
• The secondary index entries are only stored on the node
that contains the customer_id partition
Indexes to the rescue?
staff_id customer_id
trevor chbatey
trevor chbatey
bill chbatey
bill chbatey
staff_id customer_id
bill rusty
bill
rusty
trevor rusty
A B
chbatey rusty
customer_id time staff_id
chbatey 2015-03-03 08:52:45 trevor
chbatey 2015-03-03 08:52:54 trevor
chbatey 2015-03-03 08:53:11 bill
chbatey 2015-03-03 08:53:18 bill
rusty 2015-03-03 08:56:57
bill
rusty 2015-03-03 08:57:02
bill
rusty 2015-03-03 08:57:20 trevor
customer_events table
staff_id customer_id
trevor chbatey
trevor chbatey
bill chbatey
bill chbatey
bill rusty
bill
rusty
trevor rusty
staff_id index
Do it yourself index ?
CREATE TABLE if NOT EXISTS customer_events (
customer_id text,
staff_id text,
store_type text,
time timeuuid ,
event_type text,
PRIMARY KEY (customer_id, time))
CREATE TABLE if NOT EXISTS customer_events_by_staff (
customer_id text,
staff_id text,
store_type text,
time timeuuid ,
event_type text,
PRIMARY KEY (staff_id, time))
1.2 Logged batches
client
C
BATCH LOG
BL-R
BL-R
BL-R: Batch log replica
Pattern
• Write only:
- Duplicate with a different primary key
- (Optional) Logged batch for eventual consistency
• Full updates:
- No real difference
• Partial updates:
- No staff id in update?
Score Data Model
CREATE TABLE scores
(
user TEXT,
game TEXT,
year INT,
month INT,
day INT,
score INT,
PRIMARY KEY (user, game, year, month, day)
)
Materialized Views
CREATE MATERIALIZED VIEW alltimehigh AS
SELECT user FROM scores
WHERE game IS NOT NULL AND
score IS NOT NULL AND
user IS NOT NULL AND
year IS NOT NULL AND
month IS NOT NULL AND
day IS NOT NULL
PRIMARY KEY (game, score, user, year, month, day)
WITH CLUSTERING ORDER BY (score desc)
Materialized Views
INSERT INTO scores (user, game, year, month, day, score) VALUES ('pcmanus', 'Coup', 2015, 05, 01, 4000)
INSERT INTO scores (user, game, year, month, day, score) VALUES ('jbellis', 'Coup', 2015, 05, 03, 1750)
INSERT INTO scores (user, game, year, month, day, score) VALUES ('yukim', 'Coup', 2015, 05, 03, 2250)
INSERT INTO scores (user, game, year, month, day, score) VALUES ('tjake', 'Coup', 2015, 05, 03, 500)
INSERT INTO scores (user, game, year, month, day, score) VALUES ('jmckenzie', 'Coup', 2015, 06, 01, 2000)
INSERT INTO scores (user, game, year, month, day, score) VALUES ('iamaleksey', 'Coup', 2015, 06, 01, 2500)
INSERT INTO scores (user, game, year, month, day, score) VALUES ('tjake', 'Coup', 2015, 06, 02, 1000)
INSERT INTO scores (user, game, year, month, day, score) VALUES ('pcmanus', 'Coup', 2015, 06, 02, 2000)
SELECT user, score FROM alltimehigh WHERE game = 'Coup'
user | score
-----------+-------
pcmanus | 4000
iamaleksey | 2500
yukim | 2250
jmckenzie | 2000
pcmanus | 2000
jbellis | 1750
tjake | 1000
tjake | 500
KillrWeather data model
Combining aggregates + MVs
How it works…
http://www.datastax.com/dev/blog/new-in-cassandra-3-0-materialized-views
https://issues.apache.org/jira/browse/CASSANDRA-6477
For more details
Fine print
• All Primary Key columns must be present in your view
• If the part of your primary key is NULL then it won't
appear in the materialised view
• Performance will be a factor!
- More operations to complete (read-before-write,
consistency check …)
- Batch writes for MV
• Bad for low cardinality data (hot spot)
Conclusions
• We still denormalise and duplicate to achieve scalability
and performance
• We just let C* do it for us :)
Find Out More
• Documentation: http://www.datastax.com/docs
• Developer Blog: http://www.datastax.com/dev/blog
• Academy: https://academy.datastax.com
• Community Site: http://planetcassandra.org

More Related Content

What's hot

Cassandra 3.0 advanced preview
Cassandra 3.0 advanced previewCassandra 3.0 advanced preview
Cassandra 3.0 advanced previewPatrick McFadin
 
Cassandra summit 2013 - DataStax Java Driver Unleashed!
Cassandra summit 2013 - DataStax Java Driver Unleashed!Cassandra summit 2013 - DataStax Java Driver Unleashed!
Cassandra summit 2013 - DataStax Java Driver Unleashed!Michaël Figuière
 
GreenDao Introduction
GreenDao IntroductionGreenDao Introduction
GreenDao IntroductionBooch Lin
 
Cassandra EU - Data model on fire
Cassandra EU - Data model on fireCassandra EU - Data model on fire
Cassandra EU - Data model on firePatrick McFadin
 
Cassandra Materialized Views
Cassandra Materialized ViewsCassandra Materialized Views
Cassandra Materialized ViewsCarl Yeksigian
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingDataStax Academy
 
DataStax: An Introduction to DataStax Enterprise Search
DataStax: An Introduction to DataStax Enterprise SearchDataStax: An Introduction to DataStax Enterprise Search
DataStax: An Introduction to DataStax Enterprise SearchDataStax Academy
 
John Melesky - Federating Queries Using Postgres FDW @ Postgres Open
John Melesky - Federating Queries Using Postgres FDW @ Postgres OpenJohn Melesky - Federating Queries Using Postgres FDW @ Postgres Open
John Melesky - Federating Queries Using Postgres FDW @ Postgres OpenPostgresOpen
 
Cassandra Day Chicago 2015: Advanced Data Modeling
Cassandra Day Chicago 2015: Advanced Data ModelingCassandra Day Chicago 2015: Advanced Data Modeling
Cassandra Day Chicago 2015: Advanced Data ModelingDataStax Academy
 
MongoDB: tips, trick and hacks
MongoDB: tips, trick and hacksMongoDB: tips, trick and hacks
MongoDB: tips, trick and hacksScott Hernandez
 
Getting started with Elasticsearch and .NET
Getting started with Elasticsearch and .NETGetting started with Elasticsearch and .NET
Getting started with Elasticsearch and .NETTomas Jansson
 
Bulk Loading Data into Cassandra
Bulk Loading Data into CassandraBulk Loading Data into Cassandra
Bulk Loading Data into CassandraDataStax
 
Slick: Bringing Scala’s Powerful Features to Your Database Access
Slick: Bringing Scala’s Powerful Features to Your Database Access Slick: Bringing Scala’s Powerful Features to Your Database Access
Slick: Bringing Scala’s Powerful Features to Your Database Access Rebecca Grenier
 
Enter the Snake Pit for Fast and Easy Spark
Enter the Snake Pit for Fast and Easy SparkEnter the Snake Pit for Fast and Easy Spark
Enter the Snake Pit for Fast and Easy SparkJon Haddad
 
Cassandra 2.0 better, faster, stronger
Cassandra 2.0   better, faster, strongerCassandra 2.0   better, faster, stronger
Cassandra 2.0 better, faster, strongerPatrick McFadin
 

What's hot (20)

Cassandra 3.0 advanced preview
Cassandra 3.0 advanced previewCassandra 3.0 advanced preview
Cassandra 3.0 advanced preview
 
greenDAO
greenDAOgreenDAO
greenDAO
 
Apache Cassandra & Data Modeling
Apache Cassandra & Data ModelingApache Cassandra & Data Modeling
Apache Cassandra & Data Modeling
 
Cassandra summit 2013 - DataStax Java Driver Unleashed!
Cassandra summit 2013 - DataStax Java Driver Unleashed!Cassandra summit 2013 - DataStax Java Driver Unleashed!
Cassandra summit 2013 - DataStax Java Driver Unleashed!
 
MongoDB-SESSION03
MongoDB-SESSION03MongoDB-SESSION03
MongoDB-SESSION03
 
How to Use JSON in MySQL Wrong
How to Use JSON in MySQL WrongHow to Use JSON in MySQL Wrong
How to Use JSON in MySQL Wrong
 
GreenDao Introduction
GreenDao IntroductionGreenDao Introduction
GreenDao Introduction
 
Cassandra EU - Data model on fire
Cassandra EU - Data model on fireCassandra EU - Data model on fire
Cassandra EU - Data model on fire
 
Cassandra Materialized Views
Cassandra Materialized ViewsCassandra Materialized Views
Cassandra Materialized Views
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
DataStax: An Introduction to DataStax Enterprise Search
DataStax: An Introduction to DataStax Enterprise SearchDataStax: An Introduction to DataStax Enterprise Search
DataStax: An Introduction to DataStax Enterprise Search
 
John Melesky - Federating Queries Using Postgres FDW @ Postgres Open
John Melesky - Federating Queries Using Postgres FDW @ Postgres OpenJohn Melesky - Federating Queries Using Postgres FDW @ Postgres Open
John Melesky - Federating Queries Using Postgres FDW @ Postgres Open
 
Cassandra Day Chicago 2015: Advanced Data Modeling
Cassandra Day Chicago 2015: Advanced Data ModelingCassandra Day Chicago 2015: Advanced Data Modeling
Cassandra Day Chicago 2015: Advanced Data Modeling
 
MongoDB: tips, trick and hacks
MongoDB: tips, trick and hacksMongoDB: tips, trick and hacks
MongoDB: tips, trick and hacks
 
Getting started with Elasticsearch and .NET
Getting started with Elasticsearch and .NETGetting started with Elasticsearch and .NET
Getting started with Elasticsearch and .NET
 
Bulk Loading Data into Cassandra
Bulk Loading Data into CassandraBulk Loading Data into Cassandra
Bulk Loading Data into Cassandra
 
Slick: Bringing Scala’s Powerful Features to Your Database Access
Slick: Bringing Scala’s Powerful Features to Your Database Access Slick: Bringing Scala’s Powerful Features to Your Database Access
Slick: Bringing Scala’s Powerful Features to Your Database Access
 
Green dao
Green daoGreen dao
Green dao
 
Enter the Snake Pit for Fast and Easy Spark
Enter the Snake Pit for Fast and Easy SparkEnter the Snake Pit for Fast and Easy Spark
Enter the Snake Pit for Fast and Easy Spark
 
Cassandra 2.0 better, faster, stronger
Cassandra 2.0   better, faster, strongerCassandra 2.0   better, faster, stronger
Cassandra 2.0 better, faster, stronger
 

Viewers also liked

Cassandra 3 new features 2016
Cassandra 3 new features 2016Cassandra 3 new features 2016
Cassandra 3 new features 2016Duyhai Doan
 
Quelles stratégies de Recherche avec Cassandra ?
Quelles stratégies de Recherche avec Cassandra ?Quelles stratégies de Recherche avec Cassandra ?
Quelles stratégies de Recherche avec Cassandra ?Victor Coustenoble
 
CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...
CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...
CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...DataStax
 
DataStax Enterprise - La plateforme de base de données pour le Cloud
DataStax Enterprise - La plateforme de base de données pour le CloudDataStax Enterprise - La plateforme de base de données pour le Cloud
DataStax Enterprise - La plateforme de base de données pour le CloudVictor Coustenoble
 
DataStax Enterprise et Cas d'utilisation de Apache Cassandra
DataStax Enterprise et Cas d'utilisation de Apache CassandraDataStax Enterprise et Cas d'utilisation de Apache Cassandra
DataStax Enterprise et Cas d'utilisation de Apache CassandraVictor Coustenoble
 
Datastax Cassandra + Spark Streaming
Datastax Cassandra + Spark StreamingDatastax Cassandra + Spark Streaming
Datastax Cassandra + Spark StreamingVictor Coustenoble
 
Cassandra 3 new features @ Geecon Krakow 2016
Cassandra 3 new features  @ Geecon Krakow 2016Cassandra 3 new features  @ Geecon Krakow 2016
Cassandra 3 new features @ Geecon Krakow 2016Duyhai Doan
 
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseEric Evans
 
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseEric Evans
 
Sasi, cassandra on full text search ride
Sasi, cassandra on full text search rideSasi, cassandra on full text search ride
Sasi, cassandra on full text search rideDuyhai Doan
 
Castle enhanced Cassandra
Castle enhanced CassandraCastle enhanced Cassandra
Castle enhanced CassandraEric Evans
 
Webinaire Business&Decision - Trifacta
Webinaire  Business&Decision - TrifactaWebinaire  Business&Decision - Trifacta
Webinaire Business&Decision - TrifactaVictor Coustenoble
 
Wikimedia Content API (Strangeloop)
Wikimedia Content API (Strangeloop)Wikimedia Content API (Strangeloop)
Wikimedia Content API (Strangeloop)Eric Evans
 
Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)Eric Evans
 
Wayne State University & DataStax: World's best data modeling tool for Apache...
Wayne State University & DataStax: World's best data modeling tool for Apache...Wayne State University & DataStax: World's best data modeling tool for Apache...
Wayne State University & DataStax: World's best data modeling tool for Apache...DataStax Academy
 
DataStax et Apache Cassandra pour la gestion des flux IoT
DataStax et Apache Cassandra pour la gestion des flux IoTDataStax et Apache Cassandra pour la gestion des flux IoT
DataStax et Apache Cassandra pour la gestion des flux IoTVictor Coustenoble
 
CQL In Cassandra 1.0 (and beyond)
CQL In Cassandra 1.0 (and beyond)CQL In Cassandra 1.0 (and beyond)
CQL In Cassandra 1.0 (and beyond)Eric Evans
 

Viewers also liked (20)

Cassandra 3 new features 2016
Cassandra 3 new features 2016Cassandra 3 new features 2016
Cassandra 3 new features 2016
 
Quelles stratégies de Recherche avec Cassandra ?
Quelles stratégies de Recherche avec Cassandra ?Quelles stratégies de Recherche avec Cassandra ?
Quelles stratégies de Recherche avec Cassandra ?
 
CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...
CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...
CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...
 
DataStax Enterprise - La plateforme de base de données pour le Cloud
DataStax Enterprise - La plateforme de base de données pour le CloudDataStax Enterprise - La plateforme de base de données pour le Cloud
DataStax Enterprise - La plateforme de base de données pour le Cloud
 
DataStax Enterprise et Cas d'utilisation de Apache Cassandra
DataStax Enterprise et Cas d'utilisation de Apache CassandraDataStax Enterprise et Cas d'utilisation de Apache Cassandra
DataStax Enterprise et Cas d'utilisation de Apache Cassandra
 
Datastax Cassandra + Spark Streaming
Datastax Cassandra + Spark StreamingDatastax Cassandra + Spark Streaming
Datastax Cassandra + Spark Streaming
 
Apache Cassandra and Go
Apache Cassandra and GoApache Cassandra and Go
Apache Cassandra and Go
 
Cassandra 3 new features @ Geecon Krakow 2016
Cassandra 3 new features  @ Geecon Krakow 2016Cassandra 3 new features  @ Geecon Krakow 2016
Cassandra 3 new features @ Geecon Krakow 2016
 
Introduction spark
Introduction sparkIntroduction spark
Introduction spark
 
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-case
 
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-case
 
Sasi, cassandra on full text search ride
Sasi, cassandra on full text search rideSasi, cassandra on full text search ride
Sasi, cassandra on full text search ride
 
Castle enhanced Cassandra
Castle enhanced CassandraCastle enhanced Cassandra
Castle enhanced Cassandra
 
Webinaire Business&Decision - Trifacta
Webinaire  Business&Decision - TrifactaWebinaire  Business&Decision - Trifacta
Webinaire Business&Decision - Trifacta
 
Wikimedia Content API (Strangeloop)
Wikimedia Content API (Strangeloop)Wikimedia Content API (Strangeloop)
Wikimedia Content API (Strangeloop)
 
Webinar Degetel DataStax
Webinar Degetel DataStaxWebinar Degetel DataStax
Webinar Degetel DataStax
 
Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)Time Series Data with Apache Cassandra (ApacheCon EU 2014)
Time Series Data with Apache Cassandra (ApacheCon EU 2014)
 
Wayne State University & DataStax: World's best data modeling tool for Apache...
Wayne State University & DataStax: World's best data modeling tool for Apache...Wayne State University & DataStax: World's best data modeling tool for Apache...
Wayne State University & DataStax: World's best data modeling tool for Apache...
 
DataStax et Apache Cassandra pour la gestion des flux IoT
DataStax et Apache Cassandra pour la gestion des flux IoTDataStax et Apache Cassandra pour la gestion des flux IoT
DataStax et Apache Cassandra pour la gestion des flux IoT
 
CQL In Cassandra 1.0 (and beyond)
CQL In Cassandra 1.0 (and beyond)CQL In Cassandra 1.0 (and beyond)
CQL In Cassandra 1.0 (and beyond)
 

Similar to Cassandra 2.2 & 3.0

Cassandra London - 2.2 and 3.0
Cassandra London - 2.2 and 3.0Cassandra London - 2.2 and 3.0
Cassandra London - 2.2 and 3.0Christopher Batey
 
Powering Heap With PostgreSQL And CitusDB (PGConf Silicon Valley 2015)
Powering Heap With PostgreSQL And CitusDB (PGConf Silicon Valley 2015)Powering Heap With PostgreSQL And CitusDB (PGConf Silicon Valley 2015)
Powering Heap With PostgreSQL And CitusDB (PGConf Silicon Valley 2015)Dan Robinson
 
Work in TDW
Work in TDWWork in TDW
Work in TDWsaso70
 
Cassandra summit keynote 2014
Cassandra summit keynote 2014Cassandra summit keynote 2014
Cassandra summit keynote 2014jbellis
 
Uncovering SQL Server query problems with execution plans - Tony Davis
Uncovering SQL Server query problems with execution plans - Tony DavisUncovering SQL Server query problems with execution plans - Tony Davis
Uncovering SQL Server query problems with execution plans - Tony DavisRed Gate Software
 
on SQL Managment studio(For the following exercise, use the Week 5.pdf
on SQL Managment studio(For the following exercise, use the Week 5.pdfon SQL Managment studio(For the following exercise, use the Week 5.pdf
on SQL Managment studio(For the following exercise, use the Week 5.pdfformaxekochi
 
Cassandra v3.0 at Rakuten meet-up on 12/2/2015
Cassandra v3.0 at Rakuten meet-up on 12/2/2015Cassandra v3.0 at Rakuten meet-up on 12/2/2015
Cassandra v3.0 at Rakuten meet-up on 12/2/2015datastaxjp
 
SQL Server 2008 Portfolio
SQL Server 2008 PortfolioSQL Server 2008 Portfolio
SQL Server 2008 Portfoliolilredlokita
 
Laracon EU 2018: OMG MySQL 8.0 is out! are we there yet?
Laracon EU 2018: OMG MySQL 8.0 is out! are we there yet?Laracon EU 2018: OMG MySQL 8.0 is out! are we there yet?
Laracon EU 2018: OMG MySQL 8.0 is out! are we there yet?Gabriela Ferrara
 
Starting from the database used in Project 1 (see the slightly cha.docx
Starting from the database used in Project 1 (see the slightly cha.docxStarting from the database used in Project 1 (see the slightly cha.docx
Starting from the database used in Project 1 (see the slightly cha.docxdessiechisomjj4
 
Greg Lewis SQL Portfolio
Greg Lewis SQL PortfolioGreg Lewis SQL Portfolio
Greg Lewis SQL Portfoliogregmlewis
 
Data Exploration with Apache Drill: Day 2
Data Exploration with Apache Drill: Day 2Data Exploration with Apache Drill: Day 2
Data Exploration with Apache Drill: Day 2Charles Givre
 
CMS Project Phase II InstructionsIn this phase, you will create t.docx
CMS Project Phase II InstructionsIn this phase, you will create t.docxCMS Project Phase II InstructionsIn this phase, you will create t.docx
CMS Project Phase II InstructionsIn this phase, you will create t.docxmary772
 
Database Implementation Final Document
Database Implementation Final DocumentDatabase Implementation Final Document
Database Implementation Final DocumentConor O'Callaghan
 
Introduction To Oracle Sql
Introduction To Oracle SqlIntroduction To Oracle Sql
Introduction To Oracle SqlAhmed Yaseen
 

Similar to Cassandra 2.2 & 3.0 (20)

2 Dundee - Cassandra-3
2 Dundee - Cassandra-32 Dundee - Cassandra-3
2 Dundee - Cassandra-3
 
Cassandra London - 2.2 and 3.0
Cassandra London - 2.2 and 3.0Cassandra London - 2.2 and 3.0
Cassandra London - 2.2 and 3.0
 
Powering Heap With PostgreSQL And CitusDB (PGConf Silicon Valley 2015)
Powering Heap With PostgreSQL And CitusDB (PGConf Silicon Valley 2015)Powering Heap With PostgreSQL And CitusDB (PGConf Silicon Valley 2015)
Powering Heap With PostgreSQL And CitusDB (PGConf Silicon Valley 2015)
 
Work in TDW
Work in TDWWork in TDW
Work in TDW
 
Cassandra summit keynote 2014
Cassandra summit keynote 2014Cassandra summit keynote 2014
Cassandra summit keynote 2014
 
Uncovering SQL Server query problems with execution plans - Tony Davis
Uncovering SQL Server query problems with execution plans - Tony DavisUncovering SQL Server query problems with execution plans - Tony Davis
Uncovering SQL Server query problems with execution plans - Tony Davis
 
on SQL Managment studio(For the following exercise, use the Week 5.pdf
on SQL Managment studio(For the following exercise, use the Week 5.pdfon SQL Managment studio(For the following exercise, use the Week 5.pdf
on SQL Managment studio(For the following exercise, use the Week 5.pdf
 
Cassandra v3.0 at Rakuten meet-up on 12/2/2015
Cassandra v3.0 at Rakuten meet-up on 12/2/2015Cassandra v3.0 at Rakuten meet-up on 12/2/2015
Cassandra v3.0 at Rakuten meet-up on 12/2/2015
 
SQL Server 2008 Portfolio
SQL Server 2008 PortfolioSQL Server 2008 Portfolio
SQL Server 2008 Portfolio
 
Laracon EU 2018: OMG MySQL 8.0 is out! are we there yet?
Laracon EU 2018: OMG MySQL 8.0 is out! are we there yet?Laracon EU 2018: OMG MySQL 8.0 is out! are we there yet?
Laracon EU 2018: OMG MySQL 8.0 is out! are we there yet?
 
Starting from the database used in Project 1 (see the slightly cha.docx
Starting from the database used in Project 1 (see the slightly cha.docxStarting from the database used in Project 1 (see the slightly cha.docx
Starting from the database used in Project 1 (see the slightly cha.docx
 
Writeable CTEs: The Next Big Thing
Writeable CTEs: The Next Big ThingWriteable CTEs: The Next Big Thing
Writeable CTEs: The Next Big Thing
 
My Portfolio
My PortfolioMy Portfolio
My Portfolio
 
My Portfolio
My PortfolioMy Portfolio
My Portfolio
 
Greg Lewis SQL Portfolio
Greg Lewis SQL PortfolioGreg Lewis SQL Portfolio
Greg Lewis SQL Portfolio
 
Data Exploration with Apache Drill: Day 2
Data Exploration with Apache Drill: Day 2Data Exploration with Apache Drill: Day 2
Data Exploration with Apache Drill: Day 2
 
CMS Project Phase II InstructionsIn this phase, you will create t.docx
CMS Project Phase II InstructionsIn this phase, you will create t.docxCMS Project Phase II InstructionsIn this phase, you will create t.docx
CMS Project Phase II InstructionsIn this phase, you will create t.docx
 
Java script
Java scriptJava script
Java script
 
Database Implementation Final Document
Database Implementation Final DocumentDatabase Implementation Final Document
Database Implementation Final Document
 
Introduction To Oracle Sql
Introduction To Oracle SqlIntroduction To Oracle Sql
Introduction To Oracle Sql
 

More from Victor Coustenoble

Préparation de Données pour la Détection de Fraude
Préparation de Données pour la Détection de FraudePréparation de Données pour la Détection de Fraude
Préparation de Données pour la Détection de FraudeVictor Coustenoble
 
Préparation de Données dans le Cloud
Préparation de Données dans le CloudPréparation de Données dans le Cloud
Préparation de Données dans le CloudVictor Coustenoble
 
Préparation de Données Hadoop avec Trifacta
Préparation de Données Hadoop avec TrifactaPréparation de Données Hadoop avec Trifacta
Préparation de Données Hadoop avec TrifactaVictor Coustenoble
 
BI, Reporting and Analytics on Apache Cassandra
BI, Reporting and Analytics on Apache CassandraBI, Reporting and Analytics on Apache Cassandra
BI, Reporting and Analytics on Apache CassandraVictor Coustenoble
 
DataStax et Cassandra dans Azure au Microsoft Techdays
DataStax et Cassandra dans Azure au Microsoft TechdaysDataStax et Cassandra dans Azure au Microsoft Techdays
DataStax et Cassandra dans Azure au Microsoft TechdaysVictor Coustenoble
 
DataStax - Analytics on Apache Cassandra - Paris Tech Talks meetup
DataStax - Analytics on Apache Cassandra - Paris Tech Talks meetupDataStax - Analytics on Apache Cassandra - Paris Tech Talks meetup
DataStax - Analytics on Apache Cassandra - Paris Tech Talks meetupVictor Coustenoble
 
Spark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational DataSpark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational DataVictor Coustenoble
 
Lightning fast analytics with Cassandra and Spark
Lightning fast analytics with Cassandra and SparkLightning fast analytics with Cassandra and Spark
Lightning fast analytics with Cassandra and SparkVictor Coustenoble
 

More from Victor Coustenoble (9)

Préparation de Données pour la Détection de Fraude
Préparation de Données pour la Détection de FraudePréparation de Données pour la Détection de Fraude
Préparation de Données pour la Détection de Fraude
 
Préparation de Données dans le Cloud
Préparation de Données dans le CloudPréparation de Données dans le Cloud
Préparation de Données dans le Cloud
 
Préparation de Données Hadoop avec Trifacta
Préparation de Données Hadoop avec TrifactaPréparation de Données Hadoop avec Trifacta
Préparation de Données Hadoop avec Trifacta
 
DataStax Enterprise BBL
DataStax Enterprise BBLDataStax Enterprise BBL
DataStax Enterprise BBL
 
BI, Reporting and Analytics on Apache Cassandra
BI, Reporting and Analytics on Apache CassandraBI, Reporting and Analytics on Apache Cassandra
BI, Reporting and Analytics on Apache Cassandra
 
DataStax et Cassandra dans Azure au Microsoft Techdays
DataStax et Cassandra dans Azure au Microsoft TechdaysDataStax et Cassandra dans Azure au Microsoft Techdays
DataStax et Cassandra dans Azure au Microsoft Techdays
 
DataStax - Analytics on Apache Cassandra - Paris Tech Talks meetup
DataStax - Analytics on Apache Cassandra - Paris Tech Talks meetupDataStax - Analytics on Apache Cassandra - Paris Tech Talks meetup
DataStax - Analytics on Apache Cassandra - Paris Tech Talks meetup
 
Spark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational DataSpark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational Data
 
Lightning fast analytics with Cassandra and Spark
Lightning fast analytics with Cassandra and SparkLightning fast analytics with Cassandra and Spark
Lightning fast analytics with Cassandra and Spark
 

Recently uploaded

AzureNativeQumulo_HPC_Cloud_Native_Benchmarks.pdf
AzureNativeQumulo_HPC_Cloud_Native_Benchmarks.pdfAzureNativeQumulo_HPC_Cloud_Native_Benchmarks.pdf
AzureNativeQumulo_HPC_Cloud_Native_Benchmarks.pdfryanfarris8
 
Abortion Pills For Sale WhatsApp[[+27737758557]] In Birch Acres, Abortion Pil...
Abortion Pills For Sale WhatsApp[[+27737758557]] In Birch Acres, Abortion Pil...Abortion Pills For Sale WhatsApp[[+27737758557]] In Birch Acres, Abortion Pil...
Abortion Pills For Sale WhatsApp[[+27737758557]] In Birch Acres, Abortion Pil...drm1699
 
Community is Just as Important as Code by Andrea Goulet
Community is Just as Important as Code by Andrea GouletCommunity is Just as Important as Code by Andrea Goulet
Community is Just as Important as Code by Andrea GouletAndrea Goulet
 
BusinessGPT - Security and Governance for Generative AI
BusinessGPT  - Security and Governance for Generative AIBusinessGPT  - Security and Governance for Generative AI
BusinessGPT - Security and Governance for Generative AIAGATSoftware
 
Lessons Learned from Building a Serverless Notifications System.pdf
Lessons Learned from Building a Serverless Notifications System.pdfLessons Learned from Building a Serverless Notifications System.pdf
Lessons Learned from Building a Serverless Notifications System.pdfSrushith Repakula
 
Automate your OpenSIPS config tests - OpenSIPS Summit 2024
Automate your OpenSIPS config tests - OpenSIPS Summit 2024Automate your OpenSIPS config tests - OpenSIPS Summit 2024
Automate your OpenSIPS config tests - OpenSIPS Summit 2024Andreas Granig
 
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...Neo4j
 
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024SimonedeGijt
 
[GRCPP] Introduction to concepts (C++20)
[GRCPP] Introduction to concepts (C++20)[GRCPP] Introduction to concepts (C++20)
[GRCPP] Introduction to concepts (C++20)Dimitrios Platis
 
Encryption Recap: A Refresher on Key Concepts
Encryption Recap: A Refresher on Key ConceptsEncryption Recap: A Refresher on Key Concepts
Encryption Recap: A Refresher on Key Conceptsthomashtkim
 
Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-CloudAlluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-CloudAlluxio, Inc.
 
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale IbridaUNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale IbridaNeo4j
 

Recently uploaded (20)

AzureNativeQumulo_HPC_Cloud_Native_Benchmarks.pdf
AzureNativeQumulo_HPC_Cloud_Native_Benchmarks.pdfAzureNativeQumulo_HPC_Cloud_Native_Benchmarks.pdf
AzureNativeQumulo_HPC_Cloud_Native_Benchmarks.pdf
 
Abortion Pill Prices Aliwal North ](+27832195400*)[ 🏥 Women's Abortion Clinic...
Abortion Pill Prices Aliwal North ](+27832195400*)[ 🏥 Women's Abortion Clinic...Abortion Pill Prices Aliwal North ](+27832195400*)[ 🏥 Women's Abortion Clinic...
Abortion Pill Prices Aliwal North ](+27832195400*)[ 🏥 Women's Abortion Clinic...
 
Abortion Pills For Sale WhatsApp[[+27737758557]] In Birch Acres, Abortion Pil...
Abortion Pills For Sale WhatsApp[[+27737758557]] In Birch Acres, Abortion Pil...Abortion Pills For Sale WhatsApp[[+27737758557]] In Birch Acres, Abortion Pil...
Abortion Pills For Sale WhatsApp[[+27737758557]] In Birch Acres, Abortion Pil...
 
Abortion Pill Prices Mthatha (@](+27832195400*)[ 🏥 Women's Abortion Clinic In...
Abortion Pill Prices Mthatha (@](+27832195400*)[ 🏥 Women's Abortion Clinic In...Abortion Pill Prices Mthatha (@](+27832195400*)[ 🏥 Women's Abortion Clinic In...
Abortion Pill Prices Mthatha (@](+27832195400*)[ 🏥 Women's Abortion Clinic In...
 
Community is Just as Important as Code by Andrea Goulet
Community is Just as Important as Code by Andrea GouletCommunity is Just as Important as Code by Andrea Goulet
Community is Just as Important as Code by Andrea Goulet
 
Abortion Clinic In Pongola ](+27832195400*)[ 🏥 Safe Abortion Pills In Pongola...
Abortion Clinic In Pongola ](+27832195400*)[ 🏥 Safe Abortion Pills In Pongola...Abortion Clinic In Pongola ](+27832195400*)[ 🏥 Safe Abortion Pills In Pongola...
Abortion Clinic In Pongola ](+27832195400*)[ 🏥 Safe Abortion Pills In Pongola...
 
BusinessGPT - Security and Governance for Generative AI
BusinessGPT  - Security and Governance for Generative AIBusinessGPT  - Security and Governance for Generative AI
BusinessGPT - Security and Governance for Generative AI
 
Lessons Learned from Building a Serverless Notifications System.pdf
Lessons Learned from Building a Serverless Notifications System.pdfLessons Learned from Building a Serverless Notifications System.pdf
Lessons Learned from Building a Serverless Notifications System.pdf
 
Automate your OpenSIPS config tests - OpenSIPS Summit 2024
Automate your OpenSIPS config tests - OpenSIPS Summit 2024Automate your OpenSIPS config tests - OpenSIPS Summit 2024
Automate your OpenSIPS config tests - OpenSIPS Summit 2024
 
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
 
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024
Wired_2.0_CREATE YOUR ULTIMATE LEARNING ENVIRONMENT_JCON_16052024
 
Abortion Clinic In Stanger ](+27832195400*)[ 🏥 Safe Abortion Pills In Stanger...
Abortion Clinic In Stanger ](+27832195400*)[ 🏥 Safe Abortion Pills In Stanger...Abortion Clinic In Stanger ](+27832195400*)[ 🏥 Safe Abortion Pills In Stanger...
Abortion Clinic In Stanger ](+27832195400*)[ 🏥 Safe Abortion Pills In Stanger...
 
[GRCPP] Introduction to concepts (C++20)
[GRCPP] Introduction to concepts (C++20)[GRCPP] Introduction to concepts (C++20)
[GRCPP] Introduction to concepts (C++20)
 
Abortion Clinic in Bloemfontein [(+27832195400*)]🏥Safe Abortion Pills In Bloe...
Abortion Clinic in Bloemfontein [(+27832195400*)]🏥Safe Abortion Pills In Bloe...Abortion Clinic in Bloemfontein [(+27832195400*)]🏥Safe Abortion Pills In Bloe...
Abortion Clinic in Bloemfontein [(+27832195400*)]🏥Safe Abortion Pills In Bloe...
 
Abortion Clinic In Johannesburg ](+27832195400*)[ 🏥 Safe Abortion Pills in Jo...
Abortion Clinic In Johannesburg ](+27832195400*)[ 🏥 Safe Abortion Pills in Jo...Abortion Clinic In Johannesburg ](+27832195400*)[ 🏥 Safe Abortion Pills in Jo...
Abortion Clinic In Johannesburg ](+27832195400*)[ 🏥 Safe Abortion Pills in Jo...
 
Abortion Pill Prices Germiston ](+27832195400*)[ 🏥 Women's Abortion Clinic in...
Abortion Pill Prices Germiston ](+27832195400*)[ 🏥 Women's Abortion Clinic in...Abortion Pill Prices Germiston ](+27832195400*)[ 🏥 Women's Abortion Clinic in...
Abortion Pill Prices Germiston ](+27832195400*)[ 🏥 Women's Abortion Clinic in...
 
Encryption Recap: A Refresher on Key Concepts
Encryption Recap: A Refresher on Key ConceptsEncryption Recap: A Refresher on Key Concepts
Encryption Recap: A Refresher on Key Concepts
 
Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-CloudAlluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
 
Abortion Pill Prices Rustenburg [(+27832195400*)] 🏥 Women's Abortion Clinic i...
Abortion Pill Prices Rustenburg [(+27832195400*)] 🏥 Women's Abortion Clinic i...Abortion Pill Prices Rustenburg [(+27832195400*)] 🏥 Women's Abortion Clinic i...
Abortion Pill Prices Rustenburg [(+27832195400*)] 🏥 Women's Abortion Clinic i...
 
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale IbridaUNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
 

Cassandra 2.2 & 3.0

  • 3. Where did 2.2 come from?
  • 4. Don't start Thrift rpc by default (CASSANDRA-9319)
  • 5. New features • 2.2 - JSON - User defined functions - User defined aggregates - Other useful features - http://docs.datastax.com/en/cassandra/2.2/cassandra/features.html - http://www.datastax.com/dev/blog/cassandra-2-2 • 3.0 - New storage engine (8099) - A new way to denormalise/duplicate : Materialized View
  • 6. So who’s taken some data out of C* and serialised it as JSON?
  • 7. Hello JSON • create TABLE user (username text primary key, first_name text , last_name text , emails set<text> , country text); • INSERT INTO user JSON '{"username": "chbatey", "first_name":"Christopher", "last_name": "Batey", “emails":["christopher.batey@datastax.com"]}';
  • 9. JSON + User Defined Types • CREATE TYPE movie (title text, time timestamp, description text); • ALTER TABLE user ADD movies set<frozen<movie>>; • UPDATE user SET movies = {{ title:'Batman', time:'2011-02-03T04:05:00+0000', description: 'This film rocks' }} where username = 'chbatey';
  • 11. • Run code on the server !Dangerous! - Disabled by default • Java + Java Script supported out of the box • Any language that supports the Java Scripting API (Java, Javascript, Ruby, Python …) User Defined Functions
  • 12. UDF example CREATE TABLE user ( username text primary key, first_name text , last_name text , emails set<text> , country text);
  • 13. Concat function CREATE FUNCTION name ( first_name text, last_name text ) CALLED ON NULL INPUT RETURNS text LANGUAGE java AS ‘return first_name + " " + last_name;’; cqlsh:twotwo> select name(first_name, last_name) FROM user; twotwo.name(first_name, last_name) ------------------------------------ Victor Coustenoble
  • 14. User Defined Aggregates CREATE AGGREGATE average ( int ) SFUNC averageState STYPE tuple<int,bigint> FINALFUNC averageFinal INITCOND (0, 0); Called for every row state passed between Initial state Return type (CQL) Optional function called on final state
  • 15. State function (like a UDF) CREATE FUNCTION averageState ( state tuple<int,bigint>, value int ) CALLED ON NULL INPUT RETURNS tuple<int,bigint> LANGUAGE java AS ' if (value != null) { state.setInt(0, state.getInt(0)+1); state.setLong(1, state.getLong(1)+val.intValue()); } return state; '; Type Columns
  • 16. Final function CREATE FUNCTION averageFinal ( state tuple<int,bigint> ) CALLED ON NULL INPUT RETURNS double LANGUAGE java AS ' if (state.getInt(0) == 0) return null; double r = state.getLong(1) / state.getInt(0); return Double.valueOf(r); '; State typeOverall return type
  • 17. Putting it all together
  • 18. Customer events CREATE AGGREGATE count_by_type(text) SFUNC countEventTypes STYPE map<text, int> INITCOND {}; CREATE FUNCTION countEventTypes( state map<text, int>, type text ) CALLED ON NULL INPUT RETURNS map<text, int> LANGUAGE java AS ' Integer count = (Integer) state.get(type); if (count == null) count = 1; else count = count + 1; state.put(type, count); return state; ';
  • 20. Built in aggregates • count • max • min • avg • sum https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java
  • 21. Built in time functions https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/TimeFcts.java
  • 22. Built in aggregates in action
  • 26. JSON, UDF and UDA available in DevCenter
  • 28. Other bits and pieces… • Compressed commit log • Resumable bootstrapping • New types - smallint - short - tinyint - byte - date - time • Warnings now sent back to client - batch too large
  • 30. New Storage Engine • CASSANDRA-8099 • More efficient storage • Aware of CQL structure • Reduce sstable size • Reduce memory used • …
  • 31. Customer events table CREATE TABLE if NOT EXISTS customer_events ( customer_id text, staff_id text, store_type text, time timeuuid , event_type text, PRIMARY KEY (customer_id, time)) create INDEX on customer_events (staff_id) ;
  • 32. Indexes to the rescue? customer_id time staff_id chbatey 2015-03-03 08:52:45 trevor chbatey 2015-03-03 08:52:54 trevor chbatey 2015-03-03 08:53:11 bill chbatey 2015-03-03 08:53:18 bill rusty 2015-03-03 08:56:57 bill rusty 2015-03-03 08:57:02 bill rusty 2015-03-03 08:57:20 trevor staff_id customer_id trevor chbatey trevor chbatey bill chbatey bill chbatey bill rusty bill rusty trevor rusty
  • 33. Secondary index are local • The staff_id partition in the secondary index is not distributed like a normal table • The secondary index entries are only stored on the node that contains the customer_id partition
  • 34. Indexes to the rescue? staff_id customer_id trevor chbatey trevor chbatey bill chbatey bill chbatey staff_id customer_id bill rusty bill rusty trevor rusty A B chbatey rusty customer_id time staff_id chbatey 2015-03-03 08:52:45 trevor chbatey 2015-03-03 08:52:54 trevor chbatey 2015-03-03 08:53:11 bill chbatey 2015-03-03 08:53:18 bill rusty 2015-03-03 08:56:57 bill rusty 2015-03-03 08:57:02 bill rusty 2015-03-03 08:57:20 trevor customer_events table staff_id customer_id trevor chbatey trevor chbatey bill chbatey bill chbatey bill rusty bill rusty trevor rusty staff_id index
  • 35. Do it yourself index ? CREATE TABLE if NOT EXISTS customer_events ( customer_id text, staff_id text, store_type text, time timeuuid , event_type text, PRIMARY KEY (customer_id, time)) CREATE TABLE if NOT EXISTS customer_events_by_staff ( customer_id text, staff_id text, store_type text, time timeuuid , event_type text, PRIMARY KEY (staff_id, time))
  • 36. 1.2 Logged batches client C BATCH LOG BL-R BL-R BL-R: Batch log replica
  • 37. Pattern • Write only: - Duplicate with a different primary key - (Optional) Logged batch for eventual consistency • Full updates: - No real difference • Partial updates: - No staff id in update?
  • 38. Score Data Model CREATE TABLE scores ( user TEXT, game TEXT, year INT, month INT, day INT, score INT, PRIMARY KEY (user, game, year, month, day) )
  • 39. Materialized Views CREATE MATERIALIZED VIEW alltimehigh AS SELECT user FROM scores WHERE game IS NOT NULL AND score IS NOT NULL AND user IS NOT NULL AND year IS NOT NULL AND month IS NOT NULL AND day IS NOT NULL PRIMARY KEY (game, score, user, year, month, day) WITH CLUSTERING ORDER BY (score desc)
  • 40. Materialized Views INSERT INTO scores (user, game, year, month, day, score) VALUES ('pcmanus', 'Coup', 2015, 05, 01, 4000) INSERT INTO scores (user, game, year, month, day, score) VALUES ('jbellis', 'Coup', 2015, 05, 03, 1750) INSERT INTO scores (user, game, year, month, day, score) VALUES ('yukim', 'Coup', 2015, 05, 03, 2250) INSERT INTO scores (user, game, year, month, day, score) VALUES ('tjake', 'Coup', 2015, 05, 03, 500) INSERT INTO scores (user, game, year, month, day, score) VALUES ('jmckenzie', 'Coup', 2015, 06, 01, 2000) INSERT INTO scores (user, game, year, month, day, score) VALUES ('iamaleksey', 'Coup', 2015, 06, 01, 2500) INSERT INTO scores (user, game, year, month, day, score) VALUES ('tjake', 'Coup', 2015, 06, 02, 1000) INSERT INTO scores (user, game, year, month, day, score) VALUES ('pcmanus', 'Coup', 2015, 06, 02, 2000) SELECT user, score FROM alltimehigh WHERE game = 'Coup' user | score -----------+------- pcmanus | 4000 iamaleksey | 2500 yukim | 2250 jmckenzie | 2000 pcmanus | 2000 jbellis | 1750 tjake | 1000 tjake | 500
  • 44. Fine print • All Primary Key columns must be present in your view • If the part of your primary key is NULL then it won't appear in the materialised view • Performance will be a factor! - More operations to complete (read-before-write, consistency check …) - Batch writes for MV • Bad for low cardinality data (hot spot)
  • 45. Conclusions • We still denormalise and duplicate to achieve scalability and performance • We just let C* do it for us :)
  • 46. Find Out More • Documentation: http://www.datastax.com/docs • Developer Blog: http://www.datastax.com/dev/blog • Academy: https://academy.datastax.com • Community Site: http://planetcassandra.org

Editor's Notes

  1. also a toJson and fromJson if you want individual fields
  2. User defined types??
  3. time is modelled as a long, nanoseconds since midnight
  4. time is modelled as a long, nanoseconds since midnight
  5. Looks good so far. The first problem however is that a single query can result in many partitions being queries. We know why this is bad.
  6. Each of the segments of the index table
  7. Start by writing it out to a batch log on 2 other replicas Downside: Look at the extra round trips Extra complexity Serial reads