SlideShare a Scribd company logo
1 of 55
Download to read offline
Cassandra
Data Modelling and
Queries with CQL3
By Markus Klems
(2013)
Source: http://geek-and-poke.com
Data Model
● Keyspace (Database)
● Column Family (Table)
● Keys and Columns
A column

column_name
value
timestamp
the timestamp field is
used by Cassandra for conflict
resolution: “Last Write Wins”
Column family (Table)
partition key

columns ...
email

name

tel

ab@c.to

otto

12345

email

name

tel

tel2

karl@a.b

karl

6789

12233

101

103

name
104
linda
Table with standard PRIMARY KEY
CREATE TABLE messages (
msg_id timeuuid PRIMARY KEY,
author text,
body text
);
Table: Tweets
PRIMARY KEY
= msg_id
author

body

otto

Hello World!

author

body

linda

Hi, Otto

9990

9991
Table with compound PRIMARY KEY
CREATE TABLE timeline (
user_id uuid,
msg_id timeuuid,
author text,
body text,
PRIMARY KEY (user_id, msg_id)
);
“Wide-row” Table: Timeline
PRIMARY KEY = user_id + msg_id
partition key column
msg_id

author

body

9990

otto

Hello World!

msg_id

author

body

9991

linda

Hi @otto

103

103
Timeline Table is partitioned by user
and locally clustered by msg
103
103
104
104

msg_id
9990
msg_id
9994
msg_id
9881
msg_id
9999

...
...
...
...
...
...
...
...

Node A

211
211
211
212

msg_id
8090
msg_id
8555
msg_id
9678
msg_id
9877

Node B

...
...
...
...
...
...
...
...
Comparison: RDBMS vs. Cassandra
RDBMS Data Design

Cassandra Data Design
Users Table

Users Table
user_id

name

email

user_id

name

email

101

otto

o@t.to

101

otto

o@t.to

Tweets Table
tweet_id author_id
9990

101

Followers Table
follows_
id
id
104
4321

Tweets Table
body
Hello!

tweet_id author_id
9990

101

Follows Table
followed
_id
101

name

body

otto

Hello!

Followed Table

user_id

follows_list

id

followed_list

104

[101,117]

101

[104,109]
Exercise: Launch 1 Cassandra Node
Data Center DC1
(Germany)
A~$ start service cassandra

A

launch server A

/etc/conf/cassandra.yaml

seeds: A
rpc_address: 0.0.0.0
Exercise: Start CQL Shell
Data Center DC1
(Germany)
A~$ cqlsh

A
Intro: CLI, CQL2, CQL3
● CQL is “SQL for Cassandra”
● Cassandra CLI deprecated, CQL2 deprecated
● CQL3 is default since Cassandra 1.2
$ cqlsh
● Pipe scripts into cqlsh
$ cat cql_script | cqlsh

● Source files inside cqlsh
cqlsh> SOURCE '~/cassandra_training/cql3/
01_create_keyspaces';
CQL3
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●

Create a keyspace
Create a column family
Insert data
Alter schema
Update data
Delete data
Apply batch operation
Read data
Secondary Index
Compound Primary Key
Collections
Consistency level
Time-To-Live (TTL)
Counter columns
sstable2json utility tool
Create a SimpleStrategy keyspace
● Create a keyspace with SimpleStrategy and
"replication_factor" option with value "3" like
this:
cqlsh> CREATE KEYSPACE <ksname>
WITH REPLICATION =
{'class':'SimpleStrategy',
'replication_factor':3};
Exercise: Create a SimpleStrategy
keyspace
● Create a keyspace "simpledb" with SimpleStrategy and
replication factor 1.
Exercise: Create a SimpleStrategy
keyspace

cqlsh> CREATE KEYSPACE simpledb
WITH REPLICATION = {
'class' : 'SimpleStrategy',
'replication_factor' : 1 };
cqlsh> DESCRIBE KEYSPACE simpledb;
Create a NetworkTopologyStrategy
keyspace
Create a keyspace with NetworkTopologyStrategy and
strategy option "DC1" with a value of "1" and "DC2" with a
value of "2" like this:
cqlsh> CREATE KEYSPACE <ksname>
WITH REPLICATION = {
'class':'NetworkTopologyStrategy',
'DC1':1,
'DC2':2
};
Exercise Create Table “users”
● Connect to the "twotter" keyspace.
cqlsh> USE twotter;

● Create new column family (Table) named "users".
cqlsh:twotter> CREATE TABLE users (
id int PRIMARY KEY,
name text,
email text
);
cqlsh:twotter> DESCRIBE TABLES;
cqlsh:twotter> DESCRIBE TABLE users;
*we use int instead of uuid in the exercises for the sake of readability
Exercise: Create Table "messages"
● Create a new Table named "messages" with the
attributes "posted_on", "user_id", "user_name", "body",
and a primary key that consists of "user_id" and
"posted_on".
cqlsh:twotter> CREATE TABLE messages (
posted_on bigint,
user_id int,
user_name text,
body text,
PRIMARY KEY (user_id, posted_on)
);
*we use bigint instead of timeuuid in the exercises for the sake of readability
Exercise: Insert data into Table
"users" of keyspace "twotter"
cqlsh:twotter>
INSERT INTO users(id, name, email)
VALUES (101, 'otto', 'otto@abc.de');
cqlsh:twotter> ... insert more records ...
cqlsh> SOURCE
'~/cassandra_training/cql3/03_insert';
Exercise: Insert message records
cqlsh:twotter>
INSERT INTO messages (user_id, posted_on,
user_name, body)
VALUES (101, 1384895178, 'otto', 'Hello!');
cqlsh:twotter> SELECT * FROM messages;
Read data
cqlsh:twotter> SELECT * FROM users;
id | email
| name
-----+--------------+------105 |
g@rd.de
| gerd
104 | linda@abc.de | linda
102 |
null
| jane
106 | heinz@xyz.de | heinz
101 | otto@abc.de | otto
103 |
null
| karl
Update data
cqlsh:twotter> UPDATE users
SET email = 'jane@smith.org'
WHERE id = 102;
id
| email
| name
-----+----------------+------105 |
g@rd.de
| gerd
104 |
linda@abc.de | linda
102 | jane@smith.org | jane
106 |
heinz@xyz.de | heinz
101 | otto@abc.de
| otto
103 |
null
| karl
Delete data
● Delete columns
cqlsh:twotter> DELETE email
FROM users
WHERE id = 105;

● Delete an entire row
cqlsh:twotter> DELETE FROM users
WHERE id = 106;
Delete data
id | email
| name
-----+----------------+------105 |
null
| gerd
104 |
linda@abc.de | linda
102 | jane@smith.org | jane
.
101 | otto@abc.de
| otto
103 |
null
| karl
Batch operation
● Execute multiple mutations with a single operation
cqlsh:twotter>
BEGIN BATCH
INSERT INTO users(id, name, email)
VALUES(107, 'john','j@doe.net')
INSERT INTO users(id, name)
VALUES(108, 'michael')
UPDATE users
SET email = 'michael@abc.de'
WHERE id = 108
DELETE FROM users WHERE id = 105
APPLY BATCH;
Batch operation
id | email
| name
-----+----------------+--------.
107 |
j@doe.net
| john
108 | michael@abc.de | michael
104 |
linda@abc.de | linda
102 | jane@smith.org | jane
101 | otto@abc.de
|
otto
103 |
null
| karl
Secondary Index
cqlsh:twotter>
CREATE INDEX name_index ON users(name);
cqlsh:twotter>
CREATE INDEX email_index ON users(email);
cqlsh:twotter> SELECT name, email FROM
users WHERE name = 'otto';
cqlsh:twotter> SELECT name, email FROM
users WHERE email = 'michael@abc.de';
Alter Table Schema
cqlsh:twotter>
ALTER TABLE users ADD password text;
cqlsh:twotter>
ALTER TABLE users
ADD password_reset_token text;

* Given its flexible schema, Cassandra’s CQL ALTER finishes much
quicker than RDBMS SQL ALTER where all existing records need to be
updated.
Alter Table Schema
id | email
| name
| password | password_reset_token
-----+----------------+---------+----------+---------------------107 |
j@doe.net
|
john
|
null |
null
108 | michael@abc.de | michael |
null |
null
104 |
linda@abc.de |
linda |
null |
null
102 | jane@smith.org |
jane
|
null |
null
101 |
otto@abc.de
| otto
|
null |
null
103 |
null |
karl
|
null |
null
Collections - Set
CQL3 introduces collections for storing complex data
structures, namely the following: set, list, and map. This is
the CQL way of modelling many-to-one relationships.
1. Let us add a set of "hobbies" to the Table "users".
cqlsh:twotter> ALTER TABLE users ADD hobbies
set<text>;
cqlsh:twotter> UPDATE users SET hobbies =
hobbies +
{'badminton','jazz'} WHERE id = 101;
Collections - List
2. Now create a Table "followers" with a list of followers.
cqlsh:twotter> CREATE TABLE followers (
user_id int PRIMARY KEY,
followers list<text>);
cqlsh:twotter> INSERT INTO followers (
user_id, followers)
VALUES (101, ['willi','heinz']);
cqlsh:twotter> SELECT * FROM followers;
user_id | followers
---------+-------------------101
| ['willi', 'heinz']
Collections - Map
3. Add a map to the Table "messages".
cqlsh:twotter>
ALTER TABLE messages
ADD comments map<text, text>;
cqlsh:twotter>
UPDATE messages
SET comments = comments + {'otto':'thx!'}
WHERE user_id = 103
AND posted_on = 1384895223;
Consistency Level
● Set the consistency level for all subsequent requests:
cqlsh:twotter> CONSISTENCY ONE;
cqlsh:twotter> CONSISTENCY QUORUM;
cqlsh:twotter> CONSISTENCY ALL;

● Show the current consistency level:
cqlsh:twotter> CONSISTENCY;
Exercise: Consistency Level
● Set the consistency level to ANY and execute a SELECT
statement.
Exercise: Consistency Level
● Set the consistency level to ANY and execute a SELECT
statement.

Bad Request: ANY ConsistencyLevel is only
supported for writes
Exercise: Time-To-Live (TTL)
● Insert a user record with a password reset
token with a 77 second TTL value.
cqlsh:twotter>
INSERT INTO users (id, name,
password_reset_token)
VALUES (109, 'timo', 'abc-xyz-123')
USING TTL 77;
Exercise: Time-To-Live (TTL)
● The INSERT statement before will delete the
entire user record after 77 seconds.
● This is what we actually wanted to do:
cqlsh:twotter> INSERT INTO users (id, name)
VALUES(110, 'anna');
cqlsh:twotter> UPDATE users USING TTL 77
SET password_reset_token =
'abc-xyz-123'
WHERE id = 110;
Time-To-Live (TTL)
● Check the TTL expiration time in seconds.
cqlsh:twotter> SELECT TTL
(password_reset_token)
FROM messages
WHERE user_id = 110;
Counter Columns
Create a Counter Column Table that counts
"upvote" and "downvote" events.
cqlsh:twotter> CREATE TABLE votes (
user_id int,
msg_created_on bigint,
upvote counter,
downvote counter,
PRIMARY KEY (user_id, msg_created_on)
);
Counter Columns
cqlsh:twotter>
UPDATE votes SET upvote = upvote + 1
WHERE user_id = 101
AND msg_created_on = 1234;
cqlsh:twotter>
UPDATE votes
SET downvote = downvote + 2
WHERE user_id = 101
AND msg_created_on = 1234;
cqlsh:twotter> SELECT * FROM votes;
sstable2json utility tool
$ sstable2json
var/lib/cassandra/data/twotter/users/*.db >
*.json

X MB
.db file

2X MB
.json file
Exercise: sstable2json
● Insert a few records
● Flush the users column family to disk and create a json
representation of a *.db file.
Solution: sstable2json
$ nodetool flush twotter users
$ sstable2json
/var/lib/cassandra/data/twotter/users/twotte
r-users-jb-1-Data.db > twotter-users.json
$ cat twotter-users.json
[{"key": "00000069","columns": [["","",
1384963716697000], ["email","g@rd.de",
1384963716697000], ["name","gerd",
1384963716697000]]},
{"key": "00000068","columns": [["","",
1384963716685000], ...
CQL v3.1.0
(New in Cassandra 2.0)
●
●
●
●
●
●
●
●
●

IF keyword
Lightweight transactions (“Compare-And-Set”)
Triggers (experimental!!)
CQL paging support
Drop column support
SELECT column aliases
Conditional DDL
Index enhancements
cqlsh COPY
IF Keyword
cqlsh> DROP KEYSPACE twotter;
cqlsh> DROP KEYSPACE twotter;
Bad Request: Cannot drop non existing
keyspace 'twotter'.
cqlsh> DROP KEYSPACE IF EXISTS twotter;
Lightweight Transactions
● Compare And Set (CAS)
● Example: without CAS, two users attempting
to create a unique user account in the same
cluster could overwrite each other’s work
with neither user knowing about it.

Source: http://www.datastax.com/documentation/cassandra/2.0/webhelp/cassandra/dml/dml_about_transactions_c.html
Lightweight Transactions
1. Register a new user
cqlsh:twotter>
INSERT INTO users (id, name, email)
VALUES (110, 'franz', 'fr@nz.de')
IF NOT EXISTS;

2. Perform a CAS reset of Karl’s email.
cqlsh:twotter>
UPDATE users
SET email = 'franz@gmail.com'
WHERE id = 110
IF email = 'fr@nz.de';
Exercise: Lightweight Transactions
● Perform a failing CAS e-mail reset:
cqlsh:twotter>
UPDATE users
SET email = 'franz@ABC.de'
...
Exercise: Lightweight Transactions
● Perform a failing CAS e-mail reset:
cqlsh:twotter>
UPDATE users
SET email = 'franz@ABC.de'
WHERE id = 110
IF email = 'fr@nz.de';
[applied] | email
-----------+------------False
| franz@gmail.com
Exercise: Lightweight Transactions
● Write a password reset method by using an
expiring password_reset_token column and
a CAS password update query.
Exercise: Lightweight Transactions
cqlsh:twotter>
UPDATE users USING TTL 77
SET password_reset_token = 'abc-xyz-123'
WHERE id = 110;
cqlsh:twotter>
UPDATE users
SET password = 'geheim!'
WHERE id = 110
IF password_reset_token = 'abc-xyz-123';
Create a Trigger
(experimental feature)
● Triggers are written in Java.
● Triggers are currently an experimental feature
in Cassandra 2.0. Use with caution!
cqlsh:twotter>
CREATE TRIGGER myTrigger ON users USING 'org.
apache.cassandra.triggers.InvertedIndex'

More Related Content

What's hot

Node.js and the MySQL Document Store
Node.js and the MySQL Document StoreNode.js and the MySQL Document Store
Node.js and the MySQL Document StoreRui Quelhas
 
[GuideDoc] Deploy EKS thru eksctl - v1.22_v0.105.0.pdf
[GuideDoc] Deploy EKS thru eksctl - v1.22_v0.105.0.pdf[GuideDoc] Deploy EKS thru eksctl - v1.22_v0.105.0.pdf
[GuideDoc] Deploy EKS thru eksctl - v1.22_v0.105.0.pdfJo Hoon
 
Intro To Mongo Db
Intro To Mongo DbIntro To Mongo Db
Intro To Mongo Dbchriskite
 
DATABASE AUTOMATION with Thousands of database, monitoring and backup
DATABASE AUTOMATION with Thousands of database, monitoring and backupDATABASE AUTOMATION with Thousands of database, monitoring and backup
DATABASE AUTOMATION with Thousands of database, monitoring and backupSaewoong Lee
 
Storing time series data with Apache Cassandra
Storing time series data with Apache CassandraStoring time series data with Apache Cassandra
Storing time series data with Apache CassandraPatrick McFadin
 
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...OpenStack Korea Community
 
Hybrid MongoDB and RDBMS Applications
Hybrid MongoDB and RDBMS ApplicationsHybrid MongoDB and RDBMS Applications
Hybrid MongoDB and RDBMS ApplicationsSteven Francia
 
Apache Kylin - Balance Between Space and Time
Apache Kylin - Balance Between Space and TimeApache Kylin - Balance Between Space and Time
Apache Kylin - Balance Between Space and TimeDataWorks Summit
 
Accelerating Hyper-Converged Enterprise Virtualization using Proxmox and Ceph
Accelerating Hyper-Converged Enterprise Virtualization using Proxmox and CephAccelerating Hyper-Converged Enterprise Virtualization using Proxmox and Ceph
Accelerating Hyper-Converged Enterprise Virtualization using Proxmox and CephBangladesh Network Operators Group
 
Troubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the BeastTroubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the BeastDataWorks Summit
 
Delivering High-Availability Web Services with NGINX Plus on AWS
Delivering High-Availability Web Services with NGINX Plus on AWSDelivering High-Availability Web Services with NGINX Plus on AWS
Delivering High-Availability Web Services with NGINX Plus on AWSNGINX, Inc.
 
Load Balancing MySQL with HAProxy - Slides
Load Balancing MySQL with HAProxy - SlidesLoad Balancing MySQL with HAProxy - Slides
Load Balancing MySQL with HAProxy - SlidesSeveralnines
 
PostgreSQL Performance Tuning
PostgreSQL Performance TuningPostgreSQL Performance Tuning
PostgreSQL Performance Tuningelliando dias
 
Disaggregating Ceph using NVMeoF
Disaggregating Ceph using NVMeoFDisaggregating Ceph using NVMeoF
Disaggregating Ceph using NVMeoFShapeBlue
 
Galera cluster for MySQL - Introduction Slides
Galera cluster for MySQL - Introduction SlidesGalera cluster for MySQL - Introduction Slides
Galera cluster for MySQL - Introduction SlidesSeveralnines
 
MySQLの運用でありがちなこと
MySQLの運用でありがちなことMySQLの運用でありがちなこと
MySQLの運用でありがちなことHiroaki Sano
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveDataWorks Summit
 
Vault Open Source vs Enterprise v2
Vault Open Source vs Enterprise v2Vault Open Source vs Enterprise v2
Vault Open Source vs Enterprise v2Stenio Ferreira
 

What's hot (20)

Introduction to docker
Introduction to dockerIntroduction to docker
Introduction to docker
 
Node.js and the MySQL Document Store
Node.js and the MySQL Document StoreNode.js and the MySQL Document Store
Node.js and the MySQL Document Store
 
[GuideDoc] Deploy EKS thru eksctl - v1.22_v0.105.0.pdf
[GuideDoc] Deploy EKS thru eksctl - v1.22_v0.105.0.pdf[GuideDoc] Deploy EKS thru eksctl - v1.22_v0.105.0.pdf
[GuideDoc] Deploy EKS thru eksctl - v1.22_v0.105.0.pdf
 
Intro To Mongo Db
Intro To Mongo DbIntro To Mongo Db
Intro To Mongo Db
 
DATABASE AUTOMATION with Thousands of database, monitoring and backup
DATABASE AUTOMATION with Thousands of database, monitoring and backupDATABASE AUTOMATION with Thousands of database, monitoring and backup
DATABASE AUTOMATION with Thousands of database, monitoring and backup
 
Storing time series data with Apache Cassandra
Storing time series data with Apache CassandraStoring time series data with Apache Cassandra
Storing time series data with Apache Cassandra
 
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
 
Istio on Kubernetes
Istio on KubernetesIstio on Kubernetes
Istio on Kubernetes
 
Hybrid MongoDB and RDBMS Applications
Hybrid MongoDB and RDBMS ApplicationsHybrid MongoDB and RDBMS Applications
Hybrid MongoDB and RDBMS Applications
 
Apache Kylin - Balance Between Space and Time
Apache Kylin - Balance Between Space and TimeApache Kylin - Balance Between Space and Time
Apache Kylin - Balance Between Space and Time
 
Accelerating Hyper-Converged Enterprise Virtualization using Proxmox and Ceph
Accelerating Hyper-Converged Enterprise Virtualization using Proxmox and CephAccelerating Hyper-Converged Enterprise Virtualization using Proxmox and Ceph
Accelerating Hyper-Converged Enterprise Virtualization using Proxmox and Ceph
 
Troubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the BeastTroubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the Beast
 
Delivering High-Availability Web Services with NGINX Plus on AWS
Delivering High-Availability Web Services with NGINX Plus on AWSDelivering High-Availability Web Services with NGINX Plus on AWS
Delivering High-Availability Web Services with NGINX Plus on AWS
 
Load Balancing MySQL with HAProxy - Slides
Load Balancing MySQL with HAProxy - SlidesLoad Balancing MySQL with HAProxy - Slides
Load Balancing MySQL with HAProxy - Slides
 
PostgreSQL Performance Tuning
PostgreSQL Performance TuningPostgreSQL Performance Tuning
PostgreSQL Performance Tuning
 
Disaggregating Ceph using NVMeoF
Disaggregating Ceph using NVMeoFDisaggregating Ceph using NVMeoF
Disaggregating Ceph using NVMeoF
 
Galera cluster for MySQL - Introduction Slides
Galera cluster for MySQL - Introduction SlidesGalera cluster for MySQL - Introduction Slides
Galera cluster for MySQL - Introduction Slides
 
MySQLの運用でありがちなこと
MySQLの運用でありがちなことMySQLの運用でありがちなこと
MySQLの運用でありがちなこと
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
 
Vault Open Source vs Enterprise v2
Vault Open Source vs Enterprise v2Vault Open Source vs Enterprise v2
Vault Open Source vs Enterprise v2
 

Viewers also liked

Highly Scalable Java Programming for Multi-Core System
Highly Scalable Java Programming for Multi-Core SystemHighly Scalable Java Programming for Multi-Core System
Highly Scalable Java Programming for Multi-Core SystemJames Gan
 
Cassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ NetflixCassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ Netflixnkorla1share
 
MongoDB - How to model and extract your data
MongoDB - How to model and extract your dataMongoDB - How to model and extract your data
MongoDB - How to model and extract your dataFrancesco Lo Franco
 
Cassandra 2.0 to 2.1
Cassandra 2.0 to 2.1Cassandra 2.0 to 2.1
Cassandra 2.0 to 2.1Johnny Miller
 
Cassandra Summit 2014: Fuzzy Entity Matching at Scale
Cassandra Summit 2014: Fuzzy Entity Matching at ScaleCassandra Summit 2014: Fuzzy Entity Matching at Scale
Cassandra Summit 2014: Fuzzy Entity Matching at ScaleDataStax Academy
 
Low Latency “OLAP” with HBase - HBaseCon 2012
Low Latency “OLAP” with HBase - HBaseCon 2012Low Latency “OLAP” with HBase - HBaseCon 2012
Low Latency “OLAP” with HBase - HBaseCon 2012Cosmin Lehene
 
Use Your MySQL Knowledge to Become an Instant Cassandra Guru
Use Your MySQL Knowledge to Become an Instant Cassandra GuruUse Your MySQL Knowledge to Become an Instant Cassandra Guru
Use Your MySQL Knowledge to Become an Instant Cassandra GuruTim Callaghan
 
Apache Cassandra 2.0
Apache Cassandra 2.0Apache Cassandra 2.0
Apache Cassandra 2.0Joe Stein
 
Scalable Application Development on AWS
Scalable Application Development on AWSScalable Application Development on AWS
Scalable Application Development on AWSMikalai Alimenkou
 
Building a Scalable XML-based Dynamic Delivery Architecture: Standards and Be...
Building a Scalable XML-based Dynamic Delivery Architecture: Standards and Be...Building a Scalable XML-based Dynamic Delivery Architecture: Standards and Be...
Building a Scalable XML-based Dynamic Delivery Architecture: Standards and Be...Jerry SILVER
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Archmclee
 
Cuestionario internet Hernandez Michel
Cuestionario internet Hernandez MichelCuestionario internet Hernandez Michel
Cuestionario internet Hernandez Micheljhonzmichelle
 
Scalable Applications with Scala
Scalable Applications with ScalaScalable Applications with Scala
Scalable Applications with ScalaNimrod Argov
 
Building Highly Scalable Java Applications on Windows Azure - JavaOne S313978
Building Highly Scalable Java Applications on Windows Azure - JavaOne S313978Building Highly Scalable Java Applications on Windows Azure - JavaOne S313978
Building Highly Scalable Java Applications on Windows Azure - JavaOne S313978David Chou
 
Hadoop World 2011: Advanced HBase Schema Design
Hadoop World 2011: Advanced HBase Schema DesignHadoop World 2011: Advanced HBase Schema Design
Hadoop World 2011: Advanced HBase Schema DesignCloudera, Inc.
 
Diary of a Scalable Java Application
Diary of a Scalable Java ApplicationDiary of a Scalable Java Application
Diary of a Scalable Java ApplicationMartin Jackson
 
MongoDB Days UK: Jumpstart: Schema Design
MongoDB Days UK: Jumpstart: Schema DesignMongoDB Days UK: Jumpstart: Schema Design
MongoDB Days UK: Jumpstart: Schema DesignMongoDB
 
Cassandra introduction 2016
Cassandra introduction 2016Cassandra introduction 2016
Cassandra introduction 2016Duyhai Doan
 
Cassandra for mission critical data
Cassandra for mission critical dataCassandra for mission critical data
Cassandra for mission critical dataOleksandr Semenov
 
Apache cassandra in 2016
Apache cassandra in 2016Apache cassandra in 2016
Apache cassandra in 2016Duyhai Doan
 

Viewers also liked (20)

Highly Scalable Java Programming for Multi-Core System
Highly Scalable Java Programming for Multi-Core SystemHighly Scalable Java Programming for Multi-Core System
Highly Scalable Java Programming for Multi-Core System
 
Cassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ NetflixCassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ Netflix
 
MongoDB - How to model and extract your data
MongoDB - How to model and extract your dataMongoDB - How to model and extract your data
MongoDB - How to model and extract your data
 
Cassandra 2.0 to 2.1
Cassandra 2.0 to 2.1Cassandra 2.0 to 2.1
Cassandra 2.0 to 2.1
 
Cassandra Summit 2014: Fuzzy Entity Matching at Scale
Cassandra Summit 2014: Fuzzy Entity Matching at ScaleCassandra Summit 2014: Fuzzy Entity Matching at Scale
Cassandra Summit 2014: Fuzzy Entity Matching at Scale
 
Low Latency “OLAP” with HBase - HBaseCon 2012
Low Latency “OLAP” with HBase - HBaseCon 2012Low Latency “OLAP” with HBase - HBaseCon 2012
Low Latency “OLAP” with HBase - HBaseCon 2012
 
Use Your MySQL Knowledge to Become an Instant Cassandra Guru
Use Your MySQL Knowledge to Become an Instant Cassandra GuruUse Your MySQL Knowledge to Become an Instant Cassandra Guru
Use Your MySQL Knowledge to Become an Instant Cassandra Guru
 
Apache Cassandra 2.0
Apache Cassandra 2.0Apache Cassandra 2.0
Apache Cassandra 2.0
 
Scalable Application Development on AWS
Scalable Application Development on AWSScalable Application Development on AWS
Scalable Application Development on AWS
 
Building a Scalable XML-based Dynamic Delivery Architecture: Standards and Be...
Building a Scalable XML-based Dynamic Delivery Architecture: Standards and Be...Building a Scalable XML-based Dynamic Delivery Architecture: Standards and Be...
Building a Scalable XML-based Dynamic Delivery Architecture: Standards and Be...
 
Web20expo Scalable Web Arch
Web20expo Scalable Web ArchWeb20expo Scalable Web Arch
Web20expo Scalable Web Arch
 
Cuestionario internet Hernandez Michel
Cuestionario internet Hernandez MichelCuestionario internet Hernandez Michel
Cuestionario internet Hernandez Michel
 
Scalable Applications with Scala
Scalable Applications with ScalaScalable Applications with Scala
Scalable Applications with Scala
 
Building Highly Scalable Java Applications on Windows Azure - JavaOne S313978
Building Highly Scalable Java Applications on Windows Azure - JavaOne S313978Building Highly Scalable Java Applications on Windows Azure - JavaOne S313978
Building Highly Scalable Java Applications on Windows Azure - JavaOne S313978
 
Hadoop World 2011: Advanced HBase Schema Design
Hadoop World 2011: Advanced HBase Schema DesignHadoop World 2011: Advanced HBase Schema Design
Hadoop World 2011: Advanced HBase Schema Design
 
Diary of a Scalable Java Application
Diary of a Scalable Java ApplicationDiary of a Scalable Java Application
Diary of a Scalable Java Application
 
MongoDB Days UK: Jumpstart: Schema Design
MongoDB Days UK: Jumpstart: Schema DesignMongoDB Days UK: Jumpstart: Schema Design
MongoDB Days UK: Jumpstart: Schema Design
 
Cassandra introduction 2016
Cassandra introduction 2016Cassandra introduction 2016
Cassandra introduction 2016
 
Cassandra for mission critical data
Cassandra for mission critical dataCassandra for mission critical data
Cassandra for mission critical data
 
Apache cassandra in 2016
Apache cassandra in 2016Apache cassandra in 2016
Apache cassandra in 2016
 

Similar to Apache Cassandra Lesson: Data Modelling and CQL3

SQL Server Select Topics
SQL Server Select TopicsSQL Server Select Topics
SQL Server Select TopicsJay Coskey
 
Apache Cassandra Data Modeling with Travis Price
Apache Cassandra Data Modeling with Travis PriceApache Cassandra Data Modeling with Travis Price
Apache Cassandra Data Modeling with Travis PriceDataStax Academy
 
MySQL Database System Hiep Dinh
MySQL Database System Hiep DinhMySQL Database System Hiep Dinh
MySQL Database System Hiep Dinhwebhostingguy
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to CassandraGokhan Atil
 
DDL(Data defination Language ) Using Oracle
DDL(Data defination Language ) Using OracleDDL(Data defination Language ) Using Oracle
DDL(Data defination Language ) Using OracleFarhan Aslam
 
Cassandra summit keynote 2014
Cassandra summit keynote 2014Cassandra summit keynote 2014
Cassandra summit keynote 2014jbellis
 
C# Tutorial
C# Tutorial C# Tutorial
C# Tutorial Jm Ramos
 
Cassandra v3.0 at Rakuten meet-up on 12/2/2015
Cassandra v3.0 at Rakuten meet-up on 12/2/2015Cassandra v3.0 at Rakuten meet-up on 12/2/2015
Cassandra v3.0 at Rakuten meet-up on 12/2/2015datastaxjp
 
C*ollege Credit: Data Modeling for Apache Cassandra
C*ollege Credit: Data Modeling for Apache CassandraC*ollege Credit: Data Modeling for Apache Cassandra
C*ollege Credit: Data Modeling for Apache CassandraDataStax
 
Lab1-DB-Cassandra
Lab1-DB-CassandraLab1-DB-Cassandra
Lab1-DB-CassandraLilia Sfaxi
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraLuke Tillman
 

Similar to Apache Cassandra Lesson: Data Modelling and CQL3 (20)

Cassandra
CassandraCassandra
Cassandra
 
Dbms lab Manual
Dbms lab ManualDbms lab Manual
Dbms lab Manual
 
C# Basic Tutorial
C# Basic TutorialC# Basic Tutorial
C# Basic Tutorial
 
SQL Server Select Topics
SQL Server Select TopicsSQL Server Select Topics
SQL Server Select Topics
 
Apache Cassandra Data Modeling with Travis Price
Apache Cassandra Data Modeling with Travis PriceApache Cassandra Data Modeling with Travis Price
Apache Cassandra Data Modeling with Travis Price
 
MySQL Database System Hiep Dinh
MySQL Database System Hiep DinhMySQL Database System Hiep Dinh
MySQL Database System Hiep Dinh
 
mysqlHiep.ppt
mysqlHiep.pptmysqlHiep.ppt
mysqlHiep.ppt
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
 
dbms lab manual
dbms lab manualdbms lab manual
dbms lab manual
 
CQL - Cassandra commands Notes
CQL - Cassandra commands NotesCQL - Cassandra commands Notes
CQL - Cassandra commands Notes
 
DDL(Data defination Language ) Using Oracle
DDL(Data defination Language ) Using OracleDDL(Data defination Language ) Using Oracle
DDL(Data defination Language ) Using Oracle
 
Cassandra summit keynote 2014
Cassandra summit keynote 2014Cassandra summit keynote 2014
Cassandra summit keynote 2014
 
Dun ddd
Dun dddDun ddd
Dun ddd
 
C# Tutorial
C# Tutorial C# Tutorial
C# Tutorial
 
Cassandra v3.0 at Rakuten meet-up on 12/2/2015
Cassandra v3.0 at Rakuten meet-up on 12/2/2015Cassandra v3.0 at Rakuten meet-up on 12/2/2015
Cassandra v3.0 at Rakuten meet-up on 12/2/2015
 
C++ TUTORIAL 5
C++ TUTORIAL 5C++ TUTORIAL 5
C++ TUTORIAL 5
 
Databases with SQLite3.pdf
Databases with SQLite3.pdfDatabases with SQLite3.pdf
Databases with SQLite3.pdf
 
C*ollege Credit: Data Modeling for Apache Cassandra
C*ollege Credit: Data Modeling for Apache CassandraC*ollege Credit: Data Modeling for Apache Cassandra
C*ollege Credit: Data Modeling for Apache Cassandra
 
Lab1-DB-Cassandra
Lab1-DB-CassandraLab1-DB-Cassandra
Lab1-DB-Cassandra
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 

Recently uploaded

SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dashnarutouzumaki53779
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 

Recently uploaded (20)

SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dash
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 

Apache Cassandra Lesson: Data Modelling and CQL3

  • 1. Cassandra Data Modelling and Queries with CQL3 By Markus Klems (2013)
  • 3. Data Model ● Keyspace (Database) ● Column Family (Table) ● Keys and Columns
  • 4. A column column_name value timestamp the timestamp field is used by Cassandra for conflict resolution: “Last Write Wins”
  • 5. Column family (Table) partition key columns ... email name tel ab@c.to otto 12345 email name tel tel2 karl@a.b karl 6789 12233 101 103 name 104 linda
  • 6. Table with standard PRIMARY KEY CREATE TABLE messages ( msg_id timeuuid PRIMARY KEY, author text, body text );
  • 7. Table: Tweets PRIMARY KEY = msg_id author body otto Hello World! author body linda Hi, Otto 9990 9991
  • 8. Table with compound PRIMARY KEY CREATE TABLE timeline ( user_id uuid, msg_id timeuuid, author text, body text, PRIMARY KEY (user_id, msg_id) );
  • 9. “Wide-row” Table: Timeline PRIMARY KEY = user_id + msg_id partition key column msg_id author body 9990 otto Hello World! msg_id author body 9991 linda Hi @otto 103 103
  • 10. Timeline Table is partitioned by user and locally clustered by msg 103 103 104 104 msg_id 9990 msg_id 9994 msg_id 9881 msg_id 9999 ... ... ... ... ... ... ... ... Node A 211 211 211 212 msg_id 8090 msg_id 8555 msg_id 9678 msg_id 9877 Node B ... ... ... ... ... ... ... ...
  • 11. Comparison: RDBMS vs. Cassandra RDBMS Data Design Cassandra Data Design Users Table Users Table user_id name email user_id name email 101 otto o@t.to 101 otto o@t.to Tweets Table tweet_id author_id 9990 101 Followers Table follows_ id id 104 4321 Tweets Table body Hello! tweet_id author_id 9990 101 Follows Table followed _id 101 name body otto Hello! Followed Table user_id follows_list id followed_list 104 [101,117] 101 [104,109]
  • 12. Exercise: Launch 1 Cassandra Node Data Center DC1 (Germany) A~$ start service cassandra A launch server A /etc/conf/cassandra.yaml seeds: A rpc_address: 0.0.0.0
  • 13. Exercise: Start CQL Shell Data Center DC1 (Germany) A~$ cqlsh A
  • 14. Intro: CLI, CQL2, CQL3 ● CQL is “SQL for Cassandra” ● Cassandra CLI deprecated, CQL2 deprecated ● CQL3 is default since Cassandra 1.2 $ cqlsh ● Pipe scripts into cqlsh $ cat cql_script | cqlsh ● Source files inside cqlsh cqlsh> SOURCE '~/cassandra_training/cql3/ 01_create_keyspaces';
  • 15. CQL3 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Create a keyspace Create a column family Insert data Alter schema Update data Delete data Apply batch operation Read data Secondary Index Compound Primary Key Collections Consistency level Time-To-Live (TTL) Counter columns sstable2json utility tool
  • 16. Create a SimpleStrategy keyspace ● Create a keyspace with SimpleStrategy and "replication_factor" option with value "3" like this: cqlsh> CREATE KEYSPACE <ksname> WITH REPLICATION = {'class':'SimpleStrategy', 'replication_factor':3};
  • 17. Exercise: Create a SimpleStrategy keyspace ● Create a keyspace "simpledb" with SimpleStrategy and replication factor 1.
  • 18. Exercise: Create a SimpleStrategy keyspace cqlsh> CREATE KEYSPACE simpledb WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }; cqlsh> DESCRIBE KEYSPACE simpledb;
  • 19. Create a NetworkTopologyStrategy keyspace Create a keyspace with NetworkTopologyStrategy and strategy option "DC1" with a value of "1" and "DC2" with a value of "2" like this: cqlsh> CREATE KEYSPACE <ksname> WITH REPLICATION = { 'class':'NetworkTopologyStrategy', 'DC1':1, 'DC2':2 };
  • 20. Exercise Create Table “users” ● Connect to the "twotter" keyspace. cqlsh> USE twotter; ● Create new column family (Table) named "users". cqlsh:twotter> CREATE TABLE users ( id int PRIMARY KEY, name text, email text ); cqlsh:twotter> DESCRIBE TABLES; cqlsh:twotter> DESCRIBE TABLE users; *we use int instead of uuid in the exercises for the sake of readability
  • 21. Exercise: Create Table "messages" ● Create a new Table named "messages" with the attributes "posted_on", "user_id", "user_name", "body", and a primary key that consists of "user_id" and "posted_on". cqlsh:twotter> CREATE TABLE messages ( posted_on bigint, user_id int, user_name text, body text, PRIMARY KEY (user_id, posted_on) ); *we use bigint instead of timeuuid in the exercises for the sake of readability
  • 22. Exercise: Insert data into Table "users" of keyspace "twotter" cqlsh:twotter> INSERT INTO users(id, name, email) VALUES (101, 'otto', 'otto@abc.de'); cqlsh:twotter> ... insert more records ... cqlsh> SOURCE '~/cassandra_training/cql3/03_insert';
  • 23. Exercise: Insert message records cqlsh:twotter> INSERT INTO messages (user_id, posted_on, user_name, body) VALUES (101, 1384895178, 'otto', 'Hello!'); cqlsh:twotter> SELECT * FROM messages;
  • 24. Read data cqlsh:twotter> SELECT * FROM users; id | email | name -----+--------------+------105 | g@rd.de | gerd 104 | linda@abc.de | linda 102 | null | jane 106 | heinz@xyz.de | heinz 101 | otto@abc.de | otto 103 | null | karl
  • 25. Update data cqlsh:twotter> UPDATE users SET email = 'jane@smith.org' WHERE id = 102; id | email | name -----+----------------+------105 | g@rd.de | gerd 104 | linda@abc.de | linda 102 | jane@smith.org | jane 106 | heinz@xyz.de | heinz 101 | otto@abc.de | otto 103 | null | karl
  • 26. Delete data ● Delete columns cqlsh:twotter> DELETE email FROM users WHERE id = 105; ● Delete an entire row cqlsh:twotter> DELETE FROM users WHERE id = 106;
  • 27. Delete data id | email | name -----+----------------+------105 | null | gerd 104 | linda@abc.de | linda 102 | jane@smith.org | jane . 101 | otto@abc.de | otto 103 | null | karl
  • 28. Batch operation ● Execute multiple mutations with a single operation cqlsh:twotter> BEGIN BATCH INSERT INTO users(id, name, email) VALUES(107, 'john','j@doe.net') INSERT INTO users(id, name) VALUES(108, 'michael') UPDATE users SET email = 'michael@abc.de' WHERE id = 108 DELETE FROM users WHERE id = 105 APPLY BATCH;
  • 29. Batch operation id | email | name -----+----------------+--------. 107 | j@doe.net | john 108 | michael@abc.de | michael 104 | linda@abc.de | linda 102 | jane@smith.org | jane 101 | otto@abc.de | otto 103 | null | karl
  • 30. Secondary Index cqlsh:twotter> CREATE INDEX name_index ON users(name); cqlsh:twotter> CREATE INDEX email_index ON users(email); cqlsh:twotter> SELECT name, email FROM users WHERE name = 'otto'; cqlsh:twotter> SELECT name, email FROM users WHERE email = 'michael@abc.de';
  • 31. Alter Table Schema cqlsh:twotter> ALTER TABLE users ADD password text; cqlsh:twotter> ALTER TABLE users ADD password_reset_token text; * Given its flexible schema, Cassandra’s CQL ALTER finishes much quicker than RDBMS SQL ALTER where all existing records need to be updated.
  • 32. Alter Table Schema id | email | name | password | password_reset_token -----+----------------+---------+----------+---------------------107 | j@doe.net | john | null | null 108 | michael@abc.de | michael | null | null 104 | linda@abc.de | linda | null | null 102 | jane@smith.org | jane | null | null 101 | otto@abc.de | otto | null | null 103 | null | karl | null | null
  • 33. Collections - Set CQL3 introduces collections for storing complex data structures, namely the following: set, list, and map. This is the CQL way of modelling many-to-one relationships. 1. Let us add a set of "hobbies" to the Table "users". cqlsh:twotter> ALTER TABLE users ADD hobbies set<text>; cqlsh:twotter> UPDATE users SET hobbies = hobbies + {'badminton','jazz'} WHERE id = 101;
  • 34. Collections - List 2. Now create a Table "followers" with a list of followers. cqlsh:twotter> CREATE TABLE followers ( user_id int PRIMARY KEY, followers list<text>); cqlsh:twotter> INSERT INTO followers ( user_id, followers) VALUES (101, ['willi','heinz']); cqlsh:twotter> SELECT * FROM followers; user_id | followers ---------+-------------------101 | ['willi', 'heinz']
  • 35. Collections - Map 3. Add a map to the Table "messages". cqlsh:twotter> ALTER TABLE messages ADD comments map<text, text>; cqlsh:twotter> UPDATE messages SET comments = comments + {'otto':'thx!'} WHERE user_id = 103 AND posted_on = 1384895223;
  • 36. Consistency Level ● Set the consistency level for all subsequent requests: cqlsh:twotter> CONSISTENCY ONE; cqlsh:twotter> CONSISTENCY QUORUM; cqlsh:twotter> CONSISTENCY ALL; ● Show the current consistency level: cqlsh:twotter> CONSISTENCY;
  • 37. Exercise: Consistency Level ● Set the consistency level to ANY and execute a SELECT statement.
  • 38. Exercise: Consistency Level ● Set the consistency level to ANY and execute a SELECT statement. Bad Request: ANY ConsistencyLevel is only supported for writes
  • 39. Exercise: Time-To-Live (TTL) ● Insert a user record with a password reset token with a 77 second TTL value. cqlsh:twotter> INSERT INTO users (id, name, password_reset_token) VALUES (109, 'timo', 'abc-xyz-123') USING TTL 77;
  • 40. Exercise: Time-To-Live (TTL) ● The INSERT statement before will delete the entire user record after 77 seconds. ● This is what we actually wanted to do: cqlsh:twotter> INSERT INTO users (id, name) VALUES(110, 'anna'); cqlsh:twotter> UPDATE users USING TTL 77 SET password_reset_token = 'abc-xyz-123' WHERE id = 110;
  • 41. Time-To-Live (TTL) ● Check the TTL expiration time in seconds. cqlsh:twotter> SELECT TTL (password_reset_token) FROM messages WHERE user_id = 110;
  • 42. Counter Columns Create a Counter Column Table that counts "upvote" and "downvote" events. cqlsh:twotter> CREATE TABLE votes ( user_id int, msg_created_on bigint, upvote counter, downvote counter, PRIMARY KEY (user_id, msg_created_on) );
  • 43. Counter Columns cqlsh:twotter> UPDATE votes SET upvote = upvote + 1 WHERE user_id = 101 AND msg_created_on = 1234; cqlsh:twotter> UPDATE votes SET downvote = downvote + 2 WHERE user_id = 101 AND msg_created_on = 1234; cqlsh:twotter> SELECT * FROM votes;
  • 44. sstable2json utility tool $ sstable2json var/lib/cassandra/data/twotter/users/*.db > *.json X MB .db file 2X MB .json file
  • 45. Exercise: sstable2json ● Insert a few records ● Flush the users column family to disk and create a json representation of a *.db file.
  • 46. Solution: sstable2json $ nodetool flush twotter users $ sstable2json /var/lib/cassandra/data/twotter/users/twotte r-users-jb-1-Data.db > twotter-users.json $ cat twotter-users.json [{"key": "00000069","columns": [["","", 1384963716697000], ["email","g@rd.de", 1384963716697000], ["name","gerd", 1384963716697000]]}, {"key": "00000068","columns": [["","", 1384963716685000], ...
  • 47. CQL v3.1.0 (New in Cassandra 2.0) ● ● ● ● ● ● ● ● ● IF keyword Lightweight transactions (“Compare-And-Set”) Triggers (experimental!!) CQL paging support Drop column support SELECT column aliases Conditional DDL Index enhancements cqlsh COPY
  • 48. IF Keyword cqlsh> DROP KEYSPACE twotter; cqlsh> DROP KEYSPACE twotter; Bad Request: Cannot drop non existing keyspace 'twotter'. cqlsh> DROP KEYSPACE IF EXISTS twotter;
  • 49. Lightweight Transactions ● Compare And Set (CAS) ● Example: without CAS, two users attempting to create a unique user account in the same cluster could overwrite each other’s work with neither user knowing about it. Source: http://www.datastax.com/documentation/cassandra/2.0/webhelp/cassandra/dml/dml_about_transactions_c.html
  • 50. Lightweight Transactions 1. Register a new user cqlsh:twotter> INSERT INTO users (id, name, email) VALUES (110, 'franz', 'fr@nz.de') IF NOT EXISTS; 2. Perform a CAS reset of Karl’s email. cqlsh:twotter> UPDATE users SET email = 'franz@gmail.com' WHERE id = 110 IF email = 'fr@nz.de';
  • 51. Exercise: Lightweight Transactions ● Perform a failing CAS e-mail reset: cqlsh:twotter> UPDATE users SET email = 'franz@ABC.de' ...
  • 52. Exercise: Lightweight Transactions ● Perform a failing CAS e-mail reset: cqlsh:twotter> UPDATE users SET email = 'franz@ABC.de' WHERE id = 110 IF email = 'fr@nz.de'; [applied] | email -----------+------------False | franz@gmail.com
  • 53. Exercise: Lightweight Transactions ● Write a password reset method by using an expiring password_reset_token column and a CAS password update query.
  • 54. Exercise: Lightweight Transactions cqlsh:twotter> UPDATE users USING TTL 77 SET password_reset_token = 'abc-xyz-123' WHERE id = 110; cqlsh:twotter> UPDATE users SET password = 'geheim!' WHERE id = 110 IF password_reset_token = 'abc-xyz-123';
  • 55. Create a Trigger (experimental feature) ● Triggers are written in Java. ● Triggers are currently an experimental feature in Cassandra 2.0. Use with caution! cqlsh:twotter> CREATE TRIGGER myTrigger ON users USING 'org. apache.cassandra.triggers.InvertedIndex'