Helsinki Cassandra Meetup #2: Introduction to CQL3 and DataModeling

Introduction to CQL and Data Modeling
Helsinki Cassandra Meetup
10th February 2014

©2014 DataStax Confidential. Do not distribute without consent.

Agenda
• 
• 
• 
• 
• 

Introduction
CQL Basics
Data Modeling
Time Series/Sensor Data
Java Driver


2

About me
Johnny Miller
DataStax
Solutions Architect
www.datastax.com
@DataStax

@CyanMiller
https://www.linkedin.com/in/johnnymiller
jmiller@datastax.com

3

DataStax
• 

Founded in April 2010

• 

We drive Apache Cassandra™

• 

400+ customers (20 of the Fortune 100)

• 

200+ employees

• 

Home to Apache Cassandra Chair & most committers

• 

Headquartered in San Francisco Bay area

• 

European headquarters established in London

Our Goal
To be the ﬁrst and best database choice for online applications


4

DataStax
•  DataStax supports both the open source community and enterprises.

Open Source/Community

Enterprise Software

•  Apache Cassandra (employ
Cassandra chair and 90+% of
the committers)
•  DataStax Community Edition
•  DataStax OpsCenter
•  DataStax DevCenter
•  DataStax Drivers/Connectors
•  Online Documentation
•  Online Training
•  Mailing lists and forums

•  DataStax Enterprise Edition
•  Certified Cassandra
•  Built-in Analytics
•  Built-in Enterprise Search
•  Enterprise Security
•  DataStax OpsCenter
•  Expert Support
•  Consultative Help
•  Professional Training
5

Cassandra Adoption

Source http://db-engines.com/en/ranking, Feb 2014

6

A sample of Cassandra & DataStax Enterprise users


7

Why Good Data Modeling is Important
•  Cassandra is a highly available, highly scalable, & highly distributed
database, with no single point of failure
•  To achieve this, Cassandra is optimized for non-relational data models.
•  Joins do not function well on distributed databases.
•  Locking and transactions jam up distributed nodes
•  By modeling data properly for Cassandra you can avoid joins, locking, and
transactions for your application.


8

CQL Basics
YesCQL


9

CQL Basics
•  Cassandra Query Language
•  SQL–like language to query Cassandra
•  Limited predicates. Attempts to prevent bad queries
•  but, you can still get into trouble!
•  Keyspace – analogous to a schema.
•  Has various storage attributes.
•  The keyspace determines the RF.
•  Table – looks like a SQL Table.
•  A table must have a Primary Key.
•  We can fully qualify a table as <keyspace>.<table>

10

DevCenter
•  DataStax DevCenter – a free, visual query tool for creating and running CQL statements against Cassandra
and DataStax Enterprise.


11

CQL Basics
•  Usual statements
•  CREATE / DROP / ALTER TABLE • SELECT
BUT
•  INSERT AND UPDATE are similar to each other
•  If a row doesn’t exist, UPDATE will insert it, and if it exists, INSERT will
replace it.
•  Think of it as an UPSERT
•  Therefore we never get a key violation
•  For updates, Cassandra never reads


12

Creating a keyspace - Single Data Centre Consistency


13

Creating a keyspace - Multiple Data Centre Consistency


14

CQL Basics – creating a table
CREATE TABLE cities (!
city_name
varchar,!
elevation
int,!
population int,!
latitude
float,!
longitude
float,!
PRIMARY KEY (city_name)!
);!

•  We can visualize it this way:

•  city_name is the partition key
•  In this example, the partition key = primary key

15

CQL Basics – Composite Primary Key
The Primary Key
•  The key uniquely identiﬁes a row.
•  A composite primary key consists of:
•  A partition key
•  One or more clustering columns
e.g. PRIMARY KEY (partition key, cluster

columns, ...)!

•  The partition key determines on which node the partition resides
•  Data is ordered in cluster column order within the partition


16

CQL Basics – Composite Primary Key
CREATE TABLE sporty_league (!
team_name
varchar,!
player_name varchar,!
jersey
int,!
PRIMARY KEY (team_name, player_name)!
);!


17

CQL Basics – Simple Select
SELECT * FROM sporty_league;!

•  More that a few rows can be slow. (Limited to 10,000 rows by default)
•  Use LIMIT keyword to choose fewer or more rows


18

CQL Basics - Simple Select on Partition Key and Cluster Columns
SELECT * FROM sporty_league WHERE team_name = ‘Mighty Mutts’;!

SELECT * FROM sporty_league WHERE team_name = ‘Mighty Mutts’  
and player_name = ‘Lucky’;!


19

CQL Basics – Insert/Update
INSERT INTO sporty_league (team_name, player_name, jersey)
VALUES ('Mighty Mutts',’Felix’,90);!


20

CQL Basics - Ordering
• 
• 
• 
• 

Partition keys are not ordered, but the cluster columns are.
However, you can only order by a column if it’s a cluster column.
Data will returned by default in the order of the clustering column.
You can also use the ORDER BY keyword – but only on the clustering
column!
SELECT * FROM sporty_league  
WHERE team_name = ‘Mighty Mutts’  
ORDER BY player_name DESC;!


21

CQL Basics – Group By
•  We have already done this!
•  The partition key effectively names the columns for grouping.
•  The previous table contained all of the players grouped by their
team_name.


22

CQL Basics - Predicates
•  On the partition key:
= and IN
•  On the cluster columns: <, <=, =, >=, >, IN


23

CQL Basics – Composite Partition Key
CREATE TABLE cities (!
city_name
varchar,!
state
varchar!
PRIMARY KEY ((city_name,state))!
);!

•  Each city gets it own partition!

24

CQL Basics – Performance considerations
•  The best queries are in a single partition.
i.e. WHERE partition key = <something>!
•  Each new partition requires a new disk seek.
•  Queries that span multiple partitions are s-l-o-w
•  Queries that span multiple cluster columns are fast


25

CQL Basics – Authentication and Authorisation
• 
• 
• 
• 

CQL supports creating users and granting them access to tables etc..
You need to enable authentication in the cassandra.yaml conﬁg ﬁle.
You can create, alter, drop and list users
You can then GRANT permissions to users accordingly – ALTER,
AUTHORIZE, DROP, MODIFY, SELECT.


26

CQL Basics - Tracing
•  You can turn on tracing on or off for queries with the TRACING ON | OFF
command.
•  This can help you understand what Cassandra is doing and identify any
performance problems.

•  http://www.datastax.com/dev/blog/tracing-in-cassandra-1-2


27

CQL Basics – TTL
•  Expiring Columns, or Time to Live (TTL)
INSERT INTO users (id, first, last) VALUES (‘abc123’, ‘abe’,
‘lincoln’) USING TTL 3600;!
// Expires data in one hour!


28

CQL Basics – Data Types


29

CQL Basics – Data Types: Collections
•  CQL supports having columns that contain collections of data.
•  The collection types include:
•  Set, List and Map.

CREATE TABLE collections_example (!
!id int PRIMARY KEY,!
!set_example set<text>,!
!list_example list<text>,!
!map_example map<int, text>!
);

•  These data types are intended to support the type of 1-to-many relationships that can be modeled in
a relational DB e.g. a user has many email addresses.
•  Some performance considerations around collections.
•  Requires serialization so don’t go crazy!
•  Often more efﬁcient to denormalise further rather than use collections if intending to store lots
of data.
•  Favour sets over list – lists not very performant

30

CQL Basics – Data Types: Counters
•  Stores a number that incrementally counts the occurrences of a particular
event or process.
UPDATE UserActions SET total = total + 2  
WHERE user = 123 AND action = ’xyz';!


31

CQL Basics - Lightweight Transactions
•  Introduced in Cassandra 2.0
•  DSE 4 will include Cassandra 2.0 (due soon…)
•  DSE 3.2 (current version) is using Cassandra 1.2
•  Uses the Paxos consensus protocol to obtain an agreement across the cluster.
•  Example:
!INSERT INTO customer_account (customerID, customer_email)  
!VALUES (‘LauraS’, ‘lauras@gmail.com’)  
!IF NOT EXISTS;!
!UPDATE customer_account SET customer_email=’laurass@gmail.com’ 
!IF customer_email=’lauras@gmail.com’;!

•  Great for 1% of your application – but not recommended to be used too much!
•  Eventual consistency is your friend:
http://www.slideshare.net/planetcassandra/c-summit-2013-eventual-consistency- hopeful-consistency-by-christos-kalantzis


32

Data Modeling

Query based and denormalised


33

Cassandra is not a relational database
•  Cassandra doesn’t work the same way as an RDBMS
•  Your data modeling approach won’t work the same way either
•  No foreign keys
•  No joins


34

Query-Driven Data Modeling
•  Start by addressing the queries that you will need to answer
•  Your data should be able to match it directly
•  Think about:
•  The actions your application needs to perform
•  How you want to access the data
•  What are the use cases?
•  What does the data look like?


35

Query-Driven Data Modeling contd.
•  What are you trying to retrieve
•  Does it need to be ordered?
•  Is there any nesting of data?
•  Do you need to group data?
•  Do you need to ﬁlter data?
•  Does data expire?
•  Does data need to be retrieved in chronological order?


36

Denormalisation
•  Combine table columns into a single view i.e. materialized view
•  we have to create table that stores all the data that would be in the view
•  Remember - no joins in Cassandra!
Advantage:
•  Having the data stored in a this manner greatly improves performance
•  Less seeking
•  Less network trafﬁc
Disadvantage:
•  Data duplication
•  different tables for different queries
•  you will use more disk space – but disks are cheap!

37

Avoid client-side joins
•  What is a client-side join?
•  Querying a table from Cassandra
•  Using the results from the ﬁrst query to query a second table
•  Why avoid?
•  Degrades performance i.e. more I/O, seeks and trafﬁc


38

Don’t be scared of writes
• 
• 
• 
• 

Cassandra is the fastest DB there is for writes.
Writing to multiple tables is not going to be slow!
3-5000 writes/second/core e.g. 8 core server = 24k-30k writes per second!
< 1ms typical for most rights (varies based on hardware)


39

Performance
“In terms of scalability, there is a clear winner throughout our experiments. Cassandra
achieves the highest throughput for the maximum number of nodes in all experiments
with a linear increasing throughput.”
Solving Big Data Challenges for Enterprise Application Performance Management, Tilman Rable, et al., August 2013, p. 10.
Benchmark paper presented at the Very Large Database Conference, 2013. http://vldb.org/pvldb/vol5/
p1724_tilmannrabl_vldb2013.pdf

Netflix Cloud Benchmark…

End Point independent NoSQL Benchmark
Highest in throughput vs MongoDB and HBase
Lowest in latency vs MongoDB and HBase

http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalabilityon.html

http://www.datastax.com/wp-content/uploads/2013/02/WP-Benchmarking-Top-NoSQLDatabases.pdf
40

One-to-many
•  Relationship without being relational
•  Example – Users have many videos
•  Wait? Where is the foreign key?


41

One-to-Many
CREATE TABLE videos (!
videoid uuid,!
videoname varchar,!
username varchar,!
description varchar,!
tags varchar,!
upload_date timestamp,!
PRIMARY KEY(videoid)!
);!

CREATE TABLE username_video_index (!
username varchar,!
videoid uuid,!
video_name varchar,!
PRIMARY KEY (username, videoid)!
);!
!

•  Static table to store videos

SELECT video_name FROM username_video_index
WHERE username = ‘tcodd’ AND videoid =
‘99051fe9’!

•  UUID for unique video id

•  Lookup video by username

•  Add username to denormalize
Write in two tables at once for fast lookups

42

Many-to-many
•  Example - users and videos have many comments.


43

Many-to-many
•  Model both sides of the view
•  Insert both when comment is created
•  Materialized views from either side
CREATE TABLE comments_by_user (!
username varchar,!
videoid uuid,!
comment_ts timestamp,!
comment varchar,!
PRIMARY KEY (username,videoid)!
);!

CREATE TABLE comments_by_video (!
videoid uuid,!
username varchar,!
comment_ts timestamp,!
comment varchar,!
PRIMARY KEY (videoid,username)!
);!

DON’T BE AFRAID OF WRITES

44

Partition Key is not the same as a Primary Key
•  Within a table, a row is referenced by a partition key
•  This is either your primary key or the ﬁrst part of a compound primary
key
Similarities
•  Partition key identiﬁes a partition as being separate from other partitions
•  Must be unique within a table
Differences
•  Inserting a new record with a partition key that already exists doesn’t do
what you’re used to in a RDBMS i.e. No primary key violations
•  An INSERT using an existing partition key is allowed
•  As a consequence, INSERT and UPDATE act in the same way i.e. UPSERT

45

How to avoid UPSERTS
•  Guarantee that your primary keys are unique from one another
•  Use an appropriate natural key based on your data
•  Use a surrogate key for partition key
Risks with natural keys
•  Depending on the type of natural key that is used, there may still be an
increased risk of UPSERTs
•  Changing the datum used for a Natural Key requires a lot of overhead.
•  So why not use a sequence to generate a surrogate key?
•  You cant – Cassandra doesn’t provide sequences!

46

What, no sequences?
•  Sequences are a handy feature in RDMBS for auto-creation of IDs for you data.
•  Guaranteed unique
•  E.g. INSERT INTO user (id, firstName,
•  Cassandra has no sequences!

LastName) VALUES (seq.nextVal(), ‘Ted’, ‘Codd’)!

•  Extremely difﬁcult in a masterless distributed system
•  Requires a lock (perf killer)
•  What to do?
•  Use part of the data to create a unique key
•  Use a UUID


47

UUID
•  Universal Unique ID
•  128 bit number represented in character form e.g. 99051fe9-6a9c-46c2b949-38ef78858dd0
•  Easily generated on the client
•  Version 1 has a timestamp component
•  Version 4 has no timestamp component
•  Faster to generate


48

Indexing
•  This gives you fast access to data
•  Secondary indexes != relational indexes


49

Adding an Index to a table
•  If we want to do a query on a column that is not part of your PK, you can
create an index:
CREATE INDEX ON <table>(<column>);
•  Than you can do a select:
•  SELECT * FROM product WHERE type= ’PC';
•  Avoid doing this
•  Not great for performance (although improvements are being made)
•  Much more efﬁcient to model your data around the query i.e. roll your
own indexes!!


50

Keyword index example
•  Now we can define an index for
tagging videos

•  Using the previous video example,
users want to tag videos.
•  Video table defined as:

!
CREATE TABLE video_tag_index (!

CREATE TABLE videos (!

tag varchar,!

videoid uuid,!

videoid uuid,!

videoname varchar,!

timestamp timestamp!

username varchar,!

PRIMARY KEY(tag, videoid)!

description varchar,!

);!

tags varchar,!
PRIMARY KEY(videoid)!
);!

Fast

Efficient
51

Partial word index example
•  Table:
CREATE TABLE email_index (!
!domain varchar,!
!user varchar,!
!username varchar,!
!PRIMARY KEY (domain, user)!
)!

•  User: jmiller, Email: jmiller@datastax.com
INSERT INTO email_index (domain, user, username) !
VALUES (‘@datastax.com’, ‘jmiller’, ‘jmiller’)!


52

Bitmap index
•  Multiple parts to a key
•  Create a truth table of the various combinations
•  However, inserts == the number of combinations


53

Bitmap index example
•  Find a car in a car park by variable combinations


54

Bitmap index example – Table deﬁnition
•  Make a table with three different key combinations
CREATE TABLE car_location_index (!
!make varchar,!
!model varchar,!
!colour varchar,!
!vehicle_id int,!
!lot_id int,!
!PRIMARY KEY ((make, mode, colour), vehicle_id)!
);!


55

Bitmap index example – Adding records
•  We are pre-optimizing for 7 possible queries of the index on insert.
1.  INSERT INTO car_location_index (make, model, colour, vehicle_id, lot_id) 
VALUES (‘Ford’, ‘Mustang’, ‘Blue’, 1234, 8675309);!
VALUES (‘Ford’, ‘Mustang’, ‘’, 1234, 8675309);!
VALUES (‘Ford’, ‘’, ‘Blue’, 1234, 8675309);!
VALUES (‘Ford’, ‘’, ‘’, 1234, 8675309);!
VALUES (‘’, ‘Mustang’, ‘Blue’, 1234, 8675309);!
VALUES (‘’, ‘Mustang’, ‘’, 1234, 8675309);!
VALUES (‘’, ‘’, ‘Blue’, 1234, 8675309);!


56

Bitmap - selecting
•  Different queries are now possible:


57

Time Series/Sensor Data


58

What is time series data?
•  Sensors
•  CPU, Network Card, Electronic Power Meter, Resource Utilization,
Weather
•  Clickstream data
•  Historical trends
•  Stock Ticker
•  Anything that varies on a temporal basis
•  Top Ten Most Popular Videos


59

Why Cassandra for time series data?
•  Cassandra based on BigTable storage model
•  One key row and lots of (variable) columns
•  Single layout on disk


60

Time Series Example
•  Storing weather data
•  One weather station
•  Temperature measurement every minute


61

Times Series Example – query data
•  Weather station id = Locality of single node


62

Time Series Example - Table
•  Data partitioned by weather station ID and time
•  Timestamp goes in the clustered column
•  Store the measurement as the non-clustered column(s)
•  Take advantage of partition clustering
CREATE TABLE temperature (!
!weatherstation_id text,!
!event_time timestamp,!
!temperature text!
!PRIMARY KEY (weatherstation_id, event_time) !
);!

63

Time Series Example
•  Simple to insert:
INSERT INTO temperature (weatherstation_id, event_time, temperature)!
VALUES (‘1234abcd’, ‘2013-12-11 07:01:00’, ‘72F’);!
!

•  Simple to query
SELECT temperature from temperature WHERE weatherstation_id=‘1234abcd’
AND event_time > ‘2013-04-03 07:01:00’ AND event_time < ‘2013-04-03
07:04:00’ !

!


64

Time Series Example – Partitioning
•  With the previous table, you can end up with a very large row on 1 partition
i.e. PRIMARY KEY (weatherstation_id, event_time)
•  This would have to ﬁt on 1 node.
•  Cassandra can store 2 billion columns per storage row.
•  The solution is to have a composite partition key to split things up:
!date text,!
!temperature text!
!PRIMARY KEY ((weatherstation_id, date), event_time) !
);!

65

Time Series Example – reading and writing
INSERT INTO temperature (weatherstation_id, date, event_time,
temperature)!
VALUES (‘1234abcd’, ‘2013-12-11’, ‘2013-12-11 07:01:00’, ‘72F’);!
!

SELECT temperature from temperature !
WHERE weatherstation_id=‘1234abcd’ !
AND date = ‘2013-12-11’!
07:04:00’ !

!

66

Time Series Example – reverse ordering
•  Common pattern for time series data is rolling storage.
•  For example, we only want to show the last 10 temperature readings and older data is no
longer needed
•  On most DBs you would need some background job to purge the old data.
•  With Cassandra you can use TTL’s!
!date text,!
!temperature text!
!PRIMARY KEY ((weatherstation_id, date), event_time) !
) WITH CLUSTERING ORDER BY (event_time DESC);!

•  As part of the table deﬁnition, WITH CLUSTERING ORDER BY (event_time DESC), is used to order the
data by the most recent ﬁrst i.e. the data will be returned in this order.!


67

Time Series Example – TTL’ing
INSERT INTO temperature (weatherstation_id, date, event_time, temperature)!
VALUES (‘1234abcd’, ‘2013-12-11’, ‘2013-12-11 07:01:00’, ‘72F’) USING TTL 20;!

•  This data point will automatically be deleted after 20 seconds.
•  Eventually you will see all the data disappear.
!
WHERE weatherstation_id=‘1234abcd’ !
AND date = ‘2013-12-11’!
AND event_time > ‘2013-04-03 07:01:00’ AND event_time < ‘2013-04-03 07:04:00’ !


68

Time Series Bucket Example – mitigating spikes in data
•  In some situations, there might be a risk that you get an unforeseen volume
of sensor data for the partition key for your row.
•  The risk here is that your row will continue to grow and ﬁll-up the node.
•  The workaround here is to attempt to split your data across multiple nodes:
!date text,!
!bucket_id int,!
!temperature text!
!PRIMARY KEY ((weatherstation_id, date, bucket_id), event_time) !
);!


69

Time Series Bucket Example – reading and writing
•  Not so simple to insert. Client needs to generate a bucket id (often a
random number within a certain range):
INSERT INTO temperature (weatherstation_id, date, bucket, event_time,
temperature)!
VALUES (‘1234abcd’, ‘2013-12-11’, 10, ‘2013-12-11 07:01:00’, ‘72F’);!
!

•  Much more expensive to read.
The client will have to iterate through the range of random numbers,
execute a read for each and then merge and order the data in the client
WHERE weatherstation_id=‘1234abcd’ AND date = ‘2013-12-11’!
AND bucket = 10, !
07:04:00’ !

!


70

Time Series Bucket Example
•  Only do this as a last resort.
•  Reads become very expensive i.e. n x read(s) where n > range of buckets
•  If your dealing with large volumes of data it can be hard work for the client
to merge and re-order things.


71

DataStax Native Java Driver


72

Features
•  Provides CQL3 access to Cassandra using Java
•  Utilizes Cassandra’s native protocol
•  Automatic routing of client requests
•  Conﬁgurable consistency policy
•  Automatic failover
•  Tracing support
•  Tunable policies
•  Load balancing
•  Reconnection
•  Consistency
•  Queries can be executed synchronously or asynchronously
•  Supports prepared statements
•  Non-blocking I/O

73

Cassandra clients - Drivers
•  DataStax drivers for Cassandra
•  Python
•  C++
•  Java
•  C#
•  And more on the way…
•  http://www.datastax.com/download/clientdrivers


74

Where to get it?
•  The latest release of the driver is available on Maven Central.
•  You can install it in your application using the following Maven dependency:
•  Documentation:
http://www.datastax.com/documentation/developer/java-driver
Javadoc: http://www.datastax.com/drivers/java/apidocs/index.html


75

Native Protocol
•  To use CQL via the client drivers, you must set the property
start_native_transport to true in the cassandra.yaml on every node.
•  This protocol is an extremely efﬁcient way of integrating with Cassandra.
•  Supports synchronous and asynchronous requests
•  Use the corresponding native driver in your app.


76

CQL to Java Mappings
CQL3 Data Type

Java Type

CQL3 Data Type

Java Type

ascii

java. lang. String

int

int

bigint

long

list

java.util.List<T>

blob

java.nio.ByteBuffer

map

java.util.Map<K, V>

boolean

boolean

set

java.util.Set<T>

counter

long

text

java.lang.String

decimal

float

timeuuid

java.util.UUID

double

double

uuid

java.util.UUID

float

float

varchar

java.lang.String

inet

java.net.InetAddress

varint

java.math.BigInteger


77

Connecting to a Cluster
•  The Cluster class is your client apps entry point for connecting to
Cassandra and getting back its metadata.
Cluster cluster =
Cluster.builder().addContactPoints(”10.158.02.40”,“10.158.02.44”).build();

•  You can pass in one or many node addresses to connect to.
•  Make sure to tidy up your cluster after your ﬁnished:
cluster.shutdown();


78

Connecting to a Keyspace
•  After connecting to the cluster, you creation a Session on the keyspace you
want to iteract with.
Session session = cluster.connect(“akeyspace”);
•  Make sure to tidy up after your self:
session.shutdown();


79

Inserting Data
try {
session.execute( “INSERT INTO user (username, password)” + “VALUES(‘user1’,
‘user1password’);”);
session.execute( “INSERT INTO user (username, password)” + “VALUES(‘user2’,
‘user2password’);”);
} catch (NoHostAvailableException ex) {
System.out.println(“No Host available”);
}


80

Reading Data
try {
ResultSet result = session.execute ( "SELECT password from user " + "WHERE username = 'user2';");
if (result.isExhausted())
return;
Row user = result.one();
System.out.println("Password is: " + user.getString("password"));
System.out.println("No Host Available");
} catch (QueryValidationException ex) {
System.out.println(“Requested consistency” + “level not met”);
}


81

Prepared Statements
PreparedStatement statement = session.prepare( "INSERT INTO user (username, password) "
+ "VALUES (?, ?);");
BoundStatement boundStatement = new BoundStatement(statement);
try {
session.execute(boundStatement.bind("user4”,"user4password"));
System.out.println("Host Not Available");
} catch (QueryExecutionException ex) {
System.out.println (”Syntax error, runtime, not authorized");
} catch (QueryValidationException ex) {
System.out.println ("Requested consistency level not met");
}


82

Query Builder
Insert insert = QueryBuilder.insertInto("user”)
.value("username", ”rcohen”)
.value("password", ”mypassword");
session.execute(insert);
Query query = QueryBuilder
.select()
.all()
.from(”akeyspace", "user");
ResultSet rs = session.execute(query);
for (Row row : rs) {
System.out.println(String.format("%-20st%-20s",
row.getString("username"),
row.getString("password")));
}


83

Consistency Level
SimpleStatement simpleStatement = new SimpleStatement ( "SELECT * FROM USER WHERE username = 'user2’;”);
// This will show the default consistency level of ConsistencyLevel.ONE
System.out.println("Consistency Level for this request: ” +simpleStatement.getConsistencyLevel());
//Now change the consistency level
simpleStatement.setConsistencyLevel(ConsistencyLevel.ALL);
You can also set the consistency level using the QueryBuilder
Insert insert = QueryBuilder.insertInto("user”)
.value("username", ”johnny”)
.value("password", ”mypassword")
setConsistencyLevel(ConsistencyLevel.ALL);


84

Tracing
•  Tracing can help with debugging or analysing how Cassandra is handling
your queries.
Query insert = QueryBuilder.insertInto("simplex", "songs")
.value("id", UUID.randomUUID())
.value("title", "Golden Brown")
.value("album", "La Folie")
.value("artist", "The Stranglers")
.setConsistencyLevel(ConsistencyLevel.ONE).enableTracing();


85

Tracing
ResultSet results = getSession().execute(insert);
ExecutionInfo executionInfo = results.getExecutionInfo();
•  This ExecutionInfo object contains information on the hosts it attempted to communicate
with, the host it used and a QueryTrace object.
QueryTrace queryTrace = executionInfo.getQueryTrace();
•  With these two objects you can obtain quite detail on how your query performed


86

Tracing
Connected to cluster: xerxes 
Simplex keyspace and schema created. 
Host (queried): /127.0.0.1 
Host (tried): /127.0.0.1 
Trace id: 96ac9400-a3a5-11e2-96a9-4db56cdc5fe7!
activity

| timestamp

| source

| source_elapsed!

---------------------------------------+--------------+------------+--------------!
Parsing statement | 12:17:16.736 | /127.0.0.1 |

28!

Peparing statement | 12:17:16.736 | /127.0.0.1 |

199!

Determining replicas for mutation | 12:17:16.736 | /127.0.0.1 |

348!

Sending message to /127.0.0.3 | 12:17:16.736 | /127.0.0.1 |

788!


805!

Acquiring switchLock read lock | 12:17:16.736 | /127.0.0.1 |

828!

Appending to commitlog | 12:17:16.736 | /127.0.0.1 |

848!

Adding to songs memtable | 12:17:16.736 | /127.0.0.1 |

900!

Message received from /127.0.0.1 | 12:17:16.737 | /127.0.0.2 |

34!


25!


672!


525!


692!


541!


741!


583!

©2014Enqueuing response not distribute without consent.
DataStax Confidential. Do to /127.0.0.1 | 12:17:16.737 | /127.0.0.3 |

87

751!

Enqueuing response to /127.0.0.1 | 12:17:16.738 | /127.0.0.2 |

950!


178!


1189!


249!

Processing response from /127.0.0.3 | 12:17:16.738 | /127.0.0.1 |

345!

Processing response from /127.0.0.2 | 12:17:16.738 | /127.0.0.1 |

377!

OpsCenter


88

DataStax OpsCenter
•  DataStax OpsCenter is a browser-based, visual management and monitoring solution for Apache
Cassandra and DataStax Enterprise
•  Functionality is also exposed via HTTP APIs


89

Thank You

We power the big data
apps that transform business.


90

Helsinki Cassandra Meetup #2: Introduction to CQL3 and DataModeling

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (8)

Similar to Helsinki Cassandra Meetup #2: Introduction to CQL3 and DataModeling

Similar to Helsinki Cassandra Meetup #2: Introduction to CQL3 and DataModeling (20)

Recently uploaded

Recently uploaded (20)

Helsinki Cassandra Meetup #2: Introduction to CQL3 and DataModeling