SlideShare a Scribd company logo
1 of 90
Download to read offline
Introduction to CQL and Data Modeling
Helsinki Cassandra Meetup
10th February 2014

©2014 DataStax Confidential. Do not distribute without consent.
Agenda
• 
• 
• 
• 
• 

Introduction
CQL Basics
Data Modeling
Time Series/Sensor Data
Java Driver

©2014 DataStax Confidential. Do not distribute without consent.

2
About me
Johnny Miller
DataStax
Solutions Architect
www.datastax.com
@DataStax

@CyanMiller
https://www.linkedin.com/in/johnnymiller
jmiller@datastax.com
©2014 DataStax Confidential. Do not distribute without consent.

3
DataStax
• 

Founded in April 2010

• 

We drive Apache Cassandra™

• 

400+ customers (20 of the Fortune 100)

• 

200+ employees

• 

Home to Apache Cassandra Chair & most committers

• 

Headquartered in San Francisco Bay area

• 

European headquarters established in London

Our Goal
To be the first and best database choice for online applications

©2014 DataStax Confidential. Do not distribute without consent.

4
DataStax
•  DataStax supports both the open source community and enterprises.

Open Source/Community

Enterprise Software

•  Apache Cassandra (employ
Cassandra chair and 90+% of
the committers)
•  DataStax Community Edition
•  DataStax OpsCenter
•  DataStax DevCenter
•  DataStax Drivers/Connectors
•  Online Documentation
•  Online Training
•  Mailing lists and forums
©2014 DataStax Confidential. Do not distribute without consent.

•  DataStax Enterprise Edition
•  Certified Cassandra
•  Built-in Analytics
•  Built-in Enterprise Search
•  Enterprise Security
•  DataStax OpsCenter
•  Expert Support
•  Consultative Help
•  Professional Training
5
Cassandra Adoption

Source http://db-engines.com/en/ranking, Feb 2014
©2014 DataStax Confidential. Do not distribute without consent.

6
A sample of Cassandra & DataStax Enterprise users

©2014 DataStax Confidential. Do not distribute without consent.

7
Why Good Data Modeling is Important
•  Cassandra is a highly available, highly scalable, & highly distributed
database, with no single point of failure
•  To achieve this, Cassandra is optimized for non-relational data models.
•  Joins do not function well on distributed databases.
•  Locking and transactions jam up distributed nodes
•  By modeling data properly for Cassandra you can avoid joins, locking, and
transactions for your application.

©2014 DataStax Confidential. Do not distribute without consent.

8
CQL Basics
YesCQL

©2014 DataStax Confidential. Do not distribute without consent.

9
CQL Basics
•  Cassandra Query Language
•  SQL–like language to query Cassandra
•  Limited predicates. Attempts to prevent bad queries
•  but, you can still get into trouble!
•  Keyspace – analogous to a schema.
•  Has various storage attributes.
•  The keyspace determines the RF.
•  Table – looks like a SQL Table.
•  A table must have a Primary Key.
•  We can fully qualify a table as <keyspace>.<table>
©2014 DataStax Confidential. Do not distribute without consent.

10
DevCenter
•  DataStax DevCenter – a free, visual query tool for creating and running CQL statements against Cassandra
and DataStax Enterprise.

©2014 DataStax Confidential. Do not distribute without consent.

11
CQL Basics
•  Usual statements
•  CREATE / DROP / ALTER TABLE • SELECT
BUT
•  INSERT AND UPDATE are similar to each other
•  If a row doesn’t exist, UPDATE will insert it, and if it exists, INSERT will
replace it.
•  Think of it as an UPSERT
•  Therefore we never get a key violation
•  For updates, Cassandra never reads

©2014 DataStax Confidential. Do not distribute without consent.

12
Creating a keyspace - Single Data Centre Consistency

©2014 DataStax Confidential. Do not distribute without consent.

13
Creating a keyspace - Multiple Data Centre Consistency

©2014 DataStax Confidential. Do not distribute without consent.

14
CQL Basics – creating a table
CREATE TABLE cities (!
city_name
varchar,!
elevation
int,!
population int,!
latitude
float,!
longitude
float,!
PRIMARY KEY (city_name)!
);!

•  We can visualize it this way:

•  city_name is the partition key
•  In this example, the partition key = primary key
©2014 DataStax Confidential. Do not distribute without consent.

15
CQL Basics – Composite Primary Key
The Primary Key
•  The key uniquely identifies a row.
•  A composite primary key consists of:
•  A partition key
•  One or more clustering columns
e.g. PRIMARY KEY (partition key, cluster

columns, ...)!

•  The partition key determines on which node the partition resides
•  Data is ordered in cluster column order within the partition

©2014 DataStax Confidential. Do not distribute without consent.

16
CQL Basics – Composite Primary Key
CREATE TABLE sporty_league (!
team_name
varchar,!
player_name varchar,!
jersey
int,!
PRIMARY KEY (team_name, player_name)!
);!

©2014 DataStax Confidential. Do not distribute without consent.

17
CQL Basics – Simple Select
SELECT * FROM sporty_league;!

•  More that a few rows can be slow. (Limited to 10,000 rows by default)
•  Use LIMIT keyword to choose fewer or more rows

©2014 DataStax Confidential. Do not distribute without consent.

18
CQL Basics - Simple Select on Partition Key and Cluster Columns
SELECT * FROM sporty_league WHERE team_name = ‘Mighty Mutts’;!

SELECT * FROM sporty_league WHERE team_name = ‘Mighty Mutts’ 

and player_name = ‘Lucky’;!

©2014 DataStax Confidential. Do not distribute without consent.

19
CQL Basics – Insert/Update
INSERT INTO sporty_league (team_name, player_name, jersey)
VALUES ('Mighty Mutts',’Felix’,90);!

©2014 DataStax Confidential. Do not distribute without consent.

20
CQL Basics - Ordering
• 
• 
• 
• 

Partition keys are not ordered, but the cluster columns are.
However, you can only order by a column if it’s a cluster column.
Data will returned by default in the order of the clustering column.
You can also use the ORDER BY keyword – but only on the clustering
column!
SELECT * FROM sporty_league 

WHERE team_name = ‘Mighty Mutts’ 

ORDER BY player_name DESC;!

©2014 DataStax Confidential. Do not distribute without consent.

21
CQL Basics – Group By
•  We have already done this!
•  The partition key effectively names the columns for grouping.
•  The previous table contained all of the players grouped by their
team_name.

©2014 DataStax Confidential. Do not distribute without consent.

22
CQL Basics - Predicates
•  On the partition key:
= and IN
•  On the cluster columns: <, <=, =, >=, >, IN

©2014 DataStax Confidential. Do not distribute without consent.

23
CQL Basics – Composite Partition Key
CREATE TABLE cities (!
city_name
varchar,!
state
varchar!
PRIMARY KEY ((city_name,state))!
);!

•  Each city gets it own partition!
©2014 DataStax Confidential. Do not distribute without consent.

24
CQL Basics – Performance considerations
•  The best queries are in a single partition.
i.e. WHERE partition key = <something>!
•  Each new partition requires a new disk seek.
•  Queries that span multiple partitions are s-l-o-w
•  Queries that span multiple cluster columns are fast

©2014 DataStax Confidential. Do not distribute without consent.

25
CQL Basics – Authentication and Authorisation
• 
• 
• 
• 

CQL supports creating users and granting them access to tables etc..
You need to enable authentication in the cassandra.yaml config file.
You can create, alter, drop and list users
You can then GRANT permissions to users accordingly – ALTER,
AUTHORIZE, DROP, MODIFY, SELECT.

©2014 DataStax Confidential. Do not distribute without consent.

26
CQL Basics - Tracing
•  You can turn on tracing on or off for queries with the TRACING ON | OFF
command.
•  This can help you understand what Cassandra is doing and identify any
performance problems.

•  http://www.datastax.com/dev/blog/tracing-in-cassandra-1-2

©2014 DataStax Confidential. Do not distribute without consent.

27
CQL Basics – TTL
•  Expiring Columns, or Time to Live (TTL)
INSERT INTO users (id, first, last) VALUES (‘abc123’, ‘abe’,
‘lincoln’) USING TTL 3600;!
// Expires data in one hour!

©2014 DataStax Confidential. Do not distribute without consent.

28
CQL Basics – Data Types

©2014 DataStax Confidential. Do not distribute without consent.

29
CQL Basics – Data Types: Collections
•  CQL supports having columns that contain collections of data.
•  The collection types include:
•  Set, List and Map.

CREATE TABLE collections_example (!
!id int PRIMARY KEY,!
!set_example set<text>,!
!list_example list<text>,!
!map_example map<int, text>!
);

•  These data types are intended to support the type of 1-to-many relationships that can be modeled in
a relational DB e.g. a user has many email addresses.
•  Some performance considerations around collections.
•  Requires serialization so don’t go crazy!
•  Often more efficient to denormalise further rather than use collections if intending to store lots
of data.
•  Favour sets over list – lists not very performant
©2014 DataStax Confidential. Do not distribute without consent.

30
CQL Basics – Data Types: Counters
•  Stores a number that incrementally counts the occurrences of a particular
event or process.
UPDATE UserActions SET total = total + 2 

WHERE user = 123 AND action = ’xyz';!

©2014 DataStax Confidential. Do not distribute without consent.

31
CQL Basics - Lightweight Transactions
•  Introduced in Cassandra 2.0
•  DSE 4 will include Cassandra 2.0 (due soon…)
•  DSE 3.2 (current version) is using Cassandra 1.2
•  Uses the Paxos consensus protocol to obtain an agreement across the cluster.
•  Example:
!INSERT INTO customer_account (customerID, customer_email) 

!VALUES (‘LauraS’, ‘lauras@gmail.com’) 

!IF NOT EXISTS;!
!UPDATE customer_account SET customer_email=’laurass@gmail.com’

!IF customer_email=’lauras@gmail.com’;!

•  Great for 1% of your application – but not recommended to be used too much!
•  Eventual consistency is your friend:
http://www.slideshare.net/planetcassandra/c-summit-2013-eventual-consistency- hopeful-consistency-by-christos-kalantzis

©2014 DataStax Confidential. Do not distribute without consent.

32
Data Modeling

Query based and denormalised

©2014 DataStax Confidential. Do not distribute without consent.

33
Cassandra is not a relational database
•  Cassandra doesn’t work the same way as an RDBMS
•  Your data modeling approach won’t work the same way either
•  No foreign keys
•  No joins

©2014 DataStax Confidential. Do not distribute without consent.

34
Query-Driven Data Modeling
•  Start by addressing the queries that you will need to answer
•  Your data should be able to match it directly
•  Think about:
•  The actions your application needs to perform
•  How you want to access the data
•  What are the use cases?
•  What does the data look like?

©2014 DataStax Confidential. Do not distribute without consent.

35
Query-Driven Data Modeling contd.
•  What are you trying to retrieve
•  Does it need to be ordered?
•  Is there any nesting of data?
•  Do you need to group data?
•  Do you need to filter data?
•  Does data expire?
•  Does data need to be retrieved in chronological order?

©2014 DataStax Confidential. Do not distribute without consent.

36
Denormalisation
•  Combine table columns into a single view i.e. materialized view
•  we have to create table that stores all the data that would be in the view
•  Remember - no joins in Cassandra!
Advantage:
•  Having the data stored in a this manner greatly improves performance
•  Less seeking
•  Less network traffic
Disadvantage:
•  Data duplication
•  different tables for different queries
•  you will use more disk space – but disks are cheap!
©2014 DataStax Confidential. Do not distribute without consent.

37
Avoid client-side joins
•  What is a client-side join?
•  Querying a table from Cassandra
•  Using the results from the first query to query a second table
•  Why avoid?
•  Degrades performance i.e. more I/O, seeks and traffic

©2014 DataStax Confidential. Do not distribute without consent.

38
Don’t be scared of writes
• 
• 
• 
• 

Cassandra is the fastest DB there is for writes.
Writing to multiple tables is not going to be slow!
3-5000 writes/second/core e.g. 8 core server = 24k-30k writes per second!
< 1ms typical for most rights (varies based on hardware)

©2014 DataStax Confidential. Do not distribute without consent.

39
Performance
“In terms of scalability, there is a clear winner throughout our experiments. Cassandra
achieves the highest throughput for the maximum number of nodes in all experiments
with a linear increasing throughput.”
Solving Big Data Challenges for Enterprise Application Performance Management, Tilman Rable, et al., August 2013, p. 10.
Benchmark paper presented at the Very Large Database Conference, 2013. http://vldb.org/pvldb/vol5/
p1724_tilmannrabl_vldb2013.pdf

Netflix Cloud Benchmark…

End Point independent NoSQL Benchmark
Highest in throughput vs MongoDB and HBase
Lowest in latency vs MongoDB and HBase

http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalabilityon.html
©2014 DataStax Confidential. Do not distribute without consent.

http://www.datastax.com/wp-content/uploads/2013/02/WP-Benchmarking-Top-NoSQLDatabases.pdf
40
One-to-many
•  Relationship without being relational
•  Example – Users have many videos
•  Wait? Where is the foreign key?

©2014 DataStax Confidential. Do not distribute without consent.

41
One-to-Many
CREATE TABLE videos (!
videoid uuid,!
videoname varchar,!
username varchar,!
description varchar,!
tags varchar,!
upload_date timestamp,!
PRIMARY KEY(videoid)!
);!

CREATE TABLE username_video_index (!
username varchar,!
videoid uuid,!
upload_date timestamp,!
video_name varchar,!
PRIMARY KEY (username, videoid)!
);!
!

•  Static table to store videos

SELECT video_name FROM username_video_index
WHERE username = ‘tcodd’ AND videoid =
‘99051fe9’!

•  UUID for unique video id

•  Lookup video by username

•  Add username to denormalize
Write in two tables at once for fast lookups
©2014 DataStax Confidential. Do not distribute without consent.

42
Many-to-many
•  Example - users and videos have many comments.

©2014 DataStax Confidential. Do not distribute without consent.

43
Many-to-many
•  Model both sides of the view
•  Insert both when comment is created
•  Materialized views from either side
CREATE TABLE comments_by_user (!
username varchar,!
videoid uuid,!
comment_ts timestamp,!
comment varchar,!
PRIMARY KEY (username,videoid)!
);!

CREATE TABLE comments_by_video (!
videoid uuid,!
username varchar,!
comment_ts timestamp,!
comment varchar,!
PRIMARY KEY (videoid,username)!
);!

DON’T BE AFRAID OF WRITES
©2014 DataStax Confidential. Do not distribute without consent.

44
Partition Key is not the same as a Primary Key
•  Within a table, a row is referenced by a partition key
•  This is either your primary key or the first part of a compound primary
key
Similarities
•  Partition key identifies a partition as being separate from other partitions
•  Must be unique within a table
Differences
•  Inserting a new record with a partition key that already exists doesn’t do
what you’re used to in a RDBMS i.e. No primary key violations
•  An INSERT using an existing partition key is allowed
•  As a consequence, INSERT and UPDATE act in the same way i.e. UPSERT
©2014 DataStax Confidential. Do not distribute without consent.

45
How to avoid UPSERTS
•  Guarantee that your primary keys are unique from one another
•  Use an appropriate natural key based on your data
•  Use a surrogate key for partition key
Risks with natural keys
•  Depending on the type of natural key that is used, there may still be an
increased risk of UPSERTs
•  Changing the datum used for a Natural Key requires a lot of overhead.
•  So why not use a sequence to generate a surrogate key?
•  You cant – Cassandra doesn’t provide sequences!
©2014 DataStax Confidential. Do not distribute without consent.

46
What, no sequences?
•  Sequences are a handy feature in RDMBS for auto-creation of IDs for you data.
•  Guaranteed unique
•  E.g. INSERT INTO user (id, firstName,
•  Cassandra has no sequences!

LastName) VALUES (seq.nextVal(), ‘Ted’, ‘Codd’)!

•  Extremely difficult in a masterless distributed system
•  Requires a lock (perf killer)
•  What to do?
•  Use part of the data to create a unique key
•  Use a UUID

©2014 DataStax Confidential. Do not distribute without consent.

47
UUID
•  Universal Unique ID
•  128 bit number represented in character form e.g. 99051fe9-6a9c-46c2b949-38ef78858dd0
•  Easily generated on the client
•  Version 1 has a timestamp component
•  Version 4 has no timestamp component
•  Faster to generate

©2014 DataStax Confidential. Do not distribute without consent.

48
Indexing
•  This gives you fast access to data
•  Secondary indexes != relational indexes

©2014 DataStax Confidential. Do not distribute without consent.

49
Adding an Index to a table
•  If we want to do a query on a column that is not part of your PK, you can
create an index:
CREATE INDEX ON <table>(<column>);
•  Than you can do a select:
•  SELECT * FROM product WHERE type= ’PC';
•  Avoid doing this
•  Not great for performance (although improvements are being made)
•  Much more efficient to model your data around the query i.e. roll your
own indexes!!

©2014 DataStax Confidential. Do not distribute without consent.

50
Keyword index example
•  Now we can define an index for
tagging videos

•  Using the previous video example,
users want to tag videos.
•  Video table defined as:

!
CREATE TABLE video_tag_index (!

CREATE TABLE videos (!

tag varchar,!

videoid uuid,!

videoid uuid,!

videoname varchar,!

timestamp timestamp!

username varchar,!

PRIMARY KEY(tag, videoid)!

description varchar,!

);!

tags varchar,!
upload_date timestamp,!
PRIMARY KEY(videoid)!
);!

Fast
©2014 DataStax Confidential. Do not distribute without consent.

Efficient
51
Partial word index example
•  Table:
CREATE TABLE email_index (!
!domain varchar,!
!user varchar,!
!username varchar,!
!PRIMARY KEY (domain, user)!
)!

•  User: jmiller, Email: jmiller@datastax.com
INSERT INTO email_index (domain, user, username) !
VALUES (‘@datastax.com’, ‘jmiller’, ‘jmiller’)!

©2014 DataStax Confidential. Do not distribute without consent.

52
Bitmap index
•  Multiple parts to a key
•  Create a truth table of the various combinations
•  However, inserts == the number of combinations

©2014 DataStax Confidential. Do not distribute without consent.

53
Bitmap index example
•  Find a car in a car park by variable combinations

©2014 DataStax Confidential. Do not distribute without consent.

54
Bitmap index example – Table definition
•  Make a table with three different key combinations
CREATE TABLE car_location_index (!
!make varchar,!
!model varchar,!
!colour varchar,!
!vehicle_id int,!
!lot_id int,!
!PRIMARY KEY ((make, mode, colour), vehicle_id)!
);!

©2014 DataStax Confidential. Do not distribute without consent.

55
Bitmap index example – Adding records
•  We are pre-optimizing for 7 possible queries of the index on insert.
1.  INSERT INTO car_location_index (make, model, colour, vehicle_id, lot_id)

VALUES (‘Ford’, ‘Mustang’, ‘Blue’, 1234, 8675309);!
2.  INSERT INTO car_location_index (make, model, colour, vehicle_id, lot_id)

VALUES (‘Ford’, ‘Mustang’, ‘’, 1234, 8675309);!
3.  INSERT INTO car_location_index (make, model, colour, vehicle_id, lot_id)

VALUES (‘Ford’, ‘’, ‘Blue’, 1234, 8675309);!
4.  INSERT INTO car_location_index (make, model, colour, vehicle_id, lot_id)

VALUES (‘Ford’, ‘’, ‘’, 1234, 8675309);!
5.  INSERT INTO car_location_index (make, model, colour, vehicle_id, lot_id)

VALUES (‘’, ‘Mustang’, ‘Blue’, 1234, 8675309);!
6.  INSERT INTO car_location_index (make, model, colour, vehicle_id, lot_id)

VALUES (‘’, ‘Mustang’, ‘’, 1234, 8675309);!
7.  INSERT INTO car_location_index (make, model, colour, vehicle_id, lot_id)

VALUES (‘’, ‘’, ‘Blue’, 1234, 8675309);!

©2014 DataStax Confidential. Do not distribute without consent.

56
Bitmap - selecting
•  Different queries are now possible:

©2014 DataStax Confidential. Do not distribute without consent.

57
Time Series/Sensor Data

©2014 DataStax Confidential. Do not distribute without consent.

58
What is time series data?
•  Sensors
•  CPU, Network Card, Electronic Power Meter, Resource Utilization,
Weather
•  Clickstream data
•  Historical trends
•  Stock Ticker
•  Anything that varies on a temporal basis
•  Top Ten Most Popular Videos

©2014 DataStax Confidential. Do not distribute without consent.

59
Why Cassandra for time series data?
•  Cassandra based on BigTable storage model
•  One key row and lots of (variable) columns
•  Single layout on disk

©2014 DataStax Confidential. Do not distribute without consent.

60
Time Series Example
•  Storing weather data
•  One weather station
•  Temperature measurement every minute

©2014 DataStax Confidential. Do not distribute without consent.

61
Times Series Example – query data
•  Weather station id = Locality of single node

©2014 DataStax Confidential. Do not distribute without consent.

62
Time Series Example - Table
•  Data partitioned by weather station ID and time
•  Timestamp goes in the clustered column
•  Store the measurement as the non-clustered column(s)
•  Take advantage of partition clustering
CREATE TABLE temperature (!
!weatherstation_id text,!
!event_time timestamp,!
!temperature text!
!PRIMARY KEY (weatherstation_id, event_time) !
);!
©2014 DataStax Confidential. Do not distribute without consent.

63
Time Series Example
•  Simple to insert:
INSERT INTO temperature (weatherstation_id, event_time, temperature)!
VALUES (‘1234abcd’, ‘2013-12-11 07:01:00’, ‘72F’);!
!

•  Simple to query
SELECT temperature from temperature WHERE weatherstation_id=‘1234abcd’
AND event_time > ‘2013-04-03 07:01:00’ AND event_time < ‘2013-04-03
07:04:00’ !

!

©2014 DataStax Confidential. Do not distribute without consent.

64
Time Series Example – Partitioning
•  With the previous table, you can end up with a very large row on 1 partition
i.e. PRIMARY KEY (weatherstation_id, event_time)
•  This would have to fit on 1 node.
•  Cassandra can store 2 billion columns per storage row.
•  The solution is to have a composite partition key to split things up:
CREATE TABLE temperature (!
!weatherstation_id text,!
!date text,!
!event_time timestamp,!
!temperature text!
!PRIMARY KEY ((weatherstation_id, date), event_time) !
);!
©2014 DataStax Confidential. Do not distribute without consent.

65
Time Series Example – reading and writing
•  Simple to insert:
INSERT INTO temperature (weatherstation_id, date, event_time,
temperature)!
VALUES (‘1234abcd’, ‘2013-12-11’, ‘2013-12-11 07:01:00’, ‘72F’);!
!

•  Simple to query
SELECT temperature from temperature !
WHERE weatherstation_id=‘1234abcd’ !
AND date = ‘2013-12-11’!
AND event_time > ‘2013-04-03 07:01:00’ AND event_time < ‘2013-04-03
07:04:00’ !

!
©2014 DataStax Confidential. Do not distribute without consent.

66
Time Series Example – reverse ordering
•  Common pattern for time series data is rolling storage.
•  For example, we only want to show the last 10 temperature readings and older data is no
longer needed
•  On most DBs you would need some background job to purge the old data.
•  With Cassandra you can use TTL’s!
CREATE TABLE temperature (!
!weatherstation_id text,!
!date text,!
!event_time timestamp,!
!temperature text!
!PRIMARY KEY ((weatherstation_id, date), event_time) !
) WITH CLUSTERING ORDER BY (event_time DESC);!

•  As part of the table definition, WITH CLUSTERING ORDER BY (event_time DESC), is used to order the
data by the most recent first i.e. the data will be returned in this order.!

©2014 DataStax Confidential. Do not distribute without consent.

67
Time Series Example – TTL’ing
•  Simple to insert:
INSERT INTO temperature (weatherstation_id, date, event_time, temperature)!
VALUES (‘1234abcd’, ‘2013-12-11’, ‘2013-12-11 07:01:00’, ‘72F’) USING TTL 20;!

•  This data point will automatically be deleted after 20 seconds.
•  Eventually you will see all the data disappear.
!
•  Simple to query
SELECT temperature from temperature !
WHERE weatherstation_id=‘1234abcd’ !
AND date = ‘2013-12-11’!
AND event_time > ‘2013-04-03 07:01:00’ AND event_time < ‘2013-04-03 07:04:00’ !

©2014 DataStax Confidential. Do not distribute without consent.

68
Time Series Bucket Example – mitigating spikes in data
•  In some situations, there might be a risk that you get an unforeseen volume
of sensor data for the partition key for your row.
•  The risk here is that your row will continue to grow and fill-up the node.
•  The workaround here is to attempt to split your data across multiple nodes:
CREATE TABLE temperature (!
!weatherstation_id text,!
!date text,!
!bucket_id int,!
!event_time timestamp,!
!temperature text!
!PRIMARY KEY ((weatherstation_id, date, bucket_id), event_time) !
);!

©2014 DataStax Confidential. Do not distribute without consent.

69
Time Series Bucket Example – reading and writing
•  Not so simple to insert. Client needs to generate a bucket id (often a
random number within a certain range):
INSERT INTO temperature (weatherstation_id, date, bucket, event_time,
temperature)!
VALUES (‘1234abcd’, ‘2013-12-11’, 10, ‘2013-12-11 07:01:00’, ‘72F’);!
!

•  Much more expensive to read.
The client will have to iterate through the range of random numbers,
execute a read for each and then merge and order the data in the client
SELECT temperature from temperature !
WHERE weatherstation_id=‘1234abcd’ AND date = ‘2013-12-11’!
AND bucket = 10, !
AND event_time > ‘2013-04-03 07:01:00’ AND event_time < ‘2013-04-03
07:04:00’ !

!

©2014 DataStax Confidential. Do not distribute without consent.

70
Time Series Bucket Example
•  Only do this as a last resort.
•  Reads become very expensive i.e. n x read(s) where n > range of buckets
•  If your dealing with large volumes of data it can be hard work for the client
to merge and re-order things.

©2014 DataStax Confidential. Do not distribute without consent.

71
DataStax Native Java Driver

©2013 DataStax Confidential. Do not distribute without consent.

72
Features
•  Provides CQL3 access to Cassandra using Java
•  Utilizes Cassandra’s native protocol
•  Automatic routing of client requests
•  Configurable consistency policy
•  Automatic failover
•  Tracing support
•  Tunable policies
•  Load balancing
•  Reconnection
•  Consistency
•  Queries can be executed synchronously or asynchronously
•  Supports prepared statements
•  Non-blocking I/O
©2014 DataStax Confidential. Do not distribute without consent.

73
Cassandra clients - Drivers
•  DataStax drivers for Cassandra
•  Python
•  C++
•  Java
•  C#
•  And more on the way…
•  http://www.datastax.com/download/clientdrivers

©2014 DataStax Confidential. Do not distribute without consent.

74
Where to get it?
•  The latest release of the driver is available on Maven Central.
•  You can install it in your application using the following Maven dependency:
•  Documentation:
http://www.datastax.com/documentation/developer/java-driver
Javadoc: http://www.datastax.com/drivers/java/apidocs/index.html

©2014 DataStax Confidential. Do not distribute without consent.

75
Native Protocol
•  To use CQL via the client drivers, you must set the property
start_native_transport to true in the cassandra.yaml on every node.
•  This protocol is an extremely efficient way of integrating with Cassandra.
•  Supports synchronous and asynchronous requests
•  Use the corresponding native driver in your app.

©2014 DataStax Confidential. Do not distribute without consent.

76
CQL to Java Mappings
CQL3 Data Type

Java Type

CQL3 Data Type

Java Type

ascii

java. lang. String

int

int

bigint

long

list

java.util.List<T>

blob

java.nio.ByteBuffer

map

java.util.Map<K, V>

boolean

boolean

set

java.util.Set<T>

counter

long

text

java.lang.String

decimal

float

timeuuid

java.util.UUID

double

double

uuid

java.util.UUID

float

float

varchar

java.lang.String

inet

java.net.InetAddress

varint

java.math.BigInteger

©2014 DataStax Confidential. Do not distribute without consent.

77
Connecting to a Cluster
•  The Cluster class is your client apps entry point for connecting to
Cassandra and getting back its metadata.
Cluster cluster =
Cluster.builder().addContactPoints(”10.158.02.40”,“10.158.02.44”).build();

•  You can pass in one or many node addresses to connect to.
•  Make sure to tidy up your cluster after your finished:
cluster.shutdown();

©2014 DataStax Confidential. Do not distribute without consent.

78
Connecting to a Keyspace
•  After connecting to the cluster, you creation a Session on the keyspace you
want to iteract with.
Session session = cluster.connect(“akeyspace”);
•  Make sure to tidy up after your self:
session.shutdown();

©2014 DataStax Confidential. Do not distribute without consent.

79
Inserting Data
try {
session.execute( “INSERT INTO user (username, password)” + “VALUES(‘user1’,
‘user1password’);”);
session.execute( “INSERT INTO user (username, password)” + “VALUES(‘user2’,
‘user2password’);”);
} catch (NoHostAvailableException ex) {
System.out.println(“No Host available”);
}

©2014 DataStax Confidential. Do not distribute without consent.

80
Reading Data
try {
ResultSet result = session.execute ( "SELECT password from user " + "WHERE username = 'user2';");
if (result.isExhausted())
return;
Row user = result.one();
System.out.println("Password is: " + user.getString("password"));
} catch (NoHostAvailableException ex) {
System.out.println("No Host Available");
} catch (QueryValidationException ex) {
System.out.println(“Requested consistency” + “level not met”);
}

©2014 DataStax Confidential. Do not distribute without consent.

81
Prepared Statements
PreparedStatement statement = session.prepare( "INSERT INTO user (username, password) "
+ "VALUES (?, ?);");
BoundStatement boundStatement = new BoundStatement(statement);
try {
session.execute(boundStatement.bind("user4”,"user4password"));
} catch (NoHostAvailableException ex) {
System.out.println("Host Not Available");
} catch (QueryExecutionException ex) {
System.out.println (”Syntax error, runtime, not authorized");
} catch (QueryValidationException ex) {
System.out.println ("Requested consistency level not met");
}

©2014 DataStax Confidential. Do not distribute without consent.

82
Query Builder
Insert insert = QueryBuilder.insertInto("user”)
.value("username", ”rcohen”)
.value("password", ”mypassword");
session.execute(insert);
Query query = QueryBuilder
.select()
.all()
.from(”akeyspace", "user");
ResultSet rs = session.execute(query);
for (Row row : rs) {
System.out.println(String.format("%-20st%-20s",
row.getString("username"),
row.getString("password")));
}

©2014 DataStax Confidential. Do not distribute without consent.

83
Consistency Level
SimpleStatement simpleStatement = new SimpleStatement ( "SELECT * FROM USER WHERE username = 'user2’;”);
// This will show the default consistency level of ConsistencyLevel.ONE
System.out.println("Consistency Level for this request: ” +simpleStatement.getConsistencyLevel());
//Now change the consistency level
simpleStatement.setConsistencyLevel(ConsistencyLevel.ALL);
You can also set the consistency level using the QueryBuilder
Insert insert = QueryBuilder.insertInto("user”)
.value("username", ”johnny”)
.value("password", ”mypassword")
setConsistencyLevel(ConsistencyLevel.ALL);

©2014 DataStax Confidential. Do not distribute without consent.

84
Tracing
•  Tracing can help with debugging or analysing how Cassandra is handling
your queries.
Query insert = QueryBuilder.insertInto("simplex", "songs")
.value("id", UUID.randomUUID())
.value("title", "Golden Brown")
.value("album", "La Folie")
.value("artist", "The Stranglers")
.setConsistencyLevel(ConsistencyLevel.ONE).enableTracing();

©2014 DataStax Confidential. Do not distribute without consent.

85
Tracing
ResultSet results = getSession().execute(insert);
ExecutionInfo executionInfo = results.getExecutionInfo();
•  This ExecutionInfo object contains information on the hosts it attempted to communicate
with, the host it used and a QueryTrace object.
QueryTrace queryTrace = executionInfo.getQueryTrace();
•  With these two objects you can obtain quite detail on how your query performed

©2014 DataStax Confidential. Do not distribute without consent.

86
Tracing
Connected to cluster: xerxes

Simplex keyspace and schema created.

Host (queried): /127.0.0.1

Host (tried): /127.0.0.1

Trace id: 96ac9400-a3a5-11e2-96a9-4db56cdc5fe7!
activity

| timestamp

| source

| source_elapsed!

---------------------------------------+--------------+------------+--------------!
Parsing statement | 12:17:16.736 | /127.0.0.1 |

28!

Peparing statement | 12:17:16.736 | /127.0.0.1 |

199!

Determining replicas for mutation | 12:17:16.736 | /127.0.0.1 |

348!

Sending message to /127.0.0.3 | 12:17:16.736 | /127.0.0.1 |

788!

Sending message to /127.0.0.2 | 12:17:16.736 | /127.0.0.1 |

805!

Acquiring switchLock read lock | 12:17:16.736 | /127.0.0.1 |

828!

Appending to commitlog | 12:17:16.736 | /127.0.0.1 |

848!

Adding to songs memtable | 12:17:16.736 | /127.0.0.1 |

900!

Message received from /127.0.0.1 | 12:17:16.737 | /127.0.0.2 |

34!

Message received from /127.0.0.1 | 12:17:16.737 | /127.0.0.3 |

25!

Acquiring switchLock read lock | 12:17:16.737 | /127.0.0.2 |

672!

Acquiring switchLock read lock | 12:17:16.737 | /127.0.0.3 |

525!

Appending to commitlog | 12:17:16.737 | /127.0.0.2 |

692!

Appending to commitlog | 12:17:16.737 | /127.0.0.3 |

541!

Adding to songs memtable | 12:17:16.737 | /127.0.0.2 |

741!

Adding to songs memtable | 12:17:16.737 | /127.0.0.3 |

583!

©2014Enqueuing response not distribute without consent.
DataStax Confidential. Do to /127.0.0.1 | 12:17:16.737 | /127.0.0.3 |

87

751!

Enqueuing response to /127.0.0.1 | 12:17:16.738 | /127.0.0.2 |

950!

Message received from /127.0.0.3 | 12:17:16.738 | /127.0.0.1 |

178!

Sending message to /127.0.0.1 | 12:17:16.738 | /127.0.0.2 |

1189!

Message received from /127.0.0.2 | 12:17:16.738 | /127.0.0.1 |

249!

Processing response from /127.0.0.3 | 12:17:16.738 | /127.0.0.1 |

345!

Processing response from /127.0.0.2 | 12:17:16.738 | /127.0.0.1 |

377!
OpsCenter

©2013 DataStax Confidential. Do not distribute without consent.

88
DataStax OpsCenter
•  DataStax OpsCenter is a browser-based, visual management and monitoring solution for Apache
Cassandra and DataStax Enterprise
•  Functionality is also exposed via HTTP APIs

©2013 DataStax Confidential. Do not distribute without consent.

89
Thank You

We power the big data
apps that transform business.

©2014 DataStax Confidential. Do not distribute without consent.

90

More Related Content

What's hot

How jKool Analyzes Streaming Data in Real Time with DataStax
How jKool Analyzes Streaming Data in Real Time with DataStaxHow jKool Analyzes Streaming Data in Real Time with DataStax
How jKool Analyzes Streaming Data in Real Time with DataStaxDataStax
 
Transforms Document Management at Scale with Distributed Database Solution wi...
Transforms Document Management at Scale with Distributed Database Solution wi...Transforms Document Management at Scale with Distributed Database Solution wi...
Transforms Document Management at Scale with Distributed Database Solution wi...DataStax Academy
 
How to Successfully Visualize DSE Graph data
How to Successfully Visualize DSE Graph dataHow to Successfully Visualize DSE Graph data
How to Successfully Visualize DSE Graph dataDataStax
 
Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...
Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...
Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...DataStax
 
Real-time personal trainer on the SMACK stack
Real-time personal trainer on the SMACK stackReal-time personal trainer on the SMACK stack
Real-time personal trainer on the SMACK stackAnirvan Chakraborty
 
Reporting from the Trenches: Intuit & Cassandra
Reporting from the Trenches: Intuit & CassandraReporting from the Trenches: Intuit & Cassandra
Reporting from the Trenches: Intuit & CassandraDataStax
 
Webinar | Introducing DataStax Enterprise 4.6
Webinar | Introducing DataStax Enterprise 4.6Webinar | Introducing DataStax Enterprise 4.6
Webinar | Introducing DataStax Enterprise 4.6DataStax
 
Making Every Drop Count: How i20 Addresses the Water Crisis with the IoT and ...
Making Every Drop Count: How i20 Addresses the Water Crisis with the IoT and ...Making Every Drop Count: How i20 Addresses the Water Crisis with the IoT and ...
Making Every Drop Count: How i20 Addresses the Water Crisis with the IoT and ...DataStax
 
Announcing Spark Driver for Cassandra
Announcing Spark Driver for CassandraAnnouncing Spark Driver for Cassandra
Announcing Spark Driver for CassandraDataStax
 
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...DataStax
 
Managing Cassandra Databases with OpenStack Trove
Managing Cassandra Databases with OpenStack TroveManaging Cassandra Databases with OpenStack Trove
Managing Cassandra Databases with OpenStack TroveTesora
 
Introducing DataStax Enterprise 4.7
Introducing DataStax Enterprise 4.7Introducing DataStax Enterprise 4.7
Introducing DataStax Enterprise 4.7DataStax
 
Webinar: Bitcoins and Blockchains - Emerging Financial Services Trends and Te...
Webinar: Bitcoins and Blockchains - Emerging Financial Services Trends and Te...Webinar: Bitcoins and Blockchains - Emerging Financial Services Trends and Te...
Webinar: Bitcoins and Blockchains - Emerging Financial Services Trends and Te...DataStax
 
Data Modeling a Scheduling App (Adam Hutson, DataScale) | Cassandra Summit 2016
Data Modeling a Scheduling App (Adam Hutson, DataScale) | Cassandra Summit 2016Data Modeling a Scheduling App (Adam Hutson, DataScale) | Cassandra Summit 2016
Data Modeling a Scheduling App (Adam Hutson, DataScale) | Cassandra Summit 2016DataStax
 
There are More Clouds! Azure and Cassandra (Carlos Rolo, Pythian) | C* Summit...
There are More Clouds! Azure and Cassandra (Carlos Rolo, Pythian) | C* Summit...There are More Clouds! Azure and Cassandra (Carlos Rolo, Pythian) | C* Summit...
There are More Clouds! Azure and Cassandra (Carlos Rolo, Pythian) | C* Summit...DataStax
 
DataStax Training – Everything you need to become a Cassandra Rockstar
DataStax Training – Everything you need to become a Cassandra RockstarDataStax Training – Everything you need to become a Cassandra Rockstar
DataStax Training – Everything you need to become a Cassandra RockstarDataStax
 
Building and Maintaining Bulletproof Systems with DataStax
Building and Maintaining Bulletproof Systems with DataStaxBuilding and Maintaining Bulletproof Systems with DataStax
Building and Maintaining Bulletproof Systems with DataStaxDataStax
 
Keeping your application’s latency SLAs no matter what
Keeping your application’s latency SLAs no matter whatKeeping your application’s latency SLAs no matter what
Keeping your application’s latency SLAs no matter whatScyllaDB
 
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd KnownCassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd KnownDataStax
 
DataStax Enterprise in Practice (Field Notes)
DataStax Enterprise in Practice (Field Notes)DataStax Enterprise in Practice (Field Notes)
DataStax Enterprise in Practice (Field Notes)DataStax
 

What's hot (20)

How jKool Analyzes Streaming Data in Real Time with DataStax
How jKool Analyzes Streaming Data in Real Time with DataStaxHow jKool Analyzes Streaming Data in Real Time with DataStax
How jKool Analyzes Streaming Data in Real Time with DataStax
 
Transforms Document Management at Scale with Distributed Database Solution wi...
Transforms Document Management at Scale with Distributed Database Solution wi...Transforms Document Management at Scale with Distributed Database Solution wi...
Transforms Document Management at Scale with Distributed Database Solution wi...
 
How to Successfully Visualize DSE Graph data
How to Successfully Visualize DSE Graph dataHow to Successfully Visualize DSE Graph data
How to Successfully Visualize DSE Graph data
 
Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...
Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...
Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...
 
Real-time personal trainer on the SMACK stack
Real-time personal trainer on the SMACK stackReal-time personal trainer on the SMACK stack
Real-time personal trainer on the SMACK stack
 
Reporting from the Trenches: Intuit & Cassandra
Reporting from the Trenches: Intuit & CassandraReporting from the Trenches: Intuit & Cassandra
Reporting from the Trenches: Intuit & Cassandra
 
Webinar | Introducing DataStax Enterprise 4.6
Webinar | Introducing DataStax Enterprise 4.6Webinar | Introducing DataStax Enterprise 4.6
Webinar | Introducing DataStax Enterprise 4.6
 
Making Every Drop Count: How i20 Addresses the Water Crisis with the IoT and ...
Making Every Drop Count: How i20 Addresses the Water Crisis with the IoT and ...Making Every Drop Count: How i20 Addresses the Water Crisis with the IoT and ...
Making Every Drop Count: How i20 Addresses the Water Crisis with the IoT and ...
 
Announcing Spark Driver for Cassandra
Announcing Spark Driver for CassandraAnnouncing Spark Driver for Cassandra
Announcing Spark Driver for Cassandra
 
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
 
Managing Cassandra Databases with OpenStack Trove
Managing Cassandra Databases with OpenStack TroveManaging Cassandra Databases with OpenStack Trove
Managing Cassandra Databases with OpenStack Trove
 
Introducing DataStax Enterprise 4.7
Introducing DataStax Enterprise 4.7Introducing DataStax Enterprise 4.7
Introducing DataStax Enterprise 4.7
 
Webinar: Bitcoins and Blockchains - Emerging Financial Services Trends and Te...
Webinar: Bitcoins and Blockchains - Emerging Financial Services Trends and Te...Webinar: Bitcoins and Blockchains - Emerging Financial Services Trends and Te...
Webinar: Bitcoins and Blockchains - Emerging Financial Services Trends and Te...
 
Data Modeling a Scheduling App (Adam Hutson, DataScale) | Cassandra Summit 2016
Data Modeling a Scheduling App (Adam Hutson, DataScale) | Cassandra Summit 2016Data Modeling a Scheduling App (Adam Hutson, DataScale) | Cassandra Summit 2016
Data Modeling a Scheduling App (Adam Hutson, DataScale) | Cassandra Summit 2016
 
There are More Clouds! Azure and Cassandra (Carlos Rolo, Pythian) | C* Summit...
There are More Clouds! Azure and Cassandra (Carlos Rolo, Pythian) | C* Summit...There are More Clouds! Azure and Cassandra (Carlos Rolo, Pythian) | C* Summit...
There are More Clouds! Azure and Cassandra (Carlos Rolo, Pythian) | C* Summit...
 
DataStax Training – Everything you need to become a Cassandra Rockstar
DataStax Training – Everything you need to become a Cassandra RockstarDataStax Training – Everything you need to become a Cassandra Rockstar
DataStax Training – Everything you need to become a Cassandra Rockstar
 
Building and Maintaining Bulletproof Systems with DataStax
Building and Maintaining Bulletproof Systems with DataStaxBuilding and Maintaining Bulletproof Systems with DataStax
Building and Maintaining Bulletproof Systems with DataStax
 
Keeping your application’s latency SLAs no matter what
Keeping your application’s latency SLAs no matter whatKeeping your application’s latency SLAs no matter what
Keeping your application’s latency SLAs no matter what
 
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd KnownCassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
 
DataStax Enterprise in Practice (Field Notes)
DataStax Enterprise in Practice (Field Notes)DataStax Enterprise in Practice (Field Notes)
DataStax Enterprise in Practice (Field Notes)
 

Viewers also liked

Marc s01 e02-crud-database
Marc s01 e02-crud-databaseMarc s01 e02-crud-database
Marc s01 e02-crud-databaseMongoDB
 
2014 bigdatacamp asya_kamsky
2014 bigdatacamp asya_kamsky2014 bigdatacamp asya_kamsky
2014 bigdatacamp asya_kamskyData Con LA
 
Helsinki Cassandra Meetup #2: From Postgres to Cassandra
Helsinki Cassandra Meetup #2: From Postgres to CassandraHelsinki Cassandra Meetup #2: From Postgres to Cassandra
Helsinki Cassandra Meetup #2: From Postgres to CassandraBruno Amaro Almeida
 
Building Continuously Curated Ingestion Pipelines
Building Continuously Curated Ingestion PipelinesBuilding Continuously Curated Ingestion Pipelines
Building Continuously Curated Ingestion PipelinesArvind Prabhakar
 
Community Webinar: 15 Commandments of Cassandra DBAs
Community Webinar: 15 Commandments of Cassandra DBAsCommunity Webinar: 15 Commandments of Cassandra DBAs
Community Webinar: 15 Commandments of Cassandra DBAsDataStax
 
Migration Best Practices: From RDBMS to Cassandra without a Hitch
Migration Best Practices: From RDBMS to Cassandra without a HitchMigration Best Practices: From RDBMS to Cassandra without a Hitch
Migration Best Practices: From RDBMS to Cassandra without a HitchDataStax Academy
 
Building a Data Pipeline from Scratch - Joe Crobak
Building a Data Pipeline from Scratch - Joe CrobakBuilding a Data Pipeline from Scratch - Joe Crobak
Building a Data Pipeline from Scratch - Joe CrobakHakka Labs
 

Viewers also liked (8)

Marc s01 e02-crud-database
Marc s01 e02-crud-databaseMarc s01 e02-crud-database
Marc s01 e02-crud-database
 
2014 bigdatacamp asya_kamsky
2014 bigdatacamp asya_kamsky2014 bigdatacamp asya_kamsky
2014 bigdatacamp asya_kamsky
 
Helsinki Cassandra Meetup #2: From Postgres to Cassandra
Helsinki Cassandra Meetup #2: From Postgres to CassandraHelsinki Cassandra Meetup #2: From Postgres to Cassandra
Helsinki Cassandra Meetup #2: From Postgres to Cassandra
 
Building Continuously Curated Ingestion Pipelines
Building Continuously Curated Ingestion PipelinesBuilding Continuously Curated Ingestion Pipelines
Building Continuously Curated Ingestion Pipelines
 
Community Webinar: 15 Commandments of Cassandra DBAs
Community Webinar: 15 Commandments of Cassandra DBAsCommunity Webinar: 15 Commandments of Cassandra DBAs
Community Webinar: 15 Commandments of Cassandra DBAs
 
Migration Best Practices: From RDBMS to Cassandra without a Hitch
Migration Best Practices: From RDBMS to Cassandra without a HitchMigration Best Practices: From RDBMS to Cassandra without a Hitch
Migration Best Practices: From RDBMS to Cassandra without a Hitch
 
CouchDB Vs MongoDB
CouchDB Vs MongoDBCouchDB Vs MongoDB
CouchDB Vs MongoDB
 
Building a Data Pipeline from Scratch - Joe Crobak
Building a Data Pipeline from Scratch - Joe CrobakBuilding a Data Pipeline from Scratch - Joe Crobak
Building a Data Pipeline from Scratch - Joe Crobak
 

Similar to Helsinki Cassandra Meetup #2: Introduction to CQL3 and DataModeling

Introduction to CQL and Data Modeling with Apache Cassandra
Introduction to CQL and Data Modeling with Apache CassandraIntroduction to CQL and Data Modeling with Apache Cassandra
Introduction to CQL and Data Modeling with Apache CassandraJohnny Miller
 
Building better SQL Server Databases
Building better SQL Server DatabasesBuilding better SQL Server Databases
Building better SQL Server DatabasesColdFusionConference
 
Going native with Apache Cassandra
Going native with Apache CassandraGoing native with Apache Cassandra
Going native with Apache CassandraJohnny Miller
 
MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019
MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019
MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019Dave Stokes
 
Deep Dive into Cassandra
Deep Dive into CassandraDeep Dive into Cassandra
Deep Dive into CassandraBrent Theisen
 
Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingCassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingVassilis Bekiaris
 
BI, Reporting and Analytics on Apache Cassandra
BI, Reporting and Analytics on Apache CassandraBI, Reporting and Analytics on Apache Cassandra
BI, Reporting and Analytics on Apache CassandraVictor Coustenoble
 
Slide presentation pycassa_upload
Slide presentation pycassa_uploadSlide presentation pycassa_upload
Slide presentation pycassa_uploadRajini Ramesh
 
xjtrutdctrd5454drxxresersestryugyufy6rythgfytfyt
xjtrutdctrd5454drxxresersestryugyufy6rythgfytfytxjtrutdctrd5454drxxresersestryugyufy6rythgfytfyt
xjtrutdctrd5454drxxresersestryugyufy6rythgfytfytWrushabhShirsat3
 
Presentation slides of Sequence Query Language (SQL)
Presentation slides of Sequence Query Language (SQL)Presentation slides of Sequence Query Language (SQL)
Presentation slides of Sequence Query Language (SQL)Punjab University
 
MySQL: Know more about open Source Database
MySQL: Know more about open Source DatabaseMySQL: Know more about open Source Database
MySQL: Know more about open Source DatabaseMahesh Salaria
 
MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014
MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014
MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014Dave Stokes
 
Columnstore indexes in sql server 2014
Columnstore indexes in sql server 2014Columnstore indexes in sql server 2014
Columnstore indexes in sql server 2014Antonios Chatzipavlis
 
Cassandra Tutorial
Cassandra Tutorial Cassandra Tutorial
Cassandra Tutorial Na Zhu
 
Build a modern data platform.pptx
Build a modern data platform.pptxBuild a modern data platform.pptx
Build a modern data platform.pptxIke Ellis
 
Big Data Warehousing Meetup: Real-time Trade Data Monitoring with Storm & Cas...
Big Data Warehousing Meetup: Real-time Trade Data Monitoring with Storm & Cas...Big Data Warehousing Meetup: Real-time Trade Data Monitoring with Storm & Cas...
Big Data Warehousing Meetup: Real-time Trade Data Monitoring with Storm & Cas...Caserta
 
Ten query tuning techniques every SQL Server programmer should know
Ten query tuning techniques every SQL Server programmer should knowTen query tuning techniques every SQL Server programmer should know
Ten query tuning techniques every SQL Server programmer should knowKevin Kline
 
Is SQLcl the Next Generation of SQL*Plus?
Is SQLcl the Next Generation of SQL*Plus?Is SQLcl the Next Generation of SQL*Plus?
Is SQLcl the Next Generation of SQL*Plus?Zohar Elkayam
 

Similar to Helsinki Cassandra Meetup #2: Introduction to CQL3 and DataModeling (20)

Introduction to CQL and Data Modeling with Apache Cassandra
Introduction to CQL and Data Modeling with Apache CassandraIntroduction to CQL and Data Modeling with Apache Cassandra
Introduction to CQL and Data Modeling with Apache Cassandra
 
Building better SQL Server Databases
Building better SQL Server DatabasesBuilding better SQL Server Databases
Building better SQL Server Databases
 
Going native with Apache Cassandra
Going native with Apache CassandraGoing native with Apache Cassandra
Going native with Apache Cassandra
 
MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019
MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019
MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019
 
Cassandra training
Cassandra trainingCassandra training
Cassandra training
 
Deep Dive into Cassandra
Deep Dive into CassandraDeep Dive into Cassandra
Deep Dive into Cassandra
 
Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingCassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series Modeling
 
BI, Reporting and Analytics on Apache Cassandra
BI, Reporting and Analytics on Apache CassandraBI, Reporting and Analytics on Apache Cassandra
BI, Reporting and Analytics on Apache Cassandra
 
Slide presentation pycassa_upload
Slide presentation pycassa_uploadSlide presentation pycassa_upload
Slide presentation pycassa_upload
 
unit-ii.pptx
unit-ii.pptxunit-ii.pptx
unit-ii.pptx
 
xjtrutdctrd5454drxxresersestryugyufy6rythgfytfyt
xjtrutdctrd5454drxxresersestryugyufy6rythgfytfytxjtrutdctrd5454drxxresersestryugyufy6rythgfytfyt
xjtrutdctrd5454drxxresersestryugyufy6rythgfytfyt
 
Presentation slides of Sequence Query Language (SQL)
Presentation slides of Sequence Query Language (SQL)Presentation slides of Sequence Query Language (SQL)
Presentation slides of Sequence Query Language (SQL)
 
MySQL: Know more about open Source Database
MySQL: Know more about open Source DatabaseMySQL: Know more about open Source Database
MySQL: Know more about open Source Database
 
MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014
MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014
MySQL Query Tuning for the Squeemish -- Fossetcon Orlando Sep 2014
 
Columnstore indexes in sql server 2014
Columnstore indexes in sql server 2014Columnstore indexes in sql server 2014
Columnstore indexes in sql server 2014
 
Cassandra Tutorial
Cassandra Tutorial Cassandra Tutorial
Cassandra Tutorial
 
Build a modern data platform.pptx
Build a modern data platform.pptxBuild a modern data platform.pptx
Build a modern data platform.pptx
 
Big Data Warehousing Meetup: Real-time Trade Data Monitoring with Storm & Cas...
Big Data Warehousing Meetup: Real-time Trade Data Monitoring with Storm & Cas...Big Data Warehousing Meetup: Real-time Trade Data Monitoring with Storm & Cas...
Big Data Warehousing Meetup: Real-time Trade Data Monitoring with Storm & Cas...
 
Ten query tuning techniques every SQL Server programmer should know
Ten query tuning techniques every SQL Server programmer should knowTen query tuning techniques every SQL Server programmer should know
Ten query tuning techniques every SQL Server programmer should know
 
Is SQLcl the Next Generation of SQL*Plus?
Is SQLcl the Next Generation of SQL*Plus?Is SQLcl the Next Generation of SQL*Plus?
Is SQLcl the Next Generation of SQL*Plus?
 

Recently uploaded

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

Helsinki Cassandra Meetup #2: Introduction to CQL3 and DataModeling

  • 1. Introduction to CQL and Data Modeling Helsinki Cassandra Meetup 10th February 2014 ©2014 DataStax Confidential. Do not distribute without consent.
  • 2. Agenda •  •  •  •  •  Introduction CQL Basics Data Modeling Time Series/Sensor Data Java Driver ©2014 DataStax Confidential. Do not distribute without consent. 2
  • 3. About me Johnny Miller DataStax Solutions Architect www.datastax.com @DataStax @CyanMiller https://www.linkedin.com/in/johnnymiller jmiller@datastax.com ©2014 DataStax Confidential. Do not distribute without consent. 3
  • 4. DataStax •  Founded in April 2010 •  We drive Apache Cassandra™ •  400+ customers (20 of the Fortune 100) •  200+ employees •  Home to Apache Cassandra Chair & most committers •  Headquartered in San Francisco Bay area •  European headquarters established in London Our Goal To be the first and best database choice for online applications ©2014 DataStax Confidential. Do not distribute without consent. 4
  • 5. DataStax •  DataStax supports both the open source community and enterprises. Open Source/Community Enterprise Software •  Apache Cassandra (employ Cassandra chair and 90+% of the committers) •  DataStax Community Edition •  DataStax OpsCenter •  DataStax DevCenter •  DataStax Drivers/Connectors •  Online Documentation •  Online Training •  Mailing lists and forums ©2014 DataStax Confidential. Do not distribute without consent. •  DataStax Enterprise Edition •  Certified Cassandra •  Built-in Analytics •  Built-in Enterprise Search •  Enterprise Security •  DataStax OpsCenter •  Expert Support •  Consultative Help •  Professional Training 5
  • 6. Cassandra Adoption Source http://db-engines.com/en/ranking, Feb 2014 ©2014 DataStax Confidential. Do not distribute without consent. 6
  • 7. A sample of Cassandra & DataStax Enterprise users ©2014 DataStax Confidential. Do not distribute without consent. 7
  • 8. Why Good Data Modeling is Important •  Cassandra is a highly available, highly scalable, & highly distributed database, with no single point of failure •  To achieve this, Cassandra is optimized for non-relational data models. •  Joins do not function well on distributed databases. •  Locking and transactions jam up distributed nodes •  By modeling data properly for Cassandra you can avoid joins, locking, and transactions for your application. ©2014 DataStax Confidential. Do not distribute without consent. 8
  • 9. CQL Basics YesCQL ©2014 DataStax Confidential. Do not distribute without consent. 9
  • 10. CQL Basics •  Cassandra Query Language •  SQL–like language to query Cassandra •  Limited predicates. Attempts to prevent bad queries •  but, you can still get into trouble! •  Keyspace – analogous to a schema. •  Has various storage attributes. •  The keyspace determines the RF. •  Table – looks like a SQL Table. •  A table must have a Primary Key. •  We can fully qualify a table as <keyspace>.<table> ©2014 DataStax Confidential. Do not distribute without consent. 10
  • 11. DevCenter •  DataStax DevCenter – a free, visual query tool for creating and running CQL statements against Cassandra and DataStax Enterprise. ©2014 DataStax Confidential. Do not distribute without consent. 11
  • 12. CQL Basics •  Usual statements •  CREATE / DROP / ALTER TABLE • SELECT BUT •  INSERT AND UPDATE are similar to each other •  If a row doesn’t exist, UPDATE will insert it, and if it exists, INSERT will replace it. •  Think of it as an UPSERT •  Therefore we never get a key violation •  For updates, Cassandra never reads ©2014 DataStax Confidential. Do not distribute without consent. 12
  • 13. Creating a keyspace - Single Data Centre Consistency ©2014 DataStax Confidential. Do not distribute without consent. 13
  • 14. Creating a keyspace - Multiple Data Centre Consistency ©2014 DataStax Confidential. Do not distribute without consent. 14
  • 15. CQL Basics – creating a table CREATE TABLE cities (! city_name varchar,! elevation int,! population int,! latitude float,! longitude float,! PRIMARY KEY (city_name)! );! •  We can visualize it this way: •  city_name is the partition key •  In this example, the partition key = primary key ©2014 DataStax Confidential. Do not distribute without consent. 15
  • 16. CQL Basics – Composite Primary Key The Primary Key •  The key uniquely identifies a row. •  A composite primary key consists of: •  A partition key •  One or more clustering columns e.g. PRIMARY KEY (partition key, cluster columns, ...)! •  The partition key determines on which node the partition resides •  Data is ordered in cluster column order within the partition ©2014 DataStax Confidential. Do not distribute without consent. 16
  • 17. CQL Basics – Composite Primary Key CREATE TABLE sporty_league (! team_name varchar,! player_name varchar,! jersey int,! PRIMARY KEY (team_name, player_name)! );! ©2014 DataStax Confidential. Do not distribute without consent. 17
  • 18. CQL Basics – Simple Select SELECT * FROM sporty_league;! •  More that a few rows can be slow. (Limited to 10,000 rows by default) •  Use LIMIT keyword to choose fewer or more rows ©2014 DataStax Confidential. Do not distribute without consent. 18
  • 19. CQL Basics - Simple Select on Partition Key and Cluster Columns SELECT * FROM sporty_league WHERE team_name = ‘Mighty Mutts’;! SELECT * FROM sporty_league WHERE team_name = ‘Mighty Mutts’ 
 and player_name = ‘Lucky’;! ©2014 DataStax Confidential. Do not distribute without consent. 19
  • 20. CQL Basics – Insert/Update INSERT INTO sporty_league (team_name, player_name, jersey) VALUES ('Mighty Mutts',’Felix’,90);! ©2014 DataStax Confidential. Do not distribute without consent. 20
  • 21. CQL Basics - Ordering •  •  •  •  Partition keys are not ordered, but the cluster columns are. However, you can only order by a column if it’s a cluster column. Data will returned by default in the order of the clustering column. You can also use the ORDER BY keyword – but only on the clustering column! SELECT * FROM sporty_league 
 WHERE team_name = ‘Mighty Mutts’ 
 ORDER BY player_name DESC;! ©2014 DataStax Confidential. Do not distribute without consent. 21
  • 22. CQL Basics – Group By •  We have already done this! •  The partition key effectively names the columns for grouping. •  The previous table contained all of the players grouped by their team_name. ©2014 DataStax Confidential. Do not distribute without consent. 22
  • 23. CQL Basics - Predicates •  On the partition key: = and IN •  On the cluster columns: <, <=, =, >=, >, IN ©2014 DataStax Confidential. Do not distribute without consent. 23
  • 24. CQL Basics – Composite Partition Key CREATE TABLE cities (! city_name varchar,! state varchar! PRIMARY KEY ((city_name,state))! );! •  Each city gets it own partition! ©2014 DataStax Confidential. Do not distribute without consent. 24
  • 25. CQL Basics – Performance considerations •  The best queries are in a single partition. i.e. WHERE partition key = <something>! •  Each new partition requires a new disk seek. •  Queries that span multiple partitions are s-l-o-w •  Queries that span multiple cluster columns are fast ©2014 DataStax Confidential. Do not distribute without consent. 25
  • 26. CQL Basics – Authentication and Authorisation •  •  •  •  CQL supports creating users and granting them access to tables etc.. You need to enable authentication in the cassandra.yaml config file. You can create, alter, drop and list users You can then GRANT permissions to users accordingly – ALTER, AUTHORIZE, DROP, MODIFY, SELECT. ©2014 DataStax Confidential. Do not distribute without consent. 26
  • 27. CQL Basics - Tracing •  You can turn on tracing on or off for queries with the TRACING ON | OFF command. •  This can help you understand what Cassandra is doing and identify any performance problems. •  http://www.datastax.com/dev/blog/tracing-in-cassandra-1-2 ©2014 DataStax Confidential. Do not distribute without consent. 27
  • 28. CQL Basics – TTL •  Expiring Columns, or Time to Live (TTL) INSERT INTO users (id, first, last) VALUES (‘abc123’, ‘abe’, ‘lincoln’) USING TTL 3600;! // Expires data in one hour! ©2014 DataStax Confidential. Do not distribute without consent. 28
  • 29. CQL Basics – Data Types ©2014 DataStax Confidential. Do not distribute without consent. 29
  • 30. CQL Basics – Data Types: Collections •  CQL supports having columns that contain collections of data. •  The collection types include: •  Set, List and Map. CREATE TABLE collections_example (! !id int PRIMARY KEY,! !set_example set<text>,! !list_example list<text>,! !map_example map<int, text>! ); •  These data types are intended to support the type of 1-to-many relationships that can be modeled in a relational DB e.g. a user has many email addresses. •  Some performance considerations around collections. •  Requires serialization so don’t go crazy! •  Often more efficient to denormalise further rather than use collections if intending to store lots of data. •  Favour sets over list – lists not very performant ©2014 DataStax Confidential. Do not distribute without consent. 30
  • 31. CQL Basics – Data Types: Counters •  Stores a number that incrementally counts the occurrences of a particular event or process. UPDATE UserActions SET total = total + 2 
 WHERE user = 123 AND action = ’xyz';! ©2014 DataStax Confidential. Do not distribute without consent. 31
  • 32. CQL Basics - Lightweight Transactions •  Introduced in Cassandra 2.0 •  DSE 4 will include Cassandra 2.0 (due soon…) •  DSE 3.2 (current version) is using Cassandra 1.2 •  Uses the Paxos consensus protocol to obtain an agreement across the cluster. •  Example: !INSERT INTO customer_account (customerID, customer_email) 
 !VALUES (‘LauraS’, ‘lauras@gmail.com’) 
 !IF NOT EXISTS;! !UPDATE customer_account SET customer_email=’laurass@gmail.com’
 !IF customer_email=’lauras@gmail.com’;! •  Great for 1% of your application – but not recommended to be used too much! •  Eventual consistency is your friend: http://www.slideshare.net/planetcassandra/c-summit-2013-eventual-consistency- hopeful-consistency-by-christos-kalantzis ©2014 DataStax Confidential. Do not distribute without consent. 32
  • 33. Data Modeling Query based and denormalised ©2014 DataStax Confidential. Do not distribute without consent. 33
  • 34. Cassandra is not a relational database •  Cassandra doesn’t work the same way as an RDBMS •  Your data modeling approach won’t work the same way either •  No foreign keys •  No joins ©2014 DataStax Confidential. Do not distribute without consent. 34
  • 35. Query-Driven Data Modeling •  Start by addressing the queries that you will need to answer •  Your data should be able to match it directly •  Think about: •  The actions your application needs to perform •  How you want to access the data •  What are the use cases? •  What does the data look like? ©2014 DataStax Confidential. Do not distribute without consent. 35
  • 36. Query-Driven Data Modeling contd. •  What are you trying to retrieve •  Does it need to be ordered? •  Is there any nesting of data? •  Do you need to group data? •  Do you need to filter data? •  Does data expire? •  Does data need to be retrieved in chronological order? ©2014 DataStax Confidential. Do not distribute without consent. 36
  • 37. Denormalisation •  Combine table columns into a single view i.e. materialized view •  we have to create table that stores all the data that would be in the view •  Remember - no joins in Cassandra! Advantage: •  Having the data stored in a this manner greatly improves performance •  Less seeking •  Less network traffic Disadvantage: •  Data duplication •  different tables for different queries •  you will use more disk space – but disks are cheap! ©2014 DataStax Confidential. Do not distribute without consent. 37
  • 38. Avoid client-side joins •  What is a client-side join? •  Querying a table from Cassandra •  Using the results from the first query to query a second table •  Why avoid? •  Degrades performance i.e. more I/O, seeks and traffic ©2014 DataStax Confidential. Do not distribute without consent. 38
  • 39. Don’t be scared of writes •  •  •  •  Cassandra is the fastest DB there is for writes. Writing to multiple tables is not going to be slow! 3-5000 writes/second/core e.g. 8 core server = 24k-30k writes per second! < 1ms typical for most rights (varies based on hardware) ©2014 DataStax Confidential. Do not distribute without consent. 39
  • 40. Performance “In terms of scalability, there is a clear winner throughout our experiments. Cassandra achieves the highest throughput for the maximum number of nodes in all experiments with a linear increasing throughput.” Solving Big Data Challenges for Enterprise Application Performance Management, Tilman Rable, et al., August 2013, p. 10. Benchmark paper presented at the Very Large Database Conference, 2013. http://vldb.org/pvldb/vol5/ p1724_tilmannrabl_vldb2013.pdf Netflix Cloud Benchmark… End Point independent NoSQL Benchmark Highest in throughput vs MongoDB and HBase Lowest in latency vs MongoDB and HBase http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalabilityon.html ©2014 DataStax Confidential. Do not distribute without consent. http://www.datastax.com/wp-content/uploads/2013/02/WP-Benchmarking-Top-NoSQLDatabases.pdf 40
  • 41. One-to-many •  Relationship without being relational •  Example – Users have many videos •  Wait? Where is the foreign key? ©2014 DataStax Confidential. Do not distribute without consent. 41
  • 42. One-to-Many CREATE TABLE videos (! videoid uuid,! videoname varchar,! username varchar,! description varchar,! tags varchar,! upload_date timestamp,! PRIMARY KEY(videoid)! );! CREATE TABLE username_video_index (! username varchar,! videoid uuid,! upload_date timestamp,! video_name varchar,! PRIMARY KEY (username, videoid)! );! ! •  Static table to store videos SELECT video_name FROM username_video_index WHERE username = ‘tcodd’ AND videoid = ‘99051fe9’! •  UUID for unique video id •  Lookup video by username •  Add username to denormalize Write in two tables at once for fast lookups ©2014 DataStax Confidential. Do not distribute without consent. 42
  • 43. Many-to-many •  Example - users and videos have many comments. ©2014 DataStax Confidential. Do not distribute without consent. 43
  • 44. Many-to-many •  Model both sides of the view •  Insert both when comment is created •  Materialized views from either side CREATE TABLE comments_by_user (! username varchar,! videoid uuid,! comment_ts timestamp,! comment varchar,! PRIMARY KEY (username,videoid)! );! CREATE TABLE comments_by_video (! videoid uuid,! username varchar,! comment_ts timestamp,! comment varchar,! PRIMARY KEY (videoid,username)! );! DON’T BE AFRAID OF WRITES ©2014 DataStax Confidential. Do not distribute without consent. 44
  • 45. Partition Key is not the same as a Primary Key •  Within a table, a row is referenced by a partition key •  This is either your primary key or the first part of a compound primary key Similarities •  Partition key identifies a partition as being separate from other partitions •  Must be unique within a table Differences •  Inserting a new record with a partition key that already exists doesn’t do what you’re used to in a RDBMS i.e. No primary key violations •  An INSERT using an existing partition key is allowed •  As a consequence, INSERT and UPDATE act in the same way i.e. UPSERT ©2014 DataStax Confidential. Do not distribute without consent. 45
  • 46. How to avoid UPSERTS •  Guarantee that your primary keys are unique from one another •  Use an appropriate natural key based on your data •  Use a surrogate key for partition key Risks with natural keys •  Depending on the type of natural key that is used, there may still be an increased risk of UPSERTs •  Changing the datum used for a Natural Key requires a lot of overhead. •  So why not use a sequence to generate a surrogate key? •  You cant – Cassandra doesn’t provide sequences! ©2014 DataStax Confidential. Do not distribute without consent. 46
  • 47. What, no sequences? •  Sequences are a handy feature in RDMBS for auto-creation of IDs for you data. •  Guaranteed unique •  E.g. INSERT INTO user (id, firstName, •  Cassandra has no sequences! LastName) VALUES (seq.nextVal(), ‘Ted’, ‘Codd’)! •  Extremely difficult in a masterless distributed system •  Requires a lock (perf killer) •  What to do? •  Use part of the data to create a unique key •  Use a UUID ©2014 DataStax Confidential. Do not distribute without consent. 47
  • 48. UUID •  Universal Unique ID •  128 bit number represented in character form e.g. 99051fe9-6a9c-46c2b949-38ef78858dd0 •  Easily generated on the client •  Version 1 has a timestamp component •  Version 4 has no timestamp component •  Faster to generate ©2014 DataStax Confidential. Do not distribute without consent. 48
  • 49. Indexing •  This gives you fast access to data •  Secondary indexes != relational indexes ©2014 DataStax Confidential. Do not distribute without consent. 49
  • 50. Adding an Index to a table •  If we want to do a query on a column that is not part of your PK, you can create an index: CREATE INDEX ON <table>(<column>); •  Than you can do a select: •  SELECT * FROM product WHERE type= ’PC'; •  Avoid doing this •  Not great for performance (although improvements are being made) •  Much more efficient to model your data around the query i.e. roll your own indexes!! ©2014 DataStax Confidential. Do not distribute without consent. 50
  • 51. Keyword index example •  Now we can define an index for tagging videos •  Using the previous video example, users want to tag videos. •  Video table defined as: ! CREATE TABLE video_tag_index (! CREATE TABLE videos (! tag varchar,! videoid uuid,! videoid uuid,! videoname varchar,! timestamp timestamp! username varchar,! PRIMARY KEY(tag, videoid)! description varchar,! );! tags varchar,! upload_date timestamp,! PRIMARY KEY(videoid)! );! Fast ©2014 DataStax Confidential. Do not distribute without consent. Efficient 51
  • 52. Partial word index example •  Table: CREATE TABLE email_index (! !domain varchar,! !user varchar,! !username varchar,! !PRIMARY KEY (domain, user)! )! •  User: jmiller, Email: jmiller@datastax.com INSERT INTO email_index (domain, user, username) ! VALUES (‘@datastax.com’, ‘jmiller’, ‘jmiller’)! ©2014 DataStax Confidential. Do not distribute without consent. 52
  • 53. Bitmap index •  Multiple parts to a key •  Create a truth table of the various combinations •  However, inserts == the number of combinations ©2014 DataStax Confidential. Do not distribute without consent. 53
  • 54. Bitmap index example •  Find a car in a car park by variable combinations ©2014 DataStax Confidential. Do not distribute without consent. 54
  • 55. Bitmap index example – Table definition •  Make a table with three different key combinations CREATE TABLE car_location_index (! !make varchar,! !model varchar,! !colour varchar,! !vehicle_id int,! !lot_id int,! !PRIMARY KEY ((make, mode, colour), vehicle_id)! );! ©2014 DataStax Confidential. Do not distribute without consent. 55
  • 56. Bitmap index example – Adding records •  We are pre-optimizing for 7 possible queries of the index on insert. 1.  INSERT INTO car_location_index (make, model, colour, vehicle_id, lot_id)
 VALUES (‘Ford’, ‘Mustang’, ‘Blue’, 1234, 8675309);! 2.  INSERT INTO car_location_index (make, model, colour, vehicle_id, lot_id)
 VALUES (‘Ford’, ‘Mustang’, ‘’, 1234, 8675309);! 3.  INSERT INTO car_location_index (make, model, colour, vehicle_id, lot_id)
 VALUES (‘Ford’, ‘’, ‘Blue’, 1234, 8675309);! 4.  INSERT INTO car_location_index (make, model, colour, vehicle_id, lot_id)
 VALUES (‘Ford’, ‘’, ‘’, 1234, 8675309);! 5.  INSERT INTO car_location_index (make, model, colour, vehicle_id, lot_id)
 VALUES (‘’, ‘Mustang’, ‘Blue’, 1234, 8675309);! 6.  INSERT INTO car_location_index (make, model, colour, vehicle_id, lot_id)
 VALUES (‘’, ‘Mustang’, ‘’, 1234, 8675309);! 7.  INSERT INTO car_location_index (make, model, colour, vehicle_id, lot_id)
 VALUES (‘’, ‘’, ‘Blue’, 1234, 8675309);! ©2014 DataStax Confidential. Do not distribute without consent. 56
  • 57. Bitmap - selecting •  Different queries are now possible: ©2014 DataStax Confidential. Do not distribute without consent. 57
  • 58. Time Series/Sensor Data ©2014 DataStax Confidential. Do not distribute without consent. 58
  • 59. What is time series data? •  Sensors •  CPU, Network Card, Electronic Power Meter, Resource Utilization, Weather •  Clickstream data •  Historical trends •  Stock Ticker •  Anything that varies on a temporal basis •  Top Ten Most Popular Videos ©2014 DataStax Confidential. Do not distribute without consent. 59
  • 60. Why Cassandra for time series data? •  Cassandra based on BigTable storage model •  One key row and lots of (variable) columns •  Single layout on disk ©2014 DataStax Confidential. Do not distribute without consent. 60
  • 61. Time Series Example •  Storing weather data •  One weather station •  Temperature measurement every minute ©2014 DataStax Confidential. Do not distribute without consent. 61
  • 62. Times Series Example – query data •  Weather station id = Locality of single node ©2014 DataStax Confidential. Do not distribute without consent. 62
  • 63. Time Series Example - Table •  Data partitioned by weather station ID and time •  Timestamp goes in the clustered column •  Store the measurement as the non-clustered column(s) •  Take advantage of partition clustering CREATE TABLE temperature (! !weatherstation_id text,! !event_time timestamp,! !temperature text! !PRIMARY KEY (weatherstation_id, event_time) ! );! ©2014 DataStax Confidential. Do not distribute without consent. 63
  • 64. Time Series Example •  Simple to insert: INSERT INTO temperature (weatherstation_id, event_time, temperature)! VALUES (‘1234abcd’, ‘2013-12-11 07:01:00’, ‘72F’);! ! •  Simple to query SELECT temperature from temperature WHERE weatherstation_id=‘1234abcd’ AND event_time > ‘2013-04-03 07:01:00’ AND event_time < ‘2013-04-03 07:04:00’ ! ! ©2014 DataStax Confidential. Do not distribute without consent. 64
  • 65. Time Series Example – Partitioning •  With the previous table, you can end up with a very large row on 1 partition i.e. PRIMARY KEY (weatherstation_id, event_time) •  This would have to fit on 1 node. •  Cassandra can store 2 billion columns per storage row. •  The solution is to have a composite partition key to split things up: CREATE TABLE temperature (! !weatherstation_id text,! !date text,! !event_time timestamp,! !temperature text! !PRIMARY KEY ((weatherstation_id, date), event_time) ! );! ©2014 DataStax Confidential. Do not distribute without consent. 65
  • 66. Time Series Example – reading and writing •  Simple to insert: INSERT INTO temperature (weatherstation_id, date, event_time, temperature)! VALUES (‘1234abcd’, ‘2013-12-11’, ‘2013-12-11 07:01:00’, ‘72F’);! ! •  Simple to query SELECT temperature from temperature ! WHERE weatherstation_id=‘1234abcd’ ! AND date = ‘2013-12-11’! AND event_time > ‘2013-04-03 07:01:00’ AND event_time < ‘2013-04-03 07:04:00’ ! ! ©2014 DataStax Confidential. Do not distribute without consent. 66
  • 67. Time Series Example – reverse ordering •  Common pattern for time series data is rolling storage. •  For example, we only want to show the last 10 temperature readings and older data is no longer needed •  On most DBs you would need some background job to purge the old data. •  With Cassandra you can use TTL’s! CREATE TABLE temperature (! !weatherstation_id text,! !date text,! !event_time timestamp,! !temperature text! !PRIMARY KEY ((weatherstation_id, date), event_time) ! ) WITH CLUSTERING ORDER BY (event_time DESC);! •  As part of the table definition, WITH CLUSTERING ORDER BY (event_time DESC), is used to order the data by the most recent first i.e. the data will be returned in this order.! ©2014 DataStax Confidential. Do not distribute without consent. 67
  • 68. Time Series Example – TTL’ing •  Simple to insert: INSERT INTO temperature (weatherstation_id, date, event_time, temperature)! VALUES (‘1234abcd’, ‘2013-12-11’, ‘2013-12-11 07:01:00’, ‘72F’) USING TTL 20;! •  This data point will automatically be deleted after 20 seconds. •  Eventually you will see all the data disappear. ! •  Simple to query SELECT temperature from temperature ! WHERE weatherstation_id=‘1234abcd’ ! AND date = ‘2013-12-11’! AND event_time > ‘2013-04-03 07:01:00’ AND event_time < ‘2013-04-03 07:04:00’ ! ©2014 DataStax Confidential. Do not distribute without consent. 68
  • 69. Time Series Bucket Example – mitigating spikes in data •  In some situations, there might be a risk that you get an unforeseen volume of sensor data for the partition key for your row. •  The risk here is that your row will continue to grow and fill-up the node. •  The workaround here is to attempt to split your data across multiple nodes: CREATE TABLE temperature (! !weatherstation_id text,! !date text,! !bucket_id int,! !event_time timestamp,! !temperature text! !PRIMARY KEY ((weatherstation_id, date, bucket_id), event_time) ! );! ©2014 DataStax Confidential. Do not distribute without consent. 69
  • 70. Time Series Bucket Example – reading and writing •  Not so simple to insert. Client needs to generate a bucket id (often a random number within a certain range): INSERT INTO temperature (weatherstation_id, date, bucket, event_time, temperature)! VALUES (‘1234abcd’, ‘2013-12-11’, 10, ‘2013-12-11 07:01:00’, ‘72F’);! ! •  Much more expensive to read. The client will have to iterate through the range of random numbers, execute a read for each and then merge and order the data in the client SELECT temperature from temperature ! WHERE weatherstation_id=‘1234abcd’ AND date = ‘2013-12-11’! AND bucket = 10, ! AND event_time > ‘2013-04-03 07:01:00’ AND event_time < ‘2013-04-03 07:04:00’ ! ! ©2014 DataStax Confidential. Do not distribute without consent. 70
  • 71. Time Series Bucket Example •  Only do this as a last resort. •  Reads become very expensive i.e. n x read(s) where n > range of buckets •  If your dealing with large volumes of data it can be hard work for the client to merge and re-order things. ©2014 DataStax Confidential. Do not distribute without consent. 71
  • 72. DataStax Native Java Driver ©2013 DataStax Confidential. Do not distribute without consent. 72
  • 73. Features •  Provides CQL3 access to Cassandra using Java •  Utilizes Cassandra’s native protocol •  Automatic routing of client requests •  Configurable consistency policy •  Automatic failover •  Tracing support •  Tunable policies •  Load balancing •  Reconnection •  Consistency •  Queries can be executed synchronously or asynchronously •  Supports prepared statements •  Non-blocking I/O ©2014 DataStax Confidential. Do not distribute without consent. 73
  • 74. Cassandra clients - Drivers •  DataStax drivers for Cassandra •  Python •  C++ •  Java •  C# •  And more on the way… •  http://www.datastax.com/download/clientdrivers ©2014 DataStax Confidential. Do not distribute without consent. 74
  • 75. Where to get it? •  The latest release of the driver is available on Maven Central. •  You can install it in your application using the following Maven dependency: •  Documentation: http://www.datastax.com/documentation/developer/java-driver Javadoc: http://www.datastax.com/drivers/java/apidocs/index.html ©2014 DataStax Confidential. Do not distribute without consent. 75
  • 76. Native Protocol •  To use CQL via the client drivers, you must set the property start_native_transport to true in the cassandra.yaml on every node. •  This protocol is an extremely efficient way of integrating with Cassandra. •  Supports synchronous and asynchronous requests •  Use the corresponding native driver in your app. ©2014 DataStax Confidential. Do not distribute without consent. 76
  • 77. CQL to Java Mappings CQL3 Data Type Java Type CQL3 Data Type Java Type ascii java. lang. String int int bigint long list java.util.List<T> blob java.nio.ByteBuffer map java.util.Map<K, V> boolean boolean set java.util.Set<T> counter long text java.lang.String decimal float timeuuid java.util.UUID double double uuid java.util.UUID float float varchar java.lang.String inet java.net.InetAddress varint java.math.BigInteger ©2014 DataStax Confidential. Do not distribute without consent. 77
  • 78. Connecting to a Cluster •  The Cluster class is your client apps entry point for connecting to Cassandra and getting back its metadata. Cluster cluster = Cluster.builder().addContactPoints(”10.158.02.40”,“10.158.02.44”).build(); •  You can pass in one or many node addresses to connect to. •  Make sure to tidy up your cluster after your finished: cluster.shutdown(); ©2014 DataStax Confidential. Do not distribute without consent. 78
  • 79. Connecting to a Keyspace •  After connecting to the cluster, you creation a Session on the keyspace you want to iteract with. Session session = cluster.connect(“akeyspace”); •  Make sure to tidy up after your self: session.shutdown(); ©2014 DataStax Confidential. Do not distribute without consent. 79
  • 80. Inserting Data try { session.execute( “INSERT INTO user (username, password)” + “VALUES(‘user1’, ‘user1password’);”); session.execute( “INSERT INTO user (username, password)” + “VALUES(‘user2’, ‘user2password’);”); } catch (NoHostAvailableException ex) { System.out.println(“No Host available”); } ©2014 DataStax Confidential. Do not distribute without consent. 80
  • 81. Reading Data try { ResultSet result = session.execute ( "SELECT password from user " + "WHERE username = 'user2';"); if (result.isExhausted()) return; Row user = result.one(); System.out.println("Password is: " + user.getString("password")); } catch (NoHostAvailableException ex) { System.out.println("No Host Available"); } catch (QueryValidationException ex) { System.out.println(“Requested consistency” + “level not met”); } ©2014 DataStax Confidential. Do not distribute without consent. 81
  • 82. Prepared Statements PreparedStatement statement = session.prepare( "INSERT INTO user (username, password) " + "VALUES (?, ?);"); BoundStatement boundStatement = new BoundStatement(statement); try { session.execute(boundStatement.bind("user4”,"user4password")); } catch (NoHostAvailableException ex) { System.out.println("Host Not Available"); } catch (QueryExecutionException ex) { System.out.println (”Syntax error, runtime, not authorized"); } catch (QueryValidationException ex) { System.out.println ("Requested consistency level not met"); } ©2014 DataStax Confidential. Do not distribute without consent. 82
  • 83. Query Builder Insert insert = QueryBuilder.insertInto("user”) .value("username", ”rcohen”) .value("password", ”mypassword"); session.execute(insert); Query query = QueryBuilder .select() .all() .from(”akeyspace", "user"); ResultSet rs = session.execute(query); for (Row row : rs) { System.out.println(String.format("%-20st%-20s", row.getString("username"), row.getString("password"))); } ©2014 DataStax Confidential. Do not distribute without consent. 83
  • 84. Consistency Level SimpleStatement simpleStatement = new SimpleStatement ( "SELECT * FROM USER WHERE username = 'user2’;”); // This will show the default consistency level of ConsistencyLevel.ONE System.out.println("Consistency Level for this request: ” +simpleStatement.getConsistencyLevel()); //Now change the consistency level simpleStatement.setConsistencyLevel(ConsistencyLevel.ALL); You can also set the consistency level using the QueryBuilder Insert insert = QueryBuilder.insertInto("user”) .value("username", ”johnny”) .value("password", ”mypassword") setConsistencyLevel(ConsistencyLevel.ALL); ©2014 DataStax Confidential. Do not distribute without consent. 84
  • 85. Tracing •  Tracing can help with debugging or analysing how Cassandra is handling your queries. Query insert = QueryBuilder.insertInto("simplex", "songs") .value("id", UUID.randomUUID()) .value("title", "Golden Brown") .value("album", "La Folie") .value("artist", "The Stranglers") .setConsistencyLevel(ConsistencyLevel.ONE).enableTracing(); ©2014 DataStax Confidential. Do not distribute without consent. 85
  • 86. Tracing ResultSet results = getSession().execute(insert); ExecutionInfo executionInfo = results.getExecutionInfo(); •  This ExecutionInfo object contains information on the hosts it attempted to communicate with, the host it used and a QueryTrace object. QueryTrace queryTrace = executionInfo.getQueryTrace(); •  With these two objects you can obtain quite detail on how your query performed ©2014 DataStax Confidential. Do not distribute without consent. 86
  • 87. Tracing Connected to cluster: xerxes
 Simplex keyspace and schema created.
 Host (queried): /127.0.0.1
 Host (tried): /127.0.0.1
 Trace id: 96ac9400-a3a5-11e2-96a9-4db56cdc5fe7! activity | timestamp | source | source_elapsed! ---------------------------------------+--------------+------------+--------------! Parsing statement | 12:17:16.736 | /127.0.0.1 | 28! Peparing statement | 12:17:16.736 | /127.0.0.1 | 199! Determining replicas for mutation | 12:17:16.736 | /127.0.0.1 | 348! Sending message to /127.0.0.3 | 12:17:16.736 | /127.0.0.1 | 788! Sending message to /127.0.0.2 | 12:17:16.736 | /127.0.0.1 | 805! Acquiring switchLock read lock | 12:17:16.736 | /127.0.0.1 | 828! Appending to commitlog | 12:17:16.736 | /127.0.0.1 | 848! Adding to songs memtable | 12:17:16.736 | /127.0.0.1 | 900! Message received from /127.0.0.1 | 12:17:16.737 | /127.0.0.2 | 34! Message received from /127.0.0.1 | 12:17:16.737 | /127.0.0.3 | 25! Acquiring switchLock read lock | 12:17:16.737 | /127.0.0.2 | 672! Acquiring switchLock read lock | 12:17:16.737 | /127.0.0.3 | 525! Appending to commitlog | 12:17:16.737 | /127.0.0.2 | 692! Appending to commitlog | 12:17:16.737 | /127.0.0.3 | 541! Adding to songs memtable | 12:17:16.737 | /127.0.0.2 | 741! Adding to songs memtable | 12:17:16.737 | /127.0.0.3 | 583! ©2014Enqueuing response not distribute without consent. DataStax Confidential. Do to /127.0.0.1 | 12:17:16.737 | /127.0.0.3 | 87 751! Enqueuing response to /127.0.0.1 | 12:17:16.738 | /127.0.0.2 | 950! Message received from /127.0.0.3 | 12:17:16.738 | /127.0.0.1 | 178! Sending message to /127.0.0.1 | 12:17:16.738 | /127.0.0.2 | 1189! Message received from /127.0.0.2 | 12:17:16.738 | /127.0.0.1 | 249! Processing response from /127.0.0.3 | 12:17:16.738 | /127.0.0.1 | 345! Processing response from /127.0.0.2 | 12:17:16.738 | /127.0.0.1 | 377!
  • 88. OpsCenter ©2013 DataStax Confidential. Do not distribute without consent. 88
  • 89. DataStax OpsCenter •  DataStax OpsCenter is a browser-based, visual management and monitoring solution for Apache Cassandra and DataStax Enterprise •  Functionality is also exposed via HTTP APIs ©2013 DataStax Confidential. Do not distribute without consent. 89
  • 90. Thank You We power the big data apps that transform business. ©2014 DataStax Confidential. Do not distribute without consent. 90