SlideShare a Scribd company logo
Introduction to Cassandra
Nick Bailey
@nickmbailey

Monday, October 28, 13
Who am I?
©2012 DataStax
Monday, October 28, 13

2
What’s DataStax?
©2012 DataStax
Monday, October 28, 13

3
On to the good stuff!
©2012 DataStax
Monday, October 28, 13

4
Why Cassandra?
Cluster Architecture
Node Architecture
5

Data Modeling
Wrap up
©2012 DataStax
Monday, October 28, 13
Why Cassandra?
©2012 DataStax
Monday, October 28, 13

6
Time for buzz
words!

©2012 DataStax
Monday, October 28, 13

Big Data!
NoSQL!

7
Big Data
• Gartner: “...high-volume, high-velocity and
high-variety...”

• 2 sides of ‘big data’
•
•

©2012 DataStax
Monday, October 28, 13

Analytics
Real-time

8
NoSQL
• A terrible label
• Covers a wide range of DBs
•
•
•
•
•

©2012 DataStax
Monday, October 28, 13

Cassandra
Redis
MongoDB
HBase
...

9
Started by Facebook

©2012 DataStax
Monday, October 28, 13

10
Dynamo (Amazon)
+
Big Table (Google)

©2012 DataStax
Monday, October 28, 13

11
©2012 DataStax
Monday, October 28, 13

12
Cassandra is great for...
• Massive, linear scaling

(e.g. CERN hadron collider, Barracuda Networks)

• Extremely heavy writes

(e.g. BlueMountain Capital – financial tick data)

• High availability

(e.g. eBay, Eventbrite, Netflix, SoundCloud,
HeathCare Anytime, Comcast, GoDaddy, Sony
Entertainment Network)

©2012 DataStax
Monday, October 28, 13

13
©2012 DataStax
Monday, October 28, 13

14
©2012 DataStax
Monday, October 28, 13

15
http://techblog.netflix.com/2012/07/lessons-netflix-learned-from-aws-storm.html
©2012 DataStax
Monday, October 28, 13

16
9
One size does not fit all
Polyglot persistence

©2012 DataStax
Monday, October 28, 13

17
More Resources
• PlanetCassandra.org
• Blog
• 5 minute interviews

©2012 DataStax
Monday, October 28, 13

18
Cluster Architecture
©2012 DataStax
Monday, October 28, 13

19
Data Distribution
0

75

25

50
Hash_Function(Partition Key) >> Token
©2012 DataStax
Monday, October 28, 13
Replication

©2012 DataStax
Monday, October 28, 13
Failure Modes

©2012 DataStax
Monday, October 28, 13
Consistency Level
• Multiple options
•
•
•
•
•

ONE
QUORUM
ALL
LOCAL_QUORUM
...

• Can be specified per request

©2012 DataStax
Monday, October 28, 13

23
Quorum

©2012 DataStax
Monday, October 28, 13
Quorum

©2012 DataStax
Monday, October 28, 13
Consistency
Write
CL: ONE

©2012 DataStax
Monday, October 28, 13
Consistency
Read
CL: One

©2012 DataStax
Monday, October 28, 13
Failure Types
• UnavailableException
•

Didn’t even try

•

Possible success or failure

• TimedOutException

©2012 DataStax
Monday, October 28, 13

28
Multi DC

©2012 DataStax
Monday, October 28, 13
Gossip
• Manages cluster state
•
•

Nodes up/down
Nodes joining/leaving

• Decentralized

©2012 DataStax
Monday, October 28, 13

30
Snitch
• Responsible for determining cluster topology
• Tracks node responsiveness
• Simple, PropertyFile, Ec2Snitch, etc...

©2012 DataStax
Monday, October 28, 13

31
Node Architecture
©2012 DataStax
Monday, October 28, 13

32
Write Path
Write

Memtable

Memory
Disk

commit log

©2012 DataStax
Monday, October 28, 13

SSTable

33
Read Path
Read

Memtable

Memory
Disk

SSTable

©2012 DataStax
Monday, October 28, 13

SSTable

34
Data Modeling
©2012 DataStax
Monday, October 28, 13

35
CQL
Cassandra Query Language

©2012 DataStax
Monday, October 28, 13

36
Terminology
• Keyspace
• Table (Column Family)
• Row
• Column
• Partition Key
• Clustering Key (Optional)

©2012 DataStax
Monday, October 28, 13

37
For Example:
CREATE KEYSPACE packagetracker WITH REPLICATION = { 'class' :
'SimpleStrategy', 'replication_factor' : 1 };
CREATE KEYSPACE packagetracker WITH REPLICATION = { 'class' :
'NetworkTopologyStrategy', 'dc1' : 2, 'dc2' : 2};
CREATE TABLE events (
package_id text,
status_timestamp timestamp,
location text,
notes text,
PRIMARY KEY (package_id, status_timestamp)
);

©2012 DataStax
Monday, October 28, 13

38
Constructs

©2012 DataStax
Monday, October 28, 13

39
Basic Data Types
• blob
• int
• text
• long
• uuid
• etc

©2012 DataStax
Monday, October 28, 13

40
More Data Modeling Constructs
• Collections
•

map, set, list

• Time to live (TTL)
• Counters
• Secondary Indexes

©2012 DataStax
Monday, October 28, 13

41
Approaching Data Modeling
• Model your queries, not your data
•

Optimize your data model for reads

• Don’t be afraid to denormalize
• You will get it wrong, iterate

©2012 DataStax
Monday, October 28, 13

42
An Example:
User Logins

©2012 DataStax
Monday, October 28, 13

43
The Query
What are the last 10 locations nickmbailey logged in from?
SELECT time, location FROM logins WHERE user = ‘nickmbailey’
ORDER BY time DESC LIMIT 10;

©2012 DataStax
Monday, October 28, 13

44
The Query
What are the last 10 locations nickmbailey logged in from?
SELECT time, location FROM logins WHERE user = ‘nickmbailey’
ORDER BY time DESC LIMIT 10;

Partition Key

©2012 DataStax
Monday, October 28, 13

45
The Query
What are the last 10 locations nickmbailey logged in from?
SELECT time, location FROM logins WHERE user = ‘nickmbailey’
ORDER BY time DESC LIMIT 10;

Clustering Key

©2012 DataStax
Monday, October 28, 13

Partition Key

46
The Query
What are the last 10 locations nickmbailey logged in from?
SELECT time, location FROM logins WHERE user = ‘nickmbailey’
ORDER BY time DESC LIMIT 10;

Clustering Key

©2012 DataStax
Monday, October 28, 13

Partition Key
Additional Columns

47
The Query
What are the last 10 locations nickmbailey logged in from?
SELECT time, location FROM logins WHERE user = ‘nickmbailey’
ORDER BY time DESC LIMIT 10;

Clustering Key

Partition Key
Additional Columns

CREATE COLUMN FAMILY logins (
	 user text,
time timestamp,
location text,
PRIMARY KEY (user, time));

©2012 DataStax
Monday, October 28, 13

48
The Query
What are the last 10 locations nickmbailey logged in from?
SELECT time, location FROM logins WHERE user = ‘nickmbailey’
ORDER BY time DESC LIMIT 10;
CREATE COLUMN FAMILY logins (
	 user text,
time timestamp,
location text,
PRIMARY KEY (user, time));
Partition key

Primary key

User

Time

Location

nickmbailey

2013-07-19 09:22:18

Austin, Texas

nickmbailey

2013-07-19 14:49:27

Blacksburg, Virginia

jsmith

2013-07-20 07:59:34

Atlanta, Georgia

©2012 DataStax
Monday, October 28, 13

49
Time-series data
• By far, the most common data model
• Event logs
• Metrics
• Sensor Data
• Etc

©2012 DataStax
Monday, October 28, 13

50
Another Query
When was the last time nickmbailey logged in from San
Francisco, California?
SELECT time FROM logins WHERE user = ‘nickmbailey’ and
location=‘San Francisco, California’;
User

Time

Location

nickmbailey

2013-07-19 09:22:18

Austin, Texas

nickmbailey

2013-07-19 14:49:27

Blacksburg, Virginia

nickmbailey

2013-07-19 14:49:27

Austin, Texas

nickmbailey

2013-05-19 14:49:27

Austin, Texas

nickmbailey

2013-04-19 14:49:27

San Francisco, California

...

...

...

jsmith

2013-07-20 07:59:34

Atlanta, Georgia

©2012 DataStax
Monday, October 28, 13

51
Another Query
When was the last time nickmbailey logged in from Austin,
Texas?
SELECT time FROM logins_by_location WHERE user = ‘nickmbailey’
and location=‘San Francisco, California’;
CREATE COLUMN FAMILY logins_by_location (
user text,
time timestamp,
location text,
PRIMARY KEY (user, location));

©2012 DataStax
Monday, October 28, 13

52
Another Query
When was the last time nickmbailey logged in from Austin,
Texas?
SELECT time FROM logins_by_location WHERE user = ‘nickmbailey’
and location=‘San Francisco, California’;
CREATE COLUMN FAMILY logins_by_location (
user text,
time timestamp,
location text,
PRIMARY KEY (user, location));
User

Location

Time

nickmbailey

Austin, Texas

2013-07-19 09:22:18

nickmbailey

Blacksburg, Virginia

2013-07-19 14:49:27

nickmbailey

San Francisco, California

2013-07-19 14:49:27

©2012 DataStax
Monday, October 28, 13

53
Denormalize
• Create materialized views of the same data to
support different queries

• Storage space is cheap, Cassandra is fast

©2012 DataStax
Monday, October 28, 13

54
Debugging your data model
cqlsh> tracing on;
Now tracing requests.
cqlsh:foo> INSERT INTO test (a, b) VALUES (1, 'example');
Tracing session: 4ad36250-1eb4-11e2-0000-fe8ebeead9f9
activity
| timestamp
| source
| source_elapsed
-------------------------------------+--------------+-----------+---------------execute_cql3_query | 00:02:37,015 | 127.0.0.1 |
0
Parsing statement | 00:02:37,015 | 127.0.0.1 |
81
Preparing statement | 00:02:37,015 | 127.0.0.1 |
273
Determining replicas for mutation | 00:02:37,015 | 127.0.0.1 |
540
Sending message to /127.0.0.2 | 00:02:37,015 | 127.0.0.1 |
779
Messsage received from /127.0.0.1
Applying mutation
Acquiring switchLock
Appending to commitlog
Adding to memtable
Enqueuing response to /127.0.0.1
Sending message to /127.0.0.1

©2012 DataStax
Monday, October 28, 13

|
|
|
|
|
|
|

00:02:37,016
00:02:37,016
00:02:37,016
00:02:37,016
00:02:37,016
00:02:37,016
00:02:37,016

|
|
|
|
|
|
|

127.0.0.2
127.0.0.2
127.0.0.2
127.0.0.2
127.0.0.2
127.0.0.2
127.0.0.2

|
|
|
|
|
|
|

63
220
250
277
378
710
888
55
A note on Transactions
• In general, you want to construct your data
model around them

• The latest version of Cassandra has ‘Compare
and swap’

•
•
•

©2012 DataStax
Monday, October 28, 13

An implementation of Paxos
...IF NOT EXISTS;
...IF column1 = ‘value’;

56
Try it out
©2012 DataStax
Monday, October 28, 13

57
CCM
• CCM - Cassandra Cluster Manager
•

https://github.com/pcmanus/ccm

•
•
•

ccm create test -v 2.0.1
ccm populate -n 3
ccm start

• Warning: not lightweight
• Example:

©2012 DataStax
Monday, October 28, 13

58
Clients
• Cqlsh
•

Bundled with Cassandra

•
•
•
•

java: https://github.com/datastax/java-driver
python: https://github.com/datastax/python-driver
.net: https://github.com/datastax/csharp-driver
and more: http://www.datastax.com/download/
clientdrivers

• Drivers

©2012 DataStax
Monday, October 28, 13

59
Get Help
• IRC: #cassandra on freenode
• Mailing Lists
• Stack Overflow
• DataStax Docs
•

©2012 DataStax
Monday, October 28, 13

http://www.datastax.com/docs

60
Questions?
©2012 DataStax
Monday, October 28, 13

61
Monday, October 28, 13

More Related Content

What's hot

Introduction to Cassandra Architecture
Introduction to Cassandra ArchitectureIntroduction to Cassandra Architecture
Introduction to Cassandra Architecture
nickmbailey
 
PostgreSQL Deep Internal
PostgreSQL Deep InternalPostgreSQL Deep Internal
PostgreSQL Deep Internal
EXEM
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overview
PritamKathar
 
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
DataStax
 
Intro to Cassandra
Intro to CassandraIntro to Cassandra
Intro to Cassandra
DataStax Academy
 
Storing time series data with Apache Cassandra
Storing time series data with Apache CassandraStoring time series data with Apache Cassandra
Storing time series data with Apache Cassandra
Patrick McFadin
 
An Introduction to REDIS NoSQL database
An Introduction to REDIS NoSQL databaseAn Introduction to REDIS NoSQL database
An Introduction to REDIS NoSQL database
Ali MasudianPour
 
ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei Milovidov
Altinity Ltd
 
Cassandra
CassandraCassandra
Cassandra
Upaang Saxena
 
The world's next top data model
The world's next top data modelThe world's next top data model
The world's next top data model
Patrick McFadin
 
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEOClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
Altinity Ltd
 
How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)
DataStax Academy
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisDvir Volk
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
Christian Johannsen
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
DataStax
 
Better than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouseBetter than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouse
Altinity Ltd
 
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
DataStax
 
Redis overview for Software Architecture Forum
Redis overview for Software Architecture ForumRedis overview for Software Architecture Forum
Redis overview for Software Architecture Forum
Christopher Spring
 
Redis Introduction
Redis IntroductionRedis Introduction
Redis IntroductionAlex Su
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
Jurriaan Persyn
 

What's hot (20)

Introduction to Cassandra Architecture
Introduction to Cassandra ArchitectureIntroduction to Cassandra Architecture
Introduction to Cassandra Architecture
 
PostgreSQL Deep Internal
PostgreSQL Deep InternalPostgreSQL Deep Internal
PostgreSQL Deep Internal
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overview
 
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
 
Intro to Cassandra
Intro to CassandraIntro to Cassandra
Intro to Cassandra
 
Storing time series data with Apache Cassandra
Storing time series data with Apache CassandraStoring time series data with Apache Cassandra
Storing time series data with Apache Cassandra
 
An Introduction to REDIS NoSQL database
An Introduction to REDIS NoSQL databaseAn Introduction to REDIS NoSQL database
An Introduction to REDIS NoSQL database
 
ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei Milovidov
 
Cassandra
CassandraCassandra
Cassandra
 
The world's next top data model
The world's next top data modelThe world's next top data model
The world's next top data model
 
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEOClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
 
How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
Better than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouseBetter than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouse
 
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
 
Redis overview for Software Architecture Forum
Redis overview for Software Architecture ForumRedis overview for Software Architecture Forum
Redis overview for Software Architecture Forum
 
Redis Introduction
Redis IntroductionRedis Introduction
Redis Introduction
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
 

Viewers also liked

Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra ExplainedEric Evans
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
Robert Stupp
 
Cassandra NoSQL Tutorial
Cassandra NoSQL TutorialCassandra NoSQL Tutorial
Cassandra NoSQL Tutorial
Michelle Darling
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
Eric Evans
 
Cassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ NetflixCassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ Netflix
nkorla1share
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache CassandraAran Deltac
 
Introduction to cassandra 2014
Introduction to cassandra 2014Introduction to cassandra 2014
Introduction to cassandra 2014
Patrick McFadin
 
Indexing in Cassandra
Indexing in CassandraIndexing in Cassandra
Indexing in Cassandra
Ed Anuff
 
Cassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patternsCassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patterns
Dave Gardner
 
Cassandra Day Denver 2014: Introduction to Apache Cassandra
Cassandra Day Denver 2014: Introduction to Apache CassandraCassandra Day Denver 2014: Introduction to Apache Cassandra
Cassandra Day Denver 2014: Introduction to Apache Cassandra
DataStax Academy
 
Apache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide DeckApache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide Deck
DataStax Academy
 
Open source or proprietary, choose wisely!
Open source or proprietary,  choose wisely!Open source or proprietary,  choose wisely!
Open source or proprietary, choose wisely!
Patrick McFadin
 
Cassandra background-and-architecture
Cassandra background-and-architectureCassandra background-and-architecture
Cassandra background-and-architectureMarkus Klems
 
Introduction to Cassandra & Data model
Introduction to Cassandra & Data modelIntroduction to Cassandra & Data model
Introduction to Cassandra & Data model
Duyhai Doan
 
Cassandra ppt 1
Cassandra ppt 1Cassandra ppt 1
Cassandra ppt 1
Skillwise Group
 
Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012jbellis
 
Cassandra Data Model
Cassandra Data ModelCassandra Data Model
Cassandra Data Model
ebenhewitt
 
Understanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraUnderstanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache Cassandra
DataStax
 
Introduction to Cassandra: Replication and Consistency
Introduction to Cassandra: Replication and ConsistencyIntroduction to Cassandra: Replication and Consistency
Introduction to Cassandra: Replication and Consistency
Benjamin Black
 
Cassandra under the hood
Cassandra under the hoodCassandra under the hood
Cassandra under the hood
Andriy Rymar
 

Viewers also liked (20)

Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
 
Cassandra NoSQL Tutorial
Cassandra NoSQL TutorialCassandra NoSQL Tutorial
Cassandra NoSQL Tutorial
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
 
Cassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ NetflixCassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ Netflix
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
 
Introduction to cassandra 2014
Introduction to cassandra 2014Introduction to cassandra 2014
Introduction to cassandra 2014
 
Indexing in Cassandra
Indexing in CassandraIndexing in Cassandra
Indexing in Cassandra
 
Cassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patternsCassandra concepts, patterns and anti-patterns
Cassandra concepts, patterns and anti-patterns
 
Cassandra Day Denver 2014: Introduction to Apache Cassandra
Cassandra Day Denver 2014: Introduction to Apache CassandraCassandra Day Denver 2014: Introduction to Apache Cassandra
Cassandra Day Denver 2014: Introduction to Apache Cassandra
 
Apache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide DeckApache Cassandra Developer Training Slide Deck
Apache Cassandra Developer Training Slide Deck
 
Open source or proprietary, choose wisely!
Open source or proprietary,  choose wisely!Open source or proprietary,  choose wisely!
Open source or proprietary, choose wisely!
 
Cassandra background-and-architecture
Cassandra background-and-architectureCassandra background-and-architecture
Cassandra background-and-architecture
 
Introduction to Cassandra & Data model
Introduction to Cassandra & Data modelIntroduction to Cassandra & Data model
Introduction to Cassandra & Data model
 
Cassandra ppt 1
Cassandra ppt 1Cassandra ppt 1
Cassandra ppt 1
 
Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012
 
Cassandra Data Model
Cassandra Data ModelCassandra Data Model
Cassandra Data Model
 
Understanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraUnderstanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache Cassandra
 
Introduction to Cassandra: Replication and Consistency
Introduction to Cassandra: Replication and ConsistencyIntroduction to Cassandra: Replication and Consistency
Introduction to Cassandra: Replication and Consistency
 
Cassandra under the hood
Cassandra under the hoodCassandra under the hood
Cassandra under the hood
 

Similar to Introduction to Cassandra Basics

Introduction to Cassandra and Data Modeling
Introduction to Cassandra and Data ModelingIntroduction to Cassandra and Data Modeling
Introduction to Cassandra and Data Modelingnickmbailey
 
The data model is dead, long live the data model
The data model is dead, long live the data modelThe data model is dead, long live the data model
The data model is dead, long live the data modelPatrick McFadin
 
An Introduction to Cassandra on Linux
An Introduction to Cassandra on LinuxAn Introduction to Cassandra on Linux
An Introduction to Cassandra on Linuxnickmbailey
 
Cassandra 2.0 and timeseries
Cassandra 2.0 and timeseriesCassandra 2.0 and timeseries
Cassandra 2.0 and timeseries
Patrick McFadin
 
MySQL Without the MySQL -- Oh My!
MySQL Without the MySQL -- Oh My!MySQL Without the MySQL -- Oh My!
MySQL Without the MySQL -- Oh My!
Dave Stokes
 
Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101
Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101
Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101
DataStax Academy
 
Cassandra Day London 2015: Data Modeling 101
Cassandra Day London 2015: Data Modeling 101Cassandra Day London 2015: Data Modeling 101
Cassandra Day London 2015: Data Modeling 101
DataStax Academy
 
Cassandra Day Atlanta 2015: Data Modeling 101
Cassandra Day Atlanta 2015: Data Modeling 101Cassandra Day Atlanta 2015: Data Modeling 101
Cassandra Day Atlanta 2015: Data Modeling 101
DataStax Academy
 
springdatajpatwjug-120527215242-phpapp02.pdf
springdatajpatwjug-120527215242-phpapp02.pdfspringdatajpatwjug-120527215242-phpapp02.pdf
springdatajpatwjug-120527215242-phpapp02.pdf
ssuser0562f1
 
1 Dundee - Cassandra 101
1 Dundee - Cassandra 1011 Dundee - Cassandra 101
1 Dundee - Cassandra 101
Christopher Batey
 
Introduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraIntroduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandra
Patrick McFadin
 
MySQL Without the SQL -- Oh My! Longhorn PHP Conference
MySQL Without the SQL -- Oh My!  Longhorn PHP ConferenceMySQL Without the SQL -- Oh My!  Longhorn PHP Conference
MySQL Without the SQL -- Oh My! Longhorn PHP Conference
Dave Stokes
 
Use Your MySQL Knowledge to Become a MongoDB Guru
Use Your MySQL Knowledge to Become a MongoDB GuruUse Your MySQL Knowledge to Become a MongoDB Guru
Use Your MySQL Knowledge to Become a MongoDB Guru
Tim Callaghan
 
CFS: Cassandra Backed Storage for Hadoop
CFS: Cassandra Backed Storage for HadoopCFS: Cassandra Backed Storage for Hadoop
CFS: Cassandra Backed Storage for Hadoop
DataStax Academy
 
CFS: Cassandra backed storage for Hadoop
CFS: Cassandra backed storage for HadoopCFS: Cassandra backed storage for Hadoop
CFS: Cassandra backed storage for Hadoopnickmbailey
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101
ScyllaDB
 
Bonjour, iCloud
Bonjour, iCloudBonjour, iCloud
Bonjour, iCloud
Chris Adamson
 
Making MySQL Agile-ish
Making MySQL Agile-ishMaking MySQL Agile-ish
Making MySQL Agile-ish
Dave Stokes
 
Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger
Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, StrongerCassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger
Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger
DataStax
 
DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...
DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...
DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...
DataStax
 

Similar to Introduction to Cassandra Basics (20)

Introduction to Cassandra and Data Modeling
Introduction to Cassandra and Data ModelingIntroduction to Cassandra and Data Modeling
Introduction to Cassandra and Data Modeling
 
The data model is dead, long live the data model
The data model is dead, long live the data modelThe data model is dead, long live the data model
The data model is dead, long live the data model
 
An Introduction to Cassandra on Linux
An Introduction to Cassandra on LinuxAn Introduction to Cassandra on Linux
An Introduction to Cassandra on Linux
 
Cassandra 2.0 and timeseries
Cassandra 2.0 and timeseriesCassandra 2.0 and timeseries
Cassandra 2.0 and timeseries
 
MySQL Without the MySQL -- Oh My!
MySQL Without the MySQL -- Oh My!MySQL Without the MySQL -- Oh My!
MySQL Without the MySQL -- Oh My!
 
Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101
Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101
Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101
 
Cassandra Day London 2015: Data Modeling 101
Cassandra Day London 2015: Data Modeling 101Cassandra Day London 2015: Data Modeling 101
Cassandra Day London 2015: Data Modeling 101
 
Cassandra Day Atlanta 2015: Data Modeling 101
Cassandra Day Atlanta 2015: Data Modeling 101Cassandra Day Atlanta 2015: Data Modeling 101
Cassandra Day Atlanta 2015: Data Modeling 101
 
springdatajpatwjug-120527215242-phpapp02.pdf
springdatajpatwjug-120527215242-phpapp02.pdfspringdatajpatwjug-120527215242-phpapp02.pdf
springdatajpatwjug-120527215242-phpapp02.pdf
 
1 Dundee - Cassandra 101
1 Dundee - Cassandra 1011 Dundee - Cassandra 101
1 Dundee - Cassandra 101
 
Introduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraIntroduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandra
 
MySQL Without the SQL -- Oh My! Longhorn PHP Conference
MySQL Without the SQL -- Oh My!  Longhorn PHP ConferenceMySQL Without the SQL -- Oh My!  Longhorn PHP Conference
MySQL Without the SQL -- Oh My! Longhorn PHP Conference
 
Use Your MySQL Knowledge to Become a MongoDB Guru
Use Your MySQL Knowledge to Become a MongoDB GuruUse Your MySQL Knowledge to Become a MongoDB Guru
Use Your MySQL Knowledge to Become a MongoDB Guru
 
CFS: Cassandra Backed Storage for Hadoop
CFS: Cassandra Backed Storage for HadoopCFS: Cassandra Backed Storage for Hadoop
CFS: Cassandra Backed Storage for Hadoop
 
CFS: Cassandra backed storage for Hadoop
CFS: Cassandra backed storage for HadoopCFS: Cassandra backed storage for Hadoop
CFS: Cassandra backed storage for Hadoop
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101
 
Bonjour, iCloud
Bonjour, iCloudBonjour, iCloud
Bonjour, iCloud
 
Making MySQL Agile-ish
Making MySQL Agile-ishMaking MySQL Agile-ish
Making MySQL Agile-ish
 
Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger
Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, StrongerCassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger
Cassandra Community Webinar | Cassandra 2.0 - Better, Faster, Stronger
 
DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...
DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...
DataStax | DSE Search 5.0 and Beyond (Nick Panahi & Ariel Weisberg) | Cassand...
 

Recently uploaded

GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 

Recently uploaded (20)

GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 

Introduction to Cassandra Basics

  • 1. Introduction to Cassandra Nick Bailey @nickmbailey Monday, October 28, 13
  • 2. Who am I? ©2012 DataStax Monday, October 28, 13 2
  • 4. On to the good stuff! ©2012 DataStax Monday, October 28, 13 4
  • 5. Why Cassandra? Cluster Architecture Node Architecture 5 Data Modeling Wrap up ©2012 DataStax Monday, October 28, 13
  • 7. Time for buzz words! ©2012 DataStax Monday, October 28, 13 Big Data! NoSQL! 7
  • 8. Big Data • Gartner: “...high-volume, high-velocity and high-variety...” • 2 sides of ‘big data’ • • ©2012 DataStax Monday, October 28, 13 Analytics Real-time 8
  • 9. NoSQL • A terrible label • Covers a wide range of DBs • • • • • ©2012 DataStax Monday, October 28, 13 Cassandra Redis MongoDB HBase ... 9
  • 10. Started by Facebook ©2012 DataStax Monday, October 28, 13 10
  • 11. Dynamo (Amazon) + Big Table (Google) ©2012 DataStax Monday, October 28, 13 11
  • 13. Cassandra is great for... • Massive, linear scaling (e.g. CERN hadron collider, Barracuda Networks) • Extremely heavy writes (e.g. BlueMountain Capital – financial tick data) • High availability (e.g. eBay, Eventbrite, Netflix, SoundCloud, HeathCare Anytime, Comcast, GoDaddy, Sony Entertainment Network) ©2012 DataStax Monday, October 28, 13 13
  • 17. One size does not fit all Polyglot persistence ©2012 DataStax Monday, October 28, 13 17
  • 18. More Resources • PlanetCassandra.org • Blog • 5 minute interviews ©2012 DataStax Monday, October 28, 13 18
  • 20. Data Distribution 0 75 25 50 Hash_Function(Partition Key) >> Token ©2012 DataStax Monday, October 28, 13
  • 23. Consistency Level • Multiple options • • • • • ONE QUORUM ALL LOCAL_QUORUM ... • Can be specified per request ©2012 DataStax Monday, October 28, 13 23
  • 28. Failure Types • UnavailableException • Didn’t even try • Possible success or failure • TimedOutException ©2012 DataStax Monday, October 28, 13 28
  • 30. Gossip • Manages cluster state • • Nodes up/down Nodes joining/leaving • Decentralized ©2012 DataStax Monday, October 28, 13 30
  • 31. Snitch • Responsible for determining cluster topology • Tracks node responsiveness • Simple, PropertyFile, Ec2Snitch, etc... ©2012 DataStax Monday, October 28, 13 31
  • 33. Write Path Write Memtable Memory Disk commit log ©2012 DataStax Monday, October 28, 13 SSTable 33
  • 36. CQL Cassandra Query Language ©2012 DataStax Monday, October 28, 13 36
  • 37. Terminology • Keyspace • Table (Column Family) • Row • Column • Partition Key • Clustering Key (Optional) ©2012 DataStax Monday, October 28, 13 37
  • 38. For Example: CREATE KEYSPACE packagetracker WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }; CREATE KEYSPACE packagetracker WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'dc1' : 2, 'dc2' : 2}; CREATE TABLE events ( package_id text, status_timestamp timestamp, location text, notes text, PRIMARY KEY (package_id, status_timestamp) ); ©2012 DataStax Monday, October 28, 13 38
  • 40. Basic Data Types • blob • int • text • long • uuid • etc ©2012 DataStax Monday, October 28, 13 40
  • 41. More Data Modeling Constructs • Collections • map, set, list • Time to live (TTL) • Counters • Secondary Indexes ©2012 DataStax Monday, October 28, 13 41
  • 42. Approaching Data Modeling • Model your queries, not your data • Optimize your data model for reads • Don’t be afraid to denormalize • You will get it wrong, iterate ©2012 DataStax Monday, October 28, 13 42
  • 43. An Example: User Logins ©2012 DataStax Monday, October 28, 13 43
  • 44. The Query What are the last 10 locations nickmbailey logged in from? SELECT time, location FROM logins WHERE user = ‘nickmbailey’ ORDER BY time DESC LIMIT 10; ©2012 DataStax Monday, October 28, 13 44
  • 45. The Query What are the last 10 locations nickmbailey logged in from? SELECT time, location FROM logins WHERE user = ‘nickmbailey’ ORDER BY time DESC LIMIT 10; Partition Key ©2012 DataStax Monday, October 28, 13 45
  • 46. The Query What are the last 10 locations nickmbailey logged in from? SELECT time, location FROM logins WHERE user = ‘nickmbailey’ ORDER BY time DESC LIMIT 10; Clustering Key ©2012 DataStax Monday, October 28, 13 Partition Key 46
  • 47. The Query What are the last 10 locations nickmbailey logged in from? SELECT time, location FROM logins WHERE user = ‘nickmbailey’ ORDER BY time DESC LIMIT 10; Clustering Key ©2012 DataStax Monday, October 28, 13 Partition Key Additional Columns 47
  • 48. The Query What are the last 10 locations nickmbailey logged in from? SELECT time, location FROM logins WHERE user = ‘nickmbailey’ ORDER BY time DESC LIMIT 10; Clustering Key Partition Key Additional Columns CREATE COLUMN FAMILY logins ( user text, time timestamp, location text, PRIMARY KEY (user, time)); ©2012 DataStax Monday, October 28, 13 48
  • 49. The Query What are the last 10 locations nickmbailey logged in from? SELECT time, location FROM logins WHERE user = ‘nickmbailey’ ORDER BY time DESC LIMIT 10; CREATE COLUMN FAMILY logins ( user text, time timestamp, location text, PRIMARY KEY (user, time)); Partition key Primary key User Time Location nickmbailey 2013-07-19 09:22:18 Austin, Texas nickmbailey 2013-07-19 14:49:27 Blacksburg, Virginia jsmith 2013-07-20 07:59:34 Atlanta, Georgia ©2012 DataStax Monday, October 28, 13 49
  • 50. Time-series data • By far, the most common data model • Event logs • Metrics • Sensor Data • Etc ©2012 DataStax Monday, October 28, 13 50
  • 51. Another Query When was the last time nickmbailey logged in from San Francisco, California? SELECT time FROM logins WHERE user = ‘nickmbailey’ and location=‘San Francisco, California’; User Time Location nickmbailey 2013-07-19 09:22:18 Austin, Texas nickmbailey 2013-07-19 14:49:27 Blacksburg, Virginia nickmbailey 2013-07-19 14:49:27 Austin, Texas nickmbailey 2013-05-19 14:49:27 Austin, Texas nickmbailey 2013-04-19 14:49:27 San Francisco, California ... ... ... jsmith 2013-07-20 07:59:34 Atlanta, Georgia ©2012 DataStax Monday, October 28, 13 51
  • 52. Another Query When was the last time nickmbailey logged in from Austin, Texas? SELECT time FROM logins_by_location WHERE user = ‘nickmbailey’ and location=‘San Francisco, California’; CREATE COLUMN FAMILY logins_by_location ( user text, time timestamp, location text, PRIMARY KEY (user, location)); ©2012 DataStax Monday, October 28, 13 52
  • 53. Another Query When was the last time nickmbailey logged in from Austin, Texas? SELECT time FROM logins_by_location WHERE user = ‘nickmbailey’ and location=‘San Francisco, California’; CREATE COLUMN FAMILY logins_by_location ( user text, time timestamp, location text, PRIMARY KEY (user, location)); User Location Time nickmbailey Austin, Texas 2013-07-19 09:22:18 nickmbailey Blacksburg, Virginia 2013-07-19 14:49:27 nickmbailey San Francisco, California 2013-07-19 14:49:27 ©2012 DataStax Monday, October 28, 13 53
  • 54. Denormalize • Create materialized views of the same data to support different queries • Storage space is cheap, Cassandra is fast ©2012 DataStax Monday, October 28, 13 54
  • 55. Debugging your data model cqlsh> tracing on; Now tracing requests. cqlsh:foo> INSERT INTO test (a, b) VALUES (1, 'example'); Tracing session: 4ad36250-1eb4-11e2-0000-fe8ebeead9f9 activity | timestamp | source | source_elapsed -------------------------------------+--------------+-----------+---------------execute_cql3_query | 00:02:37,015 | 127.0.0.1 | 0 Parsing statement | 00:02:37,015 | 127.0.0.1 | 81 Preparing statement | 00:02:37,015 | 127.0.0.1 | 273 Determining replicas for mutation | 00:02:37,015 | 127.0.0.1 | 540 Sending message to /127.0.0.2 | 00:02:37,015 | 127.0.0.1 | 779 Messsage received from /127.0.0.1 Applying mutation Acquiring switchLock Appending to commitlog Adding to memtable Enqueuing response to /127.0.0.1 Sending message to /127.0.0.1 ©2012 DataStax Monday, October 28, 13 | | | | | | | 00:02:37,016 00:02:37,016 00:02:37,016 00:02:37,016 00:02:37,016 00:02:37,016 00:02:37,016 | | | | | | | 127.0.0.2 127.0.0.2 127.0.0.2 127.0.0.2 127.0.0.2 127.0.0.2 127.0.0.2 | | | | | | | 63 220 250 277 378 710 888 55
  • 56. A note on Transactions • In general, you want to construct your data model around them • The latest version of Cassandra has ‘Compare and swap’ • • • ©2012 DataStax Monday, October 28, 13 An implementation of Paxos ...IF NOT EXISTS; ...IF column1 = ‘value’; 56
  • 57. Try it out ©2012 DataStax Monday, October 28, 13 57
  • 58. CCM • CCM - Cassandra Cluster Manager • https://github.com/pcmanus/ccm • • • ccm create test -v 2.0.1 ccm populate -n 3 ccm start • Warning: not lightweight • Example: ©2012 DataStax Monday, October 28, 13 58
  • 59. Clients • Cqlsh • Bundled with Cassandra • • • • java: https://github.com/datastax/java-driver python: https://github.com/datastax/python-driver .net: https://github.com/datastax/csharp-driver and more: http://www.datastax.com/download/ clientdrivers • Drivers ©2012 DataStax Monday, October 28, 13 59
  • 60. Get Help • IRC: #cassandra on freenode • Mailing Lists • Stack Overflow • DataStax Docs • ©2012 DataStax Monday, October 28, 13 http://www.datastax.com/docs 60