SlideShare a Scribd company logo
OVERVIEW AND REAL WORLD 
APPLICATIONS 
Cassandra 
Jersey Shore Tech Meetup 
Nov 13, 2014
You Are Not Here… 
*** http://njhalloffame.org/ 
2
Agenda 
3 
 Some Basic Concepts/Overview 
 New Developments In Cassandra 
 Basic Data Modeling Concepts 
 Materialized Views 
 Secondary Indexes 
 Counters 
 Time Series Data 
 Expiring Data
Cassandra High Level 
4 
Cassandra's architecture is based on the combination 
of two technologies: 
 Google BigTable – Data Model 
 Amazon Dynamo – Distributed Architecture 
BTW – these mean the same thing -> 
Cassandra = C*
Architecture Basics & Terminology 
5 
 Nodes are single instances of C* 
 Cluster is a group of nodes 
 Data is organized by keys (tokens) which are 
distributed across the cluster 
 Replication Factor (rf) determines how many copies 
are key 
 Data Center Aware – works well in multi-DC/EC2 
etc. 
 Consistency Level – powerful feature to tune 
consistency vs. speed vs. availability.’
C* Ring 
6
More Architecture 
7 
 Information on who has what data and who is 
available is transferred using gossip. 
 No single point of failure (SPF), every node can 
service requests. 
 Handles Replication and Downed Nodes (within 
reason)
CAP Theorem 
8 
 Distributed Systems Law: 
 Consistency 
 Availability 
 Partition Tolerance 
(you can only really have two in a distributed system) 
 Cassandra is AP with Eventual Consistency
Consistency 
9 
 Cassandra Uses the concept of Tunable Consistency, 
which make it very powerful and flexible for system 
needs.
C* Persistence Model 
10
Read Path 
11
Write Path 
12
Data Model Architecture 
13 
 Keyspace – container of column families (tables). 
Defines RF among others. 
 Table – column family. Contains definition of 
schema. 
 Row – a “record” identified by a key 
 Column - a key and a value
14
Deletions 
15 
 Distributed systems present unique problem for 
deletes. If it actually deleted data and a node was 
down and didn’t receive the delete notice it would try 
and create record when came back online. So… 
 Tombstone - The data is replaced with a special 
value called a Tombstone, works within distributed 
architecture
Keys 
16 
 Primary Key 
 Partition Key – identifies a row 
 Cluster Key – sorting within a row 
 Using CQL these are defined together as a compound 
(composite) key 
 Compound keys are how you implement “wide 
rows”, the COOL FEATURE!
Single Primary Key 
17 
create table users ( 
user_id UUID PRIMARY KEY, 
firstname text, 
lastname text, 
emailaddres text 
); 
** Cassandra Data Types 
http://www.datastax.com/documentation/cql/3.0/cql/cql_ref 
erence/cql_data_types_c.html
Compound Key 
18 
create table users ( 
emailaddress text, 
department text, 
firstname text, 
lastname text, 
PRIMARY KEY (emailaddress, department) 
); 
 Partition Key plus Cluster Key 
 emailaddress is partition key 
 department is cluster key
Compound Key 
19 
create table users ( 
emailaddress text, 
department text, 
country text, 
firstname text, 
lastname text, 
PRIMARY KEY ((emailaddress, department), country) 
); 
 Partition Key plus Cluster Key 
 Emailaddress & department is partition key 
 country is cluster key
New Rules 
20 
 Writes Are Cheap 
 Denormalize All You Need 
 Model Your Queries, Not Data (understand access 
patterns) 
 Application Worries About Joins
What’s New In 2.0 
21 
Conditional DDL 
IF Exists or If Not Exists 
Drop Column Support 
ALTER TABLE users DROP lastname;
More New Stuff 
22 
 Triggers 
CREATE TRIGGER myTrigger 
ON myTable 
USING 'com.thejavaexperts.cassandra.updateevt' 
 Lightweight Transactions (CAS) 
UPDATE users 
SET firstname = 'tim' 
WHERE emailaddress = 'tpeters@example.com' 
IF firstname = 'tom'; 
** Not like an ACID Transaction!!
CAS & Transactions 
23 
 CAS - compare-and-set operations. In a single, 
atomic operation compares a value of a column in 
the database and applying a modification depending 
on the result of the comparison. 
 Consider performance hit. CAS is (was) considered 
an anti-pattern.
Data Modeling… The Basics 
24 
 Cassandra now is very familiar to RDBMS/SQL 
users. 
 Very nicely hides the underlying data storage model. 
 Still have all the power of Cassandra, it is all in the 
key definition. 
RDBMS = model data 
Cassandra = model access (queries)
Side-Note On Querying 
25 
 Create table with compound key 
 Select using ALLOW FILTERING 
 Counts 
 Select using IN or =
Batch Operations 
26 
 Saves Network Roundtrips 
 Can contain INSERT, UPDATE, DELETE 
 Atomic by default (all or nothing) 
 Can use timestamp for specific ordering
Batch Operation Example 
27 
BEGIN BATCH 
INSERT INTO users (emailaddress, firstname, lastname, country) values 
('brian.enochson@gmail.com', 'brian', 'enochson', 'USA'); 
INSERT INTO users (emailaddress, firstname, lastname, country) values 
('tpeters@example.com', 'tom', 'peters', 'DE'); 
INSERT INTO users (emailaddress, firstname, lastname, country) values 
('jsmith@example.com', 'jim', 'smith', 'USA'); 
INSERT INTO users (emailaddress, firstname, lastname, country) values 
('arogers@example.com', 'alan', 'rogers', 'USA'); 
DELETE FROM users WHERE emailaddress = 'jsmith@example.com'; 
APPLY BATCH; 
 select in cqlsh 
 List in cassandra-cli with timestamp
More Data Modeling… 
28 
 No Joins 
 No Foreign Keys 
 No Third (or any other) Normal Form Concerns 
 Redundant Data Encouraged. Apps maintain 
consistency.
Secondary Indexes 
29 
 Allow defining indexes to allow other access than 
partition key. 
 Each node has a local index for its data. 
 They have uses, but shouldn’t be used all the time 
without consideration. 
 We will look at alternatives.
Secondary Index Example 
30 
 Create a table 
 Try to select with column not in PK 
 Add Secondary Index 
 Try select again. (maybe need to reinsert)
When to use? 
31 
 Low Cardinality – small number of unique values 
 High Cardinality – high number of distinct values 
 Secondary Indexes are good for Low Cardinality. So 
country codes, department codes etc. Not email 
addresses.
Materialized View 
32 
 Want full distribution can use what is called a 
Materialized View pattern. 
 Remember redundant data is fine. 
 Model the queries
Materialized View Example 
33 
 Show normal able with compound key and querying 
limitations 
 Create Materialized View Table With Different 
Compound Key, support alternate access. 
 Selects use partition key. 
 Secondary indexes local, not distributed 
 Allow filtering. Can cause performance issues
Counters 
34 
 Updated in 2.1 and now work in a more distributed 
and accurate manner. 
 Table organization, example 
 How to update, view etc.
Time Series Example…. 
35 
 Time series table model. 
 Need to consider interval for event frequency and 
wide row size. 
 Make what is tracked by time and unit of interval 
partition key.
Time Series Data 
36 
 Due to its quick writing model Cassandra is suited 
for storing time series data. 
 The Cassandra wide row is a perfect fit for modeling 
time series / time based events. 
 Let’s look at an example….
Event Data 
37 
 Notice primary key and cluster key. 
 Insert some data 
 View in CQL, then in CLI as wide row
TTL – Self Expiring Data 
38 
 Another technique is data that has a defined lifespan. 
 For instance session identifiers, temporary 
passwords etc. 
 For this Cassandra provides a Time To Live (TTL) 
mechanism.
TTL Example… 
39 
 Create table 
 Insert data using TTL 
 Can update specific column with table 
 Show using selects.
Questions 
40 
 http://www.thejavaexperts.net/ 
 Email: brian.enochson@gmail.com 
 Twitter: @benochso 
 G+: https://plus.google.com/+BrianEnochson

More Related Content

What's hot

Dynamo cassandra
Dynamo cassandraDynamo cassandra
Dynamo cassandra
Wu Liang
 
Ado.Net Architecture
Ado.Net ArchitectureAdo.Net Architecture
Ado.Net Architecture
Umar Farooq
 
Data decomposition techniques
Data decomposition techniquesData decomposition techniques
Data decomposition techniques
Mohamed Ramadan
 
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory option
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory optionStar Transformation, 12c Adaptive Bitmap Pruning and In-Memory option
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory option
Franck Pachot
 
Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...
Anyscale
 
06 linked list
06 linked list06 linked list
06 linked list
Rajan Gautam
 
Db2 faqs
Db2 faqsDb2 faqs
Db2 faqs
kapa rohit
 
Data structures "1" (Lectures 2015-2016)
Data structures "1" (Lectures 2015-2016) Data structures "1" (Lectures 2015-2016)
Data structures "1" (Lectures 2015-2016)
Ameer B. Alaasam
 
MySQL 8.0 Featured for Developers
MySQL 8.0 Featured for DevelopersMySQL 8.0 Featured for Developers
MySQL 8.0 Featured for Developers
Dave Stokes
 

What's hot (10)

Dynamo cassandra
Dynamo cassandraDynamo cassandra
Dynamo cassandra
 
Ado.Net Architecture
Ado.Net ArchitectureAdo.Net Architecture
Ado.Net Architecture
 
Data decomposition techniques
Data decomposition techniquesData decomposition techniques
Data decomposition techniques
 
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory option
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory optionStar Transformation, 12c Adaptive Bitmap Pruning and In-Memory option
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory option
 
FractalTreeIndex
FractalTreeIndexFractalTreeIndex
FractalTreeIndex
 
Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...
 
06 linked list
06 linked list06 linked list
06 linked list
 
Db2 faqs
Db2 faqsDb2 faqs
Db2 faqs
 
Data structures "1" (Lectures 2015-2016)
Data structures "1" (Lectures 2015-2016) Data structures "1" (Lectures 2015-2016)
Data structures "1" (Lectures 2015-2016)
 
MySQL 8.0 Featured for Developers
MySQL 8.0 Featured for DevelopersMySQL 8.0 Featured for Developers
MySQL 8.0 Featured for Developers
 

Viewers also liked

Yimby and growing your audience from zero to lots
Yimby and growing your audience from zero to lotsYimby and growing your audience from zero to lots
Yimby and growing your audience from zero to lots
Jonathan Waddingham
 
Medical Information Workshop (23 Jan 2007 )
Medical Information Workshop (23 Jan 2007 )Medical Information Workshop (23 Jan 2007 )
Medical Information Workshop (23 Jan 2007 )rwakefor
 
Kathleen's Powerpoint Presentation in Sir Rey's Computer Class
Kathleen's Powerpoint Presentation in Sir Rey's Computer ClassKathleen's Powerpoint Presentation in Sir Rey's Computer Class
Kathleen's Powerpoint Presentation in Sir Rey's Computer Class
rey ayento
 
Postal De Nadal 2008 09 Manel Sons
Postal De Nadal 2008 09 Manel SonsPostal De Nadal 2008 09 Manel Sons
Postal De Nadal 2008 09 Manel Sons
manelagui
 
10th Ceepus – Biomedicine Students’ Council Summer Eng
10th Ceepus – Biomedicine Students’ Council Summer Eng10th Ceepus – Biomedicine Students’ Council Summer Eng
10th Ceepus – Biomedicine Students’ Council Summer Eng
SU07
 
Autotools
Autotools Autotools
Autotools
easychen
 
Presentatie Lizzy Jongma Masterclass Open Cultuur Data
Presentatie Lizzy Jongma Masterclass Open Cultuur DataPresentatie Lizzy Jongma Masterclass Open Cultuur Data
Presentatie Lizzy Jongma Masterclass Open Cultuur DataKennisland
 
Enabling co-­creation of e-services through virtual worlds
Enabling co-­creation of e-services through virtual worldsEnabling co-­creation of e-services through virtual worlds
Enabling co-­creation of e-services through virtual worlds
Thomas Kohler
 
Integrating PHP With System-i using Web Services
Integrating PHP With System-i using Web ServicesIntegrating PHP With System-i using Web Services
Integrating PHP With System-i using Web Services
Ivo Jansch
 
Social media and local government
Social media and local governmentSocial media and local government
Social media and local government
simonwakeman
 
Achievo ATK, an Open Source project
Achievo ATK, an Open Source projectAchievo ATK, an Open Source project
Achievo ATK, an Open Source project
Ivo Jansch
 
Ict4volunteering Mm
Ict4volunteering MmIct4volunteering Mm
Ict4volunteering Mm
havs
 
HTML5 - Um Ano Depois
HTML5 - Um Ano DepoisHTML5 - Um Ano Depois
HTML5 - Um Ano Depois
Elcio Ferreira
 
MapIt1418
MapIt1418MapIt1418
MapIt1418
Kennisland
 
Good Luck
Good LuckGood Luck
Good Luck
irGoogle
 
ICT Sustainability
ICT SustainabilityICT Sustainability
ICT Sustainability
havs
 
H1B 2017 Predictions: Will There Be A H-1B Lottery Again?
H1B 2017 Predictions: Will There Be A H-1B Lottery Again?H1B 2017 Predictions: Will There Be A H-1B Lottery Again?
H1B 2017 Predictions: Will There Be A H-1B Lottery Again?
VisaPro Immigration Services LLC
 
Publizitate Eraginkortasunaren Baliospena 5
Publizitate Eraginkortasunaren Baliospena 5Publizitate Eraginkortasunaren Baliospena 5
Publizitate Eraginkortasunaren Baliospena 5
katixa
 

Viewers also liked (20)

Yimby and growing your audience from zero to lots
Yimby and growing your audience from zero to lotsYimby and growing your audience from zero to lots
Yimby and growing your audience from zero to lots
 
Medical Information Workshop (23 Jan 2007 )
Medical Information Workshop (23 Jan 2007 )Medical Information Workshop (23 Jan 2007 )
Medical Information Workshop (23 Jan 2007 )
 
Kathleen's Powerpoint Presentation in Sir Rey's Computer Class
Kathleen's Powerpoint Presentation in Sir Rey's Computer ClassKathleen's Powerpoint Presentation in Sir Rey's Computer Class
Kathleen's Powerpoint Presentation in Sir Rey's Computer Class
 
Postal De Nadal 2008 09 Manel Sons
Postal De Nadal 2008 09 Manel SonsPostal De Nadal 2008 09 Manel Sons
Postal De Nadal 2008 09 Manel Sons
 
10th Ceepus – Biomedicine Students’ Council Summer Eng
10th Ceepus – Biomedicine Students’ Council Summer Eng10th Ceepus – Biomedicine Students’ Council Summer Eng
10th Ceepus – Biomedicine Students’ Council Summer Eng
 
Autotools
Autotools Autotools
Autotools
 
Presentatie Lizzy Jongma Masterclass Open Cultuur Data
Presentatie Lizzy Jongma Masterclass Open Cultuur DataPresentatie Lizzy Jongma Masterclass Open Cultuur Data
Presentatie Lizzy Jongma Masterclass Open Cultuur Data
 
Enabling co-­creation of e-services through virtual worlds
Enabling co-­creation of e-services through virtual worldsEnabling co-­creation of e-services through virtual worlds
Enabling co-­creation of e-services through virtual worlds
 
Integrating PHP With System-i using Web Services
Integrating PHP With System-i using Web ServicesIntegrating PHP With System-i using Web Services
Integrating PHP With System-i using Web Services
 
Social media and local government
Social media and local governmentSocial media and local government
Social media and local government
 
Achievo ATK, an Open Source project
Achievo ATK, an Open Source projectAchievo ATK, an Open Source project
Achievo ATK, an Open Source project
 
Ict4volunteering Mm
Ict4volunteering MmIct4volunteering Mm
Ict4volunteering Mm
 
IoF South West Conference
IoF South West ConferenceIoF South West Conference
IoF South West Conference
 
HTML5 - Um Ano Depois
HTML5 - Um Ano DepoisHTML5 - Um Ano Depois
HTML5 - Um Ano Depois
 
MapIt1418
MapIt1418MapIt1418
MapIt1418
 
Good Luck
Good LuckGood Luck
Good Luck
 
Visual image
Visual imageVisual image
Visual image
 
ICT Sustainability
ICT SustainabilityICT Sustainability
ICT Sustainability
 
H1B 2017 Predictions: Will There Be A H-1B Lottery Again?
H1B 2017 Predictions: Will There Be A H-1B Lottery Again?H1B 2017 Predictions: Will There Be A H-1B Lottery Again?
H1B 2017 Predictions: Will There Be A H-1B Lottery Again?
 
Publizitate Eraginkortasunaren Baliospena 5
Publizitate Eraginkortasunaren Baliospena 5Publizitate Eraginkortasunaren Baliospena 5
Publizitate Eraginkortasunaren Baliospena 5
 

Similar to Cassandra20141113

A Tale of Data Pattern Discovery in Parallel
A Tale of Data Pattern Discovery in ParallelA Tale of Data Pattern Discovery in Parallel
A Tale of Data Pattern Discovery in Parallel
Jenny Liu
 
Cassandra Database
Cassandra DatabaseCassandra Database
Cassandra Database
YounesCharfaoui
 
Data Vault 2.0: Using MD5 Hashes for Change Data Capture
Data Vault 2.0: Using MD5 Hashes for Change Data CaptureData Vault 2.0: Using MD5 Hashes for Change Data Capture
Data Vault 2.0: Using MD5 Hashes for Change Data Capture
Kent Graziano
 
Apache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data modelApache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data model
Andrey Lomakin
 
Storage cassandra
Storage   cassandraStorage   cassandra
Storage cassandraPL dream
 
Cassandra
CassandraCassandra
Cassandra
Bang Tsui Liou
 
Introduciton to Apache Cassandra for Java Developers (JavaOne)
Introduciton to Apache Cassandra for Java Developers (JavaOne)Introduciton to Apache Cassandra for Java Developers (JavaOne)
Introduciton to Apache Cassandra for Java Developers (JavaOne)
zznate
 
Meetup cassandra for_java_cql
Meetup cassandra for_java_cqlMeetup cassandra for_java_cql
Meetup cassandra for_java_cql
zznate
 
Apache Cassandra Data Modeling with Travis Price
Apache Cassandra Data Modeling with Travis PriceApache Cassandra Data Modeling with Travis Price
Apache Cassandra Data Modeling with Travis Price
DataStax Academy
 
A TALE of DATA PATTERN DISCOVERY IN PARALLEL
A TALE of DATA PATTERN DISCOVERY IN PARALLELA TALE of DATA PATTERN DISCOVERY IN PARALLEL
A TALE of DATA PATTERN DISCOVERY IN PARALLEL
Jenny Liu
 
Cassandra no sql ecosystem
Cassandra no sql ecosystemCassandra no sql ecosystem
Apache Cassandra, part 2 – data model example, machinery
Apache Cassandra, part 2 – data model example, machineryApache Cassandra, part 2 – data model example, machinery
Apache Cassandra, part 2 – data model example, machinery
Andrey Lomakin
 
Use Your MySQL Knowledge to Become an Instant Cassandra Guru
Use Your MySQL Knowledge to Become an Instant Cassandra GuruUse Your MySQL Knowledge to Become an Instant Cassandra Guru
Use Your MySQL Knowledge to Become an Instant Cassandra Guru
Tim Callaghan
 
Vsam interview questions and answers.
Vsam interview questions and answers.Vsam interview questions and answers.
Vsam interview questions and answers.Sweta Singh
 
7. SQL.pptx
7. SQL.pptx7. SQL.pptx
7. SQL.pptx
chaitanya149090
 
NOSQL and Cassandra
NOSQL and CassandraNOSQL and Cassandra
NOSQL and Cassandrarantav
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
Lars Albertsson
 
Using Cassandra with your Web Application
Using Cassandra with your Web ApplicationUsing Cassandra with your Web Application
Using Cassandra with your Web Applicationsupertom
 
SenchaCon 2016: The Once and Future Grid - Nige White
SenchaCon 2016: The Once and Future Grid - Nige WhiteSenchaCon 2016: The Once and Future Grid - Nige White
SenchaCon 2016: The Once and Future Grid - Nige White
Sencha
 

Similar to Cassandra20141113 (20)

A Tale of Data Pattern Discovery in Parallel
A Tale of Data Pattern Discovery in ParallelA Tale of Data Pattern Discovery in Parallel
A Tale of Data Pattern Discovery in Parallel
 
Cassandra Database
Cassandra DatabaseCassandra Database
Cassandra Database
 
Data Vault 2.0: Using MD5 Hashes for Change Data Capture
Data Vault 2.0: Using MD5 Hashes for Change Data CaptureData Vault 2.0: Using MD5 Hashes for Change Data Capture
Data Vault 2.0: Using MD5 Hashes for Change Data Capture
 
Apache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data modelApache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data model
 
Storage cassandra
Storage   cassandraStorage   cassandra
Storage cassandra
 
Cassandra
CassandraCassandra
Cassandra
 
Introduciton to Apache Cassandra for Java Developers (JavaOne)
Introduciton to Apache Cassandra for Java Developers (JavaOne)Introduciton to Apache Cassandra for Java Developers (JavaOne)
Introduciton to Apache Cassandra for Java Developers (JavaOne)
 
Meetup cassandra for_java_cql
Meetup cassandra for_java_cqlMeetup cassandra for_java_cql
Meetup cassandra for_java_cql
 
Apache Cassandra Data Modeling with Travis Price
Apache Cassandra Data Modeling with Travis PriceApache Cassandra Data Modeling with Travis Price
Apache Cassandra Data Modeling with Travis Price
 
A TALE of DATA PATTERN DISCOVERY IN PARALLEL
A TALE of DATA PATTERN DISCOVERY IN PARALLELA TALE of DATA PATTERN DISCOVERY IN PARALLEL
A TALE of DATA PATTERN DISCOVERY IN PARALLEL
 
Cassandra no sql ecosystem
Cassandra no sql ecosystemCassandra no sql ecosystem
Cassandra no sql ecosystem
 
Apache Cassandra, part 2 – data model example, machinery
Apache Cassandra, part 2 – data model example, machineryApache Cassandra, part 2 – data model example, machinery
Apache Cassandra, part 2 – data model example, machinery
 
Use Your MySQL Knowledge to Become an Instant Cassandra Guru
Use Your MySQL Knowledge to Become an Instant Cassandra GuruUse Your MySQL Knowledge to Become an Instant Cassandra Guru
Use Your MySQL Knowledge to Become an Instant Cassandra Guru
 
Vsam interview questions and answers.
Vsam interview questions and answers.Vsam interview questions and answers.
Vsam interview questions and answers.
 
Cassandra data modelling best practices
Cassandra data modelling best practicesCassandra data modelling best practices
Cassandra data modelling best practices
 
7. SQL.pptx
7. SQL.pptx7. SQL.pptx
7. SQL.pptx
 
NOSQL and Cassandra
NOSQL and CassandraNOSQL and Cassandra
NOSQL and Cassandra
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Using Cassandra with your Web Application
Using Cassandra with your Web ApplicationUsing Cassandra with your Web Application
Using Cassandra with your Web Application
 
SenchaCon 2016: The Once and Future Grid - Nige White
SenchaCon 2016: The Once and Future Grid - Nige WhiteSenchaCon 2016: The Once and Future Grid - Nige White
SenchaCon 2016: The Once and Future Grid - Nige White
 

More from Brian Enochson

Hadoop20141125
Hadoop20141125Hadoop20141125
Hadoop20141125
Brian Enochson
 
Asbury Hadoop Overview
Asbury Hadoop OverviewAsbury Hadoop Overview
Asbury Hadoop Overview
Brian Enochson
 
Big Data, NoSQL with MongoDB and Cassasdra
Big Data, NoSQL with MongoDB and CassasdraBig Data, NoSQL with MongoDB and Cassasdra
Big Data, NoSQL with MongoDB and Cassasdra
Brian Enochson
 
NoSQL and MongoDB Introdction
NoSQL and MongoDB IntrodctionNoSQL and MongoDB Introdction
NoSQL and MongoDB Introdction
Brian Enochson
 
NoSQL Intro with cassandra
NoSQL Intro with cassandraNoSQL Intro with cassandra
NoSQL Intro with cassandra
Brian Enochson
 
Cassandra Deep Diver & Data Modeling
Cassandra Deep Diver & Data ModelingCassandra Deep Diver & Data Modeling
Cassandra Deep Diver & Data Modeling
Brian Enochson
 

More from Brian Enochson (6)

Hadoop20141125
Hadoop20141125Hadoop20141125
Hadoop20141125
 
Asbury Hadoop Overview
Asbury Hadoop OverviewAsbury Hadoop Overview
Asbury Hadoop Overview
 
Big Data, NoSQL with MongoDB and Cassasdra
Big Data, NoSQL with MongoDB and CassasdraBig Data, NoSQL with MongoDB and Cassasdra
Big Data, NoSQL with MongoDB and Cassasdra
 
NoSQL and MongoDB Introdction
NoSQL and MongoDB IntrodctionNoSQL and MongoDB Introdction
NoSQL and MongoDB Introdction
 
NoSQL Intro with cassandra
NoSQL Intro with cassandraNoSQL Intro with cassandra
NoSQL Intro with cassandra
 
Cassandra Deep Diver & Data Modeling
Cassandra Deep Diver & Data ModelingCassandra Deep Diver & Data Modeling
Cassandra Deep Diver & Data Modeling
 

Recently uploaded

Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
Max Andersen
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Natan Silnitsky
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Shahin Sheidaei
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Globus
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus
 
Strategies for Successful Data Migration Tools.pptx
Strategies for Successful Data Migration Tools.pptxStrategies for Successful Data Migration Tools.pptx
Strategies for Successful Data Migration Tools.pptx
varshanayak241
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
Cyanic lab
 
Visitor Management System in India- Vizman.app
Visitor Management System in India- Vizman.appVisitor Management System in India- Vizman.app
Visitor Management System in India- Vizman.app
NaapbooksPrivateLimi
 
Software Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdfSoftware Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdf
MayankTawar1
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 
Explore Modern SharePoint Templates for 2024
Explore Modern SharePoint Templates for 2024Explore Modern SharePoint Templates for 2024
Explore Modern SharePoint Templates for 2024
Sharepoint Designs
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
vrstrong314
 
A Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfA Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdf
kalichargn70th171
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Globus
 
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
XfilesPro
 
Advanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowAdvanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should Know
Peter Caitens
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
XfilesPro
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
Tier1 app
 

Recently uploaded (20)

Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
Strategies for Successful Data Migration Tools.pptx
Strategies for Successful Data Migration Tools.pptxStrategies for Successful Data Migration Tools.pptx
Strategies for Successful Data Migration Tools.pptx
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
 
Visitor Management System in India- Vizman.app
Visitor Management System in India- Vizman.appVisitor Management System in India- Vizman.app
Visitor Management System in India- Vizman.app
 
Software Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdfSoftware Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdf
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
Explore Modern SharePoint Templates for 2024
Explore Modern SharePoint Templates for 2024Explore Modern SharePoint Templates for 2024
Explore Modern SharePoint Templates for 2024
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
 
A Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfA Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdf
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
 
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
 
Advanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowAdvanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should Know
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
 

Cassandra20141113

  • 1. OVERVIEW AND REAL WORLD APPLICATIONS Cassandra Jersey Shore Tech Meetup Nov 13, 2014
  • 2. You Are Not Here… *** http://njhalloffame.org/ 2
  • 3. Agenda 3  Some Basic Concepts/Overview  New Developments In Cassandra  Basic Data Modeling Concepts  Materialized Views  Secondary Indexes  Counters  Time Series Data  Expiring Data
  • 4. Cassandra High Level 4 Cassandra's architecture is based on the combination of two technologies:  Google BigTable – Data Model  Amazon Dynamo – Distributed Architecture BTW – these mean the same thing -> Cassandra = C*
  • 5. Architecture Basics & Terminology 5  Nodes are single instances of C*  Cluster is a group of nodes  Data is organized by keys (tokens) which are distributed across the cluster  Replication Factor (rf) determines how many copies are key  Data Center Aware – works well in multi-DC/EC2 etc.  Consistency Level – powerful feature to tune consistency vs. speed vs. availability.’
  • 7. More Architecture 7  Information on who has what data and who is available is transferred using gossip.  No single point of failure (SPF), every node can service requests.  Handles Replication and Downed Nodes (within reason)
  • 8. CAP Theorem 8  Distributed Systems Law:  Consistency  Availability  Partition Tolerance (you can only really have two in a distributed system)  Cassandra is AP with Eventual Consistency
  • 9. Consistency 9  Cassandra Uses the concept of Tunable Consistency, which make it very powerful and flexible for system needs.
  • 13. Data Model Architecture 13  Keyspace – container of column families (tables). Defines RF among others.  Table – column family. Contains definition of schema.  Row – a “record” identified by a key  Column - a key and a value
  • 14. 14
  • 15. Deletions 15  Distributed systems present unique problem for deletes. If it actually deleted data and a node was down and didn’t receive the delete notice it would try and create record when came back online. So…  Tombstone - The data is replaced with a special value called a Tombstone, works within distributed architecture
  • 16. Keys 16  Primary Key  Partition Key – identifies a row  Cluster Key – sorting within a row  Using CQL these are defined together as a compound (composite) key  Compound keys are how you implement “wide rows”, the COOL FEATURE!
  • 17. Single Primary Key 17 create table users ( user_id UUID PRIMARY KEY, firstname text, lastname text, emailaddres text ); ** Cassandra Data Types http://www.datastax.com/documentation/cql/3.0/cql/cql_ref erence/cql_data_types_c.html
  • 18. Compound Key 18 create table users ( emailaddress text, department text, firstname text, lastname text, PRIMARY KEY (emailaddress, department) );  Partition Key plus Cluster Key  emailaddress is partition key  department is cluster key
  • 19. Compound Key 19 create table users ( emailaddress text, department text, country text, firstname text, lastname text, PRIMARY KEY ((emailaddress, department), country) );  Partition Key plus Cluster Key  Emailaddress & department is partition key  country is cluster key
  • 20. New Rules 20  Writes Are Cheap  Denormalize All You Need  Model Your Queries, Not Data (understand access patterns)  Application Worries About Joins
  • 21. What’s New In 2.0 21 Conditional DDL IF Exists or If Not Exists Drop Column Support ALTER TABLE users DROP lastname;
  • 22. More New Stuff 22  Triggers CREATE TRIGGER myTrigger ON myTable USING 'com.thejavaexperts.cassandra.updateevt'  Lightweight Transactions (CAS) UPDATE users SET firstname = 'tim' WHERE emailaddress = 'tpeters@example.com' IF firstname = 'tom'; ** Not like an ACID Transaction!!
  • 23. CAS & Transactions 23  CAS - compare-and-set operations. In a single, atomic operation compares a value of a column in the database and applying a modification depending on the result of the comparison.  Consider performance hit. CAS is (was) considered an anti-pattern.
  • 24. Data Modeling… The Basics 24  Cassandra now is very familiar to RDBMS/SQL users.  Very nicely hides the underlying data storage model.  Still have all the power of Cassandra, it is all in the key definition. RDBMS = model data Cassandra = model access (queries)
  • 25. Side-Note On Querying 25  Create table with compound key  Select using ALLOW FILTERING  Counts  Select using IN or =
  • 26. Batch Operations 26  Saves Network Roundtrips  Can contain INSERT, UPDATE, DELETE  Atomic by default (all or nothing)  Can use timestamp for specific ordering
  • 27. Batch Operation Example 27 BEGIN BATCH INSERT INTO users (emailaddress, firstname, lastname, country) values ('brian.enochson@gmail.com', 'brian', 'enochson', 'USA'); INSERT INTO users (emailaddress, firstname, lastname, country) values ('tpeters@example.com', 'tom', 'peters', 'DE'); INSERT INTO users (emailaddress, firstname, lastname, country) values ('jsmith@example.com', 'jim', 'smith', 'USA'); INSERT INTO users (emailaddress, firstname, lastname, country) values ('arogers@example.com', 'alan', 'rogers', 'USA'); DELETE FROM users WHERE emailaddress = 'jsmith@example.com'; APPLY BATCH;  select in cqlsh  List in cassandra-cli with timestamp
  • 28. More Data Modeling… 28  No Joins  No Foreign Keys  No Third (or any other) Normal Form Concerns  Redundant Data Encouraged. Apps maintain consistency.
  • 29. Secondary Indexes 29  Allow defining indexes to allow other access than partition key.  Each node has a local index for its data.  They have uses, but shouldn’t be used all the time without consideration.  We will look at alternatives.
  • 30. Secondary Index Example 30  Create a table  Try to select with column not in PK  Add Secondary Index  Try select again. (maybe need to reinsert)
  • 31. When to use? 31  Low Cardinality – small number of unique values  High Cardinality – high number of distinct values  Secondary Indexes are good for Low Cardinality. So country codes, department codes etc. Not email addresses.
  • 32. Materialized View 32  Want full distribution can use what is called a Materialized View pattern.  Remember redundant data is fine.  Model the queries
  • 33. Materialized View Example 33  Show normal able with compound key and querying limitations  Create Materialized View Table With Different Compound Key, support alternate access.  Selects use partition key.  Secondary indexes local, not distributed  Allow filtering. Can cause performance issues
  • 34. Counters 34  Updated in 2.1 and now work in a more distributed and accurate manner.  Table organization, example  How to update, view etc.
  • 35. Time Series Example…. 35  Time series table model.  Need to consider interval for event frequency and wide row size.  Make what is tracked by time and unit of interval partition key.
  • 36. Time Series Data 36  Due to its quick writing model Cassandra is suited for storing time series data.  The Cassandra wide row is a perfect fit for modeling time series / time based events.  Let’s look at an example….
  • 37. Event Data 37  Notice primary key and cluster key.  Insert some data  View in CQL, then in CLI as wide row
  • 38. TTL – Self Expiring Data 38  Another technique is data that has a defined lifespan.  For instance session identifiers, temporary passwords etc.  For this Cassandra provides a Time To Live (TTL) mechanism.
  • 39. TTL Example… 39  Create table  Insert data using TTL  Can update specific column with table  Show using selects.
  • 40. Questions 40  http://www.thejavaexperts.net/  Email: brian.enochson@gmail.com  Twitter: @benochso  G+: https://plus.google.com/+BrianEnochson