SlideShare a Scribd company logo
©2013 DataStax Confidential. Do not distribute without consent.
Jon Haddad, Luke Tillman
Technical Evangelists, DataStax
@rustyrazorblade, @LukeTillman
Introduction to Apache Cassandra
1
What is Apache Cassandra?
• Fast Distributed Database
• High Availability
• Linear Scalability
• Predictable Performance
• No SPOF
• Multi-DC
• Commodity Hardware
• Easy to manage operationally
Hash Ring
• No master / slave / replica sets
• No config servers, zookeeper
• Data is partitioned around the ring
• Data is replicated to RF=N servers
• All nodes hold data and can answer
queries (both reads & writes)
• Location of data on ring is
determined by partition key
CAP Tradeoffs
• Cassandra chooses Availability &
Partition Tolerance over Consistency
• Queries have tunable consistency level
• ALL, QUORUM, ONE
• Hinted Handoff to deal with failed nodes
Data Modeling
Data Structures
• Like an RDBMS, Cassandra uses a Table to
store data
• But there’s where the similarities end
• Partitions within tables
• Rows within partitions (or a single row)
• CQL to create tables & query data
• Partition keys determine where a partition
is found
• Clustering keys determine ordering of rows
within a partition
Table
Partition
Row
Keyspace
Example: Single Row Partition
• Simple User system
• Identified by name (pk)
• 1 Row per partition
• This is familiar territory
name age job
jon 33 evangelist
luke 33 evangelist
old pete 108 retired
s. seagal 62 actor
JCVD 53 actor
cqlsh:demo> select * from user WHERE name = 'JCVD'
cqlsh:demo> create table user
(name text primary key,
age int,
job text);
Example: Multiple Rows
• Comments on photos
• Comments are always selected by
the photo_id
• There are only 4 rows in 2 partitions
• In the real world, use UUIDs instead
of int for PK
photo_id comment_id user comment
5 1 jon hi
5 2 luke oh hey
5 3 JCVD AHHHHH!!!
6 4 jon great pic
select * from comment where photo_id=5
create table comment
( photo_id int,
comment_id int,
user text,
comment text,
primary key (photo_id, comment_id));
Partition with Clustering
photo_id comment_id user comment comment_id user comment comment_id user comment
5 1 jon hi 2 luke oh hey 3 JCVD AHHHHH!!!
6 4 jon great pic
• Multiple rows are transposed into a single partition
• Partitions vary in size
• Old terminology - "wide row"
Model Tables to Answer Queries
• This is not 3NF!!
• We always query by partition key
• Create many tables aka
materialized views
• Manage in your app code
• Denormalize!!
user age
jon 33
luke 33
JCVD 53
age user user
33 jon luke
53 JCVD
CREATE TABLE age_to_user (
age int,
user text,
primary key (age, user)
);
CQL Data Types
Basic Types Collections
text uuid counter map
int timeuuid list
decimal set
blob
Read the CQL documentation for the full list of types
Reads & Writes
The Write Path
• Writes are written to any node in the cluster
(coordinator)
• Writes are written to commit log, then to
memtable
• Every write includes a timestamp
• Memtable flushed to disk periodically
(sstable)
• New memtable is created in memory
• Deletes are actually a special write case,
called a “tombstone”
What is an SSTable?
• Immutable data file for row storage
• Deletes are written as tombstones
• Every write includes a timestamp of when it
was written
• Partition is spread across multiple SSTables
• Same column can be in multiple SSTables
• Merged through compaction, only latest
timestamp is kept
• Easy backups!
sstable sstable sstable
sstable
The Read Path
• Any server may be queried, it acts as the
coordinator
• Contacts nodes with the requested key
• On each node, data is pulled from
SSTables and merged
• Consistency< ALL performs read repair
in background (read_repair_chance)
Analytics with Spark
Spark at a Glance
• Scala, Python, Java
• Hadoop alternative - batch analytics
• Distributed SQL
• Real time analytics via streaming
• Machine learning
• GraphX (in progress)
• Open source connector available
• Built into DSE
Summary
• How do I query my data if I can only
query by key?
• Denormalize!
• Create multiple views into your data
(multiple tables)
• Cassandra is built for fast writes
• Use fast writes to do as few reads as
possible
• Use Spark for advanced analytics and
real time analysis
©2013 DataStax Confidential. Do not distribute without consent. 19

More Related Content

What's hot

Apache Traffic Server Internals
Apache Traffic Server InternalsApache Traffic Server Internals
Apache Traffic Server Internals
PSUdaemon
 
chOpaal -- Senior Project Presentation
chOpaal -- Senior Project PresentationchOpaal -- Senior Project Presentation
chOpaal -- Senior Project Presentation
asimfayaz
 
MongoDB Administration 20110922
MongoDB Administration 20110922MongoDB Administration 20110922
MongoDB Administration 20110922
radiocats
 
Shard-Query, an MPP database for the cloud using the LAMP stack
Shard-Query, an MPP database for the cloud using the LAMP stackShard-Query, an MPP database for the cloud using the LAMP stack
Shard-Query, an MPP database for the cloud using the LAMP stack
Justin Swanhart
 
SQLDay2013_DennyCherry_GettingSQLServiceBrokerUp&Running
SQLDay2013_DennyCherry_GettingSQLServiceBrokerUp&RunningSQLDay2013_DennyCherry_GettingSQLServiceBrokerUp&Running
SQLDay2013_DennyCherry_GettingSQLServiceBrokerUp&Running
Polish SQL Server User Group
 
SQLite3
SQLite3SQLite3
SQLite3
cltru
 
MongoDB Administration ~ Kevin Hanson
MongoDB Administration ~ Kevin HansonMongoDB Administration ~ Kevin Hanson
MongoDB Administration ~ Kevin Hanson
hungarianhc
 
Introduction4 SQLite
Introduction4 SQLiteIntroduction4 SQLite
Introduction4 SQLite
Stanley Huang
 
Networking in iOS NSURLSession & NSStream
Networking in iOS NSURLSession & NSStreamNetworking in iOS NSURLSession & NSStream
Networking in iOS NSURLSession & NSStream
Manjula Jonnalagadda
 
Cassandra Day Chicago 2015: Top 5 Tips/Tricks with Apache Cassandra and DSE
Cassandra Day Chicago 2015: Top 5 Tips/Tricks with Apache Cassandra and DSECassandra Day Chicago 2015: Top 5 Tips/Tricks with Apache Cassandra and DSE
Cassandra Day Chicago 2015: Top 5 Tips/Tricks with Apache Cassandra and DSE
DataStax Academy
 
Turning a Search Engine into a Relational Database
Turning a Search Engine into a Relational DatabaseTurning a Search Engine into a Relational Database
Turning a Search Engine into a Relational Database
Matthias Wahl
 
Should I use a document database?
Should I use a document database?Should I use a document database?
Should I use a document database?
Oren Eini
 
What's brewing in the eZ Systems extensions kitchen
What's brewing in the eZ Systems extensions kitchenWhat's brewing in the eZ Systems extensions kitchen
What's brewing in the eZ Systems extensions kitchen
Paul Borgermans
 
mogpres
mogpresmogpres
mogpres
Hiroshi Ono
 
SOLR
SOLRSOLR
Raid
RaidRaid
Your backend architecture is what matters slideshare
Your backend architecture is what matters slideshareYour backend architecture is what matters slideshare
Your backend architecture is what matters slideshare
Colin Charles
 
SQL for Elasticsearch
SQL for ElasticsearchSQL for Elasticsearch
SQL for Elasticsearch
Jodok Batlogg
 

What's hot (18)

Apache Traffic Server Internals
Apache Traffic Server InternalsApache Traffic Server Internals
Apache Traffic Server Internals
 
chOpaal -- Senior Project Presentation
chOpaal -- Senior Project PresentationchOpaal -- Senior Project Presentation
chOpaal -- Senior Project Presentation
 
MongoDB Administration 20110922
MongoDB Administration 20110922MongoDB Administration 20110922
MongoDB Administration 20110922
 
Shard-Query, an MPP database for the cloud using the LAMP stack
Shard-Query, an MPP database for the cloud using the LAMP stackShard-Query, an MPP database for the cloud using the LAMP stack
Shard-Query, an MPP database for the cloud using the LAMP stack
 
SQLDay2013_DennyCherry_GettingSQLServiceBrokerUp&Running
SQLDay2013_DennyCherry_GettingSQLServiceBrokerUp&RunningSQLDay2013_DennyCherry_GettingSQLServiceBrokerUp&Running
SQLDay2013_DennyCherry_GettingSQLServiceBrokerUp&Running
 
SQLite3
SQLite3SQLite3
SQLite3
 
MongoDB Administration ~ Kevin Hanson
MongoDB Administration ~ Kevin HansonMongoDB Administration ~ Kevin Hanson
MongoDB Administration ~ Kevin Hanson
 
Introduction4 SQLite
Introduction4 SQLiteIntroduction4 SQLite
Introduction4 SQLite
 
Networking in iOS NSURLSession & NSStream
Networking in iOS NSURLSession & NSStreamNetworking in iOS NSURLSession & NSStream
Networking in iOS NSURLSession & NSStream
 
Cassandra Day Chicago 2015: Top 5 Tips/Tricks with Apache Cassandra and DSE
Cassandra Day Chicago 2015: Top 5 Tips/Tricks with Apache Cassandra and DSECassandra Day Chicago 2015: Top 5 Tips/Tricks with Apache Cassandra and DSE
Cassandra Day Chicago 2015: Top 5 Tips/Tricks with Apache Cassandra and DSE
 
Turning a Search Engine into a Relational Database
Turning a Search Engine into a Relational DatabaseTurning a Search Engine into a Relational Database
Turning a Search Engine into a Relational Database
 
Should I use a document database?
Should I use a document database?Should I use a document database?
Should I use a document database?
 
What's brewing in the eZ Systems extensions kitchen
What's brewing in the eZ Systems extensions kitchenWhat's brewing in the eZ Systems extensions kitchen
What's brewing in the eZ Systems extensions kitchen
 
mogpres
mogpresmogpres
mogpres
 
SOLR
SOLRSOLR
SOLR
 
Raid
RaidRaid
Raid
 
Your backend architecture is what matters slideshare
Your backend architecture is what matters slideshareYour backend architecture is what matters slideshare
Your backend architecture is what matters slideshare
 
SQL for Elasticsearch
SQL for ElasticsearchSQL for Elasticsearch
SQL for Elasticsearch
 

Viewers also liked

Exposición
ExposiciónExposición
Exposición
BrizeidaBM
 
Perspectiva universidad
Perspectiva universidadPerspectiva universidad
Perspectiva universidad
suanluvaju
 
Cassandra Day Atlanta 2015: BetterCloud: Leveraging Apache Cassandra
Cassandra Day Atlanta 2015: BetterCloud: Leveraging Apache CassandraCassandra Day Atlanta 2015: BetterCloud: Leveraging Apache Cassandra
Cassandra Day Atlanta 2015: BetterCloud: Leveraging Apache Cassandra
DataStax Academy
 
Direrentes tipos de familias integradas
Direrentes tipos de familias integradasDirerentes tipos de familias integradas
Direrentes tipos de familias integradas
Camilo Bohorquez
 
Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...
Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...
Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...
DataStax Academy
 
Cassandra Day Atlanta 2015: Data Modeling 101
Cassandra Day Atlanta 2015: Data Modeling 101Cassandra Day Atlanta 2015: Data Modeling 101
Cassandra Day Atlanta 2015: Data Modeling 101
DataStax Academy
 
Cassandra Day Atlanta 2015: Python & Cassandra
Cassandra Day Atlanta 2015: Python & CassandraCassandra Day Atlanta 2015: Python & Cassandra
Cassandra Day Atlanta 2015: Python & Cassandra
DataStax Academy
 
Auto cad comandos
Auto cad comandosAuto cad comandos
Auto cad comandos
FC311
 
Calentamiento global computacion
Calentamiento global computacionCalentamiento global computacion
Calentamiento global computacion
ricardodelgadoypanini
 
Cassandra Day Atlanta 2015: Feeding Solr at Large Scale with Apache Cassandra
Cassandra Day Atlanta 2015: Feeding Solr at Large Scale with Apache CassandraCassandra Day Atlanta 2015: Feeding Solr at Large Scale with Apache Cassandra
Cassandra Day Atlanta 2015: Feeding Solr at Large Scale with Apache Cassandra
DataStax Academy
 
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in ProductionCassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
DataStax Academy
 
Unidad Didáctica N° 01: Organizamos el aula y aprendemos nuestros nombres - ...
 Unidad Didáctica N° 01: Organizamos el aula y aprendemos nuestros nombres - ... Unidad Didáctica N° 01: Organizamos el aula y aprendemos nuestros nombres - ...
Unidad Didáctica N° 01: Organizamos el aula y aprendemos nuestros nombres - ...
Teresa Clotilde Ojeda Sánchez
 
Sesión de Aprendizaje 01 de Unidad Didáctica 01 del Área de Comunicación - Pr...
Sesión de Aprendizaje 01 de Unidad Didáctica 01 del Área de Comunicación - Pr...Sesión de Aprendizaje 01 de Unidad Didáctica 01 del Área de Comunicación - Pr...
Sesión de Aprendizaje 01 de Unidad Didáctica 01 del Área de Comunicación - Pr...
Teresa Clotilde Ojeda Sánchez
 
Good Samaritan Shout Out
Good Samaritan Shout OutGood Samaritan Shout Out
Good Samaritan Shout OutMatthew Jones
 

Viewers also liked (14)

Exposición
ExposiciónExposición
Exposición
 
Perspectiva universidad
Perspectiva universidadPerspectiva universidad
Perspectiva universidad
 
Cassandra Day Atlanta 2015: BetterCloud: Leveraging Apache Cassandra
Cassandra Day Atlanta 2015: BetterCloud: Leveraging Apache CassandraCassandra Day Atlanta 2015: BetterCloud: Leveraging Apache Cassandra
Cassandra Day Atlanta 2015: BetterCloud: Leveraging Apache Cassandra
 
Direrentes tipos de familias integradas
Direrentes tipos de familias integradasDirerentes tipos de familias integradas
Direrentes tipos de familias integradas
 
Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...
Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...
Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...
 
Cassandra Day Atlanta 2015: Data Modeling 101
Cassandra Day Atlanta 2015: Data Modeling 101Cassandra Day Atlanta 2015: Data Modeling 101
Cassandra Day Atlanta 2015: Data Modeling 101
 
Cassandra Day Atlanta 2015: Python & Cassandra
Cassandra Day Atlanta 2015: Python & CassandraCassandra Day Atlanta 2015: Python & Cassandra
Cassandra Day Atlanta 2015: Python & Cassandra
 
Auto cad comandos
Auto cad comandosAuto cad comandos
Auto cad comandos
 
Calentamiento global computacion
Calentamiento global computacionCalentamiento global computacion
Calentamiento global computacion
 
Cassandra Day Atlanta 2015: Feeding Solr at Large Scale with Apache Cassandra
Cassandra Day Atlanta 2015: Feeding Solr at Large Scale with Apache CassandraCassandra Day Atlanta 2015: Feeding Solr at Large Scale with Apache Cassandra
Cassandra Day Atlanta 2015: Feeding Solr at Large Scale with Apache Cassandra
 
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in ProductionCassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
 
Unidad Didáctica N° 01: Organizamos el aula y aprendemos nuestros nombres - ...
 Unidad Didáctica N° 01: Organizamos el aula y aprendemos nuestros nombres - ... Unidad Didáctica N° 01: Organizamos el aula y aprendemos nuestros nombres - ...
Unidad Didáctica N° 01: Organizamos el aula y aprendemos nuestros nombres - ...
 
Sesión de Aprendizaje 01 de Unidad Didáctica 01 del Área de Comunicación - Pr...
Sesión de Aprendizaje 01 de Unidad Didáctica 01 del Área de Comunicación - Pr...Sesión de Aprendizaje 01 de Unidad Didáctica 01 del Área de Comunicación - Pr...
Sesión de Aprendizaje 01 de Unidad Didáctica 01 del Área de Comunicación - Pr...
 
Good Samaritan Shout Out
Good Samaritan Shout OutGood Samaritan Shout Out
Good Samaritan Shout Out
 

Similar to Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enterprise

Intro to Cassandra
Intro to CassandraIntro to Cassandra
Intro to Cassandra
Jon Haddad
 
Cassandra Day Denver 2014: Introduction to Apache Cassandra
Cassandra Day Denver 2014: Introduction to Apache CassandraCassandra Day Denver 2014: Introduction to Apache Cassandra
Cassandra Day Denver 2014: Introduction to Apache Cassandra
DataStax Academy
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
DataStax Academy
 
L6.sp17.pptx
L6.sp17.pptxL6.sp17.pptx
L6.sp17.pptx
SudheerKumar499932
 
Cassandra training
Cassandra trainingCassandra training
Cassandra training
András Fehér
 
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Bob Pusateri
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
DataStax Academy
 
Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)
Jon Haddad
 
Why you should care about data layout in the file system with Cheng Lian and ...
Why you should care about data layout in the file system with Cheng Lian and ...Why you should care about data layout in the file system with Cheng Lian and ...
Why you should care about data layout in the file system with Cheng Lian and ...
Databricks
 
HBase in Practice
HBase in PracticeHBase in Practice
HBase in Practice
larsgeorge
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
DataWorks Summit/Hadoop Summit
 
No sql databases
No sql databasesNo sql databases
No sql databases
swathika rajan
 
Hbase schema design and sizing apache-con europe - nov 2012
Hbase schema design and sizing   apache-con europe - nov 2012Hbase schema design and sizing   apache-con europe - nov 2012
Hbase schema design and sizing apache-con europe - nov 2012
Chris Huang
 
What Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database ScalabilityWhat Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database Scalability
jbellis
 
NoSQL.pptx
NoSQL.pptxNoSQL.pptx
NoSQL.pptx
RithikRaj25
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
Amazon Web Services
 
Select Stars: A DBA's Guide to Azure Cosmos DB (Chicago Suburban SQL Server U...
Select Stars: A DBA's Guide to Azure Cosmos DB (Chicago Suburban SQL Server U...Select Stars: A DBA's Guide to Azure Cosmos DB (Chicago Suburban SQL Server U...
Select Stars: A DBA's Guide to Azure Cosmos DB (Chicago Suburban SQL Server U...
Bob Pusateri
 
Cassandra from the trenches: migrating Netflix (update)
Cassandra from the trenches: migrating Netflix (update)Cassandra from the trenches: migrating Netflix (update)
Cassandra from the trenches: migrating Netflix (update)
Jason Brown
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability Patterns
Jonas Bonér
 
HBase Advanced - Lars George
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars George
JAX London
 

Similar to Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enterprise (20)

Intro to Cassandra
Intro to CassandraIntro to Cassandra
Intro to Cassandra
 
Cassandra Day Denver 2014: Introduction to Apache Cassandra
Cassandra Day Denver 2014: Introduction to Apache CassandraCassandra Day Denver 2014: Introduction to Apache Cassandra
Cassandra Day Denver 2014: Introduction to Apache Cassandra
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
 
L6.sp17.pptx
L6.sp17.pptxL6.sp17.pptx
L6.sp17.pptx
 
Cassandra training
Cassandra trainingCassandra training
Cassandra training
 
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)
 
Why you should care about data layout in the file system with Cheng Lian and ...
Why you should care about data layout in the file system with Cheng Lian and ...Why you should care about data layout in the file system with Cheng Lian and ...
Why you should care about data layout in the file system with Cheng Lian and ...
 
HBase in Practice
HBase in PracticeHBase in Practice
HBase in Practice
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
No sql databases
No sql databasesNo sql databases
No sql databases
 
Hbase schema design and sizing apache-con europe - nov 2012
Hbase schema design and sizing   apache-con europe - nov 2012Hbase schema design and sizing   apache-con europe - nov 2012
Hbase schema design and sizing apache-con europe - nov 2012
 
What Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database ScalabilityWhat Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database Scalability
 
NoSQL.pptx
NoSQL.pptxNoSQL.pptx
NoSQL.pptx
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Select Stars: A DBA's Guide to Azure Cosmos DB (Chicago Suburban SQL Server U...
Select Stars: A DBA's Guide to Azure Cosmos DB (Chicago Suburban SQL Server U...Select Stars: A DBA's Guide to Azure Cosmos DB (Chicago Suburban SQL Server U...
Select Stars: A DBA's Guide to Azure Cosmos DB (Chicago Suburban SQL Server U...
 
Cassandra from the trenches: migrating Netflix (update)
Cassandra from the trenches: migrating Netflix (update)Cassandra from the trenches: migrating Netflix (update)
Cassandra from the trenches: migrating Netflix (update)
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability Patterns
 
HBase Advanced - Lars George
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars George
 

More from DataStax Academy

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
DataStax Academy
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
DataStax Academy
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
DataStax Academy
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
DataStax Academy
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
DataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
DataStax Academy
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
DataStax Academy
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
DataStax Academy
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
DataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
DataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
DataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
DataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
DataStax Academy
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
DataStax Academy
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
DataStax Academy
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
DataStax Academy
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
DataStax Academy
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
DataStax Academy
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and Drivers
DataStax Academy
 

More from DataStax Academy (20)

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and Drivers
 

Recently uploaded

5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Pitangent Analytics & Technology Solutions Pvt. Ltd
 
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
c5vrf27qcz
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
Fwdays
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
Jason Yip
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
Neo4j
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
DianaGray10
 
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
DianaGray10
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
Ivo Velitchkov
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
Safe Software
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Precisely
 

Recently uploaded (20)

5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
 
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
Artificial Intelligence and Electronic Warfare
Artificial Intelligence and Electronic WarfareArtificial Intelligence and Electronic Warfare
Artificial Intelligence and Electronic Warfare
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
 
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
 

Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enterprise

  • 1. ©2013 DataStax Confidential. Do not distribute without consent. Jon Haddad, Luke Tillman Technical Evangelists, DataStax @rustyrazorblade, @LukeTillman Introduction to Apache Cassandra 1
  • 2. What is Apache Cassandra? • Fast Distributed Database • High Availability • Linear Scalability • Predictable Performance • No SPOF • Multi-DC • Commodity Hardware • Easy to manage operationally
  • 3. Hash Ring • No master / slave / replica sets • No config servers, zookeeper • Data is partitioned around the ring • Data is replicated to RF=N servers • All nodes hold data and can answer queries (both reads & writes) • Location of data on ring is determined by partition key
  • 4. CAP Tradeoffs • Cassandra chooses Availability & Partition Tolerance over Consistency • Queries have tunable consistency level • ALL, QUORUM, ONE • Hinted Handoff to deal with failed nodes
  • 6. Data Structures • Like an RDBMS, Cassandra uses a Table to store data • But there’s where the similarities end • Partitions within tables • Rows within partitions (or a single row) • CQL to create tables & query data • Partition keys determine where a partition is found • Clustering keys determine ordering of rows within a partition Table Partition Row Keyspace
  • 7. Example: Single Row Partition • Simple User system • Identified by name (pk) • 1 Row per partition • This is familiar territory name age job jon 33 evangelist luke 33 evangelist old pete 108 retired s. seagal 62 actor JCVD 53 actor cqlsh:demo> select * from user WHERE name = 'JCVD' cqlsh:demo> create table user (name text primary key, age int, job text);
  • 8. Example: Multiple Rows • Comments on photos • Comments are always selected by the photo_id • There are only 4 rows in 2 partitions • In the real world, use UUIDs instead of int for PK photo_id comment_id user comment 5 1 jon hi 5 2 luke oh hey 5 3 JCVD AHHHHH!!! 6 4 jon great pic select * from comment where photo_id=5 create table comment ( photo_id int, comment_id int, user text, comment text, primary key (photo_id, comment_id));
  • 9. Partition with Clustering photo_id comment_id user comment comment_id user comment comment_id user comment 5 1 jon hi 2 luke oh hey 3 JCVD AHHHHH!!! 6 4 jon great pic • Multiple rows are transposed into a single partition • Partitions vary in size • Old terminology - "wide row"
  • 10. Model Tables to Answer Queries • This is not 3NF!! • We always query by partition key • Create many tables aka materialized views • Manage in your app code • Denormalize!! user age jon 33 luke 33 JCVD 53 age user user 33 jon luke 53 JCVD CREATE TABLE age_to_user ( age int, user text, primary key (age, user) );
  • 11. CQL Data Types Basic Types Collections text uuid counter map int timeuuid list decimal set blob Read the CQL documentation for the full list of types
  • 13. The Write Path • Writes are written to any node in the cluster (coordinator) • Writes are written to commit log, then to memtable • Every write includes a timestamp • Memtable flushed to disk periodically (sstable) • New memtable is created in memory • Deletes are actually a special write case, called a “tombstone”
  • 14. What is an SSTable? • Immutable data file for row storage • Deletes are written as tombstones • Every write includes a timestamp of when it was written • Partition is spread across multiple SSTables • Same column can be in multiple SSTables • Merged through compaction, only latest timestamp is kept • Easy backups! sstable sstable sstable sstable
  • 15. The Read Path • Any server may be queried, it acts as the coordinator • Contacts nodes with the requested key • On each node, data is pulled from SSTables and merged • Consistency< ALL performs read repair in background (read_repair_chance)
  • 17. Spark at a Glance • Scala, Python, Java • Hadoop alternative - batch analytics • Distributed SQL • Real time analytics via streaming • Machine learning • GraphX (in progress) • Open source connector available • Built into DSE
  • 18. Summary • How do I query my data if I can only query by key? • Denormalize! • Create multiple views into your data (multiple tables) • Cassandra is built for fast writes • Use fast writes to do as few reads as possible • Use Spark for advanced analytics and real time analysis
  • 19. ©2013 DataStax Confidential. Do not distribute without consent. 19