SlideShare a Scribd company logo
©2013 DataStax Confidential. Do not distribute without consent.
Christopher Batey
Staying in love with Cassandra
Anti patterns
• Client side joins
• Multi partition queries
• Batches
• Mutable data
• Secondary indices
• Retry policies?
• I’m going to be pedantic
• I’m assuming you want to scale
Distributed joins
Where’s the data?
Where do I go for the max?
Storing customer events
• Customer event
- customer_id - ChrisBatey
- staff_id - Charlie
- event_type - login, logout, add_to_basket, remove_from_basket
- time
• Store
- name
- store_type - Website, PhoneApp, Phone, Retail
- location
CREATE TABLE customer_events(
customer_id text,
staff_id text,
time timeuuid,
event_type text,
store_name text,
PRIMARY KEY ((customer_id), time));
store_name text,
location text,
store_type text,
PRIMARY KEY (store_name));
Reading this
Read latest customer event
Read store information
One to one vs one to many
• This required possibly 6 out of 8 of our nodes to be up
• Imagine if there were multiple follow up queries
Multi-partition queries
• Client side joins normally end up falling into the more
general anti-pattern of multi partition queries
Adding staff link
CREATE TABLE customer_events(
customer_id text,
staff set<text>,
time timeuuid,
event_type text,
store_name text,
PRIMARY KEY ((customer_id), time));
name text,
favourite_colour text,
job_title text);
store_name text,
location text,
store_type text);
select * from customer_events where customer_id = ‘chbatey’
limit 1
select * from staff where name in (staff1, staff2, staff3,
Any one of these fails the whole query fails
Use a multi node cluster locally
Use a multi node cluster locally
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID
UN 102.27 KB 256 ? 15ad7694-3e76-4b74-aea0-fa3c0fa59532
UN 102.18 KB 256 ? cca7d0bb-e884-49f9-b098-e38fbe895cbc
UN 93.16 KB 256 ? 1f9737d3-c1b8-4df1-be4c-d3b1cced8e30
UN 102.1 KB 256 ? fe27b958-5d3a-4f78-9880-76cb7c9bead1
UN 93.18 KB 256 ? 66eb3f23-8889-44d6-a9e7-ecdd57ed61d0
UN 102.12 KB 256 ? e2e99a7b-c1fb-4f2a-9e4f-7a4666f8245e
Let’s see with with a 6 node cluster
INSERT INTO staff (name, favourite_colour , job_title ) VALUES ( 'chbatey', 'red',
'Technical Evangelist' );
INSERT INTO staff (name, favourite_colour , job_title ) VALUES ( 'luket', 'red',
'Technical Evangelist' );
INSERT INTO staff (name, favourite_colour , job_title ) VALUES ( 'jonh', 'blue',
'Technical Evangelist' );
select * from staff where name in ('chbatey', 'luket', 'jonh');
Trace with CL ONE = 4 nodes used
Execute CQL3 query | 2015-02-02 06:39:58.759000 | | 0
Parsing select * from staff where name in ('chbatey', 'luket', 'jonh'); [SharedPool-Worker-1] | 2015-02-02
06:39:58.766000 | | 7553
Preparing statement [SharedPool-Worker-1] | 2015-02-02 06:39:58.768000 | | 9249
Executing single-partition query on staff [SharedPool-Worker-3] | 2015-02-02 06:39:58.773000 | | 14255
Sending message to / [WRITE-/] | 2015-02-02 06:39:58.773001 | | 14756
Sending message to / [WRITE-/] | 2015-02-02 06:39:58.773001 | | 14928
Sending message to / [WRITE-/] | 2015-02-02 06:39:58.774000 | | 16035
Executing single-partition query on staff [SharedPool-Worker-1] | 2015-02-02 06:39:58.777000 | | 1156
Enqueuing response to / [SharedPool-Worker-1] | 2015-02-02 06:39:58.777001 | | 1681
Sending message to / [WRITE-/] | 2015-02-02 06:39:58.778000 | | 1944
Executing single-partition query on staff [SharedPool-Worker-1] | 2015-02-02 06:39:58.778000 | | 1554
Processing response from / [SharedPool-Worker-3] | 2015-02-02 06:39:58.779000 | | 20762
Enqueuing response to / [SharedPool-Worker-1] | 2015-02-02 06:39:58.779000 | | 2425
Sending message to / [WRITE-/] | 2015-02-02 06:39:58.779000 | | 21198
Sending message to / [WRITE-/] | 2015-02-02 06:39:58.779000 | | 2639
Sending message to / [WRITE-/] | 2015-02-02 06:39:58.779000 | | 21208
Executing single-partition query on staff [SharedPool-Worker-1] | 2015-02-02 06:39:58.780000 | | 304
Enqueuing response to / [SharedPool-Worker-1] | 2015-02-02 06:39:58.780001 | | 574
Executing single-partition query on staff [SharedPool-Worker-2] | 2015-02-02 06:39:58.781000 | | 4075
Sending message to / [WRITE-/] | 2015-02-02 06:39:58.781000 | | 708
Enqueuing response to / [SharedPool-Worker-2] | 2015-02-02 06:39:58.781001 | | 4348
Sending message to / [WRITE-/] | 2015-02-02 06:39:58.782000 | | 5371
Executing single-partition query on staff [SharedPool-Worker-1] | 2015-02-02 06:39:58.783000 | | 2463
Enqueuing response to / [SharedPool-Worker-1] | 2015-02-02 06:39:58.784000 | | 2905
Sending message to / [WRITE-/] | 2015-02-02 06:39:58.784001 | | 3160
Processing response from / [SharedPool-Worker-2] | 2015-02-02 06:39:58.785000 | | --
Request complete | 2015-02-02 06:39:58.782995 | | 23995
Denormalise with UDTs
CREATE TYPE store (name text, type text, postcode text)
CREATE TYPE staff (name text, fav_colour text, job_title text)

CREATE TABLE customer_events(
customer_id text,
time timeuuid,
event_type text,
store store,
staff set<staff>,
PRIMARY KEY ((customer_id), time));
User defined types
Cassandra 2.1 has landed in DSE 4.7!!!
Less obvious example
• A good pattern for time series: Bucketing
Adding a time bucket
CREATE TABLE customer_events_bucket(
customer_id text,
time_bucket text
time timeuuid,
event_type text,
store store,
staff set<staff>,
PRIMARY KEY ((customer_id, time_bucket), time));
select * from customer_events_bucket where customer_id =
‘chbatey’ and time_bucket IN (‘2015-01-01|0910:01’,
‘2015-01-01|0910:02’, ‘2015-01-01|0910:03’, ‘2015-01-01|
Often better as multiple async queries
Tips for avoiding joins & multi gets
• Say no to client side joins by denormalising
- Much easier with UDTs
• When bucketing aim for at most two buckets for a query
• Get in the habit of reading trace + using a multi node
cluster locally
Unlogged Batches
Unlogged Batches
• Unlogged batches
- Send many statements as one
Customer events table
CREATE TABLE if NOT EXISTS customer_events (
customer_id text,
statff_id text,
store_type text,
time timeuuid ,
event_type text,
PRIMARY KEY (customer_id, time))
Inserting events
INSERT INTO events.customer_events (customer_id, time , event_type , statff_id , store_type ) VALUES
( ?, ?, ?, ?, ?)
public void storeEvent(ConsistencyLevel consistencyLevel, CustomerEvent customerEvent) {

BoundStatement boundInsert = insertStatement.bind(



Batch insert - Good idea?
public void storeEvents(ConsistencyLevel consistencyLevel, CustomerEvent... events) {

BatchStatement batchStatement = new BatchStatement(BatchStatement.Type.UNLOGGED);

for (CustomerEvent event : events) {
BoundStatement boundInsert = insertStatement.bind(

Not so much
Very likely to fail if nodes are
down / over loaded
Gives far too much work for the coordinator
Ruins performance gains of
token aware driver
Individual queries
Unlogged batch use case
public void storeEvents(String customerId, ConsistencyLevel consistencyLevel, CustomerEvent...
events) {

BatchStatement batchStatement = new BatchStatement(BatchStatement.Type.UNLOGGED);


for (CustomerEvent event : events) {

BoundStatement boundInsert = insertStatement.bind(










ResultSet execute = session.execute(batchStatement);


Partition Key!!
Writing this
All for the same partition :)
Logged Batches
Logged Batches
• Once accepted the statements will eventually succeed
• Achieved by writing them to a distributed batchlog
• 30% slower than unlogged batches
Batch insert - Good idea?
public void storeEvents(ConsistencyLevel consistencyLevel, CustomerEvent... events) {

BatchStatement batchStatement = new BatchStatement(BatchStatement.Type.LOGGED);

for (CustomerEvent event : events) {
BoundStatement boundInsert = insertStatement.bind(

Not so much
BL-R: Batch log replica
Use case?
CREATE TABLE if NOT EXISTS customer_events (
customer_id text,
statff_id text,
store_type text,
time timeuuid ,
event_type text,
PRIMARY KEY (customer_id, time))
CREATE TABLE if NOT EXISTS customer_events_by_staff (
customer_id text,
statff_id text,
store_type text,
time timeuuid ,
event_type text,
PRIMARY KEY (staff_id, time))
Storing events to both tables in a batch
public void storeEventLogged(ConsistencyLevel consistencyLevel, CustomerEvent customerEvent) {

BoundStatement boundInsertForCustomerId = insertByCustomerId.bind(customerEvent.getCustomerId(),

BoundStatement boundInsertForStaffId = insertByStaffId.bind(customerEvent.getCustomerId(),

BatchStatement batchStatement = new BatchStatement(BatchStatement.Type.LOGGED);





ResultSet execute = session.execute(batchStatement);

Mutable data
Distributed mutable state :-(
create TABLE accounts(customer text PRIMARY KEY, balance_in_pence int);
Overly naive example
• Cassandra uses last write wins (no Vector clocks)
• Read before write is an anti-pattern - race conditions and latency
Take a page from Event sourcing
create table accounts_log(
customer text,
time timeuuid,
delta int,
primary KEY (customer, time)); All records for the same customer in the same storage row
Transactions ordered by TIMEUUID - UUIDs are your friends in distributed systems
• Avoid mutable state in distributed system, favour
immutable log
• Can roll up and snapshot to avoid calculation getting too
• If you really have to then checkout LWTs and do via
CAS - this will be slower
Secondary Indices
Customer events table
CREATE TABLE if NOT EXISTS customer_events (
customer_id text,
staff_id text,
store_type text,
time timeuuid ,
event_type text,
PRIMARY KEY (customer_id, time))
create INDEX on customer_events (staff_id) ;
Indexes to the rescue?
customer_id time staff_id
chbatey 2015-03-03 08:52:45 trevor
chbatey 2015-03-03 08:52:54 trevor
chbatey 2015-03-03 08:53:11 bill
chbatey 2015-03-03 08:53:18 bill
rusty 2015-03-03 08:56:57 bill
rusty 2015-03-03 08:57:02 bill
rusty 2015-03-03 08:57:20 trevor
staff_id customer_id
trevor chbatey
trevor chbatey
bill chbatey
bill chbatey
bill rusty
bill rusty
trevor rusty
Indexes to the rescue?
staff_id customer_id
trevor chbatey
trevor chbatey
bill chbatey
bill chbatey
staff_id customer_id
bill rusty
bill rusty
trevor rusty
chbatey rusty
customer_id time staff_id
chbatey 2015-03-03 08:52:45 trevor
chbatey 2015-03-03 08:52:54 trevor
chbatey 2015-03-03 08:53:11 bill
chbatey 2015-03-03 08:53:18 bill
rusty 2015-03-03 08:56:57 bill
rusty 2015-03-03 08:57:02 bill
rusty 2015-03-03 08:57:20 trevor
customer_events table
staff_id customer_id
trevor chbatey
trevor chbatey
bill chbatey
bill chbatey
bill rusty
bill rusty
trevor rusty
staff_id index
Do it your self index
CREATE TABLE if NOT EXISTS customer_events (
customer_id text,
statff_id text,
store_type text,
time timeuuid ,
event_type text,
PRIMARY KEY (customer_id, time))
CREATE TABLE if NOT EXISTS customer_events_by_staff (
customer_id text,
statff_id text,
store_type text,
time timeuuid ,
event_type text,
PRIMARY KEY (staff_id, time))
Anti patterns summary
• Most anti patterns are very obvious in trace output
• Don’t test on small clusters, most problems only occur at
Understanding failures
(crazy retry policies)
Write timeout
Application C
Replication factor: 3
Write timeout
Retrying writes
• Cassandra does not roll back!
• R3 has the data and the coordinator has hinted for R1
and R2
Write timeout
• Received acknowledgements
• Required acknowledgements
• Consistency level
• CAS and Batches are more complicated:
- Timed out waiting for batch log replicas
- Written to batch log but timed out waiting for actual replica
- Will eventually be committed
Idempotent writes
• All writes are idempotent with the following exceptions:
- Counters
- lists
Anti patterns summary
• Most anti patterns are very obvious in trace output
• Don’t test on small clusters, most problems only occur at
Cassandra as a Queue (tombstones)
Well documented anti-pattern
• Produce data in one DC
• Consume it once and delete from another DC
• Use Cassandra?
name text,
enqueued_at timeuuid,
payload blob,
PRIMARY KEY (name, enqueued_at)
SELECT enqueued_at, payload
FROM queues
WHERE name = 'queue-1'
activity | source | elapsed
execute_cql3_query | | 0
Parsing statement | | 48
Peparing statement | | 362
Message received from / | | 42
Sending message to / | | 718
Executing single-partition query on queues | | 145
Acquiring sstable references | | 158
Merging memtable contents | | 189
Merging data from memtables and 0 sstables | | 235
Read 1 live and 19998 tombstoned cells | | 251102
Enqueuing response to / | | 252976
Sending message to / | | 253052
Message received from / | | 324314
Not good :(
• Avoid lots of deletes + range queries in the same
• Avoid it with data modelling / query pattern:
- Move partition
- Add a start for your range: e.g. enqueued_at >

More Related Content

What's hot

Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...
Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...
Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...
DataStax Academy
QB Into the Box 2018
QB Into the Box 2018QB Into the Box 2018
QB Into the Box 2018
Ortus Solutions, Corp
Temporary Cache Assistance (Transients API): WordCamp Phoenix 2014
Temporary Cache Assistance (Transients API): WordCamp Phoenix 2014Temporary Cache Assistance (Transients API): WordCamp Phoenix 2014
Temporary Cache Assistance (Transients API): WordCamp Phoenix 2014
Cliff Seal
Temporary Cache Assistance (Transients API): WordCamp Birmingham 2014
Temporary Cache Assistance (Transients API): WordCamp Birmingham 2014Temporary Cache Assistance (Transients API): WordCamp Birmingham 2014
Temporary Cache Assistance (Transients API): WordCamp Birmingham 2014
Cliff Seal
Cassandra Day Chicago 2015: Advanced Data Modeling
Cassandra Day Chicago 2015: Advanced Data ModelingCassandra Day Chicago 2015: Advanced Data Modeling
Cassandra Day Chicago 2015: Advanced Data Modeling
DataStax Academy
Service discovery and configuration provisioning
Service discovery and configuration provisioningService discovery and configuration provisioning
Service discovery and configuration provisioning
Source Ministry
Reactive programming in clojure script using reactjs wrappers
Reactive programming in clojure script using reactjs wrappersReactive programming in clojure script using reactjs wrappers
Reactive programming in clojure script using reactjs wrappers
Konrad Szydlo
5952 database systems administration (comp 1011.1)-cw1
5952   database systems administration (comp 1011.1)-cw15952   database systems administration (comp 1011.1)-cw1
5952 database systems administration (comp 1011.1)-cw1
Python in the database
Python in the databasePython in the database
Python in the database
Design Patterns in Micro-services architectures & Gilmour
Design Patterns in Micro-services architectures & GilmourDesign Patterns in Micro-services architectures & Gilmour
Design Patterns in Micro-services architectures & Gilmour
Piyush Verma
"Wix Engineering Media AI Photo Studio", Mykola Mykhailych
"Wix Engineering Media AI Photo Studio", Mykola Mykhailych"Wix Engineering Media AI Photo Studio", Mykola Mykhailych
"Wix Engineering Media AI Photo Studio", Mykola Mykhailych
Altinity Quickstart for ClickHouse
Altinity Quickstart for ClickHouseAltinity Quickstart for ClickHouse
Altinity Quickstart for ClickHouse
Altinity Ltd
Presto in Treasure Data
Presto in Treasure DataPresto in Treasure Data
Presto in Treasure Data
Mitsunori Komatsu
MongoDB .local Paris 2020: La puissance du Pipeline d'Agrégation de MongoDB
MongoDB .local Paris 2020: La puissance du Pipeline d'Agrégation de MongoDBMongoDB .local Paris 2020: La puissance du Pipeline d'Agrégation de MongoDB
MongoDB .local Paris 2020: La puissance du Pipeline d'Agrégation de MongoDB
How to Use JSON in MySQL Wrong
How to Use JSON in MySQL WrongHow to Use JSON in MySQL Wrong
How to Use JSON in MySQL Wrong
Karwin Software Solutions LLC
Performance Optimization 101 for Ruby developers - Nihad Abbasov (ENG) | Ruby...
Performance Optimization 101 for Ruby developers - Nihad Abbasov (ENG) | Ruby...Performance Optimization 101 for Ruby developers - Nihad Abbasov (ENG) | Ruby...
Performance Optimization 101 for Ruby developers - Nihad Abbasov (ENG) | Ruby...
Ruby Meditation
Go Programming Patterns
Go Programming PatternsGo Programming Patterns
Go Programming Patterns
Hao Chen
PGConf APAC 2018 - Lightening Talk #2 - Centralizing Authorization in PostgreSQL
PGConf APAC 2018 - Lightening Talk #2 - Centralizing Authorization in PostgreSQLPGConf APAC 2018 - Lightening Talk #2 - Centralizing Authorization in PostgreSQL
PGConf APAC 2018 - Lightening Talk #2 - Centralizing Authorization in PostgreSQL
MongoDB World 2018: Using Change Streams to Keep Up with Your Data
MongoDB World 2018: Using Change Streams to Keep Up with Your DataMongoDB World 2018: Using Change Streams to Keep Up with Your Data
MongoDB World 2018: Using Change Streams to Keep Up with Your Data

What's hot (20)

Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...
Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...
Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...
QB Into the Box 2018
QB Into the Box 2018QB Into the Box 2018
QB Into the Box 2018
Temporary Cache Assistance (Transients API): WordCamp Phoenix 2014
Temporary Cache Assistance (Transients API): WordCamp Phoenix 2014Temporary Cache Assistance (Transients API): WordCamp Phoenix 2014
Temporary Cache Assistance (Transients API): WordCamp Phoenix 2014
Temporary Cache Assistance (Transients API): WordCamp Birmingham 2014
Temporary Cache Assistance (Transients API): WordCamp Birmingham 2014Temporary Cache Assistance (Transients API): WordCamp Birmingham 2014
Temporary Cache Assistance (Transients API): WordCamp Birmingham 2014
Cassandra Day Chicago 2015: Advanced Data Modeling
Cassandra Day Chicago 2015: Advanced Data ModelingCassandra Day Chicago 2015: Advanced Data Modeling
Cassandra Day Chicago 2015: Advanced Data Modeling
Service discovery and configuration provisioning
Service discovery and configuration provisioningService discovery and configuration provisioning
Service discovery and configuration provisioning
Reactive programming in clojure script using reactjs wrappers
Reactive programming in clojure script using reactjs wrappersReactive programming in clojure script using reactjs wrappers
Reactive programming in clojure script using reactjs wrappers
5952 database systems administration (comp 1011.1)-cw1
5952   database systems administration (comp 1011.1)-cw15952   database systems administration (comp 1011.1)-cw1
5952 database systems administration (comp 1011.1)-cw1
Python in the database
Python in the databasePython in the database
Python in the database
Design Patterns in Micro-services architectures & Gilmour
Design Patterns in Micro-services architectures & GilmourDesign Patterns in Micro-services architectures & Gilmour
Design Patterns in Micro-services architectures & Gilmour
"Wix Engineering Media AI Photo Studio", Mykola Mykhailych
"Wix Engineering Media AI Photo Studio", Mykola Mykhailych"Wix Engineering Media AI Photo Studio", Mykola Mykhailych
"Wix Engineering Media AI Photo Studio", Mykola Mykhailych
Altinity Quickstart for ClickHouse
Altinity Quickstart for ClickHouseAltinity Quickstart for ClickHouse
Altinity Quickstart for ClickHouse
Presto in Treasure Data
Presto in Treasure DataPresto in Treasure Data
Presto in Treasure Data
MongoDB .local Paris 2020: La puissance du Pipeline d'Agrégation de MongoDB
MongoDB .local Paris 2020: La puissance du Pipeline d'Agrégation de MongoDBMongoDB .local Paris 2020: La puissance du Pipeline d'Agrégation de MongoDB
MongoDB .local Paris 2020: La puissance du Pipeline d'Agrégation de MongoDB
How to Use JSON in MySQL Wrong
How to Use JSON in MySQL WrongHow to Use JSON in MySQL Wrong
How to Use JSON in MySQL Wrong
Performance Optimization 101 for Ruby developers - Nihad Abbasov (ENG) | Ruby...
Performance Optimization 101 for Ruby developers - Nihad Abbasov (ENG) | Ruby...Performance Optimization 101 for Ruby developers - Nihad Abbasov (ENG) | Ruby...
Performance Optimization 101 for Ruby developers - Nihad Abbasov (ENG) | Ruby...
Go Programming Patterns
Go Programming PatternsGo Programming Patterns
Go Programming Patterns
Python database interfaces
Python database  interfacesPython database  interfaces
Python database interfaces
PGConf APAC 2018 - Lightening Talk #2 - Centralizing Authorization in PostgreSQL
PGConf APAC 2018 - Lightening Talk #2 - Centralizing Authorization in PostgreSQLPGConf APAC 2018 - Lightening Talk #2 - Centralizing Authorization in PostgreSQL
PGConf APAC 2018 - Lightening Talk #2 - Centralizing Authorization in PostgreSQL
MongoDB World 2018: Using Change Streams to Keep Up with Your Data
MongoDB World 2018: Using Change Streams to Keep Up with Your DataMongoDB World 2018: Using Change Streams to Keep Up with Your Data
MongoDB World 2018: Using Change Streams to Keep Up with Your Data

Viewers also liked

NYC Cassandra Day - Java Intro
NYC Cassandra Day - Java IntroNYC Cassandra Day - Java Intro
NYC Cassandra Day - Java Intro
Christopher Batey
Think your software is fault-tolerant? Prove it!
Think your software is fault-tolerant? Prove it!Think your software is fault-tolerant? Prove it!
Think your software is fault-tolerant? Prove it!
Christopher Batey
Manchester Hadoop Meetup: Cassandra Spark internals
Manchester Hadoop Meetup: Cassandra Spark internalsManchester Hadoop Meetup: Cassandra Spark internals
Manchester Hadoop Meetup: Cassandra Spark internals
Christopher Batey
LJC: Microservices in the real world
LJC: Microservices in the real worldLJC: Microservices in the real world
LJC: Microservices in the real world
Christopher Batey
Docker and jvm. A good idea?
Docker and jvm. A good idea?Docker and jvm. A good idea?
Docker and jvm. A good idea?
Christopher Batey
IoT London July 2015
IoT London July 2015IoT London July 2015
IoT London July 2015
Christopher Batey
3 Dundee-Spark Overview for C* developers
3 Dundee-Spark Overview for C* developers3 Dundee-Spark Overview for C* developers
3 Dundee-Spark Overview for C* developers
Christopher Batey
Cassandra Day London: Building Java Applications
Cassandra Day London: Building Java ApplicationsCassandra Day London: Building Java Applications
Cassandra Day London: Building Java Applications
Christopher Batey
1 Dundee - Cassandra 101
1 Dundee - Cassandra 1011 Dundee - Cassandra 101
1 Dundee - Cassandra 101
Christopher Batey
Cassandra London - C* Spark Connector
Cassandra London - C* Spark ConnectorCassandra London - C* Spark Connector
Cassandra London - C* Spark Connector
Christopher Batey
2 Dundee - Cassandra-3
2 Dundee - Cassandra-32 Dundee - Cassandra-3
2 Dundee - Cassandra-3
Christopher Batey
Manchester Hadoop User Group: Cassandra Intro
Manchester Hadoop User Group: Cassandra IntroManchester Hadoop User Group: Cassandra Intro
Manchester Hadoop User Group: Cassandra Intro
Christopher Batey
Manchester Hadoop Meetup: Spark Cassandra Integration
Manchester Hadoop Meetup: Spark Cassandra IntegrationManchester Hadoop Meetup: Spark Cassandra Integration
Manchester Hadoop Meetup: Spark Cassandra Integration
Christopher Batey
Devoxx France: Fault tolerant microservices on the JVM with Cassandra
Devoxx France: Fault tolerant microservices on the JVM with CassandraDevoxx France: Fault tolerant microservices on the JVM with Cassandra
Devoxx France: Fault tolerant microservices on the JVM with Cassandra
Christopher Batey
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
Load Testing Cassandra Applications (Ben Slater, Instaclustr) | C* Summit 2016
Load Testing Cassandra Applications (Ben Slater, Instaclustr) | C* Summit 2016Load Testing Cassandra Applications (Ben Slater, Instaclustr) | C* Summit 2016
Load Testing Cassandra Applications (Ben Slater, Instaclustr) | C* Summit 2016
Paris Day Cassandra: Use case
Paris Day Cassandra: Use caseParis Day Cassandra: Use case
Paris Day Cassandra: Use case
Christopher Batey

Viewers also liked (17)

NYC Cassandra Day - Java Intro
NYC Cassandra Day - Java IntroNYC Cassandra Day - Java Intro
NYC Cassandra Day - Java Intro
Think your software is fault-tolerant? Prove it!
Think your software is fault-tolerant? Prove it!Think your software is fault-tolerant? Prove it!
Think your software is fault-tolerant? Prove it!
Manchester Hadoop Meetup: Cassandra Spark internals
Manchester Hadoop Meetup: Cassandra Spark internalsManchester Hadoop Meetup: Cassandra Spark internals
Manchester Hadoop Meetup: Cassandra Spark internals
LJC: Microservices in the real world
LJC: Microservices in the real worldLJC: Microservices in the real world
LJC: Microservices in the real world
Docker and jvm. A good idea?
Docker and jvm. A good idea?Docker and jvm. A good idea?
Docker and jvm. A good idea?
IoT London July 2015
IoT London July 2015IoT London July 2015
IoT London July 2015
3 Dundee-Spark Overview for C* developers
3 Dundee-Spark Overview for C* developers3 Dundee-Spark Overview for C* developers
3 Dundee-Spark Overview for C* developers
Cassandra Day London: Building Java Applications
Cassandra Day London: Building Java ApplicationsCassandra Day London: Building Java Applications
Cassandra Day London: Building Java Applications
1 Dundee - Cassandra 101
1 Dundee - Cassandra 1011 Dundee - Cassandra 101
1 Dundee - Cassandra 101
Cassandra London - C* Spark Connector
Cassandra London - C* Spark ConnectorCassandra London - C* Spark Connector
Cassandra London - C* Spark Connector
2 Dundee - Cassandra-3
2 Dundee - Cassandra-32 Dundee - Cassandra-3
2 Dundee - Cassandra-3
Manchester Hadoop User Group: Cassandra Intro
Manchester Hadoop User Group: Cassandra IntroManchester Hadoop User Group: Cassandra Intro
Manchester Hadoop User Group: Cassandra Intro
Manchester Hadoop Meetup: Spark Cassandra Integration
Manchester Hadoop Meetup: Spark Cassandra IntegrationManchester Hadoop Meetup: Spark Cassandra Integration
Manchester Hadoop Meetup: Spark Cassandra Integration
Devoxx France: Fault tolerant microservices on the JVM with Cassandra
Devoxx France: Fault tolerant microservices on the JVM with CassandraDevoxx France: Fault tolerant microservices on the JVM with Cassandra
Devoxx France: Fault tolerant microservices on the JVM with Cassandra
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
The Best and Worst of Cassandra-stress Tool (Christopher Batey, The Last Pick...
Load Testing Cassandra Applications (Ben Slater, Instaclustr) | C* Summit 2016
Load Testing Cassandra Applications (Ben Slater, Instaclustr) | C* Summit 2016Load Testing Cassandra Applications (Ben Slater, Instaclustr) | C* Summit 2016
Load Testing Cassandra Applications (Ben Slater, Instaclustr) | C* Summit 2016
Paris Day Cassandra: Use case
Paris Day Cassandra: Use caseParis Day Cassandra: Use case
Paris Day Cassandra: Use case

Similar to Cassandra Day NYC - Cassandra anti patterns

Cassandra Day London 2015: Apache Cassandra Anti-Patterns
Cassandra Day London 2015: Apache Cassandra Anti-PatternsCassandra Day London 2015: Apache Cassandra Anti-Patterns
Cassandra Day London 2015: Apache Cassandra Anti-Patterns
DataStax Academy
Webinar Cassandra Anti-Patterns
Webinar Cassandra Anti-PatternsWebinar Cassandra Anti-Patterns
Webinar Cassandra Anti-Patterns
Christopher Batey
Reading Cassandra Meetup Feb 2015: Apache Spark
Reading Cassandra Meetup Feb 2015: Apache SparkReading Cassandra Meetup Feb 2015: Apache Spark
Reading Cassandra Meetup Feb 2015: Apache Spark
Christopher Batey
Cassandra Day London 2015: Getting Started with Apache Cassandra and Java
Cassandra Day London 2015: Getting Started with Apache Cassandra and JavaCassandra Day London 2015: Getting Started with Apache Cassandra and Java
Cassandra Day London 2015: Getting Started with Apache Cassandra and Java
DataStax Academy
Webinar: You made the move to Office 365—now what?
Webinar: You made the move to Office 365—now what?Webinar: You made the move to Office 365—now what?
Webinar: You made the move to Office 365—now what?
[JSS2015] Nouveautés SQL Server 2016:Sécurité,Temporal & Stretch Tables
[JSS2015] Nouveautés SQL Server 2016:Sécurité,Temporal & Stretch Tables[JSS2015] Nouveautés SQL Server 2016:Sécurité,Temporal & Stretch Tables
[JSS2015] Nouveautés SQL Server 2016:Sécurité,Temporal & Stretch Tables
Going native with Apache Cassandra
Going native with Apache CassandraGoing native with Apache Cassandra
Going native with Apache Cassandra
Johnny Miller
LA Cassandra Day 2015 - Cassandra for developers
LA Cassandra Day 2015  - Cassandra for developersLA Cassandra Day 2015  - Cassandra for developers
LA Cassandra Day 2015 - Cassandra for developers
Christopher Batey
Supercharging WordPress Development - Wordcamp Brighton 2019
Supercharging WordPress Development - Wordcamp Brighton 2019Supercharging WordPress Development - Wordcamp Brighton 2019
Supercharging WordPress Development - Wordcamp Brighton 2019
Adam Tomat
Light Weight Transactions Under Stress (Christopher Batey, The Last Pickle) ...
Light Weight Transactions Under Stress  (Christopher Batey, The Last Pickle) ...Light Weight Transactions Under Stress  (Christopher Batey, The Last Pickle) ...
Light Weight Transactions Under Stress (Christopher Batey, The Last Pickle) ...
Vienna Feb 2015: Cassandra: How it works and what it's good for!
Vienna Feb 2015: Cassandra: How it works and what it's good for!Vienna Feb 2015: Cassandra: How it works and what it's good for!
Vienna Feb 2015: Cassandra: How it works and what it's good for!
Christopher Batey
Jan 2015 - Cassandra101 Manchester Meetup
Jan 2015 - Cassandra101 Manchester MeetupJan 2015 - Cassandra101 Manchester Meetup
Jan 2015 - Cassandra101 Manchester Meetup
Christopher Batey
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Altinity Ltd
Real data models of silicon valley
Real data models of silicon valleyReal data models of silicon valley
Real data models of silicon valley
Patrick McFadin
How to Realize an Additional 270% ROI on Snowflake
How to Realize an Additional 270% ROI on SnowflakeHow to Realize an Additional 270% ROI on Snowflake
How to Realize an Additional 270% ROI on Snowflake
Datacubes in Apache Hive at ApacheCon
Datacubes in Apache Hive at ApacheConDatacubes in Apache Hive at ApacheCon
Datacubes in Apache Hive at ApacheCon
Using grouping sets in sql server 2008 tech republic
Using grouping sets in sql server 2008   tech republicUsing grouping sets in sql server 2008   tech republic
Using grouping sets in sql server 2008 tech republicKaing Menglieng

Similar to Cassandra Day NYC - Cassandra anti patterns (20)

Cassandra Day London 2015: Apache Cassandra Anti-Patterns
Cassandra Day London 2015: Apache Cassandra Anti-PatternsCassandra Day London 2015: Apache Cassandra Anti-Patterns
Cassandra Day London 2015: Apache Cassandra Anti-Patterns
Webinar Cassandra Anti-Patterns
Webinar Cassandra Anti-PatternsWebinar Cassandra Anti-Patterns
Webinar Cassandra Anti-Patterns
Reading Cassandra Meetup Feb 2015: Apache Spark
Reading Cassandra Meetup Feb 2015: Apache SparkReading Cassandra Meetup Feb 2015: Apache Spark
Reading Cassandra Meetup Feb 2015: Apache Spark
Cassandra Day London 2015: Getting Started with Apache Cassandra and Java
Cassandra Day London 2015: Getting Started with Apache Cassandra and JavaCassandra Day London 2015: Getting Started with Apache Cassandra and Java
Cassandra Day London 2015: Getting Started with Apache Cassandra and Java
Webinar: You made the move to Office 365—now what?
Webinar: You made the move to Office 365—now what?Webinar: You made the move to Office 365—now what?
Webinar: You made the move to Office 365—now what?
[JSS2015] Nouveautés SQL Server 2016:Sécurité,Temporal & Stretch Tables
[JSS2015] Nouveautés SQL Server 2016:Sécurité,Temporal & Stretch Tables[JSS2015] Nouveautés SQL Server 2016:Sécurité,Temporal & Stretch Tables
[JSS2015] Nouveautés SQL Server 2016:Sécurité,Temporal & Stretch Tables
Going native with Apache Cassandra
Going native with Apache CassandraGoing native with Apache Cassandra
Going native with Apache Cassandra
LA Cassandra Day 2015 - Cassandra for developers
LA Cassandra Day 2015  - Cassandra for developersLA Cassandra Day 2015  - Cassandra for developers
LA Cassandra Day 2015 - Cassandra for developers
Supercharging WordPress Development - Wordcamp Brighton 2019
Supercharging WordPress Development - Wordcamp Brighton 2019Supercharging WordPress Development - Wordcamp Brighton 2019
Supercharging WordPress Development - Wordcamp Brighton 2019
Light Weight Transactions Under Stress (Christopher Batey, The Last Pickle) ...
Light Weight Transactions Under Stress  (Christopher Batey, The Last Pickle) ...Light Weight Transactions Under Stress  (Christopher Batey, The Last Pickle) ...
Light Weight Transactions Under Stress (Christopher Batey, The Last Pickle) ...
Vienna Feb 2015: Cassandra: How it works and what it's good for!
Vienna Feb 2015: Cassandra: How it works and what it's good for!Vienna Feb 2015: Cassandra: How it works and what it's good for!
Vienna Feb 2015: Cassandra: How it works and what it's good for!
Jan 2015 - Cassandra101 Manchester Meetup
Jan 2015 - Cassandra101 Manchester MeetupJan 2015 - Cassandra101 Manchester Meetup
Jan 2015 - Cassandra101 Manchester Meetup
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Real data models of silicon valley
Real data models of silicon valleyReal data models of silicon valley
Real data models of silicon valley
How to Realize an Additional 270% ROI on Snowflake
How to Realize an Additional 270% ROI on SnowflakeHow to Realize an Additional 270% ROI on Snowflake
How to Realize an Additional 270% ROI on Snowflake
Datacubes in Apache Hive at ApacheCon
Datacubes in Apache Hive at ApacheConDatacubes in Apache Hive at ApacheCon
Datacubes in Apache Hive at ApacheCon
Using grouping sets in sql server 2008 tech republic
Using grouping sets in sql server 2008   tech republicUsing grouping sets in sql server 2008   tech republic
Using grouping sets in sql server 2008 tech republic

More from Christopher Batey

Data Science Lab Meetup: Cassandra and Spark
Data Science Lab Meetup: Cassandra and SparkData Science Lab Meetup: Cassandra and Spark
Data Science Lab Meetup: Cassandra and Spark
Christopher Batey
Munich March 2015 - Cassandra + Spark Overview
Munich March 2015 -  Cassandra + Spark OverviewMunich March 2015 -  Cassandra + Spark Overview
Munich March 2015 - Cassandra + Spark Overview
Christopher Batey
LA Cassandra Day 2015 - Testing Cassandra
LA Cassandra Day 2015  - Testing CassandraLA Cassandra Day 2015  - Testing Cassandra
LA Cassandra Day 2015 - Testing Cassandra
Christopher Batey
Voxxed Vienna 2015 Fault tolerant microservices
Voxxed Vienna 2015 Fault tolerant microservicesVoxxed Vienna 2015 Fault tolerant microservices
Voxxed Vienna 2015 Fault tolerant microservices
Christopher Batey
LJC: Fault tolerance with Apache Cassandra
LJC: Fault tolerance with Apache CassandraLJC: Fault tolerance with Apache Cassandra
LJC: Fault tolerance with Apache Cassandra
Christopher Batey
Cassandra Summit EU 2014 Lightning talk - Paging (no animation)
Cassandra Summit EU 2014 Lightning talk - Paging (no animation)Cassandra Summit EU 2014 Lightning talk - Paging (no animation)
Cassandra Summit EU 2014 Lightning talk - Paging (no animation)
Christopher Batey
Cassandra Summit EU 2014 - Testing Cassandra Applications
Cassandra Summit EU 2014 - Testing Cassandra ApplicationsCassandra Summit EU 2014 - Testing Cassandra Applications
Cassandra Summit EU 2014 - Testing Cassandra Applications
Christopher Batey

More from Christopher Batey (7)

Data Science Lab Meetup: Cassandra and Spark
Data Science Lab Meetup: Cassandra and SparkData Science Lab Meetup: Cassandra and Spark
Data Science Lab Meetup: Cassandra and Spark
Munich March 2015 - Cassandra + Spark Overview
Munich March 2015 -  Cassandra + Spark OverviewMunich March 2015 -  Cassandra + Spark Overview
Munich March 2015 - Cassandra + Spark Overview
LA Cassandra Day 2015 - Testing Cassandra
LA Cassandra Day 2015  - Testing CassandraLA Cassandra Day 2015  - Testing Cassandra
LA Cassandra Day 2015 - Testing Cassandra
Voxxed Vienna 2015 Fault tolerant microservices
Voxxed Vienna 2015 Fault tolerant microservicesVoxxed Vienna 2015 Fault tolerant microservices
Voxxed Vienna 2015 Fault tolerant microservices
LJC: Fault tolerance with Apache Cassandra
LJC: Fault tolerance with Apache CassandraLJC: Fault tolerance with Apache Cassandra
LJC: Fault tolerance with Apache Cassandra
Cassandra Summit EU 2014 Lightning talk - Paging (no animation)
Cassandra Summit EU 2014 Lightning talk - Paging (no animation)Cassandra Summit EU 2014 Lightning talk - Paging (no animation)
Cassandra Summit EU 2014 Lightning talk - Paging (no animation)
Cassandra Summit EU 2014 - Testing Cassandra Applications
Cassandra Summit EU 2014 - Testing Cassandra ApplicationsCassandra Summit EU 2014 - Testing Cassandra Applications
Cassandra Summit EU 2014 - Testing Cassandra Applications

Recently uploaded

How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
Philip Schwarz
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
Shane Coughlan
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
Aftab Hussain
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke
Top 7 Unique WhatsApp API Benefits | Saudi Arabia
Top 7 Unique WhatsApp API Benefits | Saudi ArabiaTop 7 Unique WhatsApp API Benefits | Saudi Arabia
Top 7 Unique WhatsApp API Benefits | Saudi Arabia
Yara Milbes
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
Aftab Hussain
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...

Recently uploaded (20)

How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
Top 7 Unique WhatsApp API Benefits | Saudi Arabia
Top 7 Unique WhatsApp API Benefits | Saudi ArabiaTop 7 Unique WhatsApp API Benefits | Saudi Arabia
Top 7 Unique WhatsApp API Benefits | Saudi Arabia
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdfAutomated software refactoring with OpenRewrite and Generative AI.pptx.pdf
Automated software refactoring with OpenRewrite and Generative AI.pptx.pdf
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...

Cassandra Day NYC - Cassandra anti patterns

  • 1. ©2013 DataStax Confidential. Do not distribute without consent. @chbatey Christopher Batey Staying in love with Cassandra
  • 2. @chbatey Anti patterns • Client side joins • Multi partition queries • Batches • Mutable data • Secondary indices • Retry policies?
  • 3. @chbatey Disclaimer • I’m going to be pedantic • I’m assuming you want to scale
  • 6. @chbatey Storing customer events • Customer event - customer_id - ChrisBatey - staff_id - Charlie - event_type - login, logout, add_to_basket, remove_from_basket - time • Store - name - store_type - Website, PhoneApp, Phone, Retail - location
  • 7. @chbatey Model CREATE TABLE customer_events( customer_id text, staff_id text, time timeuuid, event_type text, store_name text, PRIMARY KEY ((customer_id), time)); CREATE TABLE store( store_name text, location text, store_type text, PRIMARY KEY (store_name));
  • 8. @chbatey Reading this client RF3 C Read latest customer event CL = QUORUM Read store information C
  • 9. @chbatey One to one vs one to many • This required possibly 6 out of 8 of our nodes to be up • Imagine if there were multiple follow up queries
  • 10. @chbatey Multi-partition queries • Client side joins normally end up falling into the more general anti-pattern of multi partition queries
  • 11. @chbatey Adding staff link CREATE TABLE customer_events( customer_id text, staff set<text>, time timeuuid, event_type text, store_name text, PRIMARY KEY ((customer_id), time)); CREATE TYPE staff( name text, favourite_colour text, job_title text); CREATE TYPE store( store_name text, location text, store_type text);
  • 12. @chbatey Queries select * from customer_events where customer_id = ‘chbatey’ limit 1 select * from staff where name in (staff1, staff2, staff3, staff4) Any one of these fails the whole query fails
  • 13. @chbatey Use a multi node cluster locally
  • 14. @chbatey Use a multi node cluster locally Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN 102.27 KB 256 ? 15ad7694-3e76-4b74-aea0-fa3c0fa59532 rack1 UN 102.18 KB 256 ? cca7d0bb-e884-49f9-b098-e38fbe895cbc rack1 UN 93.16 KB 256 ? 1f9737d3-c1b8-4df1-be4c-d3b1cced8e30 rack1 UN 102.1 KB 256 ? fe27b958-5d3a-4f78-9880-76cb7c9bead1 rack1 UN 93.18 KB 256 ? 66eb3f23-8889-44d6-a9e7-ecdd57ed61d0 rack1 UN 102.12 KB 256 ? e2e99a7b-c1fb-4f2a-9e4f-7a4666f8245e rack1
  • 15. @chbatey Let’s see with with a 6 node cluster INSERT INTO staff (name, favourite_colour , job_title ) VALUES ( 'chbatey', 'red', 'Technical Evangelist' ); INSERT INTO staff (name, favourite_colour , job_title ) VALUES ( 'luket', 'red', 'Technical Evangelist' ); INSERT INTO staff (name, favourite_colour , job_title ) VALUES ( 'jonh', 'blue', 'Technical Evangelist' ); select * from staff where name in ('chbatey', 'luket', 'jonh');
  • 16. @chbatey Trace with CL ONE = 4 nodes used Execute CQL3 query | 2015-02-02 06:39:58.759000 | | 0 Parsing select * from staff where name in ('chbatey', 'luket', 'jonh'); [SharedPool-Worker-1] | 2015-02-02 06:39:58.766000 | | 7553 Preparing statement [SharedPool-Worker-1] | 2015-02-02 06:39:58.768000 | | 9249 Executing single-partition query on staff [SharedPool-Worker-3] | 2015-02-02 06:39:58.773000 | | 14255 Sending message to / [WRITE-/] | 2015-02-02 06:39:58.773001 | | 14756 Sending message to / [WRITE-/] | 2015-02-02 06:39:58.773001 | | 14928 Sending message to / [WRITE-/] | 2015-02-02 06:39:58.774000 | | 16035 Executing single-partition query on staff [SharedPool-Worker-1] | 2015-02-02 06:39:58.777000 | | 1156 Enqueuing response to / [SharedPool-Worker-1] | 2015-02-02 06:39:58.777001 | | 1681 Sending message to / [WRITE-/] | 2015-02-02 06:39:58.778000 | | 1944 Executing single-partition query on staff [SharedPool-Worker-1] | 2015-02-02 06:39:58.778000 | | 1554 Processing response from / [SharedPool-Worker-3] | 2015-02-02 06:39:58.779000 | | 20762 Enqueuing response to / [SharedPool-Worker-1] | 2015-02-02 06:39:58.779000 | | 2425 Sending message to / [WRITE-/] | 2015-02-02 06:39:58.779000 | | 21198 Sending message to / [WRITE-/] | 2015-02-02 06:39:58.779000 | | 2639 Sending message to / [WRITE-/] | 2015-02-02 06:39:58.779000 | | 21208 Executing single-partition query on staff [SharedPool-Worker-1] | 2015-02-02 06:39:58.780000 | | 304 Enqueuing response to / [SharedPool-Worker-1] | 2015-02-02 06:39:58.780001 | | 574 Executing single-partition query on staff [SharedPool-Worker-2] | 2015-02-02 06:39:58.781000 | | 4075 Sending message to / [WRITE-/] | 2015-02-02 06:39:58.781000 | | 708 Enqueuing response to / [SharedPool-Worker-2] | 2015-02-02 06:39:58.781001 | | 4348 Sending message to / [WRITE-/] | 2015-02-02 06:39:58.782000 | | 5371 Executing single-partition query on staff [SharedPool-Worker-1] | 2015-02-02 06:39:58.783000 | | 2463 Enqueuing response to / [SharedPool-Worker-1] | 2015-02-02 06:39:58.784000 | | 2905 Sending message to / [WRITE-/] | 2015-02-02 06:39:58.784001 | | 3160 Processing response from / [SharedPool-Worker-2] | 2015-02-02 06:39:58.785000 | | -- Request complete | 2015-02-02 06:39:58.782995 | | 23995
  • 17. @chbatey Denormalise with UDTs CREATE TYPE store (name text, type text, postcode text) CREATE TYPE staff (name text, fav_colour text, job_title text) 
 CREATE TABLE customer_events( customer_id text, time timeuuid, event_type text, store store, staff set<staff>, PRIMARY KEY ((customer_id), time)); User defined types Cassandra 2.1 has landed in DSE 4.7!!!
  • 18. @chbatey Less obvious example • A good pattern for time series: Bucketing
  • 19. @chbatey Adding a time bucket CREATE TABLE customer_events_bucket( customer_id text, time_bucket text time timeuuid, event_type text, store store, staff set<staff>, PRIMARY KEY ((customer_id, time_bucket), time));
  • 20. @chbatey Queries select * from customer_events_bucket where customer_id = ‘chbatey’ and time_bucket IN (‘2015-01-01|0910:01’, ‘2015-01-01|0910:02’, ‘2015-01-01|0910:03’, ‘2015-01-01| 0910:04’) Often better as multiple async queries
  • 21. @chbatey Tips for avoiding joins & multi gets • Say no to client side joins by denormalising - Much easier with UDTs • When bucketing aim for at most two buckets for a query • Get in the habit of reading trace + using a multi node cluster locally
  • 23. @chbatey Unlogged Batches • Unlogged batches - Send many statements as one
  • 24. @chbatey Customer events table CREATE TABLE if NOT EXISTS customer_events ( customer_id text, statff_id text, store_type text, time timeuuid , event_type text, PRIMARY KEY (customer_id, time))
  • 25. @chbatey Inserting events INSERT INTO events.customer_events (customer_id, time , event_type , statff_id , store_type ) VALUES ( ?, ?, ?, ?, ?) public void storeEvent(ConsistencyLevel consistencyLevel, CustomerEvent customerEvent) {
 BoundStatement boundInsert = insertStatement.bind( customerEvent.getCustomerId(), customerEvent.getTime(), customerEvent.getEventType(), customerEvent.getStaffId(), customerEvent.getStaffId());
  • 26. @chbatey Batch insert - Good idea? public void storeEvents(ConsistencyLevel consistencyLevel, CustomerEvent... events) {
 BatchStatement batchStatement = new BatchStatement(BatchStatement.Type.UNLOGGED);
 for (CustomerEvent event : events) { BoundStatement boundInsert = insertStatement.bind( customerEvent.getCustomerId(), customerEvent.getTime(), customerEvent.getEventType(), customerEvent.getStaffId(), customerEvent.getStaffId());
 batchStatement.add(boundInsert); } session.execute(batchStatement); }
  • 27. @chbatey Not so much client C Very likely to fail if nodes are down / over loaded Gives far too much work for the coordinator Ruins performance gains of token aware driver
  • 29. @chbatey Unlogged batch use case public void storeEvents(String customerId, ConsistencyLevel consistencyLevel, CustomerEvent... events) {
 BatchStatement batchStatement = new BatchStatement(BatchStatement.Type.UNLOGGED);
 for (CustomerEvent event : events) {
 BoundStatement boundInsert = insertStatement.bind(
 ResultSet execute = session.execute(batchStatement);
 } Partition Key!!
  • 30. @chbatey Writing this client C All for the same partition :)
  • 32. @chbatey Logged Batches • Once accepted the statements will eventually succeed • Achieved by writing them to a distributed batchlog • 30% slower than unlogged batches
  • 33. @chbatey Batch insert - Good idea? public void storeEvents(ConsistencyLevel consistencyLevel, CustomerEvent... events) {
 BatchStatement batchStatement = new BatchStatement(BatchStatement.Type.LOGGED);
 for (CustomerEvent event : events) { BoundStatement boundInsert = insertStatement.bind( customerEvent.getCustomerId(), customerEvent.getTime(), customerEvent.getEventType(), customerEvent.getStaffId(), customerEvent.getStaffId());
 batchStatement.add(boundInsert); } session.execute(batchStatement); }
  • 34. @chbatey Not so much client C BATCH LOG BL-R BL-R BL-R: Batch log replica
  • 35. @chbatey Use case? CREATE TABLE if NOT EXISTS customer_events ( customer_id text, statff_id text, store_type text, time timeuuid , event_type text, PRIMARY KEY (customer_id, time)) CREATE TABLE if NOT EXISTS customer_events_by_staff ( customer_id text, statff_id text, store_type text, time timeuuid , event_type text, PRIMARY KEY (staff_id, time))
  • 36. @chbatey Storing events to both tables in a batch public void storeEventLogged(ConsistencyLevel consistencyLevel, CustomerEvent customerEvent) {
 BoundStatement boundInsertForCustomerId = insertByCustomerId.bind(customerEvent.getCustomerId(), customerEvent.getTime(), customerEvent.getEventType(), customerEvent.getStaffId(), customerEvent.getStaffId());
 BoundStatement boundInsertForStaffId = insertByStaffId.bind(customerEvent.getCustomerId(), customerEvent.getTime(), customerEvent.getEventType(), customerEvent.getStaffId(), customerEvent.getStaffId()); 
 BatchStatement batchStatement = new BatchStatement(BatchStatement.Type.LOGGED);
 ResultSet execute = session.execute(batchStatement);
  • 38. @chbatey Distributed mutable state :-( create TABLE accounts(customer text PRIMARY KEY, balance_in_pence int); Overly naive example • Cassandra uses last write wins (no Vector clocks) • Read before write is an anti-pattern - race conditions and latency
  • 39. @chbatey Take a page from Event sourcing create table accounts_log( customer text, time timeuuid, delta int, primary KEY (customer, time)); All records for the same customer in the same storage row Transactions ordered by TIMEUUID - UUIDs are your friends in distributed systems
  • 40. @chbatey Tips • Avoid mutable state in distributed system, favour immutable log • Can roll up and snapshot to avoid calculation getting too big • If you really have to then checkout LWTs and do via CAS - this will be slower
  • 42. @chbatey Customer events table CREATE TABLE if NOT EXISTS customer_events ( customer_id text, staff_id text, store_type text, time timeuuid , event_type text, PRIMARY KEY (customer_id, time)) create INDEX on customer_events (staff_id) ;
  • 43. @chbatey Indexes to the rescue? customer_id time staff_id chbatey 2015-03-03 08:52:45 trevor chbatey 2015-03-03 08:52:54 trevor chbatey 2015-03-03 08:53:11 bill chbatey 2015-03-03 08:53:18 bill rusty 2015-03-03 08:56:57 bill rusty 2015-03-03 08:57:02 bill rusty 2015-03-03 08:57:20 trevor staff_id customer_id trevor chbatey trevor chbatey bill chbatey bill chbatey bill rusty bill rusty trevor rusty
  • 44. @chbatey Indexes to the rescue? staff_id customer_id trevor chbatey trevor chbatey bill chbatey bill chbatey staff_id customer_id bill rusty bill rusty trevor rusty A B chbatey rusty customer_id time staff_id chbatey 2015-03-03 08:52:45 trevor chbatey 2015-03-03 08:52:54 trevor chbatey 2015-03-03 08:53:11 bill chbatey 2015-03-03 08:53:18 bill rusty 2015-03-03 08:56:57 bill rusty 2015-03-03 08:57:02 bill rusty 2015-03-03 08:57:20 trevor customer_events table staff_id customer_id trevor chbatey trevor chbatey bill chbatey bill chbatey bill rusty bill rusty trevor rusty staff_id index
  • 45. @chbatey Do it your self index CREATE TABLE if NOT EXISTS customer_events ( customer_id text, statff_id text, store_type text, time timeuuid , event_type text, PRIMARY KEY (customer_id, time)) CREATE TABLE if NOT EXISTS customer_events_by_staff ( customer_id text, statff_id text, store_type text, time timeuuid , event_type text, PRIMARY KEY (staff_id, time))
  • 46. @chbatey Anti patterns summary • Most anti patterns are very obvious in trace output • Don’t test on small clusters, most problems only occur at scale
  • 47.
  • 50. @chbatey Retrying writes • Cassandra does not roll back! • R3 has the data and the coordinator has hinted for R1 and R2
  • 51. @chbatey Write timeout • Received acknowledgements • Required acknowledgements • Consistency level • CAS and Batches are more complicated: - WriteType: SIMPLE, BATCH, UNLOGGED_BATCH, BATCH_LOG, CAS
  • 52. @chbatey Batches • BATCH_LOG - Timed out waiting for batch log replicas • BATCH - Written to batch log but timed out waiting for actual replica - Will eventually be committed
  • 53. @chbatey Idempotent writes • All writes are idempotent with the following exceptions: - Counters - lists
  • 54. @chbatey Anti patterns summary • Most anti patterns are very obvious in trace output • Don’t test on small clusters, most problems only occur at scale
  • 55.
  • 56. @chbatey Cassandra as a Queue (tombstones)
  • 57. @chbatey Well documented anti-pattern • patterns-queues-and-queue-like-datasets • modeling-around-deletes-or-using-cassandra-as-a- queue-even-when-you-know-better/
  • 58. @chbatey Requirements • Produce data in one DC • Consume it once and delete from another DC • Use Cassandra?
  • 59. @chbatey Datamodel CREATE TABLE queues ( name text, enqueued_at timeuuid, payload blob, PRIMARY KEY (name, enqueued_at) ); SELECT enqueued_at, payload FROM queues WHERE name = 'queue-1' LIMIT 1;
  • 60. @chbatey Trace activity | source | elapsed -------------------------------------------+-----------+-------- execute_cql3_query | | 0 Parsing statement | | 48 Peparing statement | | 362 Message received from / | | 42 Sending message to / | | 718 Executing single-partition query on queues | | 145 Acquiring sstable references | | 158 Merging memtable contents | | 189 Merging data from memtables and 0 sstables | | 235 Read 1 live and 19998 tombstoned cells | | 251102 Enqueuing response to / | | 252976 Sending message to / | | 253052 Message received from / | | 324314 Not good :(
  • 61. @chbatey Tips • Avoid lots of deletes + range queries in the same partition • Avoid it with data modelling / query pattern: - Move partition - Add a start for your range: e.g. enqueued_at > 9d1cb818-9d7a-11b6-96ba-60c5470cbf0e