SlideShare a Scribd company logo
1 of 85
Download to read offline
Christopher Batey

Technical Evangelist for Apache Cassandra
Cassandra: How it works and what it's
Who am I?
• Technical Evangelist for Apache Cassandra
•Founder of Stubbed Cassandra
•Help out Apache Cassandra users
• DataStax
•Builds enterprise ready version of Apache
• Previous: Cassandra backed apps at BSkyB
• Distributed databases: What and why?
• Cassandra use cases
• Replication
• Fault tolerance
• Read and write path
• Data modelling
• Java Driver
Distributed databases
It is a big world
• Relational
- Oracle, PostgreSQL
• Graph databases
- Neo4J, InfoGrid, Titan
• Key value
- DynamoDB
• Document stores
- MongoDB, Couchbase
• Columnar aka wide row
- Cassandra, HBase
Building a web app
Running multiple copies of your app
Still in one DC?
Handling hardware failure
Handling hardware failure
•Master serves all writes
•Read from master and optionally slaves
• No master
• Read/write to any
• Consistency?
Decisions decisions… CAP theorem
Are these really that different??
Relational Database
Mongo, Redis
Highly Available Databases:
Voldermort, Cassandra
• If Partition: Trade off Availability vs Consistency
• Else: Trade off Latency vs Consistency
Cassandra use cases
Cassandra for Applications
Common use cases
•Ordered data such as time series
-Event stores
-Financial transactions
-Sensor data e.g IoT
Common use cases
•Ordered data such as time series
-Event stores
-Financial transactions
-Sensor data e.g IoT
•Non functional requirements:
-Linear scalability
-High throughout durable writes
-Multi datacenter including active-active
-Analytics without ETL
Work loads
• High write and/or throughout
• Many small queries
• Distributed across the world (eventually consistent)
Cassandra deep dive
• Distributed masterless
database (Dynamo)
• Column family data model
(Google BigTable)
Datacenter and rack aware
• Distributed master less
database (Dynamo)
• Column family data model
(Google BigTable)
• Multi data centre replication
built in from the start
• Distributed master less
database (Dynamo)
• Column family data model
(Google BigTable)
• Multi data centre replication
built in from the start
• Analytics with Apache SparkAnalytics
Dynamo 101
Dynamo 101
• The parts Cassandra took
- Consistent hashing
- Replication
- Strategies for replication
- Gossip
- Hinted handoff
- Anti-entropy repair
• And the parts it left behind
- Key/Value
- Vector clocks
Picking the right nodes
• You don’t want a full table scan on a 1000 node cluster!
• Dynamo to the rescue: Consistent Hashing
• Then the replication strategy takes over:
- Network topology
- Simple
Murmer3 Example
• Data:
• Murmer3 Hash Values:
jim age: 36 car: ford gender: M
carol age: 37 car: bmw gender: F
johnny age: 12 gender: M
suzy: age: 10 gender: F
Primary Key Murmur3 hash value
jim 350
carol 998
johnny 50
suzy 600
Primary Key
Real hash range: -9223372036854775808 to 9223372036854775807
Murmer3 Example
Four node cluster:
Node Murmur3 start range Murmur3 end range
A 0 249
B 250 499
C 500 749
D 750 999
Pictures are better
Murmer3 Example
Data is distributed as:
Node Start range End range Primary
Hash value
A 0 249 johnny 50
B 250 499 jim 350
C 500 749 suzy 600
D 750 999 carol 998
Replication strategy
• Simple
- Give it to the next node in the ring
- Don’t use this in production
• NetworkTopology
- Every Cassandra node knows its DC and Rack
- Replicas won’t be put on the same rack unless Replication Factor > # of racks
- Unfortunately Cassandra can’t create servers and racks on the fly to fix this :(
CL = 1 We have replication!
Replication strategy
• Simple
- Give it to the next node in the ring
- Don’t use this in production
• NetworkTopology
- Every Cassandra node knows its DC and Rack
- Replicas won’t be put on the same rack unless Replication Factor > # of racks
- Unfortunately Cassandra can’t create servers and racks on the fly to fix this :(
Tunable Consistency
•Data is replicated N times
•Every query that you execute you give a consistency
• Christos Kalantzis Eventual Consistency != Hopeful Consistency: http://
Handling hardware failure
Async Replication
Load balancing
•Data centre aware policy
•Token aware policy
•Latency aware policy
•Whitelist policy
Handling hardware failure
Async Replication
Handling hardware failure
Async Replication
Handling hardware failure
Async Replication
But what happens when they come back?
• Hinted handoff to the rescue
• Coordinators keep writes for downed nodes for a
configurable amount of time, default 3 hours
• Longer than that run a repair
Anti entropy repair
• Not exciting but mandatory :)
• New in 2.1 - incremental repair <— awesome
Don’t forget to be social
• Each node talks to a few of its other and shares
Did you hear node
1 was down???
Scaling shouldn’t be hard
• Throw more nodes at a cluster
• Bootstrapping + joining the ring
• For large data sets this can take some time
Data modelling
You must denormalise
Your model should look like your queries
•Cassandra Query Language
-SQL like query language
•Keyspace – analogous to a schema
- The keyspace determines the RF (replication factor)
•Table – looks like a SQL Table CREATE TABLE scores (
name text,
score int,
date timestamp,
PRIMARY KEY (name, score)
INSERT INTO scores (name, score, date)
VALUES ('bob', 42, '2012-06-24');
INSERT INTO scores (name, score, date)
VALUES ('bob', 47, '2012-06-25');
SELECT date, score FROM scores WHERE name='bob' AND score >= 40;
Lots of types
• Universal Unique ID
- 128 bit number represented in character form e.g.
• Easily generated on the client
- Version 1 has a timestamp component (TIMEUUID)
- Version 4 has no timestamp component
Company Confidential
TIMEUUID data type supports Version 1 UUIDs
Generated using time (60 bits), a clock sequence number (14 bits), and MAC address (48 bits)
– CQL function ‘now()’ generates a new TIMEUUID
Time can be extracted from TIMEUUID
– CQL function dateOf() extracts the timestamp as a date
TIMEUUID values in clustering columns or in column names are ordered based on time
– DESC order on TIMEUUID lists most recent data first
Š 2014 DataStax, All Rights Reserved.
videoid uuid,
userid uuid,
name varchar,
description varchar,
location text,
location_type int,
preview_thumbnails map<text,text>,
tags set<varchar>,
added_date timestamp,
PRIMARY KEY (videoid)
Data Model - User Defined Types
• Complex data in one place
• No multi-gets (multi-partitions)
• Nesting!
CREATE TYPE address (
street text,
city text,
zip_code int,
country text,
cross_streets set<text>
Data Model - Updated
• We can embed video_metadata in videos
CREATE TYPE video_metadata (
height int,
width int,
video_bit_rate set<text>,
encoding text
videoid uuid,
userid uuid,
name varchar,
description varchar,
location text,
location_type int,
preview_thumbnails map<text,text>,
tags set<varchar>,
metadata set <frozen<video_metadata>>,
added_date timestamp,
PRIMARY KEY (videoid)
Data Model - Storing JSON
"productId": 2,
"name": "Kitchen Table",
"price": 249.99,
"description" : "Rectangular table with oak finish",
"dimensions": {
"units": "inches",
"length": 50.0,
"width": 66.0,
"height": 32
"categories": {
"category" : "Home Furnishings" {
"catalogPage": 45,
"url": "/home/furnishings"
"category" : "Kitchen Furnishings" {
"catalogPage": 108,
"url": "/kitchen/furnishings"
CREATE TYPE dimensions (
units text,
length float,
width float,
height float
CREATE TYPE category (
catalogPage int,
url text
CREATE TABLE product (
productId int,
name text,
price float,
description text,
dimensions frozen <dimensions>,
categories map <text, frozen <category>>,
PRIMARY KEY (productId)
Tuple type
• Type to represent a
• Up to 256 different
CREATE TABLE tuple_table (
three_tuple frozen <tuple<int, text, float>>,
four_tuple frozen <tuple<int, text, float, inet>>,
five_tuple frozen <tuple<int, text, float, inet, ascii>>
• Old has been around since .8
• Commit log replay changes counters
• Repair can change a counter
Time-to-Live (TTL)
TTL a row:
INSERT INTO users (id, first, last) VALUES (‘abc123’, ‘catherine’, ‘cachart’)
USING TTL 3600; // Expires data in one hour

TTL a column:
UPDATE users USING TTL 30 SET last = ‘miller’ WHERE id = ‘abc123’
– TTL in seconds
– Can also set default TTL at a table level
– Expired columns/values automatically deleted
– With no TTL specified, columns/values never expire
– TTL is useful for automatic deletion
– Re-inserting the same row before it expires will overwrite TTL
Example Time: Customer event store
An example: Customer event store
• Customer event
- customer_id e.g ChrisBatey
- event_type e.g login, logout, add_to_basket,
remove_from_basket, buy_item
• Staff
- name e.g Charlie
- favourite_colour e.g red
• Store
- name
- type e.g Website, PhoneApp, Phone, Retail
• Get all events
• Get all events for a particular customer
• As above for a time slice
Modelling in a relational database
CREATE TABLE customer_events(
customer_id text,
staff_name text,
time timeuuid,
event_type text,
store_name text,
PRIMARY KEY (customer_id));
name text,
location text,
store_type text,
PRIMARY KEY (store_name));
name text,
favourite_colour text,
job_title text,
PRIMARY KEY (name));
Modelling in Cassandra
CREATE TABLE customer_events(
customer_id text,
staff_id text,
time timeuuid,
store_type text,
event_type text,
tags map<text, text>,
PRIMARY KEY ((customer_id), time));
Partition Key
Clustering Column(s)
How it is stored on disk
time event_type store_type tags
charles 2014-11-18 16:52:04 basket_add online {'item': 'coffee'}
charles 2014-11-18 16:53:00 basket_add online {'item': ‘wine'}
charles 2014-11-18 16:53:09 logout online {}
chbatey 2014-11-18 16:52:21 login online {}
chbatey 2014-11-18 16:53:21 basket_add online {'item': 'coffee'}
chbatey 2014-11-18 16:54:00 basket_add online {'item': 'cheese'}
• DataStax (open source)
- C#, Java, C++, Python, Node, Ruby
- Very similar programming API
• Other open source
- Go
- Clojure
- Erlang
- Haskell
- Many more Java/Python drivers
- Perl
DataStax Java Driver
• Open source
Get all the events
public List<CustomerEvent> getAllCustomerEvents() {

return session.execute("select * from customers.customer_events")




private Function<Row, CustomerEvent> mapCustomerEvent() {

return row -> new CustomerEvent(






row.getMap("tags", String.class, String.class));

All events for a particular customer
private PreparedStatement getEventsForCustomer;


public void prepareSatements() {

getEventsForCustomer =
session.prepare("select * from customers.customer_events where customer_id = ?");


public List<CustomerEvent> getCustomerEvents(String customerId) {

BoundStatement boundStatement = getEventsForCustomer.bind(customerId);

return session.execute(boundStatement)



Customer events for a time slice
public List<CustomerEvent> getCustomerEventsForTime(String customerId, long startTime,
long endTime) {

Select.Where getCustomers =


.from("customers", "customer_events")

.where(eq("customer_id", customerId))

.and(gt("time", UUIDs.startOf(startTime)))

.and(lt("time", UUIDs.endOf(endTime)));

return session.execute(getCustomers).all().stream()



Mapping API
@Table(keyspace = "customers", name = "customer_events")

public class CustomerEvent {


@Column(name = "customer_id")

private String customerId;


private UUID time;

@Column(name = "staff_id")

private String staffId;

@Column(name = "store_type")

private String storeType;

@Column(name = "event_type")

private String eventType;

private Map<String, String> tags;
// ctr / getters etc

Mapping API

public interface CustomerEventDao {

@Query("select * from customers.customer_events where customer_id = :customerId")

Result<CustomerEvent> getCustomerEvents(String customerId);

@Query("select * from customers.customer_events")

Result<CustomerEvent> getAllCustomerEvents();

@Query("select * from customers.customer_events where customer_id = :customerId
and time > minTimeuuid(:startTime) and time < maxTimeuuid(:endTime)")

Result<CustomerEvent> getCustomerEventsForTime(String customerId, long startTime,
long endTime);



public CustomerEventDao customerEventDao() {

MappingManager mappingManager = new MappingManager(session);

return mappingManager.createAccessor(CustomerEventDao.class);

Adding some type safety
public enum StoreType {


@Table(keyspace = "customers", name = "customer_events")

public class CustomerEvent {


@Column(name = "customer_id")

private String customerId;


private UUID time;

@Column(name = "staff_id")

private String staffId;

@Column(name = "store_type")

@Enumerated(EnumType.STRING) // could be EnumType.ORDINAL

private StoreType storeType;

User defined types
create TYPE store (name text, type text, postcode text) ;

CREATE TABLE customer_events_type(
customer_id text,
staff_id text,
time timeuuid,
store frozen<store>,
event_type text,
tags map<text, text>,
PRIMARY KEY ((customer_id), time));

Mapping user defined types
@UDT(keyspace = "customers", name = "store")

public class Store {

private String name;

private StoreType type;

private String postcode;
// getters etc
@Table(keyspace = "customers", name = "customer_events_type")

public class CustomerEventType {


@Column(name = "customer_id")

private String customerId;


private UUID time;

@Column(name = "staff_id")

private String staffId;


private Store store;

@Column(name = "event_type")

private String eventType;

private Map<String, String> tags;

Mapping user defined types
@UDT(keyspace = "customers", name = "store")

public class Store {

private String name;

private StoreType type;

private String postcode;
// getters etc
@Table(keyspace = "customers", name = "customer_events_type")

public class CustomerEventType {


@Column(name = "customer_id")

private String customerId;


private UUID time;

@Column(name = "staff_id")

private String staffId;


private Store store;

@Column(name = "event_type")

private String eventType;

private Map<String, String> tags;

@Query("select * from customers.customer_events_type")

Result<CustomerEventType> getAllCustomerEventsWithStoreType();
Š2014 DataStax. Do not distribute without consent.
Query Tracing
Connected to cluster: xerxes
Simplex keyspace and schema created.
Host (queried): /
Host (tried): /
Trace id: 96ac9400-a3a5-11e2-96a9-4db56cdc5fe7
activity | timestamp | source | source_elapsed
Parsing statement | 12:17:16.736 | / | 28
Peparing statement | 12:17:16.736 | / | 199
Determining replicas for mutation | 12:17:16.736 | / | 348
Sending message to / | 12:17:16.736 | / | 788
Sending message to / | 12:17:16.736 | / | 805
Acquiring switchLock read lock | 12:17:16.736 | / | 828
Appending to commitlog | 12:17:16.736 | / | 848
Adding to songs memtable | 12:17:16.736 | / | 900
Message received from / | 12:17:16.737 | / | 34
Message received from / | 12:17:16.737 | / | 25
Acquiring switchLock read lock | 12:17:16.737 | / | 672
Acquiring switchLock read lock | 12:17:16.737 | / | 525
Other features
Light weight transactions
• Often referred to as compare and set (CAS)
INSERT INTO USERS (login, email, name, login_count)
values ('jbellis', '', 'Jonathan Ellis', 1)
Batch Statements
INSERT INTO users (userID, password, name) VALUES ('user2', 'ch@ngem3b', 'second user')
UPDATE users SET password = 'ps22dhds' WHERE userID = 'user2'
INSERT INTO users (userID, password) VALUES ('user3', 'ch@ngem3c')
DELETE name FROM users WHERE userID = 'user2’

BATCH statement combines multiple INSERT, UPDATE, and DELETE statements into a single logical

Atomic operation
If any statement in the batch succeeds, all will
No batch isolation
Other “transactions” can read and write data being affected by a partially executed batch
Company Confidential
Batch Statements with LWT
UPDATE foo SET z = 1 WHERE x = 'a' AND y = 1;
UPDATE foo SET z = 2 WHERE x = 'a' AND y = 2 IF t = 4;
Allows you to group multiple conditional updates in a batch as long as all
those updates apply to the same partition

Š 2014 DataStax, All Rights Reserved.
• Cassandra is a shared nothing masterless datastore
• Availability a.k.a up time is king
• Biggest hurdle is learning to model differently
• Modern drivers make it easy to work with
Thanks for listening
• Follow me on twitter @chbatey
• Cassandra + Fault tolerance posts a plenty:
• Cassandra resources:

More Related Content

What's hot

Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...
Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...
Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...DataStax Academy
PostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander Korotkov
PostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander KorotkovPostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander Korotkov
PostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander KorotkovNikolay Samokhvalov
Cassandra Day Chicago 2015: Advanced Data Modeling
Cassandra Day Chicago 2015: Advanced Data ModelingCassandra Day Chicago 2015: Advanced Data Modeling
Cassandra Day Chicago 2015: Advanced Data ModelingDataStax Academy
Cassandra 3 new features 2016
Cassandra 3 new features 2016Cassandra 3 new features 2016
Cassandra 3 new features 2016Duyhai Doan
Bay area Cassandra Meetup 2011
Bay area Cassandra Meetup 2011Bay area Cassandra Meetup 2011
Bay area Cassandra Meetup 2011mubarakss
Developing and Deploying Apps with the Postgres FDW
Developing and Deploying Apps with the Postgres FDWDeveloping and Deploying Apps with the Postgres FDW
Developing and Deploying Apps with the Postgres FDWJonathan Katz
Solr 6 Feature Preview
Solr 6 Feature PreviewSolr 6 Feature Preview
Solr 6 Feature PreviewYonik Seeley
PostgreSQL, your NoSQL database
PostgreSQL, your NoSQL databasePostgreSQL, your NoSQL database
PostgreSQL, your NoSQL databaseReuven Lerner
OSDC 2012 | Scaling with MongoDB by Ross Lawley
OSDC 2012 | Scaling with MongoDB by Ross LawleyOSDC 2012 | Scaling with MongoDB by Ross Lawley
OSDC 2012 | Scaling with MongoDB by Ross LawleyNETWAYS
Sasi, cassandra on full text search ride
Sasi, cassandra on full text search rideSasi, cassandra on full text search ride
Sasi, cassandra on full text search rideDuyhai Doan
Add Powerful Full Text Search to Your Web App with Solr
Add Powerful Full Text Search to Your Web App with SolrAdd Powerful Full Text Search to Your Web App with Solr
Add Powerful Full Text Search to Your Web App with Solradunne
Cassandra Materialized Views
Cassandra Materialized ViewsCassandra Materialized Views
Cassandra Materialized ViewsCarl Yeksigian
Foreign Data Wrapper Enhancements
Foreign Data Wrapper EnhancementsForeign Data Wrapper Enhancements
Foreign Data Wrapper EnhancementsShigeru Hanada
Native Code, Off-Heap Data & JSON Facet API for Solr (Heliosearch)
Native Code, Off-Heap Data & JSON Facet API for Solr (Heliosearch)Native Code, Off-Heap Data & JSON Facet API for Solr (Heliosearch)
Native Code, Off-Heap Data & JSON Facet API for Solr (Heliosearch)Yonik Seeley
Cassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ NetflixCassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ Netflixnkorla1share
Fast track to getting started with DSE Max @ ING
Fast track to getting started with DSE Max @ INGFast track to getting started with DSE Max @ ING
Fast track to getting started with DSE Max @ INGDuyhai Doan
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...Lucidworks

What's hot (20)

Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...
Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...
Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...
PostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander Korotkov
PostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander KorotkovPostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander Korotkov
PostgreSQL Moscow Meetup - September 2014 - Oleg Bartunov and Alexander Korotkov
Cassandra Day Chicago 2015: Advanced Data Modeling
Cassandra Day Chicago 2015: Advanced Data ModelingCassandra Day Chicago 2015: Advanced Data Modeling
Cassandra Day Chicago 2015: Advanced Data Modeling
CQL3 in depth
CQL3 in depthCQL3 in depth
CQL3 in depth
Cassandra 3 new features 2016
Cassandra 3 new features 2016Cassandra 3 new features 2016
Cassandra 3 new features 2016
Bay area Cassandra Meetup 2011
Bay area Cassandra Meetup 2011Bay area Cassandra Meetup 2011
Bay area Cassandra Meetup 2011
Developing and Deploying Apps with the Postgres FDW
Developing and Deploying Apps with the Postgres FDWDeveloping and Deploying Apps with the Postgres FDW
Developing and Deploying Apps with the Postgres FDW
Solr 6 Feature Preview
Solr 6 Feature PreviewSolr 6 Feature Preview
Solr 6 Feature Preview
PostgreSQL, your NoSQL database
PostgreSQL, your NoSQL databasePostgreSQL, your NoSQL database
PostgreSQL, your NoSQL database
How to Design Indexes, Really
How to Design Indexes, ReallyHow to Design Indexes, Really
How to Design Indexes, Really
OSDC 2012 | Scaling with MongoDB by Ross Lawley
OSDC 2012 | Scaling with MongoDB by Ross LawleyOSDC 2012 | Scaling with MongoDB by Ross Lawley
OSDC 2012 | Scaling with MongoDB by Ross Lawley
Sasi, cassandra on full text search ride
Sasi, cassandra on full text search rideSasi, cassandra on full text search ride
Sasi, cassandra on full text search ride
Add Powerful Full Text Search to Your Web App with Solr
Add Powerful Full Text Search to Your Web App with SolrAdd Powerful Full Text Search to Your Web App with Solr
Add Powerful Full Text Search to Your Web App with Solr
Cassandra Materialized Views
Cassandra Materialized ViewsCassandra Materialized Views
Cassandra Materialized Views
Foreign Data Wrapper Enhancements
Foreign Data Wrapper EnhancementsForeign Data Wrapper Enhancements
Foreign Data Wrapper Enhancements
Native Code, Off-Heap Data & JSON Facet API for Solr (Heliosearch)
Native Code, Off-Heap Data & JSON Facet API for Solr (Heliosearch)Native Code, Off-Heap Data & JSON Facet API for Solr (Heliosearch)
Native Code, Off-Heap Data & JSON Facet API for Solr (Heliosearch)
Cassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ NetflixCassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ Netflix
How to Use JSON in MySQL Wrong
How to Use JSON in MySQL WrongHow to Use JSON in MySQL Wrong
How to Use JSON in MySQL Wrong
Fast track to getting started with DSE Max @ ING
Fast track to getting started with DSE Max @ INGFast track to getting started with DSE Max @ ING
Fast track to getting started with DSE Max @ ING
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...

Viewers also liked

Cassandra Summit EU 2014 - Testing Cassandra Applications
Cassandra Summit EU 2014 - Testing Cassandra ApplicationsCassandra Summit EU 2014 - Testing Cassandra Applications
Cassandra Summit EU 2014 - Testing Cassandra ApplicationsChristopher Batey
Voxxed Vienna 2015 Fault tolerant microservices
Voxxed Vienna 2015 Fault tolerant microservicesVoxxed Vienna 2015 Fault tolerant microservices
Voxxed Vienna 2015 Fault tolerant microservicesChristopher Batey
LJC Conference 2014 Cassandra for Java Developers
LJC Conference 2014 Cassandra for Java DevelopersLJC Conference 2014 Cassandra for Java Developers
LJC Conference 2014 Cassandra for Java DevelopersChristopher Batey
Cassandra Day London: Building Java Applications
Cassandra Day London: Building Java ApplicationsCassandra Day London: Building Java Applications
Cassandra Day London: Building Java ApplicationsChristopher Batey
Munich March 2015 - Cassandra + Spark Overview
Munich March 2015 -  Cassandra + Spark OverviewMunich March 2015 -  Cassandra + Spark Overview
Munich March 2015 - Cassandra + Spark OverviewChristopher Batey
LJC: Fault tolerance with Apache Cassandra
LJC: Fault tolerance with Apache CassandraLJC: Fault tolerance with Apache Cassandra
LJC: Fault tolerance with Apache CassandraChristopher Batey
Reading Cassandra Meetup Feb 2015: Apache Spark
Reading Cassandra Meetup Feb 2015: Apache SparkReading Cassandra Meetup Feb 2015: Apache Spark
Reading Cassandra Meetup Feb 2015: Apache SparkChristopher Batey

Viewers also liked (7)

Cassandra Summit EU 2014 - Testing Cassandra Applications
Cassandra Summit EU 2014 - Testing Cassandra ApplicationsCassandra Summit EU 2014 - Testing Cassandra Applications
Cassandra Summit EU 2014 - Testing Cassandra Applications
Voxxed Vienna 2015 Fault tolerant microservices
Voxxed Vienna 2015 Fault tolerant microservicesVoxxed Vienna 2015 Fault tolerant microservices
Voxxed Vienna 2015 Fault tolerant microservices
LJC Conference 2014 Cassandra for Java Developers
LJC Conference 2014 Cassandra for Java DevelopersLJC Conference 2014 Cassandra for Java Developers
LJC Conference 2014 Cassandra for Java Developers
Cassandra Day London: Building Java Applications
Cassandra Day London: Building Java ApplicationsCassandra Day London: Building Java Applications
Cassandra Day London: Building Java Applications
Munich March 2015 - Cassandra + Spark Overview
Munich March 2015 -  Cassandra + Spark OverviewMunich March 2015 -  Cassandra + Spark Overview
Munich March 2015 - Cassandra + Spark Overview
LJC: Fault tolerance with Apache Cassandra
LJC: Fault tolerance with Apache CassandraLJC: Fault tolerance with Apache Cassandra
LJC: Fault tolerance with Apache Cassandra
Reading Cassandra Meetup Feb 2015: Apache Spark
Reading Cassandra Meetup Feb 2015: Apache SparkReading Cassandra Meetup Feb 2015: Apache Spark
Reading Cassandra Meetup Feb 2015: Apache Spark

Similar to Vienna Feb 2015: Cassandra: How it works and what it's good for!

Paris Day Cassandra: Use case
Paris Day Cassandra: Use caseParis Day Cassandra: Use case
Paris Day Cassandra: Use caseChristopher Batey
Slide presentation pycassa_upload
Slide presentation pycassa_uploadSlide presentation pycassa_upload
Slide presentation pycassa_uploadRajini Ramesh
Data Science Lab Meetup: Cassandra and Spark
Data Science Lab Meetup: Cassandra and SparkData Science Lab Meetup: Cassandra and Spark
Data Science Lab Meetup: Cassandra and SparkChristopher Batey
Deep Dive into Cassandra
Deep Dive into CassandraDeep Dive into Cassandra
Deep Dive into CassandraBrent Theisen
Real data models of silicon valley
Real data models of silicon valleyReal data models of silicon valley
Real data models of silicon valleyPatrick McFadin
Advanced data modeling with apache cassandra
Advanced data modeling with apache cassandraAdvanced data modeling with apache cassandra
Advanced data modeling with apache cassandraPatrick McFadin
Cassandra Summit 2014: Real Data Models of Silicon Valley
Cassandra Summit 2014: Real Data Models of Silicon ValleyCassandra Summit 2014: Real Data Models of Silicon Valley
Cassandra Summit 2014: Real Data Models of Silicon ValleyDataStax Academy
Owning time series with team apache Strata San Jose 2015
Owning time series with team apache   Strata San Jose 2015Owning time series with team apache   Strata San Jose 2015
Owning time series with team apache Strata San Jose 2015Patrick McFadin
pandas: Powerful data analysis tools for Python
pandas: Powerful data analysis tools for Pythonpandas: Powerful data analysis tools for Python
pandas: Powerful data analysis tools for PythonWes McKinney
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraLuke Tillman
Cassandra and Spark
Cassandra and SparkCassandra and Spark
Cassandra and Sparknickmbailey
Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingCassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingVassilis Bekiaris
Apache HAWQ Architecture
Apache HAWQ ArchitectureApache HAWQ Architecture
Apache HAWQ ArchitectureAlexey Grishchenko
Cassandra Tutorial
Cassandra TutorialCassandra Tutorial
Cassandra Tutorialmubarakss
Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...
Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...
Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...DataStax Academy
Migrating To PostgreSQL
Migrating To PostgreSQLMigrating To PostgreSQL
Migrating To PostgreSQLGrant Fritchey
PostgreSQL - It's kind've a nifty database
PostgreSQL - It's kind've a nifty databasePostgreSQL - It's kind've a nifty database
PostgreSQL - It's kind've a nifty databaseBarry Jones

Similar to Vienna Feb 2015: Cassandra: How it works and what it's good for! (20)

Paris Day Cassandra: Use case
Paris Day Cassandra: Use caseParis Day Cassandra: Use case
Paris Day Cassandra: Use case
Slide presentation pycassa_upload
Slide presentation pycassa_uploadSlide presentation pycassa_upload
Slide presentation pycassa_upload
Data Science Lab Meetup: Cassandra and Spark
Data Science Lab Meetup: Cassandra and SparkData Science Lab Meetup: Cassandra and Spark
Data Science Lab Meetup: Cassandra and Spark
Deep Dive into Cassandra
Deep Dive into CassandraDeep Dive into Cassandra
Deep Dive into Cassandra
Real data models of silicon valley
Real data models of silicon valleyReal data models of silicon valley
Real data models of silicon valley
Advanced data modeling with apache cassandra
Advanced data modeling with apache cassandraAdvanced data modeling with apache cassandra
Advanced data modeling with apache cassandra
Cassandra training
Cassandra trainingCassandra training
Cassandra training
Cassandra Summit 2014: Real Data Models of Silicon Valley
Cassandra Summit 2014: Real Data Models of Silicon ValleyCassandra Summit 2014: Real Data Models of Silicon Valley
Cassandra Summit 2014: Real Data Models of Silicon Valley
Owning time series with team apache Strata San Jose 2015
Owning time series with team apache   Strata San Jose 2015Owning time series with team apache   Strata San Jose 2015
Owning time series with team apache Strata San Jose 2015
pandas: Powerful data analysis tools for Python
pandas: Powerful data analysis tools for Pythonpandas: Powerful data analysis tools for Python
pandas: Powerful data analysis tools for Python
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
Cassandra and Spark
Cassandra and SparkCassandra and Spark
Cassandra and Spark
Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingCassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series Modeling
Apache HAWQ Architecture
Apache HAWQ ArchitectureApache HAWQ Architecture
Apache HAWQ Architecture
Cassandra Tutorial
Cassandra TutorialCassandra Tutorial
Cassandra Tutorial
Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...
Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...
Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...
Migrating To PostgreSQL
Migrating To PostgreSQLMigrating To PostgreSQL
Migrating To PostgreSQL
PostgreSQL - It's kind've a nifty database
PostgreSQL - It's kind've a nifty databasePostgreSQL - It's kind've a nifty database
PostgreSQL - It's kind've a nifty database

More from Christopher Batey

Docker and jvm. A good idea?
Docker and jvm. A good idea?Docker and jvm. A good idea?
Docker and jvm. A good idea?Christopher Batey
LJC: Microservices in the real world
LJC: Microservices in the real worldLJC: Microservices in the real world
LJC: Microservices in the real worldChristopher Batey
NYC Cassandra Day - Java Intro
NYC Cassandra Day - Java IntroNYC Cassandra Day - Java Intro
NYC Cassandra Day - Java IntroChristopher Batey
Cassandra Day NYC - Cassandra anti patterns
Cassandra Day NYC - Cassandra anti patternsCassandra Day NYC - Cassandra anti patterns
Cassandra Day NYC - Cassandra anti patternsChristopher Batey
Think your software is fault-tolerant? Prove it!
Think your software is fault-tolerant? Prove it!Think your software is fault-tolerant? Prove it!
Think your software is fault-tolerant? Prove it!Christopher Batey
Manchester Hadoop Meetup: Cassandra Spark internals
Manchester Hadoop Meetup: Cassandra Spark internalsManchester Hadoop Meetup: Cassandra Spark internals
Manchester Hadoop Meetup: Cassandra Spark internalsChristopher Batey
Cassandra London - 2.2 and 3.0
Cassandra London - 2.2 and 3.0Cassandra London - 2.2 and 3.0
Cassandra London - 2.2 and 3.0Christopher Batey
Cassandra London - C* Spark Connector
Cassandra London - C* Spark ConnectorCassandra London - C* Spark Connector
Cassandra London - C* Spark ConnectorChristopher Batey
2 Dundee - Cassandra-3
2 Dundee - Cassandra-32 Dundee - Cassandra-3
2 Dundee - Cassandra-3Christopher Batey
3 Dundee-Spark Overview for C* developers
3 Dundee-Spark Overview for C* developers3 Dundee-Spark Overview for C* developers
3 Dundee-Spark Overview for C* developersChristopher Batey
Dublin Meetup: Cassandra anti patterns
Dublin Meetup: Cassandra anti patternsDublin Meetup: Cassandra anti patterns
Dublin Meetup: Cassandra anti patternsChristopher Batey
Devoxx France: Fault tolerant microservices on the JVM with Cassandra
Devoxx France: Fault tolerant microservices on the JVM with CassandraDevoxx France: Fault tolerant microservices on the JVM with Cassandra
Devoxx France: Fault tolerant microservices on the JVM with CassandraChristopher Batey
Manchester Hadoop Meetup: Spark Cassandra Integration
Manchester Hadoop Meetup: Spark Cassandra IntegrationManchester Hadoop Meetup: Spark Cassandra Integration
Manchester Hadoop Meetup: Spark Cassandra IntegrationChristopher Batey
Webinar Cassandra Anti-Patterns
Webinar Cassandra Anti-PatternsWebinar Cassandra Anti-Patterns
Webinar Cassandra Anti-PatternsChristopher Batey
LA Cassandra Day 2015 - Testing Cassandra
LA Cassandra Day 2015  - Testing CassandraLA Cassandra Day 2015  - Testing Cassandra
LA Cassandra Day 2015 - Testing CassandraChristopher Batey
LA Cassandra Day 2015 - Cassandra for developers
LA Cassandra Day 2015  - Cassandra for developersLA Cassandra Day 2015  - Cassandra for developers
LA Cassandra Day 2015 - Cassandra for developersChristopher Batey
Cassandra Summit EU 2014 Lightning talk - Paging (no animation)
Cassandra Summit EU 2014 Lightning talk - Paging (no animation)Cassandra Summit EU 2014 Lightning talk - Paging (no animation)
Cassandra Summit EU 2014 Lightning talk - Paging (no animation)Christopher Batey

More from Christopher Batey (19)

Cassandra summit LWTs
Cassandra summit  LWTsCassandra summit  LWTs
Cassandra summit LWTs
Docker and jvm. A good idea?
Docker and jvm. A good idea?Docker and jvm. A good idea?
Docker and jvm. A good idea?
LJC: Microservices in the real world
LJC: Microservices in the real worldLJC: Microservices in the real world
LJC: Microservices in the real world
NYC Cassandra Day - Java Intro
NYC Cassandra Day - Java IntroNYC Cassandra Day - Java Intro
NYC Cassandra Day - Java Intro
Cassandra Day NYC - Cassandra anti patterns
Cassandra Day NYC - Cassandra anti patternsCassandra Day NYC - Cassandra anti patterns
Cassandra Day NYC - Cassandra anti patterns
Think your software is fault-tolerant? Prove it!
Think your software is fault-tolerant? Prove it!Think your software is fault-tolerant? Prove it!
Think your software is fault-tolerant? Prove it!
Manchester Hadoop Meetup: Cassandra Spark internals
Manchester Hadoop Meetup: Cassandra Spark internalsManchester Hadoop Meetup: Cassandra Spark internals
Manchester Hadoop Meetup: Cassandra Spark internals
Cassandra London - 2.2 and 3.0
Cassandra London - 2.2 and 3.0Cassandra London - 2.2 and 3.0
Cassandra London - 2.2 and 3.0
Cassandra London - C* Spark Connector
Cassandra London - C* Spark ConnectorCassandra London - C* Spark Connector
Cassandra London - C* Spark Connector
IoT London July 2015
IoT London July 2015IoT London July 2015
IoT London July 2015
2 Dundee - Cassandra-3
2 Dundee - Cassandra-32 Dundee - Cassandra-3
2 Dundee - Cassandra-3
3 Dundee-Spark Overview for C* developers
3 Dundee-Spark Overview for C* developers3 Dundee-Spark Overview for C* developers
3 Dundee-Spark Overview for C* developers
Dublin Meetup: Cassandra anti patterns
Dublin Meetup: Cassandra anti patternsDublin Meetup: Cassandra anti patterns
Dublin Meetup: Cassandra anti patterns
Devoxx France: Fault tolerant microservices on the JVM with Cassandra
Devoxx France: Fault tolerant microservices on the JVM with CassandraDevoxx France: Fault tolerant microservices on the JVM with Cassandra
Devoxx France: Fault tolerant microservices on the JVM with Cassandra
Manchester Hadoop Meetup: Spark Cassandra Integration
Manchester Hadoop Meetup: Spark Cassandra IntegrationManchester Hadoop Meetup: Spark Cassandra Integration
Manchester Hadoop Meetup: Spark Cassandra Integration
Webinar Cassandra Anti-Patterns
Webinar Cassandra Anti-PatternsWebinar Cassandra Anti-Patterns
Webinar Cassandra Anti-Patterns
LA Cassandra Day 2015 - Testing Cassandra
LA Cassandra Day 2015  - Testing CassandraLA Cassandra Day 2015  - Testing Cassandra
LA Cassandra Day 2015 - Testing Cassandra
LA Cassandra Day 2015 - Cassandra for developers
LA Cassandra Day 2015  - Cassandra for developersLA Cassandra Day 2015  - Cassandra for developers
LA Cassandra Day 2015 - Cassandra for developers
Cassandra Summit EU 2014 Lightning talk - Paging (no animation)
Cassandra Summit EU 2014 Lightning talk - Paging (no animation)Cassandra Summit EU 2014 Lightning talk - Paging (no animation)
Cassandra Summit EU 2014 Lightning talk - Paging (no animation)

Recently uploaded

(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - smith
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.

Recently uploaded (20)

Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data

Vienna Feb 2015: Cassandra: How it works and what it's good for!

  • 1. @chbatey Christopher Batey
 Technical Evangelist for Apache Cassandra Cassandra: How it works and what it's
  • 2. @chbatey Who am I? • Technical Evangelist for Apache Cassandra •Founder of Stubbed Cassandra •Help out Apache Cassandra users • DataStax •Builds enterprise ready version of Apache Cassandra • Previous: Cassandra backed apps at BSkyB
  • 3. @chbatey Overview • Distributed databases: What and why? • Cassandra use cases • Replication • Fault tolerance • Read and write path • Data modelling • Java Driver
  • 5. @chbatey It is a big world • Relational - Oracle, PostgreSQL • Graph databases - Neo4J, InfoGrid, Titan • Key value - DynamoDB • Document stores - MongoDB, Couchbase • Columnar aka wide row - Cassandra, HBase
  • 11. @chbatey Master/slave •Master serves all writes •Read from master and optionally slaves
  • 12. @chbatey Peer-to-Peer • No master • Read/write to any • Consistency?
  • 13. @chbatey Decisions decisions… CAP theorem Are these really that different?? Relational Database Mongo, Redis Highly Available Databases: Voldermort, Cassandra
  • 14. @chbatey PAC-ELC • If Partition: Trade off Availability vs Consistency • Else: Trade off Latency vs Consistency • with-cap-and-yahoos-little.html
  • 17. @chbatey Common use cases •Ordered data such as time series -Event stores -Financial transactions -Sensor data e.g IoT
  • 18. @chbatey Common use cases •Ordered data such as time series -Event stores -Financial transactions -Sensor data e.g IoT •Non functional requirements: -Linear scalability -High throughout durable writes -Multi datacenter including active-active -Analytics without ETL
  • 19. @chbatey Work loads • High write and/or throughout • Many small queries • Distributed across the world (eventually consistent)
  • 21. @chbatey Cassandra Cassandra • Distributed masterless database (Dynamo) • Column family data model (Google BigTable)
  • 22. @chbatey Datacenter and rack aware Europe • Distributed master less database (Dynamo) • Column family data model (Google BigTable) • Multi data centre replication built in from the start USA
  • 23. @chbatey Cassandra Online • Distributed master less database (Dynamo) • Column family data model (Google BigTable) • Multi data centre replication built in from the start • Analytics with Apache SparkAnalytics
  • 25. @chbatey Dynamo 101 • The parts Cassandra took - Consistent hashing - Replication - Strategies for replication - Gossip - Hinted handoff - Anti-entropy repair • And the parts it left behind - Key/Value - Vector clocks
  • 26. @chbatey Picking the right nodes • You don’t want a full table scan on a 1000 node cluster! • Dynamo to the rescue: Consistent Hashing • Then the replication strategy takes over: - Network topology - Simple
  • 27. @chbatey Murmer3 Example • Data: • Murmer3 Hash Values: jim age: 36 car: ford gender: M carol age: 37 car: bmw gender: F johnny age: 12 gender: M suzy: age: 10 gender: F Primary Key Murmur3 hash value jim 350 carol 998 johnny 50 suzy 600 Primary Key Real hash range: -9223372036854775808 to 9223372036854775807
  • 28. @chbatey Murmer3 Example Four node cluster: Node Murmur3 start range Murmur3 end range A 0 249 B 250 499 C 500 749 D 750 999
  • 30. @chbatey Murmer3 Example Data is distributed as: Node Start range End range Primary key Hash value A 0 249 johnny 50 B 250 499 jim 350 C 500 749 suzy 600 D 750 999 carol 998
  • 32. @chbatey Replication strategy • Simple - Give it to the next node in the ring - Don’t use this in production • NetworkTopology - Every Cassandra node knows its DC and Rack - Replicas won’t be put on the same rack unless Replication Factor > # of racks - Unfortunately Cassandra can’t create servers and racks on the fly to fix this :(
  • 34. @chbatey Replication strategy • Simple - Give it to the next node in the ring - Don’t use this in production • NetworkTopology - Every Cassandra node knows its DC and Rack - Replicas won’t be put on the same rack unless Replication Factor > # of racks - Unfortunately Cassandra can’t create servers and racks on the fly to fix this :(
  • 35. 35
  • 36. @chbatey Tunable Consistency •Data is replicated N times •Every query that you execute you give a consistency -ALL -QUORUM -LOCAL_QUORUM -ONE • Christos Kalantzis Eventual Consistency != Hopeful Consistency: http://
  • 38. @chbatey Load balancing •Data centre aware policy •Token aware policy •Latency aware policy •Whitelist policy APP APP Async Replication DC1 DC2
  • 42. @chbatey But what happens when they come back? • Hinted handoff to the rescue • Coordinators keep writes for downed nodes for a configurable amount of time, default 3 hours • Longer than that run a repair
  • 43. @chbatey Anti entropy repair • Not exciting but mandatory :) • New in 2.1 - incremental repair <— awesome
  • 44. @chbatey Don’t forget to be social • Each node talks to a few of its other and shares information Did you hear node 1 was down???
  • 45. @chbatey Scaling shouldn’t be hard • Throw more nodes at a cluster • Bootstrapping + joining the ring • For large data sets this can take some time
  • 48. @chbatey Your model should look like your queries
  • 49. @chbatey CQL •Cassandra Query Language -SQL like query language •Keyspace – analogous to a schema - The keyspace determines the RF (replication factor) •Table – looks like a SQL Table CREATE TABLE scores ( name text, score int, date timestamp, PRIMARY KEY (name, score) ); INSERT INTO scores (name, score, date) VALUES ('bob', 42, '2012-06-24'); INSERT INTO scores (name, score, date) VALUES ('bob', 47, '2012-06-25'); SELECT date, score FROM scores WHERE name='bob' AND score >= 40;
  • 51. @chbatey UUID • Universal Unique ID - 128 bit number represented in character form e.g. 99051fe9-6a9c-46c2-b949-38ef78858dd0 • Easily generated on the client - Version 1 has a timestamp component (TIMEUUID) - Version 4 has no timestamp component
  • 52. @chbatey Company Confidential TIMEUUID TIMEUUID data type supports Version 1 UUIDs Generated using time (60 bits), a clock sequence number (14 bits), and MAC address (48 bits) – CQL function ‘now()’ generates a new TIMEUUID Time can be extracted from TIMEUUID – CQL function dateOf() extracts the timestamp as a date TIMEUUID values in clustering columns or in column names are ordered based on time – DESC order on TIMEUUID lists most recent data first Š 2014 DataStax, All Rights Reserved.
  • 53. @chbatey Collections CREATE TABLE videos ( videoid uuid, userid uuid, name varchar, description varchar, location text, location_type int, preview_thumbnails map<text,text>, tags set<varchar>, added_date timestamp, PRIMARY KEY (videoid) );
  • 54. @chbatey Data Model - User Defined Types • Complex data in one place • No multi-gets (multi-partitions) • Nesting! CREATE TYPE address ( street text, city text, zip_code int, country text, cross_streets set<text> );
  • 55. Data Model - Updated • We can embed video_metadata in videos CREATE TYPE video_metadata ( height int, width int, video_bit_rate set<text>, encoding text ); CREATE TABLE videos ( videoid uuid, userid uuid, name varchar, description varchar, location text, location_type int, preview_thumbnails map<text,text>, tags set<varchar>, metadata set <frozen<video_metadata>>, added_date timestamp, PRIMARY KEY (videoid) );
  • 56. Data Model - Storing JSON { "productId": 2, "name": "Kitchen Table", "price": 249.99, "description" : "Rectangular table with oak finish", "dimensions": { "units": "inches", "length": 50.0, "width": 66.0, "height": 32 }, "categories": { { "category" : "Home Furnishings" { "catalogPage": 45, "url": "/home/furnishings" }, { "category" : "Kitchen Furnishings" { "catalogPage": 108, "url": "/kitchen/furnishings" } } } CREATE TYPE dimensions ( units text, length float, width float, height float ); CREATE TYPE category ( catalogPage int, url text ); CREATE TABLE product ( productId int, name text, price float, description text, dimensions frozen <dimensions>, categories map <text, frozen <category>>, PRIMARY KEY (productId) );
  • 57. @chbatey Tuple type • Type to represent a group • Up to 256 different elements CREATE TABLE tuple_table ( id int PRIMARY KEY, three_tuple frozen <tuple<int, text, float>>, four_tuple frozen <tuple<int, text, float, inet>>, five_tuple frozen <tuple<int, text, float, inet, ascii>> );
  • 58. @chbatey Counters • Old has been around since .8 • Commit log replay changes counters • Repair can change a counter
  • 59. @chbatey Time-to-Live (TTL) TTL a row: INSERT INTO users (id, first, last) VALUES (‘abc123’, ‘catherine’, ‘cachart’) USING TTL 3600; // Expires data in one hour
 TTL a column: UPDATE users USING TTL 30 SET last = ‘miller’ WHERE id = ‘abc123’ – TTL in seconds – Can also set default TTL at a table level – Expired columns/values automatically deleted – With no TTL specified, columns/values never expire – TTL is useful for automatic deletion – Re-inserting the same row before it expires will overwrite TTL
  • 62. @chbatey An example: Customer event store • Customer event - customer_id e.g ChrisBatey - event_type e.g login, logout, add_to_basket, remove_from_basket, buy_item • Staff - name e.g Charlie - favourite_colour e.g red • Store - name - type e.g Website, PhoneApp, Phone, Retail
  • 63. @chbatey Requirements • Get all events • Get all events for a particular customer • As above for a time slice
  • 64. @chbatey Modelling in a relational database CREATE TABLE customer_events( customer_id text, staff_name text, time timeuuid, event_type text, store_name text, PRIMARY KEY (customer_id)); CREATE TABLE store( name text, location text, store_type text, PRIMARY KEY (store_name)); CREATE TABLE staff( name text, favourite_colour text, job_title text, PRIMARY KEY (name));
  • 65. Modelling in Cassandra CREATE TABLE customer_events( customer_id text, staff_id text, time timeuuid, store_type text, event_type text, tags map<text, text>, PRIMARY KEY ((customer_id), time)); Partition Key Clustering Column(s)
  • 66. How it is stored on disk customer _id time event_type store_type tags charles 2014-11-18 16:52:04 basket_add online {'item': 'coffee'} charles 2014-11-18 16:53:00 basket_add online {'item': ‘wine'} charles 2014-11-18 16:53:09 logout online {} chbatey 2014-11-18 16:52:21 login online {} chbatey 2014-11-18 16:53:21 basket_add online {'item': 'coffee'} chbatey 2014-11-18 16:54:00 basket_add online {'item': 'cheese'} charles event_type basket_add staff_id n/a store_type online tags:item coffee event_type basket_add staff_id n/a store_type online tags:item wine event_type logout staff_id n/a store_type online chbatey event_type login staff_id n/a store_type online event_type basket_add staff_id n/a store_type online tags:item coffee event_type basket_add staff_id n/a store_type online tags:item cheese
  • 68. @chbatey Languages • DataStax (open source) - C#, Java, C++, Python, Node, Ruby - Very similar programming API • Other open source - Go - Clojure - Erlang - Haskell - Many more Java/Python drivers - Perl
  • 70. @chbatey Get all the events public List<CustomerEvent> getAllCustomerEvents() {
 return session.execute("select * from customers.customer_events")
 } private Function<Row, CustomerEvent> mapCustomerEvent() {
 return row -> new CustomerEvent(
 row.getMap("tags", String.class, String.class));
  • 71. @chbatey All events for a particular customer private PreparedStatement getEventsForCustomer;
 public void prepareSatements() {
 getEventsForCustomer = session.prepare("select * from customers.customer_events where customer_id = ?");
 public List<CustomerEvent> getCustomerEvents(String customerId) {
 BoundStatement boundStatement = getEventsForCustomer.bind(customerId);
 return session.execute(boundStatement) .all().stream()
  • 72. @chbatey Customer events for a time slice public List<CustomerEvent> getCustomerEventsForTime(String customerId, long startTime, long endTime) { 
 Select.Where getCustomers =
 .from("customers", "customer_events")
 .where(eq("customer_id", customerId))
 .and(gt("time", UUIDs.startOf(startTime)))
 .and(lt("time", UUIDs.endOf(endTime)));
 return session.execute(getCustomers).all().stream()
  • 73. @chbatey Mapping API @Table(keyspace = "customers", name = "customer_events")
 public class CustomerEvent {
 @Column(name = "customer_id")
 private String customerId;
 private UUID time;
 @Column(name = "staff_id")
 private String staffId;
 @Column(name = "store_type")
 private String storeType;
 @Column(name = "event_type")
 private String eventType;
 private Map<String, String> tags; // ctr / getters etc }

  • 74. @chbatey Mapping API @Accessor
 public interface CustomerEventDao {
 @Query("select * from customers.customer_events where customer_id = :customerId")
 Result<CustomerEvent> getCustomerEvents(String customerId);
 @Query("select * from customers.customer_events")
 Result<CustomerEvent> getAllCustomerEvents();
 @Query("select * from customers.customer_events where customer_id = :customerId and time > minTimeuuid(:startTime) and time < maxTimeuuid(:endTime)")
 Result<CustomerEvent> getCustomerEventsForTime(String customerId, long startTime, long endTime);
 public CustomerEventDao customerEventDao() {
 MappingManager mappingManager = new MappingManager(session);
 return mappingManager.createAccessor(CustomerEventDao.class);
  • 75. @chbatey Adding some type safety public enum StoreType {
 } @Table(keyspace = "customers", name = "customer_events")
 public class CustomerEvent {
 @Column(name = "customer_id")
 private String customerId;
 private UUID time;
 @Column(name = "staff_id")
 private String staffId;
 @Column(name = "store_type")
 @Enumerated(EnumType.STRING) // could be EnumType.ORDINAL
 private StoreType storeType;

  • 76. @chbatey User defined types create TYPE store (name text, type text, postcode text) ;
 CREATE TABLE customer_events_type( customer_id text, staff_id text, time timeuuid, store frozen<store>, event_type text, tags map<text, text>, PRIMARY KEY ((customer_id), time));

  • 77. @chbatey Mapping user defined types @UDT(keyspace = "customers", name = "store")
 public class Store {
 private String name;
 private StoreType type;
 private String postcode; // getters etc } @Table(keyspace = "customers", name = "customer_events_type")
 public class CustomerEventType {
 @Column(name = "customer_id")
 private String customerId;
 private UUID time;
 @Column(name = "staff_id")
 private String staffId;
 private Store store;
 @Column(name = "event_type")
 private String eventType;
 private Map<String, String> tags;

  • 78. @chbatey Mapping user defined types @UDT(keyspace = "customers", name = "store")
 public class Store {
 private String name;
 private StoreType type;
 private String postcode; // getters etc } @Table(keyspace = "customers", name = "customer_events_type")
 public class CustomerEventType {
 @Column(name = "customer_id")
 private String customerId;
 private UUID time;
 @Column(name = "staff_id")
 private String staffId;
 private Store store;
 @Column(name = "event_type")
 private String eventType;
 private Map<String, String> tags;
 @Query("select * from customers.customer_events_type")
 Result<CustomerEventType> getAllCustomerEventsWithStoreType();
  • 79. @chbatey Š2014 DataStax. Do not distribute without consent. Query Tracing Connected to cluster: xerxes Simplex keyspace and schema created. Host (queried): / Host (tried): / Trace id: 96ac9400-a3a5-11e2-96a9-4db56cdc5fe7 activity | timestamp | source | source_elapsed ---------------------------------------+--------------+------------+-------------- Parsing statement | 12:17:16.736 | / | 28 Peparing statement | 12:17:16.736 | / | 199 Determining replicas for mutation | 12:17:16.736 | / | 348 Sending message to / | 12:17:16.736 | / | 788 Sending message to / | 12:17:16.736 | / | 805 Acquiring switchLock read lock | 12:17:16.736 | / | 828 Appending to commitlog | 12:17:16.736 | / | 848 Adding to songs memtable | 12:17:16.736 | / | 900 Message received from / | 12:17:16.737 | / | 34 Message received from / | 12:17:16.737 | / | 25 Acquiring switchLock read lock | 12:17:16.737 | / | 672 Acquiring switchLock read lock | 12:17:16.737 | / | 525
  • 81. @chbatey Light weight transactions • Often referred to as compare and set (CAS) INSERT INTO USERS (login, email, name, login_count) values ('jbellis', '', 'Jonathan Ellis', 1) IF NOT EXISTS
  • 82. @chbatey Batch Statements BEGIN BATCH INSERT INTO users (userID, password, name) VALUES ('user2', 'ch@ngem3b', 'second user') UPDATE users SET password = 'ps22dhds' WHERE userID = 'user2' INSERT INTO users (userID, password) VALUES ('user3', 'ch@ngem3c') DELETE name FROM users WHERE userID = 'user2’ APPLY BATCH;
 BATCH statement combines multiple INSERT, UPDATE, and DELETE statements into a single logical operation 
 Atomic operation If any statement in the batch succeeds, all will No batch isolation Other “transactions” can read and write data being affected by a partially executed batch
  • 83. @chbatey Company Confidential Batch Statements with LWT BEGIN BATCH UPDATE foo SET z = 1 WHERE x = 'a' AND y = 1; UPDATE foo SET z = 2 WHERE x = 'a' AND y = 2 IF t = 4; APPLY BATCH; Allows you to group multiple conditional updates in a batch as long as all those updates apply to the same partition
 Š 2014 DataStax, All Rights Reserved.
  • 84. @chbatey Summary • Cassandra is a shared nothing masterless datastore • Availability a.k.a up time is king • Biggest hurdle is learning to model differently • Modern drivers make it easy to work with
  • 85. @chbatey Thanks for listening • Follow me on twitter @chbatey • Cassandra + Fault tolerance posts a plenty: • • Cassandra resources: