SlideShare a Scribd company logo
1 of 36
Download to read offline
Real Data Models of Silicon 
Valley 
Patrick McFadin 
Chief Evangelist for Apache Cassandra 
! 
@PatrickMcFadin
It's been an epic year
I've had a ton of fun! 
• Traveling the world 
talking to people like 
you! 
Stockholm 
Warsaw 
Melbourne 
New York 
Vancouver 
Dublin
What's new? 
• 2.1 is out! 
• Amazing changes for performance and 
stability
Where are we going? 
• 3.0 is next. Just hold on…
KillrVideo.com 
• 2012 Summit 
• Complete example for data 
modeling 
www.killrvideos.com 
Video Title 
Recommended 
Meow 
Ads 
by Google 
Description 
Comments 
Upload New! 
Username 
Rating: Tags: Foo Bar 
*Cat drawing by goodrob13 on Flickr
It’s alive!!! 
• Hosted on Azure 
• Code on Github
Data Model - Revisited 
• Add in some 2.1 data models 
• Replace (or remove) some app code 
• Become a part of Cassandra OSS download
User Defined Types 
• Complex data in one place 
• No multi-gets (multi-partitions) 
• Nesting! CREATE TYPE address ( 
street text, 
city text, 
zip_code int, 
country text, 
cross_streets set<text> 
);
Before 
CREATE TABLE videos ( 
videoid uuid, 
userid uuid, 
name varchar, 
description varchar, 
location text, 
location_type int, 
preview_thumbnails map<text,text>, 
tags set<varchar>, 
added_date timestamp, 
PRIMARY KEY (videoid) 
); 
CREATE TABLE video_metadata ( 
video_id uuid PRIMARY KEY, 
height int, 
width int, 
video_bit_rate set<text>, 
encoding text 
); 
SELECT * 
FROM videos 
WHERE videoId = 2; 
! 
SELECT * 
FROM video_metadata 
WHERE videoId = 2; 
Title: Introduction to Apache Cassandra 
! 
Description: A one hour talk on everything 
you need to know about a totally amazing 
database. 
Playback rate: 
480 720 
In-application 
join
After 
• Now video_metadata is 
embedded in videos 
CREATE TYPE video_metadata ( 
height int, 
width int, 
video_bit_rate set<text>, 
encoding text 
); 
CREATE TABLE videos ( 
videoid uuid, 
userid uuid, 
name varchar, 
description varchar, 
location text, 
location_type int, 
preview_thumbnails map<text,text>, 
tags set<varchar>, 
metadata set <frozen<video_metadata>>, 
added_date timestamp, 
PRIMARY KEY (videoid) 
);
Wait! Frozen?? 
• Staying out of technical 
debt 
• 3.0 UDTs will not have to 
be frozen 
• Applicable to User Defined 
Types and Tuples (wait for 
Do you want to build a schema? 
Do you want to store some JSON?
Let’s store some JSON 
{ 
"productId": 2, 
"name": "Kitchen Table", 
"price": 249.99, 
"description" : "Rectangular table with oak finish", 
"dimensions": { 
"units": "inches", 
"length": 50.0, 
"width": 66.0, 
"height": 32 
}, 
"categories": { 
{ 
"category" : "Home Furnishings" { 
"catalogPage": 45, 
"url": "/home/furnishings" 
}, 
{ 
"category" : "Kitchen Furnishings" { 
"catalogPage": 108, 
"url": "/kitchen/furnishings" 
} 
} 
}
Let’s store some JSON 
{ 
"productId": 2, 
"name": "Kitchen Table", 
"price": 249.99, 
"description" : "Rectangular table with oak finish", 
"dimensions": { 
"units": "inches", 
"length": 50.0, 
"width": 66.0, 
"height": 32 
}, 
"categories": { 
{ 
"category" : "Home Furnishings" { 
"catalogPage": 45, 
"url": "/home/furnishings" 
}, 
{ 
"category" : "Kitchen Furnishings" { 
"catalogPage": 108, 
"url": "/kitchen/furnishings" 
} 
} 
} 
CREATE TYPE dimensions ( 
units text, 
length float, 
width float, 
height float 
);
Let’s store some JSON 
{ 
"productId": 2, 
"name": "Kitchen Table", 
"price": 249.99, 
"description" : "Rectangular table with oak finish", 
"dimensions": { 
"units": "inches", 
"length": 50.0, 
"width": 66.0, 
"height": 32 
}, 
"categories": { 
{ 
"category" : "Home Furnishings" { 
"catalogPage": 45, 
"url": "/home/furnishings" 
}, 
{ 
"category" : "Kitchen Furnishings" { 
"catalogPage": 108, 
"url": "/kitchen/furnishings" 
} 
} 
} 
CREATE TYPE dimensions ( 
units text, 
length float, 
width float, 
height float 
); 
CREATE TYPE category ( 
catalogPage int, 
url text 
);
Let’s store some JSON 
{ 
"productId": 2, 
"name": "Kitchen Table", 
"price": 249.99, 
"description" : "Rectangular table with oak finish", 
"dimensions": { 
"units": "inches", 
"length": 50.0, 
"width": 66.0, 
"height": 32 
}, 
"categories": { 
{ 
"category" : "Home Furnishings" { 
"catalogPage": 45, 
"url": "/home/furnishings" 
}, 
{ 
"category" : "Kitchen Furnishings" { 
"catalogPage": 108, 
"url": "/kitchen/furnishings" 
} 
} 
} 
CREATE TYPE dimensions ( 
units text, 
length float, 
width float, 
height float 
); 
CREATE TYPE category ( 
catalogPage int, 
url text 
); 
CREATE TABLE product ( 
productId int, 
name text, 
price float, 
description text, 
dimensions frozen <dimensions>, 
categories map <text, frozen <category>>, 
PRIMARY KEY (productId) 
);
Let’s store some JSON 
INSERT INTO product (productId, name, price, description, dimensions, categories) 
VALUES (2, 'Kitchen Table', 249.99, 'Rectangular table with oak finish', 
{ 
units: 'inches', 
length: 50.0, 
width: 66.0, 
height: 32 
}, 
{ 
'Home Furnishings': { 
catalogPage: 45, 
url: '/home/furnishings' 
}, 
'Kitchen Furnishings': { 
catalogPage: 108, 
url: '/kitchen/furnishings' 
} 
! 
} 
); 
dimensions frozen <dimensions> 
categories map <text, frozen <category>>
Retrieving fields
Counters pt Deux 
• Since .8 
• Commit log replay would change counters 
• Repair could change counters 
• Performance was inconsistent. Lots of GC
The good 
• Stable under load 
• No commit log replay issues 
• No repair weirdness
The bad 
• Still can’t delete/reset counters 
• Still needs to do a read before write.
Usage 
Wait for it… 
It’s the same! Carry on…
Static Fields 
• New as of 2.0.6 
• VERY specific, but useful 
• Thrift people will like this 
CREATE TABLE t ( 
k text, 
s text STATIC, 
i int, 
PRIMARY KEY (k, i) 
);
Why? 
CREATE TABLE weather ( 
id int, 
time timestamp, 
weatherstation_name text, 
temperature float, 
PRIMARY KEY (id, time) 
); 
ID = 1 
Partition Key 
(Storage Row Key) 
2014-09-08 12:00:00 : 
name 
SFO 
2014-09-08 12:00:00 : 
temp 
63.4 
2014-09-08 12:01:00 : 
name 
SFO 
2014-09-08 12:00:00 : 
temp 
63.9 
2014-09-08 12:02:00 : 
name 
SFO 
2014-09-08 12:00:00 : 
temp 
64.0 
Partition Row 1 Partition Row 2 Partition Row 3 
ID = 1 
Partition Key 
(Storage Row Key) 
name 
SFO 
Partition Row 1 Partition Row 1 Partition Row 1 
2014-09-08 12:00:00 : 
temp 
63.4 
2014-09-08 12:00:00 : 
temp 
63.9 
2014-09-08 12:00:00 : 
temp 
64.0 
CREATE TABLE weather ( 
id int, 
time timestamp, 
weatherstation_name text static, 
temperature float, 
PRIMARY KEY (id, time) 
);
Usage 
• Put a static at the end 
of the declaration 
• Can’t be a part of: 
CREATE TABLE video_event ( 
videoid uuid, 
userid uuid, 
preview_image_location text static, 
event varchar, 
event_timestamp timeuuid, 
video_timestamp bigint, 
PRIMARY KEY ((videoid,userid),event_timestamp,event) 
) WITH CLUSTERING ORDER BY (event_timestamp DESC,event ASC);
Tuples 
CREATE TABLE tuple_table ( 
id int PRIMARY KEY, 
three_tuple frozen <tuple<int, text, float>>, 
four_tuple frozen <tuple<int, text, float, inet>>, 
five_tuple frozen <tuple<int, text, float, inet, ascii>> 
); 
• A type that represents a group 
• Up to 256 different elements
Example Usage 
• Track a drone’s position 
• x, y, z in a 3D Cartesian 
CREATE TABLE drone_position ( 
droneId int, 
time timestamp, 
position frozen <tuple<float, float, float>>, 
PRIMARY KEY (droneId, time) 
);
What about partition size? 
• A CQL partition is a logical projection of a storage row 
• Storage rows can have up to 2 billion cells 
• Each cell can hold up to 2G of data
How much is too much? 
• How many cells before performance degrades? 
• How many bytes per partition before it’s unmanageable 
• What is “practical”
Old answer 
• 2011: Pre-Cassandra 1.2 (actually tested on .8) 
• Aaron Morton, Cassandra MVP and Founder of The Last Pickle
Conclusion 
• Keep partition (storage row) length < 10k cells 
• Total size in bytes below 64M (Multi-pass compaction) 
• Multiple hits to 64k page size will start to hurt 
TL;DR - It’s a performance tunable
The tests revisited 
• Attempted to reproduce the same tests using CQL 
• Cassandra 2.1, 2.0 and 1.2 
• Tested partitions sizes 1. 100 
2. 2114 
3. 5,000 
4. 10,000 
5. 100,000 
6. 1,000,000 
7. 10,000,000 
8. 100,000,000 
9. 1,000,000,000
Results 
mSec 
Cells per partition
The new answer 
• 100’s of thousands is not problem 
• 100’s of megs per partition is best operationally 
• The issue to manage is operations
Thank You! 
Follow me on twitter for more 
@PatrickMcFadin
CASSANDRASUMMIT2014 
September 10 - 11 | #CassandraSummit

More Related Content

What's hot

Cassandra 3.0 Awesomeness
Cassandra 3.0 AwesomenessCassandra 3.0 Awesomeness
Cassandra 3.0 AwesomenessJon Haddad
 
The world's next top data model
The world's next top data modelThe world's next top data model
The world's next top data modelPatrick McFadin
 
Cassandra nice use cases and worst anti patterns
Cassandra nice use cases and worst anti patternsCassandra nice use cases and worst anti patterns
Cassandra nice use cases and worst anti patternsDuyhai Doan
 
Time series with apache cassandra strata
Time series with apache cassandra   strataTime series with apache cassandra   strata
Time series with apache cassandra strataPatrick McFadin
 
Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...
Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...
Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...DataStax Academy
 
Tokyo cassandra conference 2014
Tokyo cassandra conference 2014Tokyo cassandra conference 2014
Tokyo cassandra conference 2014jbellis
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseDataStax Academy
 
The data model is dead, long live the data model
The data model is dead, long live the data modelThe data model is dead, long live the data model
The data model is dead, long live the data modelPatrick McFadin
 
Advanced data modeling with apache cassandra
Advanced data modeling with apache cassandraAdvanced data modeling with apache cassandra
Advanced data modeling with apache cassandraPatrick McFadin
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraLuke Tillman
 
2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB für .NET Entwickler)
2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB für .NET Entwickler)2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB für .NET Entwickler)
2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB für .NET Entwickler)Johannes Hoppe
 
Cassandra EU - Data model on fire
Cassandra EU - Data model on fireCassandra EU - Data model on fire
Cassandra EU - Data model on firePatrick McFadin
 
Cassandra 2.0 better, faster, stronger
Cassandra 2.0   better, faster, strongerCassandra 2.0   better, faster, stronger
Cassandra 2.0 better, faster, strongerPatrick McFadin
 
Apache Cassandra Lesson: Data Modelling and CQL3
Apache Cassandra Lesson: Data Modelling and CQL3Apache Cassandra Lesson: Data Modelling and CQL3
Apache Cassandra Lesson: Data Modelling and CQL3Markus Klems
 
DataStax NYC Java Meetup: Cassandra with Java
DataStax NYC Java Meetup: Cassandra with JavaDataStax NYC Java Meetup: Cassandra with Java
DataStax NYC Java Meetup: Cassandra with Javacarolinedatastax
 
DMDW Extra Lesson - NoSql and MongoDB
DMDW  Extra Lesson - NoSql and MongoDBDMDW  Extra Lesson - NoSql and MongoDB
DMDW Extra Lesson - NoSql and MongoDBJohannes Hoppe
 
Getting Started with Datatsax .Net Driver
Getting Started with Datatsax .Net DriverGetting Started with Datatsax .Net Driver
Getting Started with Datatsax .Net DriverDataStax Academy
 
Cassandra Summit 2015
Cassandra Summit 2015Cassandra Summit 2015
Cassandra Summit 2015jbellis
 

What's hot (20)

Cassandra 3.0 Awesomeness
Cassandra 3.0 AwesomenessCassandra 3.0 Awesomeness
Cassandra 3.0 Awesomeness
 
The world's next top data model
The world's next top data modelThe world's next top data model
The world's next top data model
 
Cassandra nice use cases and worst anti patterns
Cassandra nice use cases and worst anti patternsCassandra nice use cases and worst anti patterns
Cassandra nice use cases and worst anti patterns
 
Time series with apache cassandra strata
Time series with apache cassandra   strataTime series with apache cassandra   strata
Time series with apache cassandra strata
 
Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...
Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...
Cassandra Day Atlanta 2015: Building Your First Application with Apache Cassa...
 
Tokyo cassandra conference 2014
Tokyo cassandra conference 2014Tokyo cassandra conference 2014
Tokyo cassandra conference 2014
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
The data model is dead, long live the data model
The data model is dead, long live the data modelThe data model is dead, long live the data model
The data model is dead, long live the data model
 
Advanced data modeling with apache cassandra
Advanced data modeling with apache cassandraAdvanced data modeling with apache cassandra
Advanced data modeling with apache cassandra
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB für .NET Entwickler)
2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB für .NET Entwickler)2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB für .NET Entwickler)
2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB für .NET Entwickler)
 
Cassandra EU - Data model on fire
Cassandra EU - Data model on fireCassandra EU - Data model on fire
Cassandra EU - Data model on fire
 
Cassandra 2.0 better, faster, stronger
Cassandra 2.0   better, faster, strongerCassandra 2.0   better, faster, stronger
Cassandra 2.0 better, faster, stronger
 
Apache Cassandra Lesson: Data Modelling and CQL3
Apache Cassandra Lesson: Data Modelling and CQL3Apache Cassandra Lesson: Data Modelling and CQL3
Apache Cassandra Lesson: Data Modelling and CQL3
 
DataStax NYC Java Meetup: Cassandra with Java
DataStax NYC Java Meetup: Cassandra with JavaDataStax NYC Java Meetup: Cassandra with Java
DataStax NYC Java Meetup: Cassandra with Java
 
DMDW Extra Lesson - NoSql and MongoDB
DMDW  Extra Lesson - NoSql and MongoDBDMDW  Extra Lesson - NoSql and MongoDB
DMDW Extra Lesson - NoSql and MongoDB
 
NoSQL - Hands on
NoSQL - Hands onNoSQL - Hands on
NoSQL - Hands on
 
The Value in Trees
The Value in TreesThe Value in Trees
The Value in Trees
 
Getting Started with Datatsax .Net Driver
Getting Started with Datatsax .Net DriverGetting Started with Datatsax .Net Driver
Getting Started with Datatsax .Net Driver
 
Cassandra Summit 2015
Cassandra Summit 2015Cassandra Summit 2015
Cassandra Summit 2015
 

Similar to Cassandra Summit 2014: Real Data Models of Silicon Valley

Cassandra Day Atlanta 2015: Data Modeling 101
Cassandra Day Atlanta 2015: Data Modeling 101Cassandra Day Atlanta 2015: Data Modeling 101
Cassandra Day Atlanta 2015: Data Modeling 101DataStax Academy
 
Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101
Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101
Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101DataStax Academy
 
Cassandra Day London 2015: Data Modeling 101
Cassandra Day London 2015: Data Modeling 101Cassandra Day London 2015: Data Modeling 101
Cassandra Day London 2015: Data Modeling 101DataStax Academy
 
Jan 2015 - Cassandra101 Manchester Meetup
Jan 2015 - Cassandra101 Manchester MeetupJan 2015 - Cassandra101 Manchester Meetup
Jan 2015 - Cassandra101 Manchester MeetupChristopher Batey
 
Introduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraIntroduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraPatrick McFadin
 
Vienna Feb 2015: Cassandra: How it works and what it's good for!
Vienna Feb 2015: Cassandra: How it works and what it's good for!Vienna Feb 2015: Cassandra: How it works and what it's good for!
Vienna Feb 2015: Cassandra: How it works and what it's good for!Christopher Batey
 
Owning time series with team apache Strata San Jose 2015
Owning time series with team apache   Strata San Jose 2015Owning time series with team apache   Strata San Jose 2015
Owning time series with team apache Strata San Jose 2015Patrick McFadin
 
Introducing U-SQL (SQLPASS 2016)
Introducing U-SQL (SQLPASS 2016)Introducing U-SQL (SQLPASS 2016)
Introducing U-SQL (SQLPASS 2016)Michael Rys
 
CoconutKit
CoconutKitCoconutKit
CoconutKitdefagos
 
Cassandra Tutorial
Cassandra TutorialCassandra Tutorial
Cassandra Tutorialmubarakss
 
CBDW2014 - Behavior Driven Development with TestBox
CBDW2014 - Behavior Driven Development with TestBoxCBDW2014 - Behavior Driven Development with TestBox
CBDW2014 - Behavior Driven Development with TestBoxOrtus Solutions, Corp
 
Deep Dive into Cassandra
Deep Dive into CassandraDeep Dive into Cassandra
Deep Dive into CassandraBrent Theisen
 
How Clean is your database? Data scrubbing for all skills sets
How Clean is your database? Data scrubbing for all skills setsHow Clean is your database? Data scrubbing for all skills sets
How Clean is your database? Data scrubbing for all skills setsChad Petrovay
 
Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingCassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingVassilis Bekiaris
 
Slide presentation pycassa_upload
Slide presentation pycassa_uploadSlide presentation pycassa_upload
Slide presentation pycassa_uploadRajini Ramesh
 
Probabilistic Data Structures (Edmonton Data Science Meetup, March 2018)
Probabilistic Data Structures (Edmonton Data Science Meetup, March 2018)Probabilistic Data Structures (Edmonton Data Science Meetup, March 2018)
Probabilistic Data Structures (Edmonton Data Science Meetup, March 2018)Kyle Davis
 

Similar to Cassandra Summit 2014: Real Data Models of Silicon Valley (20)

Cassandra Day Atlanta 2015: Data Modeling 101
Cassandra Day Atlanta 2015: Data Modeling 101Cassandra Day Atlanta 2015: Data Modeling 101
Cassandra Day Atlanta 2015: Data Modeling 101
 
Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101
Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101
Cassandra Day Chicago 2015: Apache Cassandra Data Modeling 101
 
Cassandra Day London 2015: Data Modeling 101
Cassandra Day London 2015: Data Modeling 101Cassandra Day London 2015: Data Modeling 101
Cassandra Day London 2015: Data Modeling 101
 
Jan 2015 - Cassandra101 Manchester Meetup
Jan 2015 - Cassandra101 Manchester MeetupJan 2015 - Cassandra101 Manchester Meetup
Jan 2015 - Cassandra101 Manchester Meetup
 
Introduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraIntroduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandra
 
Vienna Feb 2015: Cassandra: How it works and what it's good for!
Vienna Feb 2015: Cassandra: How it works and what it's good for!Vienna Feb 2015: Cassandra: How it works and what it's good for!
Vienna Feb 2015: Cassandra: How it works and what it's good for!
 
PostgreSQL
PostgreSQLPostgreSQL
PostgreSQL
 
Owning time series with team apache Strata San Jose 2015
Owning time series with team apache   Strata San Jose 2015Owning time series with team apache   Strata San Jose 2015
Owning time series with team apache Strata San Jose 2015
 
Introducing U-SQL (SQLPASS 2016)
Introducing U-SQL (SQLPASS 2016)Introducing U-SQL (SQLPASS 2016)
Introducing U-SQL (SQLPASS 2016)
 
Apache Cassandra & Data Modeling
Apache Cassandra & Data ModelingApache Cassandra & Data Modeling
Apache Cassandra & Data Modeling
 
CoconutKit
CoconutKitCoconutKit
CoconutKit
 
Cassandra Tutorial
Cassandra TutorialCassandra Tutorial
Cassandra Tutorial
 
CBDW2014 - Behavior Driven Development with TestBox
CBDW2014 - Behavior Driven Development with TestBoxCBDW2014 - Behavior Driven Development with TestBox
CBDW2014 - Behavior Driven Development with TestBox
 
What's New in Apache Hive
What's New in Apache HiveWhat's New in Apache Hive
What's New in Apache Hive
 
MongoDB.pdf
MongoDB.pdfMongoDB.pdf
MongoDB.pdf
 
Deep Dive into Cassandra
Deep Dive into CassandraDeep Dive into Cassandra
Deep Dive into Cassandra
 
How Clean is your database? Data scrubbing for all skills sets
How Clean is your database? Data scrubbing for all skills setsHow Clean is your database? Data scrubbing for all skills sets
How Clean is your database? Data scrubbing for all skills sets
 
Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingCassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series Modeling
 
Slide presentation pycassa_upload
Slide presentation pycassa_uploadSlide presentation pycassa_upload
Slide presentation pycassa_upload
 
Probabilistic Data Structures (Edmonton Data Science Meetup, March 2018)
Probabilistic Data Structures (Edmonton Data Science Meetup, March 2018)Probabilistic Data Structures (Edmonton Data Science Meetup, March 2018)
Probabilistic Data Structures (Edmonton Data Science Meetup, March 2018)
 

More from DataStax Academy

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftDataStax Academy
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseDataStax Academy
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraDataStax Academy
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsDataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackDataStax Academy
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache CassandraDataStax Academy
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready CassandraDataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonDataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1DataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First ClusterDataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with DseDataStax Academy
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and DriversDataStax Academy
 
Getting Started with Graph Databases
Getting Started with Graph DatabasesGetting Started with Graph Databases
Getting Started with Graph DatabasesDataStax Academy
 
Cassandra Data Maintenance with Spark
Cassandra Data Maintenance with SparkCassandra Data Maintenance with Spark
Cassandra Data Maintenance with SparkDataStax Academy
 
Analytics with Spark and Cassandra
Analytics with Spark and CassandraAnalytics with Spark and Cassandra
Analytics with Spark and CassandraDataStax Academy
 
Make 2016 your year of SMACK talk
Make 2016 your year of SMACK talkMake 2016 your year of SMACK talk
Make 2016 your year of SMACK talkDataStax Academy
 

More from DataStax Academy (20)

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and Drivers
 
Getting Started with Graph Databases
Getting Started with Graph DatabasesGetting Started with Graph Databases
Getting Started with Graph Databases
 
Cassandra Data Maintenance with Spark
Cassandra Data Maintenance with SparkCassandra Data Maintenance with Spark
Cassandra Data Maintenance with Spark
 
Analytics with Spark and Cassandra
Analytics with Spark and CassandraAnalytics with Spark and Cassandra
Analytics with Spark and Cassandra
 
Make 2016 your year of SMACK talk
Make 2016 your year of SMACK talkMake 2016 your year of SMACK talk
Make 2016 your year of SMACK talk
 

Recently uploaded

Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 

Recently uploaded (20)

Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 

Cassandra Summit 2014: Real Data Models of Silicon Valley

  • 1. Real Data Models of Silicon Valley Patrick McFadin Chief Evangelist for Apache Cassandra ! @PatrickMcFadin
  • 2. It's been an epic year
  • 3. I've had a ton of fun! • Traveling the world talking to people like you! Stockholm Warsaw Melbourne New York Vancouver Dublin
  • 4. What's new? • 2.1 is out! • Amazing changes for performance and stability
  • 5. Where are we going? • 3.0 is next. Just hold on…
  • 6. KillrVideo.com • 2012 Summit • Complete example for data modeling www.killrvideos.com Video Title Recommended Meow Ads by Google Description Comments Upload New! Username Rating: Tags: Foo Bar *Cat drawing by goodrob13 on Flickr
  • 7. It’s alive!!! • Hosted on Azure • Code on Github
  • 8. Data Model - Revisited • Add in some 2.1 data models • Replace (or remove) some app code • Become a part of Cassandra OSS download
  • 9. User Defined Types • Complex data in one place • No multi-gets (multi-partitions) • Nesting! CREATE TYPE address ( street text, city text, zip_code int, country text, cross_streets set<text> );
  • 10. Before CREATE TABLE videos ( videoid uuid, userid uuid, name varchar, description varchar, location text, location_type int, preview_thumbnails map<text,text>, tags set<varchar>, added_date timestamp, PRIMARY KEY (videoid) ); CREATE TABLE video_metadata ( video_id uuid PRIMARY KEY, height int, width int, video_bit_rate set<text>, encoding text ); SELECT * FROM videos WHERE videoId = 2; ! SELECT * FROM video_metadata WHERE videoId = 2; Title: Introduction to Apache Cassandra ! Description: A one hour talk on everything you need to know about a totally amazing database. Playback rate: 480 720 In-application join
  • 11. After • Now video_metadata is embedded in videos CREATE TYPE video_metadata ( height int, width int, video_bit_rate set<text>, encoding text ); CREATE TABLE videos ( videoid uuid, userid uuid, name varchar, description varchar, location text, location_type int, preview_thumbnails map<text,text>, tags set<varchar>, metadata set <frozen<video_metadata>>, added_date timestamp, PRIMARY KEY (videoid) );
  • 12. Wait! Frozen?? • Staying out of technical debt • 3.0 UDTs will not have to be frozen • Applicable to User Defined Types and Tuples (wait for Do you want to build a schema? Do you want to store some JSON?
  • 13. Let’s store some JSON { "productId": 2, "name": "Kitchen Table", "price": 249.99, "description" : "Rectangular table with oak finish", "dimensions": { "units": "inches", "length": 50.0, "width": 66.0, "height": 32 }, "categories": { { "category" : "Home Furnishings" { "catalogPage": 45, "url": "/home/furnishings" }, { "category" : "Kitchen Furnishings" { "catalogPage": 108, "url": "/kitchen/furnishings" } } }
  • 14. Let’s store some JSON { "productId": 2, "name": "Kitchen Table", "price": 249.99, "description" : "Rectangular table with oak finish", "dimensions": { "units": "inches", "length": 50.0, "width": 66.0, "height": 32 }, "categories": { { "category" : "Home Furnishings" { "catalogPage": 45, "url": "/home/furnishings" }, { "category" : "Kitchen Furnishings" { "catalogPage": 108, "url": "/kitchen/furnishings" } } } CREATE TYPE dimensions ( units text, length float, width float, height float );
  • 15. Let’s store some JSON { "productId": 2, "name": "Kitchen Table", "price": 249.99, "description" : "Rectangular table with oak finish", "dimensions": { "units": "inches", "length": 50.0, "width": 66.0, "height": 32 }, "categories": { { "category" : "Home Furnishings" { "catalogPage": 45, "url": "/home/furnishings" }, { "category" : "Kitchen Furnishings" { "catalogPage": 108, "url": "/kitchen/furnishings" } } } CREATE TYPE dimensions ( units text, length float, width float, height float ); CREATE TYPE category ( catalogPage int, url text );
  • 16. Let’s store some JSON { "productId": 2, "name": "Kitchen Table", "price": 249.99, "description" : "Rectangular table with oak finish", "dimensions": { "units": "inches", "length": 50.0, "width": 66.0, "height": 32 }, "categories": { { "category" : "Home Furnishings" { "catalogPage": 45, "url": "/home/furnishings" }, { "category" : "Kitchen Furnishings" { "catalogPage": 108, "url": "/kitchen/furnishings" } } } CREATE TYPE dimensions ( units text, length float, width float, height float ); CREATE TYPE category ( catalogPage int, url text ); CREATE TABLE product ( productId int, name text, price float, description text, dimensions frozen <dimensions>, categories map <text, frozen <category>>, PRIMARY KEY (productId) );
  • 17. Let’s store some JSON INSERT INTO product (productId, name, price, description, dimensions, categories) VALUES (2, 'Kitchen Table', 249.99, 'Rectangular table with oak finish', { units: 'inches', length: 50.0, width: 66.0, height: 32 }, { 'Home Furnishings': { catalogPage: 45, url: '/home/furnishings' }, 'Kitchen Furnishings': { catalogPage: 108, url: '/kitchen/furnishings' } ! } ); dimensions frozen <dimensions> categories map <text, frozen <category>>
  • 19. Counters pt Deux • Since .8 • Commit log replay would change counters • Repair could change counters • Performance was inconsistent. Lots of GC
  • 20. The good • Stable under load • No commit log replay issues • No repair weirdness
  • 21. The bad • Still can’t delete/reset counters • Still needs to do a read before write.
  • 22. Usage Wait for it… It’s the same! Carry on…
  • 23. Static Fields • New as of 2.0.6 • VERY specific, but useful • Thrift people will like this CREATE TABLE t ( k text, s text STATIC, i int, PRIMARY KEY (k, i) );
  • 24. Why? CREATE TABLE weather ( id int, time timestamp, weatherstation_name text, temperature float, PRIMARY KEY (id, time) ); ID = 1 Partition Key (Storage Row Key) 2014-09-08 12:00:00 : name SFO 2014-09-08 12:00:00 : temp 63.4 2014-09-08 12:01:00 : name SFO 2014-09-08 12:00:00 : temp 63.9 2014-09-08 12:02:00 : name SFO 2014-09-08 12:00:00 : temp 64.0 Partition Row 1 Partition Row 2 Partition Row 3 ID = 1 Partition Key (Storage Row Key) name SFO Partition Row 1 Partition Row 1 Partition Row 1 2014-09-08 12:00:00 : temp 63.4 2014-09-08 12:00:00 : temp 63.9 2014-09-08 12:00:00 : temp 64.0 CREATE TABLE weather ( id int, time timestamp, weatherstation_name text static, temperature float, PRIMARY KEY (id, time) );
  • 25. Usage • Put a static at the end of the declaration • Can’t be a part of: CREATE TABLE video_event ( videoid uuid, userid uuid, preview_image_location text static, event varchar, event_timestamp timeuuid, video_timestamp bigint, PRIMARY KEY ((videoid,userid),event_timestamp,event) ) WITH CLUSTERING ORDER BY (event_timestamp DESC,event ASC);
  • 26. Tuples CREATE TABLE tuple_table ( id int PRIMARY KEY, three_tuple frozen <tuple<int, text, float>>, four_tuple frozen <tuple<int, text, float, inet>>, five_tuple frozen <tuple<int, text, float, inet, ascii>> ); • A type that represents a group • Up to 256 different elements
  • 27. Example Usage • Track a drone’s position • x, y, z in a 3D Cartesian CREATE TABLE drone_position ( droneId int, time timestamp, position frozen <tuple<float, float, float>>, PRIMARY KEY (droneId, time) );
  • 28. What about partition size? • A CQL partition is a logical projection of a storage row • Storage rows can have up to 2 billion cells • Each cell can hold up to 2G of data
  • 29. How much is too much? • How many cells before performance degrades? • How many bytes per partition before it’s unmanageable • What is “practical”
  • 30. Old answer • 2011: Pre-Cassandra 1.2 (actually tested on .8) • Aaron Morton, Cassandra MVP and Founder of The Last Pickle
  • 31. Conclusion • Keep partition (storage row) length < 10k cells • Total size in bytes below 64M (Multi-pass compaction) • Multiple hits to 64k page size will start to hurt TL;DR - It’s a performance tunable
  • 32. The tests revisited • Attempted to reproduce the same tests using CQL • Cassandra 2.1, 2.0 and 1.2 • Tested partitions sizes 1. 100 2. 2114 3. 5,000 4. 10,000 5. 100,000 6. 1,000,000 7. 10,000,000 8. 100,000,000 9. 1,000,000,000
  • 33. Results mSec Cells per partition
  • 34. The new answer • 100’s of thousands is not problem • 100’s of megs per partition is best operationally • The issue to manage is operations
  • 35. Thank You! Follow me on twitter for more @PatrickMcFadin
  • 36. CASSANDRASUMMIT2014 September 10 - 11 | #CassandraSummit