SlideShare a Scribd company logo
1 of 29
Download to read offline
Become a super modeler
Patrick McFadin @PatrickMcFadin
Senior Solutions Architect
DataStax
Thursday, May 16, 13
Become a super modeler
Patrick McFadin @PatrickMcFadin
Senior Solutions Architect
DataStax
Thursday, May 16, 13
... the saga continues.
This is the second part of a data modeling series
Part 1:The data model is dead, long live the data model!
• Relational -> Cassandra topics
• Basic entity modeling
• one-to-many
• many-to-many
•Transaction like modeling
Thursday, May 16, 13
Becoming a super modeler
• Data model is the key to happiness
• Successful deployments depend on it
• Not just a Cassandra problem...
3
Thursday, May 16, 13
Time series - Basic
CREATE TABLE temperature (
weatherstation_id text,
event_time timestamp,
temperature text,
PRIMARY KEY (weatherstation_id,event_time)
);
• Weather station collects regular temperature
• Each weather station is a row
• Each event is a new column in a wide row
Thursday, May 16, 13
Time series - Super!
• Every second? Row would be too big
• Order by access pattern
• Partition the rows by day
- One weather station by day
5
CREATE TABLE temperature_by_day (
weatherstation_id text,
date text,
event_time timestamp,
temperature text,
PRIMARY KEY ((weatherstation_id,date),event_time)
) WITH CLUSTERING ORDER BY (event_time DESC);
Compound row key
Reverse sort: Last event, first on row
Thursday, May 16, 13
User model - basic
• Plain ole entity table
• One primary key
• Booooring
6
CREATE TABLE users (
username text PRIMARY KEY,
first_name text,
last_name text,
address1 text,
city text,
postal_code text,
last_login timestamp
);
Thursday, May 16, 13
Cassandra feature - Collections
• Collections give you three types:
- Set
- List
- Map
• Each allow for dynamic updates
• Fully supported in CQL 3
• Requires serialization so don’t go crazy
7
CREATE TABLE collections_example (
! id int PRIMARY KEY,
! set_example set<text>,
! list_example list<text>,
! map_example map<int,text>
);
Thursday, May 16, 13
Cassandra Collections - Set
• Set is sorted by CQL type comparator
8
INSERT INTO collections_example (id, set_example)
VALUES(1, {'1-one', '2-two'});
set_example set<text>
Collection name Collection type CQLType
Thursday, May 16, 13
Cassandra Collections - Set Operations
9
UPDATE collections_example
SET set_example = set_example + {'3-three'} WHERE id = 1;
UPDATE collections_example
SET set_example = set_example + {'0-zero'} WHERE id = 1;
UPDATE collections_example
SET set_example = set_example - {'3-three'} WHERE id = 1;
• Adding an element to the set
• After adding this element, it will sort to the beginning.
• Removing an element from the set
Thursday, May 16, 13
Cassandra Collections - List
• Ordered by insertion
10
list_example list<text>
Collection name Collection type CQLType
INSERT INTO collections_example (id, list_example)
VALUES(1, ['1-one', '2-two']);
Thursday, May 16, 13
Cassandra Collections - List Operations
• Adding an element to the end of a list
11
UPDATE collections_example
SET list_example = list_example + ['3-three'] WHERE id = 1;
UPDATE collections_example
SET list_example = ['0-zero'] + list_example WHERE id = 1;
• Adding an element to the beginning of a list
UPDATE collections_example
SET list_example = list_example - ['3-three'] WHERE id = 1;
• Deleting an element from a list
Thursday, May 16, 13
Cassandra Collections - Map
• Key and value
• Key is sorted by CQL type comparator
12
INSERT INTO collections_example (id, map_example)
VALUES(1, { 1 : 'one', 2 : 'two' });
map_example map<int,text>
Collection name Collection type Value CQLTypeKey CQLType
Thursday, May 16, 13
Cassandra Collections - Map Operations
• Add an element to the map
13
UPDATE collections_example
SET map_example[3] = 'three' WHERE id = 1;
UPDATE collections_example
SET map_example[3] = 'tres' WHERE id = 1;
DELETE map_example[3]
FROM collections_example WHERE id = 1;
• Update an existing element in the map
• Delete an element in the map
Thursday, May 16, 13
User model - Super!
•Take boring user table and kick it up
• Great for static + some dynamic
•Takes advantage of row level isolation
14
CREATE TABLE user_with_location (
! username text PRIMARY KEY,
! first_name text,
! last_name text,
! address1 text,
! city text,
! postal_code text,
! last_login timestamp,
! location_by_date map<timeuuid,text>
);
Thursday, May 16, 13
Super user profile - Operations
• Adding new login locations to the map
15
UPDATE user_with_location
SET last_login = now(), location_by_date = {now() : '123.123.123.1'}
WHERE username='PatrickMcFadin';
UPDATE user_with_location
USING TTL 2592000 // 30 Days
SET last_login = now(), location_by_date = {now() : '123.123.123.1'}
WHERE username='PatrickMcFadin';
• Adding new login locations to the map +TTL!
Thursday, May 16, 13
Indexing
• Indexing expresses application intent
• Fast access to specific queries
• Secondary indexes != relational indexes
• Use information you have. No pre-reads.
16
Goals:
1. Create row key for speed
2. Use wide rows for efficiency
Thursday, May 16, 13
Keyword index
• Use a word as a key
• Columns are the occurrence
• Ex: Index of tag words about videos
17
CREATE TABLE tag_index (
tag varchar,
videoid uuid,
timestamp timestamp,
PRIMARY KEY (tag, videoid)
);
VideoId1 .. VideoIdNtag
Fast
Efficient
Thursday, May 16, 13
Partial word index
• Where row size will be large
•Take one part for key, rest for columns name
18
CREATE TABLE email_index (
domain varchar,
user varchar,
username varchar,
PRIMARY KEY (domain, user)
);
INSERT INTO email_index (domain, user, username)
VALUES ('@relational.com','tcodd', 'tcodd');
User: tcodd Email: tcodd@relational.com
Thursday, May 16, 13
Partial word index - Super!
• Create partitions + partial indexes FTW
19
CREATE TABLE product_index (
store int,
part_number0_3 int,
part_number4_9 int,
count int,
PRIMARY KEY ((store,part_number0_3), part_number4_9)
);
INSERT INTO product_index (store,part_number0_3,part_number4_9,count)
VALUES (8675309,7079,48575,3);
SELECT count
FROM product_index
WHERE store = 8675309
AND part_number0_3 = 7079
AND part_number4_9 = 48575;
Compound row key!
Fast and efficient!
• Store #8675309 has 3 of part# 7079748575
Thursday, May 16, 13
Bit map index
• Multiple parts to a key
• Create a truth table of the different combinations
• Inserts == the number of combinations
- 3 fields? 7 options (Not going to use null choice)
- 4 fields? 15 options
20
Thursday, May 16, 13
Bit map index
• Find a car in a lot by variable combinations
21
Make Model Color Combination
x Color
x Model
x x Model+Color
x Make
x x Make+Color
x x Make+Model
x x x Make+Model+Color
Thursday, May 16, 13
Bit map index -Table create
• Make a table with three different key combos
22
CREATE TABLE car_location_index (
make varchar,
model varchar,
color varchar,
vehical_id int,
lot_id int,
PRIMARY KEY ((make,model,color),vehical_id)
);
Compound row key with three different options
Thursday, May 16, 13
Bit map index - Adding records
• Pre-optimize for 7 possible questions on insert
23
INSERT INTO car_location_index (make,model,color,vehical_id,lot_id)
VALUES ('Ford','Mustang','Blue',1234,8675309);
INSERT INTO car_location_index (make,model,color,vehical_id,lot_id)
VALUES ('Ford','Mustang','',1234,8675309);
INSERT INTO car_location_index (make,model,color,vehical_id,lot_id)
VALUES ('Ford','','Blue',1234,8675309);
INSERT INTO car_location_index (make,model,color,vehical_id,lot_id)
VALUES ('Ford','','',1234,8675309);
INSERT INTO car_location_index (make,model,color,vehical_id,lot_id)
VALUES ('','Mustang','Blue',1234,8675309);
INSERT INTO car_location_index (make,model,color,vehical_id,lot_id)
VALUES ('','Mustang','',1234,8675309);
INSERT INTO car_location_index (make,model,color,vehical_id,lot_id)
VALUES ('','','Blue',1234,8675309);
Thursday, May 16, 13
Bit map index - Selecting records
• Different combinations now possible
24
SELECT vehical_id,lot_id
FROM car_location_index
WHERE make = 'Ford'
AND model = ''
AND color = 'Blue';
vehical_id | lot_id
------------+---------
1234 | 8675309
SELECT vehical_id,lot_id
FROM car_location_index
WHERE make = ''
AND model = ''
AND color = 'Blue';
vehical_id | lot_id
------------+---------
1234 | 8675309
8765 | 5551212
Thursday, May 16, 13
Feeling super yet?
• Use these skills. Save you they will.
• Don’t settle for boring data models
• Stay tuned for more!
25
• Final will be at the Cassandra Summit: June 11th
The worlds next top data model
Thursday, May 16, 13
Be there!!!
26
Sony, eBay, Netflix, Intuit, Spotify... the list goes on. Don’t miss it.
Here is my discount code! Use it: PMcVIP
Thursday, May 16, 13
Bonus!
• DataStax Java Driver Preso - June 12th
• Download today!
27
https://github.com/datastax/java-driver
Thursday, May 16, 13
ThankYou
Q&A
Thursday, May 16, 13

More Related Content

What's hot

Introduction to Apache Calcite
Introduction to Apache CalciteIntroduction to Apache Calcite
Introduction to Apache CalciteJordan Halterman
 
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...Databricks
 
Modularized ETL Writing with Apache Spark
Modularized ETL Writing with Apache SparkModularized ETL Writing with Apache Spark
Modularized ETL Writing with Apache SparkDatabricks
 
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark ApplicationsTop 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark ApplicationsCloudera, Inc.
 
Geospatial Options in Apache Spark
Geospatial Options in Apache SparkGeospatial Options in Apache Spark
Geospatial Options in Apache SparkDatabricks
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Flink Forward
 
Flink Batch Processing and Iterations
Flink Batch Processing and IterationsFlink Batch Processing and Iterations
Flink Batch Processing and IterationsSameer Wadkar
 
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...Extreme Apache Spark: how in 3 months we created a pipeline that can process ...
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...Josef A. Habdank
 
Native Support of Prometheus Monitoring in Apache Spark 3.0
Native Support of Prometheus Monitoring in Apache Spark 3.0Native Support of Prometheus Monitoring in Apache Spark 3.0
Native Support of Prometheus Monitoring in Apache Spark 3.0Databricks
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotFlink Forward
 
Cost-based Query Optimization in Apache Phoenix using Apache Calcite
Cost-based Query Optimization in Apache Phoenix using Apache CalciteCost-based Query Optimization in Apache Phoenix using Apache Calcite
Cost-based Query Optimization in Apache Phoenix using Apache CalciteJulian Hyde
 
Data Source API in Spark
Data Source API in SparkData Source API in Spark
Data Source API in SparkDatabricks
 
Data Discovery at Databricks with Amundsen
Data Discovery at Databricks with AmundsenData Discovery at Databricks with Amundsen
Data Discovery at Databricks with AmundsenDatabricks
 
patroni-based citrus high availability environment deployment
patroni-based citrus high availability environment deploymentpatroni-based citrus high availability environment deployment
patroni-based citrus high availability environment deploymenthyeongchae lee
 
Spark And Cassandra: 2 Fast, 2 Furious
Spark And Cassandra: 2 Fast, 2 FuriousSpark And Cassandra: 2 Fast, 2 Furious
Spark And Cassandra: 2 Fast, 2 FuriousJen Aman
 
Deep dive into stateful stream processing in structured streaming by Tathaga...
Deep dive into stateful stream processing in structured streaming  by Tathaga...Deep dive into stateful stream processing in structured streaming  by Tathaga...
Deep dive into stateful stream processing in structured streaming by Tathaga...Databricks
 
Deep Dive into the New Features of Apache Spark 3.1
Deep Dive into the New Features of Apache Spark 3.1Deep Dive into the New Features of Apache Spark 3.1
Deep Dive into the New Features of Apache Spark 3.1Databricks
 
Improving Python and Spark Performance and Interoperability: Spark Summit Eas...
Improving Python and Spark Performance and Interoperability: Spark Summit Eas...Improving Python and Spark Performance and Interoperability: Spark Summit Eas...
Improving Python and Spark Performance and Interoperability: Spark Summit Eas...Spark Summit
 
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark Summit
 

What's hot (20)

Introduction to Apache Calcite
Introduction to Apache CalciteIntroduction to Apache Calcite
Introduction to Apache Calcite
 
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
 
Modularized ETL Writing with Apache Spark
Modularized ETL Writing with Apache SparkModularized ETL Writing with Apache Spark
Modularized ETL Writing with Apache Spark
 
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark ApplicationsTop 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
 
Geospatial Options in Apache Spark
Geospatial Options in Apache SparkGeospatial Options in Apache Spark
Geospatial Options in Apache Spark
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
 
Flink Batch Processing and Iterations
Flink Batch Processing and IterationsFlink Batch Processing and Iterations
Flink Batch Processing and Iterations
 
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...Extreme Apache Spark: how in 3 months we created a pipeline that can process ...
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...
 
Native Support of Prometheus Monitoring in Apache Spark 3.0
Native Support of Prometheus Monitoring in Apache Spark 3.0Native Support of Prometheus Monitoring in Apache Spark 3.0
Native Support of Prometheus Monitoring in Apache Spark 3.0
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
 
Cost-based Query Optimization in Apache Phoenix using Apache Calcite
Cost-based Query Optimization in Apache Phoenix using Apache CalciteCost-based Query Optimization in Apache Phoenix using Apache Calcite
Cost-based Query Optimization in Apache Phoenix using Apache Calcite
 
Data Source API in Spark
Data Source API in SparkData Source API in Spark
Data Source API in Spark
 
Data Discovery at Databricks with Amundsen
Data Discovery at Databricks with AmundsenData Discovery at Databricks with Amundsen
Data Discovery at Databricks with Amundsen
 
Apache spark
Apache sparkApache spark
Apache spark
 
patroni-based citrus high availability environment deployment
patroni-based citrus high availability environment deploymentpatroni-based citrus high availability environment deployment
patroni-based citrus high availability environment deployment
 
Spark And Cassandra: 2 Fast, 2 Furious
Spark And Cassandra: 2 Fast, 2 FuriousSpark And Cassandra: 2 Fast, 2 Furious
Spark And Cassandra: 2 Fast, 2 Furious
 
Deep dive into stateful stream processing in structured streaming by Tathaga...
Deep dive into stateful stream processing in structured streaming  by Tathaga...Deep dive into stateful stream processing in structured streaming  by Tathaga...
Deep dive into stateful stream processing in structured streaming by Tathaga...
 
Deep Dive into the New Features of Apache Spark 3.1
Deep Dive into the New Features of Apache Spark 3.1Deep Dive into the New Features of Apache Spark 3.1
Deep Dive into the New Features of Apache Spark 3.1
 
Improving Python and Spark Performance and Interoperability: Spark Summit Eas...
Improving Python and Spark Performance and Interoperability: Spark Summit Eas...Improving Python and Spark Performance and Interoperability: Spark Summit Eas...
Improving Python and Spark Performance and Interoperability: Spark Summit Eas...
 
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
 

Similar to Become a super modeler

Cassandra Community Webinar | Become a Super Modeler
Cassandra Community Webinar | Become a Super ModelerCassandra Community Webinar | Become a Super Modeler
Cassandra Community Webinar | Become a Super ModelerDataStax
 
Cassandra Data Modeling
Cassandra Data ModelingCassandra Data Modeling
Cassandra Data ModelingBen Knear
 
Project A Data Modelling Best Practices Part II: How to Build a Data Warehouse?
Project A Data Modelling Best Practices Part II: How to Build a Data Warehouse?Project A Data Modelling Best Practices Part II: How to Build a Data Warehouse?
Project A Data Modelling Best Practices Part II: How to Build a Data Warehouse?Martin Loetzsch
 
Just in time (series) - KairosDB
Just in time (series) - KairosDBJust in time (series) - KairosDB
Just in time (series) - KairosDBVictor Anjos
 
PL/SQL New and Advanced Features for Extreme Performance
PL/SQL New and Advanced Features for Extreme PerformancePL/SQL New and Advanced Features for Extreme Performance
PL/SQL New and Advanced Features for Extreme PerformanceZohar Elkayam
 
Tactical data engineering
Tactical data engineeringTactical data engineering
Tactical data engineeringJulian Hyde
 
Webinar slides: Adding Fast Analytics to MySQL Applications with Clickhouse
Webinar slides: Adding Fast Analytics to MySQL Applications with ClickhouseWebinar slides: Adding Fast Analytics to MySQL Applications with Clickhouse
Webinar slides: Adding Fast Analytics to MySQL Applications with ClickhouseAltinity Ltd
 
R programming & Machine Learning
R programming & Machine LearningR programming & Machine Learning
R programming & Machine LearningAmanBhalla14
 
Ten Reasons Why You Should Prefer PostgreSQL to MySQL
Ten Reasons Why You Should Prefer PostgreSQL to MySQLTen Reasons Why You Should Prefer PostgreSQL to MySQL
Ten Reasons Why You Should Prefer PostgreSQL to MySQLanandology
 
PerlApp2Postgresql (2)
PerlApp2Postgresql (2)PerlApp2Postgresql (2)
PerlApp2Postgresql (2)Jerome Eteve
 
ch03-parameters-objects.ppt
ch03-parameters-objects.pptch03-parameters-objects.ppt
ch03-parameters-objects.pptMahyuddin8
 
TechEvent 2019: Uses of Row Pattern Matching; Kim Berg Hansen - Trivadis
TechEvent 2019: Uses of Row Pattern Matching; Kim Berg Hansen - TrivadisTechEvent 2019: Uses of Row Pattern Matching; Kim Berg Hansen - Trivadis
TechEvent 2019: Uses of Row Pattern Matching; Kim Berg Hansen - TrivadisTrivadis
 
Advanced data modeling with apache cassandra
Advanced data modeling with apache cassandraAdvanced data modeling with apache cassandra
Advanced data modeling with apache cassandraPatrick McFadin
 

Similar to Become a super modeler (20)

Cassandra Community Webinar | Become a Super Modeler
Cassandra Community Webinar | Become a Super ModelerCassandra Community Webinar | Become a Super Modeler
Cassandra Community Webinar | Become a Super Modeler
 
Do You Have the Time
Do You Have the TimeDo You Have the Time
Do You Have the Time
 
Cassandra Data Modeling
Cassandra Data ModelingCassandra Data Modeling
Cassandra Data Modeling
 
Project A Data Modelling Best Practices Part II: How to Build a Data Warehouse?
Project A Data Modelling Best Practices Part II: How to Build a Data Warehouse?Project A Data Modelling Best Practices Part II: How to Build a Data Warehouse?
Project A Data Modelling Best Practices Part II: How to Build a Data Warehouse?
 
Just in time (series) - KairosDB
Just in time (series) - KairosDBJust in time (series) - KairosDB
Just in time (series) - KairosDB
 
Sql analytic queries tips
Sql analytic queries tipsSql analytic queries tips
Sql analytic queries tips
 
Quick dive to pandas
Quick dive to pandasQuick dive to pandas
Quick dive to pandas
 
PL/SQL New and Advanced Features for Extreme Performance
PL/SQL New and Advanced Features for Extreme PerformancePL/SQL New and Advanced Features for Extreme Performance
PL/SQL New and Advanced Features for Extreme Performance
 
Tactical data engineering
Tactical data engineeringTactical data engineering
Tactical data engineering
 
MariaDB Temporal Tables
MariaDB Temporal TablesMariaDB Temporal Tables
MariaDB Temporal Tables
 
mis4200notes4_2.ppt
mis4200notes4_2.pptmis4200notes4_2.ppt
mis4200notes4_2.ppt
 
Webinar slides: Adding Fast Analytics to MySQL Applications with Clickhouse
Webinar slides: Adding Fast Analytics to MySQL Applications with ClickhouseWebinar slides: Adding Fast Analytics to MySQL Applications with Clickhouse
Webinar slides: Adding Fast Analytics to MySQL Applications with Clickhouse
 
R programming & Machine Learning
R programming & Machine LearningR programming & Machine Learning
R programming & Machine Learning
 
Ten Reasons Why You Should Prefer PostgreSQL to MySQL
Ten Reasons Why You Should Prefer PostgreSQL to MySQLTen Reasons Why You Should Prefer PostgreSQL to MySQL
Ten Reasons Why You Should Prefer PostgreSQL to MySQL
 
Rdbms day3
Rdbms day3Rdbms day3
Rdbms day3
 
PerlApp2Postgresql (2)
PerlApp2Postgresql (2)PerlApp2Postgresql (2)
PerlApp2Postgresql (2)
 
Cassandra
CassandraCassandra
Cassandra
 
ch03-parameters-objects.ppt
ch03-parameters-objects.pptch03-parameters-objects.ppt
ch03-parameters-objects.ppt
 
TechEvent 2019: Uses of Row Pattern Matching; Kim Berg Hansen - Trivadis
TechEvent 2019: Uses of Row Pattern Matching; Kim Berg Hansen - TrivadisTechEvent 2019: Uses of Row Pattern Matching; Kim Berg Hansen - Trivadis
TechEvent 2019: Uses of Row Pattern Matching; Kim Berg Hansen - Trivadis
 
Advanced data modeling with apache cassandra
Advanced data modeling with apache cassandraAdvanced data modeling with apache cassandra
Advanced data modeling with apache cassandra
 

More from Patrick McFadin

Successful Architectures for Fast Data
Successful Architectures for Fast DataSuccessful Architectures for Fast Data
Successful Architectures for Fast DataPatrick McFadin
 
Open source or proprietary, choose wisely!
Open source or proprietary,  choose wisely!Open source or proprietary,  choose wisely!
Open source or proprietary, choose wisely!Patrick McFadin
 
An Introduction to time series with Team Apache
An Introduction to time series with Team ApacheAn Introduction to time series with Team Apache
An Introduction to time series with Team ApachePatrick McFadin
 
Laying down the smack on your data pipelines
Laying down the smack on your data pipelinesLaying down the smack on your data pipelines
Laying down the smack on your data pipelinesPatrick McFadin
 
Help! I want to contribute to an Open Source project but my boss says no.
Help! I want to contribute to an Open Source project but my boss says no.Help! I want to contribute to an Open Source project but my boss says no.
Help! I want to contribute to an Open Source project but my boss says no.Patrick McFadin
 
Analyzing Time Series Data with Apache Spark and Cassandra
Analyzing Time Series Data with Apache Spark and CassandraAnalyzing Time Series Data with Apache Spark and Cassandra
Analyzing Time Series Data with Apache Spark and CassandraPatrick McFadin
 
Storing time series data with Apache Cassandra
Storing time series data with Apache CassandraStoring time series data with Apache Cassandra
Storing time series data with Apache CassandraPatrick McFadin
 
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax EnterpriseA Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax EnterprisePatrick McFadin
 
Cassandra 3.0 advanced preview
Cassandra 3.0 advanced previewCassandra 3.0 advanced preview
Cassandra 3.0 advanced previewPatrick McFadin
 
Introduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraIntroduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraPatrick McFadin
 
Apache cassandra and spark. you got the the lighter, let's start the fire
Apache cassandra and spark. you got the the lighter, let's start the fireApache cassandra and spark. you got the the lighter, let's start the fire
Apache cassandra and spark. you got the the lighter, let's start the firePatrick McFadin
 
Owning time series with team apache Strata San Jose 2015
Owning time series with team apache   Strata San Jose 2015Owning time series with team apache   Strata San Jose 2015
Owning time series with team apache Strata San Jose 2015Patrick McFadin
 
Nike Tech Talk: Double Down on Apache Cassandra and Spark
Nike Tech Talk:  Double Down on Apache Cassandra and SparkNike Tech Talk:  Double Down on Apache Cassandra and Spark
Nike Tech Talk: Double Down on Apache Cassandra and SparkPatrick McFadin
 
Apache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series dataApache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series dataPatrick McFadin
 
Introduction to cassandra 2014
Introduction to cassandra 2014Introduction to cassandra 2014
Introduction to cassandra 2014Patrick McFadin
 
Making money with open source and not losing your soul: A practical guide
Making money with open source and not losing your soul: A practical guideMaking money with open source and not losing your soul: A practical guide
Making money with open source and not losing your soul: A practical guidePatrick McFadin
 
Time series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionTime series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionPatrick McFadin
 
Time series with apache cassandra strata
Time series with apache cassandra   strataTime series with apache cassandra   strata
Time series with apache cassandra strataPatrick McFadin
 
Cassandra EU - Data model on fire
Cassandra EU - Data model on fireCassandra EU - Data model on fire
Cassandra EU - Data model on firePatrick McFadin
 
Cassandra 2.0 and timeseries
Cassandra 2.0 and timeseriesCassandra 2.0 and timeseries
Cassandra 2.0 and timeseriesPatrick McFadin
 

More from Patrick McFadin (20)

Successful Architectures for Fast Data
Successful Architectures for Fast DataSuccessful Architectures for Fast Data
Successful Architectures for Fast Data
 
Open source or proprietary, choose wisely!
Open source or proprietary,  choose wisely!Open source or proprietary,  choose wisely!
Open source or proprietary, choose wisely!
 
An Introduction to time series with Team Apache
An Introduction to time series with Team ApacheAn Introduction to time series with Team Apache
An Introduction to time series with Team Apache
 
Laying down the smack on your data pipelines
Laying down the smack on your data pipelinesLaying down the smack on your data pipelines
Laying down the smack on your data pipelines
 
Help! I want to contribute to an Open Source project but my boss says no.
Help! I want to contribute to an Open Source project but my boss says no.Help! I want to contribute to an Open Source project but my boss says no.
Help! I want to contribute to an Open Source project but my boss says no.
 
Analyzing Time Series Data with Apache Spark and Cassandra
Analyzing Time Series Data with Apache Spark and CassandraAnalyzing Time Series Data with Apache Spark and Cassandra
Analyzing Time Series Data with Apache Spark and Cassandra
 
Storing time series data with Apache Cassandra
Storing time series data with Apache CassandraStoring time series data with Apache Cassandra
Storing time series data with Apache Cassandra
 
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax EnterpriseA Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
 
Cassandra 3.0 advanced preview
Cassandra 3.0 advanced previewCassandra 3.0 advanced preview
Cassandra 3.0 advanced preview
 
Introduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraIntroduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandra
 
Apache cassandra and spark. you got the the lighter, let's start the fire
Apache cassandra and spark. you got the the lighter, let's start the fireApache cassandra and spark. you got the the lighter, let's start the fire
Apache cassandra and spark. you got the the lighter, let's start the fire
 
Owning time series with team apache Strata San Jose 2015
Owning time series with team apache   Strata San Jose 2015Owning time series with team apache   Strata San Jose 2015
Owning time series with team apache Strata San Jose 2015
 
Nike Tech Talk: Double Down on Apache Cassandra and Spark
Nike Tech Talk:  Double Down on Apache Cassandra and SparkNike Tech Talk:  Double Down on Apache Cassandra and Spark
Nike Tech Talk: Double Down on Apache Cassandra and Spark
 
Apache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series dataApache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series data
 
Introduction to cassandra 2014
Introduction to cassandra 2014Introduction to cassandra 2014
Introduction to cassandra 2014
 
Making money with open source and not losing your soul: A practical guide
Making money with open source and not losing your soul: A practical guideMaking money with open source and not losing your soul: A practical guide
Making money with open source and not losing your soul: A practical guide
 
Time series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionTime series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long version
 
Time series with apache cassandra strata
Time series with apache cassandra   strataTime series with apache cassandra   strata
Time series with apache cassandra strata
 
Cassandra EU - Data model on fire
Cassandra EU - Data model on fireCassandra EU - Data model on fire
Cassandra EU - Data model on fire
 
Cassandra 2.0 and timeseries
Cassandra 2.0 and timeseriesCassandra 2.0 and timeseries
Cassandra 2.0 and timeseries
 

Recently uploaded

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfOverkill Security
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 

Recently uploaded (20)

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

Become a super modeler

  • 1. Become a super modeler Patrick McFadin @PatrickMcFadin Senior Solutions Architect DataStax Thursday, May 16, 13
  • 2. Become a super modeler Patrick McFadin @PatrickMcFadin Senior Solutions Architect DataStax Thursday, May 16, 13
  • 3. ... the saga continues. This is the second part of a data modeling series Part 1:The data model is dead, long live the data model! • Relational -> Cassandra topics • Basic entity modeling • one-to-many • many-to-many •Transaction like modeling Thursday, May 16, 13
  • 4. Becoming a super modeler • Data model is the key to happiness • Successful deployments depend on it • Not just a Cassandra problem... 3 Thursday, May 16, 13
  • 5. Time series - Basic CREATE TABLE temperature ( weatherstation_id text, event_time timestamp, temperature text, PRIMARY KEY (weatherstation_id,event_time) ); • Weather station collects regular temperature • Each weather station is a row • Each event is a new column in a wide row Thursday, May 16, 13
  • 6. Time series - Super! • Every second? Row would be too big • Order by access pattern • Partition the rows by day - One weather station by day 5 CREATE TABLE temperature_by_day ( weatherstation_id text, date text, event_time timestamp, temperature text, PRIMARY KEY ((weatherstation_id,date),event_time) ) WITH CLUSTERING ORDER BY (event_time DESC); Compound row key Reverse sort: Last event, first on row Thursday, May 16, 13
  • 7. User model - basic • Plain ole entity table • One primary key • Booooring 6 CREATE TABLE users ( username text PRIMARY KEY, first_name text, last_name text, address1 text, city text, postal_code text, last_login timestamp ); Thursday, May 16, 13
  • 8. Cassandra feature - Collections • Collections give you three types: - Set - List - Map • Each allow for dynamic updates • Fully supported in CQL 3 • Requires serialization so don’t go crazy 7 CREATE TABLE collections_example ( ! id int PRIMARY KEY, ! set_example set<text>, ! list_example list<text>, ! map_example map<int,text> ); Thursday, May 16, 13
  • 9. Cassandra Collections - Set • Set is sorted by CQL type comparator 8 INSERT INTO collections_example (id, set_example) VALUES(1, {'1-one', '2-two'}); set_example set<text> Collection name Collection type CQLType Thursday, May 16, 13
  • 10. Cassandra Collections - Set Operations 9 UPDATE collections_example SET set_example = set_example + {'3-three'} WHERE id = 1; UPDATE collections_example SET set_example = set_example + {'0-zero'} WHERE id = 1; UPDATE collections_example SET set_example = set_example - {'3-three'} WHERE id = 1; • Adding an element to the set • After adding this element, it will sort to the beginning. • Removing an element from the set Thursday, May 16, 13
  • 11. Cassandra Collections - List • Ordered by insertion 10 list_example list<text> Collection name Collection type CQLType INSERT INTO collections_example (id, list_example) VALUES(1, ['1-one', '2-two']); Thursday, May 16, 13
  • 12. Cassandra Collections - List Operations • Adding an element to the end of a list 11 UPDATE collections_example SET list_example = list_example + ['3-three'] WHERE id = 1; UPDATE collections_example SET list_example = ['0-zero'] + list_example WHERE id = 1; • Adding an element to the beginning of a list UPDATE collections_example SET list_example = list_example - ['3-three'] WHERE id = 1; • Deleting an element from a list Thursday, May 16, 13
  • 13. Cassandra Collections - Map • Key and value • Key is sorted by CQL type comparator 12 INSERT INTO collections_example (id, map_example) VALUES(1, { 1 : 'one', 2 : 'two' }); map_example map<int,text> Collection name Collection type Value CQLTypeKey CQLType Thursday, May 16, 13
  • 14. Cassandra Collections - Map Operations • Add an element to the map 13 UPDATE collections_example SET map_example[3] = 'three' WHERE id = 1; UPDATE collections_example SET map_example[3] = 'tres' WHERE id = 1; DELETE map_example[3] FROM collections_example WHERE id = 1; • Update an existing element in the map • Delete an element in the map Thursday, May 16, 13
  • 15. User model - Super! •Take boring user table and kick it up • Great for static + some dynamic •Takes advantage of row level isolation 14 CREATE TABLE user_with_location ( ! username text PRIMARY KEY, ! first_name text, ! last_name text, ! address1 text, ! city text, ! postal_code text, ! last_login timestamp, ! location_by_date map<timeuuid,text> ); Thursday, May 16, 13
  • 16. Super user profile - Operations • Adding new login locations to the map 15 UPDATE user_with_location SET last_login = now(), location_by_date = {now() : '123.123.123.1'} WHERE username='PatrickMcFadin'; UPDATE user_with_location USING TTL 2592000 // 30 Days SET last_login = now(), location_by_date = {now() : '123.123.123.1'} WHERE username='PatrickMcFadin'; • Adding new login locations to the map +TTL! Thursday, May 16, 13
  • 17. Indexing • Indexing expresses application intent • Fast access to specific queries • Secondary indexes != relational indexes • Use information you have. No pre-reads. 16 Goals: 1. Create row key for speed 2. Use wide rows for efficiency Thursday, May 16, 13
  • 18. Keyword index • Use a word as a key • Columns are the occurrence • Ex: Index of tag words about videos 17 CREATE TABLE tag_index ( tag varchar, videoid uuid, timestamp timestamp, PRIMARY KEY (tag, videoid) ); VideoId1 .. VideoIdNtag Fast Efficient Thursday, May 16, 13
  • 19. Partial word index • Where row size will be large •Take one part for key, rest for columns name 18 CREATE TABLE email_index ( domain varchar, user varchar, username varchar, PRIMARY KEY (domain, user) ); INSERT INTO email_index (domain, user, username) VALUES ('@relational.com','tcodd', 'tcodd'); User: tcodd Email: tcodd@relational.com Thursday, May 16, 13
  • 20. Partial word index - Super! • Create partitions + partial indexes FTW 19 CREATE TABLE product_index ( store int, part_number0_3 int, part_number4_9 int, count int, PRIMARY KEY ((store,part_number0_3), part_number4_9) ); INSERT INTO product_index (store,part_number0_3,part_number4_9,count) VALUES (8675309,7079,48575,3); SELECT count FROM product_index WHERE store = 8675309 AND part_number0_3 = 7079 AND part_number4_9 = 48575; Compound row key! Fast and efficient! • Store #8675309 has 3 of part# 7079748575 Thursday, May 16, 13
  • 21. Bit map index • Multiple parts to a key • Create a truth table of the different combinations • Inserts == the number of combinations - 3 fields? 7 options (Not going to use null choice) - 4 fields? 15 options 20 Thursday, May 16, 13
  • 22. Bit map index • Find a car in a lot by variable combinations 21 Make Model Color Combination x Color x Model x x Model+Color x Make x x Make+Color x x Make+Model x x x Make+Model+Color Thursday, May 16, 13
  • 23. Bit map index -Table create • Make a table with three different key combos 22 CREATE TABLE car_location_index ( make varchar, model varchar, color varchar, vehical_id int, lot_id int, PRIMARY KEY ((make,model,color),vehical_id) ); Compound row key with three different options Thursday, May 16, 13
  • 24. Bit map index - Adding records • Pre-optimize for 7 possible questions on insert 23 INSERT INTO car_location_index (make,model,color,vehical_id,lot_id) VALUES ('Ford','Mustang','Blue',1234,8675309); INSERT INTO car_location_index (make,model,color,vehical_id,lot_id) VALUES ('Ford','Mustang','',1234,8675309); INSERT INTO car_location_index (make,model,color,vehical_id,lot_id) VALUES ('Ford','','Blue',1234,8675309); INSERT INTO car_location_index (make,model,color,vehical_id,lot_id) VALUES ('Ford','','',1234,8675309); INSERT INTO car_location_index (make,model,color,vehical_id,lot_id) VALUES ('','Mustang','Blue',1234,8675309); INSERT INTO car_location_index (make,model,color,vehical_id,lot_id) VALUES ('','Mustang','',1234,8675309); INSERT INTO car_location_index (make,model,color,vehical_id,lot_id) VALUES ('','','Blue',1234,8675309); Thursday, May 16, 13
  • 25. Bit map index - Selecting records • Different combinations now possible 24 SELECT vehical_id,lot_id FROM car_location_index WHERE make = 'Ford' AND model = '' AND color = 'Blue'; vehical_id | lot_id ------------+--------- 1234 | 8675309 SELECT vehical_id,lot_id FROM car_location_index WHERE make = '' AND model = '' AND color = 'Blue'; vehical_id | lot_id ------------+--------- 1234 | 8675309 8765 | 5551212 Thursday, May 16, 13
  • 26. Feeling super yet? • Use these skills. Save you they will. • Don’t settle for boring data models • Stay tuned for more! 25 • Final will be at the Cassandra Summit: June 11th The worlds next top data model Thursday, May 16, 13
  • 27. Be there!!! 26 Sony, eBay, Netflix, Intuit, Spotify... the list goes on. Don’t miss it. Here is my discount code! Use it: PMcVIP Thursday, May 16, 13
  • 28. Bonus! • DataStax Java Driver Preso - June 12th • Download today! 27 https://github.com/datastax/java-driver Thursday, May 16, 13