Cassandra 3.0 advanced preview

Cassandra 3.0 Advanced Preview
Patrick McFadin @PatrickMcFadin
Chief Evangelist for Apache Cassandra

Intro
1.Developer friendly
2.Pay down a lot of technical debt
3.Push performance even higher
2

User Defined Functions
• Counter table
• User clicks on a number of stars
• rating_counter = How many clicks
• rating_total = Cumulative amount of stars
4
CREATE TABLE video_rating ( 
videoid uuid, 
rating_counter counter, 
rating_total counter, 
PRIMARY KEY (videoid) 
);

5
videoid uuid, 
);
public long getRatingForVideo(UUID videoId) { 
BoundStatement bs =
getRatingByVideoPreparedStatement.bind(videoId); 
 
ResultSet rs = session.execute(bs); 
 
Row row = rs.one(); 
 
// Get the count and total rating for the video 
long total = row.getLong("rating_total"); 
long count = row.getLong("rating_counter"); 
 
// Divide the total by the count and return an average 
return (total / count); 
}

6
videoid uuid, 
);
public long getRatingForVideo(UUID videoId) { 
BoundStatement bs =
getRatingByVideoPreparedStatement.bind(videoId); 
 
ResultSet rs = session.execute(bs); 
 
Row row = rs.one(); 
 
// Get the count and total rating for the video 
long total = row.getLong("rating_total"); 
long count = row.getLong("rating_counter"); 
 
// Divide the total by the count and return an average 
return (total / count); 
} 
Application code?

7
CREATE OR REPLACE FUNCTION averageRating ( rating_counter counter, rating_total counter ) 
RETURNS Float 
LANGUAGE java 
AS ' 
return Float.valueOf(rating_total.floatValue() / rating_counter.floatValue()); 
';
Function Name CQL TypeObject return type
Java Code

• Add to your CQL statement!
8
> SELECT averageRating(rating_counter, rating_total) AS avg 
FROM video_rating 
WHERE videoid = 99051fe9-6a9c-46c2-b949-38ef78858dd0;
videoid | rating_counter | rating_total 
--------------------------------------+----------------+-------------- 
99051fe9-6a9c-46c2-b949-38ef78858dd0 | 3 | 12
avg 
----- 
4

User Defined Functions - Fine print
• “Pure” functions
• Nothing outside of input parameters
• Return types are only objects. No primitives
• Method signatures on parameter type
9

User Defined Function Language Support
• Java
• JavaScript
10
• Scala
• Groovy
• Jython
• JRuby
Primary Languages
Optional Languages

JSON Support
• Table to store a video
• TYPE to store metadata
11
CREATE TYPE video_metadata ( 
height int, 
width int, 
video_bit_rate set<text>, 
encoding text 
);
CREATE TABLE videos ( 
videoid uuid, 
userid uuid, 
name varchar, 
description varchar, 
location text, 
location_type int, 
preview_thumbnails map<text,text>, 
tags set<varchar>, 
metadata set <frozen<video_metadata>>, 
added_date timestamp, 
);

JSON Support
12
INSERT INTO videos (videoid, name, userid, description, location, location_type,
preview_thumbnails, tags, added_date, metadata) 
VALUES (49f64d40-7d89-4890-b910-dbf923563a33,'The World''s Next Top Data Model',
9761d3d7-7fbd-4269-9988-6cfd4e188678,  
'Third in a three part series for Cassandra Data Modeling','http://www.youtube.com/watch?
v=HdJlsOZVGwM',1, 
{'YouTube':'http://www.youtube.com/watch?v=HdJlsOZVGwM'},{'cassandra','data
model','examples','instruction'},'2013-06-11 11:00:00', 
{{ height: 480, width: 640, encoding: 'MP4', video_bit_rate: {'1000kbs', '400kbs'}}});
Decompose into standard insert
OR!

JSON Support
13
INSERT INTO videos JSON 
'{ 
"videoid":"99051fe9-6a9c-46c2-b949-38ef78858dd0", 
"added_date":"2012-06-01 08:00:00.000", 
"description":"My cat likes to play the piano! So funny.", 
"location":"/us/vid/b3/b3a76c6b-7c7f-4af6-964f-803a9283c401", 
"location_type":1, 
"metadata":[ 
{ 
"height":480, 
"width":640, 
"video_bit_rate":[ 
"1000kbs", 
"400kbs" 
], 
"encoding":"MP4" 
} 
], 
"name":"My funny cat", 
"preview_thumbnails":{ 
"10":"/us/vid/b3/b3a76c6b-7c7f-4af6-964f-803a9283c401" 
}, 
"tags":[ 
"cats", 
"lol", 
"piano" 
], 
"userid":"d0f60aa8-54a9-4840-b70c-fe562b68842b" 
}';
One block of JSON
OR!

JSON Support
14
INSERT INTO videos (videoid, name, userid, description, location, location_type, preview_thumbnails, tags,
added_date, metadata) 
VALUES (99051fe9-6a9c-46c2-b949-38ef78858dd0,'My funny cat',d0f60aa8-54a9-4840-b70c-fe562b68842b,  
'My cat likes to play the piano! So funny.','/us/vid/b3/b3a76c6b-7c7f-4af6-964f-803a9283c401',1, 
{'10':'/us/vid/b3/b3a76c6b-7c7f-4af6-964f-803a9283c401'},{'cats','piano','lol'},'2012-06-01 08:00:00', 
fromJson(' 
[{ 
"height":480, 
"width":640, 
"1000kbs", 
"400kbs" 
], 
"encoding":"MP4" 
}] 
') 
);
Just a block at a time

Get JSON data
15
[json] 
------------------------------------------------------------------ 
'{ 
"videoid":"99051fe9-6a9c-46c2-b949-38ef78858dd0", 
"added_date":"2012-06-01 08:00:00.000", 
"description":"My cat likes to play the piano! So funny.", 
"location":"/us/vid/b3/b3a76c6b-7c7f-4af6-964f-803a9283c401", 
"location_type":1, 
"metadata":[ 
{ 
"height":480, 
"width":640, 
"1000kbs", 
"400kbs" 
], 
"encoding":"MP4" 
} 
], 
"name":"My funny cat", 
"preview_thumbnails":{ 
"10":"/us/vid/b3/b3a76c6b-7c7f-4af6-964f-803a9283c401" 
}, 
"tags":[ 
"cats", 
"lol", 
"piano" 
], 
"userid":"d0f60aa8-54a9-4840-b70c-fe562b68842b" 
}';
SELECT JSON * FROM videos;

Global Indexes
17
videoid uuid, 
userid uuid, 
name varchar, 
location text, 
);
CREATE TABLE videos_by_tag ( 
tag text, 
videoid uuid, 
name text, 
preview_image_location text, 
tagged_date timestamp, 
PRIMARY KEY (tag, videoid) 
);
Application maintained
consistency

Global Indexes
18
videoid uuid, 
userid uuid, 
name varchar, 
location text, 
);
CREATE GLOBAL INDEX tags_index  
ON videos (tag, videoid)  
INCLUDE (name, added_date, preview_thumbnails)
CREATE TABLE videos_by_tag ( 
tag text, 
videoid uuid, 
name text, 
preview_image_location text, 
tagged_date timestamp, 
PRIMARY KEY (tag, videoid) 
);

Global Indexes
• Separate Cassandra managed table
• Inserts
• Updates
19

More Indexes!
• Partial Indexes - Postponed until 3.1
• Functional Indexes - using a UDF in an index
20
CREATE INDEX ON user_rating averageRating(rating_counter, rating_total);

Hints to Raw Files
• Pre 3.0 hints stored in table
• Create load on entire write path
• …and read path
• …and compaction
22
CREATE TABLE system.hints ( 
target_id uuid, 
hint_id timeuuid, 
message_version int, 
mutation blob, 
PRIMARY KEY (target_id, hint_id, message_version) 
) WITH COMPACT STORAGE 
AND CLUSTERING ORDER BY (hint_id ASC, message_version ASC);

Hints to Raw Files
• Hints now written to a local file
• Replays direct from disk
• Bulk streamed to endpoints
23
CREATE TABLE system.hints ( 
target_id uuid, 
hint_id timeuuid, 
message_version int, 
mutation blob, 
PRIMARY KEY (target_id, hint_id, message_version) 
) WITH COMPACT STORAGE 
AND CLUSTERING ORDER BY (hint_id ASC, message_version ASC);

Windows Compatibility - The Problem
• Java file management on Windows is… different
• File delete’s are not possible
• Hard links - Broke
• Snapshots - Broke
• Memory Mapped I/O - Broke
24

Windows Compatibility - 3.0
• Re-tooling of critical file functions
• Extensive use of FILE_SHARE_DELETE from JDK7
• Launch now in PowerShell
• CCM now supports windows
25

Storage Engine Changes
• Now infamous CASSANDRA-8099
• Technical debt from Thrift
• Move from Thrift centric to CQL centric storage
26

Pre 3.0 Storage Engine Format
27
2005:12:1:102005:12:1:92005:12:1:82005:12:1:7
5F22A0BC
Partition Key Clustering Columns
F2B3652CFFB3652D7AB3652C
PRIMARY KEY (userId,added_date,videoId)
A12378E55F5A32

3.0 Format
• Partition header stores column names
• Row stores clustering values
• No duplicated values
28
Partition Key
Column Names
Clustering Values
Column Values
Clustering Values
Column Values
Partition Header Row Row
Clustering Values
Column Values
Row

Commit Log Compression
• Pre 3.0 commit log writes
30

31

32

33

34

35

36

• Segments are compressed by time interval
• Higher throughput under high writes
37

38

39

Smaller but significant changes
• Direct buffer decompression of reads
• Avoiding memory allocation on Index Summary search
• Repair concurrency improvements
• Optimal CRC32 implementation at runtime
40

Role Based Access Control
• Expands on User based auth in 1.2
• Requires the internal auth to be enabled
42
CREATE ROLE supervisor; 
 
GRANT MODIFY ON user_credentials TO supervisor;

When will it ship?
43
Maybe June
When 8099 is finished, it ships

Thank you!
Questions
Follow me on Twitter for more
@PatrickMcFadin

Cassandra 3.0 advanced preview

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Cassandra 3.0 advanced preview

Similar to Cassandra 3.0 advanced preview (20)

More from Patrick McFadin

More from Patrick McFadin (20)

Recently uploaded

Recently uploaded (20)

Cassandra 3.0 advanced preview