SlideShare a Scribd company logo
1 of 59
Download to read offline
CASSANDRA SUMMIT 2016
CQL PERFORMANCE WITH APACHE
CASSANDRA 3.0
Aaron Morton
@aaronmorton
CEO
Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License
How We Got Here
Storage Engine 3.0
Read Path
How We Got Here
Way back in 2011…
2011
Blog: Cassandra Query Plans
http://thelastpickle.com/blog/2011/07/04/
Cassandra-Query-Plans.html
2012
Talk:Technical Deep Dive -
Query Performance
https://www.youtube.com/watch?
v=gomOKhMV0zc
2012
Explain Read & Write
performance in 45 minutes.
Skip Forward to 2016
Blog: Introduction To The
Apache Cassandra 3.x Storage
Engine
http://thelastpickle.com/blog/2016/03/04/introductiont-to-
the-apache-cassandra-3-storage-engine.html
Skip Forward to 2016
“Why don’t I do another talk
about Cassandra
performance.”
Skip Forward to 2016
It was a busy 4 years…
Skip Forward to 2016
CQL 3, Collection Types,
UDTs, UDF’s, UDA’s,
MaterialisedViews,Triggers,
SASI,…
Skip Forward to 2016
Explain Read & Write
performance in 45 minutes.
So Lets Avoid
CQL 3, Collection Types,
UDTs, UDF’s, UDA’s,
MaterialisedViews,Triggers,
SASI,…
How We Got Here
Storage Engine 3.0
Read Path
High Level Storage Engine 3.0
Storage Engine 3.0 Files
Data.db
Index.db
Filter.db
Storage Engine 3.0 Files
CompressionInfo.db
Statistics.db
Digest.crc32
CRC.db
Summary.db
TOC.txt
CQL Recap
create table my_table (
partition_1 text,
cluster_1 text,
foo text,
bar text,
baz text,
PRIMARY KEY (partition_1, cluster_1)
);
CQL Recap
WARNING:
FAKE DATA AHEAD
CQL WithThrift Pre 3.0
[default@dev] list my_table;
-------------------
RowKey: part_a
=> (column=clust_a:, value=, timestamp=1357…739000)
=> (column=clust_a:foo, value=some foo, timestamp=1357…739000)
=> (column=clust_a:bar, value=and bar, timestamp=1357…739000)
=> (column=clust_a:baz, value=no baz, timestamp=1357…739000)
=> (column=clust_b:, value=, timestamp=1357…739000)
=> (column=clust_b:foo, value=no foo, timestamp=1357…739000)
=> (column=clust_b:bar, value=no bar, timestamp=1357…739000)
=> (column=clust_b:baz, value=lots baz, timestamp=1357…739000)
CQL Pre 3.0
Clustering Keys Repeated
Column Names Repeated
Timestamps Repeated
Fixed Width Encoding
No Knowledge Of Row Contents
Storage Engine 3.0 Improvements
Delta Encoding
Variable Int Encoding
Clustering Written Once
Aggregated Metadata
Cell Presence
SerializationHeader
For each SSTable*.
Stored in each SSTable.
Held in memory.
SerializationHeader
public class SerializationHeader
{
private final AbstractType<?> keyType;
private final List<AbstractType<?>>
clusteringTypes;
private final PartitionColumns columns;
private final EncodingStats stats;
…
}
EncodingStats
Collected on the fly by the
Memtable.
EncodingStats
public class EncodingStats
{
public final long minTimestamp;
public final int minLocalDeletionTime;
public final int minTTL;
…
}
SerializationHeader
public class SerializationHeader
{
public void writeTimestamp(long timestamp,
DataOutputPlus out) throws IOException
{
out.writeUnsignedVInt(timestamp -
stats.minTimestamp);
}
…
}
VIntCoding
public class VIntCoding
{
public static void writeUnsignedVInt(long value, DataOutput
output) throws IOException {
int size = VIntCoding.computeUnsignedVIntSize(value);
if (size == 1)
{
output.write((int)value);
return;
}
output.write(VIntCoding.encodeVInt(value, size), 0,
size);
}
Storage Engine 3.0 Improvements
Delta Encoding
Variable Int Encoding
Clustering Written Once
Aggregated Metadata
Cell Presence
CQL WithThrift Pre 3.0
[default@dev] list my_table;
-------------------
RowKey: part_a
=> (column=clust_a:, value=, timestamp=1357…739000)
=> (column=clust_a:foo, value=some foo, timestamp=1357…739000)
=> (column=clust_a:bar, value=and bar, timestamp=1357…739000)
=> (column=clust_a:baz, value=no baz, timestamp=1357…739000)
=> (column=clust_b:, value=, timestamp=1357…739000)
=> (column=clust_b:foo, value=no foo, timestamp=1357…739000)
=> (column=clust_b:bar, value=no bar, timestamp=1357…739000)
=> (column=clust_b:baz, value=lots baz, timestamp=1357…739000)
Storage Engine 3.0 Data.db
Storage Engine 3.0 Partition Header
Storage Engine 3.0 Row
Storage Engine 3.0 Clustering Block
Storage Engine 3.0 Improvements
Delta Encoding
Variable Int Encoding
Clustering Written Once
Aggregated Cell Metadata
Cell Presence
CQL WithThrift Pre 3.0
[default@dev] list my_table;
-------------------
RowKey: part_a
=> (column=clust_a:, value=, timestamp=1357…739000)
=> (column=clust_a:foo, value=some foo, timestamp=1357…739000)
=> (column=clust_a:bar, value=and bar, timestamp=1357…739000)
=> (column=clust_a:baz, value=no baz, timestamp=1357…739000)
=> (column=clust_b:, value=, timestamp=1357…739000)
=> (column=clust_b:foo, value=no foo, timestamp=1357…739000)
=> (column=clust_b:bar, value=no bar, timestamp=1357…739000)
=> (column=clust_b:baz, value=lots baz, timestamp=1357…739000)
Aggregated Cell Metadata
Only store CellTimestamp,TTL, and
Local DeletionTime if different to
the Row.
Aggregated Cell Metadata
Simple Cell Component Byte Size
Flags 1
Optional Cell Timestamp (delta) varint 1…n
Optional Cell Local Deletion Time (delta) varint 1…n
Optional Cell TTL (delta) varint 1…n
Fixed Width Cell Value Byte Size
Value 1…n
Optional Cell Value See Below
Variable Width Cell Value Byte Size
Value Length varint 1…n
Value 1…n
Apache Cassandra 3.0 Storage Engine
Storage Engine 3.0 Improvements
Delta Encoding
Variable Int Encoding
Clustering Written Once
Aggregated Cell Metadata
Cell Presence
Cell Presence
SSTable stores list of Cells in this
SSTable.
Rows stores bitmap of Cells in this
Row, with reference to SSTable.
Storage Engine 3.0 Row
Remember Where We Came From
[default@dev] list my_table;
-------------------
RowKey: part_a
=> (column=clust_a:, value=, timestamp=1357…739000)
=> (column=clust_a:foo, value=some foo, timestamp=1357…739000)
=> (column=clust_a:bar, value=and bar, timestamp=1357…739000)
=> (column=clust_a:baz, value=no baz, timestamp=1357…739000)
=> (column=clust_b:, value=, timestamp=1357…739000)
=> (column=clust_b:foo, value=no foo, timestamp=1357…739000)
=> (column=clust_b:bar, value=no bar, timestamp=1357…739000)
=> (column=clust_b:baz, value=lots baz, timestamp=1357…739000)
How We Got Here
Storage Engine 3.0
Read Path
Read Paths
Ignoring Index Read paths.
Read Commands
PartitionRangeReadCommand
SinglePartitionReadCommand
AbstractClusteringIndexFilter
ClusteringIndexNamesFilter
(When we know the column names.)
ClusteringIndexSliceFilter
(When we do not know the column names.)
ClusteringIndexNamesFilter
When we know what
Columns to select, we know
when the search is over.
ClusteringIndexNamesFilter
1. Get Partition From Memtables.
2. Filter named columns into a temporary
result.
3. Select SSTables that may contain Partition
Key.
4. Order in descending timestamp order.
5. Read from SSTables in order.
Names Filter Short Circuits
If result has a Partition Deletion
newer than next SSTable max
timestamp.
Stop Search.
Names Filter Short Circuits
If read all Columns and max
timestamp of next SSTable less than
selected Columns min timestamp.
Stop Search.
Names Filter Short Circuits
If search clustering value not within
clustering range in the SSTable.
Skip SSTable.
Names Filter Short Circuits
If SSTable Cell not in search set.
Skip reading value.
ClusteringIndexSliceFilter
When we do not know which
columns to select, the search
ends when it is exhausted.
ClusteringIndexSliceFilter
Used with:
Distinct.
Not all clustering columns
restricted.
ClusteringIndexSliceFilter
1. Get Partition From Memtables.
2. Create Iterators for Partitions.
3. Select SSTables that may contain Partition
Key.
4. Order in reverse max timestamp order.
5. Create Iterators for SSTables in order.
Slice Filter Short Circuits
If SSTable max timestamp is before
max seen Partition Deletion
timestamp.
Stop Search.
Names Filter Short Circuits
If search clustering value not within
clustering range in the SSTable.
Skip SSTable.
Thanks.
Aaron Morton
@aaronmorton
Co-Founder & Principal Consultant
www.thelastpickle.com

More Related Content

What's hot

ETL With Cassandra Streaming Bulk Loading
ETL With Cassandra Streaming Bulk LoadingETL With Cassandra Streaming Bulk Loading
ETL With Cassandra Streaming Bulk Loading
alex_araujo
 
April 2010 - JBoss Web Services
April 2010 - JBoss Web ServicesApril 2010 - JBoss Web Services
April 2010 - JBoss Web Services
JBug Italy
 
MongoDB: tips, trick and hacks
MongoDB: tips, trick and hacksMongoDB: tips, trick and hacks
MongoDB: tips, trick and hacks
Scott Hernandez
 

What's hot (20)

Polyglot Persistence
Polyglot PersistencePolyglot Persistence
Polyglot Persistence
 
wtf is in Java/JDK/wtf7?
wtf is in Java/JDK/wtf7?wtf is in Java/JDK/wtf7?
wtf is in Java/JDK/wtf7?
 
Squeak DBX
Squeak DBXSqueak DBX
Squeak DBX
 
Cassandra Materialized Views
Cassandra Materialized ViewsCassandra Materialized Views
Cassandra Materialized Views
 
How and Where in GLORP
How and Where in GLORPHow and Where in GLORP
How and Where in GLORP
 
ETL With Cassandra Streaming Bulk Loading
ETL With Cassandra Streaming Bulk LoadingETL With Cassandra Streaming Bulk Loading
ETL With Cassandra Streaming Bulk Loading
 
SQL to Hive Cheat Sheet
SQL to Hive Cheat SheetSQL to Hive Cheat Sheet
SQL to Hive Cheat Sheet
 
Developing and Deploying Apps with the Postgres FDW
Developing and Deploying Apps with the Postgres FDWDeveloping and Deploying Apps with the Postgres FDW
Developing and Deploying Apps with the Postgres FDW
 
Advanced Sqoop
Advanced Sqoop Advanced Sqoop
Advanced Sqoop
 
15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance
 
Cassandra 3 new features 2016
Cassandra 3 new features 2016Cassandra 3 new features 2016
Cassandra 3 new features 2016
 
Cassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL MeetupCassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL Meetup
 
Cassandra 3.0 Awesomeness
Cassandra 3.0 AwesomenessCassandra 3.0 Awesomeness
Cassandra 3.0 Awesomeness
 
April 2010 - JBoss Web Services
April 2010 - JBoss Web ServicesApril 2010 - JBoss Web Services
April 2010 - JBoss Web Services
 
Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...
Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...
Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...
 
MongoDB: tips, trick and hacks
MongoDB: tips, trick and hacksMongoDB: tips, trick and hacks
MongoDB: tips, trick and hacks
 
Infinispan,Lucene,Hibername OGM
Infinispan,Lucene,Hibername OGMInfinispan,Lucene,Hibername OGM
Infinispan,Lucene,Hibername OGM
 
PostgreSQL, your NoSQL database
PostgreSQL, your NoSQL databasePostgreSQL, your NoSQL database
PostgreSQL, your NoSQL database
 
OrientDB
OrientDBOrientDB
OrientDB
 
CQL3 in depth
CQL3 in depthCQL3 in depth
CQL3 in depth
 

Viewers also liked

10 Devops-Friendly Database Must-Haves - Dor Laor, ScyllaDB - DevOpsDays Tel ...
10 Devops-Friendly Database Must-Haves - Dor Laor, ScyllaDB - DevOpsDays Tel ...10 Devops-Friendly Database Must-Haves - Dor Laor, ScyllaDB - DevOpsDays Tel ...
10 Devops-Friendly Database Must-Haves - Dor Laor, ScyllaDB - DevOpsDays Tel ...
DevOpsDays Tel Aviv
 

Viewers also liked (13)

Cassandra 2.2 & 3.0
Cassandra 2.2 & 3.0Cassandra 2.2 & 3.0
Cassandra 2.2 & 3.0
 
10 Devops-Friendly Database Must-Haves - Dor Laor, ScyllaDB - DevOpsDays Tel ...
10 Devops-Friendly Database Must-Haves - Dor Laor, ScyllaDB - DevOpsDays Tel ...10 Devops-Friendly Database Must-Haves - Dor Laor, ScyllaDB - DevOpsDays Tel ...
10 Devops-Friendly Database Must-Haves - Dor Laor, ScyllaDB - DevOpsDays Tel ...
 
Apache Cassandra and Go
Apache Cassandra and GoApache Cassandra and Go
Apache Cassandra and Go
 
Advanced Apache Cassandra Operations with JMX
Advanced Apache Cassandra Operations with JMXAdvanced Apache Cassandra Operations with JMX
Advanced Apache Cassandra Operations with JMX
 
5分で作るさくらのVPSでKUSANAGI8環境
5分で作るさくらのVPSでKUSANAGI8環境5分で作るさくらのVPSでKUSANAGI8環境
5分で作るさくらのVPSでKUSANAGI8環境
 
Apache Cassandra 2.0
Apache Cassandra 2.0Apache Cassandra 2.0
Apache Cassandra 2.0
 
Lightning fast analytics with Spark and Cassandra
Lightning fast analytics with Spark and CassandraLightning fast analytics with Spark and Cassandra
Lightning fast analytics with Spark and Cassandra
 
NoSQL: onde, como e por quê? Cassandra e MongoDB
NoSQL: onde, como e por quê? Cassandra e MongoDBNoSQL: onde, como e por quê? Cassandra e MongoDB
NoSQL: onde, como e por quê? Cassandra e MongoDB
 
Introduction to Apache Cassandra
Introduction to Apache Cassandra Introduction to Apache Cassandra
Introduction to Apache Cassandra
 
Elasticsearch+nodejs+dynamodbで作る全社システム基盤
Elasticsearch+nodejs+dynamodbで作る全社システム基盤Elasticsearch+nodejs+dynamodbで作る全社システム基盤
Elasticsearch+nodejs+dynamodbで作る全社システム基盤
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Nosqlの基礎知識(2013年7月講義資料)
Nosqlの基礎知識(2013年7月講義資料)Nosqlの基礎知識(2013年7月講義資料)
Nosqlの基礎知識(2013年7月講義資料)
 
Goでヤフーの分散オブジェクトストレージを作った話 Go Conference 2017 Spring
Goでヤフーの分散オブジェクトストレージを作った話 Go Conference 2017 SpringGoでヤフーの分散オブジェクトストレージを作った話 Go Conference 2017 Spring
Goでヤフーの分散オブジェクトストレージを作った話 Go Conference 2017 Spring
 

Similar to CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

Refactoring to Macros with Clojure
Refactoring to Macros with ClojureRefactoring to Macros with Clojure
Refactoring to Macros with Clojure
Dmitry Buzdin
 

Similar to CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016 (20)

Cassandra SF Meetup - CQL Performance With Apache Cassandra 3.X
Cassandra SF Meetup - CQL Performance With Apache Cassandra 3.XCassandra SF Meetup - CQL Performance With Apache Cassandra 3.X
Cassandra SF Meetup - CQL Performance With Apache Cassandra 3.X
 
Refactoring to Macros with Clojure
Refactoring to Macros with ClojureRefactoring to Macros with Clojure
Refactoring to Macros with Clojure
 
Hadoop Integration in Cassandra
Hadoop Integration in CassandraHadoop Integration in Cassandra
Hadoop Integration in Cassandra
 
Postgres Performance for Humans
Postgres Performance for HumansPostgres Performance for Humans
Postgres Performance for Humans
 
PostgreSQL 9.6 새 기능 소개
PostgreSQL 9.6 새 기능 소개PostgreSQL 9.6 새 기능 소개
PostgreSQL 9.6 새 기능 소개
 
M|18 Ingesting Data with the New Bulk Data Adapters
M|18 Ingesting Data with the New Bulk Data AdaptersM|18 Ingesting Data with the New Bulk Data Adapters
M|18 Ingesting Data with the New Bulk Data Adapters
 
12 Monkeys Inside JS Engine
12 Monkeys Inside JS Engine12 Monkeys Inside JS Engine
12 Monkeys Inside JS Engine
 
Fun Teaching MongoDB New Tricks
Fun Teaching MongoDB New TricksFun Teaching MongoDB New Tricks
Fun Teaching MongoDB New Tricks
 
Solr @ Etsy - Apache Lucene Eurocon
Solr @ Etsy - Apache Lucene EuroconSolr @ Etsy - Apache Lucene Eurocon
Solr @ Etsy - Apache Lucene Eurocon
 
The Road To Reactive with RxJava JEEConf 2016
The Road To Reactive with RxJava JEEConf 2016The Road To Reactive with RxJava JEEConf 2016
The Road To Reactive with RxJava JEEConf 2016
 
Integrating SAP the Java EE Way - JBoss One Day talk 2012
Integrating SAP the Java EE Way - JBoss One Day talk 2012Integrating SAP the Java EE Way - JBoss One Day talk 2012
Integrating SAP the Java EE Way - JBoss One Day talk 2012
 
Lambdas puzzler - Peter Lawrey
Lambdas puzzler - Peter LawreyLambdas puzzler - Peter Lawrey
Lambdas puzzler - Peter Lawrey
 
Where the wild things are - Benchmarking and Micro-Optimisations
Where the wild things are - Benchmarking and Micro-OptimisationsWhere the wild things are - Benchmarking and Micro-Optimisations
Where the wild things are - Benchmarking and Micro-Optimisations
 
MiamiJS - The Future of JavaScript
MiamiJS - The Future of JavaScriptMiamiJS - The Future of JavaScript
MiamiJS - The Future of JavaScript
 
Presto anatomy
Presto anatomyPresto anatomy
Presto anatomy
 
Performance improvements in PostgreSQL 9.5 and beyond
Performance improvements in PostgreSQL 9.5 and beyondPerformance improvements in PostgreSQL 9.5 and beyond
Performance improvements in PostgreSQL 9.5 and beyond
 
Sqlapi0.1
Sqlapi0.1Sqlapi0.1
Sqlapi0.1
 
Scala to assembly
Scala to assemblyScala to assembly
Scala to assembly
 
[245] presto 내부구조 파헤치기
[245] presto 내부구조 파헤치기[245] presto 내부구조 파헤치기
[245] presto 내부구조 파헤치기
 
Scala in Places API
Scala in Places APIScala in Places API
Scala in Places API
 

More from DataStax

More from DataStax (20)

Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?
 
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
 
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid EnvironmentsRunning DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
 
Best Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise GraphBest Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise Graph
 
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyWebinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
 
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
 
Webinar | Better Together: Apache Cassandra and Apache Kafka
Webinar  |  Better Together: Apache Cassandra and Apache KafkaWebinar  |  Better Together: Apache Cassandra and Apache Kafka
Webinar | Better Together: Apache Cassandra and Apache Kafka
 
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax EnterpriseTop 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
 
Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0
 
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
 
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud RealitiesWebinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
 
Designing a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for DummiesDesigning a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for Dummies
 
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid CloudHow to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
 
How to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerceHow to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerce
 
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
 
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
 
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
 
Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)
 
An Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking ApplicationsAn Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking Applications
 
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design ThinkingBecoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
 

Recently uploaded

The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
shinachiaurasa2
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
masabamasaba
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 

Recently uploaded (20)

WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 

CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C* Summit 2016

  • 1. CASSANDRA SUMMIT 2016 CQL PERFORMANCE WITH APACHE CASSANDRA 3.0 Aaron Morton @aaronmorton CEO Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License
  • 2.
  • 3. How We Got Here Storage Engine 3.0 Read Path
  • 4. How We Got Here Way back in 2011…
  • 5. 2011 Blog: Cassandra Query Plans http://thelastpickle.com/blog/2011/07/04/ Cassandra-Query-Plans.html
  • 6. 2012 Talk:Technical Deep Dive - Query Performance https://www.youtube.com/watch? v=gomOKhMV0zc
  • 7. 2012 Explain Read & Write performance in 45 minutes.
  • 8. Skip Forward to 2016 Blog: Introduction To The Apache Cassandra 3.x Storage Engine http://thelastpickle.com/blog/2016/03/04/introductiont-to- the-apache-cassandra-3-storage-engine.html
  • 9. Skip Forward to 2016 “Why don’t I do another talk about Cassandra performance.”
  • 10. Skip Forward to 2016 It was a busy 4 years…
  • 11. Skip Forward to 2016 CQL 3, Collection Types, UDTs, UDF’s, UDA’s, MaterialisedViews,Triggers, SASI,…
  • 12. Skip Forward to 2016 Explain Read & Write performance in 45 minutes.
  • 13. So Lets Avoid CQL 3, Collection Types, UDTs, UDF’s, UDA’s, MaterialisedViews,Triggers, SASI,…
  • 14. How We Got Here Storage Engine 3.0 Read Path
  • 15. High Level Storage Engine 3.0
  • 16. Storage Engine 3.0 Files Data.db Index.db Filter.db
  • 17. Storage Engine 3.0 Files CompressionInfo.db Statistics.db Digest.crc32 CRC.db Summary.db TOC.txt
  • 18. CQL Recap create table my_table ( partition_1 text, cluster_1 text, foo text, bar text, baz text, PRIMARY KEY (partition_1, cluster_1) );
  • 20. CQL WithThrift Pre 3.0 [default@dev] list my_table; ------------------- RowKey: part_a => (column=clust_a:, value=, timestamp=1357…739000) => (column=clust_a:foo, value=some foo, timestamp=1357…739000) => (column=clust_a:bar, value=and bar, timestamp=1357…739000) => (column=clust_a:baz, value=no baz, timestamp=1357…739000) => (column=clust_b:, value=, timestamp=1357…739000) => (column=clust_b:foo, value=no foo, timestamp=1357…739000) => (column=clust_b:bar, value=no bar, timestamp=1357…739000) => (column=clust_b:baz, value=lots baz, timestamp=1357…739000)
  • 21. CQL Pre 3.0 Clustering Keys Repeated Column Names Repeated Timestamps Repeated Fixed Width Encoding No Knowledge Of Row Contents
  • 22. Storage Engine 3.0 Improvements Delta Encoding Variable Int Encoding Clustering Written Once Aggregated Metadata Cell Presence
  • 23. SerializationHeader For each SSTable*. Stored in each SSTable. Held in memory.
  • 24. SerializationHeader public class SerializationHeader { private final AbstractType<?> keyType; private final List<AbstractType<?>> clusteringTypes; private final PartitionColumns columns; private final EncodingStats stats; … }
  • 25. EncodingStats Collected on the fly by the Memtable.
  • 26. EncodingStats public class EncodingStats { public final long minTimestamp; public final int minLocalDeletionTime; public final int minTTL; … }
  • 27. SerializationHeader public class SerializationHeader { public void writeTimestamp(long timestamp, DataOutputPlus out) throws IOException { out.writeUnsignedVInt(timestamp - stats.minTimestamp); } … }
  • 28. VIntCoding public class VIntCoding { public static void writeUnsignedVInt(long value, DataOutput output) throws IOException { int size = VIntCoding.computeUnsignedVIntSize(value); if (size == 1) { output.write((int)value); return; } output.write(VIntCoding.encodeVInt(value, size), 0, size); }
  • 29. Storage Engine 3.0 Improvements Delta Encoding Variable Int Encoding Clustering Written Once Aggregated Metadata Cell Presence
  • 30. CQL WithThrift Pre 3.0 [default@dev] list my_table; ------------------- RowKey: part_a => (column=clust_a:, value=, timestamp=1357…739000) => (column=clust_a:foo, value=some foo, timestamp=1357…739000) => (column=clust_a:bar, value=and bar, timestamp=1357…739000) => (column=clust_a:baz, value=no baz, timestamp=1357…739000) => (column=clust_b:, value=, timestamp=1357…739000) => (column=clust_b:foo, value=no foo, timestamp=1357…739000) => (column=clust_b:bar, value=no bar, timestamp=1357…739000) => (column=clust_b:baz, value=lots baz, timestamp=1357…739000)
  • 32. Storage Engine 3.0 Partition Header
  • 34. Storage Engine 3.0 Clustering Block
  • 35. Storage Engine 3.0 Improvements Delta Encoding Variable Int Encoding Clustering Written Once Aggregated Cell Metadata Cell Presence
  • 36. CQL WithThrift Pre 3.0 [default@dev] list my_table; ------------------- RowKey: part_a => (column=clust_a:, value=, timestamp=1357…739000) => (column=clust_a:foo, value=some foo, timestamp=1357…739000) => (column=clust_a:bar, value=and bar, timestamp=1357…739000) => (column=clust_a:baz, value=no baz, timestamp=1357…739000) => (column=clust_b:, value=, timestamp=1357…739000) => (column=clust_b:foo, value=no foo, timestamp=1357…739000) => (column=clust_b:bar, value=no bar, timestamp=1357…739000) => (column=clust_b:baz, value=lots baz, timestamp=1357…739000)
  • 37. Aggregated Cell Metadata Only store CellTimestamp,TTL, and Local DeletionTime if different to the Row.
  • 38. Aggregated Cell Metadata Simple Cell Component Byte Size Flags 1 Optional Cell Timestamp (delta) varint 1…n Optional Cell Local Deletion Time (delta) varint 1…n Optional Cell TTL (delta) varint 1…n Fixed Width Cell Value Byte Size Value 1…n Optional Cell Value See Below Variable Width Cell Value Byte Size Value Length varint 1…n Value 1…n Apache Cassandra 3.0 Storage Engine
  • 39. Storage Engine 3.0 Improvements Delta Encoding Variable Int Encoding Clustering Written Once Aggregated Cell Metadata Cell Presence
  • 40. Cell Presence SSTable stores list of Cells in this SSTable. Rows stores bitmap of Cells in this Row, with reference to SSTable.
  • 42. Remember Where We Came From [default@dev] list my_table; ------------------- RowKey: part_a => (column=clust_a:, value=, timestamp=1357…739000) => (column=clust_a:foo, value=some foo, timestamp=1357…739000) => (column=clust_a:bar, value=and bar, timestamp=1357…739000) => (column=clust_a:baz, value=no baz, timestamp=1357…739000) => (column=clust_b:, value=, timestamp=1357…739000) => (column=clust_b:foo, value=no foo, timestamp=1357…739000) => (column=clust_b:bar, value=no bar, timestamp=1357…739000) => (column=clust_b:baz, value=lots baz, timestamp=1357…739000)
  • 43. How We Got Here Storage Engine 3.0 Read Path
  • 46. AbstractClusteringIndexFilter ClusteringIndexNamesFilter (When we know the column names.) ClusteringIndexSliceFilter (When we do not know the column names.)
  • 47. ClusteringIndexNamesFilter When we know what Columns to select, we know when the search is over.
  • 48. ClusteringIndexNamesFilter 1. Get Partition From Memtables. 2. Filter named columns into a temporary result. 3. Select SSTables that may contain Partition Key. 4. Order in descending timestamp order. 5. Read from SSTables in order.
  • 49. Names Filter Short Circuits If result has a Partition Deletion newer than next SSTable max timestamp. Stop Search.
  • 50. Names Filter Short Circuits If read all Columns and max timestamp of next SSTable less than selected Columns min timestamp. Stop Search.
  • 51. Names Filter Short Circuits If search clustering value not within clustering range in the SSTable. Skip SSTable.
  • 52. Names Filter Short Circuits If SSTable Cell not in search set. Skip reading value.
  • 53. ClusteringIndexSliceFilter When we do not know which columns to select, the search ends when it is exhausted.
  • 55. ClusteringIndexSliceFilter 1. Get Partition From Memtables. 2. Create Iterators for Partitions. 3. Select SSTables that may contain Partition Key. 4. Order in reverse max timestamp order. 5. Create Iterators for SSTables in order.
  • 56. Slice Filter Short Circuits If SSTable max timestamp is before max seen Partition Deletion timestamp. Stop Search.
  • 57. Names Filter Short Circuits If search clustering value not within clustering range in the SSTable. Skip SSTable.
  • 59. Aaron Morton @aaronmorton Co-Founder & Principal Consultant www.thelastpickle.com