SlideShare a Scribd company logo
High Performance NoSQL Masterclass
Scaling for Performance
Felipe Cardeneti Mendes
High Performance NoSQL Masterclass
Felipe Cardeneti Mendes
● Solution Architect at ScyllaDB
● Published Author
● Linux and Open Source enthusiast
High Performance NoSQL Masterclass
Agenda
● About ScyllaDB
● Getting started with NoSQL databases
● Deployment and Production Readiness
● Observability Tips and Tricks
High Performance NoSQL Masterclass
About ScyllaDB
High Performance NoSQL Masterclass
ScyllaDB Database Architecture
Horizontal & Vertical Scaling
Built in C++
(no Java overhead)
System and Data
Center Aware
Sharding Per Core Shard-Aware Drivers
Auto-Performance
Tuning
Network
Processor NUMA
Storage
Unique Close-to-Metal Architecture
High Performance NoSQL Masterclass
Shard per Core
Threads Shards
High Performance NoSQL Masterclass
Asynchronous Architecture
Request Answer
Request Answer
Waiting for response
Time Savings
Synchronous
architecture
Asynchronous
architecture
High Performance NoSQL Masterclass
Specialized Cache
Cassandra ScyllaDB
Key
cache
Row
cache
Linux page cache
SSTables
Unified cache
SSTables
Complex Tuning
On-heap /
Off-heap
High Performance NoSQL Masterclass
Ecosystem Compatibility
+ CQL native protocol
+ JMX management protocol
+ Management command line
/REST
+ SSTable file format
+ Configuration file format
+ CQL language
High Performance NoSQL Masterclass
Getting Started with NoSQL
Databases
High Performance NoSQL Masterclass
Modern Business Challenges
Keep CapEx
& OpEx in check
Reduce complexity
Scale as the
data grows
Queries in milliseconds
Leverage massive
amounts of data
Predictable, consistent
performance
High Performance NoSQL Masterclass
Oh… The CAP Theorem
High Performance NoSQL Masterclass
Workload Types
High Performance NoSQL Masterclass
Workload Types
Decision Support
+ More complex queries - large amounts
+ Latency important but not critical
+ Seconds to hours
Fundamental business tasks
+ Simple queries
+ Latency critical
+ Milliseconds per transaction
OLAP
Time
Complexity
of
Query
Time
Complexity
of
Query
OLTP
High Performance NoSQL Masterclass
OLAP Characteristics
Main Characteristics
+ Bound Concurrency
+ Scans and aggregations
+ Rely on MapReduce paradigms
OLAP
Time
Complexity
of
Query
Examples
+ How many users are from the US?
+ How many Twitter posts happened in 2022?
+ Which devices haven’t communicated back
within the past 1 hour?
High Performance NoSQL Masterclass
OLTP Characteristics
Main Characteristics
+ Unbound Concurrency
+ Designed for speed and simplicity
+ Often user facing APIs
Examples
+ What’s the last time an user logged in?
+ What have been the last 10 temperature
measurements for a given device?
+ How many likes a given posting has?
Time
Complexity
of
Query
OLTP
High Performance NoSQL Masterclass
Why not both? Meet Workload Prioritization
100 shares
Ratio = 100:100 (1:1) means equal shares of
processing/resources to complete tasks
Ratio = 100:50 (2:1) means 2X as many shares of processing/resources
for Transactions to complete tasks compared to Analytics
100 shares
100 shares
50 shares
OLTP
OLAP
Which Task to Run
Wide Column Databases Write Path
LSM storage engine’s write path:
18
Writes
commit log
Wide Column Databases Write Path
LSM storage engine’s write path:
19
Writes
commit log
Wide Column Databases Write Path
LSM storage engine’s write path:
20
Writes
commit log
Wide Column Databases Write Path
LSM storage engine’s write path:
21
Writes
commit log
compaction
Wide Column Databases Write Path
LSM storage engine’s write path:
22
Writes
commit log
compaction
Wide Column Databases Write Path
LSM storage engine’s write path:
23
Writes
commit log
What is compaction?
LSM storage engine’s write path:
24
Hidden Gems
+ This technique of keeping sorted files and merging them is well-known and
often called Log-Structured Merge (LSM) Tree
+ Published in 1996, earliest popular application known is the Lucene search
engine, 1999
Characteristics
+ High performance write.
+ Immediately readable.
+ Reasonable performance for read.
What is a compaction strategy?
LSM storage engine’s write path:
25
▪ Which files to compact, and when?
▪ This is called the compaction strategy
▪ The goal of the strategy is low amplification:
○ Avoid read requests needing many sstables.
• read amplification
○ Avoid overwritten/deleted/expired data staying on disk.
○ Avoid excessive temporary disk space needs
• space amplification
○ Avoid compacting the same data again and again.
• write amplification
Which one to choose?
LSM storage engine’s write path:
26
Know your workload
High Performance NoSQL Masterclass
Deployment and Production
Readiness
It all starts with Data Modeling
LSM storage engine’s write path:
28
Do’s
+ Denormalize
+ Query oriented approach
+ High data distribution / cardinality
Dont’s
+ Create hotspots
+ Large partitions/rows/cells/collections/etc
+ Low cardinality tables/views/indexes
Test, test, test…!
LSM storage engine’s write path:
29
Unit Testing
+ Test your workload and access patterns in a Docker container
+ Use specialized stress tools to simulate workload
○ cassandra-stress
○ nosqlbench
○ YCSB
+ OBSERVE the results (more on that later)
Application Development
LSM storage engine’s write path:
30
Functional Testing
+ Use Prepared Statements
+ Configure your routing and load balancing policy correctly
○ DCAware and TokenAware policies
○ ShardAware Drivers if using ScyllaDB!
+ Make use of Asynchronous APIs
+ Ensure your client is NOT a bottleneck
+ Paging is important: Ensure you adjust it right
Test, test, test…!
LSM storage engine’s write path:
31
Readiness Testing
+ What’s the unreplicated data set size?
+ How many operations per second do I need to achieve?
○ Out of these, what are the reads vs writes distribution?
○ What’s the average payload size?
○ What are my latency requirements?
+ How many regions should it replicate to?
+ Do I need indexes or views to satisfy my queries?
+ What is/are the target deployment location(s)?
+ Is the use case growth predictable or unpredictable?
+ What are the data retention requirements?
+ Is the use case storage or CPU bound?
Oh mighty sizing…!
LSM storage engine’s write path:
32
A Sizing Exercise
+ 500k ops/sec with 1KB rows
+ 5TB data set size
+ P99 reads and writes < 10ms
+ Target deployment region: AWS
Simple Math
+ RF=3 / 6 nodes * 5TB = 2.5TB per node
+ 500k ops/sec / 12,5K ops/core = 40 physical cores
+ Result: 6 nodes of i4i.8xlarge
High Performance NoSQL Masterclass
Observability Tips and Tricks
High Performance NoSQL Masterclass
ScyllaDB Monitoring Architecture
High Performance NoSQL Masterclass
ScyllaDB Monitoring Architecture
High Performance NoSQL Masterclass
Keep in touch!
Felipe Cardeneti Mendes
Solutions Architect
ScyllaDB
felipemendes@scylladb.co
m
Find me on LinkedIn

More Related Content

What's hot

Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark Summit
 
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DBDistributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
YugabyteDB
 
Ceph Performance and Sizing Guide
Ceph Performance and Sizing GuideCeph Performance and Sizing Guide
Ceph Performance and Sizing Guide
Jose De La Rosa
 
Introducing Scylla Cloud
Introducing Scylla CloudIntroducing Scylla Cloud
Introducing Scylla Cloud
ScyllaDB
 
Cassandra vs. ScyllaDB: Evolutionary Differences
Cassandra vs. ScyllaDB: Evolutionary DifferencesCassandra vs. ScyllaDB: Evolutionary Differences
Cassandra vs. ScyllaDB: Evolutionary Differences
ScyllaDB
 
MyRocks Deep Dive
MyRocks Deep DiveMyRocks Deep Dive
MyRocks Deep Dive
Yoshinori Matsunobu
 
Introduction to Galera Cluster
Introduction to Galera ClusterIntroduction to Galera Cluster
Introduction to Galera Cluster
Codership Oy - Creators of Galera Cluster
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
Christian Johannsen
 
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
StreamNative
 
Apache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper OptimizationApache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper Optimization
Databricks
 
MariaDB MaxScale
MariaDB MaxScaleMariaDB MaxScale
MariaDB MaxScale
MariaDB plc
 
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
PostgreSQL-Consulting
 
ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei Milovidov
Altinity Ltd
 
Eventually, Scylla Chooses Consistency
Eventually, Scylla Chooses ConsistencyEventually, Scylla Chooses Consistency
Eventually, Scylla Chooses Consistency
ScyllaDB
 
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
DataStax
 
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Databricks
 
Common Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta LakehouseCommon Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta Lakehouse
Databricks
 
MySQL Database Architectures - InnoDB ReplicaSet & Cluster
MySQL Database Architectures - InnoDB ReplicaSet & ClusterMySQL Database Architectures - InnoDB ReplicaSet & Cluster
MySQL Database Architectures - InnoDB ReplicaSet & Cluster
Kenny Gryp
 
The Impala Cookbook
The Impala CookbookThe Impala Cookbook
The Impala Cookbook
Cloudera, Inc.
 
How to Build a Scylla Database Cluster that Fits Your Needs
How to Build a Scylla Database Cluster that Fits Your NeedsHow to Build a Scylla Database Cluster that Fits Your Needs
How to Build a Scylla Database Cluster that Fits Your Needs
ScyllaDB
 

What's hot (20)

Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
 
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DBDistributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
 
Ceph Performance and Sizing Guide
Ceph Performance and Sizing GuideCeph Performance and Sizing Guide
Ceph Performance and Sizing Guide
 
Introducing Scylla Cloud
Introducing Scylla CloudIntroducing Scylla Cloud
Introducing Scylla Cloud
 
Cassandra vs. ScyllaDB: Evolutionary Differences
Cassandra vs. ScyllaDB: Evolutionary DifferencesCassandra vs. ScyllaDB: Evolutionary Differences
Cassandra vs. ScyllaDB: Evolutionary Differences
 
MyRocks Deep Dive
MyRocks Deep DiveMyRocks Deep Dive
MyRocks Deep Dive
 
Introduction to Galera Cluster
Introduction to Galera ClusterIntroduction to Galera Cluster
Introduction to Galera Cluster
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
 
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
 
Apache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper OptimizationApache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper Optimization
 
MariaDB MaxScale
MariaDB MaxScaleMariaDB MaxScale
MariaDB MaxScale
 
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
 
ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei Milovidov
 
Eventually, Scylla Chooses Consistency
Eventually, Scylla Chooses ConsistencyEventually, Scylla Chooses Consistency
Eventually, Scylla Chooses Consistency
 
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
 
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
 
Common Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta LakehouseCommon Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta Lakehouse
 
MySQL Database Architectures - InnoDB ReplicaSet & Cluster
MySQL Database Architectures - InnoDB ReplicaSet & ClusterMySQL Database Architectures - InnoDB ReplicaSet & Cluster
MySQL Database Architectures - InnoDB ReplicaSet & Cluster
 
The Impala Cookbook
The Impala CookbookThe Impala Cookbook
The Impala Cookbook
 
How to Build a Scylla Database Cluster that Fits Your Needs
How to Build a Scylla Database Cluster that Fits Your NeedsHow to Build a Scylla Database Cluster that Fits Your Needs
How to Build a Scylla Database Cluster that Fits Your Needs
 

Similar to Scaling for Performance

Experience sql server on l inux and docker
Experience sql server on l inux and dockerExperience sql server on l inux and docker
Experience sql server on l inux and docker
Bob Ward
 
Brk2051 sql server on linux and docker
Brk2051 sql server on linux and dockerBrk2051 sql server on linux and docker
Brk2051 sql server on linux and docker
Bob Ward
 
How SQL Server 2016 SP1 Changes the Game
How SQL Server 2016 SP1 Changes the GameHow SQL Server 2016 SP1 Changes the Game
How SQL Server 2016 SP1 Changes the Game
PARIKSHIT SAVJANI
 
Critical Attributes for a High-Performance, Low-Latency Database
Critical Attributes for a High-Performance, Low-Latency DatabaseCritical Attributes for a High-Performance, Low-Latency Database
Critical Attributes for a High-Performance, Low-Latency Database
ScyllaDB
 
What’s New in ScyllaDB Open Source 5.0
What’s New in ScyllaDB Open Source 5.0What’s New in ScyllaDB Open Source 5.0
What’s New in ScyllaDB Open Source 5.0
ScyllaDB
 
Gs08 modernize your data platform with sql technologies wash dc
Gs08 modernize your data platform with sql technologies   wash dcGs08 modernize your data platform with sql technologies   wash dc
Gs08 modernize your data platform with sql technologies wash dc
Bob Ward
 
The roadmap for sql server 2019
The roadmap for sql server 2019The roadmap for sql server 2019
The roadmap for sql server 2019
Javier Villegas
 
SQL Server It Just Runs Faster
SQL Server It Just Runs FasterSQL Server It Just Runs Faster
SQL Server It Just Runs Faster
Bob Ward
 
Databases in the Cloud - DevDay Austin 2017 Day 2
Databases in the Cloud - DevDay Austin 2017 Day 2Databases in the Cloud - DevDay Austin 2017 Day 2
Databases in the Cloud - DevDay Austin 2017 Day 2
Amazon Web Services
 
Introduction to ClustrixDB
Introduction to ClustrixDBIntroduction to ClustrixDB
Introduction to ClustrixDB
I Goo Lee
 
Using Kubernetes to deliver a “serverless” service
Using Kubernetes to deliver a “serverless” serviceUsing Kubernetes to deliver a “serverless” service
Using Kubernetes to deliver a “serverless” service
DoKC
 
In-memory ColumnStore Index
In-memory ColumnStore IndexIn-memory ColumnStore Index
In-memory ColumnStore Index
SolidQ
 
AskTom: How to Make and Test Your Application "Oracle RAC Ready"?
AskTom: How to Make and Test Your Application "Oracle RAC Ready"?AskTom: How to Make and Test Your Application "Oracle RAC Ready"?
AskTom: How to Make and Test Your Application "Oracle RAC Ready"?
Markus Michalewicz
 
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013Understanding AWS Database Options (DAT201) | AWS re:Invent 2013
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013
Amazon Web Services
 
Tips to drive maria db cluster performance for nextcloud
Tips to drive maria db cluster performance for nextcloudTips to drive maria db cluster performance for nextcloud
Tips to drive maria db cluster performance for nextcloud
Severalnines
 
SQL Server 2008 Integration Services
SQL Server 2008 Integration ServicesSQL Server 2008 Integration Services
SQL Server 2008 Integration Services
Eduardo Castro
 
NoSQL_Night
NoSQL_NightNoSQL_Night
NoSQL_Night
Clarence J M Tauro
 
Brk3288 sql server v.next with support on linux, windows and containers was...
Brk3288 sql server v.next with support on linux, windows and containers   was...Brk3288 sql server v.next with support on linux, windows and containers   was...
Brk3288 sql server v.next with support on linux, windows and containers was...
Bob Ward
 
AWS glue technical enablement training
AWS glue technical enablement trainingAWS glue technical enablement training
AWS glue technical enablement training
Info Alchemy Corporation
 
Using Redgate, AKS and Azure to bring DevOps to your database
Using Redgate, AKS and Azure to bring DevOps to your databaseUsing Redgate, AKS and Azure to bring DevOps to your database
Using Redgate, AKS and Azure to bring DevOps to your database
Red Gate Software
 

Similar to Scaling for Performance (20)

Experience sql server on l inux and docker
Experience sql server on l inux and dockerExperience sql server on l inux and docker
Experience sql server on l inux and docker
 
Brk2051 sql server on linux and docker
Brk2051 sql server on linux and dockerBrk2051 sql server on linux and docker
Brk2051 sql server on linux and docker
 
How SQL Server 2016 SP1 Changes the Game
How SQL Server 2016 SP1 Changes the GameHow SQL Server 2016 SP1 Changes the Game
How SQL Server 2016 SP1 Changes the Game
 
Critical Attributes for a High-Performance, Low-Latency Database
Critical Attributes for a High-Performance, Low-Latency DatabaseCritical Attributes for a High-Performance, Low-Latency Database
Critical Attributes for a High-Performance, Low-Latency Database
 
What’s New in ScyllaDB Open Source 5.0
What’s New in ScyllaDB Open Source 5.0What’s New in ScyllaDB Open Source 5.0
What’s New in ScyllaDB Open Source 5.0
 
Gs08 modernize your data platform with sql technologies wash dc
Gs08 modernize your data platform with sql technologies   wash dcGs08 modernize your data platform with sql technologies   wash dc
Gs08 modernize your data platform with sql technologies wash dc
 
The roadmap for sql server 2019
The roadmap for sql server 2019The roadmap for sql server 2019
The roadmap for sql server 2019
 
SQL Server It Just Runs Faster
SQL Server It Just Runs FasterSQL Server It Just Runs Faster
SQL Server It Just Runs Faster
 
Databases in the Cloud - DevDay Austin 2017 Day 2
Databases in the Cloud - DevDay Austin 2017 Day 2Databases in the Cloud - DevDay Austin 2017 Day 2
Databases in the Cloud - DevDay Austin 2017 Day 2
 
Introduction to ClustrixDB
Introduction to ClustrixDBIntroduction to ClustrixDB
Introduction to ClustrixDB
 
Using Kubernetes to deliver a “serverless” service
Using Kubernetes to deliver a “serverless” serviceUsing Kubernetes to deliver a “serverless” service
Using Kubernetes to deliver a “serverless” service
 
In-memory ColumnStore Index
In-memory ColumnStore IndexIn-memory ColumnStore Index
In-memory ColumnStore Index
 
AskTom: How to Make and Test Your Application "Oracle RAC Ready"?
AskTom: How to Make and Test Your Application "Oracle RAC Ready"?AskTom: How to Make and Test Your Application "Oracle RAC Ready"?
AskTom: How to Make and Test Your Application "Oracle RAC Ready"?
 
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013Understanding AWS Database Options (DAT201) | AWS re:Invent 2013
Understanding AWS Database Options (DAT201) | AWS re:Invent 2013
 
Tips to drive maria db cluster performance for nextcloud
Tips to drive maria db cluster performance for nextcloudTips to drive maria db cluster performance for nextcloud
Tips to drive maria db cluster performance for nextcloud
 
SQL Server 2008 Integration Services
SQL Server 2008 Integration ServicesSQL Server 2008 Integration Services
SQL Server 2008 Integration Services
 
NoSQL_Night
NoSQL_NightNoSQL_Night
NoSQL_Night
 
Brk3288 sql server v.next with support on linux, windows and containers was...
Brk3288 sql server v.next with support on linux, windows and containers   was...Brk3288 sql server v.next with support on linux, windows and containers   was...
Brk3288 sql server v.next with support on linux, windows and containers was...
 
AWS glue technical enablement training
AWS glue technical enablement trainingAWS glue technical enablement training
AWS glue technical enablement training
 
Using Redgate, AKS and Azure to bring DevOps to your database
Using Redgate, AKS and Azure to bring DevOps to your databaseUsing Redgate, AKS and Azure to bring DevOps to your database
Using Redgate, AKS and Azure to bring DevOps to your database
 

More from ScyllaDB

ScyllaDB Real-Time Event Processing with CDC
ScyllaDB Real-Time Event Processing with CDCScyllaDB Real-Time Event Processing with CDC
ScyllaDB Real-Time Event Processing with CDC
ScyllaDB
 
MongoDB to ScyllaDB: Technical Comparison and the Path to Success
MongoDB to ScyllaDB: Technical Comparison and the Path to SuccessMongoDB to ScyllaDB: Technical Comparison and the Path to Success
MongoDB to ScyllaDB: Technical Comparison and the Path to Success
ScyllaDB
 
Real-Time or Analytics Workloads... Why Not Both?
Real-Time or Analytics Workloads... Why Not Both?Real-Time or Analytics Workloads... Why Not Both?
Real-Time or Analytics Workloads... Why Not Both?
ScyllaDB
 
Real-Time Persisted Events at Supercell
Real-Time Persisted Events at  SupercellReal-Time Persisted Events at  Supercell
Real-Time Persisted Events at Supercell
ScyllaDB
 
ScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDB
ScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDBScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDB
ScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDB
ScyllaDB
 
An All-Around Benchmark of the DBaaS Market
An All-Around Benchmark of the DBaaS MarketAn All-Around Benchmark of the DBaaS Market
An All-Around Benchmark of the DBaaS Market
ScyllaDB
 
Discover the Unseen: Tailored Recommendation of Unwatched Content
Discover the Unseen: Tailored Recommendation of Unwatched ContentDiscover the Unseen: Tailored Recommendation of Unwatched Content
Discover the Unseen: Tailored Recommendation of Unwatched Content
ScyllaDB
 
So You've Lost Quorum: Lessons From Accidental Downtime
So You've Lost Quorum: Lessons From Accidental DowntimeSo You've Lost Quorum: Lessons From Accidental Downtime
So You've Lost Quorum: Lessons From Accidental Downtime
ScyllaDB
 
ScyllaDB Tablets: Rethinking Replication
ScyllaDB Tablets: Rethinking ReplicationScyllaDB Tablets: Rethinking Replication
ScyllaDB Tablets: Rethinking Replication
ScyllaDB
 
Getting the Most Out of ScyllaDB Monitoring: ShareChat's Tips
Getting the Most Out of ScyllaDB Monitoring: ShareChat's TipsGetting the Most Out of ScyllaDB Monitoring: ShareChat's Tips
Getting the Most Out of ScyllaDB Monitoring: ShareChat's Tips
ScyllaDB
 
ShareChat's Path to High-Performance NoSQL with ScyllaDB
ShareChat's Path to High-Performance NoSQL with ScyllaDBShareChat's Path to High-Performance NoSQL with ScyllaDB
ShareChat's Path to High-Performance NoSQL with ScyllaDB
ScyllaDB
 
Tracking Millions of Heartbeats on Zee's OTT Platform
Tracking Millions of Heartbeats on Zee's OTT PlatformTracking Millions of Heartbeats on Zee's OTT Platform
Tracking Millions of Heartbeats on Zee's OTT Platform
ScyllaDB
 
A Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's ArchitectureA Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's Architecture
ScyllaDB
 
Real-Time Persisted Events at Supercell
Real-Time Persisted Events at  SupercellReal-Time Persisted Events at  Supercell
Real-Time Persisted Events at Supercell
ScyllaDB
 
Why We Chose ScyllaDB over DynamoDB for "User Watch Status"
Why We Chose ScyllaDB over DynamoDB for "User Watch Status"Why We Chose ScyllaDB over DynamoDB for "User Watch Status"
Why We Chose ScyllaDB over DynamoDB for "User Watch Status"
ScyllaDB
 
ScyllaDB Kubernetes Operator Goes Global
ScyllaDB Kubernetes Operator Goes GlobalScyllaDB Kubernetes Operator Goes Global
ScyllaDB Kubernetes Operator Goes Global
ScyllaDB
 
ScyllaDB Drivers: Optimizing Performance
ScyllaDB Drivers: Optimizing PerformanceScyllaDB Drivers: Optimizing Performance
ScyllaDB Drivers: Optimizing Performance
ScyllaDB
 
Cost-Efficient Stream Processing with RisingWave and ScyllaDB
Cost-Efficient Stream Processing with RisingWave and ScyllaDBCost-Efficient Stream Processing with RisingWave and ScyllaDB
Cost-Efficient Stream Processing with RisingWave and ScyllaDB
ScyllaDB
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
ScyllaDB
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
ScyllaDB
 

More from ScyllaDB (20)

ScyllaDB Real-Time Event Processing with CDC
ScyllaDB Real-Time Event Processing with CDCScyllaDB Real-Time Event Processing with CDC
ScyllaDB Real-Time Event Processing with CDC
 
MongoDB to ScyllaDB: Technical Comparison and the Path to Success
MongoDB to ScyllaDB: Technical Comparison and the Path to SuccessMongoDB to ScyllaDB: Technical Comparison and the Path to Success
MongoDB to ScyllaDB: Technical Comparison and the Path to Success
 
Real-Time or Analytics Workloads... Why Not Both?
Real-Time or Analytics Workloads... Why Not Both?Real-Time or Analytics Workloads... Why Not Both?
Real-Time or Analytics Workloads... Why Not Both?
 
Real-Time Persisted Events at Supercell
Real-Time Persisted Events at  SupercellReal-Time Persisted Events at  Supercell
Real-Time Persisted Events at Supercell
 
ScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDB
ScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDBScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDB
ScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDB
 
An All-Around Benchmark of the DBaaS Market
An All-Around Benchmark of the DBaaS MarketAn All-Around Benchmark of the DBaaS Market
An All-Around Benchmark of the DBaaS Market
 
Discover the Unseen: Tailored Recommendation of Unwatched Content
Discover the Unseen: Tailored Recommendation of Unwatched ContentDiscover the Unseen: Tailored Recommendation of Unwatched Content
Discover the Unseen: Tailored Recommendation of Unwatched Content
 
So You've Lost Quorum: Lessons From Accidental Downtime
So You've Lost Quorum: Lessons From Accidental DowntimeSo You've Lost Quorum: Lessons From Accidental Downtime
So You've Lost Quorum: Lessons From Accidental Downtime
 
ScyllaDB Tablets: Rethinking Replication
ScyllaDB Tablets: Rethinking ReplicationScyllaDB Tablets: Rethinking Replication
ScyllaDB Tablets: Rethinking Replication
 
Getting the Most Out of ScyllaDB Monitoring: ShareChat's Tips
Getting the Most Out of ScyllaDB Monitoring: ShareChat's TipsGetting the Most Out of ScyllaDB Monitoring: ShareChat's Tips
Getting the Most Out of ScyllaDB Monitoring: ShareChat's Tips
 
ShareChat's Path to High-Performance NoSQL with ScyllaDB
ShareChat's Path to High-Performance NoSQL with ScyllaDBShareChat's Path to High-Performance NoSQL with ScyllaDB
ShareChat's Path to High-Performance NoSQL with ScyllaDB
 
Tracking Millions of Heartbeats on Zee's OTT Platform
Tracking Millions of Heartbeats on Zee's OTT PlatformTracking Millions of Heartbeats on Zee's OTT Platform
Tracking Millions of Heartbeats on Zee's OTT Platform
 
A Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's ArchitectureA Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's Architecture
 
Real-Time Persisted Events at Supercell
Real-Time Persisted Events at  SupercellReal-Time Persisted Events at  Supercell
Real-Time Persisted Events at Supercell
 
Why We Chose ScyllaDB over DynamoDB for "User Watch Status"
Why We Chose ScyllaDB over DynamoDB for "User Watch Status"Why We Chose ScyllaDB over DynamoDB for "User Watch Status"
Why We Chose ScyllaDB over DynamoDB for "User Watch Status"
 
ScyllaDB Kubernetes Operator Goes Global
ScyllaDB Kubernetes Operator Goes GlobalScyllaDB Kubernetes Operator Goes Global
ScyllaDB Kubernetes Operator Goes Global
 
ScyllaDB Drivers: Optimizing Performance
ScyllaDB Drivers: Optimizing PerformanceScyllaDB Drivers: Optimizing Performance
ScyllaDB Drivers: Optimizing Performance
 
Cost-Efficient Stream Processing with RisingWave and ScyllaDB
Cost-Efficient Stream Processing with RisingWave and ScyllaDBCost-Efficient Stream Processing with RisingWave and ScyllaDB
Cost-Efficient Stream Processing with RisingWave and ScyllaDB
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 

Recently uploaded

9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
saastr
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
operationspcvita
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
Fwdays
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Neo4j
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
Jason Yip
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
DianaGray10
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
AstuteBusiness
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
Ivo Velitchkov
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 

Recently uploaded (20)

9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 

Scaling for Performance

  • 1. High Performance NoSQL Masterclass Scaling for Performance Felipe Cardeneti Mendes
  • 2. High Performance NoSQL Masterclass Felipe Cardeneti Mendes ● Solution Architect at ScyllaDB ● Published Author ● Linux and Open Source enthusiast
  • 3. High Performance NoSQL Masterclass Agenda ● About ScyllaDB ● Getting started with NoSQL databases ● Deployment and Production Readiness ● Observability Tips and Tricks
  • 4. High Performance NoSQL Masterclass About ScyllaDB
  • 5. High Performance NoSQL Masterclass ScyllaDB Database Architecture Horizontal & Vertical Scaling Built in C++ (no Java overhead) System and Data Center Aware Sharding Per Core Shard-Aware Drivers Auto-Performance Tuning Network Processor NUMA Storage Unique Close-to-Metal Architecture
  • 6. High Performance NoSQL Masterclass Shard per Core Threads Shards
  • 7. High Performance NoSQL Masterclass Asynchronous Architecture Request Answer Request Answer Waiting for response Time Savings Synchronous architecture Asynchronous architecture
  • 8. High Performance NoSQL Masterclass Specialized Cache Cassandra ScyllaDB Key cache Row cache Linux page cache SSTables Unified cache SSTables Complex Tuning On-heap / Off-heap
  • 9. High Performance NoSQL Masterclass Ecosystem Compatibility + CQL native protocol + JMX management protocol + Management command line /REST + SSTable file format + Configuration file format + CQL language
  • 10. High Performance NoSQL Masterclass Getting Started with NoSQL Databases
  • 11. High Performance NoSQL Masterclass Modern Business Challenges Keep CapEx & OpEx in check Reduce complexity Scale as the data grows Queries in milliseconds Leverage massive amounts of data Predictable, consistent performance
  • 12. High Performance NoSQL Masterclass Oh… The CAP Theorem
  • 13. High Performance NoSQL Masterclass Workload Types
  • 14. High Performance NoSQL Masterclass Workload Types Decision Support + More complex queries - large amounts + Latency important but not critical + Seconds to hours Fundamental business tasks + Simple queries + Latency critical + Milliseconds per transaction OLAP Time Complexity of Query Time Complexity of Query OLTP
  • 15. High Performance NoSQL Masterclass OLAP Characteristics Main Characteristics + Bound Concurrency + Scans and aggregations + Rely on MapReduce paradigms OLAP Time Complexity of Query Examples + How many users are from the US? + How many Twitter posts happened in 2022? + Which devices haven’t communicated back within the past 1 hour?
  • 16. High Performance NoSQL Masterclass OLTP Characteristics Main Characteristics + Unbound Concurrency + Designed for speed and simplicity + Often user facing APIs Examples + What’s the last time an user logged in? + What have been the last 10 temperature measurements for a given device? + How many likes a given posting has? Time Complexity of Query OLTP
  • 17. High Performance NoSQL Masterclass Why not both? Meet Workload Prioritization 100 shares Ratio = 100:100 (1:1) means equal shares of processing/resources to complete tasks Ratio = 100:50 (2:1) means 2X as many shares of processing/resources for Transactions to complete tasks compared to Analytics 100 shares 100 shares 50 shares OLTP OLAP Which Task to Run
  • 18. Wide Column Databases Write Path LSM storage engine’s write path: 18 Writes commit log
  • 19. Wide Column Databases Write Path LSM storage engine’s write path: 19 Writes commit log
  • 20. Wide Column Databases Write Path LSM storage engine’s write path: 20 Writes commit log
  • 21. Wide Column Databases Write Path LSM storage engine’s write path: 21 Writes commit log compaction
  • 22. Wide Column Databases Write Path LSM storage engine’s write path: 22 Writes commit log compaction
  • 23. Wide Column Databases Write Path LSM storage engine’s write path: 23 Writes commit log
  • 24. What is compaction? LSM storage engine’s write path: 24 Hidden Gems + This technique of keeping sorted files and merging them is well-known and often called Log-Structured Merge (LSM) Tree + Published in 1996, earliest popular application known is the Lucene search engine, 1999 Characteristics + High performance write. + Immediately readable. + Reasonable performance for read.
  • 25. What is a compaction strategy? LSM storage engine’s write path: 25 ▪ Which files to compact, and when? ▪ This is called the compaction strategy ▪ The goal of the strategy is low amplification: ○ Avoid read requests needing many sstables. • read amplification ○ Avoid overwritten/deleted/expired data staying on disk. ○ Avoid excessive temporary disk space needs • space amplification ○ Avoid compacting the same data again and again. • write amplification
  • 26. Which one to choose? LSM storage engine’s write path: 26 Know your workload
  • 27. High Performance NoSQL Masterclass Deployment and Production Readiness
  • 28. It all starts with Data Modeling LSM storage engine’s write path: 28 Do’s + Denormalize + Query oriented approach + High data distribution / cardinality Dont’s + Create hotspots + Large partitions/rows/cells/collections/etc + Low cardinality tables/views/indexes
  • 29. Test, test, test…! LSM storage engine’s write path: 29 Unit Testing + Test your workload and access patterns in a Docker container + Use specialized stress tools to simulate workload ○ cassandra-stress ○ nosqlbench ○ YCSB + OBSERVE the results (more on that later)
  • 30. Application Development LSM storage engine’s write path: 30 Functional Testing + Use Prepared Statements + Configure your routing and load balancing policy correctly ○ DCAware and TokenAware policies ○ ShardAware Drivers if using ScyllaDB! + Make use of Asynchronous APIs + Ensure your client is NOT a bottleneck + Paging is important: Ensure you adjust it right
  • 31. Test, test, test…! LSM storage engine’s write path: 31 Readiness Testing + What’s the unreplicated data set size? + How many operations per second do I need to achieve? ○ Out of these, what are the reads vs writes distribution? ○ What’s the average payload size? ○ What are my latency requirements? + How many regions should it replicate to? + Do I need indexes or views to satisfy my queries? + What is/are the target deployment location(s)? + Is the use case growth predictable or unpredictable? + What are the data retention requirements? + Is the use case storage or CPU bound?
  • 32. Oh mighty sizing…! LSM storage engine’s write path: 32 A Sizing Exercise + 500k ops/sec with 1KB rows + 5TB data set size + P99 reads and writes < 10ms + Target deployment region: AWS Simple Math + RF=3 / 6 nodes * 5TB = 2.5TB per node + 500k ops/sec / 12,5K ops/core = 40 physical cores + Result: 6 nodes of i4i.8xlarge
  • 33. High Performance NoSQL Masterclass Observability Tips and Tricks
  • 34. High Performance NoSQL Masterclass ScyllaDB Monitoring Architecture
  • 35. High Performance NoSQL Masterclass ScyllaDB Monitoring Architecture
  • 36. High Performance NoSQL Masterclass Keep in touch! Felipe Cardeneti Mendes Solutions Architect ScyllaDB felipemendes@scylladb.co m Find me on LinkedIn