This document discusses the attributes of a high-performance, low-latency database like ScyllaDB. It begins with introductions and an overview of ScyllaDB. It then summarizes how hardware has evolved over 20 years with more cores, memory, and faster disks. ScyllaDB was redesigned from first principles to take advantage of modern hardware, using an asynchronous, shared-nothing architecture with one shard per core. This allows it to achieve significantly higher performance than Cassandra. The document shows benchmark results demonstrating ScyllaDB's lower latencies and ability to scale to higher throughput. It also discusses how ScyllaDB uses workload prioritization to manage different types of workloads.
3. 3
Agenda
+ About ScyllaDB
+ 20 years of hardware evolution in 5 minutes
+ Scylla - Design for performance
+ Results
+ Workload Prioritization
+ Summary
4. 4
+ The Real-Time Big Data Database
+ Fully Compatible with Apache Cassandra
and Amazon DynamoDB
+ 10X the performance & low tail latency
+ Open Source, Enterprise and Cloud options
+ Founded by the creators of KVM hypervisor
+ HQs: Palo Alto, CA, USA; Herzelia, Israel;
Warsaw, Poland
About ScyllaDB
7. Why Scylla?
On-Prem
Cloud Hosted
Scylla Cloud
Best High Availability in the industry
Best Disaster Recovery in the industry
Best Scalability in the industry
Best Performance in the industry
Auto-tune — out of the box performance
Fully compatible with Cassandra & DynamoDB
The power of Cassandra at the speed of Redis and more
17. Sharding/partitioning
+ Common concept in distributed databases
+ Break the system to N non-interacting parts
+ Usually done by hash(partition_key) % N
+ Data/load may be unbalanced
+ Fact of life in distributed databases 🤷
+ Logical mapping of data shards to core shards
17
19. Seastar
+ Open source framework, powering Scylla, Ceph,
Redpanda, ValuStor and more
+ A “mini operating system in userspace”
+ Task scheduler, I/O scheduler
+ Fully asynchronous - userspace coroutines
+ Direct I/O, (bypasses kernel pagecache)
+ App should implement caching on its own.
+ One thread per core, one shard per core
19
20. Shard per Core
Cassandra
TCP/IP
Scheduler
queue
queue
queue
queue
queue
Threads
NIC
Queues
Kernel
Traditional Stack SeaStar’s Sharded Stack
Memory
Lock contention
Cache contention
NUMA unfriendly
TCP/IP
Task Scheduler
queue
queue
queue
queue
queue
smp queue
NIC
Queue
DPDK
Kernel
(isn’t
involved)
Userspace
TCP/IP
Task Scheduler
queue
queue
queue
queue
queue
smp queue
NIC
Queue
DPDK
Kernel
(isn’t
involved)
Userspace
TCP/IP
queue
queue
queue
queue
queue
smp queue
NIC
Queue
Kernel
(isn’t
involved)
Userspace
No contention (*)
Linear scaling
NUMA friendly
(*) cooperative-
preemption
model in shard
Core
Database
Task Scheduler
queue
queue
queue
queue
smp queue
Userspace
NIC
Queue
20
23. Thou shalt not block
Query
Commitlog
Compaction
Queue
Queue
Queue
Userspace
I/O
Scheduler
Disk
Max useful disk concurrency
I/O queued in FS/device
No queues
23
25. 25
Shard aware I/O scheduler(s)
+ Each shard has independent scheduler
+ Capacity groups per NUMA zone
+ Shards grab capacity leases
Minimal, low cost coordination between shards!
27. The controllers - memtable
27
This is the CPU percentage needed (50 %) To keep the buffers at a stable level
Throughput barely oscillates
Total system CPU usage barely oscillates
38. Workload Prioritization: Different types of loads
■ OLTP
● Small work items
● Latency sensitive
● involves narrow
portion of the data
■ OLAP
● Large work items
● Throughput oriented
● Performed on large
amounts of data
39. + Shares are really all there is to it :)
+ Schedulers maintain fairness by trying to optimize ratios
and not absolute throughput.
+ Schedulers only kick in when there is a conflict on the
resource.
+ Schedulers can be dynamic - meaning you can change the
amount of shares in real time.
+ Limits the impact of one Share-Holder on another.
Schedulers Basics - operation highlight
41. How does it work?
41
Memtable
Seastar
Scheduler
Compaction
Query
Repair
Commitlog
Compaction
Backlog Monitor
Memory Monitor
Adjust priority
NET
CPU
SSD
42. How does it work? Workload Prioritization!
Service-level
Controller
42
Memtable
Seastar
Scheduler
Compaction
Query
Repair
Commitlog
Compaction
Backlog Monitor
Memory Monitor
Adjust priority
NET
CPU
SSD
45. Configuring Workload prioritization
1. Make users that generates the same workload be part of
the same group.
● Priorities are attached to groups or individual users.
2. Create a service level for the workload and set its shares:
● Share determine the amount of importance of the service level.
● It is always relative to other service levels.
3. Attach the service level to the group of users.
● This will grant the shares to the group of users.
● At that point the workload prioritization mechanizm will start to
● Treat their requests according to priorities.
46. Managing Workload Prioritization using CQL
1. Make users that generates the same workload be part of
the same group.
● CREATE ROLE super_high_priority;
● GRANT super_high_priority TO special_user;
2. Create a service level for the workload and set its shares:
● CREATE SERVICE_LEVEL 'important_load' WITH SHARES=1000;
3. Attach the service level to the group of users.
● ATTACH SERVICE_LEVEL 'important_load' TO ‘super_high_priority;
48. 48
+ Design and built to meet modern
hardware
+ Use a fully async, share nothing, shard
per core architecture
+ Superior throughput and consistent low
latency
+ Expose internal scheduler to the user as
Workload Prioritization
Summary