Cloudius Systems presents:
1,000,000 CQL operations
per server
Avi Kivity, CTO @ ScyllaDB
Cassandra Summit 2015
Capable of 1,000,000 operations per second
PER NODE
With predictable, low latencies
Compatible with Apache Cassandra
Scylla: A new NoSQL Database
THE POWER OF
CASSANDRA AT
THE SPEED OF REDIS
SCYLLA DB
THROUGHPUT
LATENCY
FULLY COMPATIBLE
❏ Uses Cassandra SSTables
❏ Use your existing drivers
❏ Use your existing CQL queries
❏ Use your existing cassandra.yaml
❏ Manage with nodetool or other JMX console
❏ Use your existing code with no change
❏ Copy over a complete Cassandra database
❏ Works with the Cassandra ecosystem (Spark etc.)
FULLY COMPATIBLE
WHAT WOULD YOU DO WITH 1 MILLION TPS?
Shrink your cluster by a factor of 10X
Handle 10X traffic spikes on Black Friday
Model your data instead of using Cassandra as K/V
Get the most out of your data - Run more queries
Maintain your clusters while serving
Stop using caches in front of the database
TECHNOLOGY:
HOW IT WORKS
SCYLLA IS QUITE DIFFERENT
Shard-per-core, no locks, no threads, zero-copy
Based on the Seastar C++ application framework
Efficient, unified DB cache (not using Linux page cache)
CQL-oriented storage engine
Exploit all hardware resources - NUMA, multiqueue NICs, etc
SCYLLA DB: ARCHITECTURE COMPARISON
Kernel
Cassandra
TCP/IPScheduler
queuequeuequeuequeuequeue
threads
NIC
Queues
Kernel
Traditional stack Scylla sharded stack
Memory
Lock contention
Cache contention
NUMA unfriendly
Application
TCP/IP
Task Scheduler
queuequeuequeuequeuequeuesmp queue
NIC
Queue
DPDK
Kernel
(isn’t
involved)
Userspace
Application
TCP/IP
Task Scheduler
queuequeuequeuequeuequeuesmp queue
NIC
Queue
DPDK
Kernel
(isn’t
involved)
Userspace
Application
TCP/IP
Task Scheduler
queuequeuequeuequeuequeuesmp queue
NIC
Queue
DPDK
Kernel
(isn’t
involved)
Userspace
Core
Database
TCP/IP
Task Scheduler
queuequeuequeuequeuequeuesmp queue
NIC
Queue
DPDK
Kernel
(isn’t
involved)
Userspace
No contention
Linear scaling
NUMA friendly
Scylla has its own task scheduler
Traditional stack Scylla stack
Promise
Task
Promise
Task
Promise
Task
Promise
Task
CPU
Promise
Task
Promise
Task
Promise
Task
Promise
Task
CPU
Promise
Task
Promise
Task
Promise
Task
Promise
Task
CPU
Promise
Task
Promise
Task
Promise
Task
Promise
Task
CPU
Promise
Task
Promise
Task
Promise
Task
Promise
Task
CPU
Promise is a
pointer to
eventually
computed value
Task is a pointer
to a lambda
function
Scheduler
CPU
Scheduler
CPU
Scheduler
CPU
Scheduler
CPU
Scheduler
CPU
Thread
Stack
Thread
Stack
Thread
Stack
Thread
Stack
Thread
Stack
Thread
Stack
Thread
Stack
Thread
Stack
Thread is a
function pointer
Stack is a byte
array from 64k to
megabytes
Context switch cost is
high. Large stacks pollutes
the caches No sharing, millions of
parallel events
Scylla Memory Management
Blasting out I/O operations
future<>
make_data_requests(digest_resolver_ptr resolver, targets_iterator begin,
targets_iterator end) {
return parallel_for_each(begin, end, [this, resolver =
std::move(resolver)] (gms::inet_address ep) {
return make_data_request(ep).then_wrapped([resolver, ep]
(future<foreign_ptr<lw_shared_ptr<query::result>>> f) {
try {
resolver->add_data(ep, f.get0());
} catch (...) {
resolver->error(ep, std::current_exception());
}
});
});
}
Unified cache
Cassandra Scylla
Key cache
Row cache
On-heap /
Off-heap
Linux page cache
SSTables
Unified cache
SSTables
Unified cache
Cassandra Scylla
Key cache
Row cache
On-heap /
Off-heap
Linux page cache
SSTables
Unified cache
SSTables
Tuning
Caching unparsed data
Parasitic rows
Page faults
❏ Implement missing CQL features
❏ Stabilize clustering
❏ Complete nodetool support
❏ Spark integration, other connectors
❏ Thrift support
❏ Authentication and encryption
Work in progress
❏ Open source @ github
❏ In Beta
❏ Try it out! RPM, Docker images available
❏ Live Demo @ ScyllaDB booth!
❏ → http://scylladb.com
OUTLOOK
❏ Multi-tenancy
❏ New protocols (redis, …)
❏ Vertical integration (solr, …)
FUTURE
SCYLLA, NoSQL GOES NATIVE
Thank you.
http://scylladb.com

Scylla: 1 Million CQL operations per second per server

  • 1.
  • 2.
    1,000,000 CQL operations perserver Avi Kivity, CTO @ ScyllaDB Cassandra Summit 2015
  • 3.
    Capable of 1,000,000operations per second PER NODE With predictable, low latencies Compatible with Apache Cassandra Scylla: A new NoSQL Database
  • 4.
    THE POWER OF CASSANDRAAT THE SPEED OF REDIS SCYLLA DB
  • 5.
  • 6.
  • 7.
    FULLY COMPATIBLE ❏ UsesCassandra SSTables ❏ Use your existing drivers ❏ Use your existing CQL queries ❏ Use your existing cassandra.yaml ❏ Manage with nodetool or other JMX console ❏ Use your existing code with no change ❏ Copy over a complete Cassandra database ❏ Works with the Cassandra ecosystem (Spark etc.)
  • 8.
  • 9.
    WHAT WOULD YOUDO WITH 1 MILLION TPS? Shrink your cluster by a factor of 10X Handle 10X traffic spikes on Black Friday Model your data instead of using Cassandra as K/V Get the most out of your data - Run more queries Maintain your clusters while serving Stop using caches in front of the database
  • 10.
  • 11.
    SCYLLA IS QUITEDIFFERENT Shard-per-core, no locks, no threads, zero-copy Based on the Seastar C++ application framework Efficient, unified DB cache (not using Linux page cache) CQL-oriented storage engine Exploit all hardware resources - NUMA, multiqueue NICs, etc
  • 12.
    SCYLLA DB: ARCHITECTURECOMPARISON Kernel Cassandra TCP/IPScheduler queuequeuequeuequeuequeue threads NIC Queues Kernel Traditional stack Scylla sharded stack Memory Lock contention Cache contention NUMA unfriendly Application TCP/IP Task Scheduler queuequeuequeuequeuequeuesmp queue NIC Queue DPDK Kernel (isn’t involved) Userspace Application TCP/IP Task Scheduler queuequeuequeuequeuequeuesmp queue NIC Queue DPDK Kernel (isn’t involved) Userspace Application TCP/IP Task Scheduler queuequeuequeuequeuequeuesmp queue NIC Queue DPDK Kernel (isn’t involved) Userspace Core Database TCP/IP Task Scheduler queuequeuequeuequeuequeuesmp queue NIC Queue DPDK Kernel (isn’t involved) Userspace No contention Linear scaling NUMA friendly
  • 13.
    Scylla has itsown task scheduler Traditional stack Scylla stack Promise Task Promise Task Promise Task Promise Task CPU Promise Task Promise Task Promise Task Promise Task CPU Promise Task Promise Task Promise Task Promise Task CPU Promise Task Promise Task Promise Task Promise Task CPU Promise Task Promise Task Promise Task Promise Task CPU Promise is a pointer to eventually computed value Task is a pointer to a lambda function Scheduler CPU Scheduler CPU Scheduler CPU Scheduler CPU Scheduler CPU Thread Stack Thread Stack Thread Stack Thread Stack Thread Stack Thread Stack Thread Stack Thread Stack Thread is a function pointer Stack is a byte array from 64k to megabytes Context switch cost is high. Large stacks pollutes the caches No sharing, millions of parallel events
  • 14.
  • 15.
    Blasting out I/Ooperations future<> make_data_requests(digest_resolver_ptr resolver, targets_iterator begin, targets_iterator end) { return parallel_for_each(begin, end, [this, resolver = std::move(resolver)] (gms::inet_address ep) { return make_data_request(ep).then_wrapped([resolver, ep] (future<foreign_ptr<lw_shared_ptr<query::result>>> f) { try { resolver->add_data(ep, f.get0()); } catch (...) { resolver->error(ep, std::current_exception()); } }); }); }
  • 16.
    Unified cache Cassandra Scylla Keycache Row cache On-heap / Off-heap Linux page cache SSTables Unified cache SSTables
  • 17.
    Unified cache Cassandra Scylla Keycache Row cache On-heap / Off-heap Linux page cache SSTables Unified cache SSTables Tuning Caching unparsed data Parasitic rows Page faults
  • 18.
    ❏ Implement missingCQL features ❏ Stabilize clustering ❏ Complete nodetool support ❏ Spark integration, other connectors ❏ Thrift support ❏ Authentication and encryption Work in progress
  • 19.
    ❏ Open source@ github ❏ In Beta ❏ Try it out! RPM, Docker images available ❏ Live Demo @ ScyllaDB booth! ❏ → http://scylladb.com OUTLOOK
  • 20.
    ❏ Multi-tenancy ❏ Newprotocols (redis, …) ❏ Vertical integration (solr, …) FUTURE
  • 21.
    SCYLLA, NoSQL GOESNATIVE Thank you. http://scylladb.com