ScyllaDB Open Source 5.0 is the latest evolution of our monstrously fast and scalable NoSQL database – powering instantaneous experiences with massive distributed datasets.
Join us to learn about ScyllaDB Open Source 5.0, which represents the first milestone in ScyllaDB V. ScyllaDB 5.0 introduces a host of functional, performance and stability improvements that resolve longstanding challenges of legacy NoSQL databases.
We’ll cover:
- New capabilities including a new IO model and scheduler, Raft-based schema updates, automated tombstone garbage collection, optimized reverse queries, and support for the latest AWS EC2 instances
- How ScyllaDB 5.0 fits into the evolution of ScyllaDB – and what to expect next
- The first look at benchmarks that quantify the impact of ScyllaDB 5.0's numerous optimizations
This will be an interactive session with ample time for Q & A – bring us your questions and feedback!
2. Brought to you by
VIRTUAL EVENT | OCTOBER 19 + 20
All Things Performance
The event for developers who care about P99
percentiles and high-performance, low-latency
applications.
Register at p99conf.io
4. Tzach Livyatan
4
VP of Product, ScyllaDB
+ Lead the product team in ScyllaDB
+ Appreciate distributed system testing
+ Lives in Tel Aviv, father of two
5. Agenda
+ How did we get here?
+ ScyllaDB V Theme
+ Resilience
+ Performance
+ Ecosystem
+ Whats Next?
6. + Infoworld 2020 Technology of the Year!
+ Founded by designers of KVM Hypervisor
The Database Built for Gamechangers
6
“ScyllaDB stands apart...It’s the rare product
that exceeds my expectations.”
– Martin Heller, InfoWorld contributing editor and reviewer
“For 99.9% of applications, ScyllaDB delivers all the
power a customer will ever need, on workloads that other
databases can’t touch – and at a fraction of the cost of
an in-memory solution.”
– Adrian Bridgewater, Forbes senior contributor
+ Resolves challenges of legacy NoSQL databases
+ >5x higher throughput
+ >20x lower latency
+ >75% TCO savings
+ DBaaS/Cloud, Enterprise and Open Source solutions
+ Proven globally at scale
7. 7
+400 Gamechangers Leverage ScyllaDB
Seamless experiences
across content + devices
Fast computation of flight
pricing
Corporate fleet
management
Real-time analytics
2,000,000 SKU -commerce
management
Real-time location tracking
for friends/family
Video recommendation
management
IoT for industrial
machines
Synchronize browser
properties for millions
Threat intelligence service
using JanusGraph
Real time fraud detection
across 6M transactions/day
Uber scale, mission critical
chat & messaging app
Network security threat
detection
Power ~50M X1 DVRs with
billions of reqs/day
Precision healthcare via
Edison AI
Inventory hub for retail
operations
Property listings and
updates
Unified ML feature store
across the business
Cryptocurrency exchange
app
Geography-based
recommendations
Distributed storage for
distributed ledger tech
Global operations- Avon,
Body Shop + more
Predictable performance for
on sale surges
GPS-based exercise
tracking
8. Cloud infrastructure: The last ~10 years
8
SSD: $2500/TB
Performance
improvement
2008 2012
Typical instance 4 cores
SSD $100/TB - 1000x faster, 10x cheaper
96 core VMs - 20x more cores
100Gbps NICs - 100x more throughput
2015 2022
2000 CPU core systems and
beyond
12. What is Raft
(baby dont hurt me no more)
Picture from https://en.wikipedia.org/wiki/Raft_%28algorithm%29
13. Raft intro
Raft is a protocol for state machine replication.
What does it mean?
+ The majority of nodes have the same state
+ State transition happens in the same order on all
nodes
+ Cluster topology is part of the state
Raft paper: https://raft.github.io/raft.pdf
14. How Scylla Raft is special
Scylla Raft implements a number of important extensions:
+ Increased liveness for very large clusters (1000+ nodes)
+ Resilience against asymmetric network failures
+ Read and write support on all cluster nodes
+ Efficient multi-raft: every node can replicate many state machines
16. Consistency Model of Schema Changes
id first last
1 John Doe
Time
Node A: Node B:
id first last email
1 John Doe
2 Jenny Smith j@...
id first last email phone
1 John Doe
2 Jenny Smith j@... (867)
id first last phone
1 John Doe
2 Jenny Smith (867)
Split
brain
17. (In)consistency of Schema Changes
cqlsh:test> create table t (a int primary key);
----------------------------------------------- split ------------------------------------------
cqlsh:test> alter table t rename a to d;
Warning: schema version mismatch detected
cqlsh:test> insert into t (d) values (1);
Cannot execute this query as it might involve data filtering and thus
may have unpredictable performance.
cqlsh:test> insert into t (a) values (1);
Unknown identifier a
18. Schema Update with Raft
S1
S2
S3
CREATE TABLE t ADD COLUMN b CREATE INDEX t_i1
Raft log:
I
N
S
E
R
T
I
N
T
O
t
S
E
T
b
=
2
S
E
L
E
C
T
b
- schema fetch
19. Schema Update with Raft
App
App
App
ALTER TABLE
ALTER TABLE
Safe serialization
20. What is Topology?
Topology is defined as all of the following:
the set of nodes in the cluster,
location of those nodes in DCs and racks,
and assignment of ownership of data to nodes
21. Token Metadata
node A node B node C
A
C
B
C
A
B
Token metadata:
+ Each node has a set of tokens assigned during bootstrap
(vnodes)
+ Tokens combined determine primary owning replicas for key
ranges
22. Eventually (In)consistent Topology
node A node B node C
Token
metadata
node D
A
B
A
B
Bootstrapping
node D
A
C
B
C
A
B
A
B
A
B
local view local view
in gossip
29. Raft for Tablets
+ Manageable number of Raft groups (~100,000)
+ No client-side timestamps
+ Provides isolation for ALL queries
+ Writes do not require a read
+ No need to repair
+ Strong consistency of materialized views
+ Linearizable, more powerful schema and topology changes
+ High Availability and partition tolerance of Cassandra are mostly unaffected
+ Tablets allow to dynamically split loaded shards
29
Roadmap
32. What is RBNO
+ Use row level repair as the underlying mechanism to sync data between nodes instead of
streaming
+ Single mechanism for all the node operations
+ Bootstrap / replace / rebuild / decommission / removenode / repair
33. Benefits of RBNO
Significant improvements on performance and data safety
+ Resumable
+ Resume from previous failed bootstrap operations
+ Consistency
+ Latest replica is guaranteed
+ Simplified
+ No need to run repair before or after node operations like replace and removenode
+ Unified
+ All node ops use the same underlying mechanism
34. Off-strategy compaction
Make compaction during node operations more efficient
+ What is it
+ Sstables generated by node operations are kept in a separate data set
+ Compact them together and integrate to main set when node operation is done
+ Benefits
+ Less compaction work during node operations
+ Faster to complete node operations
37. Why scheduling at all
+ Different components compete for limited resources (Reads, Writes, Admin)
+ They have different priorities
+ They have no idea how not to over-consume the resource
48. Latest Results I3 vs I4 - one node
I3.16xlarge vs i4.16xlarge (64 vCPU servers)
50% Reads / 50% Writes
Latency tests with 50% of the max throughput
49. Latest Results I3 vs I4 - 3 node cluster
Big thanks to Michał
Chojnowski for benchmarking
all the new AWS instances
types!
I3.16xlarge vs i4.16xlarge (64 vCPU servers)
50% Reads / 50% Writes
Latency tests with 50% of the max throughput
67% better price/performance!
56. Intro: why Rust?
● Rust is gaining popularity very fast
$ curl -s https://www.p99conf.io | grep -ci rust
12
● Some of the existing customers are already giving our driver a try (including @ultrabug)
● The language is comparable to C++, but with a more strict compiler, better defaults,
decent pattern matching and an interesting built-in async model
○ liveness issues are usually detected at compile time
○ the checks also work across threads (Send + Sync traits)
○ everything is const by default, mutable only if explicitly declared as such
○ coroutine-like await syntax is part of the language
57. Intro: why Tokio?
1. Most popular async runtime, which should translate to best adoption
2. It would be tempting to write the driver in a runtime-agnostic way, but it's hard:
a. not all API's are well defined yet
b. Tokio offers quite complete support for TCP communication, timeouts and other useful
abstractions
3. Very actively developed
60. WebAssembly
Binary format for expressing executable code, executed on a stack-based virtual machine.
Designed to be:
+ portable
+ easily embeddable
+ efficient
WebAssembly is binary, but it also specifies a standard human-readable format: WAT
(WebAssembly Text Format).
Roadmap
61. WebAssembly
Creating a user-defined function with Wasm is as easy as providing its source code
represented in WebAssembly Text Format:
CREATE FUNCTION fib(input bigint) RETURNS NULL ON NULL INPUT RETURNS
bigint
LANGUAGE xwasm AS
'(module
(func $fib (param $n i64) (result i64)
(if
(i64.lt_s (local.get $n) (i64.const 2))
(return (local.get $n))
)
(i64.add
(call $fib (i64.sub (local.get $n) (i64.const 1)))
(call $fib (i64.sub (local.get $n) (i64.const 2)))
)
)
(export "fib" (func $fib))
)';
cassandra@cqlsh:ks> SELECT n, fib(n) FROM numbers;
n | ks.fib(n)
---+-----------
1 | 1
2 | 1
3 | 2
4 | 3
5 | 5
6 | 8
7 | 13
8 | 21
9 | 34
(9 rows)
62. User-defined functions
User-defined functions are a CQL feature that allows applying a custom function
to the query result rows.
cassandra@cqlsh:ks> SELECT id, inv(id), mult(id, inv(id)) FROM t;
id | ks.inv(id) | ks.mult(id, ks.inv(id))
----+------------+-------------------------
7 | 0.142857 | 1
1 | 1 | 1
0 | Infinity | NaN
4 | 0.25 | 1
(4 rows)
Roadmap
63. Kubernetes Operator
+ Performance
+ Stability
+ Security
+ Image pull secrets added to CRD.
+ Users may specify their private secure
+ repository of Scylla images.
Scylla Manager Agent
secret token
64.
65. Built for Extreme Scale
+ Safe Cluster Level Operation
+ Repair Based Node Operations
+ Repair based Tombstone
Garbage Collection
+ Unbucketed TWCS tables
+ Improved OOM Resistance
+ Gossip Free Node Operations
+ New I/O Model and Scheduler
+ I4i instances
+ CQL: Reversed Queries
+ Removed large partition penalty
+ Off-strategy compaction
+ Rust Driver (with C++ wrapper)
+ ARM Support
+ K8S support
+ Workload definitions per role
+ Alternator TTL
+ WASM UDF
+ Virtual Table for Configuration
Resilience Performance Ecosystem
66. Whats Next?
+ ScyllaDB 5.0 updates will roll into upcoming ScyllaDB Enterprise 2022.1 and
Scylla Cloud
+ Safe schema and topology update will roll into production
+ I4i will be available for Scylla Cloud
+ New Drivers roll out
+ Many more features (WASM UDF and more)
Roadmap
67. 28th of July 8AM-12PM PT
Half-day of free online training with
some of our best engineers and
experts
Register here.
68. Poll
How much data do you under management of your
transactional database?
69. Thank you
for joining us today.
@scylladb scylladb/
slack.scylladb.com
@scylladb company/scylladb/
scylladb/