SlideShare a Scribd company logo
1 of 69
Download to read offline
ScyllaDB 5.0
Innovations for Extreme Scale
Tzach Livyatan, VP Product
Brought to you by
VIRTUAL EVENT | OCTOBER 19 + 20
All Things Performance
The event for developers who care about P99
percentiles and high-performance, low-latency
applications.
Register at p99conf.io
Poll
Where are you in your NoSQL adoption?
3
Tzach Livyatan
4
VP of Product, ScyllaDB
+ Lead the product team in ScyllaDB
+ Appreciate distributed system testing
+ Lives in Tel Aviv, father of two
Agenda
+ How did we get here?
+ ScyllaDB V Theme
+ Resilience
+ Performance
+ Ecosystem
+ Whats Next?
+ Infoworld 2020 Technology of the Year!
+ Founded by designers of KVM Hypervisor
The Database Built for Gamechangers
6
“ScyllaDB stands apart...It’s the rare product
that exceeds my expectations.”
– Martin Heller, InfoWorld contributing editor and reviewer
“For 99.9% of applications, ScyllaDB delivers all the
power a customer will ever need, on workloads that other
databases can’t touch – and at a fraction of the cost of
an in-memory solution.”
– Adrian Bridgewater, Forbes senior contributor
+ Resolves challenges of legacy NoSQL databases
+ >5x higher throughput
+ >20x lower latency
+ >75% TCO savings
+ DBaaS/Cloud, Enterprise and Open Source solutions
+ Proven globally at scale
7
+400 Gamechangers Leverage ScyllaDB
Seamless experiences
across content + devices
Fast computation of flight
pricing
Corporate fleet
management
Real-time analytics
2,000,000 SKU -commerce
management
Real-time location tracking
for friends/family
Video recommendation
management
IoT for industrial
machines
Synchronize browser
properties for millions
Threat intelligence service
using JanusGraph
Real time fraud detection
across 6M transactions/day
Uber scale, mission critical
chat & messaging app
Network security threat
detection
Power ~50M X1 DVRs with
billions of reqs/day
Precision healthcare via
Edison AI
Inventory hub for retail
operations
Property listings and
updates
Unified ML feature store
across the business
Cryptocurrency exchange
app
Geography-based
recommendations
Distributed storage for
distributed ledger tech
Global operations- Avon,
Body Shop + more
Predictable performance for
on sale surges
GPS-based exercise
tracking
Cloud infrastructure: The last ~10 years
8
SSD: $2500/TB
Performance
improvement
2008 2012
Typical instance 4 cores
SSD $100/TB - 1000x faster, 10x cheaper
96 core VMs - 20x more cores
100Gbps NICs - 100x more throughput
2015 2022
2000 CPU core systems and
beyond
ScyllaDB Journey
Superior
Performance
Full
Compatibility:
Apache
Cassandra
AWS DynamoDB
K8s
CDC
Kafka, Spark
support
Shard aware drivers
ScyllaDB 5.x
Scylla Cloud
Enhance security
Workload
Prioritization
Enhance
performance
(ICS)
ScyllaDB Enterprise
2022.x Base on Open
Source 5.x
Enterprise
Open Source
Performance
Resilience
Ecosystem
ScyllaDB V Core Themes
Innovations for Extreme Scale
Strongly Consistent
Schema and Topology
Updates
Resilience
What is Raft
(baby dont hurt me no more)
Picture from https://en.wikipedia.org/wiki/Raft_%28algorithm%29
Raft intro
Raft is a protocol for state machine replication.
What does it mean?
+ The majority of nodes have the same state
+ State transition happens in the same order on all
nodes
+ Cluster topology is part of the state
Raft paper: https://raft.github.io/raft.pdf
How Scylla Raft is special
Scylla Raft implements a number of important extensions:
+ Increased liveness for very large clusters (1000+ nodes)
+ Resilience against asymmetric network failures
+ Read and write support on all cluster nodes
+ Efficient multi-raft: every node can replicate many state machines
Schema Update with Gossip
App
App
App
ALTER TABLE
ALTER TABLE
Collision
Consistency Model of Schema Changes
id first last
1 John Doe
Time
Node A: Node B:
id first last email
1 John Doe
2 Jenny Smith j@...
id first last email phone
1 John Doe
2 Jenny Smith j@... (867)
id first last phone
1 John Doe
2 Jenny Smith (867)
Split
brain
(In)consistency of Schema Changes
cqlsh:test> create table t (a int primary key);
----------------------------------------------- split ------------------------------------------
cqlsh:test> alter table t rename a to d;
Warning: schema version mismatch detected
cqlsh:test> insert into t (d) values (1);
Cannot execute this query as it might involve data filtering and thus
may have unpredictable performance.
cqlsh:test> insert into t (a) values (1);
Unknown identifier a
Schema Update with Raft
S1
S2
S3
CREATE TABLE t ADD COLUMN b CREATE INDEX t_i1
Raft log:
I
N
S
E
R
T
I
N
T
O
t
S
E
T
b
=
2
S
E
L
E
C
T
b
- schema fetch
Schema Update with Raft
App
App
App
ALTER TABLE
ALTER TABLE
Safe serialization
What is Topology?
Topology is defined as all of the following:
the set of nodes in the cluster,
location of those nodes in DCs and racks,
and assignment of ownership of data to nodes
Token Metadata
node A node B node C
A
C
B
C
A
B
Token metadata:
+ Each node has a set of tokens assigned during bootstrap
(vnodes)
+ Tokens combined determine primary owning replicas for key
ranges
Eventually (In)consistent Topology
node A node B node C
Token
metadata
node D
A
B
A
B
Bootstrapping
node D
A
C
B
C
A
B
A
B
A
B
local view local view
in gossip
Growing a cluster - One by One
Growing a cluster - One by One
Growing a cluster - One by One
Growing a cluster - One by One
Growing a cluster with Strong
Consistency Roadmap
Growing a cluster - with Strong
Consistency Roadmap
Raft for Tablets
+ Manageable number of Raft groups (~100,000)
+ No client-side timestamps
+ Provides isolation for ALL queries
+ Writes do not require a read
+ No need to repair
+ Strong consistency of materialized views
+ Linearizable, more powerful schema and topology changes
+ High Availability and partition tolerance of Cassandra are mostly unaffected
+ Tablets allow to dynamically split loaded shards
29
Roadmap
Repair Base Node
Operation (RBNO)
Resilience
Growing a cluster
What is RBNO
+ Use row level repair as the underlying mechanism to sync data between nodes instead of
streaming
+ Single mechanism for all the node operations
+ Bootstrap / replace / rebuild / decommission / removenode / repair
Benefits of RBNO
Significant improvements on performance and data safety
+ Resumable
+ Resume from previous failed bootstrap operations
+ Consistency
+ Latest replica is guaranteed
+ Simplified
+ No need to run repair before or after node operations like replace and removenode
+ Unified
+ All node ops use the same underlying mechanism
Off-strategy compaction
Make compaction during node operations more efficient
+ What is it
+ Sstables generated by node operations are kept in a separate data set
+ Compact them together and integrate to main set when node operation is done
+ Benefits
+ Less compaction work during node operations
+ Faster to complete node operations
Improved OOM Resistance
memtable
reader
cache
reader
restricted
reader
combined
reader
combined
reader
sstable
reader 1
sstable
reader 2
sstable
reader N on
cache
miss
memtable
reader
cache
reader
combined
reader
combined
reader
sstable
reader 1
sstable
reader 2
sstable
reader N
New I/O Scheduler
Performance
Why scheduling at all
+ Different components compete for limited resources (Reads, Writes, Admin)
+ They have different priorities
+ They have no idea how not to over-consume the resource
How does it work?
Diskplorer 1
Diskplorer 3 (AWS i3en.3xlarge)
Scheduler safety area
The new I/O Scheduler
+ Collect information about disks
+ Build a more accurate mathematical disk model
+ Embody the model into the I/O scheduler
Latency while replacing a node
Replace a node in Scylla 4.6
New node
added Streaming
completed
P99
Latency
Replace a node in Scylla 5.1
New node
added Streaming
completed
P99
Latency
New I4i Instances
Performance
I4 NVMe Storage
Latest Results I3 vs I4 - one node
I3.16xlarge vs i4.16xlarge (64 vCPU servers)
50% Reads / 50% Writes
Latency tests with 50% of the max throughput
Latest Results I3 vs I4 - 3 node cluster
Big thanks to Michał
Chojnowski for benchmarking
all the new AWS instances
types!
I3.16xlarge vs i4.16xlarge (64 vCPU servers)
50% Reads / 50% Writes
Latency tests with 50% of the max throughput
67% better price/performance!
50
Large
Partition?
Wide Partition Example
Removed large partition penalty
RAM
Disk
Reversed Queries
Petabyte Performance
Ecosystem
Scylla Rust Driver
Ecosystem
Intro: why Rust?
● Rust is gaining popularity very fast
$ curl -s https://www.p99conf.io | grep -ci rust
12
● Some of the existing customers are already giving our driver a try (including @ultrabug)
● The language is comparable to C++, but with a more strict compiler, better defaults,
decent pattern matching and an interesting built-in async model
○ liveness issues are usually detected at compile time
○ the checks also work across threads (Send + Sync traits)
○ everything is const by default, mutable only if explicitly declared as such
○ coroutine-like await syntax is part of the language
Intro: why Tokio?
1. Most popular async runtime, which should translate to best adoption
2. It would be tempting to write the driver in a runtime-agnostic way, but it's hard:
a. not all API's are well defined yet
b. Tokio offers quite complete support for TCP communication, timeouts and other useful
abstractions
3. Very actively developed
code: https://github.com/cvybhu/rust-driver-benchmarks
Rust Driver
Rust Inside
C++ Driver Python Driver PHP Driver Add your own
Roadmap
WebAssembly
Binary format for expressing executable code, executed on a stack-based virtual machine.
Designed to be:
+ portable
+ easily embeddable
+ efficient
WebAssembly is binary, but it also specifies a standard human-readable format: WAT
(WebAssembly Text Format).
Roadmap
WebAssembly
Creating a user-defined function with Wasm is as easy as providing its source code
represented in WebAssembly Text Format:
CREATE FUNCTION fib(input bigint) RETURNS NULL ON NULL INPUT RETURNS
bigint
LANGUAGE xwasm AS
'(module
(func $fib (param $n i64) (result i64)
(if
(i64.lt_s (local.get $n) (i64.const 2))
(return (local.get $n))
)
(i64.add
(call $fib (i64.sub (local.get $n) (i64.const 1)))
(call $fib (i64.sub (local.get $n) (i64.const 2)))
)
)
(export "fib" (func $fib))
)';
cassandra@cqlsh:ks> SELECT n, fib(n) FROM numbers;
n | ks.fib(n)
---+-----------
1 | 1
2 | 1
3 | 2
4 | 3
5 | 5
6 | 8
7 | 13
8 | 21
9 | 34
(9 rows)
User-defined functions
User-defined functions are a CQL feature that allows applying a custom function
to the query result rows.
cassandra@cqlsh:ks> SELECT id, inv(id), mult(id, inv(id)) FROM t;
id | ks.inv(id) | ks.mult(id, ks.inv(id))
----+------------+-------------------------
7 | 0.142857 | 1
1 | 1 | 1
0 | Infinity | NaN
4 | 0.25 | 1
(4 rows)
Roadmap
Kubernetes Operator
+ Performance
+ Stability
+ Security
+ Image pull secrets added to CRD.
+ Users may specify their private secure
+ repository of Scylla images.
Scylla Manager Agent
secret token
Built for Extreme Scale
+ Safe Cluster Level Operation
+ Repair Based Node Operations
+ Repair based Tombstone
Garbage Collection
+ Unbucketed TWCS tables
+ Improved OOM Resistance
+ Gossip Free Node Operations
+ New I/O Model and Scheduler
+ I4i instances
+ CQL: Reversed Queries
+ Removed large partition penalty
+ Off-strategy compaction
+ Rust Driver (with C++ wrapper)
+ ARM Support
+ K8S support
+ Workload definitions per role
+ Alternator TTL
+ WASM UDF
+ Virtual Table for Configuration
Resilience Performance Ecosystem
Whats Next?
+ ScyllaDB 5.0 updates will roll into upcoming ScyllaDB Enterprise 2022.1 and
Scylla Cloud
+ Safe schema and topology update will roll into production
+ I4i will be available for Scylla Cloud
+ New Drivers roll out
+ Many more features (WASM UDF and more)
Roadmap
28th of July 8AM-12PM PT
Half-day of free online training with
some of our best engineers and
experts
Register here.
Poll
How much data do you under management of your
transactional database?
Thank you
for joining us today.
@scylladb scylladb/
slack.scylladb.com
@scylladb company/scylladb/
scylladb/

More Related Content

Similar to What’s New in ScyllaDB Open Source 5.0

Running a Cost-Effective DynamoDB-Compatible Database on Managed Kubernetes S...
Running a Cost-Effective DynamoDB-Compatible Database on Managed Kubernetes S...Running a Cost-Effective DynamoDB-Compatible Database on Managed Kubernetes S...
Running a Cost-Effective DynamoDB-Compatible Database on Managed Kubernetes S...DevOps.com
 
Experience sql server on l inux and docker
Experience sql server on l inux and dockerExperience sql server on l inux and docker
Experience sql server on l inux and dockerBob Ward
 
Critical Attributes for a High-Performance, Low-Latency Database
Critical Attributes for a High-Performance, Low-Latency DatabaseCritical Attributes for a High-Performance, Low-Latency Database
Critical Attributes for a High-Performance, Low-Latency DatabaseScyllaDB
 
OpenEBS hangout #4
OpenEBS hangout #4OpenEBS hangout #4
OpenEBS hangout #4OpenEBS
 
The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...
 The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ... The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...
The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...Josef Adersberger
 
Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...
Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...
Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...QAware GmbH
 
OS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of MLOS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of MLNordic APIs
 
Patterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to KubernetesPatterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to KubernetesJosef Adersberger
 
Patterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to KubernetesPatterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to KubernetesQAware GmbH
 
Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...StreamNative
 
How Development Teams Cut Costs with ScyllaDB.pdf
How Development Teams Cut Costs with ScyllaDB.pdfHow Development Teams Cut Costs with ScyllaDB.pdf
How Development Teams Cut Costs with ScyllaDB.pdfScyllaDB
 
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsTensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsStijn Decubber
 
Bandwidth: Use Cases for Elastic Cloud on Kubernetes
Bandwidth: Use Cases for Elastic Cloud on Kubernetes Bandwidth: Use Cases for Elastic Cloud on Kubernetes
Bandwidth: Use Cases for Elastic Cloud on Kubernetes Elasticsearch
 
Revolutionary Storage for Modern Databases, Applications and Infrastrcture
Revolutionary Storage for Modern Databases, Applications and InfrastrctureRevolutionary Storage for Modern Databases, Applications and Infrastrcture
Revolutionary Storage for Modern Databases, Applications and Infrastrcturesabnees
 
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...Chester Chen
 
Running a DynamoDB-compatible Database on Managed Kubernetes Services
Running a DynamoDB-compatible Database on Managed Kubernetes ServicesRunning a DynamoDB-compatible Database on Managed Kubernetes Services
Running a DynamoDB-compatible Database on Managed Kubernetes ServicesScyllaDB
 
OpenEBS Technical Workshop - KubeCon San Diego 2019
OpenEBS Technical Workshop - KubeCon San Diego 2019OpenEBS Technical Workshop - KubeCon San Diego 2019
OpenEBS Technical Workshop - KubeCon San Diego 2019MayaData Inc
 
Sheepdog Status Report
Sheepdog Status ReportSheepdog Status Report
Sheepdog Status ReportLiu Yuan
 
Give Your Confluent Platform Superpowers! (Sandeep Togrika, Intel and Bert Ha...
Give Your Confluent Platform Superpowers! (Sandeep Togrika, Intel and Bert Ha...Give Your Confluent Platform Superpowers! (Sandeep Togrika, Intel and Bert Ha...
Give Your Confluent Platform Superpowers! (Sandeep Togrika, Intel and Bert Ha...HostedbyConfluent
 
Data Grids with Oracle Coherence
Data Grids with Oracle CoherenceData Grids with Oracle Coherence
Data Grids with Oracle CoherenceBen Stopford
 

Similar to What’s New in ScyllaDB Open Source 5.0 (20)

Running a Cost-Effective DynamoDB-Compatible Database on Managed Kubernetes S...
Running a Cost-Effective DynamoDB-Compatible Database on Managed Kubernetes S...Running a Cost-Effective DynamoDB-Compatible Database on Managed Kubernetes S...
Running a Cost-Effective DynamoDB-Compatible Database on Managed Kubernetes S...
 
Experience sql server on l inux and docker
Experience sql server on l inux and dockerExperience sql server on l inux and docker
Experience sql server on l inux and docker
 
Critical Attributes for a High-Performance, Low-Latency Database
Critical Attributes for a High-Performance, Low-Latency DatabaseCritical Attributes for a High-Performance, Low-Latency Database
Critical Attributes for a High-Performance, Low-Latency Database
 
OpenEBS hangout #4
OpenEBS hangout #4OpenEBS hangout #4
OpenEBS hangout #4
 
The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...
 The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ... The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...
The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...
 
Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...
Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...
Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...
 
OS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of MLOS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of ML
 
Patterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to KubernetesPatterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to Kubernetes
 
Patterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to KubernetesPatterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to Kubernetes
 
Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...
 
How Development Teams Cut Costs with ScyllaDB.pdf
How Development Teams Cut Costs with ScyllaDB.pdfHow Development Teams Cut Costs with ScyllaDB.pdf
How Development Teams Cut Costs with ScyllaDB.pdf
 
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsTensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
 
Bandwidth: Use Cases for Elastic Cloud on Kubernetes
Bandwidth: Use Cases for Elastic Cloud on Kubernetes Bandwidth: Use Cases for Elastic Cloud on Kubernetes
Bandwidth: Use Cases for Elastic Cloud on Kubernetes
 
Revolutionary Storage for Modern Databases, Applications and Infrastrcture
Revolutionary Storage for Modern Databases, Applications and InfrastrctureRevolutionary Storage for Modern Databases, Applications and Infrastrcture
Revolutionary Storage for Modern Databases, Applications and Infrastrcture
 
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
 
Running a DynamoDB-compatible Database on Managed Kubernetes Services
Running a DynamoDB-compatible Database on Managed Kubernetes ServicesRunning a DynamoDB-compatible Database on Managed Kubernetes Services
Running a DynamoDB-compatible Database on Managed Kubernetes Services
 
OpenEBS Technical Workshop - KubeCon San Diego 2019
OpenEBS Technical Workshop - KubeCon San Diego 2019OpenEBS Technical Workshop - KubeCon San Diego 2019
OpenEBS Technical Workshop - KubeCon San Diego 2019
 
Sheepdog Status Report
Sheepdog Status ReportSheepdog Status Report
Sheepdog Status Report
 
Give Your Confluent Platform Superpowers! (Sandeep Togrika, Intel and Bert Ha...
Give Your Confluent Platform Superpowers! (Sandeep Togrika, Intel and Bert Ha...Give Your Confluent Platform Superpowers! (Sandeep Togrika, Intel and Bert Ha...
Give Your Confluent Platform Superpowers! (Sandeep Togrika, Intel and Bert Ha...
 
Data Grids with Oracle Coherence
Data Grids with Oracle CoherenceData Grids with Oracle Coherence
Data Grids with Oracle Coherence
 

More from ScyllaDB

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLScyllaDB
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasScyllaDB
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasScyllaDB
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...ScyllaDB
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...ScyllaDB
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaScyllaDB
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityScyllaDB
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptxScyllaDB
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDBScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationScyllaDB
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsScyllaDB
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesScyllaDB
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsScyllaDB
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101ScyllaDB
 

More from ScyllaDB (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQL
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & Pitfalls
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual Workshop
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & Tradeoffs
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101
 

Recently uploaded

"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 

Recently uploaded (20)

"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 

What’s New in ScyllaDB Open Source 5.0

  • 1. ScyllaDB 5.0 Innovations for Extreme Scale Tzach Livyatan, VP Product
  • 2. Brought to you by VIRTUAL EVENT | OCTOBER 19 + 20 All Things Performance The event for developers who care about P99 percentiles and high-performance, low-latency applications. Register at p99conf.io
  • 3. Poll Where are you in your NoSQL adoption? 3
  • 4. Tzach Livyatan 4 VP of Product, ScyllaDB + Lead the product team in ScyllaDB + Appreciate distributed system testing + Lives in Tel Aviv, father of two
  • 5. Agenda + How did we get here? + ScyllaDB V Theme + Resilience + Performance + Ecosystem + Whats Next?
  • 6. + Infoworld 2020 Technology of the Year! + Founded by designers of KVM Hypervisor The Database Built for Gamechangers 6 “ScyllaDB stands apart...It’s the rare product that exceeds my expectations.” – Martin Heller, InfoWorld contributing editor and reviewer “For 99.9% of applications, ScyllaDB delivers all the power a customer will ever need, on workloads that other databases can’t touch – and at a fraction of the cost of an in-memory solution.” – Adrian Bridgewater, Forbes senior contributor + Resolves challenges of legacy NoSQL databases + >5x higher throughput + >20x lower latency + >75% TCO savings + DBaaS/Cloud, Enterprise and Open Source solutions + Proven globally at scale
  • 7. 7 +400 Gamechangers Leverage ScyllaDB Seamless experiences across content + devices Fast computation of flight pricing Corporate fleet management Real-time analytics 2,000,000 SKU -commerce management Real-time location tracking for friends/family Video recommendation management IoT for industrial machines Synchronize browser properties for millions Threat intelligence service using JanusGraph Real time fraud detection across 6M transactions/day Uber scale, mission critical chat & messaging app Network security threat detection Power ~50M X1 DVRs with billions of reqs/day Precision healthcare via Edison AI Inventory hub for retail operations Property listings and updates Unified ML feature store across the business Cryptocurrency exchange app Geography-based recommendations Distributed storage for distributed ledger tech Global operations- Avon, Body Shop + more Predictable performance for on sale surges GPS-based exercise tracking
  • 8. Cloud infrastructure: The last ~10 years 8 SSD: $2500/TB Performance improvement 2008 2012 Typical instance 4 cores SSD $100/TB - 1000x faster, 10x cheaper 96 core VMs - 20x more cores 100Gbps NICs - 100x more throughput 2015 2022 2000 CPU core systems and beyond
  • 9. ScyllaDB Journey Superior Performance Full Compatibility: Apache Cassandra AWS DynamoDB K8s CDC Kafka, Spark support Shard aware drivers ScyllaDB 5.x Scylla Cloud Enhance security Workload Prioritization Enhance performance (ICS) ScyllaDB Enterprise 2022.x Base on Open Source 5.x Enterprise Open Source
  • 10. Performance Resilience Ecosystem ScyllaDB V Core Themes Innovations for Extreme Scale
  • 11. Strongly Consistent Schema and Topology Updates Resilience
  • 12. What is Raft (baby dont hurt me no more) Picture from https://en.wikipedia.org/wiki/Raft_%28algorithm%29
  • 13. Raft intro Raft is a protocol for state machine replication. What does it mean? + The majority of nodes have the same state + State transition happens in the same order on all nodes + Cluster topology is part of the state Raft paper: https://raft.github.io/raft.pdf
  • 14. How Scylla Raft is special Scylla Raft implements a number of important extensions: + Increased liveness for very large clusters (1000+ nodes) + Resilience against asymmetric network failures + Read and write support on all cluster nodes + Efficient multi-raft: every node can replicate many state machines
  • 15. Schema Update with Gossip App App App ALTER TABLE ALTER TABLE Collision
  • 16. Consistency Model of Schema Changes id first last 1 John Doe Time Node A: Node B: id first last email 1 John Doe 2 Jenny Smith j@... id first last email phone 1 John Doe 2 Jenny Smith j@... (867) id first last phone 1 John Doe 2 Jenny Smith (867) Split brain
  • 17. (In)consistency of Schema Changes cqlsh:test> create table t (a int primary key); ----------------------------------------------- split ------------------------------------------ cqlsh:test> alter table t rename a to d; Warning: schema version mismatch detected cqlsh:test> insert into t (d) values (1); Cannot execute this query as it might involve data filtering and thus may have unpredictable performance. cqlsh:test> insert into t (a) values (1); Unknown identifier a
  • 18. Schema Update with Raft S1 S2 S3 CREATE TABLE t ADD COLUMN b CREATE INDEX t_i1 Raft log: I N S E R T I N T O t S E T b = 2 S E L E C T b - schema fetch
  • 19. Schema Update with Raft App App App ALTER TABLE ALTER TABLE Safe serialization
  • 20. What is Topology? Topology is defined as all of the following: the set of nodes in the cluster, location of those nodes in DCs and racks, and assignment of ownership of data to nodes
  • 21. Token Metadata node A node B node C A C B C A B Token metadata: + Each node has a set of tokens assigned during bootstrap (vnodes) + Tokens combined determine primary owning replicas for key ranges
  • 22. Eventually (In)consistent Topology node A node B node C Token metadata node D A B A B Bootstrapping node D A C B C A B A B A B local view local view in gossip
  • 23. Growing a cluster - One by One
  • 24. Growing a cluster - One by One
  • 25. Growing a cluster - One by One
  • 26. Growing a cluster - One by One
  • 27. Growing a cluster with Strong Consistency Roadmap
  • 28. Growing a cluster - with Strong Consistency Roadmap
  • 29. Raft for Tablets + Manageable number of Raft groups (~100,000) + No client-side timestamps + Provides isolation for ALL queries + Writes do not require a read + No need to repair + Strong consistency of materialized views + Linearizable, more powerful schema and topology changes + High Availability and partition tolerance of Cassandra are mostly unaffected + Tablets allow to dynamically split loaded shards 29 Roadmap
  • 30. Repair Base Node Operation (RBNO) Resilience
  • 32. What is RBNO + Use row level repair as the underlying mechanism to sync data between nodes instead of streaming + Single mechanism for all the node operations + Bootstrap / replace / rebuild / decommission / removenode / repair
  • 33. Benefits of RBNO Significant improvements on performance and data safety + Resumable + Resume from previous failed bootstrap operations + Consistency + Latest replica is guaranteed + Simplified + No need to run repair before or after node operations like replace and removenode + Unified + All node ops use the same underlying mechanism
  • 34. Off-strategy compaction Make compaction during node operations more efficient + What is it + Sstables generated by node operations are kept in a separate data set + Compact them together and integrate to main set when node operation is done + Benefits + Less compaction work during node operations + Faster to complete node operations
  • 35. Improved OOM Resistance memtable reader cache reader restricted reader combined reader combined reader sstable reader 1 sstable reader 2 sstable reader N on cache miss memtable reader cache reader combined reader combined reader sstable reader 1 sstable reader 2 sstable reader N
  • 37. Why scheduling at all + Different components compete for limited resources (Reads, Writes, Admin) + They have different priorities + They have no idea how not to over-consume the resource
  • 38. How does it work?
  • 40. Diskplorer 3 (AWS i3en.3xlarge)
  • 42. The new I/O Scheduler + Collect information about disks + Build a more accurate mathematical disk model + Embody the model into the I/O scheduler
  • 44. Replace a node in Scylla 4.6 New node added Streaming completed P99 Latency
  • 45. Replace a node in Scylla 5.1 New node added Streaming completed P99 Latency
  • 48. Latest Results I3 vs I4 - one node I3.16xlarge vs i4.16xlarge (64 vCPU servers) 50% Reads / 50% Writes Latency tests with 50% of the max throughput
  • 49. Latest Results I3 vs I4 - 3 node cluster Big thanks to Michał Chojnowski for benchmarking all the new AWS instances types! I3.16xlarge vs i4.16xlarge (64 vCPU servers) 50% Reads / 50% Writes Latency tests with 50% of the max throughput 67% better price/performance!
  • 51. Removed large partition penalty RAM Disk
  • 56. Intro: why Rust? ● Rust is gaining popularity very fast $ curl -s https://www.p99conf.io | grep -ci rust 12 ● Some of the existing customers are already giving our driver a try (including @ultrabug) ● The language is comparable to C++, but with a more strict compiler, better defaults, decent pattern matching and an interesting built-in async model ○ liveness issues are usually detected at compile time ○ the checks also work across threads (Send + Sync traits) ○ everything is const by default, mutable only if explicitly declared as such ○ coroutine-like await syntax is part of the language
  • 57. Intro: why Tokio? 1. Most popular async runtime, which should translate to best adoption 2. It would be tempting to write the driver in a runtime-agnostic way, but it's hard: a. not all API's are well defined yet b. Tokio offers quite complete support for TCP communication, timeouts and other useful abstractions 3. Very actively developed
  • 59. Rust Inside C++ Driver Python Driver PHP Driver Add your own Roadmap
  • 60. WebAssembly Binary format for expressing executable code, executed on a stack-based virtual machine. Designed to be: + portable + easily embeddable + efficient WebAssembly is binary, but it also specifies a standard human-readable format: WAT (WebAssembly Text Format). Roadmap
  • 61. WebAssembly Creating a user-defined function with Wasm is as easy as providing its source code represented in WebAssembly Text Format: CREATE FUNCTION fib(input bigint) RETURNS NULL ON NULL INPUT RETURNS bigint LANGUAGE xwasm AS '(module (func $fib (param $n i64) (result i64) (if (i64.lt_s (local.get $n) (i64.const 2)) (return (local.get $n)) ) (i64.add (call $fib (i64.sub (local.get $n) (i64.const 1))) (call $fib (i64.sub (local.get $n) (i64.const 2))) ) ) (export "fib" (func $fib)) )'; cassandra@cqlsh:ks> SELECT n, fib(n) FROM numbers; n | ks.fib(n) ---+----------- 1 | 1 2 | 1 3 | 2 4 | 3 5 | 5 6 | 8 7 | 13 8 | 21 9 | 34 (9 rows)
  • 62. User-defined functions User-defined functions are a CQL feature that allows applying a custom function to the query result rows. cassandra@cqlsh:ks> SELECT id, inv(id), mult(id, inv(id)) FROM t; id | ks.inv(id) | ks.mult(id, ks.inv(id)) ----+------------+------------------------- 7 | 0.142857 | 1 1 | 1 | 1 0 | Infinity | NaN 4 | 0.25 | 1 (4 rows) Roadmap
  • 63. Kubernetes Operator + Performance + Stability + Security + Image pull secrets added to CRD. + Users may specify their private secure + repository of Scylla images. Scylla Manager Agent secret token
  • 64.
  • 65. Built for Extreme Scale + Safe Cluster Level Operation + Repair Based Node Operations + Repair based Tombstone Garbage Collection + Unbucketed TWCS tables + Improved OOM Resistance + Gossip Free Node Operations + New I/O Model and Scheduler + I4i instances + CQL: Reversed Queries + Removed large partition penalty + Off-strategy compaction + Rust Driver (with C++ wrapper) + ARM Support + K8S support + Workload definitions per role + Alternator TTL + WASM UDF + Virtual Table for Configuration Resilience Performance Ecosystem
  • 66. Whats Next? + ScyllaDB 5.0 updates will roll into upcoming ScyllaDB Enterprise 2022.1 and Scylla Cloud + Safe schema and topology update will roll into production + I4i will be available for Scylla Cloud + New Drivers roll out + Many more features (WASM UDF and more) Roadmap
  • 67. 28th of July 8AM-12PM PT Half-day of free online training with some of our best engineers and experts Register here.
  • 68. Poll How much data do you under management of your transactional database?
  • 69. Thank you for joining us today. @scylladb scylladb/ slack.scylladb.com @scylladb company/scylladb/ scylladb/