SlideShare a Scribd company logo
1 of 35
Download to read offline
Felipe Mendes, Solution Architect at ScyllaDB
Beyond Linear Scaling
A New Path for Performance
with ScyllaDB
+ For data-intensive applications that require high
throughput and predictable low latencies
+ Close-to-the-metal design takes full advantage of
modern infrastructure
+ >5x higher throughput
+ >20x lower latency
+ >75% TCO savings
+ Compatible with Apache Cassandra and Amazon
DynamoDB
+ DBaaS/Cloud, Enterprise and Open Source
solutions
The Database for Gamechangers
2
“ScyllaDB stands apart...It’s the rare product
that exceeds my expectations.”
– Martin Heller, InfoWorld contributing editor and reviewer
“For 99.9% of applications, ScyllaDB delivers all the
power a customer will ever need, on workloads that other
databases can’t touch – and at a fraction of the cost of
an in-memory solution.”
– Adrian Bridgewater, Forbes senior contributor
3
+400 Gamechangers Leverage ScyllaDB
Seamless experiences
across content + devices
Digital experiences at
massive scale
Corporate fleet
management
Real-time analytics 2,000,000 SKU -commerce
management
Video recommendation
management
Threat intelligence service
using JanusGraph
Real time fraud detection
across 6M transactions/day
Uber scale, mission critical
chat & messaging app
Network security threat
detection
Power ~50M X1 DVRs with
billions of reqs/day
Precision healthcare via
Edison AI
Inventory hub for retail
operations
Property listings and
updates
Unified ML feature store
across the business
Cryptocurrency exchange
app
Geography-based
recommendations
Global operations- Avon,
Body Shop + more
Predictable performance for
on sale surges
GPS-based exercise
tracking
Serving dynamic live
streams at scale
Powering India's top
social media platform
Personalized
advertising to players
Distribution of game
assets in Unreal Engine
Introductions
Felipe Mendes, Solution Architect at ScyllaDB
+ Published Author on Linux and Databases
+ Helps teams solve their most challenging problems
+ Years of experience with Linux and distributed systems
Agenda
+ (Near) Linear Scaling
+ Enter Real-life
+ ScyllaDB under Load
+ Crafting Your Success
+ Beyond Linear Scaling
6
(Near) Linear Scaling
Why is it important … And when you shouldn't care :-)
7
Linear Speedup
Main goal is to run programs faster
+ To a point…
+ Measured as
+ Reasons for sub-linear speedup:
+ Laws! (Amdahl's, Gustafson-Barsis)
+ Task Management
+ Communication & Synchronization
15.2 Performance in Practice
Ideal, typical, and super-linear speedup curves
Universal Scaling Law
Generalization of Amdahl’s Law discovered by Dr. Neil
Gunther. As number of users (N) increases, the
system throughput (X) will:
+ Enjoy a period of near linear scaling
+ Eventually saturate some resource such that
increasing N doesn’t increase X. This defines
maxX
+ Possibly encounter a coordination cost that
drives down X with further increasing N
Saturation
Region
Linear
Region
Retrograde
Region
maxX
How Optimizely (Safely) Maximizes Database Concurrency
Linear Scaling – Good
Relevant for parallel programming, useful for measuring:
+ Database efficiency
+ Price-performance
+ Scalability
NoSQL Benchmark: MongoDB vs ScyllaDB
9
Doesn't account for:
+ Improvements Over Time
+ Application Semantics
+ Hotspots
+ Scaling Clients
+ Consistent Hashing Uneven Distribution
+ Communication Overhead
More on propagating state (and image credits): Gnutella: an Intro to Gossip
Linear Scaling – Bad
Gossip propagation
10
11
Enter Real-life
Overlooked considerations no one (dares) to tell you ;-)
12
Application Semantics
1,000,000 sensors, representing homes in an area
IOT
Social
DynamoDB: When to Move Out?
13
Consistent Hashing
Exercise: How much more traffic and
load does this node receive?
Alexys Jacob – Leveraging consistent hashing in your python applications
thelastpickle – The Impacts of Changing the Number of VNodes in Apache Cassandra
Avi Kivity's shard simulator
Bad
Better, but not perfect
Adding more nodes won't help
Hotspots
14
How Discord Stores Trillions of Messages
Performance Under Load – Adaptive Concurrency Limits
Challenges:
+ For a system serving X static clients, what's the max
effective concurrency to set on a single client?
+ When scaling clients, how to coordinate them to
avoid overwhelming a group of replicas?
Scaling Clients
15
Discord consistent hash-based routing
DB
Calls
Netflix Adaptive Concurrency
16
ScyllaDB Under Load
Quantifying the Performance Impact of a Shard-per-Core Architecture
17
ScyllaDB Architecture
Dor Laor on P99 CONF: Quantifying the Impact of Shard Per Core Architecture
Linear Scale Ingestion
Constant Time Ingestion
2X 2X 2X 2X 2X
18
2X 2X 2X 2X 2X
“Nodes must be small, in case they
fail”
No they don’t! {Replace, Add, remove} Node at constant time
19
Compaction Scale
2X 2X 2X 2X 2X 20
21
Crafting Your Success
Do's and Don'ts
22
In a Nutshell...
Database Performance At Scale
23
Run Real Tests
Benchmark tools prove you can get there, but:
+ Application semantics are unique
+ Access patterns are unique
+ Real-life tooling is also unique
+ Addressing all corner-cases is time-consuming or even impossible
+ Don't just blindly assume 2x will give you 2x load
24
Eliminate Noise
Avoid large deployments of small nodes
+ Go Big or Go Home!
+ Considerably reduces the overhead associated with
communication & synchronization
+ Less resource overcommitment
+ BUT, keep balance:
+ Account for inevitable failures
+ Leave room for unpredictability
25
Tune the client side
Understand your data flows:
+ Can multiple clients spam a single key?
+ What happens when scaling the number of
clients?
+ How is load balancing achieved?
Power of Two Choices load balancing
P99 CONF – Conquering Load Balancing: Experiences from ScyllaDB Drivers
26
Beyond Linear Scaling
Unveiling Performance Insights
27
ScyllaDB in 2018
Ingestion time – Lower is better
28
ScyllaDB in 2023
4
Ingestion time – Lower is better
29
Linear Scale Ingestion (2023 vs 2018)
Constant Time Ingestion
2X 2X 2X 2X 2X
30
Getting Even Faster
Time to execute, lower is better
31
Going Beyond – Tablets
tablet
tablet
replica
tablet
replica
tablet
replica
replication
metadata:
(per table)
Why ScyllaDB is Moving to a New Replication Algorithm: Tablets
32
Going Beyond – ScyllaDB Enterprise
Throughput – Higher is better
There's much more to performance beyond Linear Scale:
+ Goods and Bads of Linear Scaling
+ Real-life situations impacting linear scalability
+ ScyllaDB Shard-Per-Core Architecture
+ Run Realistic Workloads
+ How ScyllaDB drives the meaning of 'performance'
33
Summary
Q&A
ScyllaDB Cloud
Start free trial
scylladb.com/cloud
Feb 14-15 | VIRTUAL EVENT
scylladb.com/summit
Virtual Workshop
January 25, 2024
scylladb.com/events
Thank you
for joining us today.
@scylladb scylladb/
slack.scylladb.com
@scylladb company/scylladb/
scylladb/

More Related Content

Similar to Beyond Linear Scaling: A New Path for Performance with ScyllaDB

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)
An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)
An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)Robert Grossman
 
Tapping the cloud for real time data analytics
 Tapping the cloud for real time data analytics Tapping the cloud for real time data analytics
Tapping the cloud for real time data analyticsAmazon Web Services
 
Stateful on Stateless - The Future of Applications in the Cloud
Stateful on Stateless - The Future of Applications in the CloudStateful on Stateless - The Future of Applications in the Cloud
Stateful on Stateless - The Future of Applications in the CloudMarkus Eisele
 
Scylla Virtual Workshop 2022
Scylla Virtual Workshop 2022Scylla Virtual Workshop 2022
Scylla Virtual Workshop 2022ScyllaDB
 
Trivento summercamp masterclass 9/9/2016
Trivento summercamp masterclass 9/9/2016Trivento summercamp masterclass 9/9/2016
Trivento summercamp masterclass 9/9/2016Stavros Kontopoulos
 
High-Speed Reactive Microservices
High-Speed Reactive MicroservicesHigh-Speed Reactive Microservices
High-Speed Reactive MicroservicesRick Hightower
 
Battery Ventures: Simulating and Visualizing Large Scale Cassandra Deployments
Battery Ventures: Simulating and Visualizing Large Scale Cassandra DeploymentsBattery Ventures: Simulating and Visualizing Large Scale Cassandra Deployments
Battery Ventures: Simulating and Visualizing Large Scale Cassandra DeploymentsDataStax Academy
 
Drive DBMS Transformation with EDB Postgres
Drive DBMS Transformation with EDB PostgresDrive DBMS Transformation with EDB Postgres
Drive DBMS Transformation with EDB PostgresEDB
 
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerceDon't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerceDataStax
 
RightScale Webinar: Hybrid Cloud Fundamentals and Lessons Learned
RightScale Webinar: Hybrid Cloud Fundamentals and Lessons LearnedRightScale Webinar: Hybrid Cloud Fundamentals and Lessons Learned
RightScale Webinar: Hybrid Cloud Fundamentals and Lessons LearnedRightScale
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasScyllaDB
 
Relevance of time series databases & druid.io
Relevance of time series databases & druid.ioRelevance of time series databases & druid.io
Relevance of time series databases & druid.ioMuniraju V
 
Big Data in Production: Lessons from Running in the Cloud
Big Data in Production: Lessons from Running in the CloudBig Data in Production: Lessons from Running in the Cloud
Big Data in Production: Lessons from Running in the CloudJen Aman
 
Neo4j: What's Under the Hood & How Knowing This Can Help You
Neo4j: What's Under the Hood & How Knowing This Can Help You Neo4j: What's Under the Hood & How Knowing This Can Help You
Neo4j: What's Under the Hood & How Knowing This Can Help You Neo4j
 
Neo4j GraphTalks - Einführung in Graphdatenbanken
Neo4j GraphTalks - Einführung in GraphdatenbankenNeo4j GraphTalks - Einführung in Graphdatenbanken
Neo4j GraphTalks - Einführung in GraphdatenbankenNeo4j
 
A Modern Data Architecture for Risk Management... For Financial Services
A Modern Data Architecture for Risk Management... For Financial ServicesA Modern Data Architecture for Risk Management... For Financial Services
A Modern Data Architecture for Risk Management... For Financial ServicesMammoth Data
 
z Systems redefining Enterprise IT for digital business - Alain Poquillon
z Systems redefining Enterprise IT for digital business - Alain Poquillonz Systems redefining Enterprise IT for digital business - Alain Poquillon
z Systems redefining Enterprise IT for digital business - Alain PoquillonNRB
 
Lean Enterprise, Microservices and Big Data
Lean Enterprise, Microservices and Big DataLean Enterprise, Microservices and Big Data
Lean Enterprise, Microservices and Big DataStylight
 
Microservices: A foundational approach for fully managed cloud data analytics
Microservices: A foundational approach for fully managed cloud data analyticsMicroservices: A foundational approach for fully managed cloud data analytics
Microservices: A foundational approach for fully managed cloud data analyticsSam Lightstone
 

Similar to Beyond Linear Scaling: A New Path for Performance with ScyllaDB (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)
An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)
An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)
 
Tapping the cloud for real time data analytics
 Tapping the cloud for real time data analytics Tapping the cloud for real time data analytics
Tapping the cloud for real time data analytics
 
Stateful on Stateless - The Future of Applications in the Cloud
Stateful on Stateless - The Future of Applications in the CloudStateful on Stateless - The Future of Applications in the Cloud
Stateful on Stateless - The Future of Applications in the Cloud
 
Scylla Virtual Workshop 2022
Scylla Virtual Workshop 2022Scylla Virtual Workshop 2022
Scylla Virtual Workshop 2022
 
Trivento summercamp masterclass 9/9/2016
Trivento summercamp masterclass 9/9/2016Trivento summercamp masterclass 9/9/2016
Trivento summercamp masterclass 9/9/2016
 
High-Speed Reactive Microservices
High-Speed Reactive MicroservicesHigh-Speed Reactive Microservices
High-Speed Reactive Microservices
 
Battery Ventures: Simulating and Visualizing Large Scale Cassandra Deployments
Battery Ventures: Simulating and Visualizing Large Scale Cassandra DeploymentsBattery Ventures: Simulating and Visualizing Large Scale Cassandra Deployments
Battery Ventures: Simulating and Visualizing Large Scale Cassandra Deployments
 
Drive DBMS Transformation with EDB Postgres
Drive DBMS Transformation with EDB PostgresDrive DBMS Transformation with EDB Postgres
Drive DBMS Transformation with EDB Postgres
 
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerceDon't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
 
RightScale Webinar: Hybrid Cloud Fundamentals and Lessons Learned
RightScale Webinar: Hybrid Cloud Fundamentals and Lessons LearnedRightScale Webinar: Hybrid Cloud Fundamentals and Lessons Learned
RightScale Webinar: Hybrid Cloud Fundamentals and Lessons Learned
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Relevance of time series databases & druid.io
Relevance of time series databases & druid.ioRelevance of time series databases & druid.io
Relevance of time series databases & druid.io
 
Big Data in Production: Lessons from Running in the Cloud
Big Data in Production: Lessons from Running in the CloudBig Data in Production: Lessons from Running in the Cloud
Big Data in Production: Lessons from Running in the Cloud
 
Neo4j: What's Under the Hood & How Knowing This Can Help You
Neo4j: What's Under the Hood & How Knowing This Can Help You Neo4j: What's Under the Hood & How Knowing This Can Help You
Neo4j: What's Under the Hood & How Knowing This Can Help You
 
Neo4j GraphTalks - Einführung in Graphdatenbanken
Neo4j GraphTalks - Einführung in GraphdatenbankenNeo4j GraphTalks - Einführung in Graphdatenbanken
Neo4j GraphTalks - Einführung in Graphdatenbanken
 
A Modern Data Architecture for Risk Management... For Financial Services
A Modern Data Architecture for Risk Management... For Financial ServicesA Modern Data Architecture for Risk Management... For Financial Services
A Modern Data Architecture for Risk Management... For Financial Services
 
z Systems redefining Enterprise IT for digital business - Alain Poquillon
z Systems redefining Enterprise IT for digital business - Alain Poquillonz Systems redefining Enterprise IT for digital business - Alain Poquillon
z Systems redefining Enterprise IT for digital business - Alain Poquillon
 
Lean Enterprise, Microservices and Big Data
Lean Enterprise, Microservices and Big DataLean Enterprise, Microservices and Big Data
Lean Enterprise, Microservices and Big Data
 
Microservices: A foundational approach for fully managed cloud data analytics
Microservices: A foundational approach for fully managed cloud data analyticsMicroservices: A foundational approach for fully managed cloud data analytics
Microservices: A foundational approach for fully managed cloud data analytics
 

More from ScyllaDB

Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasScyllaDB
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...ScyllaDB
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...ScyllaDB
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaScyllaDB
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBScyllaDB
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptxScyllaDB
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDBScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationScyllaDB
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsScyllaDB
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesScyllaDB
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsScyllaDB
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101ScyllaDB
 
Top NoSQL Data Modeling Mistakes
Top NoSQL Data Modeling MistakesTop NoSQL Data Modeling Mistakes
Top NoSQL Data Modeling MistakesScyllaDB
 
NoSQL Data Modeling Foundations — Introducing Concepts & Principles
NoSQL Data Modeling Foundations — Introducing Concepts & PrinciplesNoSQL Data Modeling Foundations — Introducing Concepts & Principles
NoSQL Data Modeling Foundations — Introducing Concepts & PrinciplesScyllaDB
 
Optimizing Performance in Rust for Low-Latency Database Drivers
Optimizing Performance in Rust for Low-Latency Database DriversOptimizing Performance in Rust for Low-Latency Database Drivers
Optimizing Performance in Rust for Low-Latency Database DriversScyllaDB
 
Overcoming Media Streaming Challenges with NoSQL
Overcoming Media Streaming Challenges with NoSQLOvercoming Media Streaming Challenges with NoSQL
Overcoming Media Streaming Challenges with NoSQLScyllaDB
 
How Optimizely (Safely) Maximizes Database Concurrency.pdf
How Optimizely (Safely) Maximizes Database Concurrency.pdfHow Optimizely (Safely) Maximizes Database Concurrency.pdf
How Optimizely (Safely) Maximizes Database Concurrency.pdfScyllaDB
 
How Development Teams Cut Costs with ScyllaDB.pdf
How Development Teams Cut Costs with ScyllaDB.pdfHow Development Teams Cut Costs with ScyllaDB.pdf
How Development Teams Cut Costs with ScyllaDB.pdfScyllaDB
 
Learning Rust the Hard Way for a Production Kafka + ScyllaDB Pipeline
Learning Rust the Hard Way for a Production Kafka + ScyllaDB PipelineLearning Rust the Hard Way for a Production Kafka + ScyllaDB Pipeline
Learning Rust the Hard Way for a Production Kafka + ScyllaDB PipelineScyllaDB
 

More from ScyllaDB (20)

Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDB
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & Tradeoffs
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101
 
Top NoSQL Data Modeling Mistakes
Top NoSQL Data Modeling MistakesTop NoSQL Data Modeling Mistakes
Top NoSQL Data Modeling Mistakes
 
NoSQL Data Modeling Foundations — Introducing Concepts & Principles
NoSQL Data Modeling Foundations — Introducing Concepts & PrinciplesNoSQL Data Modeling Foundations — Introducing Concepts & Principles
NoSQL Data Modeling Foundations — Introducing Concepts & Principles
 
Optimizing Performance in Rust for Low-Latency Database Drivers
Optimizing Performance in Rust for Low-Latency Database DriversOptimizing Performance in Rust for Low-Latency Database Drivers
Optimizing Performance in Rust for Low-Latency Database Drivers
 
Overcoming Media Streaming Challenges with NoSQL
Overcoming Media Streaming Challenges with NoSQLOvercoming Media Streaming Challenges with NoSQL
Overcoming Media Streaming Challenges with NoSQL
 
How Optimizely (Safely) Maximizes Database Concurrency.pdf
How Optimizely (Safely) Maximizes Database Concurrency.pdfHow Optimizely (Safely) Maximizes Database Concurrency.pdf
How Optimizely (Safely) Maximizes Database Concurrency.pdf
 
How Development Teams Cut Costs with ScyllaDB.pdf
How Development Teams Cut Costs with ScyllaDB.pdfHow Development Teams Cut Costs with ScyllaDB.pdf
How Development Teams Cut Costs with ScyllaDB.pdf
 
Learning Rust the Hard Way for a Production Kafka + ScyllaDB Pipeline
Learning Rust the Hard Way for a Production Kafka + ScyllaDB PipelineLearning Rust the Hard Way for a Production Kafka + ScyllaDB Pipeline
Learning Rust the Hard Way for a Production Kafka + ScyllaDB Pipeline
 

Recently uploaded

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Recently uploaded (20)

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Beyond Linear Scaling: A New Path for Performance with ScyllaDB

  • 1. Felipe Mendes, Solution Architect at ScyllaDB Beyond Linear Scaling A New Path for Performance with ScyllaDB
  • 2. + For data-intensive applications that require high throughput and predictable low latencies + Close-to-the-metal design takes full advantage of modern infrastructure + >5x higher throughput + >20x lower latency + >75% TCO savings + Compatible with Apache Cassandra and Amazon DynamoDB + DBaaS/Cloud, Enterprise and Open Source solutions The Database for Gamechangers 2 “ScyllaDB stands apart...It’s the rare product that exceeds my expectations.” – Martin Heller, InfoWorld contributing editor and reviewer “For 99.9% of applications, ScyllaDB delivers all the power a customer will ever need, on workloads that other databases can’t touch – and at a fraction of the cost of an in-memory solution.” – Adrian Bridgewater, Forbes senior contributor
  • 3. 3 +400 Gamechangers Leverage ScyllaDB Seamless experiences across content + devices Digital experiences at massive scale Corporate fleet management Real-time analytics 2,000,000 SKU -commerce management Video recommendation management Threat intelligence service using JanusGraph Real time fraud detection across 6M transactions/day Uber scale, mission critical chat & messaging app Network security threat detection Power ~50M X1 DVRs with billions of reqs/day Precision healthcare via Edison AI Inventory hub for retail operations Property listings and updates Unified ML feature store across the business Cryptocurrency exchange app Geography-based recommendations Global operations- Avon, Body Shop + more Predictable performance for on sale surges GPS-based exercise tracking Serving dynamic live streams at scale Powering India's top social media platform Personalized advertising to players Distribution of game assets in Unreal Engine
  • 4. Introductions Felipe Mendes, Solution Architect at ScyllaDB + Published Author on Linux and Databases + Helps teams solve their most challenging problems + Years of experience with Linux and distributed systems
  • 5. Agenda + (Near) Linear Scaling + Enter Real-life + ScyllaDB under Load + Crafting Your Success + Beyond Linear Scaling
  • 6. 6 (Near) Linear Scaling Why is it important … And when you shouldn't care :-)
  • 7. 7 Linear Speedup Main goal is to run programs faster + To a point… + Measured as + Reasons for sub-linear speedup: + Laws! (Amdahl's, Gustafson-Barsis) + Task Management + Communication & Synchronization 15.2 Performance in Practice Ideal, typical, and super-linear speedup curves
  • 8. Universal Scaling Law Generalization of Amdahl’s Law discovered by Dr. Neil Gunther. As number of users (N) increases, the system throughput (X) will: + Enjoy a period of near linear scaling + Eventually saturate some resource such that increasing N doesn’t increase X. This defines maxX + Possibly encounter a coordination cost that drives down X with further increasing N Saturation Region Linear Region Retrograde Region maxX How Optimizely (Safely) Maximizes Database Concurrency
  • 9. Linear Scaling – Good Relevant for parallel programming, useful for measuring: + Database efficiency + Price-performance + Scalability NoSQL Benchmark: MongoDB vs ScyllaDB 9
  • 10. Doesn't account for: + Improvements Over Time + Application Semantics + Hotspots + Scaling Clients + Consistent Hashing Uneven Distribution + Communication Overhead More on propagating state (and image credits): Gnutella: an Intro to Gossip Linear Scaling – Bad Gossip propagation 10
  • 11. 11 Enter Real-life Overlooked considerations no one (dares) to tell you ;-)
  • 12. 12 Application Semantics 1,000,000 sensors, representing homes in an area IOT Social DynamoDB: When to Move Out?
  • 13. 13 Consistent Hashing Exercise: How much more traffic and load does this node receive? Alexys Jacob – Leveraging consistent hashing in your python applications thelastpickle – The Impacts of Changing the Number of VNodes in Apache Cassandra Avi Kivity's shard simulator Bad Better, but not perfect
  • 14. Adding more nodes won't help Hotspots 14
  • 15. How Discord Stores Trillions of Messages Performance Under Load – Adaptive Concurrency Limits Challenges: + For a system serving X static clients, what's the max effective concurrency to set on a single client? + When scaling clients, how to coordinate them to avoid overwhelming a group of replicas? Scaling Clients 15 Discord consistent hash-based routing DB Calls Netflix Adaptive Concurrency
  • 16. 16 ScyllaDB Under Load Quantifying the Performance Impact of a Shard-per-Core Architecture
  • 17. 17 ScyllaDB Architecture Dor Laor on P99 CONF: Quantifying the Impact of Shard Per Core Architecture
  • 18. Linear Scale Ingestion Constant Time Ingestion 2X 2X 2X 2X 2X 18
  • 19. 2X 2X 2X 2X 2X “Nodes must be small, in case they fail” No they don’t! {Replace, Add, remove} Node at constant time 19
  • 20. Compaction Scale 2X 2X 2X 2X 2X 20
  • 22. 22 In a Nutshell... Database Performance At Scale
  • 23. 23 Run Real Tests Benchmark tools prove you can get there, but: + Application semantics are unique + Access patterns are unique + Real-life tooling is also unique + Addressing all corner-cases is time-consuming or even impossible + Don't just blindly assume 2x will give you 2x load
  • 24. 24 Eliminate Noise Avoid large deployments of small nodes + Go Big or Go Home! + Considerably reduces the overhead associated with communication & synchronization + Less resource overcommitment + BUT, keep balance: + Account for inevitable failures + Leave room for unpredictability
  • 25. 25 Tune the client side Understand your data flows: + Can multiple clients spam a single key? + What happens when scaling the number of clients? + How is load balancing achieved? Power of Two Choices load balancing P99 CONF – Conquering Load Balancing: Experiences from ScyllaDB Drivers
  • 26. 26 Beyond Linear Scaling Unveiling Performance Insights
  • 27. 27 ScyllaDB in 2018 Ingestion time – Lower is better
  • 28. 28 ScyllaDB in 2023 4 Ingestion time – Lower is better
  • 29. 29 Linear Scale Ingestion (2023 vs 2018) Constant Time Ingestion 2X 2X 2X 2X 2X
  • 30. 30 Getting Even Faster Time to execute, lower is better
  • 31. 31 Going Beyond – Tablets tablet tablet replica tablet replica tablet replica replication metadata: (per table) Why ScyllaDB is Moving to a New Replication Algorithm: Tablets
  • 32. 32 Going Beyond – ScyllaDB Enterprise Throughput – Higher is better
  • 33. There's much more to performance beyond Linear Scale: + Goods and Bads of Linear Scaling + Real-life situations impacting linear scalability + ScyllaDB Shard-Per-Core Architecture + Run Realistic Workloads + How ScyllaDB drives the meaning of 'performance' 33 Summary
  • 34. Q&A ScyllaDB Cloud Start free trial scylladb.com/cloud Feb 14-15 | VIRTUAL EVENT scylladb.com/summit Virtual Workshop January 25, 2024 scylladb.com/events
  • 35. Thank you for joining us today. @scylladb scylladb/ slack.scylladb.com @scylladb company/scylladb/ scylladb/