Using ScyllaDB for
Real-Time Read-Heavy
Workloads
Felipe Cardeneti Mendes, Technical Director, ScyllaDB
Tim Koopmans, Product Experience, ScyllaDB
Poll
Where are you in your NoSQL adoption?
Using ScyllaDB for
Real-Time Read-Heavy
Workloads
Felipe Cardeneti Mendes, Technical Director, ScyllaDB
Tim Koopmans, Product Experience, ScyllaDB
+ For data-intensive applications that require high
throughput and predictable low latencies
+ Close-to-the-metal design takes full advantage of
modern infrastructure
+ >5x higher throughput
+ >20x lower latency
+ >75% TCO savings
+ Compatible with Apache Cassandra and Amazon
DynamoDB
+ DBaaS/Cloud, Enterprise and Open Source
solutions
The Database for Gamechangers
4
“ScyllaDB stands apart...It’s the rare product
that exceeds my expectations.”
– Martin Heller, InfoWorld contributing editor and reviewer
“For 99.9% of applications, ScyllaDB delivers all the
power a customer will ever need, on workloads that other
databases can’t touch – and at a fraction of the cost of
an in-memory solution.”
– Adrian Bridgewater, Forbes senior contributor
5
+400 Gamechangers Leverage ScyllaDB
Seamless experiences
across content + devices
Digital experiences at
massive scale
Corporate fleet
management
Real-time analytics 2,000,000 SKU -commerce
management
Video recommendation
management
Threat intelligence service
using JanusGraph
Real time fraud detection
across 6M transactions/day
Uber scale, mission critical
chat & messaging app
Network security threat
detection
Power ~50M X1 DVRs with
billions of reqs/day
Precision healthcare via
Edison AI
Inventory hub for retail
operations
Property listings and
updates
Unified ML feature store
across the business
Cryptocurrency exchange
app
Geography-based
recommendations
Global operations- Avon,
Body Shop + more
Predictable performance for
on sale surges
GPS-based exercise
tracking
Serving dynamic live
streams at scale
Powering India's top
social media platform
Personalized
advertising to players
Distribution of game
assets in Unreal Engine
Presenters
Felipe Cardeneti Mendes, Technical Director
+ Puppy Lover
+ Open Source Enthusiast
+ ScyllaDB passionate!
Tim Koopmans, Product Experience
+ Rust devy
+ Marathon Swimmer
+ Love all things P99
Agenda
+ Characterizing Read-heavy workloads
+ Challenges and Tradeoffs
+ ScyllaDB Under Load
+ Best Practices
+ Success Stories
High throughput REAL-TIME data processing.
Characterizing
Read-Heavy Workloads
+ Commonly referred to as "read-mostly"
+ Workloads requiring high volume of reads under very low response times
+ Challenges involve:
+ Scaling reads – Caches can become prohibitively expensive
+ Competing workloads – No coordination
+ Expensive queries – Aggregations, expensive filtering clauses
+ Performance over time – Dataset growth, changing access patterns (hotspots)
Real-time Read-Heavy?
+ 👥 Social
+ Feeds
+ Activity Timelines
+ Chat
+ ⛭ Recommendation / Personalization
+ Past Interactions
+ Collaborative & Content-based Filtering
+ User Profile (preferences, demographics…)
+ 🔑 Session / Profile Management
+ Authentication / Authorization
+ Security
+ User Experience
Commonly Seen Use Cases
+ 👕 Product Catalogs
+ Product Browsing / Views
+ Reviews
+ Inventory & Pricing Tracking
+ 🔔 Betting
+ Live Odds and Updates
+ Leaderboards
+ Real-time notifications
+ 🌐 Metadata Store / CDNs
+ Static assets
+ Caching layer
+ Content versioning
What happens during a read?
Challenges and
Tradeoffs
ScyllaDB Read Path
12
memtable
RAM
Disk
Read
cache
sstable
sstable
sstable
Flush
Merge
Hot versus Cold Reads
13
+ Cache items have unlimited fetch ceiling
+ Be mindful of your read:
+ Do I often retrieve a single key:value or scan a wide partition?
+ Is the data frequently accessed?
+ Will the read cause eviction of important items I need?
Read from cache Gone to disk
14
Cache Thrashing
Constant populations and evictions without fully taking advantage of caching
■ Commonly seen in heavy full-scans / Analytics
■ BYPASS CACHE
Image Credits: Yuri Kushch – Caching Strategies
Inside ScyllaDB’s Internal Cache
Paging (internal and external)
R1
R2
R3
Client
Coordinator
Node
Data
Data
Quorum
Data + Digest
Digest
Tombstones
6 seconds!
+ Deletes are actually writes of a "tombstone marker"
+ Too many deletes slow down the read path
+ When you read:
+ Scans need to iterate through your deletes
+ Many deletes result in higher latencies
ScyllaDB Under Load
Live Optimizing (or Worsening) Read Performance
Avoiding common mishaps
Best Practices
ScyllaDB Cache
■ Cache is LRU on rows
● Use BYPASS CACHE for analytical workloads
■ Efficient access & maintenance
● Thanks to replica collocation and design
■Efficient access & maintenance
CPU 0
CPU 1
CPU 2
CPU 3
SSTable index caching
■ The whole of index can now be
cached in memory
■ Populated on access (read-through)
■ Evicted on memory pressure
■ Partition index summary still
non-evictable and always resident
RAM
Disk
Workload Prioritization
21
Different workloads require different priorities
■ Meet SLAs
■ Flexible Configuration
■ Adaptability to Changing Conditions
Heat-Weighted Load Balancing
22
+ Replica goes down and comes back up
+ Caches are cold.
+ Never sending requests to the node means caches never warm up.
+ Optimize mathematically the desired hit ratio so that caches warm up,
+ While still keeping latencies down !
Restarted node. Cache misses are
initially high but deterministically go
down
Prepare your Queries
Ad-hoc, rare queries are the only excuse not to prepare statements.
R2
R3
Client
R1
Quorum
Both coordinator and replica, one less round-trip!
Keep parallelism HIGH
■ Low parallelism hurts ScyllaDB
● Fewer units will be working, database will not be efficient
■ Is there such a thing as too high?
Nope!
■ No need to guess:
● C = T x L
● Example: 200,000 requests/s at 1ms average latency:
■ C = 200,000 * 0.001
■ C = 200.
■ Driver settings:
● Number of connections x maximum requests per connection
● Remaining requests will be queued in the application side.
ScyllaDB and Memcached
p99.999 < 1ms :-)
Success Stories
How ScyllaDB is being used among your peers!
The different GCP disk types each meet these requirements in
different ways. It would be all too convenient if we could combine
both disk types into one super-disk. Since our primary focus for disk
performance was low-latency reads, we would love to read from
GCP's Local SSDs (low latency) while still writing to Persistent Disks
(snapshotting, redundancy via replication). But is there a way to
create such a super-disk at the software level?
How Discord Supercharges Network Disks for Extreme Low Latency
This workload is quite performance sensitive, so getting quick
responses from our database is key. This approach saves us plenty of
headaches, and it performs really well. We’ve had this system
deployed within Epic for over a year, and are working with licensees
to get this deployed for them as well. It’s been serving us well to
allow people to work much more efficiently. [Even as] assets
continue to grow even larger, people can still work from home
Epic Games & Unreal Engine: Where ScyllaDB Comes Into Play
Just to illustrate that idea, let’s say we have a customer in London.
We will place a copy of our services (“a cell”) into that region. And all
of that customer’s interactions will be contained in that region,
ensuring that they always have low latency. We’ll place multiple
replicas of their data in that region. And will also place additional
replicas of their data in other regions. This becomes important later.
Worldwide Local Latency With ScyllaDB: The ZeroFlucs Strategy
Poll
How much data do you have under
management of your transactional
database?
Keep Learning
scylladb.com/category/engineering Register now at p99conf.io
Visit our blog for
more on ScyllaDB
engineering
Thank you
for joining us today.
@scylladb scylladb/
slack.scylladb.com
@scylladb company/scylladb/
scylladb/

Using ScyllaDB for Real-Time Read-Heavy Workloads.pdf

  • 1.
    Using ScyllaDB for Real-TimeRead-Heavy Workloads Felipe Cardeneti Mendes, Technical Director, ScyllaDB Tim Koopmans, Product Experience, ScyllaDB
  • 2.
    Poll Where are youin your NoSQL adoption?
  • 3.
    Using ScyllaDB for Real-TimeRead-Heavy Workloads Felipe Cardeneti Mendes, Technical Director, ScyllaDB Tim Koopmans, Product Experience, ScyllaDB
  • 4.
    + For data-intensiveapplications that require high throughput and predictable low latencies + Close-to-the-metal design takes full advantage of modern infrastructure + >5x higher throughput + >20x lower latency + >75% TCO savings + Compatible with Apache Cassandra and Amazon DynamoDB + DBaaS/Cloud, Enterprise and Open Source solutions The Database for Gamechangers 4 “ScyllaDB stands apart...It’s the rare product that exceeds my expectations.” – Martin Heller, InfoWorld contributing editor and reviewer “For 99.9% of applications, ScyllaDB delivers all the power a customer will ever need, on workloads that other databases can’t touch – and at a fraction of the cost of an in-memory solution.” – Adrian Bridgewater, Forbes senior contributor
  • 5.
    5 +400 Gamechangers LeverageScyllaDB Seamless experiences across content + devices Digital experiences at massive scale Corporate fleet management Real-time analytics 2,000,000 SKU -commerce management Video recommendation management Threat intelligence service using JanusGraph Real time fraud detection across 6M transactions/day Uber scale, mission critical chat & messaging app Network security threat detection Power ~50M X1 DVRs with billions of reqs/day Precision healthcare via Edison AI Inventory hub for retail operations Property listings and updates Unified ML feature store across the business Cryptocurrency exchange app Geography-based recommendations Global operations- Avon, Body Shop + more Predictable performance for on sale surges GPS-based exercise tracking Serving dynamic live streams at scale Powering India's top social media platform Personalized advertising to players Distribution of game assets in Unreal Engine
  • 6.
    Presenters Felipe Cardeneti Mendes,Technical Director + Puppy Lover + Open Source Enthusiast + ScyllaDB passionate! Tim Koopmans, Product Experience + Rust devy + Marathon Swimmer + Love all things P99
  • 7.
    Agenda + Characterizing Read-heavyworkloads + Challenges and Tradeoffs + ScyllaDB Under Load + Best Practices + Success Stories
  • 8.
    High throughput REAL-TIMEdata processing. Characterizing Read-Heavy Workloads
  • 9.
    + Commonly referredto as "read-mostly" + Workloads requiring high volume of reads under very low response times + Challenges involve: + Scaling reads – Caches can become prohibitively expensive + Competing workloads – No coordination + Expensive queries – Aggregations, expensive filtering clauses + Performance over time – Dataset growth, changing access patterns (hotspots) Real-time Read-Heavy?
  • 10.
    + 👥 Social +Feeds + Activity Timelines + Chat + ⛭ Recommendation / Personalization + Past Interactions + Collaborative & Content-based Filtering + User Profile (preferences, demographics…) + 🔑 Session / Profile Management + Authentication / Authorization + Security + User Experience Commonly Seen Use Cases + 👕 Product Catalogs + Product Browsing / Views + Reviews + Inventory & Pricing Tracking + 🔔 Betting + Live Odds and Updates + Leaderboards + Real-time notifications + 🌐 Metadata Store / CDNs + Static assets + Caching layer + Content versioning
  • 11.
    What happens duringa read? Challenges and Tradeoffs
  • 12.
  • 13.
    Hot versus ColdReads 13 + Cache items have unlimited fetch ceiling + Be mindful of your read: + Do I often retrieve a single key:value or scan a wide partition? + Is the data frequently accessed? + Will the read cause eviction of important items I need? Read from cache Gone to disk
  • 14.
    14 Cache Thrashing Constant populationsand evictions without fully taking advantage of caching ■ Commonly seen in heavy full-scans / Analytics ■ BYPASS CACHE Image Credits: Yuri Kushch – Caching Strategies Inside ScyllaDB’s Internal Cache
  • 15.
    Paging (internal andexternal) R1 R2 R3 Client Coordinator Node Data Data Quorum Data + Digest Digest
  • 16.
    Tombstones 6 seconds! + Deletesare actually writes of a "tombstone marker" + Too many deletes slow down the read path + When you read: + Scans need to iterate through your deletes + Many deletes result in higher latencies
  • 17.
    ScyllaDB Under Load LiveOptimizing (or Worsening) Read Performance
  • 18.
  • 19.
    ScyllaDB Cache ■ Cacheis LRU on rows ● Use BYPASS CACHE for analytical workloads ■ Efficient access & maintenance ● Thanks to replica collocation and design ■Efficient access & maintenance CPU 0 CPU 1 CPU 2 CPU 3
  • 20.
    SSTable index caching ■The whole of index can now be cached in memory ■ Populated on access (read-through) ■ Evicted on memory pressure ■ Partition index summary still non-evictable and always resident RAM Disk
  • 21.
    Workload Prioritization 21 Different workloadsrequire different priorities ■ Meet SLAs ■ Flexible Configuration ■ Adaptability to Changing Conditions
  • 22.
    Heat-Weighted Load Balancing 22 +Replica goes down and comes back up + Caches are cold. + Never sending requests to the node means caches never warm up. + Optimize mathematically the desired hit ratio so that caches warm up, + While still keeping latencies down ! Restarted node. Cache misses are initially high but deterministically go down
  • 23.
    Prepare your Queries Ad-hoc,rare queries are the only excuse not to prepare statements. R2 R3 Client R1 Quorum Both coordinator and replica, one less round-trip!
  • 24.
    Keep parallelism HIGH ■Low parallelism hurts ScyllaDB ● Fewer units will be working, database will not be efficient ■ Is there such a thing as too high? Nope! ■ No need to guess: ● C = T x L ● Example: 200,000 requests/s at 1ms average latency: ■ C = 200,000 * 0.001 ■ C = 200. ■ Driver settings: ● Number of connections x maximum requests per connection ● Remaining requests will be queued in the application side.
  • 25.
  • 26.
    Success Stories How ScyllaDBis being used among your peers!
  • 27.
    The different GCPdisk types each meet these requirements in different ways. It would be all too convenient if we could combine both disk types into one super-disk. Since our primary focus for disk performance was low-latency reads, we would love to read from GCP's Local SSDs (low latency) while still writing to Persistent Disks (snapshotting, redundancy via replication). But is there a way to create such a super-disk at the software level? How Discord Supercharges Network Disks for Extreme Low Latency
  • 28.
    This workload isquite performance sensitive, so getting quick responses from our database is key. This approach saves us plenty of headaches, and it performs really well. We’ve had this system deployed within Epic for over a year, and are working with licensees to get this deployed for them as well. It’s been serving us well to allow people to work much more efficiently. [Even as] assets continue to grow even larger, people can still work from home Epic Games & Unreal Engine: Where ScyllaDB Comes Into Play
  • 29.
    Just to illustratethat idea, let’s say we have a customer in London. We will place a copy of our services (“a cell”) into that region. And all of that customer’s interactions will be contained in that region, ensuring that they always have low latency. We’ll place multiple replicas of their data in that region. And will also place additional replicas of their data in other regions. This becomes important later. Worldwide Local Latency With ScyllaDB: The ZeroFlucs Strategy
  • 30.
    Poll How much datado you have under management of your transactional database?
  • 31.
    Keep Learning scylladb.com/category/engineering Registernow at p99conf.io Visit our blog for more on ScyllaDB engineering
  • 32.
    Thank you for joiningus today. @scylladb scylladb/ slack.scylladb.com @scylladb company/scylladb/ scylladb/