7 Reasons Not to Put an External Cache in Front of Your Database.pptx

Tomasz Grabiec (Tomek) Distinguished Engineer at ScyllaDB
Tzach Livyatan,VP Product at ScyllaDB
Moderated by: Erik Costlow, InfoQ Editor
7 Reasons Not to Put an
External Cache in Front
of Your Database

Introductions
Tomasz Grabiec, Distinguished Engineer at ScyllaDB
+ Years of experience with Linux and distributed systems
+ With ScyllaDB since the beginning of the project
+ An open source enthusiast
Tzach Livyatan,VP Product at ScyllaDB
+ Long career in development, system engineering and product management.
+ NoSQL enthusiast

Poll
Are you using cache in front of your DB?

+ For data-intensive applications that require high
throughput and predictable low latencies
+ Close-to-the-metal design takes full advantage of
modern infrastructure
+ >5x higher throughput
+ >20x lower latency
+ >75% TCO savings
+ Compatible with Apache Cassandra and Amazon
DynamoDB
+ DBaaS/Cloud, Enterprise and Open Source
solutions
The Database for Gamechangers
4
“ScyllaDB stands apart...It’s the rare product
that exceeds my expectations.”
– Martin Heller, InfoWorld contributing editor and reviewer
“For 99.9% of applications, ScyllaDB delivers all the
power a customer will ever need, on workloads that other
databases can’t touch – and at a fraction of the cost of
an in-memory solution.”
– Adrian Bridgewater, Forbes senior contributor

5
+400 Gamechangers Leverage ScyllaDB
Seamless experiences
across content + devices
Digital experiences at
massive scale
Corporate fleet
management
Real-time analytics 2,000,000 SKU -commerce
management
Video recommendation
management
Threat intelligence service
using JanusGraph
Real time fraud detection
across 6M
transactions/day
Uber scale, mission critical
chat & messaging app
Network security threat
detection
Power ~50M X1 DVRs with
billions of reqs/day
Precision healthcare via
Edison AI
Inventory hub for retail
operations
Property listings and
updates
Unified ML feature store
across the business
Cryptocurrency exchange
app
Geography-based
recommendations
Global operations- Avon,
Body Shop + more
Predictable performance
for on sale surges
GPS-based exercise
tracking
Serving dynamic live
streams at scale
Powering India's top
social media platform
Personalized
advertising to players
Distribution of game
assets in Unreal Engine

Agenda
+ ScyllaDB Intro
+ Why latency is critical
+ Why placing a cache in front of a DB
might be a bad idea
+ ScyllaDB caching design
+ Summary
+ Q&A

7
Why Latency is
critical
And why ScyllaDB focuses on latency

Lower Consistent Latency -> Higher
Revenue
insideline.com site to reduce load times
from nine seconds to 1.4 seconds, ad
revenue increased three percent, and page
views-per-session went up 17 percent.
https://www.thinkwithgoogle.com/future-of-marketing/digital-transformation/the-
google-gospel-of-speed-urs-hoelzle/
https://www.globaldots.com/resources/blog/latency-is-having-a-huge-negative-impact-on-ecommerce-
companies
https://www.fastcompany.com/1825005/how-one-second-could-cost-amazon-16-billion-sales

Tail Latency - why you should care
Refresh
User App Business
Logic
DB
API Calls DB Calls
Slowest 1% DB responses dominated UX latency

Deep dive into Low Latency Engineering
https://www.p99conf.io/

11
Why placing a cache in
front of a DB might be
a bad idea
And when it might be useful

12
Why users put a cache in front of a DB?
Better latency for hot data

>50ms
> 50ms

14
>50ms
<1ms
> 50ms
>50ms
> 50ms

Type of caches
Embedded in
App /
External
Process

16
Type of caches
Embedded in
App /
External
Process
DAX
DAX
DAX
External /
Transparent

17
Type of caches
Embedded in
App /
External
Process
DAX
DAX
DAX
External /
Transparent
DAX
DAX
DAX

18
Type of caches
Embedded in
App /
External
Process
External
DAX
DAX
DAX
External /
Transparent

19
Type of caches
Embedded in
App /
External
Process
External
DAX
DAX
DAX
External /
Transparent

20
Type of caches
Embedded in
App /
External
Process
External Embedded in
DB
DAX
DAX
DAX
External /
Transparent

21
Type of caches
Embedded in
App /
External
Process
External
DAX
DAX
DAX
External /
Transparent
Embedded in
DB

22
>50ms
> 50ms
<1ms
For the same AWS region
<1ms
For ScyllaDB

Problems
with
external
cache
23
+ Additional latency
+ Additional cost
+ Decreases availability
+ Increases application complexity
+ Ruins the DB caching
+ Ignoring DB knowledge
+ Reduce security

Additional latency
DB
<5 ms
<1ms
<1ms

Additional cost
DB
<5 ms
<1ms
<1ms

Decreases availability
External
HWLB

27
Application complexity
GET
Value
SELECT
Value
Update
Res
Is Nil?
ACK/NAK

Database hold a lot of context on the data, and workload the cache is missing:
+ ScyllaDB is wide-column (Key-Key-Value), while a cache might by Key-Value
only. Say goodbye to partition locality and efficient partition level queries.
+ Structured data: Tables, User Defined Types…
+ Cache setting per table
+ Time To Live (TTL)
+ Materialized View and Secondary Indexes
+ Much more…
28
Ignoring DB knowledge - Data Modeling

CREATE TABLE caching (
k int PRIMARY KEY,
v1 int,
v2 int,
) WITH caching = {'enabled': 'true'};
Cache setting per table

Time To Live (TTL) - Table Level Default
CREATE TABLE heartrate_ttl (
pet_chip_id uuid,
owner uuid,
time timestamp,
heart_rate int,
PRIMARY KEY (pet_chip_id, time))
WITH default_time_to_live = 604800;
▪ Powerful feature to remove data that is no longer needed
▪ ScyllaDB stores the TTL for each column value

The database hold a lot of context on the data, and workload the cache is
missing:
+ Workload prioritization
+ Timer per workload
+ Scan-resistant caching
+ Role-based access control
+ Lightweight Transactions
33
Ignoring DB knowledge - Workload
Security Risk!

SELECT * FROM users BYPASS CACHE;
SELECT name, occupation FROM users WHERE userid IN
(199, 200, 207) BYPASS CACHE;
SELECT * FROM users WHERE birth_year = 1981 AND
country = 'FR' ALLOW FILTERING BYPASS CACHE;
BYPASS CACHE

Workload prioritization
■ OLTP
● Small work items
● Latency sensitive
● involves narrow
portion of the data
■ OLAP
● Large work items
● Throughput oriented
● Performed on large
amounts of data

37
Data flow
memtable
Write
RAM
Disk

38
Data flow
memtable
Write
RAM
Disk
commitlog

39
Data flow
memtable
RAM
Disk
sstable
memtable
Write

40
Data flow
RAM
Disk
sstable
memtable
Write

41
Data flow
RAM
Disk
sstable
sstable
sstable
Read
memtable

+ Read consistency easy
+ Pin sstables and memtable
+ Thanks to collocation
+ ..but slow
42
Data flow
RAM
Disk
sstable
sstable
sstable
Read
memtable

43
Data flow with cache
memtable
RAM
Disk
Read
cache
sstable
sstable
sstable

44
Buffer cache?
RAM
Disk
sstable
4K

Inefficient use of memory:
+ Need to cache whole buffers to cache a single row
+ Access locality not likely if data set >> RAM
45
Why not buffer cache?
SSTable page (4K)
Row (300B)

Poor negative caching:
+ Need to cache whole data buffer to indicate absent data
46
SSTable page (4K)
?

Inefficient use of memory:
+ Redundant buffers due to LSM
+ Read may touch multiple SSTables
+ Negative caching remark pronounced
47
sstable sstable
sstable
Read

High CPU overhead for reads:
+ Reads need to merge data from multiple sstables
48
sstable sstable
sstable
Read

High CPU overhead for reads:
+ SSTable format optimized for compact storage, not read speed
+ Parsing overhead:
+ Need to parse index buffers sequentially
+ Need to parse the data file
49

Premature cache eviction due to SSTable compaction:
+ SSTable compaction removes old files => buffer invalidation
+ Hurts read performance by incurring misses
50
sstable
sstable
sstable
sstable

+ Object cache
+ Like memtable
+ Optimized for low CPU overhead
+ Fast reads
+ Row-granularity caching
+ Reflects data in all relevant SSTables for a given object (e.g. row)
51
Cache structure

+ ScyllaDB reserves and manages most of the memory on a node
+ Small reserve for the OS
+ No use of Linux page cache (only direct I/O)
+ Cache uses all available free memory
+ Shrinked on pressure from memtable and other allocations
52
Memory management
memtable
cache other

53
CPU sharding
CPU 0
CPU 1
CPU 2
CPU 3

54
Thread-per-core architecture
task task task task task task task
+ All processing in a single thread per CPU
+ Short tasks executed serially
+ Cooperative preemption

55
Cache coherency
memtable
Read
cache
task
task
+ Complex operations on data without dealing with concurrency
+ No locking or complex lock-free algorithms
+ Data structures and algorithms simple
memtable
cache

56
Complex DQL/DML
SELECT * FROM table WHERE pk = 0 and ck >= 2;
DELETE FROM table WHERE pk = 0 and ck >= 2;

57
Range queries
2 5
SELECT * FROM table WHERE ... and ck >= 2;
?

58
Range queries
2 5
SELECT * FROM table WHERE ... and ck >= 2;
Range continuity

59
Range deletions
2
DELETE FROM table WHERE ... and ck >= 2;
Range continuity
+ tombstone

ScyllaDB cache highlights
+ ScyllaDB has a fast cache
+ Efficient access & maintenance
+ Thanks to collocation with replica and design
+ Takes care of consistency guarantees
+ Handles complexities of data and query model

62
962 C* nodes to 78
60% TCO
95% latency
“By moving to ScyllaDB Enterprise software
running on AWS EC2 infrastructure and on-
premises, Comcast improved P99 latency by
more than 95% and were able to rip out a UI
cache layer “

From Redis + Elasticsearch to ScyllaDB
63
<1ms P99
Zero downtime
TCO

64
TCO
Speed of Redis
From Redis to ScyllaDB for
Data Stores, Fraud Detection, Ad Targeting
Scalability

65
<1ms avg Latency
From Redis to Cassandra to ScyllaDB Cloud
4-8msP99
Fault Tolerance

66
Throughput P99 Read
Performance
Infrastructure
Savings
Challenges with Cassandra
+ Needed better throughput for reads
+ Too much time tuning for garbage collection
+ “Node sprawl” caused high
infrastructure costs
ScyllaDB Solution
+ More reliable performance for consistently
better customer experience
+ Less administration
+ Frictionless transition to ScyllaDB
Moving to better price-performance

Summary
+ Putting a cache in front of your DB might be anti-productive
+ A cache lacks the context the DB has for each information element.
+ ScyllaDB’s Internal Cache is optimized to work in ScyllaDB Context with
minimal overhead
+ Multiple customers have switched from a Cache+DB setup to ScyllaDB,
reducing the latency, increasing the throughput with less HW

Thank you
for joining us today.
@scylladb scylladb/
slack.scylladb.com
@scylladb company/scylladb/
scylladb/

7 Reasons Not to Put an External Cache in Front of Your Database.pptx

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to 7 Reasons Not to Put an External Cache in Front of Your Database.pptx

Similar to 7 Reasons Not to Put an External Cache in Front of Your Database.pptx (20)

More from ScyllaDB

More from ScyllaDB (20)

Recently uploaded

Recently uploaded (20)

7 Reasons Not to Put an External Cache in Front of Your Database.pptx

Editor's Notes