How Incremental Compaction Reduces Your Storage Footprint

How Incremental
Compaction Reduces Your
Storage Footprint
Benny Halevy, Core Storage Group Manager

Presenter
Benny Halevy, Core Storage Group Manager
■ Leads the storage software development team at ScyllaDB.
■ Working on operating systems and distributed file systems for
over 20 years.
■ Before Scylla, led software development for GSI Technology,
providing a hardware/software solution for deep learning and
similarity search using in-memory computing technology.
■ Previously co-founded Tonian (later acquired by Primary Data)
and led it as CTO, developing a distributed file server based on
the pNFS protocol delivering highly scalable performance and
dynamic, out-of-band data placement control.
■ Before Tonian, lead architect in Panasas of the pNFS protocol.

Introduction
Log-structured Storage and
Compaction Fundamentals

■ Changes to the data are:
● First, recorded in memory, then
● Flushed into SSTables.
■ Updates accumulate over time
● in different SSTables
● Having several versions of the same cell is called
“write amplification”
Log-structured Writes
...
Updates
MemTable
...
SSTable

SSTables
■ Immutable
■ Contain changes to data
● A.k.a mutations
■ Sorted (“Sorted Strings Table”)
■ Have metadata, like:
● Index, Statistics, Filter
...
Updates
MemTable
...
SSTable
🛈 There is no static view of the database

Reading Data
■ Requires reading all relevant SSTables
● Applying the live mutations
● Bloom filter used to locate those
■ Consolidating mutations from many
SSTables is expensive
● We call that “read amplification”
...
Updates
MemTable
...
SSTable

Why is Compaction Needed?
■ SSTables are immutable
● We can’t just keep writing updates
● Obsolete data needs to be deleted
● Reduce write amplification
■ Data may be scattered around
● We want to consolidate it
● Reduce read amplification
...
Updates
MemTable
...
SSTable

1. Compaction first selects a set of sstables to process.
● based on the Compaction Strategy.
2. It then reads the SSTables, and
● writes the compacted output
● while eliminating overwrites, deleted and expired data.
3. Eventually, when the output SSTables are
sealed and safely stored on storage
● the input SSTables can be finally deleted.
� Note that compaction requires temporary space
Since SSTables must not be deleted until their compaction completes.

■ Which mutations can be eliminated?
● Overwritten
● Expired (by TTL)
● Deleted (by tombstone / column deletion)
● Droppable tombstones
a’
a
b c
!c
!d
a’ b !c
!z
!d
[a] is overwritten
by [a’]
[b] is newly
written
[c] is deleted
by [!c]
[!d] is a live
tombstone
[!z] is a
droppable
tombstone
poof!
🛈 Note that tombstones are kept around for gc_grace_seconds
until they are garbage-collected, to prevent data resurrection.

Legacy Compaction Strategies - STCS
There is a choice of compaction strategies, for different workloads.
ICS is based on the following two common strategies:
■ Size-Tiered Compaction Strategy (STCS)
● STCS organizes SSTables into tiers,
● based on their size,
● on an exponential scale
■ When compacting several SSTables
● A single SSTable is created
● It may be as large as the union of all of them
■ Then it’s moved to the next tier
● Or become much smaller due deletes and
expirations
■ Potentially dropping to a lower tier.

STCS Space Amplification
■ STCS requires space of at least twice the data size
■ This is called Space amplification
■ The main factors are:
● Temporary space: during compaction.
● Accumulation of updates and deletes
across different tiers

Legacy Compaction Strategies - LCS
Leveled Compaction Strategy (LCS)
■ Compaction is triggered when a level has more than 10i SSTables
■ LCS picks one sstable from level “i”, with size X, to compact
■ it then finds the roughly 10 sstables in the next level
● overlapping with this sstable
● and compacts all of them together
■ It writes the resulting run
● to the next level
● Run size bound by (1+10)*X

Legacy Compaction Strategies - LCS
■ While LCS limits space amplification
■ It results in higher write amplification.

Incremental
Compaction Strategy

ICS In a Nutshell
■ We observed problems with legacy compaction strategies:
● STCS has high space amplification (and low write amplification)
● LCS has high write amplification (and low space amplification)
■ We wanted to benefit from both approaches
■ By borrowing SSTable Runs from LCS
■ And applying them over size-tiers
🛈 Merely replacing
● increasingly larger SSTables with
● increasingly longer SSTable Runs

SSTable Runs
■ Expansion of the SSTable concept
■ Comprised of a sorted set of SSTables
■ The SSTables are non-overlapping
● Those are called “Fragments”
a
b
...
z
a
b
...
z
🛈 A run is equivalent to
● a large SSTable
● split into several smaller SSTables

How ICS Works?
■ Remember that:
● Fragments are disjoint
● and sorted with respect to each other
■ So we scan the runs, fragment-by-fragment
■ and compact them incrementally
● While deleting exhausted SSTables as we go
A
B
...
Z
a
b
...
z
A+a
B+b
A a
B b
A+a
B+b

Case Study
Phases:
1. Write 500GB
2. Overwrite repeatedly
3. Compact
■ Clearly shows ICS’
improved space-
amplification
■ Most notably
STCS 2X major peak
is gone!

Thank you Stay in touch
Any questions? Benny Halevy
bhalevy@scylladb.com

How Incremental Compaction Reduces Your Storage Footprint

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to How Incremental Compaction Reduces Your Storage Footprint

Similar to How Incremental Compaction Reduces Your Storage Footprint (20)

More from ScyllaDB

More from ScyllaDB (20)

Recently uploaded

Recently uploaded (20)

How Incremental Compaction Reduces Your Storage Footprint

Editor's Notes