Voron is a low-level key-value store that is transactional and uses MVCC. It uses a multi-layered structure with a root tree containing references to named trees, which can themselves contain multiple additional trees. This tree-of-trees structure allows for optimizations like storing single items directly in the parent tree. Voron focuses on very fast reads and writes through its use of memory mapping, journaling only committed changes, and handling I/O asynchronously. It learns from other stores like LevelDB but is optimized for performance through its novel tree architecture and low-level implementation.
5. Seeks are slow
0.01 ms – Compress 1kb with Zippy
0.25 ms – Read 1 MB from memory
0.50 ms – Ping inside data center
10.0 ms – Disk seek
10.0 ms – Read 1 MB from network
30.0 ms – Read 1 MB from disk
9. Modifying the tree
Find appropriate #to modify.
Get a scratch page, copy #to scratch page.
Register scratch #with the old ## in #translation table
(PTT).
Modify the #as you wish.
On commit, the PTT becomes publicly visible.
All changed pages are written to journal file.
If rollback, revert to previous PTT, release scratch
pages, done.
11. Background
Find pages in scratch that have no one looking at
older versions of them.
Copy to data file.
Clear the scratch space.
12. How it works
Only I/O during commits is a single write
through, compressed, of data to journal.
Moving data to data file is done in async.
No need to call fsync().
Full & incremental backups.
13. Missing the forest
Voron isn’t a B+ Tree system.
It doesn’t have a tree, it has trees. Plural.
<blink>Important</blink>
14. Falling trees
Single root tree
Contain many additional trees.
Tree is similar to a table.
Operations on tree:
Add(key, value)
Del(key, value)
Find(key) : value
Iterate() (Seek,Next, Prev)
18. So, Voron has trees…
Root tree
Free Space tree
Contains references to named trees
Enough?
Tree of trees
MultiAdd, MultiDelete, MultiRead
19. Why multi trees?
Optimization – if has just 1 item (and no value) can
directly use the parent tree store.
Store multiple items for a single value.
21. What voron does?
Opens up a lot of interesting scenarios.
We have far better control over persistence now.
Very low level (bits & bytes).
Very fast!
Concurrency benefits:
Reads
Writes*
* Yet Voron allows only a single writer!
22. What it does not?
It isn’t about Linux. It can’t run on Linux*.
Need to implment:
PosixPureMemoryPager
PosixPageFileBackedMemoryMappedPager
PosixMemoryMapPager
Waiting for big Linux push post 3.0 release.
23. the cloud story…
Scratch / temp usage
Utilize fast local drives that can go away.
Slow I/O only hold us for tx commit (and we optimized
that).
24. Summary
Voron learned from LevelDB, LMDB, Esent.
Journal for Atomicity, Consistency & Durability.
MVCC for Consistency & Isolation.
Root tree, named tress, multi trees.