Seeks are slow
0.01 ms – Compress 1kb with Zippy
0.25 ms – Read 1 MB from memory
0.50 ms – Ping inside data center
10.0 ms – Disk seek
10.0 ms – Read 1 MB from network
30.0 ms – Read 1 MB from disk
Modifying the tree
Find appropriate #to modify.
Get a scratch page, copy #to scratch page.
Register scratch #with the old ## in #translation table
Modify the #as you wish.
On commit, the PTT becomes publicly visible.
All changed pages are written to journal file.
If rollback, revert to previous PTT, release scratch
Find pages in scratch that have no one looking at
older versions of them.
Copy to data file.
Clear the scratch space.
How it works
Only I/O during commits is a single write
through, compressed, of data to journal.
Moving data to data file is done in async.
No need to call fsync().
Full & incremental backups.
Missing the forest
Voron isn’t a B+ Tree system.
It doesn’t have a tree, it has trees. Plural.
Single root tree
Contain many additional trees.
Tree is similar to a table.
Operations on tree:
Find(key) : value
Iterate() (Seek,Next, Prev)
What voron does?
Opens up a lot of interesting scenarios.
We have far better control over persistence now.
Very low level (bits & bytes).
* Yet Voron allows only a single writer!
What it does not?
It isn’t about Linux. It can’t run on Linux*.
Need to implment:
Waiting for big Linux push post 3.0 release.
the cloud story…
Scratch / temp usage
Utilize fast local drives that can go away.
Slow I/O only hold us for tx commit (and we optimized
Voron learned from LevelDB, LMDB, Esent.
Journal for Atomicity, Consistency & Durability.
MVCC for Consistency & Isolation.
Root tree, named tress, multi trees.