Indexing Delight   Thinking Cap of Fractal-tree Indexes                    BohuTANG@2012/12                 overred.shuttl...
B-treeInvented in 1972, 40 years!
B-tree                                                                 Block0                           Block1            ...
B-tree Insert                                                               Insert x                                      ...
B-tree Insert                                                               Insert x                                      ...
B-tree Insert                                                               Insert x                                      ...
B-tree Search                                             Search x                                                Block0  ...
B-tree Conclusions●   Search: O(logBN ) block transfers.●   Insert: O(logBN ) block transfers(slow).●   B-tree range queri...
A Simplified Fractal-treeCache Oblivious Lookahead Array, invented by MITers
COLA                                        log2N           ...........Binary Search in one level:O(log2N) 2
COLA (Using Fractional Cascading)                                                      log2N         ...........●   Search...
COLA Conclusions● Search: O(log2N) block transfers(Using Fractional  Cascading).● Insert: O((1/B)log2N) amortized block tr...
COLA vs B-tree● Search:  -- (log2N)/(logBN)     = log2B times slower than B-tree(In theory)● Insert:  --(logBN)/((1/B)log2...
LSM-tree
LSM-tree                                                       In memory                                 buffer           ...
LSM-tree (Using Fractional Cascading)                                                     In memory                       ...
LSM-tree (Merging)                                                           In memory                                 buf...
Fractal-tree IndexesJust Fractal. Patented by Tokutek...
Fractal-tree IndexesSearch: O(logBN) Insert: O((logBN)/B) (amortized)Search is same as B-tree, but insert faster than B-tree
Fractal-tree Indexes (Block size)                    ....            ....     ....    ....               B is 4MB...
Fractal-tree Indexes (Block size)                    full                     ....            ....      ....   ....       ...
Fractal-tree Indexes (Block size)            full     ....            ....      ....   ....                   B is 4MB...
Fractal-tree Indexes (Block size)                                    ..                            ..      ..         ..  ...
εB -treeJust a constant factor on Block fanout...
εB -tree             B-tree      Fast                                ε=1/2 Search      Slow                               ...
εB -tree                          insert            search        B-tree           O(logBN)          O(logBN)        (ɛ=1)...
εB -tree     So, if block size is B, the fanout should be √B
Cache Oblivious DataStructureAll the above is JUST Cache Oblivious Data Structures...
Cache Oblivious Data StructureQuestion:   Reading a sequence of k consecutive blocksat once is not much more expensive tha...
Cache Oblivious Data StructureMy Questions(In Chinese):Q1:  只有1MB内存,怎样把两个64MB有序文件合并成一个有序文件?Q2:  大多数机械磁盘,连续读取多个Block和读取单个Bl...
nessDBYou should agree that VFS do better than yourself cache!https://github.com/shuttler/nessDB
nessDB             ..         ... ... ...        ..     ..      ..   ..                  ..   ..   ..          Each Block ...
nessDB, Whats going on?                             ..                     ..      ..         ..         ..               ...
Thanks!Most of the references are from:Tokutek & MIT CSAIL & Stony Brook.Drafted By BohuTANG using Google Drive, @2012/12/12
Upcoming SlideShare
Loading in...5
×

Indexing delight --thinking cap of fractal-tree indexes

1,092

Published on

Published in: Technology
1 Comment
5 Likes
Statistics
Notes
No Downloads
Views
Total Views
1,092
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
27
Comments
1
Likes
5
Embeds 0
No embeds

No notes for slide

Indexing delight --thinking cap of fractal-tree indexes

  1. 1. Indexing Delight Thinking Cap of Fractal-tree Indexes BohuTANG@2012/12 overred.shuttler@gmail.com
  2. 2. B-treeInvented in 1972, 40 years!
  3. 3. B-tree Block0 Block1 Block2 Block3 .... .... Block4 Block5 .....................................................................................File on disk: ... Block0 ... ... Block3 ... Block5 ...
  4. 4. B-tree Insert Insert x Block0 seek Block1 Block2 Block3 .... .... Block4 Block5 .....................................................................................File on disk: ... Block0 ... ... Block3 ... Block5 ...
  5. 5. B-tree Insert Insert x Block0 seek Block1 Block2 Block3 .... .... seek Block4 Block5 .....................................................................................File on disk: ... Block0 ... ... Block3 ... Block5 ...
  6. 6. B-tree Insert Insert x Block0 seek Block1 Block2 Block3 .... .... seek Block4 Block5 .....................................................................................File on disk: ... Block0 ... ... Block3 ... Block5 ... Insert one item causes many random seeks!
  7. 7. B-tree Search Search x Block0 seek Block1 Block2 Block3 .... .... seek Block4 Block5 ..................................................................................... Query is fast, I/Os costs O(logBN)
  8. 8. B-tree Conclusions● Search: O(logBN ) block transfers.● Insert: O(logBN ) block transfers(slow).● B-tree range queries are slow.● IMPORTANT: --Parent and child blocks sparse in disk.
  9. 9. A Simplified Fractal-treeCache Oblivious Lookahead Array, invented by MITers
  10. 10. COLA log2N ...........Binary Search in one level:O(log2N) 2
  11. 11. COLA (Using Fractional Cascading) log2N ...........● Search: O(log2N) block transfers.● Insert: O((1/B)log2N) amortized block transfers.● Data is stored in log2N arrays of sizes 2, 4, 8, 16,..● Balanced Binary Search Tree
  12. 12. COLA Conclusions● Search: O(log2N) block transfers(Using Fractional Cascading).● Insert: O((1/B)log2N) amortized block transfers.● Data is stored in log2N arrays of sizes 2, 4, 8, 16,..● Balanced Binary Search Tree● Lookahead(Prefetch), Data-Intensive!● BUT, the bottom level will be big and bigger, merging expensive.
  13. 13. COLA vs B-tree● Search: -- (log2N)/(logBN) = log2B times slower than B-tree(In theory)● Insert: --(logBN)/((1/B)log2N) = B/(log2B) times faster than B-trees(In theory)if B = 4KB: COLA search is 12 times slower than B-tree COLA insert is 341 times faster than B-tree
  14. 14. LSM-tree
  15. 15. LSM-tree In memory buffer buffer ... buffer buffer ... buffer ... buffer ... buffer● Lazy insertion, Sorted before● Leveli is the buffer of Leveli+1● Search: O(logBN) * O(logN)● Insert:O((logBN)/B)
  16. 16. LSM-tree (Using Fractional Cascading) In memory buffer buffer ... buffer buffer ... buffer ... buffer ... buffer● Search: O(logBN) (Using FC)● Insert:O((logBN)/B)● 0.618 Fractal-tree?But NOT Cache Oblivious...
  17. 17. LSM-tree (Merging) In memory buffer buffer ... buffer merge merge merge buffer ... buffer ... buffer ... bufferA lot of I/O wasted during merging!Like a headless fly flying... Zzz...
  18. 18. Fractal-tree IndexesJust Fractal. Patented by Tokutek...
  19. 19. Fractal-tree IndexesSearch: O(logBN) Insert: O((logBN)/B) (amortized)Search is same as B-tree, but insert faster than B-tree
  20. 20. Fractal-tree Indexes (Block size) .... .... .... .... B is 4MB...
  21. 21. Fractal-tree Indexes (Block size) full .... .... .... .... B is 4MB...
  22. 22. Fractal-tree Indexes (Block size) full .... .... .... .... B is 4MB...
  23. 23. Fractal-tree Indexes (Block size) .. .. .. .. full .. ... ... ... .. .. .. .. .. .. .. Fractal! 4MB one seek...
  24. 24. εB -treeJust a constant factor on Block fanout...
  25. 25. εB -tree B-tree Fast ε=1/2 Search Slow AOF Slow Fast Inserts Optimal Curve
  26. 26. εB -tree insert search B-tree O(logBN) O(logBN) (ɛ=1) ɛ=1/2 O((logBN)/√B) O(logBN) ɛ=0 O((logN)/B) O(logN) if we want optimal point queries + very fast inserts, we should choose ɛ=1/2
  27. 27. εB -tree So, if block size is B, the fanout should be √B
  28. 28. Cache Oblivious DataStructureAll the above is JUST Cache Oblivious Data Structures...
  29. 29. Cache Oblivious Data StructureQuestion: Reading a sequence of k consecutive blocksat once is not much more expensive thanreading a single block. How to take advantageof this feature?
  30. 30. Cache Oblivious Data StructureMy Questions(In Chinese):Q1: 只有1MB内存,怎样把两个64MB有序文件合并成一个有序文件?Q2: 大多数机械磁盘,连续读取多个Block和读取单个Block花费相差不大,在Q1中如何利用这个优势?
  31. 31. nessDBYou should agree that VFS do better than yourself cache!https://github.com/shuttler/nessDB
  32. 32. nessDB .. ... ... ... .. .. .. .. .. .. .. Each Block is Small-Splittable Tree
  33. 33. nessDB, Whats going on? .. .. .. .. .. ... ... ... .. .. .. .. .. .. .. From the line to the plane..
  34. 34. Thanks!Most of the references are from:Tokutek & MIT CSAIL & Stony Brook.Drafted By BohuTANG using Google Drive, @2012/12/12
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×