Your SlideShare is downloading. ×
Fractal tree-technology-and-the-art-of-indexing
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Fractal tree-technology-and-the-art-of-indexing

336
views

Published on

Published in: Technology, Business

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
336
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
2
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Fractal Tree® Technology Overview The Art of Indexing Martín Farach-Colton Co-founder & Chief Technology Officer
  • 2. The Art of Indexing ® Not all indexing is the same B-tree is the basis for almost all DB systems • Data structure invented in 1972 • Has not kept up with hardware trends ‣ Works poorly on modern rotational disks ‣ Works poorly on SSD Fractal Tree Indexes is the basis of TokuDB • Scales with hardware • Fast Indexing ➔ More Indexing ➔ Faster Queries • Great Compression • No Fragmentation • Reduced wear on SSDs
  • 3. How do Fractal Tree Indexes outperform B-trees?
  • 4. How do Fractal Tree Indexes outperform B-trees? First, some facts about storage systems
  • 5. The Art of Indexing ® Storage is quirky Hard disks are slow for random I/O but fast for sequential I/O Difference causes problems like fragmentation, ...
  • 6. The Art of Indexing ® Storage is quirky Hard disks are slow for random I/O but fast for sequential I/O SSDs are fast for random I/O but expensive for sequential. Garbage collection causes artefacts: increased wear, write cliffs... Difference causes problems like fragmentation, ...
  • 7. The Art of Indexing ® Storage ♡ Big Reads and Writes Big I/Os
  • 8. The Art of Indexing ® Storage ♡ Big Reads and Writes Big I/Os Hard Disks: No Fragmentation
  • 9. The Art of Indexing ® Storage ♡ Big Reads and Writes Big I/Os Hard Disks: No Fragmentation SSDs: Less Garbage Collection & Wear
  • 10. The Art of Indexing ® Storage ♡ Big Reads and Writes Big I/Os Hard Disks: No Fragmentation SSDs: Less Garbage Collection & Wear But you only get the goodies if each I/O has lots of new bytes
  • 11. The Art of Indexing ® Storage ♡ Big Reads and Writes Big I/Os Hard Disks: No Fragmentation SSDs: Less Garbage Collection & Wear But you only get the goodies if each I/O has lots of new bytes If you read a big block, change one byte, then write it, you get terrible wear problems
  • 12. The Art of Indexing ® Storage ♡ Big Reads and Writes Big I/Os Hard Disks: No Fragmentation SSDs: Less Garbage Collection & Wear Both: Better Compression But you only get the goodies if each I/O has lots of new bytes If you read a big block, change one byte, then write it, you get terrible wear problems
  • 13. The Art of Indexing ® Caching is great • But you can’t cache all your data • For stuff not in memory, you have to go to disk Goal: Do the best we can for the stuff on disk Data is big, RAM is Small Data RAM
  • 14. Now, What’s a B-tree? & a Fractal Tree Index
  • 15. The Art of Indexing ® What’s a B-tree? B B B B... ... B B……………………………………………….…………. Search Tree
  • 16. The Art of Indexing ® What’s a B-tree? B B B B... ... B B……………………………………………….…………. Each node has B pivots Search Tree
  • 17. The Art of Indexing ® What’s a B-tree? B B B B... ... B B……………………………………………….…………. Each node has B pivots Search Tree Fetching pivots as a group saves I/Os
  • 18. The Art of Indexing ® What’s a B-tree? B B B B... ... B B……………………………………………….…………. Each node has B pivots Search Tree Fetching pivots as a group saves I/Os Most leaves are on disk, not in RAM
  • 19. The Art of Indexing ® What’s a B-tree? B B B B... ... B B……………………………………………….…………. Each node has B pivots Search Tree Fetching pivots as a group saves I/Os Most leaves are on disk, not in RAM Inserting into a leaf that’s not in memory will require an I/O per insertion
  • 20. The Art of Indexing ® B-tree Delivery Service If fast memory is like walking across a room • Each update in a B-tree is a walking trip from ‣ New York
  • 21. The Art of Indexing ® B-tree Delivery Service If fast memory is like walking across a room • Each update in a B-tree is a walking trip from ‣ New York to St Louis
  • 22. The Art of Indexing ® B-tree Delivery Service If fast memory is like walking across a room • Each update in a B-tree is a walking trip from ‣ New York to St Louis Each item gets its own round trip!
  • 23. The Art of Indexing ® Real-world delivery Keep regional warehouses • Only move stuff when you can move a lot
  • 24. The Art of Indexing ® Real-world delivery Keep regional warehouses • Only move stuff when you can move a lot Each item gets moved several times
  • 25. The Art of Indexing ® Real-world delivery Keep regional warehouses • Only move stuff when you can move a lot Each item gets moved several times But each trip is vastly cheaper
  • 26. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers
  • 27. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers
  • 28. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers
  • 29. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive
  • 30. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive
  • 31. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive
  • 32. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive
  • 33. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive
  • 34. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive
  • 35. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive
  • 36. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive
  • 37. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive
  • 38. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive
  • 39. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive
  • 40. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive
  • 41. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive
  • 42. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive
  • 43. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive
  • 44. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive
  • 45. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive Flush a buffer when it fills
  • 46. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive Flush a buffer when it fills A flush might take an I/O, but it does lots of useful work
  • 47. The Art of Indexing ® Fractal Tree Indexes B B B BB Each node has pivots & Buffers Buffers fill as updates arrive Flush a buffer when it fills A flush might take an I/O, but it does lots of useful work More changes per write ➔ fewer changes for same write load ➔ less SSD wear
  • 48. The Art of Indexing ® Fractal Tree Indexes: Queries B B B BB Lots of buffers have messages
  • 49. The Art of Indexing ® Fractal Tree Indexes: Queries B B B BB Lots of buffers have messages But query follows root-leaf path
  • 50. The Art of Indexing ® Fractal Tree Indexes: Queries B B B BB Lots of buffers have messages But query follows root-leaf path So every query has the most up-to-date information
  • 51. The Art of Indexing ® Fractal Tree Indexes: Queries B B B BB Lots of buffers have messages But query follows root-leaf path So every query has the most up-to-date information Messages can be insert, update, delete
  • 52. The Art of Indexing ® Fractal Tree Indexes: Schema Changes B B B BB Schema Changes are broadcast messages
  • 53. The Art of Indexing ® Fractal Tree Indexes: Schema Changes B B B BB Schema Changes are broadcast messages
  • 54. The Art of Indexing ® Fractal Tree Indexes: Schema Changes B B B BB Schema Changes are broadcast messages
  • 55. The Art of Indexing ® Fractal Tree Indexes: Schema Changes B B B BB Schema Changes are broadcast messages Insertion into root is fast, changes are immediate
  • 56. The Art of Indexing ® Fractal Tree Indexes: Schema Changes B B B BB Schema Changes are broadcast messages Leaves get rewritten when broadcast message makes it that far Insertion into root is fast, changes are immediate Schema change message still on root- leaf path
  • 57. The Art of Indexing ® Analysis Delivery system gives goodies: • Messages get moved, but each I/O pays for a lot of movement • You get very fast inserts • You get Hot Schema Changes Each flush carries lots of useful information • So it’s worth it to make nodes big • No fragmentation, Better Compression • Much better wear on SSDs

×