Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Revisiting b+-trees

592 views

Published on

talk about https://github.com/myui/btree4j/ at BDI on Mar 30, 2018.

Published in: Software
  • Hello! Get Your Professional Job-Winning Resume Here - Check our website! https://vk.cc/818RFv
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Revisiting b+-trees

  1. 1. Tutorial: Revisiting Disk-based B+-trees Research Engineer, Treasure Data Makoto YUI @myui 12018/3/30 BDI@Fujitsu B+-trees
  2. 2. 2018/3/30 BDI@Fujitsu 2 データベース研究者ならB-treeぐらい書けますよね?
  3. 3. 2018/3/30 BDI@Fujitsu 3 …
  4. 4. 2018/3/30 BDI@Fujitsu 4
  5. 5. 2018/3/30 BDI@Fujitsu 5 https://github.com/myui/btree4j XML B+-Trees
  6. 6. History of B+-Trees 2018/3/30 BDI@Fujitsu 6 Douglas Comer. 1979. Ubiquitous B-Tree. ACM Comput. Surv. 11, 2 (June 1979), bit.ly/bit-btree Bit’79
  7. 7. 2018/3/30 BDI@Fujitsu 7 B+-Trees CPU →
  8. 8. B+-Trees • B 2018/3/30 BDI@Fujitsu 8
  9. 9. B+-Trees • B • • • • B Fanout 2018/3/30 BDI@Fujitsu 9
  10. 10. 2018/3/30 BDI@Fujitsu 10 B+-Trees key, pointer key, value, rightlink
  11. 11. 2018/3/30 BDI@Fujitsu 11 B+-Trees RDB value Fanout
  12. 12. Btree4j (1/2) • B+-trees • • Paging using LRU cache replacement policy / Freespace mgmt. • Key Value • Prefix B-trees Rudolf Bayer and Karl Unterauer. "Prefix B-trees", Proc. ACM Trans. Database Syst. 2, 1, pp.11-26), March 1977. • 8 bytes Variable-bytes coding keys/values • value key • DB unique non-unique • Delete/Update • Prefix search, Range , wildcard LIKE • DB 2018/3/30 BDI@Fujitsu 12
  13. 13. Btree4j (2/2) • Bulk-loading • • Indexed File) • Fanout leaf value RDB ) . value 2018/3/30 BDI@Fujitsu 13
  14. 14. Btree4j (2/2) • Bulk-loading • • Indexed File) • Fanout leaf value RDB ) . value 2018/3/30 BDI@Fujitsu 14 Cons (Disclaimer) • 2006 (Java5 ) Xindice B-tree Modern Java → Lambda, Stream API Preconditions • read-most XML-DB read-write → OLTP PR ;-) 2-3weeks (?)
  15. 15. Why prefix B-trees? 2018/3/30 BDI@Fujitsu 15 1. Goetz Graefe (2011), "Modern B-Tree Techniques", Foundations and Trends in Databases: Vol. 3: No. 4, pp 203-402. 2. Douglas Comer. Ubiquitous B-Tree. ACM Comput. Surv. 11, 2 (June 1979)
  16. 16. 2018/3/30 BDI@Fujitsu 16 Prefix B-trees computer electronic ”e” → … BTreeNode# getSeparator() leaf split
  17. 17. 2018/3/30 BDI@Fujitsu 17 Btree4j Internal - Paged file (Paged) File Header 4k bytes Page 4k bytes Page header 127 bytes https://xml.apache.org/xindice/dev/guide-internals.html
  18. 18. 2018/3/30 BDI@Fujitsu 18 Btree4j Internal – File header
  19. 19. 2018/3/30 BDI@Fujitsu 19 Btree4j Internal – Page header next_page right link page in-memory LRU . Dirty Page leftmost key rightmost key prefix aaaa, aaabb, aaaccc, aaaddd ”aaa” prefix a, bb, ccc, ddd fanout
  20. 20. 2018/3/30 BDI@Fujitsu 20 Btree4j Internal – Pages and records 1 page = 4k bytes btree4j Datapage Datapage Fragmentation Freespace Freespace
  21. 21. 2018/3/30 BDI@Fujitsu 21 Btree4j Internal – Overflow page
  22. 22. 2018/3/30 BDI@Fujitsu 22 Indexed file range scan
  23. 23. 2018/3/30 BDI@Fujitsu 23 https://github.com/apache/derby/tree/trunk/java/engine/org/apache/derby/impl/store/access/btree http://svn.apache.org/repos/asf/xml/xindice/trunk/java/src/org/apache/xindice/core/filer/ https://github.com/postgres/postgres/tree/master/src/backend/access/nbtree http://pages.cs.wisc.edu/~jignesh/cs564/schedule.html DB 3NF

×