Upcoming SlideShare
×

# Best for b trees

8,568 views

Published on

Author: T.DineshRaaja

20 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total views
8,568
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
0
0
Likes
20
Embeds 0
No embeds

No notes for slide

### Best for b trees

1. 1. Balanced Trees (B and B+)
2. 2. Completely Balanced Trees <ul><li>So far, we’ve always grown our trees from the root to the leaf nodes </li></ul><ul><li>Problem </li></ul><ul><ul><li>Unequal path lengths </li></ul></ul><ul><li>Goals </li></ul><ul><ul><li>Some maximum number of level traversals </li></ul></ul><ul><ul><li>Expand from binary to N-ary trees </li></ul></ul><ul><ul><ul><li>For a tree containing N nodes, having M children per node where M grows large means that the height of the tree will be small </li></ul></ul></ul>
3. 3. B-Trees <ul><li>Balanced trees </li></ul><ul><li>Used for database index creation </li></ul><ul><ul><li>An on-disk data structure </li></ul></ul><ul><li>Nodes consist of </li></ul><ul><ul><li>N key values </li></ul></ul><ul><ul><li>N data pointers (pointing to entire data item with that key, perhaps in a sequential file) </li></ul></ul><ul><ul><li>N+1 node pointers (pointing to other B-tree nodes) </li></ul></ul><ul><li>Note </li></ul><ul><ul><li>Can calculate N given size of pointers and keys, block size </li></ul></ul>
4. 4. B-Trees <ul><li>Each node may contain a large number of keys </li></ul><ul><li># of subtrees of each node may be large </li></ul><ul><ul><li>An on-disk data structure </li></ul></ul><ul><li>Designed to branch out large number of directions </li></ul><ul><li>Contain lots of keys in each node </li></ul><ul><ul><li> the height of the tree is relatively small N key values </li></ul></ul><ul><li>Small number of nodes must be read from disk to retrieve an item </li></ul><ul><li>Large node size (with lots of keys in the node) but disk drive can usually read a fair amount of data at once. </li></ul>
5. 5. B-Tree Definition <ul><li>A multi-way tree of order m is an ordered tree where each node has at most m children </li></ul><ul><li>If k is the actual number of children in the node, </li></ul><ul><li> k - 1 is the number of keys in the node. </li></ul><ul><li>If the keys and subtrees are arranged in the fashion of a search tree, then this is called a multiway search tree of order m. </li></ul>
6. 6. Multi-way Search Tree of order of 4 Keys Pointers
7. 7. B-Trees
8. 8. B-Tree Properties <ul><li>A B-tree of order m is a multiway search tree of order m such that: </li></ul><ul><li>All leaves are on the bottom level. </li></ul><ul><li>2. All internal nodes (except perhaps the root node) have at least </li></ul><ul><li>  (m / 2)  (nonempty) children  keep it bushy and balanced. </li></ul><ul><li>3. The root node can have as few as 2 children if it is an internal </li></ul><ul><li>node, and can obviously have no children if the root node is a </li></ul><ul><li>leaf (that is, the whole tree consists only of the root node). </li></ul>
9. 9. B-Tree Properties 4. Each leaf node (other than the root node if it is a leaf) must contain at least  (m / 2)  - 1 keys. Note:  x  is the ceiling function whose value is the smallest integer that is greater than or equal to x. E.g.,  3  = 3  3.34  = 4  1.98  = 2  5.001  = 6
10. 10. B-Tree Properties A B-tree is a fairly well-balanced tree since all leaf nodes must be at the bottom. Recall condition 2. All internal nodes (except perhaps the root node) have at least  (m / 2)  (nonempty) children  keep it bushy and balanced. Causes the tree to fan out , i.e., shorter height
11. 11. B-Tree Insertion <ul><li>Insert keys in order in a single block until it fills </li></ul><ul><li>When need to add value where there is no room, split the node into two nodes </li></ul><ul><ul><li>Smaller half (rounded down) of values go into first node </li></ul></ul><ul><ul><li>Larger half (rounded down) of values go into second node </li></ul></ul><ul><ul><li>Median value goes into new parent node </li></ul></ul><ul><ul><li>Node pointers around median value point to two nodes at lower level </li></ul></ul>
12. 12. Insertion Example Insert the following letters into what is originally an empty B-tree of order 5: C N G A H E K Q M F W L T Z D P R X Y S Order 5  max of 5 children and 4 keys. All nodes (except root) must have a minimum of 2 keys. Inserting in alphabetical order the first 4 letters:
13. 13. Insertion Example Insert H next. No room. Split into 2 nodes. Move median item G up into new root node Insert EKQ next
14. 14. Insertion Example C N G A H E K Q M F W L T Z D P R X Y S Inserting E, K, & Q doesn’t require splits. But inserting M does ?? - split into 2
15. 15. Insertion Example C N G A H E K Q M F W L T Z D P R X Y S F, W, L, and T are then added without needing any split.
16. 16. Insertion Example C N G A H E K Q M F W L T Z D P R X Y S F, W, L, and T are then added without needing any split. Adding Z requires node to split Move median (T) up & split node
17. 17. Insertion Example Z
18. 18. C N G A H E K Q M F W L T Z D P R X Y S D (which is the median too) Insert PRXY without any splitting
19. 19. C N G A H E K Q M F W L T Z D P R X Y S <ul><li>- When S is inserted, node must split  Q (median) goes up. </li></ul><ul><li>Q’s parent is full, split further  make M (medium of Q’s parent) </li></ul><ul><li>go up. </li></ul>
20. 20. B Tree Deletion <ul><li>Delete H </li></ul><ul><ul><li>H is a leaf node. Easy. </li></ul></ul><ul><ul><li>Move K and L over to the left: </li></ul></ul>
21. 21. B Tree Deletion <ul><li>Delete T (non-leaf) Find T’s successor (i.e., W) and move it up to replace T </li></ul>
22. 22. B Tree Deletion <ul><li>Delete T </li></ul><ul><ul><li>In all cases, delete leaf if leaf has extra keys </li></ul></ul>
23. 23. B Tree Deletion <ul><li>Delete R next </li></ul><ul><ul><li>R is a leaf node. </li></ul></ul><ul><ul><li>Move S to R’s spot, move W to S’s spot </li></ul></ul><ul><ul><li>Move X up to W’s spot </li></ul></ul>α β γ
24. 24. B Tree Deletion
25. 25. B Tree Deletion - Delete E next - Very problematic  siblings as well as E has no extra keys - Combine the leaf with one of two siblings - Move down parent’s key that was between these two siblings
26. 26. B Tree (Delete E)
27. 27. B Tree Deletion <ul><li>- But node G must have at least  (5 / 2)  -1 keys </li></ul><ul><li>G cannot “borrow” key from sibling (QX node) (no extra keys) </li></ul><ul><li>Combine siblings and parent node into one (root) node </li></ul>
28. 28. B Tree Deletion
29. 29. B+ Trees
30. 30. B+-Tree <ul><li>Problem with B-Tree </li></ul><ul><ul><li>Somewhat unequal access times </li></ul></ul><ul><ul><li>Difficult to traverse index sequentially </li></ul></ul><ul><li>B+-Tree </li></ul><ul><ul><li>All data stored at lowest level (leaf nodes) </li></ul></ul><ul><ul><li>Some values duplicated in internal nodes for indexing </li></ul></ul><ul><ul><li>Cost: extra storage, duplication, two types of nodes </li></ul></ul><ul><ul><li>Benefits: sequential access across bottom level </li></ul></ul>
31. 31. B+-Trees <ul><ul><li>Variant of B-trees </li></ul></ul><ul><ul><li>All data stored at lowest level (leaf nodes) </li></ul></ul><ul><ul><li>All leaves are linked sequentially </li></ul></ul><ul><ul><li>Used as a dynamic indexing method in relational DBs </li></ul></ul><ul><ul><li>Contains index pages and data pages. </li></ul></ul><ul><ul><li>Root and non-leaf nodes are index pages. </li></ul></ul>
32. 32. B+-Tree Example <ul><ul><li>With 4 keys </li></ul></ul><ul><ul><li>Leaf nodes are linked to each other via doubly linked lists (not shown) </li></ul></ul>
33. 33. B+-Tree Insertions <ul><ul><li>Key value determines placement </li></ul></ul><ul><ul><li>Three cases for insertion: </li></ul></ul>
34. 34. The insert algorithm for B+ Trees Leaf Page Full Index Page FULL Action No No Place the record in sorted position in the appropriate leaf page Yes No <ul><li>Split the leaf page </li></ul><ul><li>Place Middle Key in the index page in sorted order. </li></ul><ul><li>Left leaf page contains records with keys below the middle key. </li></ul><ul><li>Right leaf page contains records with keys equal to or greater than the middle key. </li></ul>
35. 35. The insert algorithm for B+ Trees Leaf Page Full Index Page FULL Action Yes Yes <ul><li>Split the leaf page. </li></ul><ul><li>Records with keys < middle key go to the left leaf page. </li></ul><ul><li>Records with keys >= middle key go to the right leaf page. </li></ul><ul><li>Split the index page. </li></ul><ul><li>Keys < middle key go to the left index page. </li></ul><ul><li>Keys > middle key go to the right index page. </li></ul><ul><li>The middle key goes to the next (higher level) index. IF the next level index page is full, continue splitting the index pages. </li></ul>
36. 36. Insertion Example Adding record with Key 28:
37. 37. Insertion Example <ul><li>Adding record with Key 70: </li></ul><ul><li>Should go into leaf containing 50 55 60 65 -> too bad, it’s full </li></ul><ul><li>Split page as follows: </li></ul><ul><li>Left leaf Right leaf </li></ul><ul><li>50 55 60 65 70 </li></ul><ul><li>3. Middle key(60) placed in the corresponding parent index page </li></ul>
38. 38. Insert (Leaf full; index not) 70
39. 39. Insert (Leaf and index pages are full) Add 95  belongs here Split leaf page into 2: 75 80 and 85 90 95 Middle key (85) goes up  parent is full  split parent 25 50 60 75 85 Middle key (60) made a new parent of the parents
40. 40. Insert (Leaf and index pages are full) 95
41. 41. Rotation When a leaf node is full and its sibling is not. Reduce number of page splits. E.g., add 70  previously, we split the 50 55 60 65 node and brought 60 up Instead, move a record to its sibling:
42. 42. The delete algorithm for B+ Trees Leaf Page Below Fill Factor Index Page Below Fill Factor Action No No Delete the record from the leaf page. Arrange keys in ascending order to fill void. If the key of the deleted record appears in the index page, use the next key to replace it. Yes No Combine the leaf page and its sibling. Change the index page to reflect the change. Yes Yes <ul><li>Combine the leaf page and its sibling. </li></ul><ul><li>Adjust the index page to reflect the change. </li></ul><ul><li>Combine the index page with its sibling. Continue combining index pages until you reach a page with the correct fill factor or you reach the root page. </li></ul>
43. 43. Deletion Example Delete 70: OK since fill factor = 50% (min# records in a node)
44. 44. Deletion Example Now delete 25: Leaf: OK (fill factor satisfied) Index: not OK  replace with 28
45. 45. Deletion Example Delete 60: fill factor < 50%  combine leaf pages and index pages
46. 46. Deletion Example