A B-tree is a specialized multiway tree designed especially for use on disk. In a B-tree each node may contain a large number of keys. The number of subtrees of each node, then, may also be large. A B-tree is designed to branch out in this large number of directions and to contain a lot of keys in each node so that the height of the tree is relatively small. This means that only a small number of nodes must be read from disk to retrieve an item. The goal is to get fast access to the data, and with disk drives this means reading a very small number of records. Note that a large node size (with lots of keys in the node) also fits with the fact that with a disk drive one can usually read a fair amount of data at once.
Transcript
1.
B-TreeAn Analysis By: Nikhil Sharma BE/8034/09
2.
DefinitionA B-tree is a tree data structure that keeps data sorted and allows searches, sequential access, insertions, and deletions in logarithmic amortized time. The B-tree is a generalization of a binary search tree in which more than two paths diverge from a single node.A B-tree of order m (the maximum number of children for each node) is a tree which satisfies the following properties: Every node has at most m children. Every node (except root and leaves) has at least m⁄2 children. The root has at least two children if it is not a leaf node. All leaves appear in the same level, and carry information. A non-leaf node with k children contains k−1 keys. Declaration in C: typedef struct { int Count; // number of keys stored in the current node ItemType Key[3]; // array to hold the 3 keys [4]; long Branch[4]; // array of fake pointers (record numbers) } NodeType;
3.
Order & Key of a B-TreeThe following is an example of a B-tree of order 5. This meansthat (other than the root node) all internal nodes have at least 3children (and hence at least 2 keys). Of course, the maximumnumber of children that a node can have is 5 (so that 4 is themaximum number of keys). In practice B-trees usually haveorders a lot bigger than 5. The first row in each node shows thekeys, while the second row shows the pointers to the child nodes
4.
Height of B-Tree If n ≥ 1, then for any n-key B-tree T of height h and minimum degree t ≥ 2, h logt (n 1)/2 Height of the B-Tree with n keys is important as it bound the number of disk accesses. The height of the tree is maximum when each node has minimum number of the subtree pointers, q m / 2 . Note:If number of nodes in B-tree equal 2,000,000 (2 million) and m=200 then maximum height of B-tree is 3, where as the binary tree would be of height 20.
5.
Search in a B-Tree Search in a B-tree is similar to the search in BST except that in B- tree we make a multiway branching decision instead of binary branching in BST. 25 62 12 19 32 39 73 84 3 5 15 17 21 23 30 31 34 37 45 51 69 71 75 79 90 94 Search key 71
6.
B-Tree Insert Operation Insertion in B-tree is more complicated than in BST. In BST, the keys are added in top down fashion resulting in an unbalanced tree. B-tree is built bottom up, the keys are added in the leaf node, if the leaf node is full another node is created, keys are evenly distributed and middle key is promoted to the parent. If parent is full, the process is repeated. B-tree can also be built in top down fashion using pre-splitting technique.
7.
Basic Idea : Insertion Find position for the key in the appropriate leaf nodeInsert key in order Is nodeand adjust pointer No full ? yes Split node: If parent is full • Create a new node • Move half of the keys from the full node to the new node and adjust pointers • Promote the median key (before split) to the parent Split guarantees that each node has m/ 2 1 keys.
8.
Cases in B-Tree Insert Operation In B-tree insertion we have the following cases: ◦ Case 1: The leaf node has room for the new key. ◦ Case 2: The leaf in which key is to be placed is full. This case can lead to the increase in tree height.
9.
B-Tree Insert Operation Case 1: The leaf node has room for the new key. Find appropriate leaf Insert 3 node for key 3 3 10 25 5 8 14 19 20 23 32 38 Insert 3 in order
10.
B-Tree Insert Operation Case 2: The leaf in which key is to be placed is full. Find appropriate leaf Insert 16 node for key 16 16 10 25 19 3 5 8 14 19 20 23 32 38 No room for key 16 in leaf nodeInsert key 19 in parent node in order Move median key 19 up and Split node: create a new node and move keys to the new node. 14 16 20 23 19
11.
B-Tree Insert Operation Case 2: The leaf in which key is to be placed is full and this lead to the increase in tree height. 45 55 67 81
12.
B-Tree Insert Operation Case 2: The height of the tree increases. Insert 16 No room for 27 in parent, Split node Insert 27 in parent in order 55 16 45 55 67 81 55No room for 19 in parent,Split parent node 48 52 57 61 72 77 86 92 13 27 19 27 33 38 3 3 4 5 5 7 3 3 4 5 5 7 3 3 4 5 5 7 3 3 4 5 5 7 2 8 7 1 9 5 2 8 7 1 9 5 2 8 7 1 9 5 2 8 7 1 9 5 9 12 14 19 20 23 29 31 35 36 41 42 Insert 19 in parent node in order No room for key 16, Move median key 19 up & Split node 19 14 16 20 23
13.
B-Tree Delete Operation Deletion is analogous to insertion, but a little more complicated. Two major cases ◦ Case 1: Deletion from leaf node ◦ Case 2: Deletion from non-leaf node Apply delete by copy technique used in BST, this will reduce this case to case 1. In delete by copy, the key to be deleted is replaced by the largest key in the right subtree or smallest in left subtree (which is always a leaf).
14.
B-Tree Delete Operation Leaf node deletion cases: ◦ After deletion node is at least half full. ◦ After deletion underflow occurs Redistribute: if number of keys in siblings > 2 . m 1 Merge nodes if number of keys in siblings < m 2 1 . Merging leads to decrease in tree height.
15.
B-Tree Delete Operation After deletion node is at least half full. (inverse of insertion case 1) Search key 3 10 25 3 5 8 14 19 32 38 40 45 Key found, delete key 3. Move others keys in the node to eliminate the gap.
16.
B-Tree Delete Operation Underflow occurs, evenly redistribute the keys if left or right sibling has keys . m/ 2 1 Search key Delete 14 14 10 25 5 8 14 19 32 38 40 45 Underflow occurs, evenly redistribute keys in the underflow node, in its sibling and the separator key.
17.
B-Tree Delete Operation Underflow occurs and the keys in the left & right sibling are m / 2 1 . Merge the underflow node and a sibling. Delete 25 Move separator key down. Move the keys to underflow 10 32 node and discard the sibling. 5 8 19 25 38 40 Underflow occurs, merge nodes.
18.
B-Tree Delete Operation Underflow occurs, height decreases after merging. Delete 21 70Underflowoccurs, mergenodes 8 32 79 853 5 21 27 47 66 73 75 78 81 83 88 90 92 Underflow occurs, merge nodes by moving separator key and the keys in sibling node to the underflow node.
19.
B-Tree V/s Binary Tree Advantages Efficient in real life problems where number of records is very large (i.e. large datasets) Frees up RAM as all nodes located on secondary memory B Tree reduces depth of the tree hence, desired record is located faster Disadvantages Decision process at each node is more complicated in a B-tree A sophisticated program is required to execute the operations in a B-tree Fig. Comparison of linear growth rate vs. logarithmic growth rate
Be the first to comment