2
Motivation
• When datais too large to fit in main memory, then
the number of disk accesses becomes important.
• A disk access is unbelievably expensive compared
to a typical computer instruction (mechanical
limitations).
• One disk access is worth about 200,000
instructions.
• The number of disk accesses will dominate the
running time.
3.
3
Motivation Cont..
• Secondarymemory (disk) is divided into equal-
sized blocks (typical sizes are 512, 2048, 4096 or
8192 bytes)
• The basic I/O operation transfers the contents of
one disk block to/from main memory.
• Our goal is to devise a multiway search tree that
will minimize file accesses (by exploiting disk
block read).
4.
4
m-ary Trees
• Anode contains multiple keys.
• Order of subtrees is based on parent node keys
• If each node has m children & there are n keys
then the average time taken to search the tree is
logmn.
Etc.
K1 K2 K3 K4
T1 T2 T3
K < K1 K1 < K < K2
5.
5
Searching m-ary Trees
A generalized SOT will visit all keys in ascending
order.
for (i==1;i<=m-1;i++) {
visit subtree to left of ki
visit ki
}
visit subtree to right of km-1
6.
6
B-Trees & Efficiency
•Used in Mac, NTFS, OS2 for file structure.
• Allow insertion and deletion into a tree structure,
based on logmn property, where m is the order of
the tree.
• The idea is that you leave some key spaces open.
So an insert of a new key is done using available
space (most cases).
– Less dynamic then our typical Binary Tree
– Ideal for disk based operations.
7.
7
Definition of aB-Tree
• Def: B-tree of order m is a tree with the following
properties:
– The root has at least 2 children, unless it is a leaf.
– No node in the tree has more then m children.
– Every node except for the root and the leaves have at
least m/2 children.
– All leaves appear at the same level.
– An internal node with k children contains exactly k-1
keys.
9
Insertion
• Insert kiinto B-tree of order m.
- We find the insertion point (in a leaf) by doing a search.
- If there is room then enter ki.
- Else, promote the middle key to the parent & split the
node into nodes around the middle key.
• If the splitting backs up to the root, then
– Make a new root containing the middle key.
• Note: the tree grows from the leaves, balance is
always maintained.
10.
10
I | K| M
J
H N | O
L
G | K
M
L
C
D | E
A H N | O
I
J
G
I | M
J | K
C
D | E
A
H N | O
K is promoted again, this
gives the new tree:
L is inserted into
the above tree.
Insertion Example
12
Deletion
• If theentry to be deleted is not in a leaf, swap it
with its successor (or predecessor) under the
natural order of the keys. Then delete the entry
from the leaf.
• If leaf contains more than the minimum number of
entries, then one can be deleted with no further
action.
13.
13
Deletion Example 1
C
D| E
A
C
E
A
Delete D
C
D | E
A
Delete C
D
E
A
Successor is promoted, Element D
C is Deleted.
14.
14
Deletion Cont...
• Ifthe node contains the minimum number of
entries, consider the two immediate siblings of the
parent node:
• If one of these siblings has more than the minimum
number of entries, then redistribute one entry from
this sibling to the parent node, and one entry
from the parent to the deficient node.
– This is a rotation which balances the nodes
– Note: all nodes must comply with minimum entry
restriction.
16
Deletion Cont...
• Ifboth immediate siblings have exactly the
minimum number of entries, then merge the
deficient node with one of the immediate sibling
node and one entry from the parent node.
• If this leaves the parent node with too few entries,
then the process is propagated upward.
17.
17
Deletion Example 3
G| K
M
L
C
D | E
A H N | O
I
J
G | K
M
L
C
D | E
A N | O
I
J
Delete H
G | K
M
L
C
D | E
A N | O
I | J
Combine with parent and 1 sibling of
parent
Node is
deficient
18.
18
Deletion Example 3Cont..
G | K
M
L
C
D | E
A N | O
I | J
Node is now
deficient
G
L
C
D | E
A N | O
I | J
K | M
Deficient node is combined with
1 key from parent and sibling of
parent
Node G is legal so propagation
up the tree stops.
19.
19
Review of Deletions
•All Deletions take place in leaf nodes
– To delete a internal key swap it with its successor or
predecessor which is a leaf.
– Then Delete
• Deficient Nodes are legalized by:
– Rotation with a sibling and parent.
OR
– Combining with key from parent and sibling
– Propagating up the tree until a legal node is
encountered.
20.
20
End Notes
Studieshave shown that on average there is about
1/((m/2) -1) splits per insertion.
– E.g.
• For a 2/3 tree there is 1
• For a 10-ary tree there is 1/4