Your SlideShare is downloading. ×
  • Like
Trees
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply
Published

Data Structures - Trees

Data Structures - Trees

Published in Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
863
On SlideShare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
50
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Deletion
    • Delete a node x as in ordinary binary search tree. Note that the last node deleted is a leaf.
    • Then trace the path from the new leaf towards the root .
    • For each node x encountered, check if heights of left(x) and right(x) differ by at most 1. If yes, proceed to parent(x). If not, perform an appropriate rotation at x. There are 4 cases as in the case of insertion.
    • For deletion, after we perform a rotation at x, we may have to perform a rotation at some ancestor of x. Thus, we must continue to trace the path until we reach the root .
  • 2. Deletion
    • On closer examination: the single rotations for deletion can be divided into 4 cases (instead of 2 cases)
      • Two cases for rotate with left child
      • Two cases for rotate with right child
  • 3. Single rotations in deletion rotate with left child In both figures, a node is deleted in subtree C, causing the height to drop to h. The height of y is h+2. When the height of subtree A is h+1, the height of B can be h or h+1. Fortunately, the same single rotation can correct both cases.
  • 4. Single rotations in deletion rotate with right child In both figures, a node is deleted in subtree A, causing the height to drop to h. The height of y is h+2. When the height of subtree C is h+1, the height of B can be h or h+1. A single rotation can correct both cases.
  • 5. Rotations in deletion
    • There are 4 cases for single rotations, but we do not need to distinguish among them.
    • There are exactly two cases for double rotations (as in the case of insertion)
    • Therefore, we can reuse exactly the same procedure for insertion to determine which rotation to perform
  • 6. B + -Trees
  • 7. Dictionary for Secondary storage
    • The AVL tree is an excellent dictionary structure when the entire structure can fit into the main memory .
      • following or updating a pointer only requires a memory cycle.
    • When the size of the data becomes so large that it cannot fit into the main memory, the performance of AVL tree may deteriorate rapidly
      • Following a pointer or updating a pointer requires accessing the disk once.
      • Traversing from root to a leaf may need to access the disk log 2 n time.
        • when n = 1048576 = 2 20 , we need 20 disk accesses. For a disk spinning at 7200rpm, this will take roughly 0.166 seconds. 10 searches will take more than 1 second! This is way too slow .
  • 8. B + Tree
    • Since the processor is much faster, it is more important to minimize the number of disk accesses by performing more cpu instructions.
    • Idea: allow a node in a tree to have many children .
    • If each internal node in the tree has M children, the height of the tree would be log M n instead of log 2 n .
      • For example, if M = 20, then log 20 2 20 < 5.
    • Thus, we can speed up the search significantly.
  • 9. B + Tree
    • In practice: it is impossible to keep the same number of children per internal node.
    • A B + -tree of order M ≥ 3 is an M-ary tree with the following properties:
      • Each internal node has at most M children
      • Each internal node, except the root, has between  M/2  -1 and M-1 keys
        • this guarantees that the tree does not degenerate into a binary tree
      • The keys at each node are ordered
      • The root is either a leaf or has between 1 and M-1 keys
      • The data items are stored at the leaves. All leaves are at the same depth. Each leaf has between  L/2  -1 and L-1 data items, for some L (usually L << M, but we will assume M=L in most examples)
  • 10. Example
    • Here, M=L=5
    • Records are stored at the leaves, but we only show the keys here
    • At the internal nodes, only keys (and pointers to children) are stored (also called separating keys )
  • 11. A B + tree with M=L=4
    • We can still talk about left and right child pointers
    • E.g. the left child pointer of N is the same as the right child pointer of J
    • We can also talk about the left subtree and right subtree of a key in internal nodes
  • 12. B + Tree
    • Which keys are stored at the internal nodes?
    • There are several ways to do it. Different books adopt different conventions.
    • We will adopt the following convention:
      • key i in an internal node is the smallest key in its i+1 subtree (i.e. right subtree of key i)
    • Even following this convention, there is no unique B + -tree for the same set of records.
  • 13. B+ tree
    • Each internal node/leaf is designed to fit into one I/O block of data. An I/O block usually can hold quite a lot of data. Hence, an internal node can keep a lot of keys, i.e., large M. This implies that the tree has only a few levels and only a few disk accesses can accomplish a search, insertion, or deletion.
    • B + -tree is a popular structure used in commercial databases. To further speed up the search, the first one or two levels of the B + -tree are usually kept in main memory.
    • The disadvantage of B + -tree is that most nodes will have less than M-1 keys most of the time. This could lead to severe space wastage . Thus, it is not a good dictionary structure for data in main memory.
    • The textbook calls the tree B-tree instead of B + -tree. In some other textbooks, B-tree refers to the variant where the actual records are kept at internal nodes as well as the leaves. Such a scheme is not practical. Keeping actual records at the internal nodes will limit the number of keys stored there, and thus increasing the number of tree levels.
  • 14. Searching
    • Suppose that we want to search for the key K. The path traversed is shown in bold.
  • 15. Searching
    • Let x be the input search key .
    • Start the searching at the root
    • If we encounter an internal node v , search (linear search or binary search) for x among the keys stored at v
      • If x < K min at v, follow the left child pointer of K min
      • If K i ≤ x < K i+1 for two consecutive keys K i and K i+1 at v, follow the left child pointer of K i+1
      • If x ≥ K max at v, follow the right child pointer of K max
    • If we encounter a leaf v , we search (linear search or binary search) for x among the keys stored at v. If found, we return the entire record; otherwise, report not found.
  • 16. Insertion
    • Suppose that we want to insert a key K and its associated record.
    • Search for the key K using the search procedure
    • This will bring us to a leaf x.
    • Insert K into x
      • Splitting (instead of rotations in AVL trees) of nodes is used to maintain properties of B + -trees [next slide]
  • 17. Insertion into a leaf
    • If leaf x contains < M-1 keys , then insert K into x (at the correct position in node x)
    • If x is already full (i.e. containing M-1 keys). Split x
      • Cut x off its parent
      • Insert K into x, pretending x has space for K. Now x has M keys.
      • After inserting K, split x into 2 new leaves x L and x R , with x L containing the  M/2  smallest keys , and x R containing the remaining  M/2  keys . Let J be the minimum key in x R
      • Make a copy of J to be the parent of x L and x R , and insert the copy together with its child pointers into the old parent of x.
  • 18. Inserting into a non-full leaf
  • 19. Splitting a leaf: inserting T
  • 20. Cont’d
  • 21.
    • Two disk accesses to write the two leaves, one disk access to update the parent
    • For L=32, two leaves with 16 and 17 items are created. We can perform 15 more insertions without another split
  • 22. Another example:
  • 23. Cont’d => Need to split the internal node
  • 24. Splitting an internal node
    • To insert a key K into a full internal node x :
    • Cut x off from its parent
    • Insert K and its left and right child pointers into x, pretending there is space. Now x has M keys.
    • Split x into 2 new internal nodes x L and x R , with x L containing the (  M/2  - 1 ) smallest keys , and x R containing the  M/2  largest keys . Note that the (  M/2  )th key J is not placed in x L or x R
    • Make J the parent of x L and x R , and insert J together with its child pointers into the old parent of x.
  • 25. Example: splitting internal node
  • 26. Cont’d
  • 27. Termination
    • Splitting will continue as long as we encounter full internal nodes
    • If the split internal node x does not have a parent (i.e. x is a root ), then create a new root containing the key J and its two children
  • 28. Deletion
    • To delete a key target , we find it at a leaf x, and remove it.
    • Two situations to worry about:
      • (1) target is a key in some internal node (needs to be replaced, according to our convention)
      • (2) After deleting target from leaf x, x contains less than  M/2  - 1 keys (needs to merge nodes)
  • 29. Situation (1)
    • By our convention, target can appear in at most one ancestor y of x as a key. Moreover, we must have visited node y and seen target in it when we searched down the tree. So after deleting from node x, we can access y directly and replace target by the new smallest key in x.
  • 30. Situation (2): handling leaves with too few keys
    • Suppose we delete the record with key target from a leaf.
    • Let u be the leaf that has  M/2  - 2 keys (too few)
    • Let v be a sibling of u
    • Let k be the key in the parent of u and v that separates the pointers to u and v.
    • There are two cases
  • 31. handling leaves with too few keys
    • Case 1: v contains  M/2  keys or more and v is the right sibling of u
      • Move the leftmost record from v to u
      • Set the key in parent of u that separates u and v to be the new smallest key in v
    • Case 2: v contains  M/2  keys or more and v is the left sibling of u
      • Move the rightmost record from v to u
      • Set the key in parent of u that separates u and v to be the new smallest key in u
  • 32. Deletion example Want to delete 15
  • 33. Want to delete 9
  • 34. Want to delete 10
  • 35.  
  • 36.  
  • 37. Merging two leaves
    • If no sibling leaf with at least  M/2  keys exists, then merge two leaves.
    • Case (1): Suppose that the right sibling v of u contains exactly  M/2  -1 keys. Merge u and v
      • Move the keys in u to v
      • Remove the pointer to u at parent
      • Delete the separating key between u and v from the parent of u
  • 38. Merging two leaves
    • Case (2): Suppose that the left sibling v of u contains exactly  M/2  -1 keys. Merge u and v
      • Move the keys in u to v
      • Remove the pointer to u at parent
      • Delete the separating key between u and v from the parent of u
  • 39. Example Want to delete 12
  • 40. Cont’d u v
  • 41. Cont’d
  • 42. Cont’d too few keys! …
  • 43. Deleting a key in an internal node
    • Suppose we remove a key from an internal node u, and u has less than  M/2  -1 keys afterwards.
    • Case (1): u is a root
      • If u is empty, then remove u and make its child the new root
  • 44. Deleting a key in an internal node
    • Case (2): the right sibling v of u has  M/2  keys or more
      • Move the separating key between u and v in the parent of u and v down to u.
      • Make the leftmost child of v the rightmost child of u
      • Move the leftmost key in v to become the separating key between u and v in the parent of u and v.
    • Case (2): the left sibling v of u has  M/2  keys or more
      • Move the separating key between u and v in the parent of u and v down to u.
      • Make the rightmost child of v the leftmost child of u
      • Move the rightmost key in v to become the separating key between u and v in the parent of u and v.
  • 45. … continue from previous example u v case 2
  • 46. Cont’d
  • 47.
    • Case (3): all sibling v of u contains exactly  M/2  - 1 keys
      • Move the separating key between u and v in the parent of u and v down to u.
      • Move the keys and child pointers in u to v
      • Remove the pointer to u at parent.
  • 48. Example Want to delete 5
  • 49. Cont’d u v
  • 50. Cont’d
  • 51. Cont’d u v case 3
  • 52. Cont’d
  • 53. Cont’d