Binary Search Trees

6,015 views
5,908 views

Published on

Published in: Education, Technology
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
6,015
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
563
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide
  • Key extends Comparable: not actually inheritance - means that whatever generic type Item the client specifies must implement the Comparable interface
  • Use GrowingTree WebStart application to demo
  • Here depth of node = depth of node containing the key (or bottom of tree where search ends)
  • Search, then insert
    Use GrowingTree WebStart application to demo
    iterative version more cumbersome, but not difficult
  • Here depth of node = depth of node containing the key (or bottom of tree where search ends)
    Iterative version possible, but more cumbersome
  • variance = O(1) for both propositions
    worst case = keys inserted in ascending order
    Reference: The Height of a Random Binary Search Tree by Bruce Reed, JACM 2003
  • Approach. Mimic recursive inorder traversal.
  • The rotate methods return the resulting subtree. The third pointer changed is the pointer of the parent.
  • exact same code as before, except for rotations
  • Note: root inserting N keys inorder results in same tree as inserting them in reverse order
  • The 4.33107 ln N term is actually 4.31107 ln N - 1.95 ln ln N. The second term might have an impact for moderate values of N
  • Note 4.31107 ln N ~ 2.99 lg N
  • right-left* means right-left-left-left-left-left until you find the next largest
    Hibbard = person credited with deletion strategy
    Conjecture (that I think is still unresolved): symmetric version of Hibbard deletion leads to O(log N) performance
    Note figure contains duplicate keys...
  • right-left* means right-left-left-left-left-left until you find the next largest
    Hibbard = person credited with deletion strategy
    Conjecture (that I think is still unresolved): symmetric version of Hibbard deletion leads to O(log N) performance
    Note figure contains duplicate keys...
  • Two roots have a battle. Which one should win?
  • Just because X lost the battle with I doesn't mean that X should be at bottom of tree. Now it gets to throwdown with subtree rooted at R to see who becomes the right child
  • an illustration created by Ellen Kim '10, inspired by randomized BST lecture
  • Binary Search Trees

    1. 1. Binary Search Trees ‣ basic implementations ‣ randomized BSTs ‣ deletion in BSTs References: Algorithms in Java, Chapter 12 Intro to Programming, Section 4.4 Except as otherwise noted, the content of this presentation http://www.cs.princeton.edu/algs4 is licensed under the Creative Commons Attribution 2.5 License. Algorithms in Java, 4th Edition · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · September 2, 2009 2:44:27 AM
    2. 2. Elementary implementations: summary worst case average case ordered operations implementation iteration? on keys search insert search hit insert unordered array N N N/2 N no equals() unordered list N N N/2 N no equals() ordered array lg N N lg N N/2 yes compareTo() ordered list N N N/2 N/2 yes compareTo() Challenge. Efficient implementations of search and insert. 2
    3. 3. ‣ binary search tree ‣ randomized BSTs ‣ deletion in BSTs 3
    4. 4. Binary search trees Def. A BST is a binary tree in symmetric order. it A binary tree is either: best was • Empty. the • A key-value pair and two disjoint binary trees. of times Symmetric order. Every node’s key is: • Larger than all keys in its left subtree. BST • Smaller than all keys in its right subtree. Node key val left right BST with smaller keys BST with larger keys Binary search tree 4
    5. 5. BST representation A BST is a reference to a root node. A Node is comprised of four fields: key val • A Key and a Value. left right • A reference to the left and right subtree. smaller keys larger keys root private class Node it 2 { private Key key; private Value val; best 1 was 2 private Node left; null null null private Node right; } the 1 Key and Value are generic types; Key is Comparable of 1 times 1 null null null null 5
    6. 6. BST implementation (skeleton) public class BST<Key extends Comparable<Key>, Value> { private Node root; instance variable private class Node { private Key key; private Value val; private Node left, right; instance variable public Node(Key key, Value val) { this.key = key; this.val = val; } } public void put(Key key, Value val) { /* see next slides */ } public Val get(Key key) { /* see next slides */ } } 6
    7. 7. BST search Get. Return value corresponding to given key, or null if no such key. successful search successful search unsuccessful search unsuccessful search for a node node key the the for a with with key for a node node key times for a with with key times times is after times is after it it it it it it so go so go to the right to the right best best was was best best was was the is after after it the is it so go so go to the right the to the right the the the of of of of it it it it best best was was best best was was times is before times is before was was the the the the so go so go to the left to the left the is before the is before was was of of of of so go so go to the left to the left it it it it best best was was best best was was the the the the times is after after the times is the but the right rightis null null but the link link is of of of of so theso the has no node node BST BST has no success! success! having that key key having that Searching in a in a BST Searching BST 7
    8. 8. BST search: Java implementation Get. Return value corresponding to given key, or null if no such key. public Value get(Key key) { Node x = root; while (x != null) { int cmp = key.compareTo(x.key); if (cmp < 0) x = x.left; else if (cmp > 0) x = x.right; else if (cmp == 0) return x.val; } return null; } Running time. Proportional to depth of node. 8
    9. 9. BST insert Put. Associate value with key. insert times times is after it it so go to the right best was the of it best was times is before was the so go to the left of it best was the of times is after the so it goes on the right it best was the of times Inserting a new node into a BST 9
    10. 10. BST insert: Java implementation Put. Associate value with key. public void put(Key key, Value val) { root = put(root, key, val); } private Node put(Node x, Key key, Value val) { if (x == null) return new Node(key, val); int cmp = key.compareTo(x.key); if (cmp < 0) x.left = put(x.left, key, val); concise, but tricky, else if (cmp > 0) x.right = put(x.right, key, val); recursive code; read carefully! else if (cmp == 0) x.val = val; return x; } Running time. Proportional to depth of node. 10
    11. 11. the BST construction example of it best was the key of inserted it it times it was it best was was the the it of times was worst it the best was best it the worst best was of times the Constructing a BST of it best was the of times it 11
    12. 12. BST insertion: visualization Ex. Insert keys in random order. 12
    13. 13. BST insertion: visualization Ex. Insert keys in random order. 12
    14. 14. Correspondence between BSTs and quicksort partitioning K E S C I Q T A E M R U L P U O Remark. Correspondence is 1-1 if no duplicate keys. 13
    15. 15. Tree shape • Many BSTs correspond to same input data. • Cost of search/insert is proportional to depth of node. best case typical case worst case the of best it was it times it best of times worst best the was of worst the times was worst Typical BSTs constructed from randomly ordered keys Typical BSTs constructed from randomly ordered keys worst was Remark. Tree shape depends on order of insertion. times the of 14 it
    16. 16. BSTs: mathematical analysis Proposition. If keys are inserted in random order, the expected number of compares for a search/insert is ~ 2 ln N. Pf. 1-1 correspondence with quicksort partitioning. Proposition. [Reed, 2003] If keys are inserted in random order, expected height of tree is ~ 4.311 ln N – 1.953 ln ln N. But… Worst-case for search/insert/height is N (but occurs with exponentially small chance when keys are inserted in random order). 15
    17. 17. ST implementations: summary guarantee average case ordered operations implementation iteration? on keys search insert search hit insert unordered array N N N/2 N no equals() unordered list N N N/2 N no equals() ordered array lg N N lg N N/2 yes compareTo() ordered list N N N/2 N/2 yes compareTo() BST N N 1.38 lg N 1.38 lg N ? compareTo() Next challenge. Ordered iteration. 16
    18. 18. Inorder traversal Traversing the tree inorder yields keys in ascending order. public void show() { return show(root); } BST key val private void show(Node x) { left right if (x == null) return; show(x.left); BST with smaller keys BST with larger keys StdOut.println(x.key + " " + x.val); smaller keys, in order key larger keys, in order show(x.right); } all keys, in order Recursive inorder traversal of a binary search tree To implement an iterator: need a non-recursive version. 17
    19. 19. Non-recursive inorder traversal To process a node: • Follow left links until empty (pushing onto stack). • Pop and process. • Follow right link (push onto stack). visit(E) E visit(B) E B visit(A) E B A E print A A E B print B B E B S visit(C) E C print C C E print E E - A C I visit(S) S visit(I) S I visit(H) S I H H N print H H S I print I I S visit(N) S N print N N S print S S - recursive calls output stack contents 18
    20. 20. Inorder iteartor: Java implementation public Iterator<Key> iterator() { return new Inorder(); } private class Inorder implements Iterator<Key> { private Stack<Node> stack = new Stack<Node>(); private void pushLeft(Node x) { go down left while (x != null) spine and push { stack.push(x); x = x.left; } all keys onto stack } BSTIterator() { pushLeft(root); } public boolean hasNext() { return !stack.isEmpty(); } public Key next() { Node x = stack.pop(); pushLeft(x.right); return x.key; } } 19
    21. 21. ST implementations: summary guarantee average case ordered operations implementation iteration? on keys search insert search hit insert unordered array N N N/2 N no equals() unordered list N N N/2 N no equals() ordered array lg N N lg N N/2 yes compareTo() ordered list N N N/2 N/2 yes compareTo() BST N N 1.38 lg N 1.38 lg N yes compareTo() Next challenge. Guaranteed efficiency for search and insert. 20
    22. 22. Searching challenge 3 (revisited): Problem. Frequency counts in “Tale of Two Cities” Assumptions. Book has 135,000+ words; about 10,000 distinct words. Which searching method to use? 1) Unordered array. 2) Unordered linked list 3) Ordered array with binary search. 4) Need better method, all too slow. 5) Doesn’t matter much, all fast enough. 6) BSTs. insertion cost < 10000 * 1.38 * lg 10000 < .2 million lookup cost < 135000 * 1.38 * lg 10000 < 2.5 million 21
    23. 23. ‣ basic implementations ‣ randomized BSTs ‣ deletion in BSTs 22
    24. 24. Rotation in BSTs Two fundamental operations to rearrange nodes in a tree. • Maintain symmetric order. • Local transformations (change just 3 pointers). • Basis for advanced BST algorithms. Strategy. Use rotations on insert to adjust tree shape to be more balanced. h u h v h = rotL(u) v u A h = rotR(v) C B C A B Key point. No change to BST search code (!) 23
    25. 25. Rotation in BSTs Two fundamental operations to rearrange nodes in a tree. • Easier done than said. • Raise some nodes, lowers some others. root = rotL(A) a.right = rotR(S) private Node rotL(Node h) { Node v = h.right; h.right = v.left; v.left = h; return v; } private Node rotR(Node h) { Node u = h.left; h.left = u.right; u.right = h; return u; } 24
    26. 26. BST root insertion insert G Insert a node and make it the new root. • Insert node at bottom, as in standard BST. • Rotate inserted node to the root. • Compact recursive implementation. private Node putRoot(Node x, Key key, Val val) { if (x == null) return new Node(key, val); int cmp = key.compareTo(x.key); if (cmp < 0) { x.left = putRoot(x.left, key, val); x = rotR(x); } tricky recursive else if (cmp > 0) code; read very { carefully! x.right = putRoot(x.right, key, val); x = rotL(x); } else if (cmp == 0) x.val = val; return x; } 25
    27. 27. BST root insertion: construction Ex. A S E R C H I N G X M P L Why bother? • Recently inserted keys are near the top (better for some clients). • Basis for randomized BST. 26
    28. 28. Randomized BSTs (Roura, 1996) Intuition. If tree is random, height is logarithmic. Fact. Each node in a random tree is equally likely to be the root. Idea. Since new node should be the root with probability 1/(N+1), make it the root (via root insertion) with probability 1/(N+1). and apply idea recursively private Node put(Node x, Key key, Value val) { if (x == null) return new Node(key, val); int cmp = key.compareTo(x.key); if (cmp == 0) { x.val = val; return x; } no rotations if key in table if (StdRandom.bernoulli(1.0 / (x.N + 1.0)) root insert with the right return putRoot(h, key, val); probability if (cmp < 0) x.left = put(x.left, key, val); else if (cmp > 0) x.right = put(x.right, key, val); maintain count of nodes in x.N++; subtree rooted at x return x; } 27
    29. 29. Randomized BST: construction Ex. Insert 15 keys in ascending order. 28
    30. 30. Randomized BST construction: visualization Ex. Insert 500 keys in random order. 29
    31. 31. Randomized BST construction: visualization Ex. Insert 500 keys in random order. 29
    32. 32. Randomized BST: analysis Proposition. Randomized BSTs have the same distribution as BSTs under of random insertion order, no matter in what order keys are inserted. it times best the was worst Typical BSTs constructed from randomly ordered keys • Expected height is ~ 4.31107 ln N. • Average search cost is ~ 2 ln N. • Exponentially small chance of bad balance. Implementation cost. Need to maintain subtree size in each node. 30
    33. 33. ST implementations: summary guarantee average case ordered operations implementation iteration? on keys search insert search hit insert unordered array N N N/2 N no equals() unordered list N N N/2 N no equals() ordered array lg N N lg N N/2 yes compareTo() ordered list N N N/2 N/2 yes compareTo() BST N N 1.38 lg N 1.38 lg N yes compareTo() randomized BST 3 lg N 3 lg N 1.38 lg N 1.38 lg N yes compareTo() Bottom line. Randomized BSTs provide the desired guarantee. probabilistic, with exponentially Bonus. Randomized BSTs also support delete (!) small chance of linear time 31
    34. 34. ‣ basic implementations ‣ randomized BSTs ‣ deletion in BSTs 32
    35. 35. BST deletion: lazy approach To remove a node with a given key: • Set its value to null. • Leave key in tree to guide searches (but don't consider it equal to search key). E E A S A S delete I C I C I tombstone H R H R N N Cost. O(log N') per insert, search, and delete, where N' is the number of elements ever inserted in the BST. Unsatisfactory solution. Tombstone overload. 33
    36. 36. BST deletion: Hibbard deletion To remove a node from a BST: • Zero children: just remove it. • One child: pass the child up. • Two children: find the next largest node using right-left*, swap with next largest, remove as above. zero children one child two children 34
    37. 37. BST deletion: Hibbard deletion To remove a node from a BST: • Zero children: just remove it. • One child: pass the child up. • Two children: find the next largest node using right-left*, swap with next largest, remove as above. zero children one child two children Unsatisfactory solution. Not symmetric, code is clumsy. Surprising consequence. Trees not random (!) ⇒ sqrt(N) per op. Longstanding open problem. Simple and efficient delete for BSTs. 35
    38. 38. Randomized BST deletion To delete a node containing a given key: • Find the node containing the key. • Remove the node. • Join its two subtrees to make a tree. Ex. Delete S. E A S C I X H R N 36
    39. 39. Randomized BST deletion To delete a node containing a given key: private Node remove(Node x, Key key) • Find the node containing the key. { if (x == null) return null; • Remove the node. int cmp = key.compareTo(x.key); • if (cmp < 0) Join its two subtrees to make a tree. x.left = remove(x.left, key); else if (cmp > 0) x.right = remove(x.right, key); else if (cmp == 0) return join(x.left, x.right); Ex. Delete S. return x; } E join these two subtrees A C I X H R N 37
    40. 40. Randomized BST join To join two subtrees with all keys in one less than all keys in the other: • Maintain counts of nodes in subtrees a and b. • With probability |a|/(|a|+|b|): - root = root of a - left subtree = left subtree of a - right subtree = join b and right subtree of a • With probability |a|/(|a|+|b|) do the symmetric operations. make I the root and recursively join these to join these two subtrees with probability 4/5 two subtrees to form right subtree of I X I X I X H R H R N N 38
    41. 41. Randomized BST join To join two subtrees with all keys in one less than all keys in the other: • Maintain counts of nodes in subtrees a and b. • With probability |a|/(|a|+|b|): - root = root of a - left subtree = left subtree of a - right subtree = join b and right subtree of a • With probability |a|/(|a|+|b|) do the symmetric operations. to join these two subtrees private Node join(Node a, Node b) { if (a == null) return b; X if (b == null) return a; R if (StdRandom.bernoulli((double) a.N / (a.N + b.N)) { a.right = join(a.right, b); return a; } else N make R the root { b.left = join(a, b.left ); return b; } with probability 2/3 } R N X 39
    42. 42. Randomized BST deletion To delete a node containing a given key: • Find the node containing the key. • Remove the node. • Join its two subtrees to make a tree. E E A S A I C I X C H R H R N X N Proposition. Tree still random after delete (!). Bottom line. Logarithmic guarantee for search/insert/delete. 40
    43. 43. ST implementations: summary guarantee average case ordered operations implementation iteration? on keys search insert delete search hit insert delete unordered array N N N N/2 N N/2 no equals() unordered list N N N N/2 N N/2 no equals() ordered array lg N N N lg N N/2 N/2 yes compareTo() ordered list N N N N/2 N/2 N/2 yes compareTo() BST N N N 1.38 lg N 1.38 lg N ? yes compareTo() randomized BST 3 lg N 3 lg N 3 lg N 1.38 lg N 1.38 lg N 1.38 lg N yes compareTo() Bottom line. Randomized BSTs provide the desired guarantee. Next lecture. Can we do better? probabilistic, with exponentially small chance of linear time 41

    ×