CS 1501
Balanced Trees
2
The Searching Problem
Given a collection of keys C, determine whether or not C
contains a specific key k
(still)
● In the worst case
○ (Unless otherwise specified, I'm always talking worst case)
● Because the BST rules couldn't guarantee the tree would be
balanced
● Could we design a different tree with other rules that would
guarantee a balanced result no matter the insertion order?
We couldn't get a O(lg n) runtime for BST
3
● In addition to BST Nodes like we saw last time
○ 2-nodes
● Let's build a tree that has nodes with 3 children and 2 keys!
○ 3-nodes
2-3 trees
4
10 50
Keys < 10 10 < Keys < 50 Keys > 50
● We can now build trees from the leaves up instead of from
the root down
○ Consider inserting 10, then 50, then 25 into a 2-3 tree:
How does this help us?
5
10 50
10 50
25
● We can now build trees from the leaves up instead of from
the root down
○ Consider inserting 10, then 50, then 25 into a 2-3 tree:
How does this help us?
6
10 50
25
● How would contains('H') proceed?
Searching a 2-3 tree
7
R
M
E J
A C H L
X
P S Z
Q
● How would put('K') proceed?
Inserting into a 2-3 tree
8
R
M
E J
A C H L
X
P S Z
Q
● How would put('D') proceed?
After put('K')
9
R
M
E J
A C H L
X
S Z
K
Want D here, have to split the temporary 4-node into 2-nodes
C is parent with A and D as children
Push C up
Want C here, need to split again, E is parent, C and J as children
Push E up
P Q
put('D')
10
R
M
E J
A C H L
X
S Z
K P Q
D
C
E
● How would put('O') proceed?
After put('D')
11
R
M
C J
A D H L
X
S Z
K
E
P Q
After put('O')
12
R
M
C J
A D H L
X
S Z
K
E
P
Q
O
● In general, there 6 possible cases for splitting a temporary
4-node in a 2-3 tree:
How to split in a 2-3 tree
13
● Search will be O(h)
○ Where h is the height of the tree
○ So how tall can the tree grow?
■ When can the height grow?
● When we split the root
● Increases the length all paths from root to leaf by 1
● No other transformations increase the height of the tree
● All paths from root to leaf are the same length!
○ Meaning a 2-3 tree will always be perfectly balanced
regardless of insertion order!
■ h is O(lg n)
● Search is O(lg n)
Runtimes?
14
● May need to search all the way down the tree to find the
appropriate leaf to add to
○ O(lg n)
● If inserting into a 3-node, need to split it
○ Runtime for a split?
● If we split and the parent is also a 3-node, need to split that
○ How many possible splits?
● See Proposition F in Section 3.3 of the text
What about insert runtime?
15
● Implementing this tree will be tricky
○ Consider the node object:
■ Can have a variable number of keys and children
■ Do we implement 2 different node classes?
■ When do we need to check what type of node we're
working with?
■ Much more complicated than a BST node
● … but does it have to be?
So what's the catch?
16
Implementing 3-nodes with binary tree nodes
17
E J
A C H L
E
J
A
C H
L
● Specifically, we'll be looking at
Sedgewick's left-leaning red-black BST
approach
Red-Black BSTs
18
● Red link bind together two 2-nodes to
represent 3-nodes
● Black links bind together the 2-3 tree
● Can store the color as a node attribute
○ Red links point to red nodes
○ Black links point to black nodes
Implementing 3-nodes with binary tree nodes
19
E J
A C H L
E
J
A
C H
L
E
J
A
C H
L
● Have red and black links and satisfying the following three
restrictions:
○ Red links lean left.
○ No node has two red links connected to it.
○ The tree has perfect black balance
■ Every path from the root to a leaf link has the same
number of black links
● Black links are the links of a 2-3 tree, which we already
determined to be perfectly balanced
○ Hence, red-black BSTs will be perfectly balanced
according to their black links
Left-leaning red-black BST rules
20
Searching a red-black BST
21
E
J
A
C H L
Root
● First key added goes in a black node
Inserting into a red-black BST
22
● If the next key is less, just make the left child red
10
● What if the next key would be greater?
10
5
● Rotating left is one of the 3 main operations we'll be using to
balance inserts into a red-black BST
Make the right child red then rotate the parent left
23
Rotate left
10
15
15
10
● Assuming we have the following node class:
class Node:
def __init__(self, key, is_red):
self.key = key
self.left = None
self.right = None
self.is_red = is_red # storing color as a boolean
● We can implement rotate_left() as:
def rotate_left(cur):
x = cur.right
cur.right = x.left
x.left = cur
x.is_red = cur.is_red
cur.is_red = True
return x
Implementing rotate left
24
● Regardless of the next key (key<10, 10<key<15, key>15) the
next insert will be tricky
○ The easiest of these tricky cases will be a key > 15
What about the next insert?
25
15
10
● Consider the equivalent 2-3 tree:
Inserting to the right of the root black node
26
15
10
● What should the 2-3 tree to look like after inserting 25?
○ Convert that to a red-black BST to see what our goal is
10 15
10 25
15
10 25
15
● Just like with inserting into a single black node, we're going
to descend the tree like BST insert, then add a new red link
So how do we get there?
27
15
10
15
10 25
● Now, to get this looking like our target, we'll use another
operation: the color flip
● Use color flipping on nodes with 2 red children
○ Set both child nodes to black
○ Set node, itself to red
Color flip
28
15
10 25
15
10 25
● But now the root of our tree is red!
○ Even though in this case, the root is representing a 2-node,
and hence, should be black
○ After every insert, set the root to black
■ So why bother setting the current node to red as part of a
color flip??
● What if the new key is less than the first two?
○ E.g., insert 5 into:
Back to inserting into single 3-node
29
15
10
15
10
5
● Here, we use our third operation: rotate right
○ Apply to the root, color flip
15
10
5
15
10
5
Don't forget
to set root
to black!
10
● Insert 12 into:
Now the middle case
30
15
10
15
10
12
● Rotating the node containing 10 left will get us in a similar
situation to where we started in the previous example!
15
12
10
● After 6 slides, you might be able to follow this graphic from
the textbook:
And now…
31
● Start off the same:
○ Search down the tree, insert below the appropriate leaf node
■ Will either be a stand-in for a 2-node or a 3-node
● We've covered all cases for inserting into either
■ May end up passing a red link up the tree
● From color flips
■ Need to apply rotations and color flips back up the tree
● The results of several inserts can be found on page 440 of
the text
○ Trace through the inserts on your own to make sure you
understand how this process is applied to larger trees!
What about larger trees?
32
Removing a key from the tree
33
● Has a 1-to-1 mapping with 2-3 trees, so guaranteed to have
logarithmic height
● This means that our operations will be O(lg n)
○ Worst case!
○ Specifically:
■ search, insertion, finding the minimum, finding the
maximum, floor, ceiling, rank, select, delete the minimum,
delete the maximum, delete, and range count
○ Refer to Proposition I from Section 3.3 of the text
Performance
34

02_balanced_trees for pitt dsa yayay.pdf

  • 1.
  • 2.
    2 The Searching Problem Givena collection of keys C, determine whether or not C contains a specific key k (still)
  • 3.
    ● In theworst case ○ (Unless otherwise specified, I'm always talking worst case) ● Because the BST rules couldn't guarantee the tree would be balanced ● Could we design a different tree with other rules that would guarantee a balanced result no matter the insertion order? We couldn't get a O(lg n) runtime for BST 3
  • 4.
    ● In additionto BST Nodes like we saw last time ○ 2-nodes ● Let's build a tree that has nodes with 3 children and 2 keys! ○ 3-nodes 2-3 trees 4 10 50 Keys < 10 10 < Keys < 50 Keys > 50
  • 5.
    ● We cannow build trees from the leaves up instead of from the root down ○ Consider inserting 10, then 50, then 25 into a 2-3 tree: How does this help us? 5 10 50 10 50 25
  • 6.
    ● We cannow build trees from the leaves up instead of from the root down ○ Consider inserting 10, then 50, then 25 into a 2-3 tree: How does this help us? 6 10 50 25
  • 7.
    ● How wouldcontains('H') proceed? Searching a 2-3 tree 7 R M E J A C H L X P S Z Q
  • 8.
    ● How wouldput('K') proceed? Inserting into a 2-3 tree 8 R M E J A C H L X P S Z Q
  • 9.
    ● How wouldput('D') proceed? After put('K') 9 R M E J A C H L X S Z K Want D here, have to split the temporary 4-node into 2-nodes C is parent with A and D as children Push C up Want C here, need to split again, E is parent, C and J as children Push E up P Q
  • 10.
    put('D') 10 R M E J A CH L X S Z K P Q D C E
  • 11.
    ● How wouldput('O') proceed? After put('D') 11 R M C J A D H L X S Z K E P Q
  • 12.
    After put('O') 12 R M C J AD H L X S Z K E P Q O
  • 13.
    ● In general,there 6 possible cases for splitting a temporary 4-node in a 2-3 tree: How to split in a 2-3 tree 13
  • 14.
    ● Search willbe O(h) ○ Where h is the height of the tree ○ So how tall can the tree grow? ■ When can the height grow? ● When we split the root ● Increases the length all paths from root to leaf by 1 ● No other transformations increase the height of the tree ● All paths from root to leaf are the same length! ○ Meaning a 2-3 tree will always be perfectly balanced regardless of insertion order! ■ h is O(lg n) ● Search is O(lg n) Runtimes? 14
  • 15.
    ● May needto search all the way down the tree to find the appropriate leaf to add to ○ O(lg n) ● If inserting into a 3-node, need to split it ○ Runtime for a split? ● If we split and the parent is also a 3-node, need to split that ○ How many possible splits? ● See Proposition F in Section 3.3 of the text What about insert runtime? 15
  • 16.
    ● Implementing thistree will be tricky ○ Consider the node object: ■ Can have a variable number of keys and children ■ Do we implement 2 different node classes? ■ When do we need to check what type of node we're working with? ■ Much more complicated than a BST node ● … but does it have to be? So what's the catch? 16
  • 17.
    Implementing 3-nodes withbinary tree nodes 17 E J A C H L E J A C H L
  • 18.
    ● Specifically, we'llbe looking at Sedgewick's left-leaning red-black BST approach Red-Black BSTs 18 ● Red link bind together two 2-nodes to represent 3-nodes ● Black links bind together the 2-3 tree ● Can store the color as a node attribute ○ Red links point to red nodes ○ Black links point to black nodes
  • 19.
    Implementing 3-nodes withbinary tree nodes 19 E J A C H L E J A C H L E J A C H L
  • 20.
    ● Have redand black links and satisfying the following three restrictions: ○ Red links lean left. ○ No node has two red links connected to it. ○ The tree has perfect black balance ■ Every path from the root to a leaf link has the same number of black links ● Black links are the links of a 2-3 tree, which we already determined to be perfectly balanced ○ Hence, red-black BSTs will be perfectly balanced according to their black links Left-leaning red-black BST rules 20
  • 21.
    Searching a red-blackBST 21 E J A C H L Root
  • 22.
    ● First keyadded goes in a black node Inserting into a red-black BST 22 ● If the next key is less, just make the left child red 10 ● What if the next key would be greater? 10 5
  • 23.
    ● Rotating leftis one of the 3 main operations we'll be using to balance inserts into a red-black BST Make the right child red then rotate the parent left 23 Rotate left 10 15 15 10
  • 24.
    ● Assuming wehave the following node class: class Node: def __init__(self, key, is_red): self.key = key self.left = None self.right = None self.is_red = is_red # storing color as a boolean ● We can implement rotate_left() as: def rotate_left(cur): x = cur.right cur.right = x.left x.left = cur x.is_red = cur.is_red cur.is_red = True return x Implementing rotate left 24
  • 25.
    ● Regardless ofthe next key (key<10, 10<key<15, key>15) the next insert will be tricky ○ The easiest of these tricky cases will be a key > 15 What about the next insert? 25 15 10
  • 26.
    ● Consider theequivalent 2-3 tree: Inserting to the right of the root black node 26 15 10 ● What should the 2-3 tree to look like after inserting 25? ○ Convert that to a red-black BST to see what our goal is 10 15 10 25 15 10 25 15
  • 27.
    ● Just likewith inserting into a single black node, we're going to descend the tree like BST insert, then add a new red link So how do we get there? 27 15 10 15 10 25 ● Now, to get this looking like our target, we'll use another operation: the color flip
  • 28.
    ● Use colorflipping on nodes with 2 red children ○ Set both child nodes to black ○ Set node, itself to red Color flip 28 15 10 25 15 10 25 ● But now the root of our tree is red! ○ Even though in this case, the root is representing a 2-node, and hence, should be black ○ After every insert, set the root to black ■ So why bother setting the current node to red as part of a color flip??
  • 29.
    ● What ifthe new key is less than the first two? ○ E.g., insert 5 into: Back to inserting into single 3-node 29 15 10 15 10 5 ● Here, we use our third operation: rotate right ○ Apply to the root, color flip 15 10 5 15 10 5 Don't forget to set root to black! 10
  • 30.
    ● Insert 12into: Now the middle case 30 15 10 15 10 12 ● Rotating the node containing 10 left will get us in a similar situation to where we started in the previous example! 15 12 10
  • 31.
    ● After 6slides, you might be able to follow this graphic from the textbook: And now… 31
  • 32.
    ● Start offthe same: ○ Search down the tree, insert below the appropriate leaf node ■ Will either be a stand-in for a 2-node or a 3-node ● We've covered all cases for inserting into either ■ May end up passing a red link up the tree ● From color flips ■ Need to apply rotations and color flips back up the tree ● The results of several inserts can be found on page 440 of the text ○ Trace through the inserts on your own to make sure you understand how this process is applied to larger trees! What about larger trees? 32
  • 33.
    Removing a keyfrom the tree 33
  • 34.
    ● Has a1-to-1 mapping with 2-3 trees, so guaranteed to have logarithmic height ● This means that our operations will be O(lg n) ○ Worst case! ○ Specifically: ■ search, insertion, finding the minimum, finding the maximum, floor, ceiling, rank, select, delete the minimum, delete the maximum, delete, and range count ○ Refer to Proposition I from Section 3.3 of the text Performance 34