SlideShare a Scribd company logo
1 of 618
Download to read offline
Elementary Algorithms 
Larry LIU Xinyu 1 
July 25, 2014 
1Larry LIU Xinyu 
Version: 0.6180339887498949 
Email: liuxinyu95@gmail.com
2
Contents 
I Preface 5 
0.1 Why? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 
0.2 The smallest free ID problem, the power of algorithms . . . . . . 7 
0.2.1 Improvement 1 . . . . . . . . . . . . . . . . . . . . . . . . 8 
0.2.2 Improvement 2, Divide and Conquer . . . . . . . . . . . . 9 
0.2.3 Expressiveness vs. Performance . . . . . . . . . . . . . . . 10 
0.3 The number puzzle, power of data structure . . . . . . . . . . . . 12 
0.3.1 The brute-force solution . . . . . . . . . . . . . . . . . . . 12 
0.3.2 Improvement 1 . . . . . . . . . . . . . . . . . . . . . . . . 12 
0.3.3 Improvement 2 . . . . . . . . . . . . . . . . . . . . . . . . 15 
0.4 Notes and short summary . . . . . . . . . . . . . . . . . . . . . . 18 
0.5 Structure of the contents . . . . . . . . . . . . . . . . . . . . . . . 18 
0.6 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 
II Trees 23 
1 Binary search tree, the `hello world' data structure 25 
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 
1.2 Data Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 
1.3 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 
1.4 Traversing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 
1.5 Querying a binary search tree . . . . . . . . . . . . . . . . . . . . 33 
1.5.1 Looking up . . . . . . . . . . . . . . . . . . . . . . . . . . 33 
1.5.2 Minimum and maximum . . . . . . . . . . . . . . . . . . . 34 
1.5.3 Successor and predecessor . . . . . . . . . . . . . . . . . . 34 
1.6 Deletion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 
1.7 Randomly build binary search tree . . . . . . . . . . . . . . . . . 40 
1.8 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 
2 The evolution of insertion sort 43 
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 
2.2 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 
2.3 Improvement 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 
2.4 Improvement 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 
2.5 Final improvement by binary search tree . . . . . . . . . . . . . . 49 
2.6 Short summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 
3
4 CONTENTS 
3 Red-black tree, not so complex as it was thought 53 
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 
3.1.1 Exploit the binary search tree . . . . . . . . . . . . . . . . 53 
3.1.2 How to ensure the balance of the tree . . . . . . . . . . . 54 
3.1.3 Tree rotation . . . . . . . . . . . . . . . . . . . . . . . . . 56 
3.2 De
nition of red-black tree . . . . . . . . . . . . . . . . . . . . . 58 
3.3 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 
3.4 Deletion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 
3.5 Imperative red-black tree algorithm ? . . . . . . . . . . . . . . . 71 
3.6 More words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 
4 AVL tree 77 
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 
4.1.1 How to measure the balance of a tree? . . . . . . . . . . . 77 
4.2 De
nition of AVL tree . . . . . . . . . . . . . . . . . . . . . . . . 77 
4.3 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 
4.3.1 Balancing adjustment . . . . . . . . . . . . . . . . . . . . 82 
4.3.2 Pattern Matching . . . . . . . . . . . . . . . . . . . . . . . 86 
4.4 Deletion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 
4.5 Imperative AVL tree algorithm ? . . . . . . . . . . . . . . . . . . 88 
4.6 Chapter note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 
5 Trie and Patricia 95 
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 
5.2 Integer Trie . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 
5.2.1 De
nition of integer Trie . . . . . . . . . . . . . . . . . . . 96 
5.2.2 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 
5.2.3 Look up . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 
5.3 Integer Patricia . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 
5.3.1 De
nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 
5.3.2 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 
5.3.3 Look up . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 
5.4 Alphabetic Trie . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 
5.4.1 De
nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 
5.4.2 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 
5.4.3 Look up . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 
5.5 Alphabetic Patricia . . . . . . . . . . . . . . . . . . . . . . . . . . 113 
5.5.1 De
nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 
5.5.2 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 
5.5.3 Look up . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 
5.6 Trie and Patricia applications . . . . . . . . . . . . . . . . . . . . 121 
5.6.1 E-dictionary and word auto-completion . . . . . . . . . . 121 
5.6.2 T9 input method . . . . . . . . . . . . . . . . . . . . . . . 125 
5.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 
6 Sux Tree 133 
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 
6.2 Sux trie . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 
6.2.1 Node transfer and sux link . . . . . . . . . . . . . . . . 135 
6.2.2 On-line construction . . . . . . . . . . . . . . . . . . . . . 136
CONTENTS 5 
6.3 Sux Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 
6.3.1 On-line construction . . . . . . . . . . . . . . . . . . . . . 141 
6.4 Sux tree applications . . . . . . . . . . . . . . . . . . . . . . . . 150 
6.4.1 String/Pattern searching . . . . . . . . . . . . . . . . . . . 150 
6.4.2 Find the longest repeated sub-string . . . . . . . . . . . . 152 
6.4.3 Find the longest common sub-string . . . . . . . . . . . . 153 
6.4.4 Find the longest palindrome . . . . . . . . . . . . . . . . . 155 
6.4.5 Others . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 
6.5 Notes and short summary . . . . . . . . . . . . . . . . . . . . . . 156 
7 B-Trees 159 
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 
7.2 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 
7.2.1 Splitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 
7.3 Deletion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 
7.3.1 Merge before delete method . . . . . . . . . . . . . . . . . 168 
7.3.2 Delete and
x method . . . . . . . . . . . . . . . . . . . . 176 
7.4 Searching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 
7.5 Notes and short summary . . . . . . . . . . . . . . . . . . . . . . 183 
III Heaps 187 
8 Binary Heaps 189 
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 
8.2 Implicit binary heap by array . . . . . . . . . . . . . . . . . . . . 189 
8.2.1 De
nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 
8.2.2 Heapify . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 
8.2.3 Build a heap . . . . . . . . . . . . . . . . . . . . . . . . . 192 
8.2.4 Basic heap operations . . . . . . . . . . . . . . . . . . . . 194 
8.2.5 Heap sort . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 
8.3 Leftist heap and Skew heap, the explicit binary heaps . . . . . . 201 
8.3.1 De
nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 
8.3.2 Merge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 
8.3.3 Basic heap operations . . . . . . . . . . . . . . . . . . . . 204 
8.3.4 Heap sort by Leftist Heap . . . . . . . . . . . . . . . . . . 206 
8.3.5 Skew heaps . . . . . . . . . . . . . . . . . . . . . . . . . . 206 
8.4 Splay heap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 
8.4.1 De
nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 
8.4.2 Heap sort . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 
8.5 Notes and short summary . . . . . . . . . . . . . . . . . . . . . . 215 
9 From grape to the world cup, the evolution of selection sort 219 
9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 
9.2 Finding the minimum . . . . . . . . . . . . . . . . . . . . . . . . 221 
9.2.1 Labeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222 
9.2.2 Grouping . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 
9.2.3 performance of the basic selection sorting . . . . . . . . . 224 
9.3 Minor Improvement . . . . . . . . . . . . . . . . . . . . . . . . . 225 
9.3.1 Parameterize the comparator . . . . . . . . . . . . . . . . 225
6 CONTENTS 
9.3.2 Trivial
ne tune . . . . . . . . . . . . . . . . . . . . . . . 226 
9.3.3 Cock-tail sort . . . . . . . . . . . . . . . . . . . . . . . . . 227 
9.4 Major improvement . . . . . . . . . . . . . . . . . . . . . . . . . 231 
9.4.1 Tournament knock out . . . . . . . . . . . . . . . . . . . . 231 
9.4.2 Final improvement by using heap sort . . . . . . . . . . . 239 
9.5 Short summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240 
10 Binomial heap, Fibonacci heap, and pairing heap 243 
10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 
10.2 Binomial Heaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 
10.2.1 De
nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 
10.2.2 Basic heap operations . . . . . . . . . . . . . . . . . . . . 248 
10.3 Fibonacci Heaps . . . . . . . . . . . . . . . . . . . . . . . . . . . 258 
10.3.1 De
nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 258 
10.3.2 Basic heap operations . . . . . . . . . . . . . . . . . . . . 260 
10.3.3 Running time of pop . . . . . . . . . . . . . . . . . . . . . 269 
10.3.4 Decreasing key . . . . . . . . . . . . . . . . . . . . . . . . 271 
10.3.5 The name of Fibonacci Heap . . . . . . . . . . . . . . . . 273 
10.4 Pairing Heaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 
10.4.1 De
nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 
10.4.2 Basic heap operations . . . . . . . . . . . . . . . . . . . . 276 
10.5 Notes and short summary . . . . . . . . . . . . . . . . . . . . . . 282 
IV Queues and Sequences 285 
11 Queue, not so simple as it was thought 287 
11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287 
11.2 Queue by linked-list and circular buer . . . . . . . . . . . . . . 288 
11.2.1 Singly linked-list solution . . . . . . . . . . . . . . . . . . 288 
11.2.2 Circular buer solution . . . . . . . . . . . . . . . . . . . 291 
11.3 Purely functional solution . . . . . . . . . . . . . . . . . . . . . . 294 
11.3.1 Paired-list queue . . . . . . . . . . . . . . . . . . . . . . . 294 
11.3.2 Paired-array queue - a symmetric implementation . . . . 296 
11.4 A small improvement, Balanced Queue . . . . . . . . . . . . . . . 298 
11.5 One more step improvement, Real-time Queue . . . . . . . . . . 300 
11.6 Lazy real-time queue . . . . . . . . . . . . . . . . . . . . . . . . . 307 
11.7 Notes and short summary . . . . . . . . . . . . . . . . . . . . . . 310 
12 Sequences, The last brick 313 
12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 
12.2 Binary random access list . . . . . . . . . . . . . . . . . . . . . . 314 
12.2.1 Review of plain-array and list . . . . . . . . . . . . . . . . 314 
12.2.2 Represent sequence by trees . . . . . . . . . . . . . . . . . 314 
12.2.3 Insertion to the head of the sequence . . . . . . . . . . . . 316 
12.3 Numeric representation for binary random access list . . . . . . . 322 
12.3.1 Imperative binary access list . . . . . . . . . . . . . . . . 324 
12.4 Imperative paired-array list . . . . . . . . . . . . . . . . . . . . . 327 
12.4.1 De
nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 327 
12.4.2 Insertion and appending . . . . . . . . . . . . . . . . . . . 328
CONTENTS 7 
12.4.3 random access . . . . . . . . . . . . . . . . . . . . . . . . 328 
12.4.4 removing and balancing . . . . . . . . . . . . . . . . . . . 329 
12.5 Concatenate-able list . . . . . . . . . . . . . . . . . . . . . . . . . 331 
12.6 Finger tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 
12.6.1 De
nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 
12.6.2 Insert element to the head of sequence . . . . . . . . . . . 337 
12.6.3 Remove element from the head of sequence . . . . . . . . 340 
12.6.4 Handling the ill-formed
nger tree when removing . . . . 341 
12.6.5 append element to the tail of the sequence . . . . . . . . . 346 
12.6.6 remove element from the tail of the sequence . . . . . . . 347 
12.6.7 concatenate . . . . . . . . . . . . . . . . . . . . . . . . . . 349 
12.6.8 Random access of
nger tree . . . . . . . . . . . . . . . . 354 
12.7 Notes and short summary . . . . . . . . . . . . . . . . . . . . . . 365 
V Sorting and Searching 369 
13 Divide and conquer, Quick sort vs. Merge sort 371 
13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371 
13.2 Quick sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371 
13.2.1 Basic version . . . . . . . . . . . . . . . . . . . . . . . . . 372 
13.2.2 Strict weak ordering . . . . . . . . . . . . . . . . . . . . . 373 
13.2.3 Partition . . . . . . . . . . . . . . . . . . . . . . . . . . . 374 
13.2.4 Minor improvement in functional partition . . . . . . . . 377 
13.3 Performance analysis for quick sort . . . . . . . . . . . . . . . . . 379 
13.3.1 Average case analysis ? . . . . . . . . . . . . . . . . . . . 380 
13.4 Engineering Improvement . . . . . . . . . . . . . . . . . . . . . . 383 
13.4.1 Engineering solution to duplicated elements . . . . . . . . 383 
13.5 Engineering solution to the worst case . . . . . . . . . . . . . . . 390 
13.6 Other engineering practice . . . . . . . . . . . . . . . . . . . . . . 394 
13.7 Side words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395 
13.8 Merge sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395 
13.8.1 Basic version . . . . . . . . . . . . . . . . . . . . . . . . . 396 
13.9 In-place merge sort . . . . . . . . . . . . . . . . . . . . . . . . . . 403 
13.9.1 Naive in-place merge . . . . . . . . . . . . . . . . . . . . . 403 
13.9.2 in-place working area . . . . . . . . . . . . . . . . . . . . 404 
13.9.3 In-place merge sort vs. linked-list merge sort . . . . . . . 409 
13.10Nature merge sort . . . . . . . . . . . . . . . . . . . . . . . . . . 411 
13.11Bottom-up merge sort . . . . . . . . . . . . . . . . . . . . . . . . 416 
13.12Parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419 
13.13Short summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419 
14 Searching 423 
14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423 
14.2 Sequence search . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423 
14.2.1 Divide and conquer search . . . . . . . . . . . . . . . . . . 424 
14.2.2 Information reuse . . . . . . . . . . . . . . . . . . . . . . . 444 
14.3 Solution searching . . . . . . . . . . . . . . . . . . . . . . . . . . 471 
14.3.1 DFS and BFS . . . . . . . . . . . . . . . . . . . . . . . . . 471 
14.3.2 Search the optimal solution . . . . . . . . . . . . . . . . . 507
8 CONTENTS 
14.4 Short summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535 
VI Appendix 539 
Appendices 
A Lists 541 
A.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541 
A.2 List De
nition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541 
A.2.1 Empty list . . . . . . . . . . . . . . . . . . . . . . . . . . . 542 
A.2.2 Access the element and the sub list . . . . . . . . . . . . . 542 
A.3 Basic list manipulation . . . . . . . . . . . . . . . . . . . . . . . . 543 
A.3.1 Construction . . . . . . . . . . . . . . . . . . . . . . . . . 543 
A.3.2 Empty testing and length calculating . . . . . . . . . . . . 544 
A.3.3 indexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545 
A.3.4 Access the last element . . . . . . . . . . . . . . . . . . . 546 
A.3.5 Reverse indexing . . . . . . . . . . . . . . . . . . . . . . . 547 
A.3.6 Mutating . . . . . . . . . . . . . . . . . . . . . . . . . . . 549 
A.3.7 sum and product . . . . . . . . . . . . . . . . . . . . . . . 559 
A.3.8 maximum and minimum . . . . . . . . . . . . . . . . . . . 563 
A.4 Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567 
A.4.1 mapping and for-each . . . . . . . . . . . . . . . . . . . . 567 
A.4.2 reverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573 
A.5 Extract sub-lists . . . . . . . . . . . . . . . . . . . . . . . . . . . 575 
A.5.1 take, drop, and split-at . . . . . . . . . . . . . . . . . . . 575 
A.5.2 breaking and grouping . . . . . . . . . . . . . . . . . . . . 577 
A.6 Folding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 582 
A.6.1 folding from right . . . . . . . . . . . . . . . . . . . . . . . 582 
A.6.2 folding from left . . . . . . . . . . . . . . . . . . . . . . . 584 
A.6.3 folding in practice . . . . . . . . . . . . . . . . . . . . . . 587 
A.7 Searching and matching . . . . . . . . . . . . . . . . . . . . . . . 588 
A.7.1 Existence testing . . . . . . . . . . . . . . . . . . . . . . . 588 
A.7.2 Looking up . . . . . . . . . . . . . . . . . . . . . . . . . . 588 
A.7.3
nding and
ltering . . . . . . . . . . . . . . . . . . . . . 589 
A.7.4 Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . 592 
A.8 zipping and unzipping . . . . . . . . . . . . . . . . . . . . . . . . 594 
A.9 Notes and short summary . . . . . . . . . . . . . . . . . . . . . . 597 
GNU Free Documentation License 601 
1. APPLICABILITY AND DEFINITIONS . . . . . . . . . . . . . . . 601 
2. VERBATIM COPYING . . . . . . . . . . . . . . . . . . . . . . . . 603 
3. COPYING IN QUANTITY . . . . . . . . . . . . . . . . . . . . . . 603 
4. MODIFICATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . 604 
5. COMBINING DOCUMENTS . . . . . . . . . . . . . . . . . . . . . 605 
6. COLLECTIONS OF DOCUMENTS . . . . . . . . . . . . . . . . . 606 
7. AGGREGATION WITH INDEPENDENT WORKS . . . . . . . . 606 
8. TRANSLATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . 606 
9. TERMINATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607 
10. FUTURE REVISIONS OF THIS LICENSE . . . . . . . . . . . . 607
CONTENTS 9 
11. RELICENSING . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607 
ADDENDUM: How to use this License for your documents . . . . . . 608
10 CONTENTS
Part I 
Preface 
11
Elementary Algorithms 13 
0.1 Why? 
`Are algorithms useful?'. Some programmers say that they seldom use any 
serious data structures or algorithms in real work such as commercial application 
development. Even when they need some of them, they have already been 
provided by libraries. For example, the C++ standard template library (STL) 
provides sort and selection algorithms as well as the vector, queue, and set data 
structures. It seems that knowing about how to use the library as a tool is quite 
enough. 
Instead of answering this question directly, I would like to say algorithms 
and data structures are critical in solving `interesting problems', the usefulness 
of the problem set aside. 
Let's start with two problems that looks like they can be solved in a brute-force 
way even by a fresh programmer. 
0.2 The smallest free ID problem, the power of 
algorithms 
This problem is discussed in Chapter 1 of Richard Bird's book [1]. It's common 
that applications and systems use ID (identi
er) to manage objects and entities. 
At any time, some IDs are used, and some of them are available for use. When 
some client tries to acquire a new ID, we want to always allocate it the smallest 
available one. Suppose IDs are non-negative integers and all IDs in use are kept 
in a list (or an array) which is not ordered. For example: 
[18, 4, 8, 9, 16, 1, 14, 7, 19, 3, 0, 5, 2, 11, 6] 
How can you
nd the smallest free ID, which is 10, from the list? 
It seems the solution is quite easy even without any serious algorithms. 
1: function Min-Free(A) 
2: x   0 
3: loop 
4: if x =2 
A then 
5: return x 
6: else 
7: x   x + 1 
Where the =2 
is realized like below. 
1: function `=2 
'(x;X) 
2: for i   1 to jXj do 
3: if x = X[i] then 
4: return False 
5: return True 
Some languages provide handy tools which wrap this linear time process. For 
example in Python, this algorithm can be directly translated as the following. 
def b r u t e f o r c e ( l s t ) : 
i = 0 
while True : 
i f i not in l s t :
14 Preface 
return i 
i = i + 1 
It seems this problem is trivial. However, There will be millions of IDs in a 
large system. The speed of this solution is poor in such case for it takes O(n2) 
time, where n is the length of the ID list. In my computer (2 Cores 2.10 GHz, 
with 2G RAM), a C program using this solution takes an average of 5.4 seconds 
to search a minimum free number among 100,000 IDs1. And it takes more than 
8 minutes to handle a million numbers. 
0.2.1 Improvement 1 
The key idea to improve the solution is based on a fact that for a series of n 
numbers x1; x2; :::; xn, if there are free numbers, some of the xi are outside the 
range [0; n); otherwise the list is exactly a permutation of 0; 1; :::; n  1 and n 
should be returned as the minimum free number. It means that max(xi)  n1. 
And we have the following fact. 
minfree(x1; x2; :::; xn)  n (1) 
One solution is to use an array of n + 1 
ags to mark whether a number in 
range [0; n] is free. 
1: function Min-Free(A) 
2: F   [False; False; :::; False] where jFj = n + 1 
3: for 8x 2 A do 
4: if x  n then 
5: F[x]   True 
6: for i   [0; n] do 
7: if F[i] = False then 
8: return i 
Line 2 initializes a 
ag array all of False values. This takes O(n) time. Then 
the algorithm scans all numbers in A and mark the relative 
ag to True if the 
value is less than n, This step also takes O(n) time. Finally, the algorithm 
performs a linear time search to
nd the
rst 
ag with False value. So the total 
performance of this algorithm is O(n). Note that we use n + 1 
ags instead of 
n 
ags to cover the special case that sorted(A) = [0; 1; 2; :::; n  1]. 
Although the algorithm only takes O(n) time, it needs extra O(n) spaces to 
store the 
ags. 
This solution is much faster than the brute force one. On my computer, 
the relevant Python program takes an average of 0.02 second when dealing with 
100,000 numbers. 
We haven't
ne tuned this algorithm yet. Observe that each time we have 
to allocate memory to create a n + 1 elements array of 
ags, and release the 
memory when
nished. The memory allocation and release is very expensive 
thus they cost us a lot of processing time. 
There are two ways in which we can improve on this solution. One is to 
allocate the 
ags array in advance and reuse it for all the calls of our function to
nd the smallest free number. The other is to use bit-wise 
ags instead of a 
ag 
array. The following is the C program based on these two minor improvements. 
1All programs can be downloaded along with this series posts.
0.2. THE SMALLEST FREE ID PROBLEM, THE POWER OF ALGORITHMS15 
#define N 1000000 // 1 mi l l i o n 
#define WORDLENGTH s izeof ( int )  8 
void s e t b i t (unsigned int  bi t s , unsigned int i )f 
b i t s [ i / WORDLENGTH] j= 1( i % WORDLENGTH) ; 
g 
int t e s t b i t (unsigned int  bi t s , unsigned int i )f 
return b i t s [ i /WORDLENGTH]  (1( i % WORDLENGTH) ) ; 
g 
unsigned int b i t s [N/WORDLENGTH+1] ; 
int min f r e e ( int  xs , int n)f 
int i , l en = N/WORDLENGTH+1; 
for ( i =0; il en ; ++i ) 
b i t s [ i ]=0; 
for ( i =0; in ; ++i ) 
i f ( xs [ i ]n) 
s e t b i t ( bi t s , xs [ i ] ) ; 
for ( i =0; i=n ; ++i ) 
i f ( ! t e s t b i t ( bi t s , i ) ) 
return i ; 
g 
This C program can handle 1,000,000 (1 million) IDs in just 0.023 second 
on my computer. 
The last for-loop can be further improved as seen below but this is just minor
ne-tuning. 
for ( i =0; ; ++i ) 
i f (~ b i t s [ i ] !=0 ) 
for ( j =0; ; ++j ) 
i f ( ! t e s t b i t ( bi t s , i WORDLENGTH+j ) ) 
return i WORDLENGTH+j ; 
0.2.2 Improvement 2, Divide and Conquer 
Although the above improvement is much faster, it costs O(n) extra spaces to 
keep a check list. if n is huge number this means a huge amount of space is 
wasted. 
The typical divide and conquer strategy is to break the problem into some 
smaller ones, and solve these to get the
nal answer. 
We can put all numbers xi  bn=2c as a sub-list A0 and put all the others 
as a second sub-list A00. Based on formula 1 if the length of A0 is exactly bn=2c, 
this means the
rst half of numbers are `full', which indicates that the minimum 
free number must be in A00 and so we'll need to recursively seek in the shorter 
list A00. Otherwise, it means the minimum free number is located in A0, which 
again leads to a smaller problem. 
When we search the minimum free number in A00, the conditions changes 
a little bit, we are not searching the smallest free number starting from 0, but
16 Preface 
actually from bn=2c +1 as the lower bound. So the algorithm is something like 
minfree(A; l; u), where l is the lower bound and u is the upper bound index of 
the element. 
Note that there is a trivial case, that if the number list is empty, we merely 
return the lower bound as the result. 
This divide and conquer solution can be formally expressed as a function : 
minfree(A) = search(A; 0; jAj  1) 
search(A; l; u) = 
8 
: 
l : A =  
search(A00;m + 1; u) : jA0j = m  l + 1 
search(A0; l;m) : otherwise 
where 
m = b l + u 
2 
c 
A0 = f8x 2 A ^ x  mg 
A00 = f8x 2 A ^ x  mg 
It is obvious that this algorithm doesn't need any extra space2. Each call 
performs O(jAj) comparison to build A0 and A00. After that the problem scale 
halves. So the time needed for this algorithm is T(n) = T(n=2) + O(n) which 
reduce to O(n). Another way to analyze the performance is by observing that 
the
rst call takes O(n) to build A0 and A00 and the second call takes O(n=2), and 
O(n=4) for the third... The total time is O(n+n=2+n=4+:::) = O(2n) = O(n) 
. 
In functional programming languages such as Haskell, partitioning a list has 
already been provided in the basic library and this algorithm can be translated 
as the following. 
import Data.List 
minFree xs = bsearch xs 0 (length xs - 1) 
bsearch xs l u j xs == [] = l 
j length as == m - l + 1 = bsearch bs (m+1) u 
j otherwise = bsearch as l m 
where 
m = (l + u) `div` 2 
(as, bs) = partition (m) xs 
0.2.3 Expressiveness vs. Performance 
Imperative language programmers may be concerned about the performance of 
this kind of implementation. For instance in this minimum free ID problem, the 
number of recursive calls is in O(lg n) , which means the stack size consumed 
is in O(lg n). It's not free in terms of space. But if we want to avoid that , we 
2Procedural programmer may note that it actually takes O(lg n) stack spaces for book- 
keeping. As we'll see later, this can be eliminated either by tail recursion optimization, for 
instance gcc -O2. or by manually changing the recursion to iteration
0.2. THE SMALLEST FREE ID PROBLEM, THE POWER OF ALGORITHMS17 
can eliminate the recursion by replacing it by an iteration 3 which yields the 
following C program. 
int min_free(int xs, int n){ 
int l=0; 
int u=n-1; 
while(n){ 
int m = (l + u) = 2; 
int right, left = 0; 
for(right = 0; right  n; ++ right) 
if(xs[right]  m){ 
swap(xs[left], xs[right]); 
++left; 
} 
if(left == m - l + 1){ 
xs = xs + left; 
n = n - left; 
l = m+1; 
} 
else{ 
n = left; 
u = m; 
} 
} 
return l; 
} 
This program uses a `quick-sort' like approach to re-arrange the array so that 
all the elements before left are less than or equal to m; while those between 
left and right are greater than m. This is shown in
gure 1. 
left right 
x[i]=m x[i]m ...?... 
Figure 1: Divide the array, all x[i]  m where 0  i  left; while all x[i]  m 
where left  i  right. The left elements are unknown. 
This program is fast and it doesn't need extra stack space. However, com-pared 
to the previous Haskell program, it's hard to read and the expressiveness 
decreased. We have to balance performance and expressiveness. 
3This is done automatically in most functional languages since our function is in tail 
recursive form which lends itself perfectly to this transformation
18 Preface 
0.3 The number puzzle, power of data structure 
If the
rst problem, to
nd the minimum free number, is a some what useful in 
practice, this problem is a `pure' one for fun. The puzzle is to
nd the 1,500th 
number, which only contains factor 2, 3 or 5. The
rst 3 numbers are of course 
2, 3, and 5. Number 60 = 223151, However it is the 25th number. Number 
21 = 203171, isn't a valid number because it contains a factor 7. The
rst 10 
such numbers are list as the following. 
2,3,4,5,6,8,9,10,12,15 
If we consider 1 = 203050, then 1 is also a valid number and it is the
rst 
one. 
0.3.1 The brute-force solution 
It seems the solution is quite easy without need any serious algorithms. We can 
check all numbers from 1, then extract all factors of 2, 3 and 5 to see if the left 
part is 1. 
1: function Get-Number(n) 
2: x   1 
3: i   0 
4: loop 
5: if Valid?(x) then 
6: i   i + 1 
7: if i = n then 
8: return x 
9: x   x + 1 
10: function Valid?(x) 
11: while x mod 2 = 0 do 
12: x   x=2 
13: while x mod 3 = 0 do 
14: x   x=3 
15: while x mod 5 = 0 do 
16: x   x=5 
17: if x = 1 then 
18: return True 
19: else 
20: return False 
This `brute-force' algorithm works for most small n. However, to
nd the 
1500th number (which is 859963392), the C program based on this algorithm 
takes 40.39 seconds in my computer. I have to kill the program after 10 minutes 
when I increased n to 15,000. 
0.3.2 Improvement 1 
Analysis of the above algorithm shows that modular and divide calculations 
are very expensive [2]. And they executed a lot in loops. Instead of checking a 
number contains only 2, 3, or 5 as factors, one alternative solution is to construct 
such number by these factors.
0.3. THE NUMBER PUZZLE, POWER OF DATA STRUCTURE 19 
We start from 1, and times it with 2, or 3, or 5 to generate rest numbers. 
The problem turns to be how to generate the candidate number in order? One 
handy way is to utilize the queue data structure. 
A queue data structure is used to push elements at one end, and pops them 
at the other end. So that the element be pushed
rst is also be popped out
rst. 
This property is called FIFO (First-In-First-Out). 
The idea is to push 1 as the only element to the queue, then we pop an 
element, times it with 2, 3, and 5, to get 3 new elements. We then push them 
back to the queue in order. Note that, the new elements may have already 
existed in the queue. In such case, we just drop the element. The new element 
may also smaller than the others in the queue, so we must put them to the 
correct position. Figure 2 illustrates this idea. 
1 
1*2=2 1*3=3 1*5=5 
2 3 5 
2*2=4 2*3=6 2*5=10 
3 4 5 6 1 0 
3*2=6 3*3=9 3*5=15 
4 5 6 9 1 0 1 5 
4*2=8 4*3=12 4*5=20 
Figure 2: First 4 steps of constructing numbers with a queue. 
1. Queue is initialized with 1 as the only element; 
2. New elements 2, 3, and 5 are pushed back; 
3. New elements 4, 6, and 10, are pushed back in order; 
4. New elements 9 and 15 are pushed back, element 6 already exists. 
This algorithm is shown as the following. 
1: function Get-Number(n) 
2: Q   NIL 
3: Enqueue(Q; 1) 
4: while n  0 do 
5: x   Dequeue(Q) 
6: Unique-Enqueue(Q; 2x) 
7: Unique-Enqueue(Q; 3x) 
8: Unique-Enqueue(Q; 5x) 
9: n   n  1 
10: return x 
11: function Unique-Enqueue(Q; x) 
12: i   0 
13: while i  jQj ^ Q[i]  x do 
14: i   i + 1 
15: if i  jQj ^ x = Q[i] then 
16: return 
17: Insert(Q; i; x)
20 Preface 
The insert function takes O(jQj) time to
nd the proper position and insert 
it. If the element has already existed, it just returns. 
A rough estimation tells that the length of the queue increase proportion to 
n, (Each time, we extract one element, and pushed 3 new, the increase ratio  
2), so the total running time is O(1 + 2 + 3 + ::: + n) = O(n2). 
Figure3 shows the number of queue access time against n. It is quadratic 
curve which re
ect the O(n2) performance. 
Figure 3: Queue access count v.s. n. 
The C program based on this algorithm takes only 0.016[s] to get the right 
answer 859963392. Which is 2500 times faster than the brute force solution. 
Improvement 1 can also be considered in recursive way. Suppose X is the 
in
nity series for all numbers which only contain factors of 2, 3, or 5. The 
following formula shows an interesting relationship. 
X = f1g [ f2x : 8x 2 Xg [ f3x : 8x 2 Xg [ f5x : 8x 2 Xg (2) 
Where we can de
ne [ to a special form so that all elements are stored 
in order as well as unique to each other. Suppose that X = fx1; x2; x3:::g, 
Y = fy1; y2; y3; :::g, X0 = fx2; x3; :::g and Y 0 = fy2; y3; :::g. We have 
X [ Y = 
8 
: 
X : Y =  
Y : X =  
fx1;X0 [ Y g : x1  y1 
fx1;X0 [ Y 0g : x1 = y1 
fy1;X [ Y 0g : x1  y1 
In a functional programming language such as Haskell, which supports lazy 
evaluation, The above in
nity series functions can be translate into the following 
program. 
ns = 1:merge (map (2) ns) (merge (map (3) ns) (map (5) ns)) 
merge [] l = l 
merge l [] = l 
merge (x:xs) (y:ys) j x y = x : merge xs (y:ys)
0.3. THE NUMBER PUZZLE, POWER OF DATA STRUCTURE 21 
j x ==y = x : merge xs ys 
j otherwise = y : merge (x:xs) ys 
By evaluate ns !! (n-1), we can get the 1500th number as below. 
ns !! (1500-1) 
859963392 
0.3.3 Improvement 2 
Considering the above solution, although it is much faster than the brute-force 
one, It still has some drawbacks. It produces many duplicated numbers and 
they are
nally dropped when examine the queue. Secondly, it does linear scan 
and insertion to keep the order of all elements in the queue, which degrade the 
ENQUEUE operation from O(1) to O(jQj). 
If we use three queues instead of using only one, we can improve the solution 
one step ahead. Denote these queues as Q2, Q3, and Q5, and we initialize them 
as Q2 = f2g, Q3 = f3g and Q5 = f5g. Each time we DEQUEUEed the smallest 
one from Q2, Q3, and Q5 as x. And do the following test: 
 If x comes from Q2, we ENQUEUE 2x, 3x, and 5x back to Q2, Q3, and 
Q5 respectively; 
 If x comes from Q3, we only need ENQUEUE 3x to Q3, and 5x to Q5; 
We needn't ENQUEUE 2x to Q2, because 2x have already existed in Q3; 
 If x comes from Q5, we only need ENQUEUE 5x to Q5; there is no need 
to ENQUEUE 3x, 5x to Q3, Q5 because they have already been in the 
queues; 
We repeatedly ENQUEUE the smallest one until we
nd the n-th element. 
The algorithm based on this idea is implemented as below. 
1: function Get-Number(n) 
2: if n = 1 then 
3: return 1 
4: else 
5: Q2   f2g 
6: Q3   f3g 
7: Q5   f5g 
8: while n  1 do 
9: x   min(Head(Q2), Head(Q3), Head(Q5)) 
10: if x = Head(Q2) then 
11: Dequeue(Q2) 
12: Enqueue(Q2; 2x) 
13: Enqueue(Q3; 3x) 
14: Enqueue(Q5; 5x) 
15: else if x = Head(Q3) then 
16: Dequeue(Q3) 
17: Enqueue(Q3; 3x) 
18: Enqueue(Q5; 5x) 
19: else 
20: Dequeue(Q5)
22 Preface 
2*min=4 3*min=6 5*min=10 
2 
3 5 
min=2 
4 
3*min=9 5*min=15 
3 6 5 1 0 
min=3 
2*min=8 3*min=12 5*min=20 
4 
6 9 5 1 0 1 5 
min=4 
8 
6 9 1 2 5 1 0 1 5 2 0 
min=5 
5*min=25 
Figure 4: First 4 steps of constructing numbers with Q2, Q3, and Q5. 
1. Queues are initialized with 2, 3, 5 as the only element; 
2. New elements 4, 6, and 10 are pushed back; 
3. New elements 9, and 15, are pushed back; 
4. New elements 8, 12, and 20 are pushed back; 
5. New element 25 is pushed back.
0.3. THE NUMBER PUZZLE, POWER OF DATA STRUCTURE 23 
21: Enqueue(Q5; 5x) 
22: n   n  1 
23: return x 
This algorithm loops n times, and within each loop, it extract one head 
element from the three queues, which takes constant time. Then it appends 
one to three new elements at the end of queues which bounds to constant time 
too. So the total time of the algorithm bounds to O(n). The C++ program 
translated from this algorithm shown below takes less than 1 s to produce the 
1500th number, 859963392. 
typedef unsigned long Integer; 
Integer get_number(int n){ 
if(n==1) 
return 1; 
queueInteger Q2, Q3, Q5; 
Q2.push(2); 
Q3.push(3); 
Q5.push(5); 
Integer x; 
while(n--  1){ 
x = min(min(Q2.front(), Q3.front()), Q5.front()); 
if(x==Q2.front()){ 
Q2.pop(); 
Q2.push(x2); 
Q3.push(x3); 
Q5.push(x5); 
} 
else if(x==Q3.front()){ 
Q3.pop(); 
Q3.push(x3); 
Q5.push(x5); 
} 
else{ 
Q5.pop(); 
Q5.push(x5); 
} 
} 
return x; 
} 
This solution can be also implemented in Functional way. We de
ne a func-tion 
take(n), which will return the
rst n numbers contains only factor 2, 3, or 
5. 
take(n) = f(n; f1g; f2g; f3g; f5g) 
Where 
f(n;X;Q2;Q3;Q5) = 
 
X : n = 1 
f(n  1;X [ fxg;Q0 
2;Q0 
3;Q0 
5) : otherwise 
x = min(Q21;Q31;Q51)
24 Preface 
0 
2;Q 
Q 
0 
3;Q 
0 
5 = 
8 
: 
fQ22;Q23; :::g [ f2xg;Q3 [ f3xg;Q5 [ f5xg : x = Q21 
Q2; fQ32;Q33; :::g [ f3xg;Q5 [ f5xg : x = Q31 
Q2;Q3; fQ52;Q53; :::g [ f5xg : x = Q51 
And these functional de
nition can be realized in Haskell as the following. 
ks 1 xs _ = xs 
ks n xs (q2, q3, q5) = ks (n-1) (xs++[x]) update 
where 
x = minimum $ map head [q2, q3, q5] 
update j x == head q2 = ((tail q2)++[x2], q3++[x3], q5++[x5]) 
j x == head q3 = (q2, (tail q3)++[x3], q5++[x5]) 
j otherwise = (q2, q3, (tail q5)++[x5]) 
takeN n = ks n [1] ([2], [3], [5]) 
Invoke `last takeN 1500' will generate the correct answer 859963392. 
0.4 Notes and short summary 
If review the 2 puzzles, we found in both cases, the brute-force solutions are so 
weak. In the
rst problem, it's quite poor in dealing with long ID list, while in 
the second problem, it doesn't work at all. 
The
rst problem shows the power of algorithms, while the second problem 
tells why data structure is important. There are plenty of interesting problems, 
which are hard to solve before computer was invented. With the aid of com-puter 
and programming, we are able to
nd the answer in a quite dierent way. 
Compare to what we learned in mathematics course in school, we haven't been 
taught the method like this. 
While there have been already a lot of wonderful books about algorithms, 
data structures and math, however, few of them provide the comparison between 
the procedural solution and the functional solution. From the above discussion, 
it can be found that functional solution sometimes is very expressive and they 
are close to what we are familiar in mathematics. 
This series of post focus on providing both imperative and functional algo-rithms 
and data structures. Many functional data structures can be referenced 
from Okasaki's book[6]. While the imperative ones can be founded in classic 
text books [2] or even in WIKIpedia. Multiple programming languages, includ-ing, 
C, C++, Python, Haskell, and Scheme/Lisp will be used. In order to make 
it easy to read by programmers with dierent background, pseudo code and 
mathematical function are the regular descriptions of each post. 
The author is NOT a native English speaker, the reason why this book is 
only available in English for the time being is because the contents are still 
changing frequently. Any feedback, comments, or criticizes are welcome. 
0.5 Structure of the contents 
In the following series of post, I'll
rst introduce about elementary data struc-tures 
before algorithms, because many algorithms need knowledge of data struc-tures 
as prerequisite.
0.6. APPENDIX 25 
The `hello world' data structure, binary search tree is the
rst topic; Then 
we introduce how to solve the balance problem of binary search tree. After 
that, I'll show other interesting trees. Trie, Patricia, sux trees are useful in 
text manipulation. While B-trees are commonly used in
le system and data 
base implementation. 
The second part of data structures is about heaps. We'll provide a gen-eral 
Heap de
nition and introduce about binary heaps by array and by explicit 
binary trees. Then we'll extend to K-ary heaps including Binomial heaps, Fi-bonacci 
heaps, and pairing heaps. 
Array and queues are considered among the easiest data structures typically, 
However, we'll show how dicult to implement them in the third part. 
As the elementary sort algorithms, we'll introduce insertion sort, quick sort, 
merge sort etc in both imperative way and functional way. 
The
nal part is about searching, besides the element searching, we'll also 
show string matching algorithms such as KMP. 
All the posts are provided under GNU FDL (Free document license), and 
programs are under GNU GPL. 
0.6 Appendix 
All programs provided along with this book are free for downloading. download 
position: http://sites.google.com/site/algoxy/home
26 Preface
Bibliography 
[1] Richard Bird. Pearls of functional algorithm design. Cambridge Univer-sity 
Press; 1 edition (November 1, 2010). ISBN-10: 0521513383 
[2] Jon Bentley. Programming Pearls(2nd Edition). Addison-Wesley Profes-sional; 
2 edition (October 7, 1999). ISBN-13: 978-0201657883 
[3] Chris Okasaki. Purely Functional Data Structures. Cambridge university 
press, (July 1, 1999), ISBN-13: 978-0521663502 
[4] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Cliord 
Stein. Introduction to Algorithms, Second Edition. The MIT Press, 2001. 
ISBN: 0262032937. 
27
28 BIBLIOGRAPHY
Part II 
Trees 
29
Chapter 1 
Binary search tree, the 
`hello world' data structure 
1.1 Introduction 
Arrays or lists are typically considered the `hello world' data structures. How-ever, 
we'll see they are not actually particularly easy to implement. In some 
procedural settings, arrays are the most elementary data structures, and it is 
possible to implement linked lists using arrays (see section 10.3 in [2]). On the 
other hand, in some functional settings, linked lists are the elementary building 
blocks used to create arrays and other data structures. 
Considering these factors, we start with Binary Search Trees (or BST) as the 
`hello world' data structure using an interesting problem Jon Bentley mentioned 
in `Programming Pearls' [2]. The problem is to count the number of times each 
word occurs in a large text. One solution in C++ is below: 
int main(int, char ){ 
mapstring, int dict; 
string s; 
while(cins) 
++dict[s]; 
mapstring, int::iterator it=dict.begin(); 
for(; it!=dict.end(); ++it) 
coutit!first: it!secondn; 
} 
And we can run it to produce the result using the following UNIX commands 
1. 
$ g++ wordcount.cpp -o wordcount 
$ cat bbe.txt | ./wordcount  wc.txt 
The map provided in the standard template library is a kind of balanced 
BST with augmented data. Here we use the words in the text as the keys and 
the number of occurrences as the augmented data. This program is fast, and 
1This is not a UNIX unique command, in Windows OS, it can be achieved by: 
type bbe:txtjwordcount:exe  wc:txt 
31
32CHAPTER 1. BINARY SEARCH TREE, THE `HELLOWORLD' DATA STRUCTURE 
it re
ects the power of BSTs. We'll introduce how to implement BSTs in this 
section and show how to balance them in a later section. 
Before we dive into BSTs, let's
rst introduce the more general binary tree. 
Binary trees are recursively de
ned. BSTs are just one type of binary tree. 
A binary tree is usually de
ned in the following way. 
A binary tree is 
 either an empty node; 
 or a node containing 3 parts: a value, a left child which is a binary tree 
and a right child which is also a binary tree. 
Figure 1.1 shows this concept and an example binary tree. 
k 
L R 
(a) Concept of binary tree 
1 6 
4 1 0 
1 4 7 
2 8 1 
9 3 
(b) An example binary tree 
Figure 1.1: Binary tree concept and an example. 
A BST is a binary tree where the following applies to each node: 
 all the values in left child tree are less than the value of this node; 
 the value of this node is less than any values in its right child tree. 
Figure 1.2 shows an example of binary search tree. Comparing with Figure 
1.1, we can see the dierences in how keys are ordered between them.
1.2. DATA LAYOUT 33 
4 
3 8 
1 
2 
7 1 6 
1 0 
9 1 4 
Figure 1.2: A Binary search tree example. 
1.2 Data Layout 
Based on the recursive de
nition of BSTs, we can draw the data layout in 
procedural setting with pointers as in Figure 1.3. 
The node contains a
eld of key, which can be augmented with satellite data; 
a
eld contains a pointer to the left child and a
eld point to the right child. In 
order to back-track an ancestor easily, a parent
eld can be provided as well. 
In this post, we'll ignore the satellite data for simple illustration purpose. 
Based on this layout, the node of binary search tree can be de
ned in a proce-dural 
language, such as C++ as the following. 
templateclass T 
struct node{ 
node(T x):key(x), left(0), right(0), parent(0){} 
~node(){ 
delete left; 
delete right; 
} 
node left; 
node right; 
node parent; ==parent is optional, it's helpful for succ=pred 
T key; 
}; 
There is another setting, for instance in Scheme/Lisp languages, the ele-mentary 
data structure is linked-list. Figure 1.4 shows how a binary search tree 
node can be built on top of linked-list.
34CHAPTER 1. BINARY SEARCH TREE, THE `HELLOWORLD' DATA STRUCTURE 
key + satellite data 
left 
right 
parent 
key + satellite data 
left 
right 
parent 
key + satellite data 
left 
right 
parent 
... ... ... ... 
Figure 1.3: Layout of nodes with parent
eld. 
key next 
left ... next 
right ... NIL 
Figure 1.4: Binary search tree node layout on top of linked list. Where `left...' 
and 'right ...' are either empty or binary search tree node composed in the same 
way.
1.3. INSERTION 35 
Because in pure functional setting, It's hard to use pointer for back tracking 
the ancestors, (and typically, there is no need to do back tracking, since we can 
provide top-down solution in recursive way) there is not `parent'
eld in such 
layout. 
For simpli
ed reason, we'll skip the detailed layout in the future, and only 
focus on the logic layout of data structures. For example, below is the de
nition 
of binary search tree node in Haskell. 
data Tree a = Empty 
j Node (Tree a) a (Tree a) 
1.3 Insertion 
To insert a key k (may be along with a value in practice) to a binary search tree 
T, we can follow a quite straight forward way. 
 If the tree is empty, then construct a leave node with key=k; 
 If k is less than the key of root node, insert it to the left child; 
 If k is greater than the key of root, insert it to the right child; 
There is an exceptional case that if k is equal to the key of root, it means it 
has already existed, we can either overwrite the data, or just do nothing. For 
simple reason, this case is skipped in this post. 
This algorithm is described recursively. It is so simple that is why we consider 
binary search tree is `hello world' data structure. Formally, the algorithm can 
be represented with a recursive function. 
insert(T; k) = 
8 
: 
node(; k; ) : T =  
node(insert(L; k);Key;R) : k  Key 
node(L;Key; insert(R; k)) : otherwise 
(1.1) 
Where 
L = left(T) 
R = right(T) 
Key = key(T) 
The node function creates a new node with given left sub-tree, a key and a 
right sub-tree as parameters.  means NIL or Empty. function left, right and 
key are access functions which can get the left sub-tree, right sub-tree and the 
key of a node. 
Translate the above functions directly to Haskell yields the following pro-gram. 
insert::(Ord a) ) Tree a ! a ! Tree a 
insert Empty k = Node Empty k Empty 
insert (Node l x r) k j k  x = Node (insert l k) x r 
j otherwise = Node l x (insert r k) 
This program utilized the pattern matching features provided by the lan-guage. 
However, even in functional settings without this feature, for instance, 
Scheme/Lisp, the program is still expressive.
36CHAPTER 1. BINARY SEARCH TREE, THE `HELLOWORLD' DATA STRUCTURE 
(define (insert tree x) 
(cond ((null? tree) (list '() x '())) 
(( x (key tree)) 
(make-tree (insert (left tree) x) 
(key tree) 
(right tree))) 
(( x (key tree)) 
(make-tree (left tree) 
(key tree) 
(insert (right tree) x))))) 
It is possible to turn the algorithm completely into imperative way without 
recursion. 
1: function Insert(T; k) 
2: root   T 
3: x   Create-Leaf(k) 
4: parent   NIL 
5: while T6= NIL do 
6: parent   T 
7: if k  Key(T) then 
8: T   Left(T) 
9: else 
10: T   Right(T) 
11: Parent(x)   parent 
12: if parent = NIL then . tree T is empty 
13: return x 
14: else if k  Key(parent) then 
15: Left(parent)   x 
16: else 
17: Right(parent)   x 
18: return root 
19: function Create-Leaf(k) 
20: x   Empty-Node 
21: Key(x)   k 
22: Left(x)   NIL 
23: Right(x)   NIL 
24: Parent(x)   NIL 
25: return x 
Compare with the functional algorithm, it is obviously that this one is more 
complex although it is fast and can handle very deep tree. A complete C++ 
program and a python program are available along with this post for reference. 
1.4 Traversing 
Traversing means visiting every element one by one in a binary search tree. 
There are 3 ways to traverse a binary tree, pre-order tree walk, in-order tree 
walk, and post-order tree walk. The names of these traversing methods highlight 
the order of when we visit the root of a binary search tree.
1.4. TRAVERSING 37 
Since there are three parts in a tree, as left child, the root, which con-tains 
the key and satellite data, and the right child. If we denote them as 
(left; current; right), the three traversing methods are de
ned as the following. 
 pre-order traverse, visit current, then left,
nally right; 
 in-order traverse, visit left , then current,
nally right; 
 post-order traverse, visit left, then right,
nally current. 
Note that each visiting operation is recursive. And we see the order of 
visiting current determines the name of the traversing method. 
For the binary search tree shown in
gure 1.2, below are the three dierent 
traversing results. 
 pre-order traverse result: 4, 3, 1, 2, 8, 7, 16, 10, 9, 14; 
 in-order traverse result: 1, 2, 3, 4, 7, 8, 9, 10, 14, 16; 
 post-order traverse result: 2, 1, 3, 7, 9, 14, 10, 16, 8, 4; 
It can be found that the in-order walk of a binary search tree outputs the 
elements in increase order, which is particularly helpful. The de
nition of binary 
search tree ensures this interesting property, while the proof of this fact is left 
as an exercise of this post. 
In-order tree walk algorithm can be described as the following: 
 If the tree is empty, just return; 
 traverse the left child by in-order walk, then access the key,
nally traverse 
the right child by in-order walk. 
Translate the above description yields a generic map function 
map(f; T) = 
 
 : T =  
node(l0; k0; r0) : otherwise 
(1.2) 
where 
l0 = map(f; left(T)) 
r0 = map(f; right(T)) 
k0 = f(key(T)) 
If we only need access the key without create the transformed tree, we can 
realize this algorithm in procedural way lie the below C++ program. 
templateclass T, class F 
void in_order_walk(nodeT t, F f){ 
if(t){ 
in_order_walk(t!left, f); 
f(t!value); 
in_order_walk(t!right, f); 
} 
}
38CHAPTER 1. BINARY SEARCH TREE, THE `HELLOWORLD' DATA STRUCTURE 
The function takes a parameter f, it can be a real function, or a function 
object, this program will apply f to the node by in-order tree walk. 
We can simpli
ed this algorithm one more step to de
ne a function which 
turns a binary search tree to a sorted list by in-order traversing. 
toList(T) = 
 
 : T =  
toList(left(T)) [ fkey(T)g [ toList(right(T)) : otherwise 
(1.3) 
Below is the Haskell program based on this de
nition. 
toList::(Ord a))Tree a ! [a] 
toList Empty = [] 
toList (Node l x r) = toList l ++ [x] ++ toList r 
This provides us a method to sort a list of elements. We can
rst build 
a binary search tree from the list, then output the tree by in-order traversing. 
This method is called as `tree sort'. Let's denote the list X = fx1; x2; x3; :::; xng. 
sort(X) = toList(fromList(X)) (1.4) 
And we can write it in function composition form. 
sort = toList:fromList 
Where function fromList repeatedly insert every element to a binary search 
tree. 
fromList(X) = foldL(insert; ;X) (1.5) 
It can also be written in partial application form like below. 
fromList = foldL insert  
For the readers who are not familiar with folding from left, this function can 
also be de
ned recursively as the following. 
fromList(X) = 
 
 : X =  
insert(fromList(fx2; x3; :::; xng); x1) : otherwise 
We'll intense use folding function as well as the function composition and 
partial evaluation in the future, please refer to appendix of this book or [6] [7] 
and [8] for more information. 
Exercise 1.1 
 Given the in-order traverse result and pre-order traverse result, can you re-construct 
the tree from these result and
gure out the post-order traversing 
result? 
Pre-order result: 1, 2, 4, 3, 5, 6; In-order result: 4, 2, 1, 5, 3, 6; Post-order 
result: ? 
 Write a program in your favorite language to re-construct the binary tree 
from pre-order result and in-order result.
1.5. QUERYING A BINARY SEARCH TREE 39 
 Prove why in-order walk output the elements stored in a binary search 
tree in increase order? 
 Can you analyze the performance of tree sort with big-O notation? 
1.5 Querying a binary search tree 
There are three types of querying for binary search tree, searching a key in the 
tree,
nd the minimum or maximum element in the tree, and
nd the predecessor 
or successor of an element in the tree. 
1.5.1 Looking up 
According to the de
nition of binary search tree, search a key in a tree can be 
realized as the following. 
 If the tree is empty, the searching fails; 
 If the key of the root is equal to the value to be found, the search succeed. 
The root is returned as the result; 
 If the value is less than the key of the root, search in the left child. 
 Else, which means that the value is greater than the key of the root, search 
in the right child. 
This algorithm can be described with a recursive function as below. 
lookup(T; x) = 
8 
: 
 : T =  
T : key(T) = x 
lookup(left(T); x) : x  key(T) 
lookup(right(T); x) : otherwise 
(1.6) 
In the real application, we may return the satellite data instead of the node 
as the search result. This algorithm is simple and straightforward. Here is a 
translation of Haskell program. 
lookup::(Ord a)) Tree a ! a ! Tree a 
lookup Empty _ = Empty 
lookup t@(Node l k r) x j k == x = t 
j x  k = lookup l x 
j otherwise = lookup r x 
If the binary search tree is well balanced, which means that almost all nodes 
have both non-NIL left child and right child, for N elements, the search algo-rithm 
takes O(lgN) time to perform. This is not formal de
nition of balance. 
We'll show it in later post about red-black-tree. If the tree is poor balanced, 
the worst case takes O(N) time to search for a key. If we denote the height of 
the tree as h, we can uniform the performance of the algorithm as O(h). 
The search algorithm can also be realized without using recursion in a pro-cedural 
manner. 
1: function Search(T; x) 
2: while T6= NIL^ Key(T)6= x do
40CHAPTER 1. BINARY SEARCH TREE, THE `HELLOWORLD' DATA STRUCTURE 
3: if x  Key(T) then 
4: T   Left(T) 
5: else 
6: T   Right(T) 
7: return T 
Below is the C++ program based on this algorithm. 
templateclass T 
nodeT search(nodeT t, T x){ 
while(t  t!key!=x){ 
if(x  t!key) t=t!left; 
else t=t!right; 
} 
return t; 
} 
1.5.2 Minimum and maximum 
Minimum and maximum can be implemented from the property of binary search 
tree, less keys are always in left child, and greater keys are in right. 
For minimum, we can continue traverse the left sub tree until it is empty. 
While for maximum, we traverse the right. 
min(T) = 
 
key(T) : left(T) =  
min(left(T)) : otherwise 
(1.7) 
max(T) = 
 
key(T) : right(T) =  
max(right(T)) : otherwise 
(1.8) 
Both function bound to O(h) time, where h is the height of the tree. For the 
balanced binary search tree, min/max are bound to O(lgN) time, while they 
are O(N) in the worst cases. 
We skip translating them to programs, It's also possible to implement them 
in pure procedural way without using recursion. 
1.5.3 Successor and predecessor 
The last kind of querying, to
nd the successor or predecessor of an element 
is useful when a tree is treated as a generic container and traversed by using 
iterator. It will be relative easier to implement if parent of a node can be 
accessed directly. 
It seems that the functional solution is hard to be found, because there is no 
pointer like
eld linking to the parent node. One solution is to left `breadcrumbs' 
when we visit the tree, and use these information to back-track or even re-construct 
the whole tree. Such data structure, that contains both the tree and 
`breadcrumbs' is called zipper. please refer to [9] for details. 
However, If we consider the original purpose of providing succ/pred func-tion, 
`to traverse all the binary search tree elements one by one` as a generic 
container, we realize that they don't make signi
cant sense in functional settings 
because we can traverse the tree in increase order by mapT function we de
ned 
previously.
1.5. QUERYING A BINARY SEARCH TREE 41 
We'll meet many problems in this series of post that they are only valid in 
imperative settings, and they are not meaningful problems in functional settings 
at all. One good example is how to delete an element in red-black-tree[3]. 
In this section, we'll only present the imperative algorithm for
nding the 
successor and predecessor in a binary search tree. 
When
nding the successor of element x, which is the smallest one y that 
satis
es y  x, there are two cases. If the node with value x has non-NIL right 
child, the minimum element in right child is the answer; For example, in Figure 
1.2, in order to
nd the successor of 8, we search it's right sub tree for the 
minimum one, which yields 9 as the result. While if node x don't have right 
child, we need back-track to
nd the closest ancestors whose left child is also 
ancestor of x. In Figure 1.2, since 2 don't have right sub tree, we go back to its 
parent 1. However, node 1 don't have left child, so we go back again and reach 
to node 3, the left child of 3, is also ancestor of 2, thus, 3 is the successor of 
node 2. 
Based on this description, the algorithm can be given as the following. 
1: function Succ(x) 
2: if Right(x)6= NIL then 
3: return Min(Right(x)) 
4: else 
5: p   Parent(x) 
6: while p6= NIL and x = Right(p) do 
7: x   p 
8: p   Parent(p) 
9: return p 
The predecessor case is quite similar to the successor algorithm, they are 
symmetrical to each other. 
1: function Pred(x) 
2: if Left(x)6= NIL then 
3: return Max(Left(x)) 
4: else 
5: p   Parent(x) 
6: while p6= NIL and x = Left(p) do 
7: x   p 
8: p   Parent(p) 
9: return p 
Below are the Python programs based on these algorithms. They are changed 
a bit in while loop conditions. 
def succ(x): 
if x.right is not None: return tree_min(x.right) 
p = x.parent 
while p is not None and p.left != x: 
x = p 
p = p.parent 
return p 
def pred(x): 
if x.left is not None: return tree_max(x.left) 
p = x.parent
42CHAPTER 1. BINARY SEARCH TREE, THE `HELLOWORLD' DATA STRUCTURE 
while p is not None and p.right != x: 
x = p 
p = p.parent 
return p 
Exercise 1.2 
 Can you
gure out how to iterate a tree as a generic container by using 
pred()/succ()? What's the performance of such traversing process in terms 
of big-O? 
 A reader discussed about traversing all elements inside a range [a; b]. In 
C++, the algorithm looks like the below code: 
for each(m:lower bound(12); m:upper bound(26); f); 
Can you provide the purely function solution for this problem? 
1.6 Deletion 
Deletion is another `imperative only' topic for binary search tree. This is because 
deletion mutate the tree, while in purely functional settings, we don't modify 
the tree after building it in most application. 
However, One method of deleting element from binary search tree in purely 
functional way is shown in this section. It's actually reconstructing the tree but 
not modifying the tree. 
Deletion is the most complex operation for binary search tree. this is because 
we must keep the BST property, that for any node, all keys in left sub tree are 
less than the key of this node, and they are all less than any keys in right sub 
tree. Deleting a node can break this property. 
In this post, dierent with the algorithm described in [2], A simpler one from 
SGI STL implementation is used.[6] 
To delete a node x from a tree. 
 If x has no child or only one child, splice x out; 
 Otherwise (x has two children), use minimum element of its right sub tree 
to replace x, and splice the original minimum element out. 
The simplicity comes from the truth that, the minimum element is stored in 
a node in the right sub tree, which can't have two non-NIL children. It ends up 
in the trivial case, the the node can be directly splice out from the tree. 
Figure 1.5, 1.6, and 1.7 illustrate these dierent cases when deleting a node 
from the tree. 
Based on this idea, the deletion can be de
ned as the below function. 
delete(T; x) = 
8 
: 
 : T =  
node(delete(L; x);K;R) : x  K 
node(L;K; delete(R; x)) : x  K 
R : x = K ^ L =  
L : x = K ^ R =  
node(L; y; delete(R; y)) : otherwise 
(1.9)
1.6. DELETION 43 
Tree 
x 
NIL NIL 
Figure 1.5: x can be spliced out. 
Tree 
x 
L NIL 
(a) Before delete x 
Tree 
L 
(b) After delete x 
x is spliced out, and replaced by its left child. 
Tree 
x 
NIL R 
(c) Before delete x 
Tree 
R 
(d) Before delete x 
x is spliced out, and replaced by its right child. 
Figure 1.6: Delete a node which has only one non-NIL child.
44CHAPTER 1. BINARY SEARCH TREE, THE `HELLOWORLD' DATA STRUCTURE 
Tree 
x 
L R 
(a) Before delete x 
Tree 
min(R) 
L delete(R, min(R)) 
(b) After delete x 
x is replaced by splicing the minimum element from its right child. 
Figure 1.7: Delete a node which has both children. 
Where 
L = left(T) 
R = right(T) 
K = key(T) 
y = min(R) 
Translating the function to Haskell yields the below program. 
delete::(Ord a)) Tree a ! a ! Tree a 
delete Empty _ = Empty 
delete (Node l k r) x j x  k = (Node (delete l x) k r) 
j x  k = (Node l k (delete r x)) 
-- x == k 
j isEmpty l = r 
j isEmpty r = l 
j otherwise = (Node l k' (delete r k')) 
where k' = min r 
Function `isEmpty' is used to test if a tree is empty (). Note that the 
algorithm
rst performs search to locate the node where the element need be 
deleted, after that it execute the deletion. This algorithm takes O(h) time where 
h is the height of the tree. 
It's also possible to pass the node but not the element to the algorithm for 
deletion. Thus the searching is no more needed. 
The imperative algorithm is more complex because it need set the parent 
properly. The function will return the root of the result tree. 
1: function Delete(T; x) 
2: root   T
1.6. DELETION 45 
3: x0   x . save x 
4: parent   Parent(x) 
5: if Left(x) = NIL then 
6: x   Right(x) 
7: else if Right(x) = NIL then 
8: x   Left(x) 
9: else . both children are non-NIL 
10: y   Min(Right(x)) 
11: Key(x)   Key(y) 
12: Copy other satellite data from y to x 
13: if Parent(y)6= x then . y hasn't left sub tree 
14: Left(Parent(y))   Right(y) 
15: else . y is the root of right child of x 
16: Right(x)   Right(y) 
17: Remove y 
18: return root 
19: if x6= NIL then 
20: Parent(x)   parent 
21: if parent = NIL then . We are removing the root of the tree 
22: root   x 
23: else 
24: if Left(parent) = x0 then 
25: Left(parent)   x 
26: else 
27: Right(parent)   x 
28: Remove x0 
29: return root 
Here we assume the node to be deleted is not empty (otherwise we can simply 
returns the original tree). In other cases, it will
rst record the root of the tree, 
create copy pointers to x, and its parent. 
If either of the children is empty, the algorithm just splice x out. If it has 
two non-NIL children, we
rst located the minimum of right child, replace the 
key of x to y's, copy the satellite data as well, then splice y out. Note that there 
is a special case that y is the root node of x's left sub tree. 
Finally we need reset the stored parent if the original x has only one non- 
NIL child. If the parent pointer we copied before is empty, it means that we are 
deleting the root node, so we need return the new root. After the parent is set 
properly, we
nally remove the old x from memory. 
The relative Python program for deleting algorithm is given as below. Be-cause 
Python provides GC, we needn't explicitly remove the node from the 
memory. 
def tree_delete(t, x): 
if x is None: 
return t 
[root, old_x, parent] = [t, x, x.parent] 
if x.left is None: 
x = x.right 
elif x.right is None: 
x = x.left
46CHAPTER 1. BINARY SEARCH TREE, THE `HELLOWORLD' DATA STRUCTURE 
else: 
y = tree_min(x.right) 
x.key = y.key 
if y.parent != x: 
y.parent.left = y.right 
else: 
x.right = y.right 
return root 
if x is not None: 
x.parent = parent 
if parent is None: 
root = x 
else: 
if parent.left == old_x: 
parent.left = x 
else: 
parent.right = x 
return root 
Because the procedure seeks minimum element, it runs in O(h) time on a 
tree of height h. 
Exercise 1.3 
 There is a symmetrical solution for deleting a node which has two non-NIL 
children, to replace the element by splicing the maximum one out o the 
left sub-tree. Write a program to implement this solution. 
1.7 Randomly build binary search tree 
It can be found that all operations given in this post bound to O(h) time for a 
tree of height h. The height aects the performance a lot. For a very unbalanced 
tree, h tends to be O(N), which leads to the worst case. While for balanced 
tree, h close to O(lgN). We can gain the good performance. 
How to make the binary search tree balanced will be discussed in next post. 
However, there exists a simple way. Binary search tree can be randomly built as 
described in [2]. Randomly building can help to avoid (decrease the possibility) 
unbalanced binary trees. The idea is that before building the tree, we can call 
a random process, to shue the elements. 
Exercise 1.4 
 Write a randomly building process for binary search tree. 
1.8 Appendix 
All programs are provided along with this post. They are free for downloading. 
We provided C, C++, Python, Haskell, and Scheme/Lisp programs as example.
Bibliography 
[1] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Cliord 
Stein. Introduction to Algorithms, Second Edition. ISBN:0262032937. 
The MIT Press. 2001 
[2] Jon Bentley. Programming Pearls(2nd Edition). Addison-Wesley Profes-sional; 
2 edition (October 7, 1999). ISBN-13: 978-0201657883 
[3] Chris Okasaki. Ten Years of Purely Functional Data Structures. 
http://okasaki.blogspot.com/2008/02/ten-years-of-purely-functional-data. 
html 
[4] SGI. Standard Template Library Programmer's Guide. 
http://www.sgi.com/tech/stl/ 
[5] http://en.literateprograms.org/Category:Binary search tree 
[6] http://en.wikipedia.org/wiki/Foldl 
[7] http://en.wikipedia.org/wiki/Function composition 
[8] http://en.wikipedia.org/wiki/Partial application 
[9] Miran Lipovaca. Learn You a Haskell for Great Good! A Beginner's 
Guide. the last chapter. No Starch Press; 1 edition April 2011, 400 pp. 
ISBN: 978-1-59327-283-8 
47
48 The evolution of insertion sort
Chapter 2 
The evolution of insertion 
sort 
2.1 Introduction 
In previous chapter, we introduced the 'hello world' data structure, binary 
search tree. In this chapter, we explain insertion sort, which can be think of 
the 'hello world' sorting algorithm 1. It's straightforward, but the performance 
is not as good as some divide and conqueror sorting approaches, such as quick 
sort and merge sort. Thus insertion sort is seldom used as generic sorting utility 
in modern software libraries. We'll analyze the problems why it is slow, and 
trying to improve it bit by bit till we reach the best bound of comparison based 
sorting algorithms, O(n lg n), by evolution to tree sort. And we
nally show the 
connection between the 'hello world' data structure and 'hello world' sorting 
algorithm. 
The idea of insertion sort can be vivid illustrated by a real life poker game[2]. 
Suppose the cards are shued, and a player starts taking card one by one. 
At any time, all cards in player's hand are well sorted. When the player 
gets a new card, he insert it in proper position according to the order of points. 
Figure 2.1 shows this insertion example. 
Based on this idea, the algorithm of insertion sort can be directly given as 
the following. 
function Sort(A) 
X    
for each x 2 A do 
Insert(X; x) 
return X 
It's easy to express this process with folding, which we mentioned in the 
chapter of binary search tree. 
insert = foldL insert  (2.1) 
1Some reader may argue that 'Bubble sort' is the easiest sort algorithm. Bubble sort isn't 
covered in this book as we don't think it's a valuable algorithm[1] 
49
50 CHAPTER 2. THE EVOLUTION OF INSERTION SORT 
Figure 2.1: Insert card 8 to proper position in a deck. 
Note that in above algorithm, we store the sorted result in X, so this isn't 
in-place sorting. It's easy to change it to in-place algorithm. 
function Sort(A) 
for i   2 to Length(A) do 
insert Ai to sorted sequence fA0 
1;A0 
2; :::;A0 
i1 
g 
At any time, when we process the i-th element, all elements before i have 
already been sorted. we continuously insert the current elements until consume 
all the unsorted data. This idea is illustrated as in
gure 9.3. 
insert 
... sorted elements ... x 
... unsorted elements ... 
Figure 2.2: The left part is sorted data, continuously insert elements to sorted 
part. 
We can
nd there is recursive concept in this de
nition. Thus it can be 
expressed as the following. 
sort(A) = 
 
 : A =  
insert(sort(fA2;A3; :::g);A1) : otherwise 
(2.2) 
2.2 Insertion 
We haven't answered the question about how to realize insertion however. It's 
a puzzle how does human locate the proper position so quickly. 
For computer, it's an obvious option to perform a scan. We can either scan 
from left to right or vice versa. However, if the sequence is stored in plain array, 
it's necessary to scan from right to left. 
function Sort(A) 
for i   2 to Length(A) do . Insert A[i] to sorted sequence A[1:::i  1]
2.2. INSERTION 51 
x   A[i] 
j   i  1 
while j  0 ^ x  A[j] do 
A[j + 1]   A[j] 
j   j  1 
A[j + 1]   x 
One may think scan from left to right is natural. However, it isn't as eect 
as above algorithm for plain array. The reason is that, it's expensive to insert an 
element in arbitrary position in an array. As array stores elements continuously. 
If we want to insert new element x in position i, we must shift all elements after 
i, including i+1; i+2; ::: one position to right. After that the cell at position i 
is empty, and we can put x in it. This is illustrated in
gure 2.3. 
x 
insert 
A[1] A[2] ... A[i-1] A[i] A[i+1] A[i+2] ... A[n-1] A[n] empty 
Figure 2.3: Insert x to array A at position i. 
If the length of array is n, this indicates we need examine the
rst i elements, 
then perform n  i + 1 moves, and then insert x to the i-th cell. So insertion 
from left to right need traverse the whole array anyway. While if we scan from 
right to left, we totally examine the last j = n  i + 1 elements, and perform 
the same amount of moves. If j is small (e.g. less than n=2), there is possibility 
to perform less operations than scan from left to right. 
Translate the above algorithm to Python yields the following code. 
def isort(xs): 
n = len(xs) 
for i in range(1, n): 
x = xs[i] 
j = i - 1 
while j  0 and x  xs[j]: 
xs[j+1] = xs[j] 
j = j - 1 
xs[j+1] = x 
It can be found some other equivalent programs, for instance the following 
ANSI C program. However this version isn't as eective as the pseudo code. 
void isort(Key xs, int n){ 
int i, j; 
for(i=1; in; ++i) 
for(j=i-1; j0  xs[j+1]  xs[j]; --j) 
swap(xs, j, j+1); 
}
52 CHAPTER 2. THE EVOLUTION OF INSERTION SORT 
This is because the swapping function, which can exchange two elements 
typically uses a temporary variable like the following: 
void swap(Key xs, int i, int j){ 
Key temp = xs[i]; 
xs[i] = xs[j]; 
xs[j] = temp; 
} 
So the ANSI C program presented above takes 3m times assignment, where 
m is the number of inner loops. While the pseudo code as well as the Python 
program use shift operation instead of swapping. There are n + 2 times assign-ment. 
We can also provide Insert() function explicitly, and call it from the general 
insertion sort algorithm in previous section. We skip the detailed realization here 
and left it as an exercise. 
All the insertion algorithms are bound to O(n), where n is the length of 
the sequence. No matter what dierence among them, such as scan from left 
or from right. Thus the over all performance for insertion sort is quadratic as 
O(n2). 
Exercise 2.1 
 Provide explicit insertion function, and call it with general insertion sort 
algorithm. Please realize it in both procedural way and functional way. 
2.3 Improvement 1 
Let's go back to the question, that why human being can
nd the proper position 
for insertion so quickly. We have shown a solution based on scan. Note the fact 
that at any time, all cards at hands have been well sorted, another possible 
solution is to use binary search to
nd that location. 
We'll explain the search algorithms in other dedicated chapter. Binary search 
is just brie
y introduced for illustration purpose here. 
The algorithm will be changed to call a binary search procedure. 
function Sort(A) 
for i   2 to Length(A) do 
x   A[i] 
p   Binary-Search(A[1:::i  1]; x) 
for j   i down to p do 
A[j]   A[j  1] 
A[p]   x 
Instead of scan elements one by one, binary search utilize the information 
that all elements in slice of array fA1; :::;Ai1g are sorted. Let's assume the 
order is monotonic increase order. To
nd a position j that satis
es Aj1  
x  Aj . We can
rst examine the middle element, for example, Abi=2c. If x is 
less than it, we need next recursively perform binary search in the
rst half of 
the sequence; otherwise, we only need search in last half. 
Every time, we halve the elements to be examined, this search process runs 
O(lg n) time to locate the insertion position. 
function Binary-Search(A; x)
2.4. IMPROVEMENT 2 53 
l   1 
u   1+ Length(A) 
while l  u do 
m   b l+u 
2 
c 
if Am = x then 
return m . Find a duplicated element 
else if Am  x then 
l   m + 1 
else 
u   m 
return l 
The improved insertion sort algorithm is still bound to O(n2), compare to 
previous section, which we use O(n2) times comparison and O(n2) moves, with 
binary search, we just use O(n lg n) times comparison and O(n2) moves. 
The Python program regarding to this algorithm is given below. 
def isort(xs): 
n = len(xs) 
for i in range(1, n): 
x = xs[i] 
p = binary_search(xs[:i], x) 
for j in range(i, p, -1): 
xs[j] = xs[j-1] 
xs[p] = x 
def binary_search(xs, x): 
l = 0 
u = len(xs) 
while l  u: 
m = (l+u)=2 
if xs[m] == x: 
return m 
elif xs[m]  x: 
l = m + 1 
else: 
u = m 
return l 
Exercise 2.2 
Write the binary search in recursive manner. You needn't use purely func-tional 
programming language. 
2.4 Improvement 2 
Although we improve the search time to O(n lg n) in previous section, the num-ber 
of moves is still O(n2). The reason of why movement takes so long time, is 
because the sequence is stored in plain array. The nature of array is continu-ously 
layout data structure, so the insertion operation is expensive. This hints 
us that we can use linked-list setting to represent the sequence. It can improve
54 CHAPTER 2. THE EVOLUTION OF INSERTION SORT 
the insertion operation from O(n) to constant time O(1). 
insert(A; x) = 
8 
: 
fxg : A =  
fxg [ A : x  A1 
fA1g [ insert(fA2;A3; :::Ang; x) : otherwise 
(2.3) 
Translating the algorithm to Haskell yields the below program. 
insert :: (Ord a) ) [a] ! a ! [a] 
insert [] x = [x] 
insert (y:ys) x = if x  y then x:y:ys else y:insert ys x 
And we can complete the two versions of insertion sort program based on 
the
rst two equations in this chapter. 
isort [] = [] 
isort (x:xs) = insert (isort xs) x 
Or we can represent the recursion with folding. 
isort = foldl insert [] 
Linked-list setting solution can also be described imperatively. Suppose 
function Key(x), returns the value of element stored in node x, and Next(x) 
accesses the next node in the linked-list. 
function Insert(L; x) 
p   NIL 
H   L 
while L6= NIL^ Key(L)  Key(x) do 
p   L 
L   Next(L) 
Next(x)   L 
if p6= NIL then 
H   x 
else 
Next(p)   x 
return H 
For example in ANSI C, the linked-list can be de
ned as the following. 
struct node{ 
Key key; 
struct node next; 
}; 
Thus the insert function can be given as below. 
struct node insert(struct node lst, struct node x){ 
struct node p, head; 
p = NULL; 
for(head = lst; lst  x!key  lst!key; lst = lst!next) 
p = lst; 
x!next = lst; 
if(!p) 
return x; 
p!next = x;
2.5. FINAL IMPROVEMENT BY BINARY SEARCH TREE 55 
return head; 
} 
Instead of using explicit linked-list such as by pointer or reference based 
structure. Linked-list can also be realized by another index array. For any 
array element Ai, Nexti stores the index of next element follows Ai. It means 
ANexti is the next element after Ai. 
The insertion algorithm based on this solution is given like below. 
function Insert(A;Next; i) 
j  ? 
while Nextj6= NIL ^ ANextj  Ai do 
j   Nextj 
Nexti   Nextj 
Nextj   i 
Here ? means the head of the Next table. And the relative Python program 
for this algorithm is given as the following. 
def isort(xs): 
n = len(xs) 
next = [-1](n+1) 
for i in range(n): 
insert(xs, next, i) 
return next 
def insert(xs, next, i): 
j = -1 
while next[j] != -1 and xs[next[j]]  xs[i]: 
j = next[j] 
next[j], next[i] = i, next[j] 
Although we change the insertion operation to constant time by using linked-list. 
However, we have to traverse the linked-list to
nd the position, which 
results O(n2) times comparison. This is because linked-list, unlike array, doesn't 
support random access. It means we can't use binary search with linked-list 
setting. 
Exercise 2.3 
 Complete the insertion sort by using linked-list insertion function in your 
favorate imperative programming language. 
 The index based linked-list return the sequence of rearranged index as 
result. Write a program to re-order the original array of elements from 
this result. 
2.5 Final improvement by binary search tree 
It seems that we drive into a corner. We must improve both the comparison 
and the insertion at the same time, or we will end up with O(n2) performance. 
We must use binary search, this is the only way to improve the comparison 
time to O(lg n). On the other hand, we must change the data structure, because 
we can't achieve constant time insertion at a position with plain array.
56 CHAPTER 2. THE EVOLUTION OF INSERTION SORT 
This remind us about our 'hello world' data structure, binary search tree. It 
naturally support binary search from its de
nition. At the same time, We can 
insert a new leaf in binary search tree in O(1) constant time if we already
nd 
the location. 
So the algorithm changes to this. 
function Sort(A) 
T    
for each x 2 A do 
T   Insert-Tree(T; x) 
return To-List(T) 
Where Insert-Tree() and To-List() are described in previous chapter 
about binary search tree. 
As we have analyzed for binary search tree, the performance of tree sort is 
bound to O(n lg n), which is the lower limit of comparison based sort[3]. 
2.6 Short summary 
In this chapter, we present the evolution process of insertion sort. Insertion sort 
is well explained in most textbooks as the
rst sorting algorithm. It has simple 
and straightforward idea, but the performance is quadratic. Some textbooks 
stop here, but we want to show that there exist ways to improve it by dierent 
point of view. We
rst try to save the comparison time by using binary search, 
and then try to save the insertion operation by changing the data structure to 
linked-list. Finally, we combine these two ideas and evolute insertion sort to 
tree sort.
Bibliography 
[1] http://en.wikipedia.org/wiki/Bubble sort 
[2] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Cliord 
Stein. Introduction to Algorithms, Second Edition. ISBN:0262032937. 
The MIT Press. 2001 
[3] Donald E. Knuth. The Art of Computer Programming, Volume 3: Sorting 
and Searching (2nd Edition). Addison-Wesley Professional; 2 edition (May 
4, 1998) ISBN-10: 0201896850 ISBN-13: 978-0201896855 
57
58 Red black tree
Chapter 3 
Red-black tree, not so 
complex as it was thought 
3.1 Introduction 
3.1.1 Exploit the binary search tree 
We have shown the power of binary search tree by using it to count the occur-rence 
of every word in Bible. The idea is to use binary search tree as a dictionary 
for counting. 
One may come to the idea that to feed a yellow page book 1 to a binary 
search tree, and use it to look up the phone number for a contact. 
By modifying a bit of the program for word occurrence counting yields the 
following code. 
int main(int, char ){ 
ifstream f(yp.txt); 
mapstring, string dict; 
string name, phone; 
while(fname  fphone) 
dict[name]=phone; 
for(;;){ 
coutnname: ; 
cinname; 
if(dict.find(name)==dict.end()) 
coutnot found; 
else 
coutphone: dict[name]; 
} 
} 
This program works well. However, if you replace the STL map with the bi-nary 
search tree as mentioned previously, the performance will be bad, especially 
when you search some names such as Zara, Zed, Zulu. 
This is because the content of yellow page is typically listed in lexicographic 
order. Which means the name list is in increase order. If we try to insert a 
1A name-phone number contact list book 
59
60CHAPTER 3. RED-BLACK TREE, NOT SO COMPLEX AS ITWAS THOUGHT 
sequence of number 1, 2, 3, ..., n to a binary search tree. We will get a tree like 
in Figure 3.1. 
1 
2 
3 
... 
n 
Figure 3.1: unbalanced tree 
This is a extreme unbalanced binary search tree. The looking up performs 
O(h) for a tree with height h. In balanced case, we bene
t from binary search 
tree by O(lgN) search time. But in this extreme case, the search time down-graded 
to O(N). It's no better than a normal link-list. 
Exercise 3.1 
 For a very big yellow page list, one may want to speed up the dictionary 
building process by two concurrent tasks (threads or processes). One task 
reads the name-phone pair from the head of the list, while the other one 
reads from the tail. The building terminates when these two tasks meet 
at the middle of the list. What will be the binary search tree looks like 
after building? What if you split the the list more than two and use more 
tasks? 
 Can you
nd any more cases to exploit a binary search tree? Please 
consider the unbalanced trees shown in
gure 3.2. 
3.1.2 How to ensure the balance of the tree 
In order to avoid such case, we can shue the input sequence by randomized 
algorithm, such as described in Section 12.4 in [2]. However, this method doesn't 
always work, for example the input is fed from user interactively, and the tree 
need to be built/updated after each input. 
There are many solutions people have ever found to make binary search tree 
balanced. Many of them rely on the rotation operations to binary search tree. 
Rotation operations change the tree structure while maintain the ordering of 
the elements. Thus it either improve or keep the balance property of the binary 
search tree.
3.1. INTRODUCTION 61 
n 
n-1 
n-2 
... 
1 
(a) 
1 
2 
n 
3 
n-1 
4 
... 
(b) 
m 
m-1 m+1 
m-2 
... 
1 
m+2 
... 
n 
(c) 
Figure 3.2: Some unbalanced trees
62CHAPTER 3. RED-BLACK TREE, NOT SO COMPLEX AS ITWAS THOUGHT 
In this chapter, we'll
rst introduce about red-black tree, which is one of 
the most popular and widely used self-adjusting balanced binary search tree. In 
next chapter, we'll introduce about AVL tree, which is another intuitive solution; 
In later chapter about binary heaps, we'll show another interesting tree called 
splay tree, which can gradually adjust the the tree to make it more and more 
balanced. 
3.1.3 Tree rotation 
() 
X 
a Y 
b c 
(a) 
Y 
X c 
a b 
(b) 
Figure 3.3: Tree rotation, `rotate-left' transforms the tree from left side to right 
side, and `rotate-right' does the inverse transformation. 
Tree rotation is a kind of special operation that can transform the tree 
structure without changing the in-order traverse result. It based on the fact 
that for a speci
ed ordering, there are multiple binary search trees correspond 
to it. Figure 3.3 shows the tree rotation. For a binary search tree on the left 
side, left rotate transforms it to the tree on the right, and right rotate does the 
inverse transformation. 
Although tree rotation can be realized in procedural way, there exists quite 
simple function description if using pattern matching. 
rotateL(T) = 
 
node(node(a;X; b); Y; c) : pattern(T) = node(a;X; node(b; Y; c)) 
T : otherwise 
(3.1) 
rotateR(T) = 
 
node(a;X; node(b; Y; c)) : pattern(T) = node(node(a;X; b); Y; c)) 
T : otherwise 
(3.2) 
However, the pseudo code dealing imperatively has to set all
elds accord-ingly. 
1: function Left-Rotate(T; x) 
2: p   Parent(x) 
3: y   Right(x) . Assume y6= NIL 
4: a   Left(x) 
5: b   Left(y) 
6: c   Right(y)
3.1. INTRODUCTION 63 
7: Replace(x; y) 
8: Set-Children(x; a; b) 
9: Set-Children(y; x; c) 
10: if p = NIL then 
11: T   y 
12: return T 
13: function Right-Rotate(T; y) 
14: p   Parent(y) 
15: x   Left(y) . Assume x6= NIL 
16: a   Left(x) 
17: b   Right(x) 
18: c   Right(y) 
19: Replace(y; x) 
20: Set-Children(y; b; c) 
21: Set-Children(x; a; y) 
22: if p = NIL then 
23: T   x 
24: return T 
25: function Set-Left(x; y) 
26: Left(x)   y 
27: if y6= NIL then Parent(y)   x 
28: function Set-Right(x; y) 
29: Right(x)   y 
30: if y6= NIL then Parent(y)   x 
31: function Set-Children(x; L;R) 
32: Set-Left(x;L) 
33: Set-Right(x;R) 
34: function Replace(x; y) 
35: if Parent(x) = NIL then 
36: if y6= NIL then Parent(y)   NIL 
37: else if Left(Parent(x)) = x then Set-Left(Parent(x), y) 
38: elseSet-Right(Parent(x), y) 
39: Parent(x)   NIL 
Compare these pseudo codes with the pattern matching functions, the former 
focus on the structure states changing, while the latter focus on the rotation 
process. As the title of this chapter indicated, red-black tree needn't be so 
complex as it was thought. Most traditional algorithm text books use the classic 
procedural way to teach red-black tree, there are several cases need to deal and 
all need carefulness to manipulate the node
elds. However, by changing the 
mind to functional settings, things get intuitive and simple. Although there is 
some performance overhead. 
Most of the content in this chapter is based on Chris Okasaki's work in [2].
64CHAPTER 3. RED-BLACK TREE, NOT SO COMPLEX AS ITWAS THOUGHT 
3.2 De
nition of red-black tree 
Red-black tree is a type of self-balancing binary search tree[3]. 2 By using color 
changing and rotation, red-black tree provides a very simple and straightforward 
way to keep the tree balanced. 
For a binary search tree, we can augment the nodes with a color
eld, a node 
can be colored either red or black. We call a binary search tree red-black tree 
if it satis
es the following 5 properties[2]. 
1. Every node is either red or black. 
2. The root is black. 
3. Every leaf (NIL) is black. 
4. If a node is red, then both its children are black. 
5. For each node, all paths from the node to descendant leaves contain the 
same number of black nodes. 
Why this 5 properties can ensure the red-black tree is well balanced? Because 
they have a key characteristic, the longest path from root to a leaf can't be as 
2 times longer than the shortest path. 
Please note the 4-th property, which means there won't be two adjacent red 
nodes. so the shortest path only contains black nodes, any paths longer than 
the shortest one has interval red nodes. According to property 5, all paths have 
the same number of black nodes, this
nally ensure there won't be any path is 
2 times longer than others[3]. Figure 3.4 shows an example red-black tree. 
1 3 
8 1 7 
1 1 1 
6 
1 5 2 5 
2 2 2 7 
Figure 3.4: An example red-black tree 
2Red-black tree is one of the equivalent form of 2-3-4 tree (see chapter B-tree about 2-3-4 
tree). That is to say, for any 2-3-4 tree, there is at least one red-black tree has the same data 
order.
3.3. INSERTION 65 
All read only operations such as search, min/max are as same as in binary 
search tree. While only the insertion and deletion are special. 
As we have shown in word occurrence example, many implementation of 
set or map container are based on red-black tree. One example is the C++ 
Standard library (STL)[6]. 
As mentioned previously, the only change in data layout is that there is color 
information augmented to binary search tree. This can be represented as a data
eld in imperative languages such as C++ like below. 
enum Color {Red, Black}; 
template class T 
struct node{ 
Color color; 
T key; 
node left; 
node right; 
node parent; 
}; 
In functional settings, we can add the color information in constructors, 
below is the Haskell example of red-black tree de
nition. 
data Color = R j B 
data RBTree a = Empty 
j Node Color (RBTree a) a (RBTree a) 
Exercise 3.2 
 Can you prove that a red-black tree with n nodes has height at most 
2 lg(n + 1)? 
3.3 Insertion 
Inserting a new node as what has been mentioned in binary search tree may 
cause the tree unbalanced. The red-black properties has to be maintained, so 
we need do some
xing by transform the tree after insertion. 
When we insert a new key, one good practice is to always insert it as a red 
node. As far as the new inserted node isn't the root of the tree, we can keep all 
properties except the 4-th one. that it may bring two adjacent red nodes. 
Functional and procedural implementation have dierent
xing methods. 
One is intuitive but has some overhead, the other is a bit complex but has 
higher performance. Most text books about algorithm introduce the later one. 
In this chapter, we focus on the former to show how easily a red-black tree 
insertion algorithm can be realized. The traditional procedural method will be 
given only for comparison purpose. 
As described by Chris Okasaki, there are total 4 cases which violate property 
4. All of them has 2 adjacent red nodes. However, they have a uniformed form 
after
xing[2] as shown in
gure 4.3. 
Note that this transformation will move the redness one level up. So this is a 
bottom-up recursive
xing, the last step will make the root node red. According
66CHAPTER 3. RED-BLACK TREE, NOT SO COMPLEX AS ITWAS THOUGHT 
@ 
@ 
@R 
 
 
 
 
 
	 
@ 
@I 
@ 
z 
y D 
x C 
A B 
z 
x D 
A y 
B C 
x 
A y 
B z 
C D 
x 
A z 
y D 
B C 
y 
x z 
A B C D 
Figure 3.5: 4 cases for balancing a red-black tree after insertion
3.3. INSERTION 67 
to property 2, root is always black. Thus we need
nal
xing to revert the root 
color to black. 
Observing that the 4 cases and the
xed result have strong pattern features, 
the
xing function can be de
ned by using the similar method we mentioned in 
tree rotation. To avoid too long formula, we abbreviate Color as C, Black as 
B, and Red as R. 
balance(T) = 
 
node(R; node(B; A; x;B); y; node(B;C; z;D)) : match(T) 
T : otherwise 
(3.3) 
where function node() can construct a red-black tree node with 4 parameters 
as color, the left child, the key and the right child. Function match() can test 
if a tree is one of the 4 possible patterns as the following. 
match(T) = 
pattern(T) = node(B; node(R; node(R; A; x;B); y;C); z;D)_ 
pattern(T) = node(B; node(R; A; x; node(R;B; y;C); z;D))_ 
pattern(T) = node(B; A; x; node(R;B; y; node(R;C; z;D)))_ 
pattern(T) = node(B; A; x; node(R; node(R;B; y;C); z;D)) 
With function balance() de
ned, we can modify the previous binary search 
tree insertion functions to make it work for red-black tree. 
insert(T; k) = makeBlack(ins(T; k)) (3.4) 
where 
ins(T; k) = 
8 
: 
node(R; ; k; ) : T =  
balance(node(ins(L; k);Key;R)) : k  Key 
balance(node(L;Key; ins(R; k))) : otherwise 
(3.5) 
L;R;Key represent the left child, right child and the key of a tree. 
L = left(T) 
R = right(T) 
Key = key(T) 
Function makeBlack() is de
ned as the following, it forces the color of a 
non-empty tree to be black. 
makeBlack(T) = node(B; L;Key;R) (3.6) 
Summarize the above functions and use language supported pattern match-ing 
features, we can come to the following Haskell program. 
insert::(Ord a))RBTree a ! a ! RBTree a 
insert t x = makeBlack $ ins t where 
ins Empty = Node R Empty x Empty 
ins (Node color l k r) 
j x  k = balance color (ins l) k r 
j otherwise = balance color l k (ins r) --[3] 
makeBlack(Node _ l k r) = Node B l k r
68CHAPTER 3. RED-BLACK TREE, NOT SO COMPLEX AS ITWAS THOUGHT 
balance::Color ! RBTree a ! a ! RBTree a ! RBTree a 
balance B (Node R (Node R a x b) y c) z d = 
Node R (Node B a x b) y (Node B c z d) 
balance B (Node R a x (Node R b y c)) z d = 
Node R (Node B a x b) y (Node B c z d) 
balance B a x (Node R b y (Node R c z d)) = 
Node R (Node B a x b) y (Node B c z d) 
balance B a x (Node R (Node R b y c) z d) = 
Node R (Node B a x b) y (Node B c z d) 
balance color l k r = Node color l k r 
Note that the 'balance' function is changed a bit from the original de
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms
Elementary algorithms

More Related Content

What's hot

Introduction to Programming Using Java v. 7 - David J Eck - Inglês
Introduction to Programming Using Java v. 7 - David J Eck - InglêsIntroduction to Programming Using Java v. 7 - David J Eck - Inglês
Introduction to Programming Using Java v. 7 - David J Eck - InglêsMarcelo Negreiros
 
Advanced cardiovascular exercites
Advanced cardiovascular exercitesAdvanced cardiovascular exercites
Advanced cardiovascular exercitesMaria Lopez
 
Notes and Description for Xcos Scilab Block Simulation with Suitable Examples...
Notes and Description for Xcos Scilab Block Simulation with Suitable Examples...Notes and Description for Xcos Scilab Block Simulation with Suitable Examples...
Notes and Description for Xcos Scilab Block Simulation with Suitable Examples...ssuserd6b1fd
 
A practical introduction_to_python_programming_heinold
A practical introduction_to_python_programming_heinoldA practical introduction_to_python_programming_heinold
A practical introduction_to_python_programming_heinoldFaruqolayinkaSalako
 
Notes for C++ Programming / Object Oriented C++ Programming for MCA, BCA and ...
Notes for C++ Programming / Object Oriented C++ Programming for MCA, BCA and ...Notes for C++ Programming / Object Oriented C++ Programming for MCA, BCA and ...
Notes for C++ Programming / Object Oriented C++ Programming for MCA, BCA and ...ssuserd6b1fd
 
Basic ForTran Programming - for Beginners - An Introduction by Arun Umrao
Basic ForTran Programming - for Beginners - An Introduction by Arun UmraoBasic ForTran Programming - for Beginners - An Introduction by Arun Umrao
Basic ForTran Programming - for Beginners - An Introduction by Arun Umraossuserd6b1fd
 
Notes for C Programming for MCA, BCA, B. Tech CSE, ECE and MSC (CS) 1 of 5 by...
Notes for C Programming for MCA, BCA, B. Tech CSE, ECE and MSC (CS) 1 of 5 by...Notes for C Programming for MCA, BCA, B. Tech CSE, ECE and MSC (CS) 1 of 5 by...
Notes for C Programming for MCA, BCA, B. Tech CSE, ECE and MSC (CS) 1 of 5 by...ssuserd6b1fd
 
Perl <b>5 Tutorial</b>, First Edition
Perl <b>5 Tutorial</b>, First EditionPerl <b>5 Tutorial</b>, First Edition
Perl <b>5 Tutorial</b>, First Editiontutorialsruby
 
Notes of 8051 Micro Controller for BCA, MCA, MSC (CS), MSC (IT) & AMIE IEI- b...
Notes of 8051 Micro Controller for BCA, MCA, MSC (CS), MSC (IT) & AMIE IEI- b...Notes of 8051 Micro Controller for BCA, MCA, MSC (CS), MSC (IT) & AMIE IEI- b...
Notes of 8051 Micro Controller for BCA, MCA, MSC (CS), MSC (IT) & AMIE IEI- b...ssuserd6b1fd
 

What's hot (15)

Dsa
DsaDsa
Dsa
 
10.1.1.652.4894
10.1.1.652.489410.1.1.652.4894
10.1.1.652.4894
 
tutorial.pdf
tutorial.pdftutorial.pdf
tutorial.pdf
 
Introduction to Programming Using Java v. 7 - David J Eck - Inglês
Introduction to Programming Using Java v. 7 - David J Eck - InglêsIntroduction to Programming Using Java v. 7 - David J Eck - Inglês
Introduction to Programming Using Java v. 7 - David J Eck - Inglês
 
Advanced cardiovascular exercites
Advanced cardiovascular exercitesAdvanced cardiovascular exercites
Advanced cardiovascular exercites
 
Ns doc
Ns docNs doc
Ns doc
 
Notes and Description for Xcos Scilab Block Simulation with Suitable Examples...
Notes and Description for Xcos Scilab Block Simulation with Suitable Examples...Notes and Description for Xcos Scilab Block Simulation with Suitable Examples...
Notes and Description for Xcos Scilab Block Simulation with Suitable Examples...
 
A practical introduction_to_python_programming_heinold
A practical introduction_to_python_programming_heinoldA practical introduction_to_python_programming_heinold
A practical introduction_to_python_programming_heinold
 
Notes for C++ Programming / Object Oriented C++ Programming for MCA, BCA and ...
Notes for C++ Programming / Object Oriented C++ Programming for MCA, BCA and ...Notes for C++ Programming / Object Oriented C++ Programming for MCA, BCA and ...
Notes for C++ Programming / Object Oriented C++ Programming for MCA, BCA and ...
 
Dsa
DsaDsa
Dsa
 
Basic ForTran Programming - for Beginners - An Introduction by Arun Umrao
Basic ForTran Programming - for Beginners - An Introduction by Arun UmraoBasic ForTran Programming - for Beginners - An Introduction by Arun Umrao
Basic ForTran Programming - for Beginners - An Introduction by Arun Umrao
 
Notes for C Programming for MCA, BCA, B. Tech CSE, ECE and MSC (CS) 1 of 5 by...
Notes for C Programming for MCA, BCA, B. Tech CSE, ECE and MSC (CS) 1 of 5 by...Notes for C Programming for MCA, BCA, B. Tech CSE, ECE and MSC (CS) 1 of 5 by...
Notes for C Programming for MCA, BCA, B. Tech CSE, ECE and MSC (CS) 1 of 5 by...
 
Na 20130603
Na 20130603Na 20130603
Na 20130603
 
Perl <b>5 Tutorial</b>, First Edition
Perl <b>5 Tutorial</b>, First EditionPerl <b>5 Tutorial</b>, First Edition
Perl <b>5 Tutorial</b>, First Edition
 
Notes of 8051 Micro Controller for BCA, MCA, MSC (CS), MSC (IT) & AMIE IEI- b...
Notes of 8051 Micro Controller for BCA, MCA, MSC (CS), MSC (IT) & AMIE IEI- b...Notes of 8051 Micro Controller for BCA, MCA, MSC (CS), MSC (IT) & AMIE IEI- b...
Notes of 8051 Micro Controller for BCA, MCA, MSC (CS), MSC (IT) & AMIE IEI- b...
 

Viewers also liked

Hr interview-questions
Hr interview-questionsHr interview-questions
Hr interview-questionssaurabh goel
 
Good and Bad STUDY HABITS
Good and Bad STUDY HABITSGood and Bad STUDY HABITS
Good and Bad STUDY HABITSsaurabh goel
 
Agile software development
Agile software development Agile software development
Agile software development saurabh goel
 
Abstract on Implementation of LEACH Protocol for WSN
Abstract on Implementation of LEACH Protocol for WSNAbstract on Implementation of LEACH Protocol for WSN
Abstract on Implementation of LEACH Protocol for WSNsaurabh goel
 
Private sector bank home loans
Private sector bank home loansPrivate sector bank home loans
Private sector bank home loanssaurabh goel
 
Sharon100questions
Sharon100questionsSharon100questions
Sharon100questionssaurabh goel
 
Advancements in Computer Graphics
Advancements in Computer GraphicsAdvancements in Computer Graphics
Advancements in Computer Graphicssaurabh goel
 

Viewers also liked (13)

Hr interview-questions
Hr interview-questionsHr interview-questions
Hr interview-questions
 
Art of war
Art of warArt of war
Art of war
 
Good and Bad STUDY HABITS
Good and Bad STUDY HABITSGood and Bad STUDY HABITS
Good and Bad STUDY HABITS
 
Maglev Trains
Maglev TrainsMaglev Trains
Maglev Trains
 
Business sutra
Business sutraBusiness sutra
Business sutra
 
Asian paints swot
Asian paints swotAsian paints swot
Asian paints swot
 
Agile software development
Agile software development Agile software development
Agile software development
 
LEACH Protocol
LEACH ProtocolLEACH Protocol
LEACH Protocol
 
Abstract on Implementation of LEACH Protocol for WSN
Abstract on Implementation of LEACH Protocol for WSNAbstract on Implementation of LEACH Protocol for WSN
Abstract on Implementation of LEACH Protocol for WSN
 
Private sector bank home loans
Private sector bank home loansPrivate sector bank home loans
Private sector bank home loans
 
Sharon100questions
Sharon100questionsSharon100questions
Sharon100questions
 
Advancements in Computer Graphics
Advancements in Computer GraphicsAdvancements in Computer Graphics
Advancements in Computer Graphics
 
Win interviews
Win interviewsWin interviews
Win interviews
 

Similar to Elementary algorithms

Algorithmic Problem Solving with Python.pdf
Algorithmic Problem Solving with Python.pdfAlgorithmic Problem Solving with Python.pdf
Algorithmic Problem Solving with Python.pdfEmily Smith
 
Python_Programming_and_Numerical_Methods_A_Guide_for_Engineers_and.pdf
Python_Programming_and_Numerical_Methods_A_Guide_for_Engineers_and.pdfPython_Programming_and_Numerical_Methods_A_Guide_for_Engineers_and.pdf
Python_Programming_and_Numerical_Methods_A_Guide_for_Engineers_and.pdfjankoabel2022
 
A_Practical_Introduction_to_Python_Programming_Heinold.pdf
A_Practical_Introduction_to_Python_Programming_Heinold.pdfA_Practical_Introduction_to_Python_Programming_Heinold.pdf
A_Practical_Introduction_to_Python_Programming_Heinold.pdfssuser7fcce2
 
A practical introduction_to_python_programming_heinold
A practical introduction_to_python_programming_heinoldA practical introduction_to_python_programming_heinold
A practical introduction_to_python_programming_heinoldthe_matrix
 
A Practical Introduction To Python Programming
A Practical Introduction To Python ProgrammingA Practical Introduction To Python Programming
A Practical Introduction To Python ProgrammingNat Rice
 
A_Practical_Introduction_to_Python_Programming_Heinold.pdf
A_Practical_Introduction_to_Python_Programming_Heinold.pdfA_Practical_Introduction_to_Python_Programming_Heinold.pdf
A_Practical_Introduction_to_Python_Programming_Heinold.pdfTariqSaeed80
 
Algorithms for programers
Algorithms for programersAlgorithms for programers
Algorithms for programersRicardo Garcia
 
javanotes5.pdf
javanotes5.pdfjavanotes5.pdf
javanotes5.pdfkmspega
 
An Introduction to MATLAB for Geoscientists.pdf
An Introduction to MATLAB for Geoscientists.pdfAn Introduction to MATLAB for Geoscientists.pdf
An Introduction to MATLAB for Geoscientists.pdfvishnuraj764102
 
Ric walter (auth.) numerical methods and optimization a consumer guide-sprin...
Ric walter (auth.) numerical methods and optimization  a consumer guide-sprin...Ric walter (auth.) numerical methods and optimization  a consumer guide-sprin...
Ric walter (auth.) numerical methods and optimization a consumer guide-sprin...valentincivil
 
Data struture and aligorism
Data struture and aligorismData struture and aligorism
Data struture and aligorismmbadhi barnabas
 
Haltermanpythonbook.pdf
Haltermanpythonbook.pdfHaltermanpythonbook.pdf
Haltermanpythonbook.pdfnebulaa2
 

Similar to Elementary algorithms (20)

Algorithmic Problem Solving with Python.pdf
Algorithmic Problem Solving with Python.pdfAlgorithmic Problem Solving with Python.pdf
Algorithmic Problem Solving with Python.pdf
 
main-moonmath.pdf
main-moonmath.pdfmain-moonmath.pdf
main-moonmath.pdf
 
Python_Programming_and_Numerical_Methods_A_Guide_for_Engineers_and.pdf
Python_Programming_and_Numerical_Methods_A_Guide_for_Engineers_and.pdfPython_Programming_and_Numerical_Methods_A_Guide_for_Engineers_and.pdf
Python_Programming_and_Numerical_Methods_A_Guide_for_Engineers_and.pdf
 
A_Practical_Introduction_to_Python_Programming_Heinold.pdf
A_Practical_Introduction_to_Python_Programming_Heinold.pdfA_Practical_Introduction_to_Python_Programming_Heinold.pdf
A_Practical_Introduction_to_Python_Programming_Heinold.pdf
 
A practical introduction_to_python_programming_heinold
A practical introduction_to_python_programming_heinoldA practical introduction_to_python_programming_heinold
A practical introduction_to_python_programming_heinold
 
A Practical Introduction To Python Programming
A Practical Introduction To Python ProgrammingA Practical Introduction To Python Programming
A Practical Introduction To Python Programming
 
A_Practical_Introduction_to_Python_Programming_Heinold.pdf
A_Practical_Introduction_to_Python_Programming_Heinold.pdfA_Practical_Introduction_to_Python_Programming_Heinold.pdf
A_Practical_Introduction_to_Python_Programming_Heinold.pdf
 
Die casting2
Die casting2Die casting2
Die casting2
 
data structures
data structuresdata structures
data structures
 
book.pdf
book.pdfbook.pdf
book.pdf
 
Algorithms for programers
Algorithms for programersAlgorithms for programers
Algorithms for programers
 
javanotes5.pdf
javanotes5.pdfjavanotes5.pdf
javanotes5.pdf
 
An Introduction to MATLAB for Geoscientists.pdf
An Introduction to MATLAB for Geoscientists.pdfAn Introduction to MATLAB for Geoscientists.pdf
An Introduction to MATLAB for Geoscientists.pdf
 
Ric walter (auth.) numerical methods and optimization a consumer guide-sprin...
Ric walter (auth.) numerical methods and optimization  a consumer guide-sprin...Ric walter (auth.) numerical methods and optimization  a consumer guide-sprin...
Ric walter (auth.) numerical methods and optimization a consumer guide-sprin...
 
Dsa
DsaDsa
Dsa
 
Dsa book
Dsa bookDsa book
Dsa book
 
Data struture and aligorism
Data struture and aligorismData struture and aligorism
Data struture and aligorism
 
Perltut
PerltutPerltut
Perltut
 
Haltermanpythonbook.pdf
Haltermanpythonbook.pdfHaltermanpythonbook.pdf
Haltermanpythonbook.pdf
 
pyspark.pdf
pyspark.pdfpyspark.pdf
pyspark.pdf
 

Recently uploaded

BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxfenichawla
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdfKamal Acharya
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfKamal Acharya
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college projectTonystark477637
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGMANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGSIVASHANKAR N
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 

Recently uploaded (20)

BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGMANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 

Elementary algorithms

  • 1. Elementary Algorithms Larry LIU Xinyu 1 July 25, 2014 1Larry LIU Xinyu Version: 0.6180339887498949 Email: liuxinyu95@gmail.com
  • 2. 2
  • 3. Contents I Preface 5 0.1 Why? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 0.2 The smallest free ID problem, the power of algorithms . . . . . . 7 0.2.1 Improvement 1 . . . . . . . . . . . . . . . . . . . . . . . . 8 0.2.2 Improvement 2, Divide and Conquer . . . . . . . . . . . . 9 0.2.3 Expressiveness vs. Performance . . . . . . . . . . . . . . . 10 0.3 The number puzzle, power of data structure . . . . . . . . . . . . 12 0.3.1 The brute-force solution . . . . . . . . . . . . . . . . . . . 12 0.3.2 Improvement 1 . . . . . . . . . . . . . . . . . . . . . . . . 12 0.3.3 Improvement 2 . . . . . . . . . . . . . . . . . . . . . . . . 15 0.4 Notes and short summary . . . . . . . . . . . . . . . . . . . . . . 18 0.5 Structure of the contents . . . . . . . . . . . . . . . . . . . . . . . 18 0.6 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 II Trees 23 1 Binary search tree, the `hello world' data structure 25 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 1.2 Data Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 1.3 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 1.4 Traversing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 1.5 Querying a binary search tree . . . . . . . . . . . . . . . . . . . . 33 1.5.1 Looking up . . . . . . . . . . . . . . . . . . . . . . . . . . 33 1.5.2 Minimum and maximum . . . . . . . . . . . . . . . . . . . 34 1.5.3 Successor and predecessor . . . . . . . . . . . . . . . . . . 34 1.6 Deletion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 1.7 Randomly build binary search tree . . . . . . . . . . . . . . . . . 40 1.8 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 2 The evolution of insertion sort 43 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 2.2 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 2.3 Improvement 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 2.4 Improvement 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 2.5 Final improvement by binary search tree . . . . . . . . . . . . . . 49 2.6 Short summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3
  • 4. 4 CONTENTS 3 Red-black tree, not so complex as it was thought 53 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 3.1.1 Exploit the binary search tree . . . . . . . . . . . . . . . . 53 3.1.2 How to ensure the balance of the tree . . . . . . . . . . . 54 3.1.3 Tree rotation . . . . . . . . . . . . . . . . . . . . . . . . . 56 3.2 De
  • 5. nition of red-black tree . . . . . . . . . . . . . . . . . . . . . 58 3.3 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 3.4 Deletion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 3.5 Imperative red-black tree algorithm ? . . . . . . . . . . . . . . . 71 3.6 More words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 4 AVL tree 77 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 4.1.1 How to measure the balance of a tree? . . . . . . . . . . . 77 4.2 De
  • 6. nition of AVL tree . . . . . . . . . . . . . . . . . . . . . . . . 77 4.3 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 4.3.1 Balancing adjustment . . . . . . . . . . . . . . . . . . . . 82 4.3.2 Pattern Matching . . . . . . . . . . . . . . . . . . . . . . . 86 4.4 Deletion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 4.5 Imperative AVL tree algorithm ? . . . . . . . . . . . . . . . . . . 88 4.6 Chapter note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 5 Trie and Patricia 95 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 5.2 Integer Trie . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 5.2.1 De
  • 7. nition of integer Trie . . . . . . . . . . . . . . . . . . . 96 5.2.2 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 5.2.3 Look up . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.3 Integer Patricia . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 5.3.1 De
  • 8. nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 5.3.2 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 5.3.3 Look up . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 5.4 Alphabetic Trie . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 5.4.1 De
  • 9. nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 5.4.2 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 5.4.3 Look up . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 5.5 Alphabetic Patricia . . . . . . . . . . . . . . . . . . . . . . . . . . 113 5.5.1 De
  • 10. nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 5.5.2 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 5.5.3 Look up . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 5.6 Trie and Patricia applications . . . . . . . . . . . . . . . . . . . . 121 5.6.1 E-dictionary and word auto-completion . . . . . . . . . . 121 5.6.2 T9 input method . . . . . . . . . . . . . . . . . . . . . . . 125 5.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 6 Sux Tree 133 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 6.2 Sux trie . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 6.2.1 Node transfer and sux link . . . . . . . . . . . . . . . . 135 6.2.2 On-line construction . . . . . . . . . . . . . . . . . . . . . 136
  • 11. CONTENTS 5 6.3 Sux Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 6.3.1 On-line construction . . . . . . . . . . . . . . . . . . . . . 141 6.4 Sux tree applications . . . . . . . . . . . . . . . . . . . . . . . . 150 6.4.1 String/Pattern searching . . . . . . . . . . . . . . . . . . . 150 6.4.2 Find the longest repeated sub-string . . . . . . . . . . . . 152 6.4.3 Find the longest common sub-string . . . . . . . . . . . . 153 6.4.4 Find the longest palindrome . . . . . . . . . . . . . . . . . 155 6.4.5 Others . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 6.5 Notes and short summary . . . . . . . . . . . . . . . . . . . . . . 156 7 B-Trees 159 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 7.2 Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 7.2.1 Splitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 7.3 Deletion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 7.3.1 Merge before delete method . . . . . . . . . . . . . . . . . 168 7.3.2 Delete and
  • 12. x method . . . . . . . . . . . . . . . . . . . . 176 7.4 Searching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 7.5 Notes and short summary . . . . . . . . . . . . . . . . . . . . . . 183 III Heaps 187 8 Binary Heaps 189 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 8.2 Implicit binary heap by array . . . . . . . . . . . . . . . . . . . . 189 8.2.1 De
  • 13. nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 8.2.2 Heapify . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 8.2.3 Build a heap . . . . . . . . . . . . . . . . . . . . . . . . . 192 8.2.4 Basic heap operations . . . . . . . . . . . . . . . . . . . . 194 8.2.5 Heap sort . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 8.3 Leftist heap and Skew heap, the explicit binary heaps . . . . . . 201 8.3.1 De
  • 14. nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 8.3.2 Merge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 8.3.3 Basic heap operations . . . . . . . . . . . . . . . . . . . . 204 8.3.4 Heap sort by Leftist Heap . . . . . . . . . . . . . . . . . . 206 8.3.5 Skew heaps . . . . . . . . . . . . . . . . . . . . . . . . . . 206 8.4 Splay heap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 8.4.1 De
  • 15. nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 8.4.2 Heap sort . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 8.5 Notes and short summary . . . . . . . . . . . . . . . . . . . . . . 215 9 From grape to the world cup, the evolution of selection sort 219 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 9.2 Finding the minimum . . . . . . . . . . . . . . . . . . . . . . . . 221 9.2.1 Labeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222 9.2.2 Grouping . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 9.2.3 performance of the basic selection sorting . . . . . . . . . 224 9.3 Minor Improvement . . . . . . . . . . . . . . . . . . . . . . . . . 225 9.3.1 Parameterize the comparator . . . . . . . . . . . . . . . . 225
  • 16. 6 CONTENTS 9.3.2 Trivial
  • 17. ne tune . . . . . . . . . . . . . . . . . . . . . . . 226 9.3.3 Cock-tail sort . . . . . . . . . . . . . . . . . . . . . . . . . 227 9.4 Major improvement . . . . . . . . . . . . . . . . . . . . . . . . . 231 9.4.1 Tournament knock out . . . . . . . . . . . . . . . . . . . . 231 9.4.2 Final improvement by using heap sort . . . . . . . . . . . 239 9.5 Short summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240 10 Binomial heap, Fibonacci heap, and pairing heap 243 10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 10.2 Binomial Heaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 10.2.1 De
  • 18. nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 10.2.2 Basic heap operations . . . . . . . . . . . . . . . . . . . . 248 10.3 Fibonacci Heaps . . . . . . . . . . . . . . . . . . . . . . . . . . . 258 10.3.1 De
  • 19. nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 258 10.3.2 Basic heap operations . . . . . . . . . . . . . . . . . . . . 260 10.3.3 Running time of pop . . . . . . . . . . . . . . . . . . . . . 269 10.3.4 Decreasing key . . . . . . . . . . . . . . . . . . . . . . . . 271 10.3.5 The name of Fibonacci Heap . . . . . . . . . . . . . . . . 273 10.4 Pairing Heaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 10.4.1 De
  • 20. nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 10.4.2 Basic heap operations . . . . . . . . . . . . . . . . . . . . 276 10.5 Notes and short summary . . . . . . . . . . . . . . . . . . . . . . 282 IV Queues and Sequences 285 11 Queue, not so simple as it was thought 287 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287 11.2 Queue by linked-list and circular buer . . . . . . . . . . . . . . 288 11.2.1 Singly linked-list solution . . . . . . . . . . . . . . . . . . 288 11.2.2 Circular buer solution . . . . . . . . . . . . . . . . . . . 291 11.3 Purely functional solution . . . . . . . . . . . . . . . . . . . . . . 294 11.3.1 Paired-list queue . . . . . . . . . . . . . . . . . . . . . . . 294 11.3.2 Paired-array queue - a symmetric implementation . . . . 296 11.4 A small improvement, Balanced Queue . . . . . . . . . . . . . . . 298 11.5 One more step improvement, Real-time Queue . . . . . . . . . . 300 11.6 Lazy real-time queue . . . . . . . . . . . . . . . . . . . . . . . . . 307 11.7 Notes and short summary . . . . . . . . . . . . . . . . . . . . . . 310 12 Sequences, The last brick 313 12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 12.2 Binary random access list . . . . . . . . . . . . . . . . . . . . . . 314 12.2.1 Review of plain-array and list . . . . . . . . . . . . . . . . 314 12.2.2 Represent sequence by trees . . . . . . . . . . . . . . . . . 314 12.2.3 Insertion to the head of the sequence . . . . . . . . . . . . 316 12.3 Numeric representation for binary random access list . . . . . . . 322 12.3.1 Imperative binary access list . . . . . . . . . . . . . . . . 324 12.4 Imperative paired-array list . . . . . . . . . . . . . . . . . . . . . 327 12.4.1 De
  • 21. nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 327 12.4.2 Insertion and appending . . . . . . . . . . . . . . . . . . . 328
  • 22. CONTENTS 7 12.4.3 random access . . . . . . . . . . . . . . . . . . . . . . . . 328 12.4.4 removing and balancing . . . . . . . . . . . . . . . . . . . 329 12.5 Concatenate-able list . . . . . . . . . . . . . . . . . . . . . . . . . 331 12.6 Finger tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 12.6.1 De
  • 23. nition . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 12.6.2 Insert element to the head of sequence . . . . . . . . . . . 337 12.6.3 Remove element from the head of sequence . . . . . . . . 340 12.6.4 Handling the ill-formed
  • 24. nger tree when removing . . . . 341 12.6.5 append element to the tail of the sequence . . . . . . . . . 346 12.6.6 remove element from the tail of the sequence . . . . . . . 347 12.6.7 concatenate . . . . . . . . . . . . . . . . . . . . . . . . . . 349 12.6.8 Random access of
  • 25. nger tree . . . . . . . . . . . . . . . . 354 12.7 Notes and short summary . . . . . . . . . . . . . . . . . . . . . . 365 V Sorting and Searching 369 13 Divide and conquer, Quick sort vs. Merge sort 371 13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371 13.2 Quick sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371 13.2.1 Basic version . . . . . . . . . . . . . . . . . . . . . . . . . 372 13.2.2 Strict weak ordering . . . . . . . . . . . . . . . . . . . . . 373 13.2.3 Partition . . . . . . . . . . . . . . . . . . . . . . . . . . . 374 13.2.4 Minor improvement in functional partition . . . . . . . . 377 13.3 Performance analysis for quick sort . . . . . . . . . . . . . . . . . 379 13.3.1 Average case analysis ? . . . . . . . . . . . . . . . . . . . 380 13.4 Engineering Improvement . . . . . . . . . . . . . . . . . . . . . . 383 13.4.1 Engineering solution to duplicated elements . . . . . . . . 383 13.5 Engineering solution to the worst case . . . . . . . . . . . . . . . 390 13.6 Other engineering practice . . . . . . . . . . . . . . . . . . . . . . 394 13.7 Side words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395 13.8 Merge sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395 13.8.1 Basic version . . . . . . . . . . . . . . . . . . . . . . . . . 396 13.9 In-place merge sort . . . . . . . . . . . . . . . . . . . . . . . . . . 403 13.9.1 Naive in-place merge . . . . . . . . . . . . . . . . . . . . . 403 13.9.2 in-place working area . . . . . . . . . . . . . . . . . . . . 404 13.9.3 In-place merge sort vs. linked-list merge sort . . . . . . . 409 13.10Nature merge sort . . . . . . . . . . . . . . . . . . . . . . . . . . 411 13.11Bottom-up merge sort . . . . . . . . . . . . . . . . . . . . . . . . 416 13.12Parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419 13.13Short summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419 14 Searching 423 14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423 14.2 Sequence search . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423 14.2.1 Divide and conquer search . . . . . . . . . . . . . . . . . . 424 14.2.2 Information reuse . . . . . . . . . . . . . . . . . . . . . . . 444 14.3 Solution searching . . . . . . . . . . . . . . . . . . . . . . . . . . 471 14.3.1 DFS and BFS . . . . . . . . . . . . . . . . . . . . . . . . . 471 14.3.2 Search the optimal solution . . . . . . . . . . . . . . . . . 507
  • 26. 8 CONTENTS 14.4 Short summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535 VI Appendix 539 Appendices A Lists 541 A.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541 A.2 List De
  • 27. nition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541 A.2.1 Empty list . . . . . . . . . . . . . . . . . . . . . . . . . . . 542 A.2.2 Access the element and the sub list . . . . . . . . . . . . . 542 A.3 Basic list manipulation . . . . . . . . . . . . . . . . . . . . . . . . 543 A.3.1 Construction . . . . . . . . . . . . . . . . . . . . . . . . . 543 A.3.2 Empty testing and length calculating . . . . . . . . . . . . 544 A.3.3 indexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545 A.3.4 Access the last element . . . . . . . . . . . . . . . . . . . 546 A.3.5 Reverse indexing . . . . . . . . . . . . . . . . . . . . . . . 547 A.3.6 Mutating . . . . . . . . . . . . . . . . . . . . . . . . . . . 549 A.3.7 sum and product . . . . . . . . . . . . . . . . . . . . . . . 559 A.3.8 maximum and minimum . . . . . . . . . . . . . . . . . . . 563 A.4 Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567 A.4.1 mapping and for-each . . . . . . . . . . . . . . . . . . . . 567 A.4.2 reverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573 A.5 Extract sub-lists . . . . . . . . . . . . . . . . . . . . . . . . . . . 575 A.5.1 take, drop, and split-at . . . . . . . . . . . . . . . . . . . 575 A.5.2 breaking and grouping . . . . . . . . . . . . . . . . . . . . 577 A.6 Folding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 582 A.6.1 folding from right . . . . . . . . . . . . . . . . . . . . . . . 582 A.6.2 folding from left . . . . . . . . . . . . . . . . . . . . . . . 584 A.6.3 folding in practice . . . . . . . . . . . . . . . . . . . . . . 587 A.7 Searching and matching . . . . . . . . . . . . . . . . . . . . . . . 588 A.7.1 Existence testing . . . . . . . . . . . . . . . . . . . . . . . 588 A.7.2 Looking up . . . . . . . . . . . . . . . . . . . . . . . . . . 588 A.7.3
  • 29. ltering . . . . . . . . . . . . . . . . . . . . . 589 A.7.4 Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . 592 A.8 zipping and unzipping . . . . . . . . . . . . . . . . . . . . . . . . 594 A.9 Notes and short summary . . . . . . . . . . . . . . . . . . . . . . 597 GNU Free Documentation License 601 1. APPLICABILITY AND DEFINITIONS . . . . . . . . . . . . . . . 601 2. VERBATIM COPYING . . . . . . . . . . . . . . . . . . . . . . . . 603 3. COPYING IN QUANTITY . . . . . . . . . . . . . . . . . . . . . . 603 4. MODIFICATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . 604 5. COMBINING DOCUMENTS . . . . . . . . . . . . . . . . . . . . . 605 6. COLLECTIONS OF DOCUMENTS . . . . . . . . . . . . . . . . . 606 7. AGGREGATION WITH INDEPENDENT WORKS . . . . . . . . 606 8. TRANSLATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . 606 9. TERMINATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607 10. FUTURE REVISIONS OF THIS LICENSE . . . . . . . . . . . . 607
  • 30. CONTENTS 9 11. RELICENSING . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607 ADDENDUM: How to use this License for your documents . . . . . . 608
  • 33.
  • 34. Elementary Algorithms 13 0.1 Why? `Are algorithms useful?'. Some programmers say that they seldom use any serious data structures or algorithms in real work such as commercial application development. Even when they need some of them, they have already been provided by libraries. For example, the C++ standard template library (STL) provides sort and selection algorithms as well as the vector, queue, and set data structures. It seems that knowing about how to use the library as a tool is quite enough. Instead of answering this question directly, I would like to say algorithms and data structures are critical in solving `interesting problems', the usefulness of the problem set aside. Let's start with two problems that looks like they can be solved in a brute-force way even by a fresh programmer. 0.2 The smallest free ID problem, the power of algorithms This problem is discussed in Chapter 1 of Richard Bird's book [1]. It's common that applications and systems use ID (identi
  • 35. er) to manage objects and entities. At any time, some IDs are used, and some of them are available for use. When some client tries to acquire a new ID, we want to always allocate it the smallest available one. Suppose IDs are non-negative integers and all IDs in use are kept in a list (or an array) which is not ordered. For example: [18, 4, 8, 9, 16, 1, 14, 7, 19, 3, 0, 5, 2, 11, 6] How can you
  • 36. nd the smallest free ID, which is 10, from the list? It seems the solution is quite easy even without any serious algorithms. 1: function Min-Free(A) 2: x 0 3: loop 4: if x =2 A then 5: return x 6: else 7: x x + 1 Where the =2 is realized like below. 1: function `=2 '(x;X) 2: for i 1 to jXj do 3: if x = X[i] then 4: return False 5: return True Some languages provide handy tools which wrap this linear time process. For example in Python, this algorithm can be directly translated as the following. def b r u t e f o r c e ( l s t ) : i = 0 while True : i f i not in l s t :
  • 37. 14 Preface return i i = i + 1 It seems this problem is trivial. However, There will be millions of IDs in a large system. The speed of this solution is poor in such case for it takes O(n2) time, where n is the length of the ID list. In my computer (2 Cores 2.10 GHz, with 2G RAM), a C program using this solution takes an average of 5.4 seconds to search a minimum free number among 100,000 IDs1. And it takes more than 8 minutes to handle a million numbers. 0.2.1 Improvement 1 The key idea to improve the solution is based on a fact that for a series of n numbers x1; x2; :::; xn, if there are free numbers, some of the xi are outside the range [0; n); otherwise the list is exactly a permutation of 0; 1; :::; n 1 and n should be returned as the minimum free number. It means that max(xi) n1. And we have the following fact. minfree(x1; x2; :::; xn) n (1) One solution is to use an array of n + 1 ags to mark whether a number in range [0; n] is free. 1: function Min-Free(A) 2: F [False; False; :::; False] where jFj = n + 1 3: for 8x 2 A do 4: if x n then 5: F[x] True 6: for i [0; n] do 7: if F[i] = False then 8: return i Line 2 initializes a ag array all of False values. This takes O(n) time. Then the algorithm scans all numbers in A and mark the relative ag to True if the value is less than n, This step also takes O(n) time. Finally, the algorithm performs a linear time search to
  • 39. rst ag with False value. So the total performance of this algorithm is O(n). Note that we use n + 1 ags instead of n ags to cover the special case that sorted(A) = [0; 1; 2; :::; n 1]. Although the algorithm only takes O(n) time, it needs extra O(n) spaces to store the ags. This solution is much faster than the brute force one. On my computer, the relevant Python program takes an average of 0.02 second when dealing with 100,000 numbers. We haven't
  • 40. ne tuned this algorithm yet. Observe that each time we have to allocate memory to create a n + 1 elements array of ags, and release the memory when
  • 41. nished. The memory allocation and release is very expensive thus they cost us a lot of processing time. There are two ways in which we can improve on this solution. One is to allocate the ags array in advance and reuse it for all the calls of our function to
  • 42. nd the smallest free number. The other is to use bit-wise ags instead of a ag array. The following is the C program based on these two minor improvements. 1All programs can be downloaded along with this series posts.
  • 43. 0.2. THE SMALLEST FREE ID PROBLEM, THE POWER OF ALGORITHMS15 #define N 1000000 // 1 mi l l i o n #define WORDLENGTH s izeof ( int ) 8 void s e t b i t (unsigned int bi t s , unsigned int i )f b i t s [ i / WORDLENGTH] j= 1( i % WORDLENGTH) ; g int t e s t b i t (unsigned int bi t s , unsigned int i )f return b i t s [ i /WORDLENGTH] (1( i % WORDLENGTH) ) ; g unsigned int b i t s [N/WORDLENGTH+1] ; int min f r e e ( int xs , int n)f int i , l en = N/WORDLENGTH+1; for ( i =0; il en ; ++i ) b i t s [ i ]=0; for ( i =0; in ; ++i ) i f ( xs [ i ]n) s e t b i t ( bi t s , xs [ i ] ) ; for ( i =0; i=n ; ++i ) i f ( ! t e s t b i t ( bi t s , i ) ) return i ; g This C program can handle 1,000,000 (1 million) IDs in just 0.023 second on my computer. The last for-loop can be further improved as seen below but this is just minor
  • 44. ne-tuning. for ( i =0; ; ++i ) i f (~ b i t s [ i ] !=0 ) for ( j =0; ; ++j ) i f ( ! t e s t b i t ( bi t s , i WORDLENGTH+j ) ) return i WORDLENGTH+j ; 0.2.2 Improvement 2, Divide and Conquer Although the above improvement is much faster, it costs O(n) extra spaces to keep a check list. if n is huge number this means a huge amount of space is wasted. The typical divide and conquer strategy is to break the problem into some smaller ones, and solve these to get the
  • 45. nal answer. We can put all numbers xi bn=2c as a sub-list A0 and put all the others as a second sub-list A00. Based on formula 1 if the length of A0 is exactly bn=2c, this means the
  • 46. rst half of numbers are `full', which indicates that the minimum free number must be in A00 and so we'll need to recursively seek in the shorter list A00. Otherwise, it means the minimum free number is located in A0, which again leads to a smaller problem. When we search the minimum free number in A00, the conditions changes a little bit, we are not searching the smallest free number starting from 0, but
  • 47. 16 Preface actually from bn=2c +1 as the lower bound. So the algorithm is something like minfree(A; l; u), where l is the lower bound and u is the upper bound index of the element. Note that there is a trivial case, that if the number list is empty, we merely return the lower bound as the result. This divide and conquer solution can be formally expressed as a function : minfree(A) = search(A; 0; jAj 1) search(A; l; u) = 8 : l : A = search(A00;m + 1; u) : jA0j = m l + 1 search(A0; l;m) : otherwise where m = b l + u 2 c A0 = f8x 2 A ^ x mg A00 = f8x 2 A ^ x mg It is obvious that this algorithm doesn't need any extra space2. Each call performs O(jAj) comparison to build A0 and A00. After that the problem scale halves. So the time needed for this algorithm is T(n) = T(n=2) + O(n) which reduce to O(n). Another way to analyze the performance is by observing that the
  • 48. rst call takes O(n) to build A0 and A00 and the second call takes O(n=2), and O(n=4) for the third... The total time is O(n+n=2+n=4+:::) = O(2n) = O(n) . In functional programming languages such as Haskell, partitioning a list has already been provided in the basic library and this algorithm can be translated as the following. import Data.List minFree xs = bsearch xs 0 (length xs - 1) bsearch xs l u j xs == [] = l j length as == m - l + 1 = bsearch bs (m+1) u j otherwise = bsearch as l m where m = (l + u) `div` 2 (as, bs) = partition (m) xs 0.2.3 Expressiveness vs. Performance Imperative language programmers may be concerned about the performance of this kind of implementation. For instance in this minimum free ID problem, the number of recursive calls is in O(lg n) , which means the stack size consumed is in O(lg n). It's not free in terms of space. But if we want to avoid that , we 2Procedural programmer may note that it actually takes O(lg n) stack spaces for book- keeping. As we'll see later, this can be eliminated either by tail recursion optimization, for instance gcc -O2. or by manually changing the recursion to iteration
  • 49. 0.2. THE SMALLEST FREE ID PROBLEM, THE POWER OF ALGORITHMS17 can eliminate the recursion by replacing it by an iteration 3 which yields the following C program. int min_free(int xs, int n){ int l=0; int u=n-1; while(n){ int m = (l + u) = 2; int right, left = 0; for(right = 0; right n; ++ right) if(xs[right] m){ swap(xs[left], xs[right]); ++left; } if(left == m - l + 1){ xs = xs + left; n = n - left; l = m+1; } else{ n = left; u = m; } } return l; } This program uses a `quick-sort' like approach to re-arrange the array so that all the elements before left are less than or equal to m; while those between left and right are greater than m. This is shown in
  • 50. gure 1. left right x[i]=m x[i]m ...?... Figure 1: Divide the array, all x[i] m where 0 i left; while all x[i] m where left i right. The left elements are unknown. This program is fast and it doesn't need extra stack space. However, com-pared to the previous Haskell program, it's hard to read and the expressiveness decreased. We have to balance performance and expressiveness. 3This is done automatically in most functional languages since our function is in tail recursive form which lends itself perfectly to this transformation
  • 51. 18 Preface 0.3 The number puzzle, power of data structure If the
  • 53. nd the minimum free number, is a some what useful in practice, this problem is a `pure' one for fun. The puzzle is to
  • 54. nd the 1,500th number, which only contains factor 2, 3 or 5. The
  • 55. rst 3 numbers are of course 2, 3, and 5. Number 60 = 223151, However it is the 25th number. Number 21 = 203171, isn't a valid number because it contains a factor 7. The
  • 56. rst 10 such numbers are list as the following. 2,3,4,5,6,8,9,10,12,15 If we consider 1 = 203050, then 1 is also a valid number and it is the
  • 57. rst one. 0.3.1 The brute-force solution It seems the solution is quite easy without need any serious algorithms. We can check all numbers from 1, then extract all factors of 2, 3 and 5 to see if the left part is 1. 1: function Get-Number(n) 2: x 1 3: i 0 4: loop 5: if Valid?(x) then 6: i i + 1 7: if i = n then 8: return x 9: x x + 1 10: function Valid?(x) 11: while x mod 2 = 0 do 12: x x=2 13: while x mod 3 = 0 do 14: x x=3 15: while x mod 5 = 0 do 16: x x=5 17: if x = 1 then 18: return True 19: else 20: return False This `brute-force' algorithm works for most small n. However, to
  • 58. nd the 1500th number (which is 859963392), the C program based on this algorithm takes 40.39 seconds in my computer. I have to kill the program after 10 minutes when I increased n to 15,000. 0.3.2 Improvement 1 Analysis of the above algorithm shows that modular and divide calculations are very expensive [2]. And they executed a lot in loops. Instead of checking a number contains only 2, 3, or 5 as factors, one alternative solution is to construct such number by these factors.
  • 59. 0.3. THE NUMBER PUZZLE, POWER OF DATA STRUCTURE 19 We start from 1, and times it with 2, or 3, or 5 to generate rest numbers. The problem turns to be how to generate the candidate number in order? One handy way is to utilize the queue data structure. A queue data structure is used to push elements at one end, and pops them at the other end. So that the element be pushed
  • 60. rst is also be popped out
  • 61. rst. This property is called FIFO (First-In-First-Out). The idea is to push 1 as the only element to the queue, then we pop an element, times it with 2, 3, and 5, to get 3 new elements. We then push them back to the queue in order. Note that, the new elements may have already existed in the queue. In such case, we just drop the element. The new element may also smaller than the others in the queue, so we must put them to the correct position. Figure 2 illustrates this idea. 1 1*2=2 1*3=3 1*5=5 2 3 5 2*2=4 2*3=6 2*5=10 3 4 5 6 1 0 3*2=6 3*3=9 3*5=15 4 5 6 9 1 0 1 5 4*2=8 4*3=12 4*5=20 Figure 2: First 4 steps of constructing numbers with a queue. 1. Queue is initialized with 1 as the only element; 2. New elements 2, 3, and 5 are pushed back; 3. New elements 4, 6, and 10, are pushed back in order; 4. New elements 9 and 15 are pushed back, element 6 already exists. This algorithm is shown as the following. 1: function Get-Number(n) 2: Q NIL 3: Enqueue(Q; 1) 4: while n 0 do 5: x Dequeue(Q) 6: Unique-Enqueue(Q; 2x) 7: Unique-Enqueue(Q; 3x) 8: Unique-Enqueue(Q; 5x) 9: n n 1 10: return x 11: function Unique-Enqueue(Q; x) 12: i 0 13: while i jQj ^ Q[i] x do 14: i i + 1 15: if i jQj ^ x = Q[i] then 16: return 17: Insert(Q; i; x)
  • 62. 20 Preface The insert function takes O(jQj) time to
  • 63. nd the proper position and insert it. If the element has already existed, it just returns. A rough estimation tells that the length of the queue increase proportion to n, (Each time, we extract one element, and pushed 3 new, the increase ratio 2), so the total running time is O(1 + 2 + 3 + ::: + n) = O(n2). Figure3 shows the number of queue access time against n. It is quadratic curve which re ect the O(n2) performance. Figure 3: Queue access count v.s. n. The C program based on this algorithm takes only 0.016[s] to get the right answer 859963392. Which is 2500 times faster than the brute force solution. Improvement 1 can also be considered in recursive way. Suppose X is the in
  • 64. nity series for all numbers which only contain factors of 2, 3, or 5. The following formula shows an interesting relationship. X = f1g [ f2x : 8x 2 Xg [ f3x : 8x 2 Xg [ f5x : 8x 2 Xg (2) Where we can de
  • 65. ne [ to a special form so that all elements are stored in order as well as unique to each other. Suppose that X = fx1; x2; x3:::g, Y = fy1; y2; y3; :::g, X0 = fx2; x3; :::g and Y 0 = fy2; y3; :::g. We have X [ Y = 8 : X : Y = Y : X = fx1;X0 [ Y g : x1 y1 fx1;X0 [ Y 0g : x1 = y1 fy1;X [ Y 0g : x1 y1 In a functional programming language such as Haskell, which supports lazy evaluation, The above in
  • 66. nity series functions can be translate into the following program. ns = 1:merge (map (2) ns) (merge (map (3) ns) (map (5) ns)) merge [] l = l merge l [] = l merge (x:xs) (y:ys) j x y = x : merge xs (y:ys)
  • 67. 0.3. THE NUMBER PUZZLE, POWER OF DATA STRUCTURE 21 j x ==y = x : merge xs ys j otherwise = y : merge (x:xs) ys By evaluate ns !! (n-1), we can get the 1500th number as below. ns !! (1500-1) 859963392 0.3.3 Improvement 2 Considering the above solution, although it is much faster than the brute-force one, It still has some drawbacks. It produces many duplicated numbers and they are
  • 68. nally dropped when examine the queue. Secondly, it does linear scan and insertion to keep the order of all elements in the queue, which degrade the ENQUEUE operation from O(1) to O(jQj). If we use three queues instead of using only one, we can improve the solution one step ahead. Denote these queues as Q2, Q3, and Q5, and we initialize them as Q2 = f2g, Q3 = f3g and Q5 = f5g. Each time we DEQUEUEed the smallest one from Q2, Q3, and Q5 as x. And do the following test: If x comes from Q2, we ENQUEUE 2x, 3x, and 5x back to Q2, Q3, and Q5 respectively; If x comes from Q3, we only need ENQUEUE 3x to Q3, and 5x to Q5; We needn't ENQUEUE 2x to Q2, because 2x have already existed in Q3; If x comes from Q5, we only need ENQUEUE 5x to Q5; there is no need to ENQUEUE 3x, 5x to Q3, Q5 because they have already been in the queues; We repeatedly ENQUEUE the smallest one until we
  • 69. nd the n-th element. The algorithm based on this idea is implemented as below. 1: function Get-Number(n) 2: if n = 1 then 3: return 1 4: else 5: Q2 f2g 6: Q3 f3g 7: Q5 f5g 8: while n 1 do 9: x min(Head(Q2), Head(Q3), Head(Q5)) 10: if x = Head(Q2) then 11: Dequeue(Q2) 12: Enqueue(Q2; 2x) 13: Enqueue(Q3; 3x) 14: Enqueue(Q5; 5x) 15: else if x = Head(Q3) then 16: Dequeue(Q3) 17: Enqueue(Q3; 3x) 18: Enqueue(Q5; 5x) 19: else 20: Dequeue(Q5)
  • 70. 22 Preface 2*min=4 3*min=6 5*min=10 2 3 5 min=2 4 3*min=9 5*min=15 3 6 5 1 0 min=3 2*min=8 3*min=12 5*min=20 4 6 9 5 1 0 1 5 min=4 8 6 9 1 2 5 1 0 1 5 2 0 min=5 5*min=25 Figure 4: First 4 steps of constructing numbers with Q2, Q3, and Q5. 1. Queues are initialized with 2, 3, 5 as the only element; 2. New elements 4, 6, and 10 are pushed back; 3. New elements 9, and 15, are pushed back; 4. New elements 8, 12, and 20 are pushed back; 5. New element 25 is pushed back.
  • 71. 0.3. THE NUMBER PUZZLE, POWER OF DATA STRUCTURE 23 21: Enqueue(Q5; 5x) 22: n n 1 23: return x This algorithm loops n times, and within each loop, it extract one head element from the three queues, which takes constant time. Then it appends one to three new elements at the end of queues which bounds to constant time too. So the total time of the algorithm bounds to O(n). The C++ program translated from this algorithm shown below takes less than 1 s to produce the 1500th number, 859963392. typedef unsigned long Integer; Integer get_number(int n){ if(n==1) return 1; queueInteger Q2, Q3, Q5; Q2.push(2); Q3.push(3); Q5.push(5); Integer x; while(n-- 1){ x = min(min(Q2.front(), Q3.front()), Q5.front()); if(x==Q2.front()){ Q2.pop(); Q2.push(x2); Q3.push(x3); Q5.push(x5); } else if(x==Q3.front()){ Q3.pop(); Q3.push(x3); Q5.push(x5); } else{ Q5.pop(); Q5.push(x5); } } return x; } This solution can be also implemented in Functional way. We de
  • 72. ne a func-tion take(n), which will return the
  • 73. rst n numbers contains only factor 2, 3, or 5. take(n) = f(n; f1g; f2g; f3g; f5g) Where f(n;X;Q2;Q3;Q5) = X : n = 1 f(n 1;X [ fxg;Q0 2;Q0 3;Q0 5) : otherwise x = min(Q21;Q31;Q51)
  • 74. 24 Preface 0 2;Q Q 0 3;Q 0 5 = 8 : fQ22;Q23; :::g [ f2xg;Q3 [ f3xg;Q5 [ f5xg : x = Q21 Q2; fQ32;Q33; :::g [ f3xg;Q5 [ f5xg : x = Q31 Q2;Q3; fQ52;Q53; :::g [ f5xg : x = Q51 And these functional de
  • 75. nition can be realized in Haskell as the following. ks 1 xs _ = xs ks n xs (q2, q3, q5) = ks (n-1) (xs++[x]) update where x = minimum $ map head [q2, q3, q5] update j x == head q2 = ((tail q2)++[x2], q3++[x3], q5++[x5]) j x == head q3 = (q2, (tail q3)++[x3], q5++[x5]) j otherwise = (q2, q3, (tail q5)++[x5]) takeN n = ks n [1] ([2], [3], [5]) Invoke `last takeN 1500' will generate the correct answer 859963392. 0.4 Notes and short summary If review the 2 puzzles, we found in both cases, the brute-force solutions are so weak. In the
  • 76. rst problem, it's quite poor in dealing with long ID list, while in the second problem, it doesn't work at all. The
  • 77. rst problem shows the power of algorithms, while the second problem tells why data structure is important. There are plenty of interesting problems, which are hard to solve before computer was invented. With the aid of com-puter and programming, we are able to
  • 78. nd the answer in a quite dierent way. Compare to what we learned in mathematics course in school, we haven't been taught the method like this. While there have been already a lot of wonderful books about algorithms, data structures and math, however, few of them provide the comparison between the procedural solution and the functional solution. From the above discussion, it can be found that functional solution sometimes is very expressive and they are close to what we are familiar in mathematics. This series of post focus on providing both imperative and functional algo-rithms and data structures. Many functional data structures can be referenced from Okasaki's book[6]. While the imperative ones can be founded in classic text books [2] or even in WIKIpedia. Multiple programming languages, includ-ing, C, C++, Python, Haskell, and Scheme/Lisp will be used. In order to make it easy to read by programmers with dierent background, pseudo code and mathematical function are the regular descriptions of each post. The author is NOT a native English speaker, the reason why this book is only available in English for the time being is because the contents are still changing frequently. Any feedback, comments, or criticizes are welcome. 0.5 Structure of the contents In the following series of post, I'll
  • 79. rst introduce about elementary data struc-tures before algorithms, because many algorithms need knowledge of data struc-tures as prerequisite.
  • 80. 0.6. APPENDIX 25 The `hello world' data structure, binary search tree is the
  • 81. rst topic; Then we introduce how to solve the balance problem of binary search tree. After that, I'll show other interesting trees. Trie, Patricia, sux trees are useful in text manipulation. While B-trees are commonly used in
  • 82. le system and data base implementation. The second part of data structures is about heaps. We'll provide a gen-eral Heap de
  • 83. nition and introduce about binary heaps by array and by explicit binary trees. Then we'll extend to K-ary heaps including Binomial heaps, Fi-bonacci heaps, and pairing heaps. Array and queues are considered among the easiest data structures typically, However, we'll show how dicult to implement them in the third part. As the elementary sort algorithms, we'll introduce insertion sort, quick sort, merge sort etc in both imperative way and functional way. The
  • 84. nal part is about searching, besides the element searching, we'll also show string matching algorithms such as KMP. All the posts are provided under GNU FDL (Free document license), and programs are under GNU GPL. 0.6 Appendix All programs provided along with this book are free for downloading. download position: http://sites.google.com/site/algoxy/home
  • 86. Bibliography [1] Richard Bird. Pearls of functional algorithm design. Cambridge Univer-sity Press; 1 edition (November 1, 2010). ISBN-10: 0521513383 [2] Jon Bentley. Programming Pearls(2nd Edition). Addison-Wesley Profes-sional; 2 edition (October 7, 1999). ISBN-13: 978-0201657883 [3] Chris Okasaki. Purely Functional Data Structures. Cambridge university press, (July 1, 1999), ISBN-13: 978-0521663502 [4] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Cliord Stein. Introduction to Algorithms, Second Edition. The MIT Press, 2001. ISBN: 0262032937. 27
  • 89.
  • 90. Chapter 1 Binary search tree, the `hello world' data structure 1.1 Introduction Arrays or lists are typically considered the `hello world' data structures. How-ever, we'll see they are not actually particularly easy to implement. In some procedural settings, arrays are the most elementary data structures, and it is possible to implement linked lists using arrays (see section 10.3 in [2]). On the other hand, in some functional settings, linked lists are the elementary building blocks used to create arrays and other data structures. Considering these factors, we start with Binary Search Trees (or BST) as the `hello world' data structure using an interesting problem Jon Bentley mentioned in `Programming Pearls' [2]. The problem is to count the number of times each word occurs in a large text. One solution in C++ is below: int main(int, char ){ mapstring, int dict; string s; while(cins) ++dict[s]; mapstring, int::iterator it=dict.begin(); for(; it!=dict.end(); ++it) coutit!first: it!secondn; } And we can run it to produce the result using the following UNIX commands 1. $ g++ wordcount.cpp -o wordcount $ cat bbe.txt | ./wordcount wc.txt The map provided in the standard template library is a kind of balanced BST with augmented data. Here we use the words in the text as the keys and the number of occurrences as the augmented data. This program is fast, and 1This is not a UNIX unique command, in Windows OS, it can be achieved by: type bbe:txtjwordcount:exe wc:txt 31
  • 91. 32CHAPTER 1. BINARY SEARCH TREE, THE `HELLOWORLD' DATA STRUCTURE it re ects the power of BSTs. We'll introduce how to implement BSTs in this section and show how to balance them in a later section. Before we dive into BSTs, let's
  • 92. rst introduce the more general binary tree. Binary trees are recursively de
  • 93. ned. BSTs are just one type of binary tree. A binary tree is usually de
  • 94. ned in the following way. A binary tree is either an empty node; or a node containing 3 parts: a value, a left child which is a binary tree and a right child which is also a binary tree. Figure 1.1 shows this concept and an example binary tree. k L R (a) Concept of binary tree 1 6 4 1 0 1 4 7 2 8 1 9 3 (b) An example binary tree Figure 1.1: Binary tree concept and an example. A BST is a binary tree where the following applies to each node: all the values in left child tree are less than the value of this node; the value of this node is less than any values in its right child tree. Figure 1.2 shows an example of binary search tree. Comparing with Figure 1.1, we can see the dierences in how keys are ordered between them.
  • 95. 1.2. DATA LAYOUT 33 4 3 8 1 2 7 1 6 1 0 9 1 4 Figure 1.2: A Binary search tree example. 1.2 Data Layout Based on the recursive de
  • 96. nition of BSTs, we can draw the data layout in procedural setting with pointers as in Figure 1.3. The node contains a
  • 97. eld of key, which can be augmented with satellite data; a
  • 98. eld contains a pointer to the left child and a
  • 99. eld point to the right child. In order to back-track an ancestor easily, a parent
  • 100. eld can be provided as well. In this post, we'll ignore the satellite data for simple illustration purpose. Based on this layout, the node of binary search tree can be de
  • 101. ned in a proce-dural language, such as C++ as the following. templateclass T struct node{ node(T x):key(x), left(0), right(0), parent(0){} ~node(){ delete left; delete right; } node left; node right; node parent; ==parent is optional, it's helpful for succ=pred T key; }; There is another setting, for instance in Scheme/Lisp languages, the ele-mentary data structure is linked-list. Figure 1.4 shows how a binary search tree node can be built on top of linked-list.
  • 102. 34CHAPTER 1. BINARY SEARCH TREE, THE `HELLOWORLD' DATA STRUCTURE key + satellite data left right parent key + satellite data left right parent key + satellite data left right parent ... ... ... ... Figure 1.3: Layout of nodes with parent
  • 103. eld. key next left ... next right ... NIL Figure 1.4: Binary search tree node layout on top of linked list. Where `left...' and 'right ...' are either empty or binary search tree node composed in the same way.
  • 104. 1.3. INSERTION 35 Because in pure functional setting, It's hard to use pointer for back tracking the ancestors, (and typically, there is no need to do back tracking, since we can provide top-down solution in recursive way) there is not `parent'
  • 105. eld in such layout. For simpli
  • 106. ed reason, we'll skip the detailed layout in the future, and only focus on the logic layout of data structures. For example, below is the de
  • 107. nition of binary search tree node in Haskell. data Tree a = Empty j Node (Tree a) a (Tree a) 1.3 Insertion To insert a key k (may be along with a value in practice) to a binary search tree T, we can follow a quite straight forward way. If the tree is empty, then construct a leave node with key=k; If k is less than the key of root node, insert it to the left child; If k is greater than the key of root, insert it to the right child; There is an exceptional case that if k is equal to the key of root, it means it has already existed, we can either overwrite the data, or just do nothing. For simple reason, this case is skipped in this post. This algorithm is described recursively. It is so simple that is why we consider binary search tree is `hello world' data structure. Formally, the algorithm can be represented with a recursive function. insert(T; k) = 8 : node(; k; ) : T = node(insert(L; k);Key;R) : k Key node(L;Key; insert(R; k)) : otherwise (1.1) Where L = left(T) R = right(T) Key = key(T) The node function creates a new node with given left sub-tree, a key and a right sub-tree as parameters. means NIL or Empty. function left, right and key are access functions which can get the left sub-tree, right sub-tree and the key of a node. Translate the above functions directly to Haskell yields the following pro-gram. insert::(Ord a) ) Tree a ! a ! Tree a insert Empty k = Node Empty k Empty insert (Node l x r) k j k x = Node (insert l k) x r j otherwise = Node l x (insert r k) This program utilized the pattern matching features provided by the lan-guage. However, even in functional settings without this feature, for instance, Scheme/Lisp, the program is still expressive.
  • 108. 36CHAPTER 1. BINARY SEARCH TREE, THE `HELLOWORLD' DATA STRUCTURE (define (insert tree x) (cond ((null? tree) (list '() x '())) (( x (key tree)) (make-tree (insert (left tree) x) (key tree) (right tree))) (( x (key tree)) (make-tree (left tree) (key tree) (insert (right tree) x))))) It is possible to turn the algorithm completely into imperative way without recursion. 1: function Insert(T; k) 2: root T 3: x Create-Leaf(k) 4: parent NIL 5: while T6= NIL do 6: parent T 7: if k Key(T) then 8: T Left(T) 9: else 10: T Right(T) 11: Parent(x) parent 12: if parent = NIL then . tree T is empty 13: return x 14: else if k Key(parent) then 15: Left(parent) x 16: else 17: Right(parent) x 18: return root 19: function Create-Leaf(k) 20: x Empty-Node 21: Key(x) k 22: Left(x) NIL 23: Right(x) NIL 24: Parent(x) NIL 25: return x Compare with the functional algorithm, it is obviously that this one is more complex although it is fast and can handle very deep tree. A complete C++ program and a python program are available along with this post for reference. 1.4 Traversing Traversing means visiting every element one by one in a binary search tree. There are 3 ways to traverse a binary tree, pre-order tree walk, in-order tree walk, and post-order tree walk. The names of these traversing methods highlight the order of when we visit the root of a binary search tree.
  • 109. 1.4. TRAVERSING 37 Since there are three parts in a tree, as left child, the root, which con-tains the key and satellite data, and the right child. If we denote them as (left; current; right), the three traversing methods are de
  • 110. ned as the following. pre-order traverse, visit current, then left,
  • 111. nally right; in-order traverse, visit left , then current,
  • 112. nally right; post-order traverse, visit left, then right,
  • 113. nally current. Note that each visiting operation is recursive. And we see the order of visiting current determines the name of the traversing method. For the binary search tree shown in
  • 114. gure 1.2, below are the three dierent traversing results. pre-order traverse result: 4, 3, 1, 2, 8, 7, 16, 10, 9, 14; in-order traverse result: 1, 2, 3, 4, 7, 8, 9, 10, 14, 16; post-order traverse result: 2, 1, 3, 7, 9, 14, 10, 16, 8, 4; It can be found that the in-order walk of a binary search tree outputs the elements in increase order, which is particularly helpful. The de
  • 115. nition of binary search tree ensures this interesting property, while the proof of this fact is left as an exercise of this post. In-order tree walk algorithm can be described as the following: If the tree is empty, just return; traverse the left child by in-order walk, then access the key,
  • 116. nally traverse the right child by in-order walk. Translate the above description yields a generic map function map(f; T) = : T = node(l0; k0; r0) : otherwise (1.2) where l0 = map(f; left(T)) r0 = map(f; right(T)) k0 = f(key(T)) If we only need access the key without create the transformed tree, we can realize this algorithm in procedural way lie the below C++ program. templateclass T, class F void in_order_walk(nodeT t, F f){ if(t){ in_order_walk(t!left, f); f(t!value); in_order_walk(t!right, f); } }
  • 117. 38CHAPTER 1. BINARY SEARCH TREE, THE `HELLOWORLD' DATA STRUCTURE The function takes a parameter f, it can be a real function, or a function object, this program will apply f to the node by in-order tree walk. We can simpli
  • 118. ed this algorithm one more step to de
  • 119. ne a function which turns a binary search tree to a sorted list by in-order traversing. toList(T) = : T = toList(left(T)) [ fkey(T)g [ toList(right(T)) : otherwise (1.3) Below is the Haskell program based on this de
  • 120. nition. toList::(Ord a))Tree a ! [a] toList Empty = [] toList (Node l x r) = toList l ++ [x] ++ toList r This provides us a method to sort a list of elements. We can
  • 121. rst build a binary search tree from the list, then output the tree by in-order traversing. This method is called as `tree sort'. Let's denote the list X = fx1; x2; x3; :::; xng. sort(X) = toList(fromList(X)) (1.4) And we can write it in function composition form. sort = toList:fromList Where function fromList repeatedly insert every element to a binary search tree. fromList(X) = foldL(insert; ;X) (1.5) It can also be written in partial application form like below. fromList = foldL insert For the readers who are not familiar with folding from left, this function can also be de
  • 122. ned recursively as the following. fromList(X) = : X = insert(fromList(fx2; x3; :::; xng); x1) : otherwise We'll intense use folding function as well as the function composition and partial evaluation in the future, please refer to appendix of this book or [6] [7] and [8] for more information. Exercise 1.1 Given the in-order traverse result and pre-order traverse result, can you re-construct the tree from these result and
  • 123. gure out the post-order traversing result? Pre-order result: 1, 2, 4, 3, 5, 6; In-order result: 4, 2, 1, 5, 3, 6; Post-order result: ? Write a program in your favorite language to re-construct the binary tree from pre-order result and in-order result.
  • 124. 1.5. QUERYING A BINARY SEARCH TREE 39 Prove why in-order walk output the elements stored in a binary search tree in increase order? Can you analyze the performance of tree sort with big-O notation? 1.5 Querying a binary search tree There are three types of querying for binary search tree, searching a key in the tree,
  • 125. nd the minimum or maximum element in the tree, and
  • 126. nd the predecessor or successor of an element in the tree. 1.5.1 Looking up According to the de
  • 127. nition of binary search tree, search a key in a tree can be realized as the following. If the tree is empty, the searching fails; If the key of the root is equal to the value to be found, the search succeed. The root is returned as the result; If the value is less than the key of the root, search in the left child. Else, which means that the value is greater than the key of the root, search in the right child. This algorithm can be described with a recursive function as below. lookup(T; x) = 8 : : T = T : key(T) = x lookup(left(T); x) : x key(T) lookup(right(T); x) : otherwise (1.6) In the real application, we may return the satellite data instead of the node as the search result. This algorithm is simple and straightforward. Here is a translation of Haskell program. lookup::(Ord a)) Tree a ! a ! Tree a lookup Empty _ = Empty lookup t@(Node l k r) x j k == x = t j x k = lookup l x j otherwise = lookup r x If the binary search tree is well balanced, which means that almost all nodes have both non-NIL left child and right child, for N elements, the search algo-rithm takes O(lgN) time to perform. This is not formal de
  • 128. nition of balance. We'll show it in later post about red-black-tree. If the tree is poor balanced, the worst case takes O(N) time to search for a key. If we denote the height of the tree as h, we can uniform the performance of the algorithm as O(h). The search algorithm can also be realized without using recursion in a pro-cedural manner. 1: function Search(T; x) 2: while T6= NIL^ Key(T)6= x do
  • 129. 40CHAPTER 1. BINARY SEARCH TREE, THE `HELLOWORLD' DATA STRUCTURE 3: if x Key(T) then 4: T Left(T) 5: else 6: T Right(T) 7: return T Below is the C++ program based on this algorithm. templateclass T nodeT search(nodeT t, T x){ while(t t!key!=x){ if(x t!key) t=t!left; else t=t!right; } return t; } 1.5.2 Minimum and maximum Minimum and maximum can be implemented from the property of binary search tree, less keys are always in left child, and greater keys are in right. For minimum, we can continue traverse the left sub tree until it is empty. While for maximum, we traverse the right. min(T) = key(T) : left(T) = min(left(T)) : otherwise (1.7) max(T) = key(T) : right(T) = max(right(T)) : otherwise (1.8) Both function bound to O(h) time, where h is the height of the tree. For the balanced binary search tree, min/max are bound to O(lgN) time, while they are O(N) in the worst cases. We skip translating them to programs, It's also possible to implement them in pure procedural way without using recursion. 1.5.3 Successor and predecessor The last kind of querying, to
  • 130. nd the successor or predecessor of an element is useful when a tree is treated as a generic container and traversed by using iterator. It will be relative easier to implement if parent of a node can be accessed directly. It seems that the functional solution is hard to be found, because there is no pointer like
  • 131. eld linking to the parent node. One solution is to left `breadcrumbs' when we visit the tree, and use these information to back-track or even re-construct the whole tree. Such data structure, that contains both the tree and `breadcrumbs' is called zipper. please refer to [9] for details. However, If we consider the original purpose of providing succ/pred func-tion, `to traverse all the binary search tree elements one by one` as a generic container, we realize that they don't make signi
  • 132. cant sense in functional settings because we can traverse the tree in increase order by mapT function we de
  • 134. 1.5. QUERYING A BINARY SEARCH TREE 41 We'll meet many problems in this series of post that they are only valid in imperative settings, and they are not meaningful problems in functional settings at all. One good example is how to delete an element in red-black-tree[3]. In this section, we'll only present the imperative algorithm for
  • 135. nding the successor and predecessor in a binary search tree. When
  • 136. nding the successor of element x, which is the smallest one y that satis
  • 137. es y x, there are two cases. If the node with value x has non-NIL right child, the minimum element in right child is the answer; For example, in Figure 1.2, in order to
  • 138. nd the successor of 8, we search it's right sub tree for the minimum one, which yields 9 as the result. While if node x don't have right child, we need back-track to
  • 139. nd the closest ancestors whose left child is also ancestor of x. In Figure 1.2, since 2 don't have right sub tree, we go back to its parent 1. However, node 1 don't have left child, so we go back again and reach to node 3, the left child of 3, is also ancestor of 2, thus, 3 is the successor of node 2. Based on this description, the algorithm can be given as the following. 1: function Succ(x) 2: if Right(x)6= NIL then 3: return Min(Right(x)) 4: else 5: p Parent(x) 6: while p6= NIL and x = Right(p) do 7: x p 8: p Parent(p) 9: return p The predecessor case is quite similar to the successor algorithm, they are symmetrical to each other. 1: function Pred(x) 2: if Left(x)6= NIL then 3: return Max(Left(x)) 4: else 5: p Parent(x) 6: while p6= NIL and x = Left(p) do 7: x p 8: p Parent(p) 9: return p Below are the Python programs based on these algorithms. They are changed a bit in while loop conditions. def succ(x): if x.right is not None: return tree_min(x.right) p = x.parent while p is not None and p.left != x: x = p p = p.parent return p def pred(x): if x.left is not None: return tree_max(x.left) p = x.parent
  • 140. 42CHAPTER 1. BINARY SEARCH TREE, THE `HELLOWORLD' DATA STRUCTURE while p is not None and p.right != x: x = p p = p.parent return p Exercise 1.2 Can you
  • 141. gure out how to iterate a tree as a generic container by using pred()/succ()? What's the performance of such traversing process in terms of big-O? A reader discussed about traversing all elements inside a range [a; b]. In C++, the algorithm looks like the below code: for each(m:lower bound(12); m:upper bound(26); f); Can you provide the purely function solution for this problem? 1.6 Deletion Deletion is another `imperative only' topic for binary search tree. This is because deletion mutate the tree, while in purely functional settings, we don't modify the tree after building it in most application. However, One method of deleting element from binary search tree in purely functional way is shown in this section. It's actually reconstructing the tree but not modifying the tree. Deletion is the most complex operation for binary search tree. this is because we must keep the BST property, that for any node, all keys in left sub tree are less than the key of this node, and they are all less than any keys in right sub tree. Deleting a node can break this property. In this post, dierent with the algorithm described in [2], A simpler one from SGI STL implementation is used.[6] To delete a node x from a tree. If x has no child or only one child, splice x out; Otherwise (x has two children), use minimum element of its right sub tree to replace x, and splice the original minimum element out. The simplicity comes from the truth that, the minimum element is stored in a node in the right sub tree, which can't have two non-NIL children. It ends up in the trivial case, the the node can be directly splice out from the tree. Figure 1.5, 1.6, and 1.7 illustrate these dierent cases when deleting a node from the tree. Based on this idea, the deletion can be de
  • 142. ned as the below function. delete(T; x) = 8 : : T = node(delete(L; x);K;R) : x K node(L;K; delete(R; x)) : x K R : x = K ^ L = L : x = K ^ R = node(L; y; delete(R; y)) : otherwise (1.9)
  • 143. 1.6. DELETION 43 Tree x NIL NIL Figure 1.5: x can be spliced out. Tree x L NIL (a) Before delete x Tree L (b) After delete x x is spliced out, and replaced by its left child. Tree x NIL R (c) Before delete x Tree R (d) Before delete x x is spliced out, and replaced by its right child. Figure 1.6: Delete a node which has only one non-NIL child.
  • 144. 44CHAPTER 1. BINARY SEARCH TREE, THE `HELLOWORLD' DATA STRUCTURE Tree x L R (a) Before delete x Tree min(R) L delete(R, min(R)) (b) After delete x x is replaced by splicing the minimum element from its right child. Figure 1.7: Delete a node which has both children. Where L = left(T) R = right(T) K = key(T) y = min(R) Translating the function to Haskell yields the below program. delete::(Ord a)) Tree a ! a ! Tree a delete Empty _ = Empty delete (Node l k r) x j x k = (Node (delete l x) k r) j x k = (Node l k (delete r x)) -- x == k j isEmpty l = r j isEmpty r = l j otherwise = (Node l k' (delete r k')) where k' = min r Function `isEmpty' is used to test if a tree is empty (). Note that the algorithm
  • 145. rst performs search to locate the node where the element need be deleted, after that it execute the deletion. This algorithm takes O(h) time where h is the height of the tree. It's also possible to pass the node but not the element to the algorithm for deletion. Thus the searching is no more needed. The imperative algorithm is more complex because it need set the parent properly. The function will return the root of the result tree. 1: function Delete(T; x) 2: root T
  • 146. 1.6. DELETION 45 3: x0 x . save x 4: parent Parent(x) 5: if Left(x) = NIL then 6: x Right(x) 7: else if Right(x) = NIL then 8: x Left(x) 9: else . both children are non-NIL 10: y Min(Right(x)) 11: Key(x) Key(y) 12: Copy other satellite data from y to x 13: if Parent(y)6= x then . y hasn't left sub tree 14: Left(Parent(y)) Right(y) 15: else . y is the root of right child of x 16: Right(x) Right(y) 17: Remove y 18: return root 19: if x6= NIL then 20: Parent(x) parent 21: if parent = NIL then . We are removing the root of the tree 22: root x 23: else 24: if Left(parent) = x0 then 25: Left(parent) x 26: else 27: Right(parent) x 28: Remove x0 29: return root Here we assume the node to be deleted is not empty (otherwise we can simply returns the original tree). In other cases, it will
  • 147. rst record the root of the tree, create copy pointers to x, and its parent. If either of the children is empty, the algorithm just splice x out. If it has two non-NIL children, we
  • 148. rst located the minimum of right child, replace the key of x to y's, copy the satellite data as well, then splice y out. Note that there is a special case that y is the root node of x's left sub tree. Finally we need reset the stored parent if the original x has only one non- NIL child. If the parent pointer we copied before is empty, it means that we are deleting the root node, so we need return the new root. After the parent is set properly, we
  • 149. nally remove the old x from memory. The relative Python program for deleting algorithm is given as below. Be-cause Python provides GC, we needn't explicitly remove the node from the memory. def tree_delete(t, x): if x is None: return t [root, old_x, parent] = [t, x, x.parent] if x.left is None: x = x.right elif x.right is None: x = x.left
  • 150. 46CHAPTER 1. BINARY SEARCH TREE, THE `HELLOWORLD' DATA STRUCTURE else: y = tree_min(x.right) x.key = y.key if y.parent != x: y.parent.left = y.right else: x.right = y.right return root if x is not None: x.parent = parent if parent is None: root = x else: if parent.left == old_x: parent.left = x else: parent.right = x return root Because the procedure seeks minimum element, it runs in O(h) time on a tree of height h. Exercise 1.3 There is a symmetrical solution for deleting a node which has two non-NIL children, to replace the element by splicing the maximum one out o the left sub-tree. Write a program to implement this solution. 1.7 Randomly build binary search tree It can be found that all operations given in this post bound to O(h) time for a tree of height h. The height aects the performance a lot. For a very unbalanced tree, h tends to be O(N), which leads to the worst case. While for balanced tree, h close to O(lgN). We can gain the good performance. How to make the binary search tree balanced will be discussed in next post. However, there exists a simple way. Binary search tree can be randomly built as described in [2]. Randomly building can help to avoid (decrease the possibility) unbalanced binary trees. The idea is that before building the tree, we can call a random process, to shue the elements. Exercise 1.4 Write a randomly building process for binary search tree. 1.8 Appendix All programs are provided along with this post. They are free for downloading. We provided C, C++, Python, Haskell, and Scheme/Lisp programs as example.
  • 151. Bibliography [1] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Cliord Stein. Introduction to Algorithms, Second Edition. ISBN:0262032937. The MIT Press. 2001 [2] Jon Bentley. Programming Pearls(2nd Edition). Addison-Wesley Profes-sional; 2 edition (October 7, 1999). ISBN-13: 978-0201657883 [3] Chris Okasaki. Ten Years of Purely Functional Data Structures. http://okasaki.blogspot.com/2008/02/ten-years-of-purely-functional-data. html [4] SGI. Standard Template Library Programmer's Guide. http://www.sgi.com/tech/stl/ [5] http://en.literateprograms.org/Category:Binary search tree [6] http://en.wikipedia.org/wiki/Foldl [7] http://en.wikipedia.org/wiki/Function composition [8] http://en.wikipedia.org/wiki/Partial application [9] Miran Lipovaca. Learn You a Haskell for Great Good! A Beginner's Guide. the last chapter. No Starch Press; 1 edition April 2011, 400 pp. ISBN: 978-1-59327-283-8 47
  • 152. 48 The evolution of insertion sort
  • 153. Chapter 2 The evolution of insertion sort 2.1 Introduction In previous chapter, we introduced the 'hello world' data structure, binary search tree. In this chapter, we explain insertion sort, which can be think of the 'hello world' sorting algorithm 1. It's straightforward, but the performance is not as good as some divide and conqueror sorting approaches, such as quick sort and merge sort. Thus insertion sort is seldom used as generic sorting utility in modern software libraries. We'll analyze the problems why it is slow, and trying to improve it bit by bit till we reach the best bound of comparison based sorting algorithms, O(n lg n), by evolution to tree sort. And we
  • 154. nally show the connection between the 'hello world' data structure and 'hello world' sorting algorithm. The idea of insertion sort can be vivid illustrated by a real life poker game[2]. Suppose the cards are shued, and a player starts taking card one by one. At any time, all cards in player's hand are well sorted. When the player gets a new card, he insert it in proper position according to the order of points. Figure 2.1 shows this insertion example. Based on this idea, the algorithm of insertion sort can be directly given as the following. function Sort(A) X for each x 2 A do Insert(X; x) return X It's easy to express this process with folding, which we mentioned in the chapter of binary search tree. insert = foldL insert (2.1) 1Some reader may argue that 'Bubble sort' is the easiest sort algorithm. Bubble sort isn't covered in this book as we don't think it's a valuable algorithm[1] 49
  • 155. 50 CHAPTER 2. THE EVOLUTION OF INSERTION SORT Figure 2.1: Insert card 8 to proper position in a deck. Note that in above algorithm, we store the sorted result in X, so this isn't in-place sorting. It's easy to change it to in-place algorithm. function Sort(A) for i 2 to Length(A) do insert Ai to sorted sequence fA0 1;A0 2; :::;A0 i1 g At any time, when we process the i-th element, all elements before i have already been sorted. we continuously insert the current elements until consume all the unsorted data. This idea is illustrated as in
  • 156. gure 9.3. insert ... sorted elements ... x ... unsorted elements ... Figure 2.2: The left part is sorted data, continuously insert elements to sorted part. We can
  • 157. nd there is recursive concept in this de
  • 158. nition. Thus it can be expressed as the following. sort(A) = : A = insert(sort(fA2;A3; :::g);A1) : otherwise (2.2) 2.2 Insertion We haven't answered the question about how to realize insertion however. It's a puzzle how does human locate the proper position so quickly. For computer, it's an obvious option to perform a scan. We can either scan from left to right or vice versa. However, if the sequence is stored in plain array, it's necessary to scan from right to left. function Sort(A) for i 2 to Length(A) do . Insert A[i] to sorted sequence A[1:::i 1]
  • 159. 2.2. INSERTION 51 x A[i] j i 1 while j 0 ^ x A[j] do A[j + 1] A[j] j j 1 A[j + 1] x One may think scan from left to right is natural. However, it isn't as eect as above algorithm for plain array. The reason is that, it's expensive to insert an element in arbitrary position in an array. As array stores elements continuously. If we want to insert new element x in position i, we must shift all elements after i, including i+1; i+2; ::: one position to right. After that the cell at position i is empty, and we can put x in it. This is illustrated in
  • 160. gure 2.3. x insert A[1] A[2] ... A[i-1] A[i] A[i+1] A[i+2] ... A[n-1] A[n] empty Figure 2.3: Insert x to array A at position i. If the length of array is n, this indicates we need examine the
  • 161. rst i elements, then perform n i + 1 moves, and then insert x to the i-th cell. So insertion from left to right need traverse the whole array anyway. While if we scan from right to left, we totally examine the last j = n i + 1 elements, and perform the same amount of moves. If j is small (e.g. less than n=2), there is possibility to perform less operations than scan from left to right. Translate the above algorithm to Python yields the following code. def isort(xs): n = len(xs) for i in range(1, n): x = xs[i] j = i - 1 while j 0 and x xs[j]: xs[j+1] = xs[j] j = j - 1 xs[j+1] = x It can be found some other equivalent programs, for instance the following ANSI C program. However this version isn't as eective as the pseudo code. void isort(Key xs, int n){ int i, j; for(i=1; in; ++i) for(j=i-1; j0 xs[j+1] xs[j]; --j) swap(xs, j, j+1); }
  • 162. 52 CHAPTER 2. THE EVOLUTION OF INSERTION SORT This is because the swapping function, which can exchange two elements typically uses a temporary variable like the following: void swap(Key xs, int i, int j){ Key temp = xs[i]; xs[i] = xs[j]; xs[j] = temp; } So the ANSI C program presented above takes 3m times assignment, where m is the number of inner loops. While the pseudo code as well as the Python program use shift operation instead of swapping. There are n + 2 times assign-ment. We can also provide Insert() function explicitly, and call it from the general insertion sort algorithm in previous section. We skip the detailed realization here and left it as an exercise. All the insertion algorithms are bound to O(n), where n is the length of the sequence. No matter what dierence among them, such as scan from left or from right. Thus the over all performance for insertion sort is quadratic as O(n2). Exercise 2.1 Provide explicit insertion function, and call it with general insertion sort algorithm. Please realize it in both procedural way and functional way. 2.3 Improvement 1 Let's go back to the question, that why human being can
  • 163. nd the proper position for insertion so quickly. We have shown a solution based on scan. Note the fact that at any time, all cards at hands have been well sorted, another possible solution is to use binary search to
  • 164. nd that location. We'll explain the search algorithms in other dedicated chapter. Binary search is just brie y introduced for illustration purpose here. The algorithm will be changed to call a binary search procedure. function Sort(A) for i 2 to Length(A) do x A[i] p Binary-Search(A[1:::i 1]; x) for j i down to p do A[j] A[j 1] A[p] x Instead of scan elements one by one, binary search utilize the information that all elements in slice of array fA1; :::;Ai1g are sorted. Let's assume the order is monotonic increase order. To
  • 165. nd a position j that satis
  • 166. es Aj1 x Aj . We can
  • 167. rst examine the middle element, for example, Abi=2c. If x is less than it, we need next recursively perform binary search in the
  • 168. rst half of the sequence; otherwise, we only need search in last half. Every time, we halve the elements to be examined, this search process runs O(lg n) time to locate the insertion position. function Binary-Search(A; x)
  • 169. 2.4. IMPROVEMENT 2 53 l 1 u 1+ Length(A) while l u do m b l+u 2 c if Am = x then return m . Find a duplicated element else if Am x then l m + 1 else u m return l The improved insertion sort algorithm is still bound to O(n2), compare to previous section, which we use O(n2) times comparison and O(n2) moves, with binary search, we just use O(n lg n) times comparison and O(n2) moves. The Python program regarding to this algorithm is given below. def isort(xs): n = len(xs) for i in range(1, n): x = xs[i] p = binary_search(xs[:i], x) for j in range(i, p, -1): xs[j] = xs[j-1] xs[p] = x def binary_search(xs, x): l = 0 u = len(xs) while l u: m = (l+u)=2 if xs[m] == x: return m elif xs[m] x: l = m + 1 else: u = m return l Exercise 2.2 Write the binary search in recursive manner. You needn't use purely func-tional programming language. 2.4 Improvement 2 Although we improve the search time to O(n lg n) in previous section, the num-ber of moves is still O(n2). The reason of why movement takes so long time, is because the sequence is stored in plain array. The nature of array is continu-ously layout data structure, so the insertion operation is expensive. This hints us that we can use linked-list setting to represent the sequence. It can improve
  • 170. 54 CHAPTER 2. THE EVOLUTION OF INSERTION SORT the insertion operation from O(n) to constant time O(1). insert(A; x) = 8 : fxg : A = fxg [ A : x A1 fA1g [ insert(fA2;A3; :::Ang; x) : otherwise (2.3) Translating the algorithm to Haskell yields the below program. insert :: (Ord a) ) [a] ! a ! [a] insert [] x = [x] insert (y:ys) x = if x y then x:y:ys else y:insert ys x And we can complete the two versions of insertion sort program based on the
  • 171. rst two equations in this chapter. isort [] = [] isort (x:xs) = insert (isort xs) x Or we can represent the recursion with folding. isort = foldl insert [] Linked-list setting solution can also be described imperatively. Suppose function Key(x), returns the value of element stored in node x, and Next(x) accesses the next node in the linked-list. function Insert(L; x) p NIL H L while L6= NIL^ Key(L) Key(x) do p L L Next(L) Next(x) L if p6= NIL then H x else Next(p) x return H For example in ANSI C, the linked-list can be de
  • 172. ned as the following. struct node{ Key key; struct node next; }; Thus the insert function can be given as below. struct node insert(struct node lst, struct node x){ struct node p, head; p = NULL; for(head = lst; lst x!key lst!key; lst = lst!next) p = lst; x!next = lst; if(!p) return x; p!next = x;
  • 173. 2.5. FINAL IMPROVEMENT BY BINARY SEARCH TREE 55 return head; } Instead of using explicit linked-list such as by pointer or reference based structure. Linked-list can also be realized by another index array. For any array element Ai, Nexti stores the index of next element follows Ai. It means ANexti is the next element after Ai. The insertion algorithm based on this solution is given like below. function Insert(A;Next; i) j ? while Nextj6= NIL ^ ANextj Ai do j Nextj Nexti Nextj Nextj i Here ? means the head of the Next table. And the relative Python program for this algorithm is given as the following. def isort(xs): n = len(xs) next = [-1](n+1) for i in range(n): insert(xs, next, i) return next def insert(xs, next, i): j = -1 while next[j] != -1 and xs[next[j]] xs[i]: j = next[j] next[j], next[i] = i, next[j] Although we change the insertion operation to constant time by using linked-list. However, we have to traverse the linked-list to
  • 174. nd the position, which results O(n2) times comparison. This is because linked-list, unlike array, doesn't support random access. It means we can't use binary search with linked-list setting. Exercise 2.3 Complete the insertion sort by using linked-list insertion function in your favorate imperative programming language. The index based linked-list return the sequence of rearranged index as result. Write a program to re-order the original array of elements from this result. 2.5 Final improvement by binary search tree It seems that we drive into a corner. We must improve both the comparison and the insertion at the same time, or we will end up with O(n2) performance. We must use binary search, this is the only way to improve the comparison time to O(lg n). On the other hand, we must change the data structure, because we can't achieve constant time insertion at a position with plain array.
  • 175. 56 CHAPTER 2. THE EVOLUTION OF INSERTION SORT This remind us about our 'hello world' data structure, binary search tree. It naturally support binary search from its de
  • 176. nition. At the same time, We can insert a new leaf in binary search tree in O(1) constant time if we already
  • 177. nd the location. So the algorithm changes to this. function Sort(A) T for each x 2 A do T Insert-Tree(T; x) return To-List(T) Where Insert-Tree() and To-List() are described in previous chapter about binary search tree. As we have analyzed for binary search tree, the performance of tree sort is bound to O(n lg n), which is the lower limit of comparison based sort[3]. 2.6 Short summary In this chapter, we present the evolution process of insertion sort. Insertion sort is well explained in most textbooks as the
  • 178. rst sorting algorithm. It has simple and straightforward idea, but the performance is quadratic. Some textbooks stop here, but we want to show that there exist ways to improve it by dierent point of view. We
  • 179. rst try to save the comparison time by using binary search, and then try to save the insertion operation by changing the data structure to linked-list. Finally, we combine these two ideas and evolute insertion sort to tree sort.
  • 180. Bibliography [1] http://en.wikipedia.org/wiki/Bubble sort [2] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Cliord Stein. Introduction to Algorithms, Second Edition. ISBN:0262032937. The MIT Press. 2001 [3] Donald E. Knuth. The Art of Computer Programming, Volume 3: Sorting and Searching (2nd Edition). Addison-Wesley Professional; 2 edition (May 4, 1998) ISBN-10: 0201896850 ISBN-13: 978-0201896855 57
  • 181. 58 Red black tree
  • 182. Chapter 3 Red-black tree, not so complex as it was thought 3.1 Introduction 3.1.1 Exploit the binary search tree We have shown the power of binary search tree by using it to count the occur-rence of every word in Bible. The idea is to use binary search tree as a dictionary for counting. One may come to the idea that to feed a yellow page book 1 to a binary search tree, and use it to look up the phone number for a contact. By modifying a bit of the program for word occurrence counting yields the following code. int main(int, char ){ ifstream f(yp.txt); mapstring, string dict; string name, phone; while(fname fphone) dict[name]=phone; for(;;){ coutnname: ; cinname; if(dict.find(name)==dict.end()) coutnot found; else coutphone: dict[name]; } } This program works well. However, if you replace the STL map with the bi-nary search tree as mentioned previously, the performance will be bad, especially when you search some names such as Zara, Zed, Zulu. This is because the content of yellow page is typically listed in lexicographic order. Which means the name list is in increase order. If we try to insert a 1A name-phone number contact list book 59
  • 183. 60CHAPTER 3. RED-BLACK TREE, NOT SO COMPLEX AS ITWAS THOUGHT sequence of number 1, 2, 3, ..., n to a binary search tree. We will get a tree like in Figure 3.1. 1 2 3 ... n Figure 3.1: unbalanced tree This is a extreme unbalanced binary search tree. The looking up performs O(h) for a tree with height h. In balanced case, we bene
  • 184. t from binary search tree by O(lgN) search time. But in this extreme case, the search time down-graded to O(N). It's no better than a normal link-list. Exercise 3.1 For a very big yellow page list, one may want to speed up the dictionary building process by two concurrent tasks (threads or processes). One task reads the name-phone pair from the head of the list, while the other one reads from the tail. The building terminates when these two tasks meet at the middle of the list. What will be the binary search tree looks like after building? What if you split the the list more than two and use more tasks? Can you
  • 185. nd any more cases to exploit a binary search tree? Please consider the unbalanced trees shown in
  • 186. gure 3.2. 3.1.2 How to ensure the balance of the tree In order to avoid such case, we can shue the input sequence by randomized algorithm, such as described in Section 12.4 in [2]. However, this method doesn't always work, for example the input is fed from user interactively, and the tree need to be built/updated after each input. There are many solutions people have ever found to make binary search tree balanced. Many of them rely on the rotation operations to binary search tree. Rotation operations change the tree structure while maintain the ordering of the elements. Thus it either improve or keep the balance property of the binary search tree.
  • 187. 3.1. INTRODUCTION 61 n n-1 n-2 ... 1 (a) 1 2 n 3 n-1 4 ... (b) m m-1 m+1 m-2 ... 1 m+2 ... n (c) Figure 3.2: Some unbalanced trees
  • 188. 62CHAPTER 3. RED-BLACK TREE, NOT SO COMPLEX AS ITWAS THOUGHT In this chapter, we'll
  • 189. rst introduce about red-black tree, which is one of the most popular and widely used self-adjusting balanced binary search tree. In next chapter, we'll introduce about AVL tree, which is another intuitive solution; In later chapter about binary heaps, we'll show another interesting tree called splay tree, which can gradually adjust the the tree to make it more and more balanced. 3.1.3 Tree rotation () X a Y b c (a) Y X c a b (b) Figure 3.3: Tree rotation, `rotate-left' transforms the tree from left side to right side, and `rotate-right' does the inverse transformation. Tree rotation is a kind of special operation that can transform the tree structure without changing the in-order traverse result. It based on the fact that for a speci
  • 190. ed ordering, there are multiple binary search trees correspond to it. Figure 3.3 shows the tree rotation. For a binary search tree on the left side, left rotate transforms it to the tree on the right, and right rotate does the inverse transformation. Although tree rotation can be realized in procedural way, there exists quite simple function description if using pattern matching. rotateL(T) = node(node(a;X; b); Y; c) : pattern(T) = node(a;X; node(b; Y; c)) T : otherwise (3.1) rotateR(T) = node(a;X; node(b; Y; c)) : pattern(T) = node(node(a;X; b); Y; c)) T : otherwise (3.2) However, the pseudo code dealing imperatively has to set all
  • 191. elds accord-ingly. 1: function Left-Rotate(T; x) 2: p Parent(x) 3: y Right(x) . Assume y6= NIL 4: a Left(x) 5: b Left(y) 6: c Right(y)
  • 192. 3.1. INTRODUCTION 63 7: Replace(x; y) 8: Set-Children(x; a; b) 9: Set-Children(y; x; c) 10: if p = NIL then 11: T y 12: return T 13: function Right-Rotate(T; y) 14: p Parent(y) 15: x Left(y) . Assume x6= NIL 16: a Left(x) 17: b Right(x) 18: c Right(y) 19: Replace(y; x) 20: Set-Children(y; b; c) 21: Set-Children(x; a; y) 22: if p = NIL then 23: T x 24: return T 25: function Set-Left(x; y) 26: Left(x) y 27: if y6= NIL then Parent(y) x 28: function Set-Right(x; y) 29: Right(x) y 30: if y6= NIL then Parent(y) x 31: function Set-Children(x; L;R) 32: Set-Left(x;L) 33: Set-Right(x;R) 34: function Replace(x; y) 35: if Parent(x) = NIL then 36: if y6= NIL then Parent(y) NIL 37: else if Left(Parent(x)) = x then Set-Left(Parent(x), y) 38: elseSet-Right(Parent(x), y) 39: Parent(x) NIL Compare these pseudo codes with the pattern matching functions, the former focus on the structure states changing, while the latter focus on the rotation process. As the title of this chapter indicated, red-black tree needn't be so complex as it was thought. Most traditional algorithm text books use the classic procedural way to teach red-black tree, there are several cases need to deal and all need carefulness to manipulate the node
  • 193. elds. However, by changing the mind to functional settings, things get intuitive and simple. Although there is some performance overhead. Most of the content in this chapter is based on Chris Okasaki's work in [2].
  • 194. 64CHAPTER 3. RED-BLACK TREE, NOT SO COMPLEX AS ITWAS THOUGHT 3.2 De
  • 195. nition of red-black tree Red-black tree is a type of self-balancing binary search tree[3]. 2 By using color changing and rotation, red-black tree provides a very simple and straightforward way to keep the tree balanced. For a binary search tree, we can augment the nodes with a color
  • 196. eld, a node can be colored either red or black. We call a binary search tree red-black tree if it satis
  • 197. es the following 5 properties[2]. 1. Every node is either red or black. 2. The root is black. 3. Every leaf (NIL) is black. 4. If a node is red, then both its children are black. 5. For each node, all paths from the node to descendant leaves contain the same number of black nodes. Why this 5 properties can ensure the red-black tree is well balanced? Because they have a key characteristic, the longest path from root to a leaf can't be as 2 times longer than the shortest path. Please note the 4-th property, which means there won't be two adjacent red nodes. so the shortest path only contains black nodes, any paths longer than the shortest one has interval red nodes. According to property 5, all paths have the same number of black nodes, this
  • 198. nally ensure there won't be any path is 2 times longer than others[3]. Figure 3.4 shows an example red-black tree. 1 3 8 1 7 1 1 1 6 1 5 2 5 2 2 2 7 Figure 3.4: An example red-black tree 2Red-black tree is one of the equivalent form of 2-3-4 tree (see chapter B-tree about 2-3-4 tree). That is to say, for any 2-3-4 tree, there is at least one red-black tree has the same data order.
  • 199. 3.3. INSERTION 65 All read only operations such as search, min/max are as same as in binary search tree. While only the insertion and deletion are special. As we have shown in word occurrence example, many implementation of set or map container are based on red-black tree. One example is the C++ Standard library (STL)[6]. As mentioned previously, the only change in data layout is that there is color information augmented to binary search tree. This can be represented as a data
  • 200. eld in imperative languages such as C++ like below. enum Color {Red, Black}; template class T struct node{ Color color; T key; node left; node right; node parent; }; In functional settings, we can add the color information in constructors, below is the Haskell example of red-black tree de
  • 201. nition. data Color = R j B data RBTree a = Empty j Node Color (RBTree a) a (RBTree a) Exercise 3.2 Can you prove that a red-black tree with n nodes has height at most 2 lg(n + 1)? 3.3 Insertion Inserting a new node as what has been mentioned in binary search tree may cause the tree unbalanced. The red-black properties has to be maintained, so we need do some
  • 202. xing by transform the tree after insertion. When we insert a new key, one good practice is to always insert it as a red node. As far as the new inserted node isn't the root of the tree, we can keep all properties except the 4-th one. that it may bring two adjacent red nodes. Functional and procedural implementation have dierent
  • 203. xing methods. One is intuitive but has some overhead, the other is a bit complex but has higher performance. Most text books about algorithm introduce the later one. In this chapter, we focus on the former to show how easily a red-black tree insertion algorithm can be realized. The traditional procedural method will be given only for comparison purpose. As described by Chris Okasaki, there are total 4 cases which violate property 4. All of them has 2 adjacent red nodes. However, they have a uniformed form after
  • 205. gure 4.3. Note that this transformation will move the redness one level up. So this is a bottom-up recursive
  • 206. xing, the last step will make the root node red. According
  • 207. 66CHAPTER 3. RED-BLACK TREE, NOT SO COMPLEX AS ITWAS THOUGHT @ @ @R @ @I @ z y D x C A B z x D A y B C x A y B z C D x A z y D B C y x z A B C D Figure 3.5: 4 cases for balancing a red-black tree after insertion
  • 208. 3.3. INSERTION 67 to property 2, root is always black. Thus we need
  • 209. nal
  • 210. xing to revert the root color to black. Observing that the 4 cases and the
  • 211. xed result have strong pattern features, the
  • 213. ned by using the similar method we mentioned in tree rotation. To avoid too long formula, we abbreviate Color as C, Black as B, and Red as R. balance(T) = node(R; node(B; A; x;B); y; node(B;C; z;D)) : match(T) T : otherwise (3.3) where function node() can construct a red-black tree node with 4 parameters as color, the left child, the key and the right child. Function match() can test if a tree is one of the 4 possible patterns as the following. match(T) = pattern(T) = node(B; node(R; node(R; A; x;B); y;C); z;D)_ pattern(T) = node(B; node(R; A; x; node(R;B; y;C); z;D))_ pattern(T) = node(B; A; x; node(R;B; y; node(R;C; z;D)))_ pattern(T) = node(B; A; x; node(R; node(R;B; y;C); z;D)) With function balance() de
  • 214. ned, we can modify the previous binary search tree insertion functions to make it work for red-black tree. insert(T; k) = makeBlack(ins(T; k)) (3.4) where ins(T; k) = 8 : node(R; ; k; ) : T = balance(node(ins(L; k);Key;R)) : k Key balance(node(L;Key; ins(R; k))) : otherwise (3.5) L;R;Key represent the left child, right child and the key of a tree. L = left(T) R = right(T) Key = key(T) Function makeBlack() is de
  • 215. ned as the following, it forces the color of a non-empty tree to be black. makeBlack(T) = node(B; L;Key;R) (3.6) Summarize the above functions and use language supported pattern match-ing features, we can come to the following Haskell program. insert::(Ord a))RBTree a ! a ! RBTree a insert t x = makeBlack $ ins t where ins Empty = Node R Empty x Empty ins (Node color l k r) j x k = balance color (ins l) k r j otherwise = balance color l k (ins r) --[3] makeBlack(Node _ l k r) = Node B l k r
  • 216. 68CHAPTER 3. RED-BLACK TREE, NOT SO COMPLEX AS ITWAS THOUGHT balance::Color ! RBTree a ! a ! RBTree a ! RBTree a balance B (Node R (Node R a x b) y c) z d = Node R (Node B a x b) y (Node B c z d) balance B (Node R a x (Node R b y c)) z d = Node R (Node B a x b) y (Node B c z d) balance B a x (Node R b y (Node R c z d)) = Node R (Node B a x b) y (Node B c z d) balance B a x (Node R (Node R b y c) z d) = Node R (Node B a x b) y (Node B c z d) balance color l k r = Node color l k r Note that the 'balance' function is changed a bit from the original de