5. Hash tables
Keys and values
O(1) lookup
Hash function
– Good v fast
Clustering
Databases
6. Selection sort :-(
O(n2)
Algorithm:
– Find the minimum value
– Swap with 1st position value
– Repeat with 2nd position down
7. Insertion sort :-)
O(n2)
O(1) space
Great with small number of elements
(becomes relevant later)
Algorithm:
– Move element from unsorted to sorted list
8. Bubble sort :-(
O(n2)
Algorithm:
– Iterate through each n, and sort with n+1
element
Maybe go n-1 steps every iteration?
Great for big numbers, bad for small
Totally useless?
9. Merge sort :-)
O(nlogn)
Requires O(n) extra space
Parallelizable
Algorithm:
– Break list into 2 sublists
– Sort sublist
– Merge
10. Quick sort :-)
Average O(nlogn), worst O(n2)
O(n) extra space (can optimized for O(logn))
Algorithm:
– pick a pivot
– put all x < pivot in less, all x > pivot in more
– Concat and recurse through less, pivot, and more
Advantages also based on caching, registry
(single pivot comparison)
Variations: use fat pivot
13. Trees
Almost like linked lists!
Traverse: Pre-order v. Post-order v. In-
order
Node, edge, sibling/parent/child, leaf
14. Binary trees
0, 1, or 2 children per node
Binary Search Tree: a binary tree where
node.left_child < node.value and
node.right_child >= node.value
15. Balanced binary
trees
Minimizes the level of nodes
Compared with “bad” binary tree?
Advantages:
– Lookup, insertion, removal: O(log n)
Disadvantages:
– Overhead to maintain balance
16. Heaps (binary)
Complete: all leafs are at n or n-1,
toward the left
Node.value >= child.value
In binary min/max heap
– Insert = O(logn) .. add to bottom, bubble-up
– deleteMax = O(logn) .. Move last to root
and bubble-down
18. Why bother?
Tries (say trees)
– Position determines the key
– Great for lots of short words
– Prefix matching
But..
– Long strings..
– Complex algorithms