Upcoming SlideShare
×

# Sorting2

1,005 views

Published on

Data Structures Sorting 2

Published in: Technology
0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
1,005
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
39
0
Likes
0
Embeds 0
No embeds

No notes for slide

### Sorting2

1. 1.
2. 2.
3. 3.
4. 4. Heap: array implementation<br />1<br />3<br />2<br />7<br />5<br />6<br />4<br />8<br />10<br />9<br />Is it a good idea to store arbitrary binary trees as arrays? May have many empty spaces!<br />
5. 5. Array implementation<br />1<br />6<br />3<br />2<br />5<br />1<br />4<br />The root node is A[1].<br />The left child of A[j] is A[2j]<br />The right child of A[j] is A[2j + 1]<br />The parent of A[j] is A[j/2] (note: integer divide)<br />2<br />5<br />4<br />3<br />6<br />Need to estimate the maximum size of the heap.<br />
6. 6. Heapsort<br />(1)   Build a binary heap of N elements <br />the minimum element is at the top of the heap<br />(2)   Perform N DeleteMin operations<br />the elements are extracted in sorted order<br />(3)   Record these elements in a second array and then copy the array back<br />
7. 7. Heapsort – running time analysis<br />(1)   Build a binary heap of N elements <br />repeatedly insert N elements O(N log N) time<br /> (there is a more efficient way)<br />(2)   Perform N DeleteMin operations<br />Each DeleteMin operation takes O(log N)  O(N log N)<br />(3)   Record these elements in a second array and then copy the array back<br />O(N)<br />Total: O(N log N)<br />Uses an extra array<br />
8. 8. Heapsort: no extra storage<br />After each deleteMin, the size of heap shrinks by 1<br />We can use the last cell just freed up to store the element that was just deleted<br />  after the last deleteMin, the array will contain the elements in decreasing sorted order<br />To sort the elements in the decreasing order, use a min heap<br />To sort the elements in the increasing order, use a max heap<br />the parent has a larger element than the child<br />
9. 9. Heapsort<br />Sort in increasing order: use max heap<br />Delete 97<br />
10. 10. Heapsort: A complete example<br />Delete 16<br />Delete 14<br />
11. 11. Example (cont’d)<br />Delete 10<br />Delete 9<br />Delete 8<br />
12. 12. Example (cont’d)<br />
13. 13. Lower bound for sorting,radix sort<br />
14. 14. Lower Bound for Sorting<br />Mergesort and heapsort<br />worst-case running time is O(N log N)<br />Are there better algorithms?<br />Goal: Prove that any sorting algorithm based on only comparisons takes (N log N) comparisons in the worst case (worse-case input) to sort N elements. <br />
15. 15. Lower Bound for Sorting<br />Suppose we want to sort N distinct elements<br />How many possible orderings do we have for N elements?<br />We can have N! possible orderings (e.g., the sorted output for a,b,c can be a b c, b a c, a c b, c a b, c b a, b c a.)<br />
16. 16. Lower Bound for Sorting<br />Any comparison-based sorting process can be represented as a binary decision tree.<br />Each node represents a set of possible orderings, consistent with all the comparisons that have been made<br />The tree edges are results of the comparisons<br />
17. 17. Decision tree for<br />Algorithm X for sorting<br />three elements a, b, c<br />
18. 18. Lower Bound for Sorting<br />A different algorithm would have a different decision tree<br />Decision tree for Insertion Sort on 3 elements:<br />
19. 19. Lower Bound for Sorting<br />The worst-case number of comparisons used by the sorting algorithm is equal to the depth of the deepest leaf<br />The average number of comparisons used is equal to the average depth of the leaves<br />A decision tree to sort N elements must have N! leaves<br />a binary tree of depth d has at most 2d leaves<br />  the tree must have depth at least log2 (N!)<br />Therefore, any sorting algorithm based on only comparisons between elements requires at least<br /> log2(N!)  comparisons in the worst case.<br />
20. 20. Lower Bound for Sorting<br />Any sorting algorithm based on comparisons between elements requires (N log N) comparisons.<br />
21. 21. Linear time sorting<br />Can we do better (linear time algorithm) if the input has special structure (e.g., uniformly distributed, every numbers can be represented by d digits)? Yes.<br />Counting sort, radix sort<br />
22. 22. Counting Sort<br />Assume N integers to be sorted, each is in the range 1 to M.<br />Define an array B[1..M], initialize all to 0  O(M)<br />Scan through the input list A[i], insert A[i] into B[A[i]]  O(N)<br />Scan B once, read out the nonzero integers  O(M)<br />Total time: O(M + N)<br />if M is O(N), then total time is O(N)<br />Can be bad if range is very big, e.g. M=O(N2)<br />N=7, M = 9, <br /> Want to sort 8 1 9 5 2 6 3 <br />5<br />8<br />9<br />3<br />6<br />1<br />2<br />Output: 1 2 3 5 6 8 9<br />
23. 23. Counting sort<br />What if we have duplicates?<br />B is an array of pointers.<br />Each position in the array has 2 pointers: head and tail. Tail points to the end of a linked list, and head points to the beginning.<br />A[j] is inserted at the end of the list B[A[j]]<br />Again, Array B is sequentially traversed and each nonempty list is printed out.<br />Time: O(M + N)<br />
24. 24. M = 9, <br />Wish to sort 8 5 1 5 9 5 6 2 7 <br />1<br />2<br />5<br />6<br />7<br />8<br />9<br />5<br />5<br />Output: 1 2 5 5 5 6 7 8 9<br />Counting sort<br />
25. 25. Radix Sort<br />Extra information: every integer can be represented by at most k digits<br />d1d2…dkwheredi are digits in base r<br />d1: most significant digit<br />dk: least significant digit<br />
26. 26. Radix Sort<br />Algorithm<br />sort by the least significant digit first (counting sort)<br /> => Numbers with the same digit go to same bin<br />reorder all the numbers: the numbers in bin 0 precede the numbers in bin 1, which precede the numbers in bin 2, and so on <br />sort by the next least significant digit<br />continue this process until the numbers have been sorted on all k digits<br />
27. 27. Radix Sort<br />Least-significant-digit-first<br />Example: 275, 087, 426, 061, 509, 170, 677, 503<br />
28. 28.
29. 29. Radix Sort<br />Does it work?<br />Clearly, if the most significant digit of a and b are different and a < b, then finally a comes before b<br />If the most significant digit of a and b are the same, and the second most significant digit of b is less than that of a, then b comes before a.<br />
30. 30. Radix Sort<br />Example 2: sorting cards<br />2 digits for each card: d1d2<br />d1 = : base 4<br />      <br />d2 = A, 2, 3, ...J, Q, K: base 13<br />A  2  3  ...  J  Q  K <br />2  2  5  K<br />
31. 31. // base 10<br />// FIFO<br />// d times of counting sort<br />// scan A[i], put into correct slot<br />// re-order back to original array<br />
32. 32. Radix Sort<br />Increasing the base r decreases the number of passes<br />Running time<br />k passes over the numbers (i.e. k counting sorts, with range being 0..r)<br />each pass takes O(N+r)<br />total: O(Nk+rk)<br />r and k are constants: O(N)<br />
33. 33. Quicksort<br />
34. 34. Introduction<br />Fastest known sorting algorithm in practice<br />Average case: O(N log N)<br />Worst case: O(N2)<br />But, the worst case seldom happens.<br />Another divide-and-conquer recursive algorithm like mergesort<br />
35. 35. Quicksort<br />S<br />Divide step: <br />Pick any element (pivot) v in S <br />Partition S – {v} into two disjoint groups<br /> S1 = {x  S – {v} | x v}<br /> S2 = {x  S – {v} | x  v}<br />Conquer step: recursively sort S1 and S2<br />Combine step: combine the sorted S1, followed by v, followed by the sorted S2<br />v<br />v<br />S1<br />S2<br />
36. 36. Example: Quicksort<br />
37. 37. Example: Quicksort...<br />
38. 38. Pseudocode <br />Input: an array A[p, r]<br />Quicksort (A, p, r) {<br /> if (p < r) {<br /> q = Partition (A, p, r) //q is the position of the pivot element<br />Quicksort (A, p, q-1)<br />Quicksort (A, q+1, r)<br /> }<br />}<br />
39. 39. Partitioning<br />Partitioning <br />Key step of quicksort algorithm<br />Goal: given the picked pivot, partition the remaining elements into two smaller sets<br />Many ways to implement <br />Even the slightest deviations may cause surprisingly bad results.<br />We will learn an easy and efficient partitioning strategy here.<br />How to pick a pivot will be discussed later<br />
40. 40. Partitioning Strategy<br />19<br />6<br />j<br />i<br />Want to partition an array A[left .. right]<br />First, get the pivot element out of the way by swapping it with the last element. (Swap pivot and A[right])<br />Let i start at the first element and j start at the next-to-last element (i = left, j = right – 1)<br />swap<br />6<br />5<br />6<br />4<br />3<br />12<br />19<br />5<br />6<br />4<br />3<br />12<br />pivot<br />
41. 41. Partitioning Strategy<br />19<br />19<br />6<br />6<br />j<br />j<br />i<br />i<br />Want to have<br />A[p] <= pivot, for p < i<br />A[p] >= pivot, for p > j<br />When i < j<br />Move i right, skipping over elements smaller than the pivot<br />Move j left, skipping over elements greater than the pivot<br />When both i and j have stopped<br />A[i] >= pivot<br />A[j] <= pivot<br />5<br />6<br />4<br />3<br />12<br />5<br />6<br />4<br />3<br />12<br />
42. 42. Partitioning Strategy<br />19<br />19<br />6<br />6<br />j<br />j<br />i<br />i<br />When i and j have stopped and i is to the left of j<br />Swap A[i] and A[j]<br />The large element is pushed to the right and the small element is pushed to the left<br />After swapping<br />A[i] <= pivot<br />A[j] >= pivot<br />Repeat the process until i and j cross<br />swap<br />5<br />6<br />4<br />3<br />12<br />5<br />3<br />4<br />6<br />12<br />
43. 43. Partitioning Strategy<br />19<br />19<br />19<br />6<br />6<br />6<br />j<br />j<br />j<br />i<br />i<br />i<br />When i and j have crossed<br />Swap A[i] and pivot<br />Result:<br />A[p] <= pivot, for p < i<br />A[p] >= pivot, for p > i<br />5<br />3<br />4<br />6<br />12<br />5<br />3<br />4<br />6<br />12<br />5<br />3<br />4<br />6<br />12<br />
44. 44. Small arrays<br />For very small arrays, quicksort does not perform as well as insertion sort<br />how small depends on many factors, such as the time spent making a recursive call, the compiler, etc<br />Do not use quicksort recursively for small arrays<br />Instead, use a sorting algorithm that is efficient for small arrays, such as insertion sort<br />
45. 45. Picking the Pivot<br />Use the first element as pivot<br />if the input is random, ok<br />if the input is presorted (or in reverse order)<br />all the elements go into S2 (or S1)<br />this happens consistently throughout the recursive calls<br />Results in O(n2) behavior (Analyze this case later)<br />Choose the pivot randomly<br />generally safe<br />random number generation can be expensive<br />
46. 46. Picking the Pivot<br />Use the median of the array<br />Partitioning always cuts the array into roughly half<br />An optimal quicksort (O(N log N))<br />However, hard to find the exact median<br />e.g., sort an array to pick the value in the middle<br />
47. 47. Pivot: median of three<br />We will use median of three<br />Compare just three elements: the leftmost, rightmost and center<br />Swap these elements if necessary so that <br />A[left] = Smallest<br />A[right] = Largest<br />A[center] = Median of three<br />Pick A[center] as the pivot<br />Swap A[center] and A[right – 1] so that pivot is at second last position (why?)<br />median3<br />
48. 48. 6<br />6<br />5<br />2<br />13<br />6<br />5<br />2<br />6<br />13<br />5<br />2<br />13<br />5<br />2<br />13<br />pivot<br />pivot<br />19<br />Pivot: median of three<br />A[left] = 2, A[center] = 13, A[right] = 6<br />6<br />4<br />3<br />12<br />19<br />Swap A[center] and A[right]<br />6<br />4<br />3<br />12<br />19<br />6<br />4<br />3<br />12<br />19<br />Choose A[center] as pivot<br />Swap pivot and A[right – 1]<br />6<br />4<br />3<br />12<br />Note we only need to partition A[left + 1, …, right – 2]. Why?<br />
49. 49. Main Quicksort Routine<br />Choose pivot<br />Partitioning<br />Recursion<br />For small arrays<br />
50. 50. Partitioning Part<br />Works only if pivot is picked as median-of-three. <br />A[left] <= pivot and A[right] >= pivot<br />Thus, only need to partition A[left + 1, …, right – 2]<br />j will not run past the end<br />because a[left] <= pivot<br />i will not run past the end<br />because a[right-1] = pivot<br />
51. 51. Quicksort Faster than Mergesort<br />Both quicksort and mergesort take O(N log N) in the average case.<br />Why is quicksort faster than mergesort?<br />The inner loop consists of an increment/decrement (by 1, which is fast), a test and a jump. <br />There is no extra juggling as in mergesort.<br />inner loop<br />
52. 52. Analysis<br />Assumptions:<br />A random pivot (no median-of-three partitioning<br />No cutoff for small arrays<br />Running time<br />pivot selection: constant time O(1)<br />partitioning: linear time O(N)<br />running time of the two recursive calls <br />T(N)=T(i)+T(N-i-1)+cN where c is a constant<br />i: number of elements in S1<br />
53. 53. Worst-Case Analysis<br />What will be the worst case?<br />The pivot is the smallest element, all the time<br />Partition is always unbalanced<br />
54. 54. Best-case Analysis<br />What will be the best case?<br />Partition is perfectly balanced.<br />Pivot is always in the middle (median of the array)<br />
55. 55. Average-Case Analysis<br />Assume<br />Each of the sizes for S1 is equally likely<br />This assumption is valid for our pivoting (median-of-three) and partitioning strategy<br />On average, the running time is O(N log N) (covered in comp271)<br />
56. 56. Topological Sort<br />3<br />6<br />8<br />0<br />7<br />2<br />9<br />1<br />5<br />4<br />Topological sort is an algorithm for a directed acyclic graph<br />It can be thought of as a way to linearly order the vertices so that the linear order respects the ordering relations implied by the arcs<br />For example:<br />0, 1, 2, 5, 9<br />0, 4, 5, 9<br />0, 6, 3, 7 ?<br />
57. 57. Topological Sort<br />Idea:<br />Starting point must have zero indegree!<br />If it doesn’t exist, the graph would not be acyclic<br />A vertex with zero indegree is a task that can start right away. So we can output it first in the linear order<br />If a vertex i is output, then its outgoing arcs (i, j) are no longer useful, since tasks j does not need to wait for i anymore- so remove all i’s outgoing arcs<br />With vertex i removed, the new graph is still a directed acyclic graph. So, repeat step 1-2 until no vertex is left.<br />
58. 58. Topological Sort<br />Find all starting points<br />Reduce indegree(w)<br />Place new startvertices on the Q<br />
59. 59. Example<br />Indegree<br />start<br />3<br />6<br />8<br />0<br />7<br />2<br />9<br />1<br />5<br />4<br />Q = { 0 }<br />OUTPUT: 0 <br />
60. 60. Example<br />Indegree<br />-1<br />3<br />6<br />8<br />-1<br />0<br />7<br />2<br />9<br />1<br />-1<br />5<br />4<br />Dequeue 0 Q = { }<br /> -> remove 0’s arcs – adjust indegrees of neighbors<br />Decrement 0’sneighbors<br />OUTPUT: <br />
61. 61. Example<br />Indegree<br />3<br />6<br />8<br />0<br />7<br />2<br />9<br />1<br />5<br />4<br />Dequeue 0 Q = { 6, 1, 4 }<br /> Enqueue all starting points<br />Enqueue all<br />new start points<br />OUTPUT: 0 <br />
62. 62. Example<br />Indegree<br />3<br />-1<br />6<br />8<br />-1<br />7<br />2<br />9<br />1<br />5<br />4<br />Dequeue 6 Q = { 1, 4 } <br /> Remove arcs .. Adjust indegrees of neighbors <br />Adjust neighborsindegree<br />OUTPUT: 0 6 <br />
63. 63. Example<br />Indegree<br />3<br />8<br />7<br />2<br />9<br />1<br />5<br />4<br />Dequeue 6 Q = { 1, 4, 3 } <br /> Enqueue 3<br />Enqueue newstart<br />OUTPUT: 0 6 <br />
64. 64. Example<br />Indegree<br />-1<br />3<br />8<br />7<br />2<br />9<br />1<br />5<br />4<br />Dequeue 1 Q = { 4, 3 } <br /> Adjust indegrees of neighbors<br />Adjust neighborsof 1<br />OUTPUT: 0 6 1 <br />
65. 65. Example<br />Indegree<br />3<br />8<br />7<br />2<br />9<br />5<br />4<br />Dequeue 1 Q = { 4, 3, 2 } <br /> Enqueue 2<br />Enqueue new starting points<br />OUTPUT: 0 6 1 <br />
66. 66. Example<br />Indegree<br />3<br />8<br />7<br />2<br />9<br />-1<br />5<br />4<br />Dequeue 4 Q = { 3, 2 } <br /> Adjust indegrees of neighbors<br />Adjust 4’s <br />neighbors<br />OUTPUT: 0 6 1 4 <br />
67. 67. Example<br />Indegree<br />3<br />8<br />7<br />2<br />9<br />5<br />Dequeue 4 Q = { 3, 2 } <br /> No new start points found<br />NO new start<br />points<br />OUTPUT: 0 6 1 4 <br />
68. 68. Example<br />Indegree<br />3<br />8<br />7<br />2<br />9<br />5<br />-1<br />Dequeue 3 Q = { 2 } <br /> Adjust 3’s neighbors<br />OUTPUT: 0 6 1 4 3 <br />
69. 69. Example<br />Indegree<br />8<br />7<br />2<br />9<br />5<br />Dequeue 3 Q = { 2 } <br /> No new start points found<br />OUTPUT: 0 6 1 4 3 <br />
70. 70. Example<br />Indegree<br />8<br />7<br />2<br />9<br />-1<br />-1<br />5<br />Dequeue 2 Q = { } <br /> Adjust 2’s neighbors<br />OUTPUT: 0 6 1 4 3 2 <br />
71. 71. Example<br />Indegree<br />8<br />7<br />9<br />5<br />Dequeue 2 Q = { 5, 7 } <br /> Enqueue 5, 7<br />OUTPUT: 0 6 1 4 3 2 <br />
72. 72. Example<br />Indegree<br />8<br />7<br />9<br />5<br />Dequeue 2 Q = { 5, 7 } <br /> Enqueue 5, 7<br />OUTPUT: 0 6 1 4 3 2 <br />
73. 73. Example<br />Indegree<br />8<br />7<br />9<br />5<br />-1<br />Dequeue 5 Q = { 7 }<br /> Adjust neighbors <br />OUTPUT: 0 6 1 4 3 2 5 <br />
74. 74. Example<br />Indegree<br />8<br />7<br />9<br />Dequeue 5 Q = { 7 }<br /> No new starts <br />OUTPUT: 0 6 1 4 3 2 5 <br />
75. 75. Example<br />Indegree<br />8<br />7<br />9<br />-1<br />Dequeue 7 Q = { }<br /> Adjust neighbors <br />OUTPUT: 0 6 1 4 3 2 5 7 <br />
76. 76. Example<br />Indegree<br />8<br />9<br />Dequeue 7 Q = { 8 }<br /> Enqueue 8 <br />OUTPUT: 0 6 1 4 3 2 5 7 <br />
77. 77. Example<br />Indegree<br />8<br />9<br />-1<br />Dequeue 8 Q = { }<br /> Adjust indegrees of neighbors<br />OUTPUT: 0 6 1 4 3 2 5 7 8 <br />
78. 78. Example<br />Indegree<br />9<br />Dequeue 8 Q = { 9 }<br /> Enqueue 9<br />Dequeue 9 Q = { }<br /> STOP – no neighbors<br />OUTPUT: 0 6 1 4 3 2 5 7 8 9<br />
79. 79. Example<br />3<br />6<br />8<br />0<br />7<br />2<br />9<br />1<br />5<br />4<br />OUTPUT: 0 6 1 4 3 2 5 7 8 9<br />Is output topologically correct?<br />
80. 80. Topological Sort: Complexity<br />We never visited a vertex more than one time<br />For each vertex, we had to examine all outgoing edges<br />Σ outdegree(v) = m<br />This is summed over all vertices, not per vertex<br />So, our running time is exactly<br />O(n + m)<br />